On 12/27/17, 2:23 PM, "sqlite-users on behalf of Simon Slavin" <[hidden email] on behalf of [hidden email]> wrote:
> Fair point. Automatic de-duplication would be more beneficial. And it wouldn’t require extreme cleverness to be separately written into each application. APFS does not do automatic de-duplication.
We had Netapp filers at my last gig. It’s about the only thing I really miss from there, but they were *nice*. And automatic deduplication was not the least of the reasons.
On Wed, 27 Dec 2017, Simon Slavin wrote:
> I understand that ZFS does this too, though I’ve never used ZFS.
ZFS currently clones on the filesystem level. Filesystems are easy to
create/delete and only consume the space required. Once a filesystem
has been cloned, then only modified file blocks take new storage
ZFS and some other storage-pools/filesystems optionally support
de-duplication at the block level so copying a block can imply
incrementing a reference count. The application might do quite a lot
of work copying the data (slow) but the underlying store can realize
that the block matches other copies and not store a new copy.
Inserting just one byte early in a changed file may foil
Filesystem tricks still do not solve the most common problem that the
master repository is usually accessed over a network, and networks are
Any DVCS is going to cause a penalty when the goal is to check out a
particular version of the files from a remote server and the
repository is large. A hosted VCS like CVS/SVN would deliver just the
desired versions of the files (as long as the server remains available
and working) whereas with a DVCS, the whole repository normally needs
to be duplicated first.
On Dec 27, 2017, at 1:49 PM, Bob Friesenhahn <[hidden email]> wrote:
> Any DVCS is going to cause a penalty when the goal is to check out a particular version of the files from a remote server
…the first time.
After you’ve got a clone, operations are usually much faster with a DVCS than a remote non-distributed VCS.
> the whole repository normally needs to be duplicated first.
Which is why narrow and shallow cloning features can make the difference between “impractical” and “practical” when converting a sufficiently large repository from a traditional non-distributed VCS to a DVCS. In effect, they roll back some of the features you get from a DVCS, making it more like a non-distributed VCS.
When the repo size is on the order of that of sqlite.org, however, narrow and shallow clones buy you little, and they cost you the inherent backup you get by distributing your entire repository everywhere.
My war story above about about my lost repository on Gna! was because it was hosted in Subversion, so I lost all project history between the last svnadmin backup I made and the working copies on my development machines. That won’t happen now that that project is hosted on Fossil: every user of that software project holds the whole project history now.
sqlite-users mailing list
[hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users