Git LFS vs Git Annex

τэкnoкraτ

I'm using git annex for a few years now. I cannot consider myself a fan but among its alternatives, it's longest I was able to work together.

Its alternatives?

Dropbox is slow (maybe only here in Turkey, maybe elsewhere too.) Google Drive doesn't have a proper Linux client and one of its proprietary clients created duplicate files in my drive that I think I have over 200GB of duplicate files there which became an object of procrastination. I have used unison in the past but its never ending comparisons made me sick. I also wrote rsync scripts but they are tedious and it's risky when you can lose files due to small errors.

When I discovered git annex it was a very good news. It can use multiple remotes and local repositories to store filename contents, it's a natural remedy to duplicate files and it was a logical extension to git. I was using more Mercurial that git when I began using annex but the change paid off. The downside? It may becomes slow after 1 million files or so, and it's reliance to symbolic links make it alien to Windows world. I didn't use annex under Windows and it seems possible, though using symbolic links under Windows always make me to question the operating system preferences of common folk.

Anyhow when I learned that Bitbucket and GitHub began to support git-lfs, an alternative to annex that can store large binary files more automatically than annex, I decided to give it a try. I like new things.

Initial setup is a bit easier in annex but daily use is simpler in lfs. You don't need to learn extra commands after asking lfs to track certain file types. However a similar setup is possible in annex too, if you use git annex add instead of git add and add options to automatically catch the file types. On that front there isn't much to decide.

However you can't use e.g. S3 to store file contents in LFS. It needs server support and a git installation on the other side. I gave it a try in one of my normal repositories to store PDF content but I wasn't able to push my changes to a bare repository. And the killer for me was not being able to name my remotes other than origin to use automated sync facilities in LFS.

git-lfs uses smudge to replace file contents in the time of checkout. However smudge is reported not to have remote branch info, so it's not possible to know the remote and branch name automatically at this later stage. When I learned that basic design decision of git-lfs, using smudge instead of symbolic links seemed a mistake. I decided that I prefer to learn a few extra commands of git annex than making an investment on git-lfs.



Comments !