@are0h Oh cool. Looked like it might just be a filter-branch wrapper at first, but:
> Taking advantage of the great support for parallelism in Scala and the JVM, the BFG does multi-core processing by default - the work of cleaning your Git repository is spread over every single core in your machine and typically consumes 100% of capacity for a substantial portion of the run. > All action takes place in a single process (the process of the JVM), so doesn't require the frequent fork-and-exec-ing needed by git-filter-branch's mix of Bash and C code.
> How do you even find out that's something you can do with Git?
There are like 10000 man pages, but then one day you have a specific problem and you happen to read the right line in the right page and you go "omg I so needed to know that last year when there was that thing with the thing".
A couple of weeks later you're like "What, you didn't know that? It's common knowledge, just read the man page".
In this specific case, the man page is git-rebase.
@are0h @webinista Yeah, a specialized tool for a very specific but pretty common use case. That would allow it to run circles around the more generic, composable tool.
One small thing: `git gc --aggressive` doesn't do much for the object size, it's just depth/window 50/250 instead of the default 10/50. Try e.g. `git repack -f -a -d --window=1000 --depth=1000`. (`-f` for discarding old deltas calculated with the tiny window)
Or consider raising the global gc.aggressiveDepth and gc.aggressiveWindows parameters considerably. I even have my pack.window and pack.depth at 1000. The machine has the memory and the juice for it, and most packing is done in the background anyway, so it doesn't affect the user experience if it takes some time once in a while.
@are0h @webinista The window is how many similar-sized objects it's looking at when determining which delta is the smallest, the depth is how long a chain of delta-upon-delta it will accept.
@are0h @webinista It's funny how Linus goes "you should do it overnight" with depth/window 250/250, because he's back in 2007. On my 2013 laptop, linux.git with 1000/1000 doesn't need overnight. :-)
So apparently `--aggressive` doesn't just mean higher depth and window, it also means discard deltas (that `-f` to git-repack). I had forgotten that!
----
So I guess obscure emails on random tangentially-related mailing lists are another source of ... osmosis. The information flow in these projects is so bad, I sometimes think maybe taking a time machine back to 2008 and then not have a life is the easy way to learn the crooks and crannies. I guess time travel can be emulated by reading ancient issues of LWN every weekend. Neither is optimal.
Maybe just hanging around people who already did it, and asking questions, is the reasonable way.
@are0h @webinista I'm in favor of worse is better. If we didn't have the people come into git that we did, if Linus didn't get fed up with the loss of BitKeeper and hash out git over the weekend and then just roll with it, we wouldn't have git. Maybe we would have had Mercurial, which many people say is much less arcane and has an actual designed user experience in comparison -- but then again maybe Mercurial wouldn't have been in the state it is without the competition of git.
So maybe people generally use git the https://xkcd.com/1597/ way and maybe that works. And maybe if people mess their repo up they can just nuke it and clone a new one instead of spending the weekend finding out what happened and how git actually really works.
But I'm hoping that something like http://gitless.com/ can replace #xkcd1597 and still let us keep the history of git repos that we have now, and if it's *really* necessary to do some serious git surgery maybe someone who already wasted time on learning the idiosyncracies, like you or me, can take care of that with good old/bad git.
You can do some really neat stuff with git, but the problem is you *have* to do some really neat stuff to get by in everyday situations.
@are0h @webinista I like luddite technologists. I don't think there's a contradiction there, only a balance. And the problem as such is *maybe* not someone creating a tight technical solution and then rigging the UI with gum and paper clips. That's fine for hackers. The problem is just when hackers are solving problems for hackers, go the 80%, and then the product turns out to be generally useful and is dumped on people just trying to be productive (and there's Nothing Wrong with That!). There needs to be a step of "ok, now we figured out the proof of concept -- now how do we make a new one that is actually usable rather than just useful?"
Goes back to my old adage "your prototype *will* get put into production", and I'm still not sure there's a generic solution to that problem.
@are0h I simply mean if one is creating a product, it would make sense to choose e.g. Mercurial over git for the source, for usability reasons. But then it makes sense to choose git over Mercurial for network effect reasons, and that's what people often end up doing.
Better git usability would make this not an issue. Maybe people should try gitless.
@are0h In particular, some things are usable only as a CLI or a GUI/CLI mix. Worse is better is not the only reason the CLI persists -- the richest and most profitable two software companies tried to abolish it for decades and then embraced it.