|
Subject: Re: Bazaar, OLPC, and the "no-packs" repository format. Newsgroups: gmane.comp.version-control.bazaar-ng.general Date: Thursday 20th December 2007 13:06:00 UTC (over 10 years ago) On Št, 2007-12-20 at 06:25 -0600, John Arbash Meinel wrote: > Robert has at least conceptualized a hash-based index, so lookups become more > of an O(1) operation. Which makes it O(N) for N packs, but that is better than > O(N log(M)) for our current indexes. I wonder if you could even combine the > hash maps to make it O(1) across all indexes that you have read... I've implemented a prototype of a similar idea about two weeks ago, but gave up on finishing it because I found out that the main bottleneck for me was reading data from all over the big pack files. The indexing code is in the attachment, but it's only a part of the plugin, which itself requires patched bzrlib and breaks 'packs-0.92', so I did't bother with attaching the whole package (it's not very usable, anyway). The general idea is that the index is split into segments, which are grouped by their left-hand history (similar to toposort, but backwards) and the index header contains 32 bit hashes of each key of each segment. The combined index also uses a global cache, so in most cases you don't search in all indexes, but only the ones that actually contain the key (for 15k revisions in bzr.dev I get no hash collisisions, so it's something like O(1+very_small_constant*N)). But that helps only in cases where you have a few big indexes, not a *lot* of tiny ones. Lukas |
||