|
Subject: [MERGE] Updates to the "auto-buffer" logic of GraphIndex Newsgroups: gmane.comp.version-control.bazaar-ng.general Date: Friday 29th August 2008 19:00:06 UTC (over 9 years ago) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Attached are two tweaks to how the GraphIndex code decides to switch over to "_buffer_all" mode. (The tests are a bit noisier than they need to be, because I factored out some common code into helper functions.) 1) If a single readv request returns the whole index, buffer_all and return. This turns out to happen a lot in real-world repositories with mixed size packs, and remote access. By default remote access reads 64kB pages, and when a repository grows "organically" it tends to end up with a lot of indexes under the 64kB page size. For example, bzr.dev: http://bazaar-vcs.org/bzr/.bzr/repository/indices/ has 104 index files, and 81 of those are <64kB in size (only 68 of those should actually be referenced, it seems we have a bit of 'junk' there.) And bzr.dev in launchpad: http://bazaar.launchpad.net/~bzr/bzr/trunk/.bzr/repository/indices/ has 72 index files, and 56 of those are < 64kB in size. Or mysql-5.1: http://bazaar.launchpad.net/~mysql/mysql-server/mysql-5.1/.bzr/repository/indices/ has 52 index files, and 48 of them are < 64kB in size. The existing code would read the whole file, parse it into "bisect_nodes" pointers. And then later on might decide that it actually needed the whole file, (say because num_requested_keys > total_keys / 20), which would cause it to then read the file again. My patch here just says "if offset == 0 and len(data) == self._size: buffer_all(data)". 2) If we read more than 50% of the index by random 'readv()' calls, I trigger a buffer_all. This is mostly because of some pathological behavior that I was seeing. Namely, while iterating the revision graph to find the full set of nodes to request. Our existing bisect code makes a request for (start, 800) which the Transport layer translates into a (start, 64kB) request. The data is returned, and then we chop off the beginning and trailing bits, which don't amount to keys that we can actually parse. The problem is that if we then request (start-20, 800) bytes, it will turn that into another (start-20, 64kB) of data, which greatly overlaps what we already have read. Also, while searching the graph, we would eventually get to a point where we were requesting more that 1/20th of the keys in a single request (bzr gets pretty wide after about 40 get_parent_map() calls because we have a lot of merging.) Which would mean it would then trigger a buffer_all anyway. When I was looking at it, I saw us read about 1.6x the total size of the file before we ended up triggering "buffer_all" by 'accident'. 50% is just a random threshold. Robert has mentioned in the past the possibility that 10% of the index may be sufficient. (That is the number I remember from my postgresql tuning days, if >10% of the data is going to be read, use a sequential scan instead of an index.) We could be conservative and set it to 100%, so if we've readv() enough to be equivalent to reading the whole file, go ahead and just read the whole thing. Note also that this only triggers on the *next* iter_entries call. Arguably we could use "if this readv would read more than 100% of the file, buffer_all", but I think that would only save 1 round trip (out of 40). Both of these changes don't help fully-packed repositories much, but that doesn't really matter as that is not the "common" case. My own bzr repo has 23 pack files. In testing the latest bzr.dev branching over bzr+ssh://localhost with 200ms delay (400ms ping), I get: time calls bzr branch 94.0s 102 bzr.dev packed 133.7s 278 bzr.dev unpacked 88.5s 75 new packed 115.7s 202 new unpacked "calls" is the number of "hpss call" statements in the log file. So we save about 27 calls on a packed repository (6% total time), and 76 calls on an unpacked one (15% total time). These numbers are latency constrained, but not bandwidth constrained (as I'm using the loopback). So I did run another test doing "bzr branch lp:bzr" just to see one that is both latency *and* bandwidth constrained. My ping to "bazaar.lp.net" is: rtt min/avg/max/mdev = 112.003/ 270.721 /384.155/84.658 ms So I average 270ms, but can go up to 400, and down to 100. time calls command 11m04s 231 bzr.dev 10m43s 172 new So 59 fewer calls, saving about 21s total, or 3% of total time. Not a huge difference. I should point out that I'm still mostly bandwidth constrained with the new code. 231 calls * 400ms = 92s or 1m32s. So at best I could get down to 9m30s. 10m43s is actually quite good for my bandwidth. Final disk size of pack and indexes is 87MB @ 160kB/s = 556.8s = 9m16s. So with the patch we are at 113% of bandwidth capped, and without it we are at 116%. I had thought about trying to get these into 1.6.1 (to help with the loss of the HPSS.stream_revisions RPC), but there isn't a reviewer, and I didn't want to delay rc1. Also, I would comment that probably the next best thing to do to decrease effective latency, is to have HPSS.readv() actually stream the chunks back. Right now it waits to buffer the entire thing into a single memory buffer, and then returns that to the RemoteTransport.readv handler, which then yields it back as you go. The actual HPSS request wouldn't even have to change, just the client. Right now we use response_handler.read_body_bytes() rather than a function that would let us process the bytes as they come back. It would end up more memory efficient (you don't ever have to buffer 30MB of file/inventory data), and it would probably be slightly faster. (while waiting on bytes coming to the socket, you can do some work locally to push that data into your pack file.) This could probably shave as much as 30s off of the branch time. At least, that is the "user" time spent during download, so interleaving 30s of user time with download time would probably be a decent win. If we could truly do that, we would be down to 107.5% bandwidth-capped time. John =:-> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIuEc2JdeBCYSNAAMRAtmDAKCLRiGm8OppK8VGuJTMWgU4VeOIHgCfbXNw q6GXEb4GZ2V2uWX1c1Pb1sU= =P1JS -----END PGP SIGNATURE----- |
||