Subject: Bug in Data.Map
Date: Tuesday 3rd August 2010 20:59:45 UTC (over 7 years ago)
Hi all! Taylor Campbell found a bug in Data.Map but wasn't able to post to this mailing list so I'm submitting this on his behalf. ---------------------------- Deletion in Data.Map does not always preserve the balance of a tree. Deletion in Data.Set appears to preserve the balance in some cases where Data.Map does not; I don't know whether Data.Set always leaves the tree balanced. I have attached two files to test the balance of trees after a random sequence of insertions and a pathological sequence of deletions (deleteMin -- although this is not pathological for some applications such as priority queues). TestMap.hs reliably demonstrates for me that deleteMin readily yields an unbalanced tree, if I supply a size of 10000 to test_map. TestSet.hs does not. The salient difference between Data.Set and Data.Map, I believe, is the choice of delta (w in Adams' paper), which is 4 in Data.Set and 5 in Data.Map. Since Adams' paper claims that when alpha (i.e. 1/ratio) is 1/2, w must exceed about 4.646, this seems suspect. But I haven't analyzed the reasoning in Adams' paper to find whether his conclusions are wrong. What I do know is that I can break trees when w is 5, but I have not been able to break them when w is 4. I also noticed a comment in Set.hs saying that a value of 4 for w improved performance on many operations. I have also not been able to break trees if I change deletion to use join rather than balance. But that slows everything down a good deal more, I believe, than simply changing w from 5 to 4.