Subject: Re: GHC predictability
Date: Monday 12th May 2008 19:01:53 UTC (over 9 years ago)
Don Stewart wrote: > jeff.polakow: > >> Hello, >> >> One frequent criticism of Haskell (and by extension GHC) is that it has >> unpredictable performance and memory consumption. I personally do not find >> this to be the case. I suspect that most programmer confusion is rooted in >> shaky knowledge of lazy evaluation; and I have been able to fix, with >> relative ease, the various performance problems I've run into. However I >> am not doing any sort of performance critical computing (I care about >> minutes or seconds, but not about milliseconds). >> >> I would like to know what others think about this. Is GHC predictable? Is >> a thorough knowledge of lazy evaluation good enough to write efficient >> (whatever that means to you) code? Or is intimate knowledge of GHC's >> innards necessary? >> >> thanks, >> Jeff >> >> PS I am conflating Haskell and GHC because I use GHC (with its extensions) >> and it produces (to my knowledge) the fastest code. >> > > This has been my experience to. I'm not even sure where > "unpredicatiblity" would even come in, other than though not > understanding the demand patterns of the code. > > It's relatively easy to look at the Core to get a precise understanding > of the runtime behaviour. > > I've also not found the GC unpredicatble either. > I offer up the following example: mean xs = sum xs / length xs Now try, say, "mean [1.. 1e9]", and watch GHC eat several GB of RAM. (!!) If we now rearrange this to mean = (\(s,n) -> s / n) . foldr (\x (s,n) -> let s' = s+x; n' = n+1 in s' `seq` n' `seq` (s', n')) (0,0) and run the same example, and watch it run in constant space. Of course, the first version is clearly readable, while the second one is almost utterly incomprehensible, especially to a beginner. (It's even more fun that you need all those seq calls in there to make it work properly.) The sad fact is that if you just write something in Haskell in a nice, declarative style, then roughly 20% of the time you get good performance, and 80% of the time you get laughably poor performance. For example, I sat down and spent the best part of a day writing an MD5 implementation. Eventually I got it so that all the test vectors work right. (Stupid little-endian nonsense... mutter mutter...) When I tried it on a file containing more than 1 MB of data... ooooohhhh dear... I gave up after waiting several minutes for an operation that the C implementation can do in milliseconds. I'm sure there's some way of fixing this, but... the source code is pretty damn large, and very messy as it is. I shudder to think what you'd need to do to it to speed it up. Of course, the first step in any serious attempt at performance improvement is to actually profile the code to figure out where the time is being spent. Laziness is *not* your friend here. I've more or less given up trying to comprehend the numbers I get back from the GHC profiles, because they apparently defy logic. I'm sure there's a reason to the madness somewhere, but... for nontrivial programs, it's just too hard to figure out what's going on. Probably the best part is that almost any nontrivial program you write spends 60% or more of its time doing GC rather than actual work. Good luck with the heap profiler. It's even more mysterious than the time profiles. ;-) In short, as a fairly new Haskell programmer, I find it completely impossibly to write code that doesn't crawl along at a snail's pace. Even when I manage to make it faster, I usually have no clue why. (E.g., adding a seq to a mergesort made it 10x faster. Why? Changing from strict ByteString to lazy ByteString made one program 100x faster. Why?) Note that I'm not *blaming* GHC for this - I think it's just inherantly very hard to predict performance in a lazy language. (Remember, deterministic isn't the same as predictable - see Chaos Theory for why.) I wish it wasn't - becuase I really *really* want to write all my complex compute-bounded programs in Haskell, because it makes algorithms so beautifully easy to express. But when you're trying to implement something that takes hours to run even in C...