I'd like to propose bugfixes, documentation fixes and a performance
improvement for Text.PrettyPrint.HughesPJ. The changes shouldn't
effect the expected behaviour of the PP library.
I've written a QuickCheck test suite for the pretty printer (to test
the improvement), and found two bugs and some misconceptions/
ambiguities in the documentation. Additionally, there is a
microbenchmark for the suggested improvement.
Both are available at http://code.haskell.org/~bhuber/Text/
PrettyPrint/. Note that the QuickCheck tests need access to all top-
level names in HughesPJ (i.e. ignore the export list).
In summary, I propose to
* fix a bug in fillNB and one in fillNB/sepNB
* correct documentation on laws and invariants.
* add more efficient implementations of vcat,hsep,hcat
(1) Bugfix fillNB: Additional case for fillNB Empty (Empty : ys)
(2) Bugfix [f](cat|sep): do not allow overlapping ($$) in vertical
(3) Lazy implementations of vcat,hcat and hsep
(4) Law isn't always true
(5) Invariant 5 should be made stronger
(6) Change the comment about negative indentation
The details follow below (maybe a little long).
== Details (Long) ==
(1) Bugfix fillNB
Law states that
> sep (ps++[empty]++qs) = sep (ps ++ qs)
> ...ditto hsep, hcat, vcat, fill...
In the current implementation, this fails for the paragraph fill
> render' $ fill True [ text "c", text "c",empty, text "c", text "b"]
> where render' = renderStyle (Style PageMode 7 1.4)
>> c c c
The reason is a missing test for Empty in fillNB
2) Bugfix: overlap and f?(cat|sep)
The specification for cat/sep:
* oneLiner (hcat/hsep ps)
vcat ps [*]
But currently cat, sep, fcat and fsep attempt to overlap the second
line with the first one, i.e. they use
`foldr ($$) empty ps' instead of `foldr ($+$) empty ps' [*]. I assume
this is a mistake.
This bug can lead to situations, where the line in the right argument
of Union is actually longer:
> prettyDoc$ cat [ text "a", nest 2 ( text "b") ]
>> text "a"; union
>> (text "b"; empty)
>> (nilabove; nest 1; text "b"; empty)
> renderStyle (Style PageMode 1 1) $ cat [ text "a", nest 2 ( text
>> "a b"
In the implementation, we call `nilAbove False' instead of `nilAbove
True' (see patch).
3) Improving the space/time performance of vcat, hsep, hcat
vcat (hsep,cat) is implemented in an unneccessarily strict way.
We only get some output after all of vcat's arguments are evaluated
and checked against being Empty.
This can be improved by only checking the right argument of foldr
against being Empty, and
then applying an Empty-filter on the resulting Doc. Space improvement
The microbenchmark (code.haskell.org/~bhuber/Text/PrettyPrint/
HughesPJPerfCheck.hs) suggests that
the improvements in time are remarkable too.
The QuickCheck tests revealed that
4) Law isn't always true
text "" <> x = x, if x non-empty
only holds if x does not start with nest (obviously, because of law
5) The invariant 5 should be extended:
A @NoDoc@ may only appear on the first line of the left argument of
an union. Therefore, the right argument of an union can never be
to the empty set.
because this assumption is used when rendering.
6) Change the comment about negative indentation
In the source code we have:
-- (spaces n) generates a list of n spaces
-- It should never be called with 'n' < 0, but that can happen for
reasons I don't understand
If we compose a <> b, and the first line of b is deeply nested, but
other lines of b are not,
then, because <> eats the nest, the pretty printer will try to layout
some of b's lines with
d2 | b
Here is the reason why this is rather unavoidable: In John Hughes
paper, there is a line stating that
"composing layouts does not change the layouts being composed".
Another one states that
"<> should eat up nests"
But this leads to negative indentation - imho a user error.
== End ==