Subject: Infinite, open, statically constrained HLists Newsgroups: gmane.comp.lang.haskell.general Date: Thursday 26th October 2006 07:38:38 UTC (over 11 years ago) We show a presentation of heterogeneous sequences that admits infinite sequences and permits post-hoc addition of new elements, even to an already infinite sequence. We can also assert static constraints, e.g., the sequence must be made of non-decreasing Peano numerals. Our sequences permit all the common operations such as `head', `tail', `cons', `map', `take' and `fold'. It is a type error to attempt to take `head' of an empty sequence. Any finite sequence is inter-convertible with the HCons-based HList. The full code is available at http://darcs.haskell.org/HList/CHList.hs The presentation is inspired by the problem posed by Greg Buchholz on the following thread: http://www.haskell.org/pipermail/haskell-cafe/2006-October/018948.html http://www.haskell.org/pipermail/haskell-cafe/2006-October/019030.html He observed how easy it is to define _some_ (potentially) infinite heterogeneous data structures. For example, data SeqI a = ConsI a (SeqI (S a)) | NilI defines the type of a sub-sequence of the sequence of natural Peano numerals. We can construct finite seqi1 = ConsI Z (ConsI (S Z) NilI) and even infinite sequences seqi_inf :: n -> SeqI n seqi_inf n = ConsI n (seqi_inf (S n)) but not seqi2 = ConsI Z (ConsI Z NilI) The latter ill-formed sequence is ill-typed. However, it is difficult to generalize these constructions, for example, to the sequences of Fibonacci Peano numerals or to the sequences of non-decreasing Peano numerals. The HList presentation, with familiar "HCons a b" and "HNil" types, makes it straightforward to define sequences with arbitrary constraints on the adjacent elements. The structure of such HList values is reflected in their types, and so we can statically constrain them. Alas, the type of an HList has as many HCons type constructors as there are HCons data constructors. That seems to rule out infinite sequences. There is a trivial way out of this predicament, via a different presentation. A sequence is a partial function from a range of natural numbers. In other words, a sequence is a mapping from a Natural number to {HNothing, HJust a}. We consider the HNothing element with the smallest index to be the end of the sequence. This obvious idea leads to the straightforward implementation: > class Nat n => CHList l n r | l n -> r where > chel :: l -> n -> r > chel = undefined > > -- Default `cut-off' instance > instance (Nat n, TypeCast r HNothing) => CHList l n r where > data HNothing = HNothing > newtype HJust x = HJust x > fromHJust (HJust x) = x -- this is the total function! The class 'CHList l n r' establishes the n-th element of the sequence named 'l' to be 'r'. The empty sequence is defined then by the single declaration: > data LEmpty = LEmpty Here is a bit more elaborate sequence, of two elements. Both are Ints. Homogeneous sequences are just a subset of heterogenous ones. > data L1 = L1 > instance CHList L1 Z (HJust Int) where > chel _ _ = HJust 0 > instance CHList L1 (S Z) (HJust Int) where > chel _ _ = HJust 1 We can convert such sequences to the regular HList; this is quite handy for showing them. For example, > test1 = show $ chl_to_hl L1 yields "HCons 0 (HCons 1 HNil)" Not surprisingly, we can implement all of the common List API, for example, > cl_head l = fromHJust (chel l Z) It will be a type error attempting to apply cl_head to an empty sequence. We can define tail > newtype CLTail l = CLTail l > instance (Nat n, CHList l (S n) r) => CHList (CLTail l) n r where > chel (CLTail l) n = chel l (S n) > > -- ensure the list to take the tail of is non-empty > cl_tail :: CHList l Z (HJust e) => l -> CLTail l > cl_tail l = CLTail l > > test_h3 = cl_head (cl_tail L1) -- 1 and cons > data CLCons a l = CLCons a l > cl_cons = CLCons > > instance CHList (CLCons a l) Z (HJust a) where > chel (CLCons x _ ) _ = HJust x > > instance (Nat n, CHList l n r) => CHList (CLCons a l) (S n) r where > chel (CLCons _ l) (S n) = chel l n after which we can introduce and extend our sequences in the `normal' way: > test_c1 = cl_head (cl_tail (cl_cons ((-2)::Int) (cl_cons ((-1)::Int) L1))) > test_c2 = cl_head (cl_tail (cl_cons () (cl_cons True LEmpty))) > test_c3 = show $ chl_to_hl $ cl_cons () L1 Defining other functions like `take', `map' and `fold' is straightforward and elided. The latter require the generalized `Apply' class defined in the HList library. We can write, for example, > test_m1 = cl_map (succ :: Int->Int) L1 > test_m2 = show $ chl_to_hl test_m1 because regular functions are already instances of the Apply class and so apply-able. It is worth noting a close resemblance between test_m1 above and > test_m1' = map (succ :: Int->Int) l1 > where l1 = [0,1] The difference seems to be in the case of the identifier: `l1' vs `L1'. Indeed, both l1 and L1 _denote_ certain values. The rules of looking up the value given the name are implicit in the case of l1 and explicit (and defined by us) in the case of L1. The type-checking is more explicit in the latter case too -- that's why we are able to assert more constraints, as we shall see soon. It goes without saying that our sequences can as well be infinite: > data LNats = LNats -- lists of successive numerals > instance Nat n => CHList LNats n (HJust n) where > chel _ n = HJust n which defines the infinite list of Peano numerals. That is quite akin to "seqi_inf Z", described in the beginning of this message. Now, however, we can introduce other infinite sequences, e.g., by mapping arbitrary function over LNats. We can also define sequences by recursion, e.g., LFibs. Perhaps more interesting is the fact that our lists are open, even when they are extensible. For example, if we define only > data LCyc = LCyc > instance (Nat n, CHList LCyc n r) => CHList LCyc (S (S n)) r where > chel l (S (S n)) = chel l n then > test_cyc = show $ chl_to_hl $ cl_take five LCyc yields "HNil". If we later add another instance > instance CHList LCyc Z (HJust ()) where chel _ _ = HJust () the same test_cyc will give "HCons () HNil". The addition the third instance > instance CHList LCyc (S Z) (HJust Bool) where chel _ _ = HJust True makes our list infinite. test_cyc now gives "HCons () (HCons True (HCons () (HCons True (HCons () HNil))))" We may still add instances, for example > instance CHList LCyc (S (S Z)) (HJust Bool) where chel _ _ = HJust False causes test_cyc to return "HCons () (HCons True (HCons False (HCons True (HCons False HNil))))" Finally, we show constrained lists. A constraint is a binary predicate; it is apply-ed to two consecutive elements of the sequence, and must return HTrue. For simplicity, we introduce the following non-generic class of constrained lists, whose elements must be non-decreasing numerals: > data CK l = CK l > class (Nat n, Apply (CK l) (n,r) HTrue) => C1 l n r | l n -> r where > chel1 :: l -> n -> r; chel1 = undefined > instance (Nat n, TypeCast r HNothing, Apply (CK l) (n, r) HTrue) => C1 l n r > > instance Apply (CK l) (Z,r) HTrue > instance Apply (CK l) (S n,HNothing) HTrue > instance (C1 l n (HJust ep), Apply PLEQ (ep,e) r) > => Apply (CK l) (S n,HJust e) r The constraint is specified as a class constraint; therefore, all instances of C1 must obey it. All instances of C1 are CHLists, after we say > newtype C1toCL cl = C1toCL cl > instance C1 cl n r => CHList (C1toCL cl) n r where > chel (C1toCL l) n = chel1 l n We introduce > data LC1 = LC1 > instance C1 LC1 Z (HJust Z) > instance C1 LC1 (S Z) (HJust Z) > instance C1 LC1 (S (S Z)) (HJust (S (S (S Z)))) > test_cl1 = show $ chl_to_hl $ C1toCL LC1 but we cannot add the instance instance C1 LC1 (S (S (S Z))) (HJust Z) because it fails the non-decreasing constraint. The constrained list may be infinite: > data LC2 = LC2 > instance C1 LC2 Z (HJust Z) > -- this constraint suggested by the typechecker > instance (Apply PLEQ (r, S (S r)) HTrue, > C1 LC2 n (HJust r)) > => C1 LC2 (S n) (HJust (S (S r))) > > test_cl2 = show $ chl_to_hl $ cl_take five $ C1toCL LC2 We check the constraint by applying the predicate to the current and the previous element of the sequence (if it exists). Even in case of the infinite sequence, the prefix is finite, so our constraint is effectively verifiable. |
|||