Subject: Even higher-order abstract syntax: typeclasses vs GADT
Date: Monday 22nd January 2007 03:06:33 UTC (over 10 years ago)
We show the typeclass implementation of the example used to make the case for GADTs. We demonstrate the higher-order abstract-syntax-based embedding of a language in Haskell and implement static and dynamic semantics of the language. The interpreter of the language is tagless and statically assured: Only well-typed terms may be evaluated, and the evaluation of those does not get stuck. We use no tags and *no* run-time pattern-matching, therefore, the `eval' function has no possibility of raising a run-time error. Our language is _non_-strongly normalizing and non-structurally inductive due to the presence of Fix; yet the typechecking is decidable and our typeclass programs always terminate. The running example has been used as an argument for generalized recursive data types (one of the variants of GADTs). This message is meant to clarify the expressiveness of GADTs with respect to typeclasses. Our implementation is also aimed at the sometimes exuberant enthusiasm for GADTs. In positive contribution, we demonstrate the _automatic_ promotion of a value-level higher-order abstract syntax (HOAS) term to the type-level. The small-lambda HOAS becomes the big-Lambda HOAS and can be operated by the typechecker. We demonstrate the typechecking of our term language by abstract interpretation. The latter is performed by the Haskell typechecker itself, following straightforward declarative rules expressed via instance declarations. The code works in GHC and Hugs. This message has been inspired by the paper `Guarded recursive datatype constructors' by Hongwei Xi, Chiyan Chen and Gang Chen (POPL2003) to be called `the GRDT paper' below. The particular inspiration comes from a comment by Meng Wang on Haskell-Cafe (Jan 13, 2007). GADTs are often motivated by a tagless interpreter. When we embed a language in Haskell, the corresponding data types may admit meaningless terms like an application of a number to a pair. If we get the Haskell typechecker to reject such terms, we gain the (static) type checking for our object language. Furthermore, the evaluator for the language no longer has to deal with ill-formed terms and report run-time `type' errors. There is no longer any need in `dynamic typing' and in maintaining and checking type tags. The paper `Wobbly types: type inference for generalised algebraic data types' by Peyton-Jones, Washburn, and Weirich used the embedding of a first-order language to motivate GADTs. That example is _trivially_ implemented in Haskell without GADTs, with simple typeclasses. http://pobox.com/~oleg/ftp/Haskell/GADT-interpreter.hs Far more interesting is the embedding of a higher-order language, described in the GRDT paper. The first challenge is representing higher-order syntax at the type level. More demanding is dealing with the fixpoint operator of the GRDT example. The HOAS fixpoint destroys the structural induction principle. If we fully expose the structure of our terms at the type level, how can we write divergent terms? Do we need (equi-)recursive types? Mainly, would the Haskell typechecker itself diverge when processing such terms? Who would have thought that divergent terms can be so interesting. The examples in the code include several terms whose run-time evaluation does diverge, yet the typechecking of those terms terminates. The termination of our typechecker is easy to see as each step eliminates one term constructor. The key to the terminating typechecking is abstract interpretation, which permits the introduction and the elimination of assumptions/hypothesis. The full code is available at http://pobox.com/~oleg/ftp/Haskell/GADT-HOAS.hs The syntax of our language is as follows. We took the first-order language described previously and added higher-order terms and the fixpoint. The resulting language is a little bit larger than that introduced in the GRDT paper. > newtype Lit a = Lit a -- aka, Lift in GRDT > newtype Inc a = Inc a > newtype IsZ a = IsZ a > newtype If a b c = If (a,b,c) > newtype Fst a = Fst a > newtype Snd a = Snd a > newtype Pair a b = Pair (a,b) > > newtype Lam f t = Lam f -- t is the type of the bound arg > newtype App a b = App (a,b) > newtype Fix f t = Fix f -- ditto > > newtype AbsValue t = AbsValue t The Lam and Fix terms with binders contain in their type the type of the bound variable. The GRDT paper imposes the same type annotation requirement. That type, however, can be inferred in many cases. The term `AbsValue' is used for typechecking of functions by abstract interpretation. Here is a sample higher-order term in our language, with latent non-termination > t5f = Lam $ \p -> If( IsZ p, (Pair (p,Inc p)), (Pair (p,Fix id)) ) The body of Lam is the ordinary Haskell function. We define the static semantics by > class Of a t | a -> t asserting that the term 'a' has the type 't'. The type-checking rules, defined by the instances of `Of' are straightforward: > instance Of (Lit t) t > instance Of a Int => Of (Inc a) Int More interesting is the typechecking of functions and fixpoints: the Hyp introduction and elimination rules. > instance (Apply f (AbsValue t1) y, Of y t2) => Of (Lam f t1) (t1->t2) > instance (Apply f (AbsValue t) y, Of y t) => Of (Fix f t) t > instance Of (AbsValue t) t -- Hyp elimination rule The Lam rules says that the function mapping the term `x' to the term 'y' has the type t1->t2, if the function maps the term AbsValue t1 to the term that has the type t2. In other words, if the body of the function has the type t2 _assuming_ the argument has the type t1. The AbsValue makes the typechecking of the HOAS Fix terminating. If we define > teval :: Of e t => e -> t -- static evaluator > teval = undefined -- all computation is static > et5f1 = teval $ t5f we can see the type inferred for t5f > *HOASL> :t et5f1 > et5f1 :: Int -> (Int, Int) We define the dynamic semantics as the big-step evaluator > class Eval a t | a->t where eval:: a -> t Most of the instances are straightforward: > instance Of (Lit a) a => Eval (Lit a) a where > eval (Lit i) = i > instance Eval t Int => Eval (Inc t) Int where > eval (Inc t) = eval t + 1 Again, more interesting is the evaluation of higher-order terms > instance (Eval a (t1->t2), Eval b t1) => Eval (App a b) t2 where > eval (App (a,b)) = (eval a) (eval b) > > instance (Apply f (Lit t1) b, Eval b t2) => Eval (Lam f t1) (t1->t2) where > eval (Lam f) = \x -> eval (apply f (Lit x)) > > instance (Apply f (Lit t) y, Eval y t) => Eval (Fix f t) t where > eval (Fix f) = fix (\x -> eval $ apply f (Lit x)) We can now evaluate the term and its applications > et5f1 = teval $ t5f > et5f2 = eval $ t5f > et5f3 = eval $ App (t5f, Lit (0::Int)) -- yields (0,1) > et5f4t= teval $ App (t5f, Lit (1::Int)) -- typechecking terminates > et5f4 = eval $ App (t5f, Lit (1::Int)) -- evaluation diverges! The full code contains a more meaningful example of using Fix, to add two numbers using recursion with accumulation. That example too is potentially diverging on some inputs.