Features Download
From: Simon Peyton Jones <simonpj <at> microsoft.com>
Subject: patch applied (ghc): Simon's big boxy-type commit
Newsgroups: gmane.comp.lang.haskell.cvs.ghc
Date: Wednesday 25th January 2006 16:31:12 UTC (over 11 years ago)
Wed Jan 25 08:28:32 PST 2006  [email protected]
  * Simon's big boxy-type commit
  This very large commit adds impredicativity to GHC, plus
  numerous other small things.
  *** WARNING: I have compiled all the libraries, and
  ***	     a stage-2 compiler, and everything seems
  ***	     fine.  But don't grab this patch if you 
  ***	     can't tolerate a hiccup if something is
  ***	     broken.
  The big picture is this:
  a) GHC handles impredicative polymorphism, as described in the
     "Boxy types: type inference for higher-rank types and
     impredicativity" paper
  b) GHC handles GADTs in the new simplified (and very sligtly less
     epxrssive) way described in the
     "Simple unification-based type inference for GADTs" paper
  But there are lots of smaller changes, and since it was pre-Darcs
  they are not individually recorded.
  Some things to watch out for:
  c)   The story on lexically-scoped type variables has changed, as per
       my email.  I append the story below for completeness, but I 
       am still not happy with it, and it may change again.  In particular,
       the new story does not allow a pattern-bound scoped type variable
       to be wobbly, so (\(x::[a]) -> ...) is usually rejected.  This is
       more restrictive than before, and we might loosen up again.
  d)   A consequence of adding impredicativity is that GHC is a bit less
       gung ho about converting automatically between
    	(ty1 -> forall a. ty2)    and    (forall a. ty1 -> ty2)
       In particular, you may need to eta-expand some functions to make
       typechecking work again.
       Furthermore, functions are now invariant in their argument types,
       rather than being contravariant.  Again, the main consequence is
       that you may occasionally need to eta-expand function arguments when
       using higher-rank polymorphism.
  Please test, and let me know of any hiccups
  Scoped type variables in GHC
  	January 2006
  0) Terminology.
     A *pattern binding* is of the form
  	pat = rhs
     A *function binding* is of the form
  	f pat1 .. patn = rhs
     A binding of the formm
  	var = rhs
     is treated as a (degenerate) *function binding*.
     A *declaration type signature* is a separate type signature for a
     let-bound or where-bound variable:
  	f :: Int -> Int
     A *pattern type signature* is a signature in a pattern: 
  	\(x::a) -> x
  	f (x::a) = x
     A *result type signature* is a signature on the result of a
     function definition:
  	f :: forall a. [a] -> a
  	head (x:xs) :: a = x
     The form
  	x :: a = rhs
     is treated as a (degnerate) function binding with a result
     type signature, not as a pattern binding.
  1) The main invariants:
       A) A lexically-scoped type variable always names a (rigid)
   	type variable (not an arbitrary type).  THIS IS A CHANGE.
          Previously, a scoped type variable named an arbitrary *type*.
       B) A type signature always describes a rigid type (since
  	its free (scoped) type variables name rigid type variables).
  	This is also a change, a consequence of (A).
       C) Distinct lexically-scoped type variables name distinct
  	rigid type variables.  This choice is open; 
  2) Scoping
  2(a) If a declaration type signature has an explicit forall, those type
     variables are brought into scope in the right hand side of the 
     corresponding binding (plus, for function bindings, the patterns on
     the LHS).  
  	f :: forall a. a -> [a]
  	f (x::a) = [x :: a, x]
     Both occurences of 'a' in the second line are bound by 
     the 'forall a' in the first line
     A declaration type signature *without* an explicit top-level forall
     is implicitly quantified over all the type variables that are
     mentioned in the type but not already in scope.  GHC's current
     rule is that this implicit quantification does *not* bring into scope
     any new scoped type variables.
  	f :: a -> a
  	f x = ...('a' is not in scope here)...
     This gives compatibility with Haskell 98
  2(b) A pattern type signature implicitly brings into scope any type
     variables mentioned in the type that are not already into scope.
     These are called *pattern-bound type variables*.
  	g :: a -> a -> [a]
  	g (x::a) (y::a) = [y :: a, x]
     The pattern type signature (x::a) brings 'a' into scope.
     The 'a' in the pattern (y::a) is bound, as is the occurrence on 
     the RHS.  
     A pattern type siganture is the only way you can bring existentials 
     into scope.
  	data T where
  	  MkT :: forall a. a -> (a->Int) -> T
  	f x = case x of
  		MkT (x::a) f -> f (x::a)
  	class C a where
  	  op :: forall b. b->a->a
  	instance C (T p q) where
  	  op = 
      Clearly p,q are in scope in , but is 'b'?  Not at the moment.
      Nor can you add a type signature for op in the instance decl.
      You'd have to say this:
  	instance C (T p q) where
  	  op = let op' :: forall b. ...
  	           op' = 
  	       in op'
  3) A pattern-bound type variable is allowed only if the pattern's
     expected type is rigid.  Otherwise we don't know exactly *which*
     skolem the scoped type variable should be bound to, and that means
     we can't do GADT refinement.  This is invariant (A), and it is a 
     big change from the current situation.
  	f (x::a) = x	-- NO; pattern type is wobbly
  	g1 :: b -> b
  	g1 (x::b) = x	-- YES, because the pattern type is rigid
  	g2 :: b -> b
  	g2 (x::c) = x	-- YES, same reason
  	h :: forall b. b -> b
  	h (x::b) = x	-- YES, but the inner b is bound
  	k :: forall b. b -> b
  	k (x::c) = x	-- NO, it can't be both b and c
  3a) You cannot give different names for the same type variable in the
same scope
      (Invariant (C)):
  	f1 :: p -> p -> p		-- NO; because 'a' and 'b' would be
  	f1 (x::a) (y::b) = (x::a)	--     bound to the same type variable
  	f2 :: p -> p -> p		-- OK; 'a' is bound to the type variable
  	f2 (x::a) (y::a) = (x::a)	--     over which f2 is quantified
  					-- NB: 'p' is not lexically scoped
  	f3 :: forall p. p -> p -> p	-- NO: 'p' is now scoped, and is bound to
  	f3 (x::a) (y::a) = (x::a)	--     to the same type varialble as 'a'
  	f4 :: forall p. p -> p -> p	-- OK: 'p' is now scoped, and its occurences
  	f4 (x::p) (y::p) = (x::p)	--     in the patterns are bound by the forall
  3b) You can give a different name to the same type variable in different
      disjoint scopes, just as you can (if you want) give diferent names to

      the same value parameter
  	g :: a -> Bool -> Maybe a
  	g (x::p) True  = Just x  :: Maybe p
  	g (y::q) False = Nothing :: Maybe q
  3c) Scoped type variables respect alpha renaming. For example, 
      function f2 from (3a) above could also be written:
  	f2' :: p -> p -> p
  	f2' (x::b) (y::b) = x::b
     where the scoped type variable is called 'b' instead of 'a'.
  4) Result type signatures obey the same rules as pattern types
     In particular, they can bind a type variable only if the result type
is rigid
  	f x :: a = x	-- NO
  	g :: b -> b
  	g x :: b = x	-- YES; binds b in rhs
  5) A *pattern type signature* in a *pattern binding* cannot bind a 
     scoped type variable
  	(x::a, y) = ...		-- Legal only if 'a' is already in scope
     Reason: in type checking, the "expected type" of the LHS pattern is
     always wobbly, so we can't bind a rigid type variable.  (The exception
     would be for an existential type variable, but existentials are not
     allowed in pattern bindings either.)
     Even this is illegal
  	f :: forall a. a -> a
  	f x = let ((y::b)::a, z) = ... 
     Here it looks as if 'b' might get a rigid binding; but you can't bind
     it to the same skolem as a.
  6) Explicitly-forall'd type variables in the *declaration type
     for a *pattern binding* do not scope AT ALL.
  	x :: forall a. a->a	  -- NO; the forall a does 
  	Just (x::a->a) = Just id  --     not scope at all
  	y :: forall a. a->a
  	Just y = Just (id :: a->a)  -- NO; same reason
     THIS IS A CHANGE, but one I bet that very few people will notice.
     Here's why:
  	strange :: forall b. (b->b,b->b)
  	strange = (id,id)
  	x1 :: forall a. a->a
  	y1 :: forall b. b->b
  	(x1,y1) = strange
      This is legal Haskell 98 (modulo the forall). If both 'a' and 'b'
      both scoped over the RHS, they'd get unified and so cannot stand
      for distinct type variables. One could *imagine* allowing this:
  	x2 :: forall a. a->a
  	y2 :: forall a. a->a
  	(x2,y2) = strange
      using the very same type variable 'a' in both signatures, so that
      a single 'a' scopes over the RHS.  That seems defensible, but odd,
      because though there are two type signatures, they introduce just
      *one* scoped type variable, a.
  7) Possible extension.  We might consider allowing
  	\(x :: [ _ ]) -> 
      where "_" is a wild card, to mean "x has type list of something",
      naming the something.

    M ./ghc/compiler/basicTypes/BasicTypes.lhs -1 +5
    M ./ghc/compiler/basicTypes/Id.lhs -3 +9
    M ./ghc/compiler/basicTypes/Module.lhs -1
    M ./ghc/compiler/basicTypes/Name.lhs -3 +7
    M ./ghc/compiler/basicTypes/SrcLoc.lhs -1 +1
    M ./ghc/compiler/basicTypes/Var.lhs -9 +11
    M ./ghc/compiler/basicTypes/VarEnv.lhs -1 +3
    M ./ghc/compiler/coreSyn/PprCore.lhs -13 +26
    M ./ghc/compiler/deSugar/Check.lhs -1 +1
    M ./ghc/compiler/deSugar/DsBinds.lhs -4 +35
    M ./ghc/compiler/deSugar/DsExpr.lhs -3 +6
    M ./ghc/compiler/deSugar/DsMeta.hs -6 +7
    M ./ghc/compiler/deSugar/DsUtils.lhs -3 +6
    M ./ghc/compiler/hsSyn/Convert.lhs -4 +4
    M ./ghc/compiler/hsSyn/HsBinds.lhs -46 +104
    M ./ghc/compiler/hsSyn/HsExpr.lhs -2 +15
    M ./ghc/compiler/hsSyn/HsTypes.lhs -1 +1
    M ./ghc/compiler/hsSyn/HsUtils.lhs -35 +13
    M ./ghc/compiler/iface/IfaceSyn.lhs -3 +3
    M ./ghc/compiler/iface/IfaceType.lhs -1 +1
    M ./ghc/compiler/iface/TcIface.lhs -18 +4
    M ./ghc/compiler/main/GHC.hs -2 +2
    M ./ghc/compiler/main/HscMain.lhs -1 +1
    M ./ghc/compiler/main/HscStats.lhs -3 +3
    M ./ghc/compiler/main/PprTyThing.hs -1 +1
    M ./ghc/compiler/parser/Parser.y.pp -7 +7
    M ./ghc/compiler/parser/RdrHsSyn.lhs -16 +17
    M ./ghc/compiler/rename/RnBinds.lhs -13 +14
    M ./ghc/compiler/typecheck/Inst.lhs -123 +65
    M ./ghc/compiler/typecheck/TcArrows.lhs -45 +38
    M ./ghc/compiler/typecheck/TcBinds.lhs -151 +275
    M ./ghc/compiler/typecheck/TcClassDcl.lhs -34 +37
    M ./ghc/compiler/typecheck/TcEnv.lhs -19 +52
    M ./ghc/compiler/typecheck/TcExpr.lhs -547 +584
    M ./ghc/compiler/typecheck/TcExpr.lhs-boot -10 +5
    M ./ghc/compiler/typecheck/TcForeign.lhs -2 +2
    M ./ghc/compiler/typecheck/TcGenDeriv.lhs -2 +2
    M ./ghc/compiler/typecheck/TcHsSyn.lhs -62 +48
    M ./ghc/compiler/typecheck/TcHsType.lhs -131 +128
    M ./ghc/compiler/typecheck/TcMType.lhs -299 +251
    M ./ghc/compiler/typecheck/TcMatches.lhs -295 +158
    M ./ghc/compiler/typecheck/TcMatches.lhs-boot -5 +5
    M ./ghc/compiler/typecheck/TcPat.lhs -250 +440
    M ./ghc/compiler/typecheck/TcRnDriver.lhs -8 +9
    M ./ghc/compiler/typecheck/TcRnMonad.lhs -28 +44
    M ./ghc/compiler/typecheck/TcRnTypes.lhs -36 +29
    M ./ghc/compiler/typecheck/TcRules.lhs -10 +16
    M ./ghc/compiler/typecheck/TcSimplify.lhs -6 +7
    M ./ghc/compiler/typecheck/TcSplice.lhs -17 +17
    M ./ghc/compiler/typecheck/TcSplice.lhs-boot -3 +3
    M ./ghc/compiler/typecheck/TcTyClsDecls.lhs -24 +5
    M ./ghc/compiler/typecheck/TcTyDecls.lhs -2 +2
    M ./ghc/compiler/typecheck/TcType.hi-boot-6 +1
    M ./ghc/compiler/typecheck/TcType.lhs -175 +339
    M ./ghc/compiler/typecheck/TcType.lhs-boot +2
    M ./ghc/compiler/typecheck/TcUnify.hi-boot-6 -1 +2
    M ./ghc/compiler/typecheck/TcUnify.lhs -697 +851
    M ./ghc/compiler/typecheck/TcUnify.lhs-boot -2 +3
    M ./ghc/compiler/types/Generics.lhs -7 +4
    M ./ghc/compiler/types/TyCon.lhs -4 +7
    M ./ghc/compiler/types/Type.lhs -64 +18
    M ./ghc/compiler/types/Unify.lhs -11 +23
    M ./ghc/compiler/utils/IOEnv.hs -4 +4
    M ./ghc/compiler/utils/UniqFM.lhs -5 +5
CD: 3ms