aboutsummaryrefslogtreecommitdiff
path: root/coclmaindll/doc/trans.doc.txt
diff options
context:
space:
mode:
Diffstat (limited to 'coclmaindll/doc/trans.doc.txt')
-rw-r--r--coclmaindll/doc/trans.doc.txt711
1 files changed, 711 insertions, 0 deletions
diff --git a/coclmaindll/doc/trans.doc.txt b/coclmaindll/doc/trans.doc.txt
new file mode 100644
index 0000000..bd9a491
--- /dev/null
+++ b/coclmaindll/doc/trans.doc.txt
@@ -0,0 +1,711 @@
+This file contains documentation of the transformation phase.
+
+Contents
+ - Overview
+ - The analysation phase
+ - The transformation phase
+ - Overview
+ - Generating a new function
+ - Status
+
+Originally the transformation phase was designed to solve three tasks:
+ - specialisation of overloaded functions (in the following just
+ "overloading specialisation"):
+ - inlining of higher order arguments in functions that are
+ defined locally in a macro (Nijmegen slang: "fold optimalisation")
+ - deforestation
+
+All these optimisations have some things in common (more or less): they
+are all a kind of specialisation. Specialisation involves creating new
+versions of functions. That's why it was decided to implement one algorithm
+that can handle all three optimalisations at the same time.
+We call this algorithm the "fusion" algorithm.
+
+Overview
+********
+
+In fact there are two phases involved:
+ - the analysation phase
+ - the transformation phase
+
+Goal of the analysys is to assign two properties to each icl function argument
+(or: function argument position): The consumer classification (::Int)
+and the linearity (::Bool). These values are returned by analyseGroups in the
+{! ConsClasses} value. Furtheron so called "active" cases are marked as such (these
+markings are stored in the ExpressionHeap). The consumer classification can be
+one of these four values:
+
+ cPassive:
+ if none of the following
+ cActive:
+ Only these function arguments can be specialized. The
+ variable ("x" below) appears at least in one of the following positions:
+ - x.member (record selection: could be overloading specialised)
+ - x @ ... (higher order app: fold optimalisation)
+ - case x of .. (could be deforested)
+ - in a call to a function that's also cActive in that argument
+ cAccumulating:
+ function argument can't be specialized because there
+ is a recursive call where the argument is not a variable, which
+ could lead into non termination of that algorithm.
+ cVarOfMultimatchCase:
+ A multimatch case is a case where one constructor could match
+ twice, e.g.:
+
+ g x = case x of
+ Cons h t | p1 h -> t
+ Cons h t | p2 h -> t
+
+ Such cases were exluded from deforestation in a first
+ approximation because they were difficult to handle (e.g. linearity
+ analysys)
+
+The linearity classification indicates, well, whether a function argument
+is used linearily. Only linear variables can be substituted with expressions
+(otherwise work could be duplicated).
+
+The transformation phase uses this information to generate new functions.
+The transformation algorithm will run over all icl functions, beginning
+with the leafs of the component tree. For every icl function he will run
+through the body expression (postorder!) and when he finds a function
+appliation that can be specialised
+he will do so (if the function wasn't specialised in that way already).
+
+The criteria to specialize a function "f" in an argument "A" are the following:
+ - f must be cActive in that argument.
+ - A must be an application
+ - If A is an application of a dictionary constructor: YEAH, go for it!
+ - If A is a curried application and f is defined as a local macro function:
+ YEAH!!!
+ - If A is a function application (g E1 .. En) and f is linear in that
+ argument: YEAH! Deforest! Cut all trees.
+Otherwise nothing happens.
+
+E.g. if (f E1 (g E2 E3) E4) meets the above condition then a new function
+f_g will be created and that function application will become
+
+ f_g E1 E2 E3 E4 (if g is a function)
+ f_g E1 E4 (if g is a dictionary constructor) (it's not always that simple)
+
+f is called the _consumer_ and g the _producer_.
+
+Of course a new type for f_g has to be derived.
+
+How is f_g created? In case g was
+ - a dictionary:the formal argument in f is substituted the dictionary application
+ - a curried function: the formal (higher order) argument in f is substituted
+ with "g"
+ - an uncurried function: the formal argument in f is substituted with the
+ _body_ of g.
+
+
+The analysation phase
+*********************
+
+The analysys happens component wise, beginning with the leafs.
+
+The fundamental function here is the consumerRequirements class whose
+instances run through an expression. It's main data structure it is working
+on is the *ConsClassSubst:=={#Int}. Each variable that is used in a function
+that is currently analysed is assigned a number. This number is an index into
+the ConsClassSubst array. So the ConsClassSubst assigns a consumer classification
+to every variable. The classification process reminds on typing. If an array
+element is negative then the corresponding variable currently has been assigned
+to one of the four classifications:
+
+ cPassive :== -1
+ cActive :== -2
+ cAccumulating :== -3
+ cVarOfMultimatchCase :== -4
+
+If it is non negative (say i) then this states that the classification is the same
+as the variable which corresponds to i.
+
+The instances of consumerRequirements return a boolean that indicates whether
+the expression is unsafe, e.g. in
+
+ case x of
+ Cons h t -> case p h of True -> t
+ ...
+
+(assumed that these cases result from patterns and guards) the
+consumerRequirements instance would return True for the inner case,
+because this case could fail so that control would GOTO the outer case.
+In this case the boolean would be used to classify x as
+cVarOfMultimatchCase.
+
+The ai_cur_ref_counts field of the AnalyseInfo record is used to
+determine the linearity of all variables, the arguments and the local
+variables.
+
+The ai_cases_of_vars_for_function field of the AnalyseInfo record
+accumulates all cases of which the case_expr is a variable. At the end
+of analyzing a component the case_info_ptr is marked (in function
+set_case_expr_info) with an EEI_ActiveCase tag, if the corresponding
+variable is active. So after the analysation phase every case is
+either "active" or not.
+
+The information whether a case is active is stored with the
+case_info_ptr. But there is already type information stored in the
+ExpressionHeap. For every case the case_info_ptr points to an
+EI_CaseType entry. For every "let" expression the let_info_ptr
+points to an EI_LetType entry. These entries assign the variables
+that are introduced by a pattern or let bind a type. This information
+is needed for lifting (also in the following phases). So I invented
+the EI_Extended constructor which allows to store several independent
+pieces of information for one pointer. At the end of the transformation
+phase the original heap contents has to be restored again
+(function "cleanup_attributes") (the advantages of functional
+programming shine through here very clearly).
+
+The transformation phase
+************************
+
+The transformation phase - Overview
+************************^^^^^^^^^^^
+
+There are two basic transformations that are applied for deforestation
+(overloading specialisation and the fold optimalisation are just simple
+inlining!):
+
+ - The "case in case" transformation:
+
+ case (case x of
+ P1 -> A1
+ P2 -> A2)
+ P3 -> A3
+ P4 -> A4
+
+ transforms into
+
+ case x of
+ P1 -> case A1 of
+ P3 -> A3
+ P4 -> A4
+ P2 -> case A2 of
+ P3 -> A3
+ P4 -> A4
+
+ This happens in function lift_case. Note that this transformation
+ duplicates code.
+
+ - The "match" transformation:
+
+ case (Cons A1 A2) of
+ Cons h t -> A3
+ ..
+
+ transforms into
+
+ let h=A1
+ t=A2
+ in A3
+
+ If "h" or "t" are used linearily in A3 the can be inlined, too.
+ The match transformation reduces the amount of code (all the other patterns).
+ Note that this transformation is only valid, if the case is not
+ a multimatch case
+
+Let's consider some examples first:
+
+ sum l =
+ case l of
+ Nil -> 0
+ Cons h t -> +I h (sum t) // +I: plus on integers
+
+ every_second toggle l2 =
+ case l2 of
+ Cons h2 t2
+ -> case toggle of
+ True -> Cons h2 (every_second (not toggle) t2)
+ False -> (every_second (not toggle) t2)
+ Nil -> Nil
+
+The analysys phase will assign l (the argument of sum) "cActive" and "linear".
+"cActive" because
+ - the recursive call has a variable (t) as argument (in case of a
+ non variable it would be cAccumulating)
+ - l is used as a case_expression (case l of) (otherwise it would be cPassive
+ or cVarOfMultiMatchCase
+ - the case is not a multimatch case
+"linear" because l appears only once
+
+The case in the sum function will be marked as "active case of 'sum' in 'l'":
+
+ sum l =
+ case<*sum*> l of ...
+
+Indeed, sum is a typical list consumer!
+
+The arguments of every_second are classified as:
+ - toggle:cAccumulating, because there is a recursive call (in fact two)
+ with a non variable argument (not toggle).
+ - l2: cActive (for the same reasons as "l" in function sum)
+
+Now if we transform the following function
+
+ Start = sum (every_second True [42,42,42])
+
+we see that we have an uncurried function application in the active
+consumer position of sum. We fuse these two functions by replacing
+"l" in sum with the body of every_second (happens in "generateFunction"):
+
+ Start = sum_every_second True [42,42,42]
+
+ sum_every_second toggle l2 =
+ case<*sum*> (case l2 of
+ Cons h2 t2
+ -> case toggle of
+ True -> Cons h2 (every_second (not toggle) t2)
+ False -> (every_second (not toggle) t2)
+ Nil -> Nil) of
+ Nil -> 0
+ Cons h t -> +I h (sum t)
+
+This newly generated function will be stored in the FunctionHeap.
+
+Now we transform sum_every_second. First we apply the case in case
+transformation.
+
+ sum_every_second toggle l2 =
+ case l2 of
+ Cons h2 t2
+ -> case<*sum*> (case toggle of
+ True -> Cons h2 (every_second (not toggle) t2)
+ False -> (every_second (not toggle) t2))
+ Nil -> 0
+ Cons h t -> +I h (sum t)
+ Nil -> case<*sum*> Nil of
+ Nil -> 0
+ Cons h t -> +I h (sum t)
+
+Now we apply the case in case transformation once more in the
+Cons case and in the Nil case we do the match:
+
+ sum_every_second toggle l2 =
+ case l2 of
+ Cons h2 t2
+ -> case toggle of
+ True -> case<*sum*> Cons h2 (every_second (not toggle) t2) of
+ Nil -> 0
+ Cons h t -> +I h (sum t)
+ False -> case<*sum*> every_second (not toggle) t2 of
+ Nil -> 0
+ Cons h t -> +I h (sum t)
+ Nil -> 0
+
+Finally we apply the match in the True case and we do the FOLD step:
+
+ sum_every_second toggle l2 =
+ case l2 of
+ Cons h2 t2
+ -> case toggle of
+ True -> +I h2 (sum (every_second (not toggle) t2))
+ False -> sum_every_second (not toggle) t2
+ Nil -> 0
+
+The fold step can be applied, because we know that this particular
+case is the root case of the sum function, hence
+
+ case<*sum*> every_second (not toggle) t2 of
+ Nil -> 0
+ Cons h t -> +I h (sum t)
+
+is equivalent to
+
+ sum (every_second (not toggle) t2)
+
+which in turn is known to be specialised already:
+
+ sum_every_second (not toggle) t2
+
+So we can apply the fold step, if the root case comes together with the
+producer again. What do we do when during transformation a non-root case
+comes together with the producer? Answer: We generate a new function.
+Comment: This is bullshit. It would be better to generate functions for
+_all_ non root cases in a seperate phase before,
+because currently after the transformation
+phase (while interfacing with the backend) this is done anyway! BTW, the
+every_second function has no non root case.
+
+Generating a new function would be necessary in the following case:
+
+ consumer l =
+ 1 + (case l of
+ Cons h t -> h + consumer t
+ Nil -> 0)
+
+ filter p l2 =
+ case l2 of
+ Nil -> Nil
+ Cons h2 t2 -> case p @ h2 of
+ True -> Cons h2 (filter p t2)
+ False -> filter p t2
+
+Fusing these two would proceed like follows:
+
+ consumer_filter p l2 =
+ 1 + (case<*consumer*> (case l2 of
+ Nil -> Nil
+ Cons h2 t2 -> case p h2 of
+ True -> Cons h2 (filter p t2)
+ False -> filter p t2) of
+ Cons h t -> h + consumer t
+ Nil -> 0 )
+
+ consumer_filter p l2 =
+ 1 + (case l2 of
+ Nil -> case<*consumer*> Nil of
+ Cons h t -> h + consumer t
+ Nil -> 0
+ Cons h2 t2 -> case<*consumer*> (case p h2 of
+ True -> Cons h2 (filter p t2)
+ False -> filter p t2) of
+ Cons h t -> h + consumer t
+ Nil -> 0)
+
+ consumer_filter p l2 =
+ 1 + (case l2 of
+ Nil -> 0
+ Cons h2 t2 -> case p h2 of
+ True -> case<*consumer*> Cons h2 (filter p t2)
+ Cons h t -> h + consumer t
+ Nil -> 0
+ False -> case<*consumer*> filter p t2 of
+ Cons h t -> h + consumer t
+ Nil -> 0)
+
+ consumer_filter p l2 =
+ 1 + (case l2 of
+ Nil -> 0
+ Cons h2 t2 -> case p h2 of
+ True -> h2 + (consumer_filter p t2)
+ False -> case<*consumer*> filter p t2 of
+ Cons h t -> h + consumer t
+ Nil -> 0)
+
+Now we should _not_ apply the fold step like
+
+ consumer_filter p l2 =
+ 1 + (case l2 of
+ Nil -> 0
+ Cons h2 t2 -> case p h2 of
+ True -> h2 + (consumer_filter p t2)
+ False -> consumer_filter p t2
+
+Then (consumer_filter isPositive [-1]) would evaluate to 2 while
+(consumer (filter isPositive [-1])) would evaluate to 1!
+
+Instead we do so as if consumer was defined like
+
+ consumer l = 1 + (consumer_case l)
+ consumer_case l = case l of
+ Cons h t -> h + consumer t
+ Nil -> 0
+
+Now we can do the fold step. So the result would be
+
+ consumer_filter p l2 = 1+(consumer_case_filter p l2)
+ consumer_case_filter p l2
+ = case l2 of
+ Nil -> 0
+ Cons h2 t2 -> case p h2 of
+ True -> h2 + (consumer_filter p t2)
+ False -> consumer_case_filter p t2
+
+The additional functions that are generated for non root cases
+are created in function "generate_case_function".
+
+To finish this overview I give an example for fold optimalisation:
+
+ foldl f l init =
+ case l of Nil -> init
+ Cons h t -> foldl f t (f @ h init)
+
+ Start = foldl (+) [] 0
+
+Classification:
+ foldl f:(cActive, not linear) l:(cActive, linear) init:(cAccumulating, linear)
+
+The first arg is active because it appears on the RHS of @.
+
+I don't fuse the constructor []! (Sjaak does -> code explosion)
+
+But I specialize in the first arg. Being not linear doesn't matter
+for higher order arguments, because there is no calculation
+duplicated (same with dictionaries)
+
+ foldl_plus l init =
+ case l of Nil -> init
+ Cons h t -> foldl_plus t (h + init)
+
+ Start = foldl_plus [] 0
+
+And I give example for overloading specialisation
+
+The programmer writes:
+
+ gimme_da_first :: (a e) -> e | Array a e
+ gimme_da_first a = a.[0]
+
+ gimme_a_strict_array :: {!Int}
+ gimme_a_strict_array = {42}
+
+ Start = gimme_da_first gimme_a_strict_array
+
+The compiler makes from this while typing:
+
+ :: ArrayDictionary a e =
+ { select :: (a e) Int -> e
+ , update :: (a e) Int e -> (a e)
+ , size :: (a e) -> Int
+ ..
+
+ gimme_da_first :: (ArrayDictionary a e) (a e) -> e
+ gimme_da_first dictionary a = dictionary.select @ a 0
+
+ Start = gimme_da_first { select = strictArraySelect, update=strictArrayUpdate,
+ size = strictArraySize .. }
+ gimme_a_strict_array
+
+We see that the first dictionary argument of "gimme_da_first" is classified as
+a consumer (cActive because of the selection and even linear). So we fuse
+"gimme_da_first" with the record constructor:
+
+ gimme_da_first_StrictArray :: {!e} -> e
+ gimme_da_first_StrictArray a = strictArraySelect a 0
+
+ Start = gimme_da_first_StrictArray gimme_a_strict_array
+
+In this example we can see very clear that through fusion (be it
+overloading specialisation, deforestation or fold optimalisation) the type
+of the newly created function has become specialised, too. In this case
+for (a e) the a has been substituted with {!} yielding {!e}
+
+This happens by unifying the producer's result type with the type of the
+argument of the producer. In this case we unify
+
+ (ArrayDictionary {!} e) // producer { select = strictArraySelect, ...
+ with (ArrayDictionary a e) // consumer gimme_da_first
+
+yielding the substitution a->{!}
+
+
+The transformation phase - Generating a new function
+************************^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The function "generateFunction" is called when a function is specialised.
+
+As the important parameters it gets the consumers function definition, the
+classification of it's arguments and an array "prods" that contains a producer
+description for each consumer argument. So generateFunction indeed
+specialises one consumer with possibly multiple producers. A producer
+description of type "Producer" indicates, whether the producer is a
+dictionary, a curried application or a function.
+
+First some trouble with types happens. As stated before we have to apply
+type unification. Therefore we reuse the infrastructure from the typing
+phase. Unfortunately this infrastructure makes certain assumptions about
+the given types for which some work has to be done to fulfill them.
+
+The unification algorithm assumes type variables in "integer form" (TempV)
+instead of "pointer form" (TV). Furtheron we have to take care about
+uniqueness information. We have to add propagation information (see function
+"add_propagation_attributes). This will transform a type
+
+ consumer :: [*{Int}] -> Int
+
+into
+
+ consumer :: *[*{Int}] -> Int
+
+After giving every involved type variable a number we create an initial
+type variable substitution "subst". This array assigns every type variable
+(number) a type. With every producer this substitution can be further specialised.
+
+The "determine_args" function is the called. To explain what this function
+does here is an example (I write (List a) instead of [a]):
+
+ consumer :: (List a) (List b) (a, b) -> (Int, Int, Int)
+ consumer x y z = (case x of ..., case y of ..., fun0 z z z)
+
+ producer :: (List c) T2 -> (List Int)
+ producer1 a b = fun1 a b
+
+ producer2 :: T3 -> (List Real)
+ producer2 c = fun2 c
+
+Our application is:
+
+ Start = consumer (producer1 E1 E2) (producer E3) E4
+
+Because we are generating a new functions all variables in that function
+should be fresh. The generated function should look like this:
+
+ consumer` a1 b1 c1 z1 = (case fun1 a1 b1 of ..., case fun2 c1 of ..., fun0 z1 z1 z1)
+
+The RHS expression is later created by calling the "unfold" function. "unfold" substitutes
+all local variables of a expression with fresh ones, but the free variables (here x, y and z)
+are replaced with an expression on which the free variable has been bound before.
+So determine_args establishes the following bindings (stored in the var_heap):
+
+ x -> fun1 a1 b1
+ y -> fun2 c1
+ z -> z1
+
+As it's first result value determine_args delivers the new function args: [a1, b1, c1, z1]
+
+What about the types of these arguments? The second result value of determine args
+is an array whose elements either contain the producers argument types or (in case there
+was no producer for that argument) the consumers original type. This would be in the
+upper example:
+
+ {[List c, T2],[T3],[(a, b)]}
+
+But the following type would be wrong:
+
+ consumer` :: (List c) T2 T3 (a, b) -> (Int, Int, Int)
+
+The right type is
+
+ consumer` :: (List c1) T2 T3 (Int, Real) -> (Int, Int, Int) // c1 "fresh"
+
+Fortunately we can infer this type correctly because "determine_args" corrects
+our type variable substitution "subst" with every producer. So the
+substitution would look like this
+
+ a -> Int // from producer1
+ b -> Real // from producer2
+ c -> c1 // fresh
+
+Furtheron determine_args also accumulates uniqueness requirements: inequalities between
+attributed types. Later the inference of the attributes of the type of the generated
+function happens like in the typing phase: the (attribute less) substitution is lifted
+and attribute coercions are derived from the expanded uniqueness requirements,
+the coercion graph is partitionated, unused attribute variables are removed.
+
+Finally, the "integer" type representation is again copied into a "pointer" type
+representation.
+
+After we have let "unfold" create the RHS of our new function, we of course transform
+it.
+
+There is a lot of type information stored in the ExpressionHeap. This information has
+to be kept up to date when creating the RHS expression. Especially the type
+information for case and let expressions has to be refreshened and specialized.
+E.g. when in the upper original consumer for some local variable the polymorph
+type "a" was stored then this should be "Int" in the specialized consumer`.
+
+Status (June 2001)
+******
+
+The following issues are suboptimal:
+
+- The algorithm does not terminate, e.g. when transforming:
+
+ zipWith f l1 l2 :== [f e1 e2 \\ e1<-l1 & e2<-l2]
+
+ l = [1 : zipWith (+) [1] l]
+
+ t [] = []
+ t [e:l] = [e: t l]
+
+ Start = t l
+
+ One could try to find criteria that ensure that fusion will terminate. Alternatively
+ one could also decide while fusing, whether this fusion should be aborted using
+ ad hoc methods (e.g. a counter that counts keeps track of the recursion depth, or
+ whatever). It can be quite nice if a compiler is known to terminate always.
+
+- It is possible that through fusion some functions can get unused. Imagine a function
+ that is called only once. When this call is specialised the original won't be
+ called anymore. But this superflous function still remains in the functions array,
+ doesn't get garbage collected and is further processed (e.g. in the convertcases
+ phase). One could infer the components once more after the transformation phase
+ to deal with that issue.
+
+- Local definitions don't get inlined, e.g. let "c" be a consumer and "p" a producer.
+ The follwing would be fused:
+
+ c p
+
+ while the following would not:
+
+ let pp = p in c pp
+
+ This could be improved.
+
+- Only two functions are fused together, but a third one is not getting involved.
+ E.g. when fusing the follwing:
+
+ i x = x
+ producer x = i [x:producer x]
+ consumer l = case l of ...
+
+ the algorithm stops at
+
+ consumer_producer x = case (i [x:producer x]) of
+
+- Multimatch cases are ignored (see above). The compiler generates lots of
+ multimatch cases (I remember the zip2 function).
+
+- Only function arguments get deforested, but not e.g. the following
+
+ case [] of [] -> 1
+
+The following are real bugs:
+
+- Strict expressions get optimised away although they shouldn't
+
+ :: T = C !Int
+ producer = C (abort "ABORT")
+ consumer x = case x of C _ -> 1
+ Start = consumer producer
+
+ fusion result:
+
+ Start = consumer_producer
+ consumer_producer = 1
+
+ (no abort)
+
+- It can happen that while fusing one sees that a case will never match:
+
+ producer = []
+ consumer [_:_] = 1
+ Start = consumer producer
+
+ fusion result:
+
+ Start = consumer_producer
+ consumer_producer = <never_matching_case>
+
+- Deforestation should better be done _after_ strictness analysys. Look at:
+
+ consumer :: ![a] -> [Int]
+ consumer l
+ = [case l of
+ [] -> 0
+ [_:t] -> 1
+ ]
+
+ producer = undef
+
+ Start = consumer producer
+
+ In this case "consumer" and "producer" may not be fused here although this
+ happens with the current implementation. The problem here is that the
+ argument of "consumer" has been made artificially strict. Possible solutions:
+ - VERBIEDEN!!!!
+ Never fuse consumer arguments that have been marked strict.
+ - If a consumer argument has been marked strict, ensure that it is indeed
+ strict. Therefore one has to reason about strictness of expressions.
+ A primitive strictness analysys could be part of the transformation
+ phase (prima!).
+
+- The algorithm makes no distinction between cases that come from patterns/guards
+ and cases that were specified by the programmer. The algorithm assumes that no
+ case was specified by the programmer. I think two places where this leads into
+ wrong results are:
+ - the second return value of the consumerRequirements class
+ - the removeNeverMatchingSubcases function