1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
4 <META NAME="GENERATOR" CONTENT="SGML-Tools 1.0.7">
5 <TITLE>SDCC Compiler User Guide: Optimizations</TITLE>
6 <LINK HREF="SDCCUdoc-7.html" REL=next>
7 <LINK HREF="SDCCUdoc-5.html" REL=previous>
8 <LINK HREF="SDCCUdoc.html#toc6" REL=contents>
11 <A HREF="SDCCUdoc-7.html">Next</A>
12 <A HREF="SDCCUdoc-5.html">Previous</A>
13 <A HREF="SDCCUdoc.html#toc6">Contents</A>
15 <H2><A NAME="Optimizations"></A> <A NAME="s6">6. Optimizations</A> </H2>
17 <P>SDCC performs a a host of standard optimizations in addition to some MCU
18 specific optimizations.
19 <H2><A NAME="Sub-expression Elimination"></A> <A NAME="ss6.1">6.1 Sub-expression elimination</A>
22 <P>The compiler does local and global common subexpression elimination.
30 <P>will be translated to
38 <P>Some subexpressions are not as obvious as the above example.
46 <P>In this case the address arithmetic a->b[i] will be computed
47 only once; the equivalent code in C would be.
55 <P>The compiler will try to keep these temporary variables in registers.
56 <H2><A NAME="Dead-code elimination"></A> <A NAME="ss6.2">6.2 Dead-Code elimination.</A>
65 i = 1; /* dead store */
70 global = 3; /* unreachable
85 <H2><A NAME="Copy-Propagation"></A> <A NAME="ss6.3">6.3 Copy-Propagation:</A>
111 <P>Note: the dead stores created by this copy propagation will be eliminated
112 by dead-code elimination .
113 <H2><A NAME="Loop Optimizations"></A> <A NAME="ss6.4">6.4 Loop optimizations</A>
116 <P>Two types of loop optimizations are done by SDCC loop invariant lifting
117 and strength reduction of loop induction variables.In addition to the strength
118 reduction the optimizer marks the induction variables and the register allocator
119 tries to keep the induction variables in registers for the duration of the
120 loop. Because of this preference of the register allocator , loop induction
121 optimization causes an increase in register pressure, which may cause unwanted
122 spilling of other temporary variables into the stack / data space . The compiler
123 will generate a warning message when it is forced to allocate extra space either
124 on the stack or data space. If this extra space allocation is undesirable then
125 induction optimization can be eliminated either for the entire source file
126 ( with --noinduction option) or for a given function only (#pragma NOINDUCTION).
127 <H3><A NAME="Loop Invariant"></A> Loop Invariant: </H3>
132 for (i = 0 ; i < 100 ; i ++)
140 for ( i = 0; i < 100; i++ ) f += itemp;
143 <P>As mentioned previously some loop invariants are not as apparent, all static
144 address computations are also moved out of the loop.
145 <H3><A NAME="Strength Reduction"></A> Strength reduction : </H3>
147 <P>This optimization substitutes an expression by a cheaper expression.
151 for (i=0;i < 100; i++) ar[i*5] = i*3;
159 for (i=0;i< 100;i++) {
167 <P>The more expensive multiplication is changed to a less expensive addition.
168 <H3><A NAME="Loop reversing"></A> Loop reversing: </H3>
170 <P>This optimization is done to reduce the overhead of checking loop boundaries
171 for every iteration. Some simple loops can be reversed and implemented using
172 a "decrement and jump if not zero" instruction. SDCC checks for the following
173 criterion to determine if a loop is reversible (note: more sophisticated compiers
174 use data-dependency analysis to make this determination, SDCC uses a more simple
178 <LI>The 'for' loop is of the form
179 "for ( <symbol> = <expression>
180 ; <sym> [< | <=] <expression> ; [<sym>++
182 <for body>"</LI>
183 <LI>The <for body> does not contain "continue" or 'break".</LI>
184 <LI>All goto's are contained within the loop.</LI>
185 <LI>No function calls within the loop.</LI>
186 <LI>The loop control variable <sym> is not assigned any value within
188 <LI>The loop control variable does NOT participate in any arithmetic operation
189 within the loop.</LI>
190 <LI>There are NO switch statements in the loop.</LI>
192 <P>Note djnz instruction can be used for 8-bit values ONLY, therefore it is
193 advantageous to declare loop control symbols as either 'char' or 'short', ofcourse
194 this may not be possible on all situations.
195 <H2><A NAME="Algebraic Simplifications"></A> <A NAME="ss6.5">6.5 Algebraic simplifications:</A>
198 <P>SDCC does numerous algebraic simplifications, the following is a small
199 sub-set of these optimizations.
203 i = j + 0 ; /* changed to */ i = j;
204 i /= 2; /* changed to */ i >>=
206 i = j - j ; /* changed to */ i = 0;
207 i = j / 1 ; /* changed to */ i = j;
210 <P>Note the subexpressions given above are generally introduced by macro expansions
211 or as a result of copy/constant propagation.
212 <H2><A NAME="Switch Statement"></A> <A NAME="ss6.6">6.6 'switch' statements.</A>
215 <P>SDCC changes switch statements to jump tables when the following conditions
219 <LI>The case labels are in numerical sequence , the labels need not be in order,
220 and the starting number need not be one or zero.</LI>
225 switch(i) { switch (i) {
230 case 3:... case 3: ...
236 <P>Both the above switch statements will be implemented using a jump-table.
239 <LI>The number of case labels is at least three, since it takes two conditional
240 statements to handle the boundary conditions.</LI>
241 <LI>The number of case labels is less than 84, since each label takes 3 bytes
242 and a jump-table can be utmost 256 bytes long. </LI>
244 <P>Switch statements which have gaps in the numeric sequence or those that
245 have more that 84 case labels can be split into more than one switch statement
246 for efficient code generation.
263 <P>If the above switch statement is broken down into two switch statements
280 <P>then both the switch statements will be implemented using jump-tables whereas
281 the unmodified switch statement will not be .
282 <H2><A NAME="bit shifting"></A> <A NAME="ss6.7">6.7 bit-shifting operations.</A>
285 <P>Bit shifting is one of the most frequently used operation in embedded programming
286 . SDCC tries to implement bit-shift operations in the most efficient way possible.
295 <P>generates the following code.
304 <P>In general SDCC will never setup a loop if the shift count is known. Another
324 <P>Note that SDCC stores numbers in little-endian format (i.e. lowest order
326 <H3><A NAME="bit rotation"></A> Bit-rotation: </H3>
328 <P>A special case of the bit-shift operation is bit rotation, SDCC recognizes
329 the following expression to be a left bit-rotation.
334 i = ( ( i << 1) | ( i >> 7));
338 <P>will generate the following code.
346 <P>SDCC uses pattern matching on the parse tree to determine this operation
347 .Variations of this case will also be recognized as bit-rotation i.e i = ((i
348 >> 7) | (i << 1)); /* left-bit rotation */
349 <H2><A NAME="Highest Order Bit"></A> <A NAME="ss6.8">6.8 Highest Order Bit.</A>
352 <P>It is frequently required to obtain the highest order bit of an integral
353 type (int,long,short or char types). SDCC recognizes the following expression
354 to yield the highest order bit and generates optimized code for it.
363 = (gint >> 15) & 1;
368 <P>Will generate the following code.
379 000F F5*02 66 mov _foo_hob_1_1,a
382 <P>Variations of this case however will NOT be recognized . It is a standard
383 C expression , so I heartily recommend this be the only way to get the highest
384 order bit, (it is portable). Of course it will be recognized even if it is
385 embedded in other expressions.
388 eg.xyz = gint + ((gint >> 15) & 1);
391 <P>will still be recognized.
392 <H2><A NAME="Peep-Hole"></A> <A NAME="ss6.9">6.9 Peep-hole optimizer.</A>
395 <P>The compiler uses a rule based , pattern matching and re-writing mechanism
396 for peep-hole optimization . It is inspired by 'copt' a peep-hole optimizer
397 by Christopher W. Fraser (cwfraser@microsoft.com). A default set of rules are
398 compiled into the compiler, additional rules may be added with the --peep-file
399 <filename> option. The rule language is best illustrated with examples.
404 mov a,%1 } by { mov %1,a
408 <P>The above rule will the following assembly sequence
421 <P>Note: All occurrences of a '%n' ( pattern variable ) must denote
422 the same string. With the above rule, the assembly sequence
429 <P>will remain unmodified. Other special case optimizations may be added by
430 the user (via --peep-file option), eg. some variants of the 8051 MCU allow
431 only 'AJMP' and 'ACALL' , the following two rules will change all 'LJMP' &
432 'LCALL' to 'AJMP' & 'ACALL'.
435 replace { lcall %1 } by { acall %1 }
437 replace { ljmp %1 } by { ajmp %1 }
440 <P>The inline-assembler' code is also passed through the peep hole optimizer,
441 thus the peephole optimizer can also be used as an assembly level macro expander.
442 The rules themselves are MCU dependent whereas the rule language infra-structure
443 is MCU independent. Peephole optimization rules for other MCU can be easily
444 programmed using the rule language.
445 <P>The syntax for a rule is as follows ,
448 rule := replace [ restart ] '{' <assembly sequence>
452 <assembly sequence> '\n'
454 '}' [if <functionName> ] '\n'
456 sequence> := assembly instruction (each instruction including labels must
457 be on a separate line).
460 <P>The optimizer will apply to the rules one by one from the top in the sequence
461 of their appearance, it will terminate when all rules are exhausted. If the
462 'restart' option is specified, then the optimizer will start matching the rules
463 again from the top, this option for a rule is expensive (performance), it is
464 intended to be used in situations where a transformation will trigger the same
465 rule again. A good example of this the following rule.
476 <P>Note that the replace pattern cannot be a blank, but can be a comment line.
477 Without the 'restart' option only the inner most 'pop' 'push' pair would be
495 <P>with the 'restart' option the rule will be applied again to the resulting
496 code and the all the 'pop' 'push' pairs will be eliminated to yield
503 <P>A conditional function can be attached to a rule. Attaching rules are somewhat
504 more involved, let me illustrate this with an example.
515 <P>The optimizer does a look-up of a function name table defined in function
516 'callFuncByName' in the source file SDCCpeeph.c , with the name 'labelInRange',
517 if it finds a corresponding entry the function is called. Note there can be
518 no parameters specified for these functions, in this case the use of '%5'
519 is crucial, since the function labelInRange expects to find the label in that
520 particular variable (the hash table containing the variable bindings is passed
521 as a parameter). If you want to code more such functions , take a close look
522 at the function labelInRange and the calling mechanism in source file SDCCpeeph.c.
523 I know this whole thing is a little kludgey , may be some day we will have
524 some better means. If you are looking at this file, you will also see the default
525 rules that are compiled into the compiler, you can your own rules in the default
526 set there if you get tired of specifying the --peep-file option.
528 <A HREF="SDCCUdoc-7.html">Next</A>
529 <A HREF="SDCCUdoc-5.html">Previous</A>
530 <A HREF="SDCCUdoc.html#toc6">Contents</A>