------------
A random set of notes about sdcc and how it works.
+Michael on the register allocation stage:
+-----------------------------------------
+Lets trace how the register allocator in the mcs51 port works.
+
+Some concepts:
+eBBlock
+ A basic block. I cant remeber the conditions, but a basic block
+is one that is easy to optimise and analyse. I guess this means that it has
+a nice set of assignments and a reasonably straight flow.
+iCode
+ Intermediate code. Provides the interface between the parser + optimiser
+and the code generator by providing an abstract machine with infinite registers
+which the parser can generate for and the back end can turn into real code.
+iTemps
+ An iTemp is a temporary register used in iCode. These will eventually
+be either replaced, allocated into a register, or placed onto the stack by the
+backend.
+Live range
+ The live range of an iTemp is the part of the code that the iTemp is used
+over. Generally the live range of an iTemp is from its first assignment to its
+last use.
+
+Input to mcs51_assignRegisters is an array of basic blocks. Assign
+registers is normally called at the end of a function.
+
+In pseudo code,
+1. For each basic block, pack the registers in the block.
+ In this case register packing consists of:
+ Remove any unneded iTemps that are just used in assignments.
+ Mark anything that can be rematerialised as rematerialisable.
+ There is no way I spelt that correctly. Something is rematerialisable
+ if it can be generated easily and is constant, and hence dosnt need
+ to be cached away in an iTemp. An example is the address of something.
+ Packs iTemps that are only used once into normally unavailble registers.
+ Register packing removes unneeded iTemps.
+2. Determine what number and type of regsiters are needed for each
+ live range.
+ It does
+ If the iTemp lives for zero time, dont bother assigning
+ If its not an iTemp, skip for now.
+ If its a conditional (determined in the register packing), skip as it will
+ be stored in carry.
+ If the iTemp is already packed from 1.c, skip
+ If the iTemp is remat and some other magic, skip.
+ Else set the number and type of registers based on the size of the iTemp.
+3. Assign registers for each segment.
+ For each iCode, do
+ If it is a IPOP (pop of an iTemp at the end of a block), reset the LR.
+ De-assign the live ranges of the iTemps that expire here.
+ For each iTemp, do
+ If this iTemp is still alive, skip
+ If this iTemp is spilt on the stack, free the location and continue.
+ If there are no registers assigned (?), continue.
+ Some magic using IFX and IPOP
+ If the iTemp has no registers, continue.
+ If the result of this iCode doesnt yet have registers, allocate them now. Weird.
+ Deallocate the registers used.
+ Skip instructions that dont need registers (IFX, JUMPTABLE, POINTER_SET)
+ Only assign registers to the result of this iCode.
+ If the iCode has registers, or has been spilt, continue.
+ If this will cause a spill as it needs more registers than are free, then
+ Find those that can be spilt.
+ Spill this if its easy.
+ Spill this if its the least used.
+ Allocate registers to the result iTemp
+ If any registers in the result are shared with the operand, make them line up.
+4. Create the register mask for each segment.
+ For each iCode, do
+ Set the used register bit vector from the used registers.
+ Mark these registers as used in the higher function. This lets the generator
+ decide which registers need to be saved when calling or being called by a function.
+ Hmm. It seems to re-setup the used register bit vector.
+5. Redo the stack offsets.
+6. Turn the basic blocks into an intermediate code chain.
+ Takes the array of basic blocks and pulls them out into one iCode chain.
+7. Optimise the labels in the iCode chain.
+ Skipped if the label optimisations are turned off.
+ Remove any gotos that go to the next line.
+ Simplify any chained gotos
+ Remove unreferenced labels
+ Remove unreferenced code.
+7. Generate the mcs51 code from the iCode chain.
+8. Deallocate everything (registers and stack locations).
+
+Sandeep:
+--------
+=======
Sandeep:
--------
The Register Allocation story.
--- /dev/null
+% ``Test Suite Design''
+% $Id$
+\documentclass{widearticle}
+\usepackage{url}
+
+\begin{document}
+\title{Proposed Test Suite Design}
+\author{Michael Hope (michaelh@juju.net.nz)}
+\date{\today}
+\maketitle
+
+\begin{abstract}
+This article describes the goals, requirements, and suggested
+specification for a test suite for the output of the Small Device C
+Compiler (sdcc). Also included is a short list of existing works.
+\end{abstract}
+
+\section{Goals}
+The main goals of a test suite for sdcc are
+\begin{enumerate}
+ \item To allow developers to run regression tests to check that
+core changes do not break any of the many ports.
+ \item To verify the core.
+ \item To allow developers to verify individual ports.
+ \item To allow developers to test port changes.
+\end{enumerate}
+
+This design only covers the generated code. It does not cover a
+test/unit test framework for the sdcc application itself, which may be
+useful.
+
+One side effect of (1) is that it requires that the individual ports
+pass the tests originally. This may be too hard. See the section on
+Exceptions below.
+
+\section{Requirements}
+\subsection{Coverage}
+The suite is intended to cover language features only. Hardware
+specific libraries are explicitly not covered.
+
+\subsection{Permutations}
+The ports often generate different code for handling different types
+(Byte, Word, DWord, and the signed forms). Meta information
+could be used to permute the different test cases across the different
+types.
+
+\subsection{Exceptions}
+The different ports are all at different levels of development. Test
+cases must be able to be disabled on a per port basis. Permutations
+also must be able to be disabled on a port level for unsupported
+cases. Disabling, as opposed to enabling, on a per port basis seems
+more maintainable.
+
+\subsection{Running}
+The tests must be able to run unaided. The test suite must run on all
+platforms that sdcc runs on. A good minimum may be a subset of Unix
+command set and common tools, provided by default on a Unix host and
+provided through cygwin on a Windows host.
+
+The tests suits should be able to be sub-divided, so that the failing
+or interesting tests may be run separately.
+
+\subsection{Artifcats}
+The test code within the test cases should not generate artifacts. An
+artifact occurs when the test code itself interferes with the test and
+generates an erroneous result.
+
+\subsection{Emulators}
+sdcc is a cross compiling compiler. As such, an emulator is needed
+for each port to run the tests.
+
+\section{Existing works}
+\subsection{DejaGnu}
+DejaGnu is a toolkit written in Expect designed to test an interactive
+program. It provides a way of specifying an interface to the program,
+and given that interface a way of stimulating the program and
+interpreting the results. It was originally written by Cygnus
+Solutions for running against development boards. I believe the gcc
+test suite is written against DejaGnu, perhaps partly to test the
+Cygnus ports of gcc on target systems.
+
+\subsection{gcc test suite}
+I don't know much about the gcc test suite. It was recently removed
+from the gcc distribution due to issues with copyright ownership. The
+code I saw from older distributions seemed more concerned with
+esoteric features of the language.
+
+\subsection{xUnit}
+The xUnit family, in particular JUnit, is a library of in test
+assertions, test wrappers, and test suite wrappers designed mainly for
+unit testing. PENDING: More.
+
+\subsection{CoreLinux++ Assertion framework}
+While not a test suite system, the assertion framework is an
+interesting model for the types of assertions that could be used.
+They include pre-condition, post-condition, invariants, conditional
+assertions, unconditional assertions, and methods for checking
+conditions.
+
+\section{Specification}
+This specification borrows from the JUnit style of unit testing and
+the CoreLinux++ style of assertions. The emphasis is on
+maintainability and ease of writing the test cases.
+
+\subsection{Terms}
+PENDING: Align these terms with the rest of the world.
+
+\begin{itemize}
+ \item An \emph{assertion} is a statement of how things should be.
+PENDING: Better description, an example.
+ \item A \emph{test point} is the smallest unit of a test suite,
+and consists of a single assertion that passes if the test passes.
+ \item A \emph{test case} is a set of test points that test a
+certain feature.
+ \item A \emph{test suite} is a set of test cases that test a
+certain set of features.
+\end{itemize}
+
+\subsection{Test cases}
+Test cases shall be contained in their own C file, along with the meta
+data on the test. Test cases shall be contained within functions
+whose names start with 'test' and which are descriptive of the test
+case. Any function that starts with 'test' will be automatically run in
+the test suite.
+
+To make the automatic code generation easier, the C code shall have
+this format
+\begin{itemize}
+ \item Test functions shall start with 'test' to allow
+automatic detection.
+ \item Test functions shall follow the K\&R intention style for ease
+of detection. i.e. the function name shall start in the left
+column on a new line below the return specification.
+\end{itemize}
+
+\subsection{Assertions}
+All assertions shall log the line number, function name, and test
+case file when they fail. Most assertions can have a more descriptive
+message attached to them. Assertions will be implemented through
+macros to get at the line information. This may cause trouble with
+artifacts.
+
+The following definitions use C++ style default arguments where
+optional messages may be inserted. All assertions use double opening
+and closing brackets in the macros to allow them to be compiled out
+without any side effects. While this is not required for a test
+suite, they are there in case any of this code is incorporated into the
+main product.
+
+Borrowing from JUnit, the assertions shall include
+\begin{itemize}
+ \item FAIL((String msg = ``Failed'')). Used when execution should
+not get here.
+ \item ASSERT((Boolean cond, String msg = ``Assertion failed'').
+Fails if cond is false. Parent to REQUIRE and ENSURE.
+\end{itemize}
+
+JUnit also includes may sub-cases of ASSERT, such as assertNotNull,
+assertEquals, and assertSame.
+
+CoreLinux++ includes the extra assertions
+\begin{itemize}
+ \item REQUIRE((Boolean cond, String msg = ``Precondition
+failed''). Checks preconditions.
+ \item ENSURE((Boolean cond, String msg = ``Postcondition
+failed''). Checks post conditions.
+ \item CHECK((Boolean cond, String msg = ``Check failed'')). Used
+to call a function and to check that the return value is as expected.
+i.e. CHECK((fread(in, buf, 10) != -1)). Very similar to ASSERT, but
+the function still gets called in a release build.
+ \item FORALL and EXISTS. Used to check conditions within part of
+the code. For example, can be used to check that a list is still
+sorted inside each loop of a sort routine.
+\end{itemize}
+
+All of FAIL, ASSERT, REQUIRE, ENSURE, and CHECK shall be available.
+
+\subsection{Meta data}
+PENDING: It's not really meta data.
+
+Meta data includes permutation information, exception information, and
+permutation exceptions.
+
+Meta data shall be global to the file. Meta data names consist of the
+lower case alphanumerics. Test case specific meta data (fields) shall
+be stored in a comment block at the start of the file. This is only
+due to style.
+
+A field definition shall consist of
+\begin{itemize}
+ \item The field name
+ \item A colon.
+ \item A comma separated list of values.
+\end{itemize}
+
+The values shall be stripped of leading and trailing white space.
+
+Permutation exceptions are by port only. Exceptions to a field are
+specified by a modified field definition. An exception definition
+consists of
+
+\begin{itemize}
+ \item The field name.
+ \item An opening square bracket.
+ \item A comma separated list of ports the exception applies for.
+ \item A closing square bracket.
+ \item A colon.
+ \item The values to use for this field for these ports.
+\end{itemize}
+
+An instance of the test case shall be generated for each permutation
+of the test case specific meta data fields.
+
+The runtime meta fields are
+\begin{itemize}
+ \item port - The port this test is running on.
+ \item testcase - The name of this test case.
+ \item function - The name of the current function.
+\end{itemize}
+
+Most of the runtime fields are not very usable. They are there for
+completeness.
+
+Meta fields may be accessed inside the test case by enclosing them in
+curly brackets. The curly brackets will be interpreted anywhere
+inside the test case, including inside quoted strings. Field names that
+are not recognised will be passed through including the brackets.
+Note that it is therefore impossible to use some strings within the
+test case.
+
+Test case function names should include the permuted fields in the
+name to reduce name collisions.
+
+\subsection{An example}
+I don't know how to do pre-formatted text in \LaTeX. Sigh.
+
+The following code generates a simple increment test for all combinations of the
+storage classes and all combinations of the data sizes. This is a
+bad example as the optimiser will often remove most of this code.
+
+\tt{
+/** Test for increment.
+
+ type: char, int, long
+
+ Z80 port does not fully support longs (4 byte)
+
+ type[z80]: char, int
+
+
+ class: ``'', register, static
+*/
+
+static void
+
+testInc\{class\}\{types\}(void)
+
+\{
+
+ \{class\} \{type\} i = 0;
+
+ i = i + 1;
+
+ ASSERT((i == 1));
+
+\}
+
+}
+
+\end{document}
register where possible. This requires knowledge of what the code
generator touches for a given instruction.
-\subsection{Problems}
+The first generation register allocator will only pack assignments and mark
+remat. variables. Only the register management is processor specific. The
+allocator may ask for a given size register or if a given size register is
+available. Note that only whole registers may be returned. For example,
+allocation will fail if a sixteen bit register is requested and no pair
+is available, even two eight bit registers are available. Note that on
+the Z80, GBZ80, and i186 a request for a 32 bit register will always fail.
+
+\subsection{Code generator}
+The possible operations are:
+\begin{itemize}
+ \item NOT - Logical not. 0 -> 1, others -> 0.
+ \item CPL - Bitwise complement.
+ \item UMINUS - Unary minus. result = 0 - left.
+ \item IPUSH - Push immediate onto the stack.
+ \item CALL - Call a function.
+ \item PCALL - Call via pointer.
+ \item FUNCTION - Emit the function prelude.
+ \item ENDFUNCTION - Emit the function prologue.
+ \item RET - Load the return value and jump to end of function.
+ \item LABEL - Generate a local label.
+ \item GOTO - Jump to a local label.
+ \item Arithmitic - +, -, *, /, \%.
+ \item Comparison - LT, GT, LEQ, GEQ, !=, =.
+ \item Logical - \&\&, ||
+ \item Binary - AND, OR, XOR.
+ \item Shift - RRC, RLC, LSR, LSL.
+ \item Pointer - Set and Get.
+ \item Assign.
+ \item IF jump.
+ \item Misc - Jump table, cast, address of.
+\end{itemize}
\end{document}
\ No newline at end of file