X-Git-Url: https://git.gag.com/?a=blobdiff_plain;f=doc%2Frandom-notes.txt;h=301c6de537e14644a9957a6c7883f5e811ec3f4d;hb=4bafa836005a601b321ee04bb15e0c27242036df;hp=7ec7b844360766bc4d3be06153ed9848ae74db0d;hpb=642a343b925965aa06211d624dc4cb0063c2bcfb;p=fw%2Fsdcc diff --git a/doc/random-notes.txt b/doc/random-notes.txt index 7ec7b844..301c6de5 100644 --- a/doc/random-notes.txt +++ b/doc/random-notes.txt @@ -1,8 +1,140 @@ Random notes ------------ - A random set of notes about sdcc and how it works. +Michael on the register allocation stage: +----------------------------------------- +Lets trace how the register allocator in the mcs51 port works. + +Some concepts: +eBBlock + A basic block. I can't remember the conditions, but a basic block +is one that is easy to optimise and analyse. I guess this means that it has +a nice set of assignments and a reasonably straight flow. +iCode + Intermediate code. Provides the interface between the parser + optimiser +and the code generator by providing an abstract machine with infinite registers +which the parser can generate for and the back end can turn into real code. +iTemps + An iTemp is a temporary register used in iCode. These will eventually +be either replaced, allocated into a register, or placed onto the stack by the +backend. +Live range + The live range of an iTemp is the part of the code that the iTemp is used +over. Generally the live range of an iTemp is from its first assignment to its +last use. + +Input to mcs51_assignRegisters is an array of basic blocks. Assign +registers is normally called at the end of a function. + +In pseudo code, +1. For each basic block, pack the registers in the block. + In this case register packing consists of: + Remove any unneded iTemps that are just used in assignments. + Mark anything that can be rematerialised as rematerialisable. + There is no way I spelt that correctly. Something is rematerialisable + if it can be generated easily and is constant, and hence dosn't need + to be cached away in an iTemp. An example is the address of something. + Packs iTemps that are only used once into normally unavailable registers. + Register packing removes unneeded iTemps. +2. Determine what number and type of registers are needed for each + live range. + It does + If the iTemp lives for zero time, don't bother assigning + If its not an iTemp, skip for now. + If its a conditional (determined in the register packing), skip as it will + be stored in carry. + If the iTemp is already packed from 1.c, skip + If the iTemp is remat and some other magic, skip. + Else set the number and type of registers based on the size of the iTemp. +3. Assign registers for each segment. + For each iCode, do + If it is a IPOP (pop of an iTemp at the end of a block), reset the LR. + De-assign the live ranges of the iTemps that expire here. + For each iTemp, do + If this iTemp is still alive, skip + If this iTemp is spilt on the stack, free the location and continue. + If there are no registers assigned (?), continue. + Some magic using IFX and IPOP + If the iTemp has no registers, continue. + If the result of this iCode doesnt yet have registers, allocate them now. Weird. + Deallocate the registers used. + Skip instructions that dont need registers (IFX, JUMPTABLE, POINTER_SET) + Only assign registers to the result of this iCode. + If the iCode has registers, or has been spilt, continue. + If this will cause a spill as it needs more registers than are free, then + Find those that can be spilt. + Spill this if its easy. + Spill this if its the least used. + Allocate registers to the result iTemp + If any registers in the result are shared with the operand, make them line up. +4. Create the register mask for each segment. + For each iCode, do + Set the used register bit vector from the used registers. + Mark these registers as used in the higher function. This lets the generator + decide which registers need to be saved when calling or being called by a function. + Hmm. It seems to re-setup the used register bit vector. +5. Redo the stack offsets. +6. Turn the basic blocks into an intermediate code chain. + Takes the array of basic blocks and pulls them out into one iCode chain. +7. Optimise the labels in the iCode chain. + Skipped if the label optimisations are turned off. + Remove any gotos that go to the next line. + Simplify any chained gotos + Remove unreferenced labels + Remove unreferenced code. +7. Generate the mcs51 code from the iCode chain. +8. Deallocate everything (registers and stack locations). + +Sandeep: +-------- +======= +Sandeep: +-------- +The Register Allocation story. + +Before I get into this there are a few important fields +on the iCode & oprtand data structures that need to be +addressed. + +iCode. +----- + ->key - is an unique number assigned to this + iCode when this icode is allocated. + + ->seq - sequence number of the iCode given in + ascending order of execution. + +operand. +------- + ->key - unique number for an operand when operand + is allocated. + +OP_LIVEFROM(op) - sequence number of the iCode where it was + first defined. Computed in SDCClrange.c + +OP_LIVETO(op) - sequence number of the iCode where it is + last used. Computed in SDCClrange.c + + + + +Sandeep: +-------- +Adding memory maps for AVR, setup default map for +local & global variables in port.h will be initialised +by the __finaliseOptions() [good functions]..some +kludges remain because of the "overlay" segment for 8051. + +The memory segment stuff will have to be made more flexible. +Currently there does not seem to be a 1-to-1 correspondence +between storage class & memory map. (should there be??). + +Also Added check for memory map in SDCCmem.c allocLocal +and allocglobal for "eeprom" memory space (AVR). + + + Michael: -------- Tracing parmBytes and function calls. @@ -48,7 +180,7 @@ It seems that IS_SPEC() is being set, deferencing *s so that it's size is sizeof(char) == 1. It's either a SPECIFIER or a DECLARATOR - seems that were the wrong way around. This is set in SDCCsymt.c, SDCCval.c, and the yacc file. SDCCsymt.c and SDCCval.c havnt really changed in 5 days - must -be SDCC.y. Nope, no changes. diff against 5 days ago shows only intersting +be SDCC.y. Nope, no changes. diff against 5 days ago shows only interesting changes are in SDCCicode. Same with -14 days. Michael @@ -76,3 +208,175 @@ which is more like: First looking in SDCCicode.c for the stuff that generates function calls: Probably a bug in geniCodeParams. Nope, a bug in my stuff :) + +Michael +------- +Comparing the dhrystone code vs Hitech C v7.50 + +cp: + ld hl,__r + ld (_Next_Ptr_Glob),hl +vs: + ld hl,#__r + ld iy,#_Next_Ptr_Glob + ld 0(iy),l + ld 1(iy),h + +cp: + ld a,#<__r + add a,#0x27 + ld iy,#_Ptr_Glob + ld 0(iy),a + ld a,#>__r + adc a,#0x00 + ld 1(iy),a +vs: + ld hl,__r+027h + ld (_Ptr_Glob),hl + +cp: + ld iy,#_Next_Ptr_Glob + ld a,0(iy) + ld (hl),a + inc hl + ld a,1(iy) + ld (hl),a + +vs: + ld bc,(_Next_Ptr_Glob) + ld hl,(_Ptr_Glob) + ld (hl),c + inc hl + ld (hl),b + + +cp: + ld bc,(_Ptr_Glob) + ld l,c + ld h,b + inc hl + inc hl + ld (hl),#0x00 + inc hl + ld (hl),#0x00 +vs: + ld iy,(_Ptr_Glob) + ld (iy+2),0 + ld (iy+3),0 + + +strcpy is inlined as: +u12: + ld a,(hl) + ldi + or a + jp nz,u12 + +cp: + ld a,#<_Arr_2_Glob + add a,#0x2E + ld -80(ix),a + ld a,#>_Arr_2_Glob + adc a,#0x03 + ld -79(ix),a +; genAssign (pointer) +; AOP_STK for _main_sloc1_1_0 + ld l,-80(ix) + ld h,-79(ix) + ld (hl),#0x0A + inc hl + ld (hl),#0x00 +vs: + ld hl,0Ah + ld (_Arr_2_Glob+032Eh),hl + +cp: + ld a,-72(ix) + or a,a + jp nz,00126$ + ld a,-71(ix) + and a,#0x03 +; Rule 4: Changed jp order + jp z,00127$ +00126$: + jp 00102$ +00127$: + +vs: + ld (ix+-7),c + ld (ix+-6),b + ld a,c + ld l,a + ld a,b + and 03h + ld h,a + ld a,l + or h + jp nz,l12 + +cp: + ld a,-82(ix) + or a,-81(ix) + sub a,#0x01 + ld a,#0x00 + rla + ld iy,#_Bool_Glob + ld 0(iy),a + ld 1(iy),#0x00 + +vs: + ld a,l + or h + ld hl,01h + jp z,u42 + dec hl +u42: + ld (_Bool_Glob),hl + +;C:\TEMP\DHRY.C: 173: Int_3_Loc = 5 * Int_1_Loc - Int_2_Loc; +is turned into: + ld l,(ix+-13) + ld h,(ix+-12) + ld d,h + ld e,l + add hl,hl + add hl,hl + add hl,de + +Comapring two types of cmpeq: + ld a,c ; 4 + cp a,#0xFB ; 7 + jp nz,00119$ ; 10 + ld a,b ; 4 + cp a,#0xFF ; 7 +; Rule 5: Changed jump logic + jp z,00101$ ; 10 +00119$: + ; 21 to 42 + +vs: + ld hl,#val ; 10 + or a ; 4 + sbc hl,bc ; 15 + jp z,... ; 10 + ; Always 39 - worse + +Comaring break even point for shift: + ld a,#0x03+1 ; 7 + jp 00114$ ; 10 +00113$: + sra b ; 8 + rr c ; 8 +00114$: + dec a ; 4 + jp nz,00113$ ; 10 + + ; Bytes: 2+3+1+1+1+3 = 11 + ; t-states = 17+n*30 +vs + sra b + rr c ... + ; Bytes: 2*n + ; t-states = 16*n + + Ah, pick 4 \ No newline at end of file