Optimal Code Generation For Machines With Multiregister Operations

Started by cman, January 18, 2010, 11:03:08 PM

Previous topic - Next topic

cman

I've seen a number of algorithms for generating optimal code for expressions with tree traversals methods and dynamic programming formulations for hypothetical machines with a number of general purpose registers and single register operations. Can anyone point me to any specialized algorithms that address the problem of optimal code generation for expressions with multiregister operations ( I've been looking at a few that formulate the problem as a linear programming problem , but have not found anything too thorough on the web ). Thanks for any information.

clive

Well I'm kind of thinking that you want to be searching for academic papers, perhaps dating back to the old RISC days, including Berkeley (SPARC) and Stanford (MIPS).

SPARC has a huge register set, the exact size depended on the implementation, but was basically spilled on to the stack to provide endless registers. MIPS has 32 registers, so might also be a good direction to dig in.

The University of Illinois at Urbana-Champaign did a lot of work with Intel on Itanium related compilers. While not the success Intel wanted, the Itanium had crate loads of registers, and a lots of compiler heuristics to allocate and sequence the use of registers and follow setup time rules normally enforced in hardware with stalls. Surely a gold mine for the kind of analysis and papers you are looking for. Intel also has done a lot of clever compiler work in the x86 field, and ARMs RealView compilers are really quite good.

Some random links
http://compilers.cs.ucla.edu/fernando/projects/puzzles/intro/
http://www.csc.lsu.edu/lcpc05/slides/lcpc05-slides-04.pdf
http://www-inst.eecs.berkeley.edu/~cs164/sp08/lectures/
http://www-inst.eecs.berkeley.edu/~cs164/sp08/lectures/lecture38.pdf

-Clive
It could be a random act of randomness. Those happen a lot as well.

Slugsnack

I recently coded a compiler for a subset of C to MIPS assembler. In MIPS most arithmetic operations take 3 parameters. One destination register, and either 2 source registers or 1 source + 1 immediate.

I had a register class which had a flag for free/not free. Then a registers class which will return a free register. When optimising for this, you need to think of the cases when the 'temporary' registers will be free and free them straight after. Also bear in mind that although a register is free, it can still mirror a variable until that variable changes value in which case that register becomes 'dirty'. To take advantage of this, I had a hash table mapping variables to clean registers so even after I had freed them, my compiler would search for any mappings to minimize memory operations.

Essentially to solve the problem you present, my method was to have a temporary flag and my code would detect when a temporary register was no longer needed and hence could be freed. I don't know about 'algorithm' but if you are interested in the above method I can try to give you a better explanation of it or the code I wrote for that part.