News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Optimising C code tutorial

Started by hutch--, August 22, 2005, 07:00:08 AM

Previous topic - Next topic

hutch--

I wrote this one last year when I was working on sorting algorithms. There is a mountain of old C junk awash on the internet and while a reasonable amount of it is crap code, you often get good algorithm design from some of the examples. After testing a heap of "shell" sorts, the fastest I could find was a design by Janet Incerpi in conjunction with Robert Sedgewick.

The tutorial uses the design as a reference and built the C code using VC6 CL.EXE with the maximum of optimisation turned on. A completely unoptimised version was built as a basis for manually optimising the ASM output from VC6 and a manually optimised version is included to demonstrate what can be done in a hour or so of optimisation using MASM.

Particularly at an algorithm level writing reusable code for libraries, the time involved in manually optimising C code at the asm level is worth the effort to gain additional performance not possible from a compiler. The example is classic 486 compatible code using integer registers only, no MMX or SSE(2) and will run on anything from a 486 upwards.

This code was optimised on my old 1.5gig PIV so the timings on the attached text file will be slow in comparison to a later box. I did rebuild the C code using the version of CL.EXE from the VCTOOLKIT but it was no faster than the code created by the VC6 version of CL.EXE .

The timings I get on the current 2.8 gig PIV are as follows.

Optimised C   = 266 to 282 MS
Unoptimised C = 454 to 469 MS
Optimised asm = 234 to 250 MS


If anything, the sample is a bit small for a later box and needs to be enlarged to get longer and more reliable timings.


[attachment deleted by admin]
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

PBrennick

Hutch,
Very nice presentation.

Paul
The GeneSys Project is available from:
The Repository or My crappy website

Mark Jones

Very nice to know, if I ever make anything worth optimizing. :)
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

hutch--

I like the basic technique and better still, its easy and useful practice at manually optimising assembler code. When I get a bit more time, I have a nice quick sort which was one of Robert Sedgewick's designs that I found written in Pascal dating from about 1988, ported it to basic, ported it from basic to C and converted it to masm with CL.EXE.

Its a good tchnique to this extent that it fully seperates algo design from code optimisation and as long as you have a test piece to make sure you did not mess aything up on the way, it allows you to concentrate on the actual GO FAST code details.

The punchline is that there is a mountain of old C code awash on the internet with a lot of the better stuff dating from before ANSI 89 that performs well when you test it in C with the optimisation maximised and this is a reasonably good test as to whether it will manually optimise well. It gives the assembler programmer the capacity to use a large body of mainstream code and still bash it into shape in assembler performance terms.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php