Preserve carry flag in loop

Eddy · August 29, 2005, 08:33:14 AM

Hi all,

I have a routine that adds 2 multi-precision (or arbitrary precision) integers.
This means that the integers can be many dwords long.
For reasons of speed I add the integers dword per dword in chunks of 4 dwords (loop unrolling).

This is the basic loop:
Note that:
- esi is negative and is incremented to zero
- eax, ebx and edx actually point to the END of the operands (that's why esi must be negative initially)

Code Select


    Lab_Add1:
        mov  edx, [eax+esi]       'get dword of h1
        adc  edx, [ebx+esi]       'add dword of h2
        mov  [ecx+esi], edx       'store result in hr

        mov  edx, [eax+esi+4]     'get dword of h1
        adc  edx, [ebx+esi+4]     'add dword of h2
        mov  [ecx+esi+4], edx     'store result in hr

        mov  edx, [eax+esi+8]     'get dword of h1
        adc  edx, [ebx+esi+8]     'add dword of h2
        mov  [ecx+esi+8], edx     'store result in hr

        mov  edx, [eax+esi+12]    'get dword of h1
        adc  edx, [ebx+esi+12]    'add dword of h2
        mov  [ecx+esi+12], edx    'store result in hr

        ADD esi, 16               'increment loop counter
        
    jnz Lab_Add1

There's one little annoyance. 'ADD esi, 16' affects the carry flag.
And I don't want that, since the CF must be preserved between loops.

Here is a (not so elegant/efficient) solution.
Does anyone know of a more efficient solution?

Code Select


    Lab_Add1:
        mov  edx, [eax+esi]       'get dword of h1
        adc  edx, [ebx+esi]       'add dword of h2
        mov  [ecx+esi], edx       'store result in hr

        mov  edx, [eax+esi+4]     'get dword of h1
        adc  edx, [ebx+esi+4]     'add dword of h2
        mov  [ecx+esi+4], edx     'store result in hr

        mov  edx, [eax+esi+8]     'get dword of h1
        adc  edx, [ebx+esi+8]     'add dword of h2
        mov  [ecx+esi+8], edx     'store result in hr

        mov  edx, [eax+esi+12]    'get dword of h1
        adc  edx, [ebx+esi+12]    'add dword of h2
        mov  [ecx+esi+12], edx    'store result in hr

        rcl edi, 1                'Store carryflag because it can be overwritten by the ADD
        ADD esi, 16               'increment loop counter
        rcr edi, 1                'Retrieve stored carryflag
        
    jnz Lab_Add1

Kind regards
Eddy

Petroizki · August 29, 2005, 08:54:29 AM

Use lea instruction, it does not affect any flags:

Code Select

lea esi, [esi + 16]

Eddy · August 29, 2005, 08:57:56 AM

That was my first thought, but for 'jnz Lab_Add1' I need to have the ZF affected.
To do that, I would have to use CMP... which again affects the CF... and I have the same problem again..:(

Kind regards
Eddy

PBrennick · August 29, 2005, 09:55:33 AM

You can store the flag condition you want to preserve in a memory location. You would then react to the contents of that memory location as need be.

Paul

Eddy · August 29, 2005, 10:04:28 AM

Paul,

That's pretty much what I'm doing in my second code snippet, except I'm using the EDI register instead of a memory location.

Code Select


      rcl edi, 1             'Store carryflag because it can be overwritten by the ADD
      ADD esi, 16            'increment loop counter
      rcr edi, 1             'Retrieve stored carryflag

But I was wondering if there was a way without having to store the state of the CF flag...

Too bad I can't use a scale factor of 16 in the addressing, otherwise I could do this:

Code Select


...
mov  edx, [eax+esi*16+4]
...
inc esi
jnz Lab_Add1

INC doesn't affect CF. But unfortunately scale factor can be no higher than 8...:(

Kind regards
Eddy

PBrennick · August 29, 2005, 10:52:16 AM

How about pushing the flags, do the compare, pop the flags and then do the jnz thing. It will react to the results of the compare and you get to keep the previous state of the flags.

Paul

Eddy · August 29, 2005, 11:30:46 AM

Paul,

When I pushf, cmp, popf, the results (flags) of the cmp will be overwritten by the pop.
Besides, what I did above (rcl/rcr) is faster than pushf/popf.

Kind regards
Eddy

roticv · August 29, 2005, 12:00:34 PM

You can try using MMX instructiions (You would have to modify the code below as I just copy + paste).

Code Select


		movd	mm0, [ecx]
		movd	mm1, [edx]
		paddq	mm1, mm0
		movd	[eax], mm1
		psrlq	mm1, 32
	i = 4
	REPT N/2 - 1
		movd	mm0, [ecx+i]
		movd	mm2, [edx+i]
		paddq	mm2, mm0
		paddq	mm2, mm1
		movd	[eax+i],mm2
		psrlq	mm2, 32
		i = i + 4
		movd	mm0, [ecx+i]
		movd	mm1, [edx+i]
		paddq	mm1, mm0
		paddq	mm1, mm2
		movd	[eax+i], mm1
		psrlq	mm1, 32
		i = i + 4
	ENDM
		movd	mm0,[ecx+i]
		movd	mm2,[edx+i]
		paddq	mm2, mm0
		paddq	mm2, mm1
		movq	[eax+i], mm2

Frank · August 29, 2005, 12:09:24 PM

This combines two partial solutions:

Code Select


lea esi, [esi+15]
inc esi
jnz Lab_Add1

Eddy · August 29, 2005, 01:23:43 PM

Roticv,
Yes, I really should start getting into that MMX thing, but I haven't so far..
Thanks for your suggestion!

Frank,
How clever! Hadn't thought of that. Thanks!! :U

Kind regards
Eddy

News:

Preserve carry flag in loop

roticv

Frank