News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

ScanMap - App event handler

Started by Tight_Coder_Ex, December 28, 2010, 10:52:00 PM

Previous topic - Next topic

Tight_Coder_Ex

This is an improvement on
http://www.masm32.com/board/index.php?topic=12756.0

Most significantly lack of 16 bit variables.  Code has been tested on sub/super classed window procedures


00  AD lodsd Address of subclass procedure if specified
01  8B D8 mov ebx, eax Copy that won't be modified by handler
03  AD lodsd Number of handlers in map
04  8B C8 mov ecx, eax Save in counter register


Main loop cycles through each of app specified handlers until list is exhasted or match found


06  AD lodsd Read event number frommap
07  3B 44 24 08 cmp eax, [Msg] Do comparison so flag(s) are set
0B  AD lodsd Address of handler in the event of match
0C  75 0A     jne 18

0E  8D 74 24 04 lea esi, [hWnd] ESI points to hWnd, Msg, wParam, lParam
12  FF D0 call eax Execute
14  73 16 jnc 2C No other processing if CF = 0
16  EB 02     jmp 1A Handler also requires default processing
18  E0 EC loopnz 06 Untill list is exhausted


Note jumps to default procedures.  This works because stack is already aligned to required parameters


1A  0B DB     or ebx, ebx Is this window sub or super classed
1C  75 05     jne 23
1E  E9 User32 jmp DefWindowProc

23  87 1C 24 xchg [esp], ebx Stack needs one additonal parameter for this API
26  53 push ebx
27  E9 User32 jmp CallWindowProc

2C  33 C0 xor eax, eax Exit when window doesn't need default processing
2E  C2 0010 ret 16 Kernel even cleans this up if arguments not wasted


31 = 49 Bytes

NOTE: My listings are somewhat unorthadox, but it is with purpose so plagurists have to work at at least a little.
For the regulars on this board, meaning you have at least 20 or more postings, I will be glad to share source, object or library files

dedndave

for the first part, i would use
        lodsd
        xchg    eax,ebx
        lodsd
        xchg    eax,ecx

XCHG EAX,reg is a special one-byte form

XCHG reg,mem, however, is slow
replace
        xchg    [esp],ebx
        push    ebx

with this
        push    [esp]
        mov     [esp+4],ebx

hopefully, you don't care what winds up in EBX   :P

Tight_Coder_Ex

I've implemented and tested the changes Dave.

Is there a significant difference between

xchg    eax, ebx
vs
xchg    ebx, eax

jj2007

Quote from: Tight_Coder_Ex on December 28, 2010, 11:30:06 PM
Is there a significant difference between

xchg    eax, ebx
vs
xchg    ebx, eax


No, they have identical opcode 93h, in Masm and JWasm

dedndave

in practice, no
the assembler replaces the two-byte form of XCHG EBX,EAX with the one-byte form of XCHG EAX,EBX
early versions of masm did not make that replacement   :P
if you want to see the opcode, try XCHG EBX,ECX, then subtract 1 (i think that's right)

NOP is actually XCHG EAX,EAX

Tight_Coder_Ex

I find it kind of bizzare that this is more efficient by 1 clock, even being 3 bytes more code, but documentation says so.


FF 34 24 push [esp]
89 5C 24 04 mov [esp + 4], ebx


Is there a reference other than this one http://www.penguin.cz/~literakl/intel/intel.html that is representative of newer proccessors

dedndave

no - i just learned this from Clive, myself
XCHG reg,mem forces a bus lock
meaning that it takes extra time

it is good to know
in one of my programs, i use it to my advantage
i use XCHG reg,mem to alter a thread semaphore
that way, only one thread may access that memory at any given moment

as for the times, it is better to measure the time, yourself
no document is going to tell you how many clock cycles code really takes
different instructions execute differently on different CPU's
i am willing to bet there will be more than 1 cycle difference

hutch--

Unless you have good reason to use XCHG, do it in registers, its a lot faster. Also code that uses the string instructions WITHOUT the leading REP will be a long way off the pace. With REP you have special case circuitry that makes most of the string instruction fast over about 500 bytes but wthout it they are a bad bottleneck.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Tight_Coder_Ex

Quote from: hutch-- on December 29, 2010, 12:37:44 AM
Unless you have good reason to use XCHG, do it in registers, its a lot faster.

Out of all that and emitting a little smoke from the grey matter, I've come up with this alternative.


23  58 pop eax
24  53 push ebx
25  50 push eax
26  E9 User32 jmp CallWindowProc




dedndave

glad i thought of it   :lol

nice work, Tight   :U

i was trying not to use any registers   :P

oex

We are all of us insane, just to varying degrees and intelligently balanced through networking

http://www.hereford.tv

dedndave

ohhhhhhhhhhhhmmmmmmmmmmmmmm

:lol

look, ma ! - no pants !!!

Tight_Coder_Ex

#12
I've made a code ammendment to facilitate cleaner documentation. As a result, saving two bytes of code and
probably save time, as unecessary branches are removed from main loop.


00  AD lodsd
01  8B D8 mov ebx, eax
03  AD lodsd
04  8B C8 mov ecx, eax

06  AD lodsd
07  3B 44 24 08 cmp eax, [esp + 8]
0B  AD lodsd
0C  74 13 jz 21
0E  E0 F6 loopnz 06

10  0B DB or ebx, ebx
12  75 05 jnz 21
14  E9 User32 jmp DefWindowProc
19  58 pop eax
1A  53 push ebx
1B  50 push eax
1C  E9 User32 jmp CallWindowProc

21  8D 74 24 04 lea esi, [esp + 4]
25  FF D0 call eax
27  73 F0 jnc 10
29  33 C0 xor eax, eax
2B  C2 0010 ret 16

dedndave

save 2 more bytes
lodsd
xchg eax,ebx
lodsd
xchg eax,ecx

then speed it up a little - this adds a byte
lodsd
cmp eax,[esp+8]
lodsd
jz $+21h          ;use label here
dec ecx
jnz $-1Fh          ;use label here

otherwise - if you want to use loopnz, place the jz after the loopnz

Tight_Coder_Ex

I hadn't used XCHG  ACCUM/REG as I though I read this would cause bus lock, but it seems it wouldn't.


00  AD lodsd
01  93 xchg ebx, eax
02  AD lodsd
03  91 xchg ecx, eax

04  AD lodsd
05  3B 44 24 08 cmp eax, [Msg]
09  AD lodsd
0A  74 13 jz 1F
0C  E0 F6 loopnz 04

0E  0B DB or ebx, ebx
10  75 05 jnz 17

12  E9 User32 jmp DefWindowProc

17  58 pop eax
18  53 push ebx
19  50 push eax
1A  E9 User32 jmp CallWindowProc

1F  8D 74 24 04 lea esi, [hWnd]
23  FF D0 call eax
25  73 E7 jnc 0E
27  33 C0 xor eax, eax
29  C2 0010 ret 16


2C = 44 Bytes

Listing is obfuscated intentionally. Actual source does have labels.