The conversion of an unsigned dword into a formatted string is usually done in two steps:
1] conversion from binary to string
2] format the string, in this case with thousand separators.
This process is quite slow compared to what could be done in a smarter way.
Using the usual way, with a MACRO and an API: ustrv$ + GetNumberFormat I have these results:
┌─────────────────────────────────────────────────────────────[18-Nov-2010 at 03:00 GMT]─┐
│OS : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600) │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 103 │ 46.629 │ 46.320 │ 46.300 │ 46.293 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
I think it could be done in less than half this time. Feel free to post any suggestion to speed up things a little.
To have a standard test all the routines have to convert the same array of values:
NumToTest DWORD 0
DWORD 9
DWORD 10
DWORD 99
DWORD 100
DWORD 999
DWORD 1000
DWORD 9999
DWORD 10000
DWORD 99999
DWORD 100000
DWORD 999999
DWORD 1000000
DWORD 9999999
DWORD 10000000
DWORD 99999999
DWORD 100000000
DWORD 999999999
DWORD 1000000000
DWORD 4294967295
After that the measurements can be reliable, and inserted into the testbed.
There are 2 files to get used to the testbed:
Readme.txt and Info Screen displayed with the key when
the program shows the results.
This is the final release of the testbed. If you want to use it for any purpose, get used to it.
The sources, the info, the screens, everything is included in the zip file.
Enjoy it and post your results. Press C and paste the content of clipboard into the forum.
That's all.
Frank
Frank,
I usually use wsprintf to convert an unsigned dword to a string because it's easy to use and fast enough. It won't be the fastest.
┌─────────────────────────────────────────────────────────────[18-Nov-2010 at 04:30 GMT]─┐
│OS : Microsoft Windows Vista Home Premium Edition, 32-bit Service Pack 2 (build 6002) │
│CPU : Intel(R) Core(TM)2 Duo CPU T5750 @ 2.00GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 103 │ 55.703 │ 51.868 │ 53.009 │ 53.044 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 wsprintf │ 43 │ 10.163 │ 10.177 │ 10.223 │ 10.204 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
; ---------------------------------------------------------------------------
; Algo #02 to test and code to manage the display of the results.
; ---------------------------------------------------------------------------
.data
; ---------------------------------------------------------------------------
; put here the description of your algo to test. max 30 chars.
; ---------------------------------------------------------------------------
ALIGN 4
NumToTest2 DWORD 0
DWORD 9
DWORD 10
DWORD 99
DWORD 100
DWORD 999
DWORD 1000
DWORD 9999
DWORD 10000
DWORD 99999
DWORD 100000
DWORD 999999
DWORD 1000000
DWORD 9999999
DWORD 10000000
DWORD 99999999
DWORD 100000000
DWORD 999999999
DWORD 1000000000
DWORD 4294967295
NumFormat2 BYTE "%u",0
Buffer2 BYTE 32 DUP(0)
; -------------------------<123456789012345678901234567890>------------------
AlgoDesc2 BYTE "wsprintf ",0
; ---------------------------------------------------------------------------
.code
align 4
mov AlgoSize, (EndAlgo2 - Algo2)
jmp Start2
Algo2:
align 4
AlgoN2 proc
; ----------------------------------------------------------------------
; put here your code to test
; ----------------------------------------------------------------------
mov ecx, 20
lea eax, NumToTest2
@@:
push ecx
push eax
mov eax, [eax]
INVOKE wsprintf, ADDR Buffer2, ADDR NumFormat2, eax
pop eax
add eax, 4
pop ecx
dec ecx
jnz @B
ret
; ----------------------------------------------------------------------
; end point of algo to test
; ----------------------------------------------------------------------
AlgoN2 endp
Using udw2str from masm32 library. No formatting options.
┌─────────────────────────────────────────────────────────────[18-Nov-2010 at 05:02 GMT]─┐
│OS : Microsoft Windows Vista Home Premium Edition, 32-bit Service Pack 2 (build 6002) │
│CPU : Intel(R) Core(TM)2 Duo CPU T5750 @ 2.00GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 103 │ 52.763 │ 52.645 │ 52.362 │ 52.623 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str │ 35 │ 3.425 │ 3.449 │ 3.437 │ 3.465 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
; ---------------------------------------------------------------------------
; Algo #02 to test and code to manage the display of the results.
; ---------------------------------------------------------------------------
.data
; ---------------------------------------------------------------------------
; put here the description of your algo to test. max 30 chars.
; ---------------------------------------------------------------------------
ALIGN 4
NumToTest2 DWORD 0
DWORD 9
DWORD 10
DWORD 99
DWORD 100
DWORD 999
DWORD 1000
DWORD 9999
DWORD 10000
DWORD 99999
DWORD 100000
DWORD 999999
DWORD 1000000
DWORD 9999999
DWORD 10000000
DWORD 99999999
DWORD 100000000
DWORD 999999999
DWORD 1000000000
DWORD 4294967295
;NumFormat2 BYTE "%u",0
Buffer2 BYTE 32 DUP(0)
; -------------------------<123456789012345678901234567890>------------------
AlgoDesc2 BYTE "udw2str ",0
; ---------------------------------------------------------------------------
.code
align 4
mov AlgoSize, (EndAlgo2 - Algo2)
jmp Start2
Algo2:
align 4
AlgoN2 proc
; ----------------------------------------------------------------------
; put here your code to test
; ----------------------------------------------------------------------
mov ecx, 20
lea eax, NumToTest2
@@:
push ecx
push eax
mov eax, [eax]
INVOKE udw2str, eax, ADDR Buffer2
pop eax
add eax, 4
pop ecx
dec ecx
jnz @B
ret
; ----------------------------------------------------------------------
; end point of algo to test
; ----------------------------------------------------------------------
AlgoN2 endp
Thanks Greg. These two example are only converting the binary into ASCII string,
both need the second step in order to have a fair comparison.
By the way, the second function looks far better than the MACRO and the C function. I didn't know it
existed at all. :red
I'll add the second step and post the results. :U
Frank
hiyas Frank
masm32\help\masmlib.chm :U
Quote from: dedndave on November 18, 2010, 11:22:25 AM
hiyas Frank
masm32\help\masmlib.chm :U
Yeah Dave, I had a look at it after Greg posted his results. :thumbu
I thought the MACROS were all the stuff to do the conversions, but there are
also MASM functions. Faster than C equivalent, and smaller too I guess.
Strange results, less than expected:
┌─────────────────────────────────────────────────────────────[18-Nov-2010 at 11:55 GMT]─┐
│OS : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600) │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 45.734 │ 45.647 │ 45.622 │ 45.484 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 45.346 │ 45.243 │ 45.188 │ 45.282 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 51.642 │ 51.644 │ 51.640 │ 51.642 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
Maybe I have made some mistake into translating the code Greg posted, attached the whole.
Frank,
You will probably find that the API GetNumberFormat() is much slower than any number conversion algo and it will tend to level the times between different algos.
Quote from: hutch-- on November 18, 2010, 12:07:27 PM
Frank,
You will probably find that the API GetNumberFormat() is much slower than any number conversion algo and it will tend to level the times between different algos.
Yes Steve. I guessed it, this is the reason I called the prog
TwoInOne, I'm sure
merging together the two steps in one ASM PROC will give us much better results.
I'll Start with a chunk of code Clive posted some months ago when I was moving my first
steps into MASM world. I hope I'm now able to adapt it to run inside the Testbed and time it.
; EAX = 32-bit number
; ESI = string buffer for NUL terminated ASCII
; Uses ESI,EDI,EAX,ECX,EDX
push 0 ; Mark stack end with NUL
divloop:
mov ecx,1000 ; Divide into 3 digit groups
xor edx,edx ; Clear high order 32-bit for divide
idiv ecx ; eax = edx:eax / ecx, edx = edx:eax % ecx
mov edi,eax ; Save division result
mov ecx,10 ; Subdivide in 10's
mov eax,edx ; Get remainder
or edi,edi ; Still number left, so at least 3 digits in remainder
jnz digit000
cmp eax,10 ; remainder has one digit
jb digit0
cmp eax,100 ; remainder has two digits
jb digit00
digit000: ; 3 digits
xor edx,edx ; Clear high order 32-bit for divide
idiv ecx ; eax = edx:eax / ecx, edx = edx:eax % ecx
add edx,30h ; += '0'
push edx ; Stack
digit00: ; 2 digits
xor edx,edx ; Clear high order 32-bit for divide
idiv ecx ; eax = edx:eax / ecx, edx = edx:eax % ecx
add edx,30h ; += '0'
push edx ; Stack
digit0: ; 1 digit
xor edx,edx ; Clear high order 32-bit for divide
; idiv ecx ; eax = edx:eax / ecx, edx = edx:eax % ecx
add edx,30h ; += '0'
push edx ; Stack
mov eax,edi ; Recover remaining number
or eax,eax ; Zero?
jz poploop
push 2Ch ; Comma added to groups of three digits
jmp divloop
poploop:
pop eax ; Recover next digit
mov [esi],al ; Add to string
inc esi
or eax,eax ; Was it a NUL?
jnz poploop
and after I'll try something a bit more difficult, the reciprocal IMUL. :eek
Frank
And with the help of Clive now I can affirm that:
┌─────────────────────────────────────────────────────────────[18-Nov-2010 at 18:56 GMT]─┐
│OS : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600) │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 45.182 │ 45.237 │ 45.148 │ 45.400 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 44.971 │ 45.064 │ 45.037 │ 45.224 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 52.046 │ 51.991 │ 52.041 │ 51.994 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 3.187 │ 3.185 │ 3.150 │ 3.185 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
That's a lot of gain already, and this is not the fastest around. Maybe we can expect to go below 2000 with
the appropriate algo.
Frank
Quote from: frktonsboth need the second step in order to have a fair comparison.
Just what special format do you need an unsigned integer to be in?
Nevermind, I looked at Clive's code, you want commas (or whatever) between each group of three digits. Why didn't you say that?
Quote from: GregL on November 18, 2010, 09:43:15 PM
Quoteboth need the second step in order to have a fair comparison.
Just what special format do you need an unsigned integer to be in?
unsigned integer = 4294967295
ASCII formatted = 4.294.967.295
This is the reason for using: GetNumberFormat
It looks like The bigger the size of the prog, the fastest it gets :P
┌─────────────────────────────────────────────────────────────[18-Nov-2010 at 22:25 GMT]─┐
│OS : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600) │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 45.421 │ 45.113 │ 45.035 │ 44.975 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 44.840 │ 44.676 │ 45.928 │ 44.803 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 51.733 │ 51.713 │ 52.677 │ 51.742 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 3.186 │ 3.125 │ 3.207 │ 3.167 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 2.110 │ 2.101 │ 2.111 │ 2.049 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
We are nearing the limits probably :bg
Quote from: oex on November 18, 2010, 10:58:41 PM
Hey Frank, just a thought for your app.... Honestly I havent used the working version yet, the copy I downloaded was the initial test that didnt work but I was looking at the output and thinking that you could highlight or just highlight :lol the relevent copied info ie best results for forum posts....
I have noticed you were talking about clipboard copying so that should be easy?
I'm not sure I got what you mean. In the last release of the testbed, just the previous post of mine,
There is the
[C] option to copy the results embedded in a couple of tags. You then just
paste the content of the clipboard into the forum and that's all.
Give it a try and let me know what you meant.
Quote from: GregL on November 18, 2010, 09:43:15 PM
Nevermind, I looked at Clive's code, you want commas (or whatever) between each group of three digits. Why didn't you say that?
Sorry Greg, I thought it was clear from the code posted, but I was apparently wrong. :P
Thanks for making that point clear. :U
Frank,
Here is a quick scruffy that may do the job for you. It could be tweaked a bit more but it should be reasonably fast.
IF 0 ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Build this template with "CONSOLE ASSEMBLE AND LINK"
ENDIF ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
include \masm32\include\masm32rt.inc
format_num_string PROTO :DWORD,:DWORD
.code
start:
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
call main
inkey
exit
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
main proc
LOCAL buffer[64]:BYTE
LOCAL pbuf :DWORD
mov pbuf, ptr$(buffer)
fn format_num_string,"1234567890",pbuf
print pbuf,13,10
fn format_num_string,"123456789",pbuf
print pbuf,13,10
fn format_num_string,"12345678",pbuf
print pbuf,13,10
fn format_num_string,"1234567",pbuf
print pbuf,13,10
fn format_num_string,"123456",pbuf
print pbuf,13,10
fn format_num_string,"12345",pbuf
print pbuf,13,10
fn format_num_string,"1234",pbuf
print pbuf,13,10
fn format_num_string,"123",pbuf
print pbuf,13,10
fn format_num_string,"12",pbuf
print pbuf,13,10
fn format_num_string,"1",pbuf
print pbuf,13,10
ret
main endp
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
format_num_string proc src:DWORD,dst:DWORD
push ebx
push esi
push edi
; -----------------
; get source length
; -----------------
mov ebx, src
sub ebx, 1
@@:
add ebx, 1
cmp BYTE PTR [ebx], 0
jne @B
sub ebx, src
; -----------------
.data
; --------------------------------------------------
; store the initial spacing counter value in a table
; --------------------------------------------------
align 4
tbl1 dd 0,0,0,0,1,2,3,1,2,3,1,0
; 1=0 0
; 2=0 00
; 3=0 000
; 4=1 0000
; 5=2 00000
; 6=3 000000
; 7=1 0000000
; 8=2 00000000
; 9=3 000000000
; 10=1 0000000000
.code
mov ebx, [tbl1+ebx*4]
mov esi, src
mov edi, dst
sub esi, 1
stlp:
add esi, 1
movzx eax, BYTE PTR [esi]
test eax, eax
jz bye
mov [edi], al
add edi, 1
sub ebx, 1 ; dec the spacing counter
jnz stlp ; loop back if its not zero
cmp BYTE PTR [esi+1], 0 ; 1 byte look ahead
je bye ; exit if char its zero terminator
mov BYTE PTR [edi], "," ; change the character here
add edi, 1
mov ebx, 3 ; reset the spacing counter to 3
jmp stlp
bye:
mov BYTE PTR [edi], 0
pop edi
pop esi
pop ebx
ret
format_num_string endp
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
end start
Quote from: frktons on November 18, 2010, 10:28:18 PM
│02 udw2str + GetNumberFormat │ 65 │ 44.840 │ 44.676 │ 45.928 │ 44.803 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 51.733 │ 51.713 │ 52.677 │ 51.742 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 3.186 │ 3.125 │ 3.207 │ 3.167 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 2.110 │ 2.101 │ 2.111 │ 2.049 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
I was thinking somewhere along the lines of the above but removed post because code tags cant contain other tags and it removes the formating
Quote from: oex on November 18, 2010, 11:13:48 PM
Quote from: frktons on November 18, 2010, 10:28:18 PM
│02 udw2str + GetNumberFormat │ 65 │ 44.840 │ 44.676 │ 45.928 │ 44.803 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 51.733 │ 51.713 │ 52.677 │ 51.742 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 3.186 │ 3.125 │ 3.207 │ 3.167 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 2.110 │ 2.101 │ 2.111 │ 2.049 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
I was thinking somewhere along the lines of the above but removed post because code tags cant contain other tags and it removes the formating
This will require to save previous clocks. Scan them after all. Scan the output string buffer for founding place where is needed to instert the
[b][/b]
tags. That is will be relatively slow for nothing :P I guess Frank not desire to do work which can be done by people :lol
Quote from: hutch-- on November 18, 2010, 11:12:24 PM
Frank,
Here is a quick scruffy that may do the job for you. It could be tweaked a bit more but it should be reasonably fast.
IF 0 ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Build this template with "CONSOLE ASSEMBLE AND LINK"
ENDIF ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
include \masm32\include\masm32rt.inc
format_num_string PROTO :DWORD,:DWORD
.code
start:
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
call main
inkey
exit
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
main proc
LOCAL buffer[64]:BYTE
LOCAL pbuf :DWORD
mov pbuf, ptr$(buffer)
fn format_num_string,"1234567890",pbuf
print pbuf,13,10
fn format_num_string,"123456789",pbuf
print pbuf,13,10
fn format_num_string,"12345678",pbuf
print pbuf,13,10
fn format_num_string,"1234567",pbuf
print pbuf,13,10
fn format_num_string,"123456",pbuf
print pbuf,13,10
fn format_num_string,"12345",pbuf
print pbuf,13,10
fn format_num_string,"1234",pbuf
print pbuf,13,10
fn format_num_string,"123",pbuf
print pbuf,13,10
fn format_num_string,"12",pbuf
print pbuf,13,10
fn format_num_string,"1",pbuf
print pbuf,13,10
ret
main endp
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
format_num_string proc src:DWORD,dst:DWORD
push ebx
push esi
push edi
; -----------------
; get source length
; -----------------
mov ebx, src
sub ebx, 1
@@:
add ebx, 1
cmp BYTE PTR [ebx], 0
jne @B
sub ebx, src
; -----------------
.data
; --------------------------------------------------
; store the initial spacing counter value in a table
; --------------------------------------------------
align 4
tbl1 dd 0,0,0,0,1,2,3,1,2,3,1,0
; 1=0 0
; 2=0 00
; 3=0 000
; 4=1 0000
; 5=2 00000
; 6=3 000000
; 7=1 0000000
; 8=2 00000000
; 9=3 000000000
; 10=1 0000000000
.code
mov ebx, [tbl1+ebx*4]
mov esi, src
mov edi, dst
sub esi, 1
stlp:
add esi, 1
movzx eax, BYTE PTR [esi]
test eax, eax
jz bye
mov [edi], al
add edi, 1
sub ebx, 1 ; dec the spacing counter
jnz stlp ; loop back if its not zero
cmp BYTE PTR [esi+1], 0 ; 1 byte look ahead
je bye ; exit if char its zero terminator
mov BYTE PTR [edi], "," ; change the character here
add edi, 1
mov ebx, 3 ; reset the spacing counter to 3
jmp stlp
bye:
mov BYTE PTR [edi], 0
pop edi
pop esi
pop ebx
ret
format_num_string endp
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
end start
Thanks Steve. :U
To see if it is reasonably fast and how fast it is, you should insert it into the testbed. :bg
It should be easy enough for everybody here to do it. It is much harder for me to convert
all the code posted into a suitable form. :(
Please everybody, why don't you start to use the testbed and get used to it? It is not that hard I guess.
Quote from: Antariy on November 18, 2010, 11:22:11 PMI guess Frank not desire to do work which can be done by people :lol
Yeah !!!! that's the point my friend. :U
Frank
:bg
I am hard to get work out of.
Her is the same algo tidied up a bit with the stack frame removed and register usage reduced.
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
.data
; --------------------------------------------------
; store the initial spacing counter value in a table
; --------------------------------------------------
align 4
tbl1 dd 0,0,0,0,1,2,3,1,2,3,1,0
; 1=0 0
; 2=0 00
; 3=0 000
; 4=1 0000
; 5=2 00000
; 6=3 000000
; 7=1 0000000
; 8=2 00000000
; 9=3 000000000
; 10=1 0000000000
.code
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
format_num_string proc src:DWORD,dst:DWORD
; -----------------
; get source length
; -----------------
mov ecx, [esp+4]
sub ecx, 1
@@:
add ecx, 1
cmp BYTE PTR [ecx], 0
jne @B
sub ecx, [esp+4]
; -----------------
push esi
mov ecx, [tbl1+ecx*4] ; set the initial spacing from the table
mov esi, [esp+4][4]
mov edx, [esp+8][4]
sub esi, 1
stlp:
add esi, 1
movzx eax, BYTE PTR [esi]
test eax, eax
jz bye
mov [edx], al
add edx, 1
sub ecx, 1 ; dec the spacing counter
jnz stlp ; loop back if its not zero
cmp BYTE PTR [esi+1], 0 ; 1 byte look ahead
je bye ; exit if char its zero terminator
mov BYTE PTR [edx], "," ; write the spacer. <<<<<< change the character here
add edx, 1
mov ecx, 3 ; reset the spacing counter to 3
jmp stlp
bye:
mov BYTE PTR [edx], 0 ; write terminator
pop esi
ret 8
format_num_string endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
To have a standard test all the routines have to convert the same array of values:
NumToTest DWORD 0
DWORD 9
DWORD 10
DWORD 99
DWORD 100
DWORD 999
DWORD 1000
DWORD 9999
DWORD 10000
DWORD 99999
DWORD 100000
DWORD 999999
DWORD 1000000
DWORD 9999999
DWORD 10000000
DWORD 99999999
DWORD 100000000
DWORD 999999999
DWORD 1000000000
DWORD 4294967295
After that the measurements can be reliable, and inserted into the testbed.
There are 2 files to get used to the testbed:
Readme.txt and Info Screen displayed with the key when
the program shows the results.
This is the final release of the testbed. If you want to use it for any purpose, get used to it.
The sources, the info, the screens, everything is included in the zip file.
Enjoy it and post your results. Press [C] and paste the content of clipboard into the forum.
That's all.
Frank
Frank,
I was just being dense, damn painkillers. I wondered why the heck you were calling GetNumberFormat. At least I used the testbed :bg
Also, some of us like commas and others like periods for the separators. Use GetLocaleInfo with LCType flag set to LOCALE_STHOUSAND.
Quote from: GregL on November 19, 2010, 12:04:53 AM
Frank,
I was just being dense, damn painkillers. I wondered why the heck you were calling GetNumberFormat. At least I used the testbed :bg
Also, some of us like commas and others like periods for the separators. Use GetLocaleInfo with LCType flag set to LOCALE_STHOUSAND.
I used
GetNumberFormat to have the opportunity to make this thread and select the best code around to replace it. :bg
I'm glad to know that you started to use it. :U
Feel free to post the replacing code, when you get rid of the painkillers, and I'll gladly do that. For the time being I'm quite tired and here it is
night. Tomorrow I'll have a look at it. By the way the most simple solution is to replace yourself this line:
Tsep DD ".",0 ; used for thousand number separator - choose yours
with this
Tsep DD ",",0 ; used for thousand number separator - choose yours
As the comment says,
choose yours. :lol
Frank.
Frank,
The painkillers won't be going away any time soon. Have a good night.
Quote from: GregL on November 19, 2010, 12:19:27 AM
Frank,
The painkillers won't be going away any time soon. Have a good night.
Good night Greg, I'll stay some more time around. When I feel my eyes are closing, I'll go. :P
I guess that Greg's suggestion is simple to implementation and worth enough - Western and European formats of separation is different.
And addition of this code at start of testbed seems to be easy...
invoke GetLocaleInfo,LOCALE_USER_DEFAULT,LOCALE_STHOUSAND,offset Tsep,4
...But I have results of non-breakable space for thousand separator, this is right in some degree, but most of users use "." as thousands separator here.
And main point is: in OEM encoding the non-breakable space have other code, and value returned by GetLocaleInfo should be translated to ANSI, otherwise separator looks like a letter and make mess.
Quote from: Antariy on November 19, 2010, 12:27:51 AM
I guess that Greg's suggestion is simple to implementation and worth enough - Western and European formats of separation is different.
And addition of this code at start of testbed seems to be easy...
push eax
mov edx,esp
invoke GetLocaleInfo,LOCALE_USER_DEFAULT,LOCALE_STHOUSAND,edx,4
pop Tsep
...But I have results of non-breakable space for thousand separator, this is right in some degree, but most of users use "." as thousands separator here.
And main point is: in OEM encoding the non-breakable space have other code, and value returned by GetLocaleInfo should be translated to ANSI, otherwise separator looks like a letter and make mess.
Alex, if you feel like, please try it in the last posted release and see what you get, after others can test it
and tell us if it works for different countries as well.
If you do, post the new package, or only the part you changed, and we'll give it a try.
Frank
Quote from: frktons on November 19, 2010, 12:33:36 AM
Alex, if you feel like, please try it in the last posted release and see what you get, after others can test it
and tell us if it works for different countries as well.
Frank, if do this in that way, then needed to translate ANSI to OEM, to display things. Otherwise you can get this (as I)
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 184а052 │ 182а061 │ 180а817 │ 184а765 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 181а947 │ 180а363 │ 181а235 │ 181а365 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
CharToOem is good way to do things.
I get
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 74 845 │ 73 199 │ 72 696 │ 71 789 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 71 465 │ 72 461 │ 72 418 │ 72 388 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 90 419 │ 90 686 │ 89 978 │ 89 189 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
For this code
mov ebx,offset Tsep
invoke GetLocaleInfo,LOCALE_USER_DEFAULT,LOCALE_STHOUSAND,ebx,4
invoke CharToOem,ebx,ebx
Alex
Thanks Alex.
As I said before:
Quote
By the way the most simple solution is to replace yoursefl this line:
Tsep DD ".",0 ; used for thousand number separator - choose yours
with this
Tsep DD ",",0 ; used for thousand number separator - choose yours
As the comment says, choose yours. :lol
If somebody posts a working solution, I'll gladly insert it into testbed :U
Quote from: frktons on November 19, 2010, 12:52:12 AM
If somebody posts a working solution, I'll gladly insert it into testbed :U
........
Main PROC
mov ebx,offset Tsep ; THIS IS INSERTED
invoke GetLocaleInfo,LOCALE_USER_DEFAULT,LOCALE_STHOUSAND,ebx,4 ; THIS IS INSERTED
invoke CharToOem,ebx,ebx ; THIS IS INSERTED
mov RowInitialFile, One
mov RowFinalFile, MaxRows
mov ColInitialFile, One
mov ColFinalFile, MaxCols
.........
:P
Alex
Quote from: Antariy on November 19, 2010, 12:57:37 AM
........
Main PROC
invoke GetLocaleInfo,LOCALE_USER_DEFAULT,LOCALE_STHOUSAND,offset Tsep,4 ; THIS IS INSERTED
invoke CharToOem,offset Tsep,offset Tsep ; THIS IS INSERTED
mov RowInitialFile, One
mov RowFinalFile, MaxRows
mov ColInitialFile, One
mov ColFinalFile, MaxCols
.........
:P
Alex
The results you posted don't show point or comma separators, so what kind of display we get
with these two added instructions?
Quote from: frktons on November 19, 2010, 01:00:00 AM
The results you posted don't show point or comma separators, so what kind of display we get
with these two added instructions?
That is question for MS - why they are think that here is used
non-breakable space for an Thousands separator. Probably they know better, which kind of separators is used European peoples.
This code shoud return and convert to OEM the separator of 1000ds. And it do this - it return sparator which is provided by OS relatively to locale settings. This is not bug of code - this is decision of OS which char to return.
Alex
Quote from: Antariy on November 19, 2010, 01:04:03 AM
That is question for MS - why they are think that here is used non-breakable space for an Thousands separator. Probably they know better, which kind of separators is used European peoples.
This code shoud return and convert to OEM the separator of 1000ds. And it do this - it return sparator which is provided by OS relatively to locale settings. This is not bug of code - this is decision of OS which char to return.
Alex
On my pc I get:
┌─────────────────────────────────────────────────────────────[19-Nov-2010 at 01:07 GMT]─┐
│OS : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600) │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 45.524 │ 45.230 │ 45.316 │ 45.084 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 45.406 │ 45.070 │ 45.114 │ 45.248 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 52.217 │ 52.330 │ 52.267 │ 52.196 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 3.006 │ 3.008 │ 3.000 │ 3.006 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 1.966 │ 1.984 │ 1.998 │ 1.951 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
Quote from: frktons on November 19, 2010, 01:07:10 AM
On my pc I get:
And this is right results for European country :wink
I get a comma.
Quote from: Alex... but most of users use "." as thousands separator here.
Australia, Canada, U.S.A. and UK among others use a comma.
Quote from: Antariy on November 19, 2010, 01:09:50 AM
And this is right results for European country :wink
Now let's see what other countries get :lol
Thanks Alex, always very helpful. :U
┌─────────────────────────────────────────────────────────────[19-Nov-2010 at 01:12 GMT]─┐
│OS : Microsoft Windows XP Professional Service Pack 2 (build 2600) │
│CPU : Intel(R) Celeron(R) CPU 2.13GHz with 1 logical core(s) with SSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 74 098 │ 72 932 │ 72 492 │ 72 319 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 71 791 │ 71 894 │ 71 039 │ 70 818 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 90 070 │ 90 501 │ 91 407 │ 90 484 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 9 060 │ 9 605 │ 8 791 │ 8 743 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 3 785 │ 3 444 │ 3 575 │ 3 566 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
Quote from: GregL on November 19, 2010, 01:10:33 AM
I get a comma.
Quote from: Alex... but most of users use "." as thousands separator here.
Australia, Canada, U.S.A. and UK among others use a comma.
About half population use comma and about half use point, more or less, now all should be happy :lol
Quote from: GregL on November 19, 2010, 01:10:33 AM
I get a comma.
Quote from: Alex... but most of users use "." as thousands separator here.
Australia, Canada, U.S.A. and UK among others use a comma.
When I sayed "here" I meant here - in my country and in Europe. :lol
Alex
Quote from: Antariy on November 19, 2010, 01:11:26 AM
┌─────────────────────────────────────────────────────────────[19-Nov-2010 at 01:12 GMT]─┐
│OS : Microsoft Windows XP Professional Service Pack 2 (build 2600) │
│CPU : Intel(R) Celeron(R) CPU 2.13GHz with 1 logical core(s) with SSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 74 098 │ 72 932 │ 72 492 │ 72 319 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 71 791 │ 71 894 │ 71 039 │ 70 818 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 90 070 │ 90 501 │ 91 407 │ 90 484 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 9 060 │ 9 605 │ 8 791 │ 8 743 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 3 785 │ 3 444 │ 3 575 │ 3 566 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
Alex in your country you use a space separator?
Quote from: frktons on November 19, 2010, 01:13:40 AM
Alex in your country you use a space separator?
Should be
Quote from: frktons on November 19, 2010, 01:13:40 AM
Alex, MS think that in your country you used a space separator.
:bg
Quote from: AlexWhen I sayed "here" I meant here - in my country and in Europe. lol
Oh, I thought you meant users here in the forum. :lol
Quote from: Antariy on November 19, 2010, 01:17:10 AM
Quote from: frktons on November 19, 2010, 01:13:40 AM
Alex in your country you use a space separator?
Should be
Quote from: frktons on November 19, 2010, 01:13:40 AM
Alex, MS think that in your country you used a space separator.
:bg
:lol :lol :lol :dazzled: :dazzled: :dazzled: :lol :lol :lol :dazzled: :dazzled: :dazzled: :dazzled: :lol :lol :lol
Going to sleep now. Enjoy.
Frank
Quote from: GregL on November 19, 2010, 01:17:52 AM
Oh, I thought you meant users here in the forum. :lol
:bg
i am sure there is a function you can call to get the right seperator for the user's country/code page :P
well - it is in the registry
[HKEY_CURRENT_USER\Control Panel\International]
"sMonThousandSep"=","
:bg
CountryThousandsSeperatorCode PROC USES esi
print "Hello"
mov esi, input("What Country Thousands Seperator Code Are You Looking For? ")
print "Whiz Bang Whir"
print "I'm sorry I dont have that Country Thousands Seperator Code"
mov esi, input("Please Enter your Country Thousands Seperator Code ")
print "Your Country Thousands Seperator Code is: "
print esi
ret
CountryThousandsSeperatorCode ENDP
that'll work - lol
but, i was thinking of a little routine during init that reads the registry value and stores it :P
then, the conversion routine can grab the stored value, or it can be passed as a parm
Quote from: dedndave on November 19, 2010, 03:53:58 AM
that'll work - lol
but, i was thinking of a little routine during init that reads the registry value and stores it :P
then, the conversion routine can grab the stored value, or it can be passed as a parm
This feature has already been implemented. Do you want to change the actual working one
with some other weird one? :lol
Having a look at Hutch's example, it looks like the code is using an "already converted
unsigned dword into a string":
fn format_num_string,"1234567890",pbuf
and this is not the task we are trying to accomplish.
The task here is to convert an unsigned dword into an ASCII string with thousand separator.
So the starting point has to be an array of unsigned dword value as stated in the first post.
Frank
Just for fun I tried to implement Hutch's code, and guess what?
He got a good result, considering he is working on a two steps
algo:
┌─────────────────────────────────────────────────────────────[19-Nov-2010 at 13:09 GMT]─┐
│OS : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600) │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 44.782 │ 45.197 │ 44.310 │ 44.255 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 43.805 │ 45.017 │ 43.724 │ 43.839 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 51.995 │ 51.742 │ 50.983 │ 50.973 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 3.029 │ 3.007 │ 3.035 │ 3.016 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 2.004 │ 1.987 │ 1.954 │ 1.989 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 Hutch ustr$ + format algo │ 159 │ 5.792 │ 5.901 │ 5.862 │ 5.783 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
Congrats Hutch, if you try harder, you can get even better than that. :P
Frank
:bg
ustr$() is a MSVCRT function call. The "format_num_string" was designed to do just what it says.
Quote from: hutch-- on November 19, 2010, 01:37:17 PM
:bg
ustr$() is a MSVCRT function call. The "format_num_string" was designed to do just what it says.
I think that combining together the
udw2str code and the one you have used to
format the string, you could get results that are 30-40% faster than the combination of
ustrv$ and your formatting algo. :U
I think you could combine the numeric conversion and the output formatting with some reasonable gains but the formatting algo is useful in its own right and may end up in a library. Depending on how you do the conversion you can save the length check as you should have the length from the conversion.
Quote from: hutch-- on November 19, 2010, 01:56:50 PM
I think you could combine the numeric conversion and the output formatting with some reasonable gains but the formatting algo is useful in its own right and may end up in a library. Depending on how you do the conversion you can save the length check as you should have the length from the conversion.
Considering that udw2str uses a magic number:
mov ecx,429496730
and it is .386 compatible, there is probably enough room for optimizing the conversion just using
MMX or XMM registers and SSE2 and upwards opcodes. An entire thousand separated unsigned dword
uses only 14 bytes, including the NULL terminator. And an XMM register can hold up to 16 bytes.
I've to think a lot about this simple task. Maybe the limits are still far from what we got till now.
By the way, the tests started yesterday, there is a lot of time ahead. :P
Quote from: dedndave on November 19, 2010, 03:36:03 AM
i am sure there is a function you can call to get the right seperator for the user's country/code page :P
Dave, are you make suggestions before checking them???
GetLocaleInfo returned right separator, and IT GET THIS FROM REGISTRY. So, you can use RegOpenKey and RegQueryValue and RegCloseKey APIs to doing this.
I'm prefer use one (one!) API for the same results :P
Alex
Quote from: dedndave on November 19, 2010, 03:53:58 AM
but, i was thinking of a little routine during init that reads the registry value and stores it :P
then, the conversion routine can grab the stored value, or it can be passed as a parm
The same - read post above. Of course, make things harder is very interesting, though :P
I'm tempted to try a combination of algos and see what I get.
First: a lookup table with initialized string four bytes long, with group of three digit and the separator.
Divide the number for 1000 and use the remainder as an index for the look-up table to get the
sequence of digits. Pushing the 4 bytes into the stack, and go to next division, checking if the
number is > 999 to perform he division by 1000, or use it directly as table index.
This should be fast enough, I guess, max 3 division of integer numbers, so IDIV. Pushing and popping
4 bytes at a time, and building the final formatted string.
I'll try it next days, as I've got time enough.
Frank
Quote from: frktons on November 19, 2010, 09:30:15 PM
This should be fast enough, I guess, max 3 division of integer numbers, so IDIV. Pushing and popping
4 bytes at a time, and building the final formatted string.
This will be fast, but will require ~4KB table of numbers. :eek :lol
Quote from: Antariy on November 19, 2010, 09:33:10 PM
This will be fast, but will require ~4KB table of numbers. :eek :lol
I can afford this, and maybe more .... :P
Quote from: frktons on November 19, 2010, 09:34:42 PM
Quote from: Antariy on November 19, 2010, 09:33:10 PM
This will be fast, but will require ~4KB table of numbers. :eek :lol
I can afford this, and maybe more .... :P
:P :lol
After that experiment, I'd like to try what happens using MMX and XMM registers to hold data
before filling the formatted string. I have to see some SSE2/3 opcodes that can suit the task.
Not sure at the moment how to do it, but I have a vague intuition something can be done in
a very effective way. :lol
Probably prefilling an XMM registers with the separators, depending on the magnitude of the
number to format, and after filling the appropriate bytes with the digits extracted with "magic numbers" or
anything fast enough. ::)
For number conversions I have a faster signed DWORD version that was written by Paul Dixon. This may be useful for some of the tasks you have in mind. It also passes exhaustive testing over the full signed range.
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
align 16
ltoa_ex proc LongVar:DWORD,answer:DWORD
; --------------------------------------------------------------------------------
; this algorithm was written by Paul Dixon and has been converted to MASM notation
; --------------------------------------------------------------------------------
push esi
push edi
mov eax, [esp+4+8] ; LongVar ; get number
mov ecx, [esp+8+8] ; answer ; get pointer to answer string
jmp over
align 16
chartab:
dd "00","10","20","30","40","50","60","70","80","90"
dd "01","11","21","31","41","51","61","71","81","91"
dd "02","12","22","32","42","52","62","72","82","92"
dd "03","13","23","33","43","53","63","73","83","93"
dd "04","14","24","34","44","54","64","74","84","94"
dd "05","15","25","35","45","55","65","75","85","95"
dd "06","16","26","36","46","56","66","76","86","96"
dd "07","17","27","37","47","57","67","77","87","97"
dd "08","18","28","38","48","58","68","78","88","98"
dd "09","19","29","39","49","59","69","79","89","99"
over:
; on entry eax=number to convert, ecx=pointer to answer buffer (minimum 12 bytes)
; on exit, eax,ecx,edx are undefined, all other registers are preserved.
; answer is in location pointed to by ecx on entry
signed:
; do a signed DWORD to ASCII
or eax,eax ; test sign
jns udword ; if +ve, continue as for unsigned
neg eax ; else, make number positive
mov byte ptr [ecx],"-" ; include the - sign
add ecx, 1 ; update the pointer
udword:
; unsigned DWORD to ASCII
mov esi,ecx ; get pointer to answer
mov edi,eax ; save a copy of the number
mov edx, 0D1B71759h ; =2^45\10000 13 bit extra shift
mul edx ; gives 6 high digits in edx
mov eax, 068DB9h ; =2^32\10000+1
shr edx,13 ; correct for multiplier offset used to give better accuracy
jz skiphighdigits ; if zero then don't need to process the top 6 digits
mov ecx,edx ; get a copy of high digits
imul ecx,10000 ; scale up high digits
sub edi,ecx ; subtract high digits from original. EDI now = lower 4 digits
mul edx ; get first 2 digits in edx
mov ecx,100 ; load ready for later
jnc next1 ; if zero, supress them by ignoring
cmp edx,9 ; 1 digit or 2?
ja ZeroSupressed ; 2 digits, just continue with pairs of digits to the end
mov edx,chartab[edx*4] ; look up 2 digits
mov [esi],dh ; but only write the 1 we need, supress the leading zero
add esi, 1
jmp ZS1 ; continue with pairs of digits to the end
next1:
mul ecx ; get next 2 digits
jnc next2 ; if zero, supress them by ignoring
cmp edx,9 ; 1 digit or 2?
ja ZS1a ; 2 digits, just continue with pairs of digits to the end
mov edx,chartab[edx*4] ; look up 2 digits
mov [esi],dh ; but only write the 1 we need, supress the leading zero
add esi, 1
jmp ZS2 ; continue with pairs of digits to the end
next2:
mul ecx ; get next 2 digits
jnc short next3 ; if zero, supress them by ignoring
cmp edx,9 ; 1 digit or 2?
ja ZS2a ; 2 digits, just continue with pairs of digits to the end
mov edx,chartab[edx*4] ; look up 2 digits
mov [esi],dh ; but only write the 1 we need, supress the leading zero
add esi, 1
jmp ZS3 ; continue with pairs of digits to the end
next3:
skiphighdigits:
mov eax,edi ; get lower 4 digits
mov ecx,100
mov edx,28F5C29h ; 2^32\100 +1
mul edx
jnc next4 ; if zero, supress them by ignoring
cmp edx,9 ; 1 digit or 2?
ja ZS3a ; 2 digits, just continue with pairs of digits to the end
mov edx,chartab[edx*4] ; look up 2 digits
mov [esi],dh ; but only write the 1 we need, supress the leading zero
add esi, 1
jmp ZS4 ; continue with pairs of digits to the end
next4:
mul ecx ; this is the last pair so don; t supress a single zero
cmp edx,9 ; 1 digit or 2?
ja ZS4a ; 2 digits, just continue with pairs of digits to the end
mov edx,chartab[edx*4] ; look up 2 digits
mov [esi],dh ; but only write the 1 we need, supress the leading zero
mov byte ptr [esi+1],0 ; zero terminate string
jmp xit ; all done
ZeroSupressed:
mov edx,chartab[edx*4] ; look up 2 digits
mov [esi],dx
add esi,2 ; write them to answer
ZS1:
mul ecx ; get next 2 digits
ZS1a:
mov edx,chartab[edx*4] ; look up 2 digits
mov [esi],dx ; write them to answer
add esi,2
ZS2:
mul ecx ; get next 2 digits
ZS2a:
mov edx,chartab[edx*4] ; look up 2 digits
mov [esi],dx ; write them to answer
add esi,2
ZS3:
mov eax,edi ; get lower 4 digits
mov edx,28F5C29h ; 2^32\100 +1
mul edx ; edx= top pair
ZS3a:
mov edx,chartab[edx*4] ; look up 2 digits
mov [esi],dx ; write to answer
add esi,2 ; update pointer
ZS4:
mul ecx ; get final 2 digits
ZS4a:
mov edx,chartab[edx*4] ; look them up
mov [esi],dx ; write to answer
mov byte ptr [esi+2],0 ; zero terminate string
xit:
sdwordend:
pop edi
pop esi
ret 8
ltoa_ex endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Thanks Hutch.
This version is faster than the one in the m32lib, udw2str?
; #########################################################################
.386
.model flat, stdcall ; 32 bit memory model
option casemap :none ; case sensitive
; ---------------------------------------------------
; The original algorithm was written by comrade
; <comrade2k@hotmail.com>; http://www.comrade64.com/
;
; It has been optimised by Alexander Yackubtchik
; ---------------------------------------------------
; udw2str
; Parameters
; dwNumber - 32-bit double-word to be converted
; pszString - null-terminated string (output)
; Result
; None
.code
; #########################################################################
udw2str proc dwNumber:DWORD, pszString:DWORD
push ebx
push esi
push edi
mov eax, [dwNumber]
mov esi, [pszString]
mov edi, [pszString]
mov ecx,429496730
@@redo:
mov ebx,eax
mul ecx
mov eax,edx
lea edx,[edx*4+edx]
add edx,edx
sub ebx,edx
add bl,'0'
mov [esi],bl
inc esi
test eax, eax
jnz @@redo
jmp @@chks
@@invs:
dec esi
mov al, [edi]
xchg [esi], al
mov [edi], al
inc edi
@@chks:
cmp edi, esi
jb @@invs
pop edi
pop esi
pop ebx
ret
udw2str endp
; #########################################################################
end
Frank
Quote from: frktons on November 19, 2010, 10:14:03 PM
This version is faster than the one in the m32lib, udw2str?
Yes, Frank. Only look into
xchg [esi], al
This is dropped timings so much, that other code have no meaning.
Quote from: Antariy on November 19, 2010, 10:18:44 PM
Quote from: frktons on November 19, 2010, 10:14:03 PM
This version is faster than the one in the m32lib, udw2str?
Yes, Frank. Only look into
xchg [esi], al
This is dropped timings so much, that other code have no meaning.
Is that instruction so powerful? I didn't even suspect it. :lol
Can you explain why this instruction is so important?
Quote from: frktons on November 19, 2010, 10:24:46 PM
Quote from: Antariy on November 19, 2010, 10:18:44 PM
Quote from: frktons on November 19, 2010, 10:14:03 PM
This version is faster than the one in the m32lib, udw2str?
Yes, Frank. Only look into
xchg [esi], al
This is dropped timings so much, that other code have no meaning.
Is that instruction so powerful? I didn't even suspect it. :lol
Can you explain why this instruction is so important?
Probably I sayed not right. I meant - it dropped, dropped algo to the one from SLOWEST. Oh... I should choose word too precise...
This instruction itself will cause 50-100 clocks of stall. This is atomical instruction, and CPU waits for all pending transactions in the system bus, before exchange values.
#LOCK is generated implicitly.
Alex
Quote from: Antariy on November 19, 2010, 10:30:05 PM
Probably I sayed not right. I meant - it dropped, dropped algo to the one from SLOWEST. Oh... I should choose word too precise...
This instruction itself will cause 50-100 clocks of stall. This is atomical instruction, and CPU waits for all pending transactions in the system bus, before exchange values.
#LOCK is generated implicitly.
Alex
Oh!!! Well. This is what I knew about
xchg, that it is not efficient mnemonic, better to use
other solutions. :U
Quote from: frktons on November 19, 2010, 10:40:23 PM
Oh!!! Well. This is what I knew about xchg, that it is not efficient mnemonic, better to use
other solutions. :U
Something like:
@@invs:
dec esi
mov al, [edi]
mov ah, [esi]
mov [edi], ah
mov [esi], al
inc edi
@@chks:
But this is not make algo faster than Paul's code :lol
Quote from: Antariy on November 19, 2010, 10:49:10 PM
Something like:
@@invs:
dec esi
mov al, [edi]
mov ah, [esi]
xchg [edi], ah
mov [esi], al
inc edi
@@chks:
But this is not make algo faster than Paul's code :lol
Is it not possible to avoid
xchg and to use other mnemonics, better ones I mean?
Quote from: frktons on November 19, 2010, 10:52:13 PM
Is it not possible to avoid xchg and to use other mnemonics, better ones I mean?
Pardon :green2, I'm make changes not attentively :lol Look to post again :bg
Quote from: Antariy on November 19, 2010, 10:55:30 PM
Quote from: frktons on November 19, 2010, 10:52:13 PM
Is it not possible to avoid xchg and to use other mnemonics, better ones I mean?
Pardon :green2, I'm make changes not attentively :lol Look to post again :bg
:U
I have just converted the same algo to unsigned. Its an algo that Paul Dixon wrote in powerbasic that I have converted to MASM notation. Removed the stack frame and run it through exhaustive testing 0 to -1 full unsigned range.
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
utoa_ex proc uvar:DWORD,pbuffer:DWORD
; --------------------------------------------------------------------------------
; this algorithm was written by Paul Dixon and has been converted to MASM notation
; --------------------------------------------------------------------------------
mov eax, [esp+4] ; uvar : unsigned variable to convert
mov ecx, [esp+8] ; pbuffer : pointer to result buffer
push esi
push edi
jmp udword
align 4
chartab:
dd "00","10","20","30","40","50","60","70","80","90"
dd "01","11","21","31","41","51","61","71","81","91"
dd "02","12","22","32","42","52","62","72","82","92"
dd "03","13","23","33","43","53","63","73","83","93"
dd "04","14","24","34","44","54","64","74","84","94"
dd "05","15","25","35","45","55","65","75","85","95"
dd "06","16","26","36","46","56","66","76","86","96"
dd "07","17","27","37","47","57","67","77","87","97"
dd "08","18","28","38","48","58","68","78","88","98"
dd "09","19","29","39","49","59","69","79","89","99"
udword:
mov esi, ecx ; get pointer to answer
mov edi, eax ; save a copy of the number
mov edx, 0D1B71759h ; =2^45\10000 13 bit extra shift
mul edx ; gives 6 high digits in edx
mov eax, 68DB9h ; =2^32\10000+1
shr edx, 13 ; correct for multiplier offset used to give better accuracy
jz short skiphighdigits ; if zero then don; t need to process the top 6 digits
mov ecx, edx ; get a copy of high digits
imul ecx, 10000 ; scale up high digits
sub edi, ecx ; subtract high digits from original. EDI now = lower 4 digits
mul edx ; get first 2 digits in edx
mov ecx, 100 ; load ready for later
jnc short next1 ; if zero, supress them by ignoring
cmp edx, 9 ; 1 digit or 2?
ja short ZeroSupressed ; 2 digits, just continue with pairs of digits to the end
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dh ; but only write the 1 we need, supress the leading zero
inc esi ; update pointer by 1
jmp short ZS1 ; continue with pairs of digits to the end
next1:
mul ecx ; get next 2 digits
jnc short next2 ; if zero, supress them by ignoring
cmp edx, 9 ; 1 digit or 2?
ja short ZS1a ; 2 digits, just continue with pairs of digits to the end
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dh ; but only write the 1 we need, supress the leading zero
inc esi ; update pointer by 1
jmp short ZS2 ; continue with pairs of digits to the end
next2:
mul ecx ; get next 2 digits
jnc short next3 ; if zero, supress them by ignoring
cmp edx, 9 ; 1 digit or 2?
ja short ZS2a ; 2 digits, just continue with pairs of digits to the end
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dh ; but only write the 1 we need, supress the leading zero
inc esi ; update pointer by 1
jmp short ZS3 ; continue with pairs of digits to the end
next3:
skiphighdigits:
mov eax, edi ; get lower 4 digits
mov ecx, 100
mov edx, 28F5C29h ; 2^32\100 +1
mul edx
jnc short next4 ; if zero, supress them by ignoring
cmp edx, 9 ; 1 digit or 2?
ja short ZS3a ; 2 digits, just continue with pairs of digits to the end
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dh ; but only write the 1 we need, supress the leading zero
inc esi ; update pointer by 1
jmp short ZS4 ; continue with pairs of digits to the end
next4:
mul ecx ; this is the last pair so don; t supress a single zero
cmp edx, 9 ; 1 digit or 2?
ja short ZS4a ; 2 digits, just continue with pairs of digits to the end
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dh ; but only write the 1 we need, supress the leading zero
mov byte ptr [esi+1], 0 ; zero terminate string
jmp short sdwordend ; all done
ZeroSupressed:
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dx
add esi, 2 ; write them to answer
ZS1:
mul ecx ; get next 2 digits
ZS1a:
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dx ; write them to answer
add esi, 2
ZS2:
mul ecx ; get next 2 digits
ZS2a:
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dx ; write them to answer
add esi, 2
ZS3:
mov eax, edi ; get lower 4 digits
mov edx, 28F5C29h ; 2^32\100 +1
mul edx ; edx= top pair
ZS3a:
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dx ; write to answer
add esi, 2 ; update pointer
ZS4:
mul ecx ; get final 2 digits
ZS4a:
mov edx, chartab[edx*4] ; look them up
mov [esi], dx ; write to answer
mov byte ptr [esi+2], 0 ; zero terminate string
sdwordend:
pop edi
pop esi
ret 8
utoa_ex endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Quote from: hutch-- on November 19, 2010, 11:47:00 PM
I have just converted the same algo to unsigned. Its an algo that Paul Dixon wrote in powerbasic that I have converted to MASM notation. Removed the stack frame and run it through exhaustive testing 0 to -1 full unsigned range.
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
utoa_ex proc uvar:DWORD,pbuffer:DWORD
; --------------------------------------------------------------------------------
; this algorithm was written by Paul Dixon and has been converted to MASM notation
; --------------------------------------------------------------------------------
mov eax, [esp+4] ; uvar : unsigned variable to convert
mov ecx, [esp+8] ; pbuffer : pointer to result buffer
push esi
push edi
jmp udword
align 4
chartab:
dd "00","10","20","30","40","50","60","70","80","90"
dd "01","11","21","31","41","51","61","71","81","91"
dd "02","12","22","32","42","52","62","72","82","92"
dd "03","13","23","33","43","53","63","73","83","93"
dd "04","14","24","34","44","54","64","74","84","94"
dd "05","15","25","35","45","55","65","75","85","95"
dd "06","16","26","36","46","56","66","76","86","96"
dd "07","17","27","37","47","57","67","77","87","97"
dd "08","18","28","38","48","58","68","78","88","98"
dd "09","19","29","39","49","59","69","79","89","99"
udword:
mov esi, ecx ; get pointer to answer
mov edi, eax ; save a copy of the number
mov edx, 0D1B71759h ; =2^45\10000 13 bit extra shift
mul edx ; gives 6 high digits in edx
mov eax, 68DB9h ; =2^32\10000+1
shr edx, 13 ; correct for multiplier offset used to give better accuracy
jz short skiphighdigits ; if zero then don; t need to process the top 6 digits
mov ecx, edx ; get a copy of high digits
imul ecx, 10000 ; scale up high digits
sub edi, ecx ; subtract high digits from original. EDI now = lower 4 digits
mul edx ; get first 2 digits in edx
mov ecx, 100 ; load ready for later
jnc short next1 ; if zero, supress them by ignoring
cmp edx, 9 ; 1 digit or 2?
ja short ZeroSupressed ; 2 digits, just continue with pairs of digits to the end
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dh ; but only write the 1 we need, supress the leading zero
inc esi ; update pointer by 1
jmp short ZS1 ; continue with pairs of digits to the end
next1:
mul ecx ; get next 2 digits
jnc short next2 ; if zero, supress them by ignoring
cmp edx, 9 ; 1 digit or 2?
ja short ZS1a ; 2 digits, just continue with pairs of digits to the end
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dh ; but only write the 1 we need, supress the leading zero
inc esi ; update pointer by 1
jmp short ZS2 ; continue with pairs of digits to the end
next2:
mul ecx ; get next 2 digits
jnc short next3 ; if zero, supress them by ignoring
cmp edx, 9 ; 1 digit or 2?
ja short ZS2a ; 2 digits, just continue with pairs of digits to the end
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dh ; but only write the 1 we need, supress the leading zero
inc esi ; update pointer by 1
jmp short ZS3 ; continue with pairs of digits to the end
next3:
skiphighdigits:
mov eax, edi ; get lower 4 digits
mov ecx, 100
mov edx, 28F5C29h ; 2^32\100 +1
mul edx
jnc short next4 ; if zero, supress them by ignoring
cmp edx, 9 ; 1 digit or 2?
ja short ZS3a ; 2 digits, just continue with pairs of digits to the end
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dh ; but only write the 1 we need, supress the leading zero
inc esi ; update pointer by 1
jmp short ZS4 ; continue with pairs of digits to the end
next4:
mul ecx ; this is the last pair so don; t supress a single zero
cmp edx, 9 ; 1 digit or 2?
ja short ZS4a ; 2 digits, just continue with pairs of digits to the end
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dh ; but only write the 1 we need, supress the leading zero
mov byte ptr [esi+1], 0 ; zero terminate string
jmp short sdwordend ; all done
ZeroSupressed:
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dx
add esi, 2 ; write them to answer
ZS1:
mul ecx ; get next 2 digits
ZS1a:
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dx ; write them to answer
add esi, 2
ZS2:
mul ecx ; get next 2 digits
ZS2a:
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dx ; write them to answer
add esi, 2
ZS3:
mov eax, edi ; get lower 4 digits
mov edx, 28F5C29h ; 2^32\100 +1
mul edx ; edx= top pair
ZS3a:
mov edx, chartab[edx*4] ; look up 2 digits
mov [esi], dx ; write to answer
add esi, 2 ; update pointer
ZS4:
mul ecx ; get final 2 digits
ZS4a:
mov edx, chartab[edx*4] ; look them up
mov [esi], dx ; write to answer
mov byte ptr [esi+2], 0 ; zero terminate string
sdwordend:
pop edi
pop esi
ret 8
utoa_ex endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Thanks Steve. I'm afraid I'll have to do the work to insert and adapt it for the new testbed myself, don't I? :lol
Well if I come back from the week-end sane enough I'll start to convert it and test it with the other algos.
This one misses all the separator stuff, so I've to do quite a lot of work. :eek
Couldn't you mix this code and your formatting routine to make my life a bit easier? :P
We'll see. :P
Frank,
The trick is to have your testbed so it uses standard MASM algorithms so you don't have to adapt them. Align each algo if its not aligned itself, if you ned to show the size in bytes (I don't personally care) then use a label either end do the arithmetic at assembly time.
These algo and a number of others are for the masm32 library so they must be presented in that form.
There is also the qword algo by drizz here (http://www.masm32.com/board/index.php?topic=9857.msg72422#msg72422).
Quote from: hutch-- on November 20, 2010, 12:31:35 AM
Frank,
The trick is to have your testbed so it uses standard MASM algorithms so you don't have to adapt them. Align each algo if its not aligned itself, if you ned to show the size in bytes (I don't personally care) then use a label either end do the arithmetic at assembly time.
These algo and a number of others are for the masm32 library so they must be presented in that form.
The code is not prepared for the test I'm doing. If I test an array of 16 unsigned dword, the timings are
related to perform that task. This is the reason I have to adapt the code. Moreover the task is to convert from
unsigned dword to ASCII string with thousand separator.
If a routine doesn't do that, but only a partial task, I need to fill the gap if I want to use it in the testbed.
It is a good exercise for me, I have to admit, but sometime I'm just too tired, as I'm now, and I'm going to sleep.
Maybe after the week-end I can undertake this new task. But it would be better if the code were not
posted, but inserted into the testbed. It has its own structure, not difficult to grasp, and info how to use it.
The size is calculated at assemble time through the labels inside which the code should be inserted.
There is no need to adapt or rewrite code, if you start with the template I have inserted into the zip.
Apparently not many have read anything about the use of it, or the task to do in this test.
Thanks anyway for your contribution.
Quote from: jj2007 on November 20, 2010, 12:40:02 AM
There is also the qword algo by drizz here (http://www.masm32.com/board/index.php?topic=9857.msg72422#msg72422).
Thanks Jochen :U
Frank
┌─────────────────────────────────────────────────────────────[20-Nov-2010 at 05:36 GMT]─┐
│OS : Microsoft Windows 7 Home Premium Edition, 64-bit (build 7600) │
│CPU : AMD Athlon(tm) II X2 215 Processor with 2 logical core(s) with SSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 53,977 │ 43,026 │ 41,538 │ 41,453 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 42,022 │ 41,842 │ 41,732 │ 50,379 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 59,343 │ 66,041 │ 48,354 │ 56,633 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 13,839 │ 11,373 │ 13,800 │ 13,972 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 3,371 │ 5,327 │ 6,293 │ 6,316 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 Hutch ustr$ + format algo │ 159 │ 12,321 │ 6,406 │ 12,503 │ 11,744 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
Clive, your algo has been adopted into the testbed to format the numbers :U
The attached release also should correct the problems with partial copy of clipboard
due to incompatibilities between MASM 10 that I use, and older versions before MASM 9.
Try this and let me know.
IMPORTANT if you use a machine that is not SSE2 capable, in the main prog: TestBed.asm
set the flag: SSE2 to OFF, otherwise you won't see the screens.
Frank
It works. Results are copied with original EXE and with ML8 recompilation.
Quote from: Antariy on November 21, 2010, 09:57:25 PM
It works. Results are copied with original EXE and with ML8 recompilation.
I put a
mov BYTE PTR [esi], al instead of
mov [esi], al because I already experienced this problem
with other PROCs, when compiling with MASM older than ver. 9.
So it should work from MASM 6.15 up in this fashion. :P
Quote from: frktons on November 21, 2010, 10:04:04 PM
I put a mov DWORD PTR [esi], al
instead of mov [esi], al because I already experienced this problem
with other PROCs, when compiling with MASM older than ver. 9.
So it should work from MASM 6.15 up in this fashion. :P
I have some doubts about that :P
Quote from: Antariy on November 21, 2010, 10:08:45 PM
Quote from: frktons on November 21, 2010, 10:04:04 PM
I put a mov DWORD PTR [esi], al
instead of mov [esi], al because I already experienced this problem
with other PROCs, when compiling with MASM older than ver. 9.
So it should work from MASM 6.15 up in this fashion. :P
I have some doubts about that :P
Why? Oh sure it should be
Byte PTRThe program is correct, I wrote rubbish here. :P
with MOV [ESI],AL - the assembler knows the size is a byte
if you MOV [ESI],0 - the assember does not know, so you need MOV BYTE PTR [ESI],0
it is the same for all versions of masm :8)
Quote from: dedndave on November 21, 2010, 10:12:14 PM
with AL, the assembler knows the size is a byte
if you MOV [ESI],0 - the assember does not know, so you need MOV BYTE PTR [ESI],0
This happens with MASM from 8 below, from 9 upwards it doesn't. :P
Try this new version with your pc and tell me if the clipboard copy is still incomplete.
Quote from: frktons on November 21, 2010, 10:09:17 PM
Quote from: Antariy on November 21, 2010, 10:08:45 PM
Quote from: frktons on November 21, 2010, 10:04:04 PM
I put a mov DWORD PTR [esi], al
instead of mov [esi], al because I already experienced this problem
with other PROCs, when compiling with MASM older than ver. 9.
So it should work from MASM 6.15 up in this fashion. :P
I have some doubts about that :P
Why? Oh sure it should be Byte PTR
The program is correct, I wrote rubbish here. :P
:lol :P
i am sitting with dad - to give mom a break :P
so, i am not on the same machine
i hate this keyboard - lol
in fact, i hate this computer
Quote from: frktons on November 21, 2010, 10:13:12 PM
Quote from: dedndave on November 21, 2010, 10:12:14 PM
with AL, the assembler knows the size is a byte
if you MOV [ESI],0 - the assember does not know, so you need MOV BYTE PTR [ESI],0
This happens with MASM from 8 below, from 9 upwards it doesn't. :P
Try this new version with your pc and tell me if the clipboard copy is still incomplete.
Frank, Dave is right with an immediate - in any case you should specify datasize when you move immediate to memory. Assembler don't know which "0" is you write - byte/word/dword sized.
This rel 1.51 should be able to run on old and new machines, provided that the user set the SSE2 flag
to the correct state. ON = machine SSE2 capable, OFF = machine SSE2 not capable :lol
Well I guess I was moving al to [esi] and it didn't work with older MASM versions.
Quote from: dedndave on November 21, 2010, 10:16:00 PM
i am sitting with dad - to give mom a break :P
so, i am not on the same machine
i hate this keyboard - lol
Well, when you can, obviously. :U
Quote from: frktons on November 21, 2010, 10:17:28 PM
Well I guess I was moving al to [esi] and it didn't work with older MASM versions.
No, this *should* (should!!!) work. At assembly-time is known the size of the operand.
Maybe Dave will test the new release, and we can see, what is up? What you say, Dave?
Quote from: Antariy on November 21, 2010, 10:19:50 PM
Quote from: frktons on November 21, 2010, 10:17:28 PM
Well I guess I was moving al to [esi] and it didn't work with older MASM versions.
No, this *should* (should!!!) work. At assembly-time is known the size of the operand.
Maybe Dave will test the new release, and we can see, what is up? What you say, Dave?
Alex you have MASM 6.15 on your machine. You can try it yourself. :P
┌─────────────────────────────────────────────────────────────[21-Nov-2010 at 22:21 GMT]─┐
│OS : Microsoft Windows XP Professional Service Pack 2 (build 2600) │
│CPU : Intel(R) Pentium(R) 4 CPU 3.00GHz with 2 logical core(s) with SSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat
Quote from: dedndave on November 21, 2010, 10:22:36 PM
┌─────────────────────────────────────────────────────────────[21-Nov-2010 at 22:21 GMT]─┐
│OS : Microsoft Windows XP Professional Service Pack 2 (build 2600) │
│CPU : Intel(R) Pentium(R) 4 CPU 3.00GHz with 2 logical core(s) with SSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat
:lol :lol :lol :lol :lol :lol :lol :lol
; -------------------------------------------------------------------------
; The data read from the Screen Buffer is converted into DOS style.
; The New Buffer is filled with the content of the Screen Buffer, removing
; a char after each byte making it DOS compatible.
; -------------------------------------------------------------------------
ConvertToDOS PROC DestBuffer:DWORD, SourceBuffer:DWORD
mov esi, SourceBuffer
mov edi, DestBuffer
mov ecx, CharNumber
xor eax, eax
xor ebx, ebx
NextChar:
mov eax, [esi]
mov BYTE PTR [edi], al
shr eax, N16
mov BYTE PTR [edi+1], al
add esi, Four
add edi, Two
dec ecx
jnz NextChar
End_cycle:
ret
ConvertToDOS ENDP
Do you see any problem here?
manually..
┌─────────────────────────────────────────────────────────────[21-Nov-2010 at 22:21 GMT]─┐
│OS : Microsoft Windows XP Professional Service Pack 2 (build 2600) │
│CPU : Intel(R) Pentium(R) 4 CPU 3.00GHz with 2 logical core(s) with SSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 93,311 │ 92,294 │ 89,304 │ 89,287 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 87,343 │ 87,847 │ 87,956 │ 87,071 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 106,150 │ 106,907 │ 106,463 │ 120,352 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 8,638 │ 8,480 │ 8,414 │ 8,112 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 3,516 │ 3,481 │ 3,530 │ 3,516 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 Hutch ustr$ + format algo │ 159 │ 12,250 │ 11,931 │ 12,162 │ 12,289 │
Quote from: frktons on November 21, 2010, 10:21:04 PM
Alex you have MASM 6.15 on your machine. You can try it yourself. :P
This is thing which is not needed in proving :P It should work, and it work :P
Quote from: Antariy on November 21, 2010, 10:26:07 PM
Quote from: frktons on November 21, 2010, 10:21:04 PM
Alex you have MASM 6.15 on your machine. You can try it yourself. :P
This is thing which is not needed in proving :P It should work, and it work :P
Not in Dave's machine.
Quote from: frktons on November 21, 2010, 10:23:06 PM
Do you see any problem here?
No.
You should search for the runtime dependency: maybe somewhere you did not preserve some register, or you rely that this register have the same/specific value, etc.
Framk,
The notation,
mov [esi], al
Has always worked in MASM. The guys are right that its only if the data size is not known that you must specify its size. MASM is historically a fully specified Intel notation but where it can determine the size from a register you can use abbreviated notation.
Either,
mov BYTE PTR [esi], al
; or
mov [esi], al
are valid as the size can be determined from the register AL.
Quote from: frktons on November 21, 2010, 10:27:34 PM
Quote from: Antariy on November 21, 2010, 10:26:07 PM
Quote from: frktons on November 21, 2010, 10:21:04 PM
Alex you have MASM 6.15 on your machine. You can try it yourself. :P
This is thing which is not needed in proving :P It should work, and it work :P
Not in Dave's machine.
I used ML6.15 just as fovour to ask, and all is copied.
See my previous post for probably reason.
Probably there is something in this PROC thas is not MASM-old compatible:
; -------------------------------------------------------------------------
; The text is extracted from the Screen Buffer and prepared to be copied
; into the Windows Clipboard with code tags enclosing it and with
; LF+CR at the end of each text line.
; -------------------------------------------------------------------------
CopyScreenText PROC
lea esi, TopCode
lea edi, TextResults
movq mm7, qword ptr [esi]
movq qword ptr [edi], mm7
add edi, Eight
lea esi, SavedScreen
mov eax, AlgoRow
imul eax, MaxCols
mov ecx, eax
xor eax, eax
xor ebx, ebx
NextCharText:
mov eax, [esi]
mov BYTE PTR [edi], al
add esi, Four
add edi, One
inc ebx
cmp ebx, MaxCols
jl GoOn
mov BYTE PTR [edi], CR
mov BYTE PTR [edi + 1], LF
add edi, Two
xor ebx, ebx
GoOn:
dec ecx
jnz NextCharText
SetNULL:
lea esi, BottomCode
movq mm7, qword ptr [esi]
movq qword ptr [edi], mm7
add edi, Eight
mov WORD PTR [edi], 00A0h; CR + NULL
End_cycle:
ret
CopyScreenText ENDP
that conclusion is not valid
i am not assembling it - i only execute it :U
Quote from: frktons on November 21, 2010, 10:31:36 PM
mov WORD PTR [edi], 00A0h; CR + NULL
This code is not past LF to end of the string, Frank. 000Ah is LF+NULL.
Quote from: dedndave on November 21, 2010, 10:34:00 PM
that conclusion is not valid
i am not assembling it - i only execute it :U
Dave, do you have anything debugger, 32 bit Windows debugger at your current machine?
dad's machine - i have nothing set up here :'(
Quote from: dedndave on November 21, 2010, 10:34:00 PM
that conclusion is not valid
i am not assembling it - i only execute it :U
The strange thing is that somebody with win XP sp2, as yours, is running it correct. ::)
Quote from: Antariy on November 21, 2010, 10:38:29 PM
Quote from: frktons on November 21, 2010, 10:31:36 PM
mov WORD PTR [edi], 00A0h; CR + NULL
This code is not past LF to end of the string, Frank. 000Ah is LF+NULL.
Well, that's right. How does it work on so many machines?
I'm changing it, let's see if something change... :wink
Quote from: dedndave on November 21, 2010, 10:40:53 PM
dad's machine - i have nothing set up here :'(
I can post old WinDbg which can be freely distributed, but it have size 625 KB - cannot be attached.
typo !!
00A0h <> 000Ah
hmmmmm
that could mean that some machines do not like "á"
Corrected the typo. :P
Quote from: frktons on November 21, 2010, 10:41:11 PM
Quote from: dedndave on November 21, 2010, 10:34:00 PM
that conclusion is not valid
i am not assembling it - i only execute it :U
The strange thing is that somebody with win XP sp2, as yours, is running it correct. ::)
Quote from: Antariy on November 21, 2010, 10:38:29 PM
Quote from: frktons on November 21, 2010, 10:31:36 PM
mov WORD PTR [edi], 00A0h; CR + NULL
This code is not past LF to end of the string, Frank. 000Ah is LF+NULL.
Well, that's right. How does it work on so many machines?
I'm changing it, let's see if something change... :wink
No, this is not reason, Frank :lol, I have just draw your notice on, but this is not reason for uncopied results.
┌─────────────────────────────────────────────────────────────[21-Nov-2010 at 22:47 GMT]─┐
│OS : Microsoft Windows XP Professional Service Pack 2 (build 2600) │
│CPU : Intel(R) Pentium(R) 4 CPU 3.00GHz with 2 logical core(s) with SSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat
Quote from: frktons on November 21, 2010, 10:45:05 PM
Corrected the typo. :P
Also GMEM_DDESHARE is not specified within allocation of memory for clipboard buffer.
Just strange suggestions :P
Dave, which OS is installed on dad's machine?
This is serious question, answer without jokes, please :P
Quote from: Antariy on November 21, 2010, 10:45:20 PM
No, this is not reason, Frank :lol, I have just draw your notice on, but this is not reason for uncopied results.
I don't know so far what else could it be. Your explanation is not clear enough for me.
Let's carry on one problem at the time and see what we get.
Quote from: Antariy on November 21, 2010, 10:48:50 PM
Quote from: frktons on November 21, 2010, 10:45:05 PM
Corrected the typo. :P
Also GMEM_DDESHARE is not specified within allocation of memory for clipboard buffer.
Just strange suggestions :P
What should I do in your opinion?
Quote from: Antariy on November 21, 2010, 10:49:56 PM
Dave, which OS is installed on dad's machine?
This is serious question, answer without jokes, please :P
the program says:
Microsoft Windows XP Professional Service Pack 2 (build 2600)
Quote from: frktons on November 21, 2010, 10:52:15 PM
Quote from: Antariy on November 21, 2010, 10:49:56 PM
Dave, which OS is installed on dad's machine?
This is serious question, answer without jokes, please :P
the program says:
Microsoft Windows XP Professional Service Pack 2 (build 2600)
Which time at your location :P
If Dave will return to the thread, I'll say the solution for his machine to find the culprit. Dave would be glady, because this solution will be familar to him.
:bg
Quote from: Antariy on November 21, 2010, 10:54:17 PM
Which time at your location :P
21-Nov-2010 at 22:55 GMT + 1
Quote from: Antariy on November 21, 2010, 10:55:34 PM
If Dave will return to the thread, I'll say the solution for his machine to find the culprit. Dave would be glady, because this solution will be familar to him.
:bg
:lol :lol :lol :lol :lol :lol :lol :lol
that is correct
the time is not correct - it is now 3:58 PM
the OS is correct
edit - my mistake - it is GMT time
Dave, your OS (in this case) have integrated debugger. It is console interface, and very similar to DEBUG.EXE in handling.
Can you do debugging? This will be simple for you, and will not require to install any software (or updates !!!).
Should I say details?
Quote from: dedndave on November 21, 2010, 10:59:01 PM
the time is not correct - it is now 3:58 PM
the OS is correct
We are talking about GMT not your country time.
sorry
i am taking care of dad, and his dog (lol) and this keyboard is about to be thrown against the wall :bdg
Quote from: dedndave on November 21, 2010, 11:02:45 PM
sorry
i am taking care of dad, and his dog (lol) and this keyboard is about to be thrown against the wall :bdg
So, you do not want to be dedicated, and make debugging with standard tools?
You leave battlefield at the same interesting time?
:eek
http://www.masm32.com/board/index.php?topic=15365.msg125873#msg125873
Quote from: Antariy on November 21, 2010, 11:05:49 PM
Quote from: dedndave on November 21, 2010, 11:02:45 PM
sorry
i am taking care of dad, and his dog (lol) and this keyboard is about to be thrown against the wall :bdg
So, you do not want to be dedicated, and make debugging with standard tools?
You leave battlefield at the same interesting time?
:eek
http://www.masm32.com/board/index.php?topic=15365.msg125873#msg125873
Well, Dave is not in the best environment to do that debugging session, better to do it
in another time/day, or leave here the info he needs, and when he's got time he'll see
what to do. :wink
i would not mind if it weren't for this keyboard
it misses half the letters i type - very annoying
also - just realized i am out of smokes :dazzled:
Quote from: Antariy on November 21, 2010, 11:17:21 PM
That's not should take a long time.
And it is not clear - maybe he not want to spent his time for this.
Alex, do you mind if Dave does the debugging session later or tomorrow?
It doesn't seem the right moment to me. ::)
i should be at home in about an hour - we can play, then :U
Quote from: dedndave on November 21, 2010, 11:21:48 PM
i should be at home in about an hour - we can play, then :U
Great! :U
Quote from: frktons on November 21, 2010, 11:21:00 PM
Alex, do you mind if Dave does the debugging session later or tomorrow?
It doesn't seem the right moment to me. ::)
:eek
I'm not persist on something. Just:
1. Dave is first reporter of this issue.
2. Dave like consoles
3. My English is not take me understanding of difference between his jokes and truth :lol
Quote from: Antariy on November 21, 2010, 11:28:16 PM
Quote from: dedndave on November 21, 2010, 11:21:48 PM
i should be at home in about an hour - we can play, then :U
Not hurry, at first destroy the bad keyboard :lol
That's a good idea anyway. :P
i don't know what the problem is
new keyboard - new batteries - receiver is 10 inches away
Quote from: dedndave on November 21, 2010, 11:33:16 PM
i don't know what the problem is
new keyboard - new batteries - receiver is 10 inches away
:lol :lol :lol :lol :lol :lol :lol :lol :lol :lol :lol
Maybe it is too advanced for your dad's computer :lol :lol :lol :lol :lol :lol
Frank, make this:
invoke GlobalAlloc,GMEM_MOVEABLE,DWORD PTR Lenght
to this
invoke GlobalAlloc,GMEM_MOVEABLE or GMEM_DDESHARE,DWORD PTR Lenght
Just for make sure :lol
Quote from: Antariy on November 21, 2010, 11:36:37 PM
Frank, make this:
invoke GlobalAlloc,GMEM_MOVEABLE,DWORD PTR Lenght
to this
invoke GlobalAlloc,GMEM_MOVEABLE or GMEM_DDESHARE,DWORD PTR Lenght
Just for make sure :lol
OK! Before posting the 100th release, I'll wait some hour. Better to test it before posting. :P
Quote from: frktons on November 21, 2010, 11:40:03 PM
Quote from: Antariy on November 21, 2010, 11:36:37 PM
Frank, make this:
invoke GlobalAlloc,GMEM_MOVEABLE,DWORD PTR Lenght
to this
invoke GlobalAlloc,GMEM_MOVEABLE or GMEM_DDESHARE,DWORD PTR Lenght
Just for make sure :lol
OK! Before posting the 100th release, I'll wait some hour. Better to test it before posting. :P
I'm guess problem is not in DDESHARE flag, but if you set it and post it now, we can see results now :P
here it is.
Alex you are running Win XP pro sp2, the same OS as Dave's one.
What could be so different between your machines? Oex has the same problem, same OS,
I really don't understand. ::)
Quote from: frktons on November 22, 2010, 12:16:28 AM
Alex you are running Win XP pro sp2, the same OS as Dave's one.
What could be so different between your machines? Oex has the same problem, same OS,
http://www.masm32.com/board/index.php?topic=15365.msg125844#msg125844 - a first thing which can be
Quote from: Antariy on November 21, 2010, 10:28:37 PM
You should search for the runtime dependency: maybe somewhere you did not preserve some register,
or you rely that this register have the same/specific value, etc.
I don't think so. Otherwise bad results should appear in many more systems. They show up only
in a few machines, when people probably compile with old MASM versions. A corrupted register
should break entire logic of PROC.
Something else in my opinion is working bad here. ::)
Quote from: frktons on November 22, 2010, 12:24:38 AM
I don't think so. Otherwise bad results should appear in many more systems. They show up only
in a few machines, when people probably compile with old MASM versions. A corrupted register
should break entire logic of PROC.
All bugs is based on thing, that many of the equal systems has the same layout of things. So, this is very possible that most of machines have, for example, zero some register after some operation, etc. At equal sub-builds of the system this should be something as rule.
But this is only first supposition - no more than.
Quote from: Antariy on November 22, 2010, 12:33:56 AM
All bugs is based on thing, that many of the equal systems has the same layout of things. So, this is very possible that most of machines have, for example, zero some register after some operation, etc. At equal sub-builds of the system this should be something as rule.
But this is only first supposition - no more than.
Of course, but some bugs can depend on things you don't even suspect. Maybe an antivirus, or a system
program, or a corrupted dll. I hope it is only a program bug, because, if it is, we'll find it and correct it.
But if it is something related to OS, or dll, or any other thing, it could be very complex to get rid of it.
Let's see if the changes we've made produce some effect. Otherwise a debug session can help find
were the problem is.
ok - it still has the problem
i am debugging it - i will figure it out :U
it seems only logical to troubleshoot it on a system that exhibits the problem
it cannot be easily solved on a machine that doesn't :bg
Quote from: dedndave on November 22, 2010, 12:48:30 AM
ok - it still has the problem
i am debugging it - i will figure it out :U
it seems only logical to troubleshoot it on a system that exhibits the problem
it cannot be easily solved on a machine that doesn't :bg
You are right, indeed. :U
Frank, this is culprit:
ProgData.inc:
AlgoDesc BYTE 31 DUP (?)
AlgoDescSize DWORD SIZEOF AlgoDesc
Algo1.asm:
AlgoDesc1 byte "ustrv$ + GetNumberFormat ",0 ; ZERO BYTE SHOULD BE REPLACED TO SPACE
...
invoke DisplayAt, dword ptr AlgoColDesc, dword ptr AlgoRow, addr AlgoDesc,
dword ptr AlgoDescSize
So, you are display a binary zero into screen, because SIZEOF is includes zero to the length of the string.
You can solve all that strings by:
AlgoDesc1 byte "ustrv$ + GetNumberFormat " ; one space added
or, if you want to have zero terminated string
AlgoDesc1 byte "ustrv$ + GetNumberFormat " ; one space added
db 0
Alex
i haven't completely solved the problem
but - i have isolated it to some degree
there appears to be a null byte at that point in the text data
something else, not related to this specific problem ...
ConvertToDOS PROC DestBuffer:DWORD, SourceBuffer:DWORD
mov esi, SourceBuffer
mov edi, DestBuffer
mov ecx, CharNumber ;shouldn't this be NumCycles ?????? - we do 2 chars per pass
he may have found it :U
well - i count 6 spaces at the end of that string
in the pasted text, i only count 5 :P
let me see if i can find it
Even better:
invoke lstrlen,addr AlgoDesc
invoke DisplayAt, dword ptr AlgoColDesc, dword ptr AlgoRow, addr AlgoDesc,
eax
Alex
Quote from: dedndave on November 22, 2010, 01:07:21 AM
well - i count 6 spaces at the end of that string
in the pasted text, i only count 5 :P
let me see if i can find it
Would be better to calculate string length dinamically, as I made at my version of the manager.
Of course do the search :U
But I guess that found a culprit, or the main reason at least. :lol That's something strange in the treatment of the results by the console in defferent systems - on my system, it seems - zero byte is replaced to space by system, on your - it is leaves as is, and terminate string.
this may not be the end solution, but it does solve the problem
should help find the bug
CopyScreenText PROC
lea esi, TopCode
lea edi, TextResults
movq mm7, qword ptr [esi]
movq qword ptr [edi], mm7
add edi, Eight
lea esi, SavedScreen
mov eax, AlgoRow
imul eax, MaxCols
mov ecx, eax
xor eax, eax
xor ebx, ebx
NextCharText:
mov eax, [esi]
cmp al,0
jnz around1
mov al,20h
around1:
mov BYTE PTR [edi], al
Dave, did you try it? Does it work?
yes - using the "C" command.....
┌─────────────────────────────────────────────────────────────[22-Nov-2010 at 01:21 GMT]─┐
│OS : Microsoft Windows XP Professional Service Pack 2 (build 2600) │
│CPU : Intel(R) Pentium(R) 4 CPU 3.00GHz with 2 logical core(s) with SSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 75,602 │ 72,022 │ 73,863 │ 71,714 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 72,385 │ 71,391 │ 71,918 │ 70,983 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 90,708 │ 88,303 │ 91,428 │ 88,650 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 8,627 │ 8,596 │ 8,448 │ 8,441 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 3,353 │ 3,607 │ 3,697 │ 3,357 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 Hutch ustr$ + format algo │ 159 │ 11,877 │ 11,919 │ 12,173 │ 12,370 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
Another suggestion, the following code in TestBed.asm could cause problems. Why not just test for SSE2.
SSE2 EQU ON ; If your CPU is SSE2 capable set this var to ON
Procedure(s) to test for SSE2.
;---------------------------------------
ChkSSE2 PROC
; returns: TRUE (1) if SSE2 is supported or
; FALSE (0) if SSE2 is not supported
call ChkCPUID
test eax, eax
jz @F ; CPUID not supported
xor eax, eax
cpuid
test eax, eax
jz @F ; function 1 not supported
mov eax, 1
cpuid
xor eax, eax ; set up for return of FALSE
bt edx, 26 ; SSE2 supported?
jnc @f ; return FALSE
mov eax, 1 ; return TRUE
@@:
ret
ChkSSE2 ENDP
;---------------------------------------
ChkCPUID PROC USES ebx
; Return: True (1) if CPUID supported
; False(0) if CPUID not supported
pushfd
pop eax
btc eax, 21 ; check if CPUID bit can toggle
push eax
popfd
pushfd
pop ebx
xor ebx, eax
xor eax, eax ; set up to return FALSE
bt ebx, 21
jc @F ; CPUID not supported, return FALSE
mov eax, 1 ; CPUID supported, return TRUE
@@:
ret
ChkCPUID ENDP
;---------------------------------------
Frank, have a look into AssignStr ;)
OK, let's see if it works on my system as well :P
Well you have posted 20 corrections. Now which one is the good one?
All secondary points can be solved later... :P
Quote from: GregL on November 22, 2010, 01:23:24 AM
Another suggestion, the following code in TestBed.asm could cause problems. Why not just test for SSE2.
My old manager handle that issue at runtime :lol
while we are discussing SSE :bg
in 2 places, the CopyTextScreen routine uses
movq mm7, qword ptr [esi]
movq qword ptr [edi], mm7
is that necessary ?
i mean - sure it is a little faster, but they are executed only once
that code makes the program incompatible with a P3 machine
i am sure you have several other similar pieces of code in there
i would think it might be better to use non-SSE code, unless speed is an issue
that way, you have fewer pieces of code to write replacement routines for if SSE is not available
I would recommend a global struct something like
CPUFeatures STRUCT
MMX
SSE
SSE2
ETC ;<---- Awesome feature if you have got it :lol
CPUFeatures ENDS
Check for all features at once and then just check the global struct when needed....
Best always to code everything with basic x86 commands first and only optimise where needed at the end (Obviously I used Dave's post as a reference :lol)
One thing at a time. :P
1] implemented workaround proposed by Dave.
2] I'm going to implement check for SSE2 capable code, this will solve also:
Quote from: dedndave on November 22, 2010, 01:30:33 AM
while we are discussing SSE :bg
in 2 places, the CopyTextScreen routine uses
movq mm7, qword ptr [esi]
movq qword ptr [edi], mm7
is that necessary ?
i mean - sure it is a little faster, but they are executed only once
that code makes the program incompatible with a P3 machine
i am sure you have several other similar pieces of code in there
i would think it might be better to use non-SSE code, unless speed is an issue
that way, you have fewer pieces of code to write replacement routines for if SSE is not available
Because if the system doesn't support MMX, that is probably supported by P3, it will run/compile alternative code.
i suggest this code
http://www.masm32.com/board/index.php?topic=15338.msg125149#msg125149
the value can be stored in a single dword (byte, actually)
Dave, is this EXE runs properly?
Problem is with many definitions of the size of the string, and not equally of the reeal things.
Alex
┌─────────────────────────────────────────────────────────────[22-Nov-2010 at 01:37 GMT]─┐
│OS : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600) │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 44.100 │ 43.737 │ 43.864 │ 43.768 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 43.794 │ 43.569 │ 43.541 │ 43.612 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 50.065 │ 49.866 │ 50.556 │ 50.546 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 3.134 │ 3.142 │ 3.149 │ 3.113 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 1.981 │ 1.966 │ 1.994 │ 1.960 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 Hutch ustr$ + format algo │ 159 │ 5.576 │ 5.640 │ 5.607 │ 5.592 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
On my pc almost everything works. :lol
no Alex - same problem
Frank - perhaps this is related to the CPU, itself
you have a newer one than i do :'(
you don't have any SSE4 code in there, do you ?
also - i noticed in Hutch's thread that you have common controls disbaled or something ???
that makes your machine different than everyone elses
Quote from: dedndave on November 22, 2010, 01:38:33 AM
no Alex - same problem
First row is not copied properly? I have changed only one testing description.
the first row ?
the first algo data row is where it quits
see the previous posts with the short results - it looks the same
Dave, how about this? First row should shows.
This is not CPU issues, this is entanglement of string lengths.
i know - but i was trying to give Frank something to think about, in terms of CPU/SSE :bdg
yes - that one works, Alex
┌─────────────────────────────────────────────────────────────[22-Nov-2010 at 01:46 GMT]─┐
│OS : Microsoft Windows XP Professional Service Pack 2 (build 2600) │
│CPU : Intel(R) Pentium(R) 4 CPU 3.00GHz with 2 logical core(s) with SSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 74,853 │ 76,643 │ 73,079 │ 77,197 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 71,043 │ 75,563 │ 70,802 │ 86,603 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 92,975 │ 96,372 │ 100,538 │ 90,121 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 8,812 │ 9,840 │ 9,729 │ 11,084 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 3,566 │ 3,369 │ 3,184 │ 3,952 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 Hutch ustr$ + format algo │ 159 │ 12,113 │ 12,253 │ 11,941 │ 11,992 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
Instead of the:
SSE2 EQU ON
I should do something like:
CALL ChkSSE2
.if eax
SSE2 EQU ON
.else
SSE2 EQU OFF
.endif
I don't actually know if this syntax is correct. Is it?
well - you are going to want to test for MMX, SSE, SSE2, SSE3 :bg
use the routine i posted earlier
store the result in a dword and use BT on that dword to see if a feature is present
or the "&"
if FeatureFlags & 20h
Quote from: dedndave on November 22, 2010, 01:47:35 AM
i know - but i was trying to give Frank something to think about, in terms of CPU/SSE :bdg
yes - that one works, Alex
Thanks! :U
I'm changed, in ProgData.inc:
AlgoDescSize DWORD SIZEOF AlgoDesc-1
:lol
Try to make this changement in original source, and tell results :U
Looks good here Frank.
┌─────────────────────────────────────────────────────────────[22-Nov-2010 at 01:52 GMT]─┐
│OS : Microsoft Windows XP Professional Service Pack 3 (build 2600) │
│CPU : Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz with 4 logical core(s) with SSE4.1 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 32,115 │ 32,048 │ 31,269 │ 31,559 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 32,146 │ 31,748 │ 32,162 │ 31,828 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 37,187 │ 37,389 │ 37,251 │ 37,388 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 2,583 │ 2,583 │ 2,583 │ 2,577 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 2,052 │ 2,017 │ 2,009 │ 2,019 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 │ │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│07 │ │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│08 │ │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│09 │ │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│10 │ │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│11 │ │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│12 │ │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│13 │ │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│14 │ │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│15 │ │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│16 │ │ │ │ │ │
├──────────────────────────────────┴─────────┴──────────┴──────────┴──────────┴──────────┤
│ Esc Exit Copy Run View Save Info F1 Help │
└────────────────────────────────────────────────────────────────────────────────────────┘
Quote from: frktons on November 22, 2010, 01:48:02 AM
Instead of the:
SSE2 EQU ON
I should do something like:
CALL ChkSSE2
.if eax
SSE2 EQU ON
.else
SSE2 EQU OFF
.endif
I don't actually know if this syntax is correct. Is it?
No, you cannot mix macro equations and code - equations is compile time stuff.
Quote from: dedndave on November 22, 2010, 01:49:55 AM
well - you are going to want to test for MMX, SSE, SSE2, SSE3? ?:bg
Dave, at original thread of "New TestBed", I'm already posted fully functional and working manager of the algos, which is exclude unsupported algos from testing at runtime. Its decision based on result returned by CPUID code at the start of the program.
that's it Alex - you found it :U
as for mixing code and equates - you can handle that as an assembly-time conditional
but, that isn't how i would do it - lol
Quote from: dedndave on November 22, 2010, 01:49:55 AM
well - you are going to want to test for MMX, SSE, SSE2, SSE3 :bg
use the routine i posted earlier
store the result in a dword and use BT on that dword to see if a feature is present
I think the routine for CPU detection has already everything I need. At least, I think
Alex, the routine is yours, what do you think?
Quote from: Antariy on November 22, 2010, 01:50:49 AM
Quote from: dedndave on November 22, 2010, 01:47:35 AM
i know - but i was trying to give Frank something to think about, in terms of CPU/SSE :bdg
yes - that one works, Alex
Thanks! :U
I'm changed, in ProgData.inc:
AlgoDescSize DWORD SIZEOF AlgoDesc-1
:lol
Try to make this changement in original source, and tell results :U
This is logical as well. :U I'll change it, but the routine already works with Dave's workaround.
Maybe this will make it not necessary to use the workaround. Better if Dave tests it.
Quote from: hutch-- on November 22, 2010, 01:53:37 AM
Looks good here Frank.
Yes Steve. Thanks. It has some problem with oldies.
Frank
my work-around was merely a debuging tool used to help isolate the problem
Alex's fix is the right way to do it - and it works
remove my temporary code
Quote from: dedndave on November 22, 2010, 01:58:43 AM
Frank
my work-around was merely a debuging tool used to help isolate the problem
Alex's fix is the right way to do it - and it works
remove my temporary code
Already done. :P
I think the routine for CPU detection has already everything I need. At least, I think
Alex, the routine is yours, what do you think?
I'm already sayed many times that my CPUid code returns the maximal supported instruction set. And can be used for determination of execution of the algos. Moreover, that "ultra hidded" feature is used in the my "Algos Manager" :P :lol
Just read comments for AxCPUid code.
This is logical as well. :U I'll change it, but the routine already works with Dave's workaround.
Daves workaround is straightforward as it is possible, and show that culprit is the zero byte, as mentinioned. But it is slow enough - search of string, and replace the nulls. Some time ago you didn't wants to search "Here can be your advertisement" for my manger :lol
Maybe this will make it not necessary to use the workaround. Better if Dave tests it.
He already test it, it work.
OK guys. It was a nice debugging session. Now it is 3 o'clock in the morning. In few hours
I have to go somewhere else other than Virtual world. We'll carry on the good job another time.
Testbed is still work in progress. Many things will change along the way. And Alex Algo Manager
is still waiting to be implemented. :P
Stay tuned. I'll be back. :lol
Good night
Frank
as to the CPU features.....
i am not just talking about the code in the algos
i am seeing MMX code in the TestBed procs
my suggestion is - you may want to re-think using any code in the testbed
program that prevents it from being used on older CPU's
it is nice to be able to run test algos on older machines
good nite Frank :bg
Quote from: frktons on November 22, 2010, 02:07:21 AM
Testbed is still work in progress. Many things will change along the way. And Alex Algo Manager
is still waiting to be implemented. :P
Well, your testbed is updated with such frequency, that is not possible - to insert of the Manager into each next updated release :P
I just threw those procedures out there so I wasn't making a suggestion without some code. It doesn't matter to me which code is used. I agree with Dave's last post too.
he'll get there, Alex :bg
good work :U
Quote from: dedndave on November 22, 2010, 02:09:05 AM
my suggestion is - you may want to re-think using any code in the testbed
program that prevents it from being used on older CPU's
it is nice to be able to run test algos on older machines
MMX code can be runned from 1997 PI MMX, I guess - it is old enough CPU :eek :lol
Initially Frank wants to make algos as SSE2, but I'm dissuade him from this, with hardness. :lol
yah, Greg - i wasn't aware that Alex already had code written
i am sure they will make it fly right :P
Alex - yah - come to think of it, my oldest pentium machine is a P1-MMX (one of the first MMX, i guess)
it is suitible as a win 98 test machine - not really enough guts for XP
200 MHz - i run it at 225 :lol
Quote from: dedndave on November 22, 2010, 02:14:26 AM
Alex - yah - come to think of it, my oldest pentium machine is a P1-MMX (one of the first MMX, i guess)
it is suitible as a win 98 test machine - not really enough guts for XP
200 MHz - i run it at 225 :lol
Yes, for XP is PII is eno-o-o-o-o-o-o-ough... :bg
Good old machine, when M/B have switches for multiplier :wink
But I guess you are overclock it by system bus?
jumpers
no book - i found them by reading the silkscreen on the m/b :P
Quote from: dedndave on November 22, 2010, 02:22:22 AM
no book - i found them by reading the silkscreen on the m/b :P
:lol
I think MMX, SSE and SSE2 code is OK in the TestBed as long as you test for it and provide alternative code. Which is what Frank was talking about doing.
Quote from: GregL on November 22, 2010, 02:32:12 AM
I think MMX, SSE and SSE2 code is OK in the TestBed as long as you test for it and provide alternative code. Which is what Frank was talking about doing.
That's right, of course!
But such thing as testbed, I guess - would be better to provide one code path, no need in the different patchs with runtime switching. This will make testbed's code entangled enough, without any reason. Speed isn't critical in the testbed, which runs tests for about of millions clocks.
That's because I have suggested to Frank to use MMX code instead of SSE2 code. He strongly wants to do testbed very fast, and I suggest to do this in way of better compatibility. :lol
Quote from: dedndave on November 22, 2010, 02:09:05 AM
as to the CPU features.....
i am not just talking about the code in the algos
i am seeing MMX code in the TestBed procs
my suggestion is - you may want to re-think using any code in the testbed
program that prevents it from being used on older CPU's
it is nice to be able to run test algos on older machines
good nite Frank :bg
The Testbed is also my learning project. This is one of the reason it changes so often :P
The
Conditional assembly will be done the reverse way:
I start with the minimum CPU requisites, and if the user wants to switch to internal
faster PROC, he has to change the switch/es because he should know what machine
he is running. The default settings should be "old enough compatible". :lol
And if they aren't, somebody will say it. :P
Quote from: GregL on November 22, 2010, 02:32:12 AM
I think MMX, SSE and SSE2 code is OK in the TestBed as long as you test for it and provide alternative code.
Which is what Frank was talking about doing.
Yes this is my project. I'll learn many ways of doing the same thing, and give the prog
sort of flexibility. Alex is rightly proud of his flexible "Algo Manager", but I had not time
to rearrange the code in order to use it. The future is ahead, by the way.
Quote from: Antariy on November 22, 2010, 02:38:20 AM
But such thing as testbed, I guess - would be better to provide one code path, no need in the different patchs with runtime switching. This will make testbed's code entangled enough, without any reason. Speed isn't critical in the testbed, which runs tests for about of millions clocks.
That's because I have suggested to Frank to use MMX code instead of SSE2 code.
He strongly wants to do testbed very fast, and I suggest to do this in way of better compatibility. :lol
You are right my friend. But I learn more making more errors than fewer. :lol
And I should only use old code. This way I can use also more recent one.
MMX is not that young, nobody has noticed it with all the tests done.
SSE2/3/4 are another thing. For those, better to have switches, in order to allow
people who want to try them, to be able to do it.
Because I'll be away most of the week, I leave you the last version, corrected with your
help, that should be able to run on whatever [I'm exaggerating] pc with a pentium ans windows
98 upwards. Sorry not being able to make it DOS compatible and 286 standard :lol :lol :lol :lol
Frank
Tested with Win XP SP3:
┌─────────────────────────────────────────────────────────────[22-Nov-2010 at 11:29 GMT]─┐
│OS : Microsoft Windows XP Professional Service Pack 3 (build 2600) │
│CPU : Intel(R) Core(TM)2 Duo CPU E4500 @ 2.20GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 35.571 │ 35.407 │ 35.333 │ 35.281 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 34.130 │ 34.297 │ 34.346 │ 34.246 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 41.227 │ 41.215 │ 41.189 │ 41.197 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 2.952 │ 2.963 │ 2.990 │ 3.030 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 1.931 │ 1.975 │ 1.932 │ 2.012 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 Hutch ustr$ + format algo │ 159 │ 6.262 │ 6.274 │ 6.300 │ 6.280 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
kinda strange, when I re-Run the benchmark, timings differ by 1 to 4 seconds, is this considered accurate ?
Quote from: ramguru on November 22, 2010, 01:47:56 PM
kinda strange, when I re-Run the benchmark, timings differ by 1 to 4 seconds, is this considered accurate ?
What do you mean 1-4 seconds? The time is calculated in CPU cycles, in billionth of seconds.
If the tests differ 1-4 cycles it is quite normal.
sry, the dot confused me, I meant the number before dot 35.571
I guess that would be 1000 to 4000 cycles, the precision ..
Quote from: ramguru on November 22, 2010, 03:01:36 PM
sry, the dot confused me, I meant the number before dot 35.571
I guess that would be 1000 to 4000 cycles, the precision ..
1,000 - 4,000 cycles are about nothing, like millionth os seconds. Quite normal I guess.
no - not normal
it helps if you restrict execution to a single core during the tests
INVOKE GetCurrentProcess
INVOKE SetProcessAffinityMask,eax,1
you may want to use GetProcessAffinityMask to restore it when an algo is not running
also - not sure which priority you are running with - HIGH_PRIORITY_CLASS usually works well
Quote from: dedndave on November 22, 2010, 03:16:48 PM
no - not normal
it helps if you restrict execution to a single core during the tests
INVOKE GetCurrentProcess
INVOKE SetProcessAffinityMask,eax,1
you may want to use GetProcessAffinityMask to restore it when an algo is not running
also - not sure which priority you are running with - HIGH_PRIORITY_CLASS usually works well
I'm using Michael Timing Macros, and the program uses:
counter_begin LOOP_COUNT, HIGH_PRIORITY_CLASS
I'm not sure about:
INVOKE GetCurrentProcess
INVOKE SetProcessAffinityMask,eax,1
You should know better than me what the default settings of Michael Macros are.
well - Michael's macros don't mess with the affinity mask
i can tell you this from experience :P
my machine is known to jump around on timing numbers
i think it has something to do with Media Center
Quote from: dedndave on November 22, 2010, 07:04:55 PM
well - Michael's macros don't mess with the affinity mask
i can tell you this from experience :P
my machine is known to jump around on timing numbers
i think it has something to do with Media Center
What if I leave the default settings in Michael Macros?
Quote from: frktons on November 22, 2010, 03:07:10 PM
Quote from: ramguru on November 22, 2010, 03:01:36 PM
sry, the dot confused me, I meant the number before dot 35.571
I guess that would be 1000 to 4000 cycles, the precision ..
1,000 - 4,000 cycles are about nothing, like millionth os seconds. Quite normal I guess.
For 1000 cycles, 4 cycles of difference is 0.4% of flow. Almost all instruments allow flow +/- 3%. So, timings is quite stable :lol
Quote from: Antariy on November 22, 2010, 10:00:31 PM
Quote from: frktons on November 22, 2010, 03:07:10 PM
Quote from: ramguru on November 22, 2010, 03:01:36 PM
sry, the dot confused me, I meant the number before dot 35.571
I guess that would be 1000 to 4000 cycles, the precision ..
1,000 - 4,000 cycles are about nothing, like millionth os seconds. Quite normal I guess.
For 1000 cycles, 4 cycles of difference is 0.4% of flow. Almost all instruments allow flow +/- 3%. So, timings is quite stable :lol
I told them :lol
the affinity mask has nothing to do with Michael's macros :bg
you are simply selecting a single core to run the test
Quote from: frktons on November 22, 2010, 08:30:02 AM
Alex is rightly proud of his flexible "Algo Manager", but I had not time to rearrange the code in order to use it.
"Proud" is the too strong therm :P
Well, if I'm incorporate the Manager into your latest release, and *if you will use this tweaked release* for further development - that's will not hard.
It's just impossible to add manager into each new release :P, so, if you will stop for an moment, and accept a manager as "standard" - then things will be simpler. Since at current moment Manager is not "standard" (I have insert in manually) - that's not simple to made this for each step of development.
Frank, would be better to implement selection of the one core to test.
If thread of the testbed would be switched to other core after first RDTSC, the timings will not fair, because cores have different counters of clocks.
Quote from: Antariy on November 22, 2010, 10:05:06 PM
Well, if I'm incorporate the Manager into your latest release, and *if you will use this tweaked release*
for further development - that's will not hard.
It's just impossible to add manager into each new release :P, so, if you will stop for an moment,
and accept a manager as "standard" - then things will be simpler. Since at current moment Manager
is not "standard" (I have insert in manually) - that's not simple to made this for each step of development.
I'll be far from my pc for 4-5 days, so maybe this is the right moment to do it, if you want.
After that, I'll make new improvements using your Algo Manager as the standard. It'll
be easy when I have the complete program, with all the algos [6 up to now] already transformed
to work with the Manager, to add other PROCs. options, and algo as well. :U
Please leave the columns as they are now, don't change them :eek
Quote from: Antariy on November 22, 2010, 10:11:50 PM
Frank, would be better to implement selection of the one core to test.
If thread of the testbed would be switched to other core after first RDTSC, the timings will not fair,
because cores have different counters of clocks.
What and where should I insert the code?
I'll be far from my pc for 4-5 days, so maybe this is the right moment to do it, if you want.
Maybe :P
with all the algos [6 up to now] already transformed to work with the Manager
My latest this http://www.masm32.com/board/index.php?topic=14871.msg125063#msg125063" release contain all 16 files, which is ready for algos inclusion and testing :lol
Please leave the columns as they are now, don't change them :eek
I'm not changed them at previous insertion, what is up??? :eek
What and where should I insert the code?
Some post above, Dave post code with "GetCurrentProcess" - that it is.
selecting a single core is simple
it is also a good idea to do this when reading CPUID values to identify a processor
a good idea to:
1) select a single core
2) insure that CPUID is supported
3) insure that RDTSC is supported
that way, you know that Michael's timing macro will work on the machine
i assume that Alex's code verifies that
.DATA?
hProc dd ? ;current process handle
dwPMask dd ? ;process affinity mask
dwSMask dd ? ;system affinity mask
.CODE
;------------------------------------------------------------------------------
;initialization code section
;------------------------------------------------------------------------------
;get and save current process handle
INVOKE GetCurrentProcess
mov hProc,eax
;get and save system and process affinity masks
INVOKE GetProcessAffinityMask,hProc,offset dwPMask,offset dwSMask
;------------------------------------------------------------------------------
;------------------------------------------------------------------------------
;run timing test code section
;------------------------------------------------------------------------------
;restrict execution to a single core (mask = 1 selects core 0)
INVOKE SetProcessAffinityMask,hProc,1
;
;timing test code goes here
;
;restore original process affinity mask
INVOKE SetProcessAffinityMask,hProc,dwPMask
;------------------------------------------------------------------------------
This is the display my version produces:
┌─────────────────────────────────────────────────────────────[22-Nov-2010 at 22:28 GMT]─┐
│OS : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600) │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat │ 95 │ 44.362 │ 44.218 │ 44.254 │ 44.123 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat │ 65 │ 44.190 │ 43.981 │ 44.181 │ 43.973 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat │ 73 │ 50.618 │ 50.611 │ 50.737 │ 50.675 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack │ 120 │ 3.068 │ 3.068 │ 3.053 │ 3.063 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL │ 157 │ 2.024 │ 2.006 │ 1.970 │ 1.972 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 Hutch ustr$ + format algo │ 159 │ 5.563 │ 5.623 │ 5.565 │ 5.539 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
This is your version:
┌─────────────────────────────────────────────────────────────[22 Nov 2010 at 22:29 GMT]─┐
│OS : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600) │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with 2 logical core(s) with SSSE3 │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│ Algorithm notes │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 Alex / MMX │ 55 │ 5.415 │ 5.409 │ 5.411 │ 5.407 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 Frank / SSE2 │ 45 │ 4.465 │ 4.464 │ 4.464 │ 4.464 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 Here can be your advertisement │ 0 │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Here can be your advertisement │ 0 │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Here can be your advertisement │ 0 │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 Here can be your advertisement │ 0 │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│07 Here can be your advertisement │ 0 │ │ │ │ │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
look carefully and see what I mean.
I would like to know where to put the :
INVOKE GetCurrentProcess
INVOKE SetProcessAffinityMask,eax,1
Quote from: dedndave on November 22, 2010, 10:29:30 PM
that way, you know that Michael's timing macro will work on the machine
i assume that Alex's code verifies that
Yes, I check for presence of the CPUID. All other code of CPUid routine is i386.
About restoring of the affinity - I'm not sure that this should be done, since program will exit. Just setting affinity as in start of piece, and no restoring after all tests.
This is the display my version produces:
...
This is your version:
...
look carefully and see what I mean.
Well, I'm not insert algos about a week - just not seen that Manager is appreciated, or worth for efforts - no feedback - no bothering :green2
I would like to know where to put the :
At the same start of the program.
no - the program will not exit
it simply allows use of all cores during execution of the rest of the TestBed code
Michael's macros use RDTSC
if it is executed on a machine that does not support RDTSC, it will hang
Frank - maybe you missed this post...
http://www.masm32.com/board/index.php?topic=15365.msg126090#msg126090
RDTSC support may be verified by reading
CPUID with EAX = 1
EDX, bit 4
Quote from: dedndave on November 22, 2010, 10:38:17 PM
no - the program will not exit
it simply allows use of all cores during execution of the rest of the TestBed code
Michael's macros use RDTSC
if it is executed on a machine that does not support RDTSC, it will hang
The program will run forever? :P No, after all it will exit, and running of post-testing code would be possible for power of one code :lol
I honestly hope, that nobody will run the TestBed on i486 machine :green2
Quote from: Antariy on November 22, 2010, 10:36:55 PM
This is the display my version produces:
...
This is your version:
...
look carefully and see what I mean.
Well, I'm not insert algos about a week - just not seen that Manager is appreciated, or worth for efforts - no feedback - no bothering :green2
I would like to know where to put the :
INVOKE GetCurrentProcess
INVOKE SetProcessAffinityMask,eax,1
[/color]
At the same start of the program.
I told you that your manager will be the standard. I don't care about too much feed-back, and we had
already a lot of feed-back.
Dave is giving us a lot of feed-back, and others have done as well: oex, ramguru, Michaelw, clive, GregL
Hutch, jj2007 and so on.
I'll leave the Tesbed as it is in your hands. Modify the Manager, and leave the columns as they are,
eliminate those rows about "advertisement", and make any optimization you want.
I'll take back control of Testbed in about 2 weeks. During this time do whatever you like with it. :lol
Quote from: dedndave on November 22, 2010, 10:41:38 PM
RDTSC support may be verified by reading
CPUID with EAX = 1
EDX, bit 4
Pentium has it, AFAIK
I told you that your manager will be the standard. I don't care about too much feed-back, and we had
already a lot of feed-back...
...feedback not about manager :lol
I'll leave the Tesbed as it is in your hands. Modify the Manager, and leave the columns as they are,
eliminate those rows about "advertisement", and make any optimization you want.
Why did you dislike the "advertisement" strings? The is very funny, in style of modern world :green2
Quote from: dedndave on November 22, 2010, 10:39:10 PM
Frank - maybe you missed this post...
http://www.masm32.com/board/index.php?topic=15365.msg126090#msg126090
I didn't. Read what I posted. :U
Quote from: Antariy on November 22, 2010, 10:46:44 PM
I told you that your manager will be the standard. I don't care about too much feed-back, and we had
already a lot of feed-back...
...feedback not about manager :lol
I'll leave the Tesbed as it is in your hands. Modify the Manager, and leave the columns as they are,
eliminate those rows about "advertisement", and make any optimization you want.
Why did you dislike the "advertisement" strings? The is very funny, in style of modern world :green2
Ask Dave what he thinks about advertisement. I'm preparing the luggage, because tomorrow I have a flight
to catch. No Testbed development for about 2 weeks. Have a nice optimization trip with Dave. I think he
could be a good advisor. :P But don't believe everything he says. :lol
Quote from: frktons on November 22, 2010, 10:50:02 PM
Ask Dave what he thinks about advertisement.
Dave, what you think about Manager's default description for the non-existent algos? Did you run it? Then you shoud see image as in the TV :bg
yes - i was waiting to hear...
"Oh, I wish I were an Oscar Meyer hot dog
for that is truly what I want to be......."
listen guys....
CPUID, RDTSC, affinity....
i am just giving my best knowledge and info
you can use it or ignore it :bg
there are other ways to acquire more stable result numbers
but, if you are not interested in the basics, i know you won't be interested in those
Quote from: dedndave on November 22, 2010, 10:54:31 PM
yes - i was waiting to hear...
"Oh, I wish I were an Oscar Meyer hot dog
for that is truly what I want to be......."
So, it is quite useful as description, or it is annoying? Second is preferable, and if it is - the it will be that :green2
Quote from: dedndave on November 22, 2010, 10:54:31 PM
yes - i was waiting to hear...
"Oh, I wish I were an Oscar Meyer hot dog
for that is truly what I want to be......."
In other words he thinks you can throw that rubbish where you like, but not in the Testbed. :lol
Frank,
Make sure you enjoy yourself while you are away.
that is a song, or as we call it, a "jingle"
a jingle is a song from an advertisement
you have never heard this song ???? :P
watch the movie "Demolition Man"
Quote from: dedndave on November 22, 2010, 10:54:31 PM
listen guys....
CPUID, RDTSC, affinity....
i am just giving my best knowledge and info
you can use it or ignore it :bg
there are other ways to acquire more stable result numbers
but, if you are not interested in the basics, i know you won't be interested in those
We are interested Dave. Just give Alex the time to understand, his english is not
like his ASM, you know?
Quote from: hutch-- on November 22, 2010, 10:56:38 PM
Frank,
Make sure you enjoy yourself while you are away.
I promise. :P
Quote from: frktons on November 22, 2010, 10:56:33 PM
In other words he thinks you can throw that rubbish where you like, but not in the Testbed. :lol
That's very interesting, well formatted for the testbed, funny and simple thing, not rubbish :(
Quote from: Antariy on November 22, 2010, 10:58:58 PM
That's very interesting, well formatted for the testbed, funny and simple thing, not rubbish :(
Take care of respecting the columns, the numbers alignment and don't care too much about
what you think is funny to put into the display. :U
You can have your own version with advertisement and songs. Not the Testbed that will
be used in the forum.
Quote from: dedndave on November 22, 2010, 10:54:31 PM
listen guys....
CPUID, RDTSC, affinity....
i am just giving my best knowledge and info
you can use it or ignore it :bg
Dave, I appreciate your suggestion for affinity, it is right :U Just read some my previous posts :P
But I'm really guess - nobody will run the testbed on i486... So, we can safely decise (and jump over Pentium) of having PMMX as minimal test machine. Or not? :eek
Quote from: Antariy on November 22, 2010, 11:02:41 PM
Quote from: dedndave on November 22, 2010, 10:54:31 PM
listen guys....
CPUID, RDTSC, affinity....
i am just giving my best knowledge and info
you can use it or ignore it :bg
Dave, I appreciate your suggestion for affinity, it is right :U Just read some my previous posts :P
But I'm really guess - nobody will run the testbed on i486... So, we can safely decise (and jump over Pentium) of having PMMX as minimal test machine. Or not? :eek
I agree :U
Quote from: frktons on November 22, 2010, 11:01:21 PM
You can have your own version with advertisement and songs.
Yeah! Playback of some MIDI or MP3 would be very nice :P
Quote from: Antariy on November 22, 2010, 11:04:22 PM
Quote from: frktons on November 22, 2010, 11:01:21 PM
You can have your own version with advertisement and songs.
Yeah! Playback of some MIDI or MP3 would be very nice :P
We can implement a "P"lay key to choose a midi or mp3 to play while doing the tests, but the tests
will be a little bit out of precision then. :lol
Quote from: frktons on November 22, 2010, 10:58:52 PM
Quote from: hutch-- on November 22, 2010, 10:56:38 PM
Frank,
Make sure you enjoy yourself while you are away.
I promise. :P
Don't pick off a small piece of Colloseum as souvenir :P :lol
Quote from: frktons on November 22, 2010, 11:08:45 PM
We can implement a "P"lay key to choose a midi or mp3 to play while doing the tests, but the tests will be a little bit out of precision then. :lol
Or just test the MP3 decoding algos only :lol
Quote from: Antariy on November 22, 2010, 11:09:02 PM
Don't pick off a small piece of Colloseum as souvenir :P :lol
Not really. I have some friends to meet, and it will be better than stones. :wink
Quote from: Antariy on November 22, 2010, 11:10:10 PM
Quote from: frktons on November 22, 2010, 11:08:45 PM
We can implement a "P"lay key to choose a midi or mp3 to play while doing the tests, but the tests will be a little bit out of precision then. :lol
Or just test the MP3 decoding algos only :lol
That's good. :U
Quote from: frktons on November 22, 2010, 11:10:57 PM
Not really. I have some friends to meet, and it will be better than stones. :wink
Well, with help of friends you can take a very big stone of Colloseum as souvenir then. :lol
Quote from: Antariy on November 22, 2010, 11:15:08 PM
Well, with help of friends you can take a very big stone of Colloseum as souvenir then. :lol
We can transport Colosseum to my town if we want. :P
Quote from: frktons on November 22, 2010, 11:21:20 PM
We can transport Colosseum to my town if we want. :P
:lol
By the way, I've already written on paper the code to highlight the best
performing algo row. I'm afraid I can't send the image to you because I don't have a scanner
at home.
When I'll be back home, if you have inserted the Manager, I'll implement the highlight of the
code, assuming I'll be able to grasp your code and your comment in cyrillic english :lol
Quote from: frktons on November 22, 2010, 11:34:18 PM
your comment in cyrillic english :lol
:green2 :green2 :green2 :green2 :green2 :green2 :green2 :green2 :green2 :green2 :green2
Maybe it is easier if you write comments in russian and I google translate them or ask a russian
friend of mine to do it for me. :dazzled:
Jochen lives over there
someday, if i get to Italy, i may stop and visit him :P
(http://easu.jrc.ec.europa.eu/eas/downloads/images/Staff_Photos_EAS_Images_jesinghaus.jpg)
http://easu.jrc.ec.europa.eu/eas/sipa/staff/index.htm
If I see around somebody who looks like Jochen I'll offer a coffee to him. :bg
Quote from: dedndave on November 22, 2010, 11:48:08 PM
Jochen lives over there
someday, if i get to Italy, i may stop and visit him :P
Give Dave's photo, Dave's photo - scan of the public :P
:lol is he in charge of the website?
http://easu.jrc.ec.europa.eu/eas/sipa/staff/jochen.jesinghaus'at'jrc.ec.europa.eu
Quote from: oex on November 22, 2010, 11:59:48 PM
:lol is he in charge of the website?
http://easu.jrc.ec.europa.eu/eas/sipa/staff/jochen.jesinghaus'at'jrc.ec.europa.eu
Link is not work :P
Give Dave's photo... :P
that is the mail link
they do that to reduce spam :U
http://www.qrz.com/db/K7NL
click on the pic to enlarge - i assume no responsibility if it breaks your monitor
:P
I think this is also Dave :lol
(http://www.hereford.tv/dave.png)
yes - that is me as Count Dracula
as you can see, i now have a beard and glasses :bg
(and thick eyebrows)
.... and pointy ears :lol
Much better picture I think :lol.... I'm a little concerned as to what happened to your jaw though....
It's amazing the things that come back to 'haunt' you isnt it :lol
I have no picture fortunately I am an AI entity lost in a repetative loop on this forum :bg
Quote from: dedndave on November 23, 2010, 12:05:33 AM
http://www.qrz.com/db/K7NL
click on the pic to enlarge - i assume no responsibility if it breaks your monitor
:P
Quite scaring indeed :lol
Quote from: oex on November 23, 2010, 12:08:18 AM
I think this is also Dave :lol
(http://www.hereford.tv/dave.png)
That is Dave which simulate the Dracula, or this is the Dracula which simulate the Dave??? :eek
Just kidding :bg
that is a Dracula emulation - not simulation :lol
Quote from: dedndave on November 23, 2010, 12:58:15 AM
that is a Dracula emulation - not simulation :lol
:lol
but, you are missing the other 2 Daves :P
(http://img98.imageshack.us/img98/8356/monsterrap.jpg)
doesn't Zara make a kick-ass witch !!!
(http://www.nestreetriders.com/forum/images/smilies/threadjacked.gif)
Quote from: dedndave on November 23, 2010, 01:03:06 AM
but, you are missing the other 2 Daves :P
(http://img98.imageshack.us/img98/8356/monsterrap.jpg)
doesn't Zara make a kick-ass witch !!!
:U
Algo Manager : The Return: http://www.masm32.com/board/index.php?topic=14871.msg126258#msg126258
:P
:bg
Quote from: dedndave on November 23, 2010, 01:03:06 AM
(http://www.nestreetriders.com/forum/images/smilies/threadjacked.gif)
By the way - this funny image is not optimized yet. It can have 2 times smaller code size, really :P
i think it was an animated gif that someone screwed up - lol
speaking of icons, here is one for the testbed program :P
(http://img202.imageshack.us/img202/8241/tbicon.png)
what a handsome devil !
Quote from: dedndave on November 24, 2010, 10:26:28 PM
i think it was an animated gif that someone screwed up - lol
speaking of icons, here is one for the testbed program :P
(http://img202.imageshack.us/img202/8241/tbicon.png)
what a handsome devil
No, it is just one-frame GIF file, with can be smaller 2.14 times at least, and have no differencies to human's look.
Icon is good enough, but too dark... Some brightness should be adjusted :P
48x48 would be better :P
(http://img256.imageshack.us/img256/3863/clock1296128.png)
Quote from: dedndave on November 24, 2010, 10:44:52 PM
(http://img256.imageshack.us/img256/3863/clock1296128.png)
This is not Dave, apparently :lol
notice how the bg is transparent ?
Quote from: dedndave on November 24, 2010, 10:49:24 PM
notice how the bg is transparent ?
Yes, I have noticed that it is not transparent. Just my current browser is not support of Alpha-channel transparency by default, it is needed some tricks with JS to display transparency via alpha-channel. Current image with current browser is not transparent. I know - background should be transparent, and clocks have a shadow, which is smoothed via alpha-channel, too.
P.S. FireFox is support alpha-channel, that is reason why you see transparency :P