News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

szLen optimize...

Started by denise_amiga, May 31, 2005, 07:42:44 PM

Previous topic - Next topic

jj2007

Quote from: tetsu-jp on April 03, 2009, 11:22:04 PM
this is the result for REP SCASD, cache=0

You might have a look at the second line of your screenshot.

tetsu-jp

yes i know, error message. i have not modified the macro, which is generating the string sequence.
so i think it is just a buggy message.


align 16
db 11 dup (0) ; misaligned by 11
szTest_4k db txt50

REPEAT 80-1
db txt50
ENDM
db txt50, "4096 bytes************************************", 0,0,0,0,0

align 16
szTest_16k db txt50


still should be 4K?!?

**************

I was thinking to extend the software:

-use random strings, random length, random location, get some more parameters for that.
CPU will behave differently than just one&the same string, same length, again and again.

-link with a VB program, and put the results in a database!
then it can be compared using EXCEL.

-provide a web service, to upload results!
then after a while, numerous CPUs can be compared.

-add more algorithms: memory copy, search for specific pattern

-64 bit support

I can do all this, but...as you write, there are members with superior knowledge.
so, why hijack the project, and steal the show?

I mean, i just made requests, and suggestions.
the question i had was "HOW MUCH FASTER compared to SCAS".

and, yes, why not waste a few bytes, and use longword instructions?
NP if you have Gbytes of memory.

so what do you think about the extensions?

for instance, you could generate a list in C++/C#/VB, and pass it to the assembly program.
this would be "real world data", not just a static string.

I never wrote my solution is superior, or i am the better wizard.

just, there are features missing in this software, i just wrote a few of them.

jj2007

Quote from: tetsu-jp on April 04, 2009, 12:43:48 PM
yes i know, error message. i have not modified the macro, which is generating the string sequence.
so i think it is just a buggy message.


Njet. The message is correct. It's your code that is buggy.

tetsu-jp

#363
unlikely if you compare the relation to 16K (which is about 1:4)

anyway, i will investigate later on today. the source is not that difficult, it's about the level i can follow without major problems.

and the CPU detect- I think I'll borrow that for my own projects- and give a copyright reference.
no need to re-invent such a code...

so i have some fun...

Uhm...i can copy strings (in C)


StrLenS proc src:DWORD

          push esi
mov esi,[esp+8]
xor ecx,ecx
pxor mm0,mm0
          xor ebx,ebx

_reloop:
movq mm1,[esi]
pcmpeqd mm1,mm0
inc ecx
movq mm2,[esi+8]
pcmpeqd mm2,mm0
lea esi,[esi+16]
          por mm1, mm2
          pmovmskb eax, mm1
test eax,eax
jz _reloop

clc
shl ecx,4
mov eax,ecx
pop esi

; push eax
; push edi
; mov edi,[esp+12]
; add edi,eax
; mov ecx,32
; xor eax,eax
; std
; repe scasb
; mov eax,32
; sub eax,ecx
; mov ecx,eax
; pop edi
;
; cld
; pop eax
; sub eax,ecx

ret 4
StrLenS endp


i wrote this (using 64bit MMX).
it's a little faster than Agner Fog's stuff.
but i can not fix the string length stuff correctly!
at least, not today.

so you see, i have examined your codes a little.
I've just downloaded the manuals with 128bit instructions a few days ago.
they must be aligned, or exception will happen.

my idea is to use 64bit, do not care about alignment at all (maybe enfore it in software anyway),
and fix the length via SCAS.

short strings can be copied to aligned space.
long strings- unaligned, and determine their length? i can not think of such a case.

i understand your efforts are to align the data, and also to test byte by byte.

is such code really required? i try to think of a real-world software, which has large unaligned strings.



now, i have made modifications...
can't get the correct string length. what's wrong with the code?

it works using 64bit MMX, not 128bits.
so it's hard to be the 128bit MMX!

also i think the Genesys is not active at all- and there won't be an explanation what's wrong with the string length.

the code at MyTest is strange (patching bytes). can someone explain? i tried an hour to determine the extra bytes.

[attachment deleted by admin]