News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

a counter of lines in mmx instructions

Started by ToutEnMasm, March 18, 2009, 09:20:18 AM

Previous topic - Next topic

Mark Jones

Interesting... WinXP 32-bit.


AMD Athlon(tm) 64 X2 Dual Core Processor 4000+ (SSE3)
Tests for correctness - 100 / 100 / 11 lines
expected, first string 5-byte misaligned:

jj2007=         100 / 100 / 11 lines
Lingo=          100 / 100 / 11 lines
Lingo2=         100 / 100 / 11 lines
ToutEnMasm=     100 / 100 / 11 lines

Codesizes:
getlinesJJ =            153
getlines Lingo =        217
getlines2 Lingo2 =      278
CompteurLignes =        238

Counting lines of \masm32\include\windows.inc:

getlinesJJ: (jj2007)
1087    kilocycles for 22274 lines, 849788 bytes

getlines (Lingo):
955     kilocycles for 22274 lines, 849788 bytes

getlines2 (Lingo2):
843     kilocycles for 22274 lines, 849788 bytes

CompteurLignes: (ToutEnMasm)
1143    kilocycles for 22274 lines, 849788 bytes


Ahh yes LAMPs, was just going to say that there was a name for the ratio of code size to execution speed. :bg
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

Antariy

Oh, I see there is the place, where bloated algos were testing?
Can I join to the party with fat one of mine?

Nothing very interesting - just algo seems that it does not make funny errors on the badly formatted strings :green2

So, I ask to test this. Remember that program should be placed on drive where MASM32 has installed, because testing file is one from include files.

The difference of the algo is that it get lines delimiter as third parameter. So, it is support Windows/DOS/Unix text files. Also a lot of work in algo is just aligment stuff, multiple checks for good working with "badly formatted" strings. At least, it should work properly with badly formatted strings... maybe... :green2
So, not swear too much on spaghetti-style code :P

Here is timings:
Quote
Tests for correctness - 100 / 100 / 11 lines
expected, first string 5-byte misaligned:

jj2007=         99 / 100 / 11 lines
Lingo=          99 / 100 / 11 lines
Lingo2=         100 / 100 / 11 lines
ToutEnMasm=     100 / 100 / 11 lines
AxCountLines=   100 / 100 / 11 lines

Codesizes:
getlinesJJ =            153
getlines Lingo =        217
getlines Lingo2=        278
CompteurLignes =        238
AxCountLines =          262

Counting lines of \masm32\include\winextra.inc:

getlinesJJ: (jj2007)
13768 kiloLAMPs, 1113 kilocycles for 20025 lines, 807877 bytes

getlines (Lingo):
15942 kiloLAMPs, 1082 kilocycles for 20025 lines, 807877 bytes

getlines2 (Lingo):
15038 kiloLAMPs, 901 kilocycles for 20025 lines, 807877 bytes

CompteurLignes: (ToutEnMasm)
19351 kiloLAMPs, 1254 kilocycles for 20025 lines, 807877 bytes

AxCountLines: 13991 kiloLAMPs, 864 kilocycles for 20025 lines, 807877 bytes

LAMPs = Lean And Mean Points = cycles * sqrt(codesize)

Program and sources in attached archive.

Used old testbed, since Jochen was made a big job for correctness testing.

dedndave

prescott w/htt
getlinesJJ: (jj2007)
11828 kiloLAMPs, 956 kilocycles for 20025 lines, 807877 bytes

getlines (Lingo):
13750 kiloLAMPs, 933 kilocycles for 20025 lines, 807877 bytes

getlines2 (Lingo):
10250 kiloLAMPs, 614 kilocycles for 20025 lines, 807877 bytes

CompteurLignes: (ToutEnMasm)
17835 kiloLAMPs, 1156 kilocycles for 20025 lines, 807877 bytes

AxCountLines:
5350 kiloLAMPs, 330 kilocycles for 20025 lines, 807877 bytes


i haven't seen the term "kilocycles" for years - brings back memories   :P

oex

#63
Dammit what's my computer? :lol AMD Sempron 1800+


getlinesJJ: (jj2007)
12357 kiloLAMPs, 999 kilocycles for 20025 lines, 807877 bytes

getlines (Lingo):
12094 kiloLAMPs, 821 kilocycles for 20025 lines, 807877 bytes

getlines2 (Lingo):
12362 kiloLAMPs, 741 kilocycles for 20025 lines, 807877 bytes

CompteurLignes: (ToutEnMasm)
15156 kiloLAMPs, 982 kilocycles for 20025 lines, 807877 bytes

AxCountLines: 10955 kiloLAMPs, 676 kilocycles for 20025 lines, 807877 bytes

LAMPs = Lean And Mean Points = cycles * sqrt(codesize)
We are all of us insane, just to varying degrees and intelligently balanced through networking

http://www.hereford.tv

sinsi

Crashes with an access violation in win7pro x64 at 0000106b.
Light travels faster than sound, that's why some people seem bright until you hear them.

jj2007

Prescott P4

getlinesJJ: (jj2007)
12348 kiloLAMPs, 998 kilocycles for 20025 lines, 807877 bytes

getlines (Lingo):
14325 kiloLAMPs, 972 kilocycles for 20025 lines, 807877 bytes

getlines2 (Lingo):
10848 kiloLAMPs, 650 kilocycles for 20025 lines, 807877 bytes

CompteurLignes: (ToutEnMasm)
17664 kiloLAMPs, 1145 kilocycles for 20025 lines, 807877 bytes

AxCountLines:
5795 kiloLAMPs, 358 kilocycles for 20025 lines, 807877 bytes

clive

Atom N270


Tests for correctness - 100 / 100 / 11 lines
expected, first string 5-byte misaligned:

jj2007=         99 / 100 / 11 lines
Lingo=          99 / 100 / 11 lines
Lingo2=         100 / 100 / 11 lines
ToutEnMasm=     100 / 100 / 11 lines
AxCountLines=   100 / 100 / 11 lines

Codesizes:
getlinesJJ =            153
getlines Lingo =        217
getlines Lingo2=        278
CompteurLignes =        238
AxCountLines =          262

Counting lines of \masm32\include\winextra.inc:

getlinesJJ: (jj2007)
16757 kiloLAMPs, 1354 kilocycles for 20025 lines, 807877 bytes

getlines (Lingo):
14778 kiloLAMPs, 1003 kilocycles for 20025 lines, 807877 bytes

getlines2 (Lingo):
13690 kiloLAMPs, 821 kilocycles for 20025 lines, 807877 bytes

CompteurLignes: (ToutEnMasm)
20978 kiloLAMPs, 1359 kilocycles for 20025 lines, 807877 bytes

AxCountLines: 9275 kiloLAMPs, 573 kilocycles for 20025 lines, 807877 bytes

LAMPs = Lean And Mean Points = cycles * sqrt(codesize)
It could be a random act of randomness. Those happen a lot as well.

GregL

Alex,

I get an access violation in AxCountLines at line 121:  0xC0000005: Access violation reading location 0xffffffff.

This is the instruction:  pshufd xmm0,[esp+16],0

I am running Windows Vista 32-bit on this laptop.

jj2007

pshufd needs 16-byte alignment. Otherwise it's a fantastic algo:
Celeron M
getlinesJJ: (jj2007)
5555 kiloLAMPs, 449 kilocycles for 20025 lines, 807877 bytes

getlines (Lingo):
6237 kiloLAMPs, 423 kilocycles for 20025 lines, 807877 bytes

getlines2 (Lingo):
8086 kiloLAMPs, 485 kilocycles for 20025 lines, 807877 bytes

CompteurLignes: (ToutEnMasm)
15304 kiloLAMPs, 992 kilocycles for 20025 lines, 807877 bytes

AxCountLines: 5397 kiloLAMPs, 333 kilocycles for 20025 lines, 807877 bytes


drizz

test this:
db "aaaaaaaaaaaaaaa",13,10,"aaaaaaaaaaaaaaa",0
The truth cannot be learned ... it can only be recognized.

Antariy

Quote from: drizz on December 17, 2010, 11:59:36 PM
test this:
db "aaaaaaaaaaaaaaa",13,10,"aaaaaaaaaaaaaaa",0


I tried this, as aligned and misaligned, and lines count is 1. What you want to say?

drizz

Well my question is :
Is this "db 'a',10,'b',10,0" two lines or zero?

The truth cannot be learned ... it can only be recognized.

Antariy

Quote from: jj2007 on December 17, 2010, 09:53:37 PM
pshufd needs 16-byte alignment. Otherwise it's a fantastic algo:

Yes, that's show what make a hurry anytime :green2

Antariy

Quote from: drizz on December 18, 2010, 12:34:10 AM
Well my question is :
Is this "db 'a',10,'b',10,0" two lines or zero?

Well, my answer is: algo support line dilimiter specification. If you know that text in Unix format, then just specify LF as delimiters.

Quote
invoke AxCountLines, offset gltestA, len(offset gltestA),0d0d0d0dh

Maybe LFs as default delimiters is better choice.

drizz

Yes that's the point, it checks for either 13 or 10, not 13+10 pair. As far as i can see only getlines2 algo does this.
The truth cannot be learned ... it can only be recognized.