News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Does anyone have a genuinely fast SHA1 algo ?

Started by hutch--, January 01, 2011, 02:37:05 AM

Previous topic - Next topic

dedndave

yah - a lot of dependencies in that macro

perhaps you are trying to use memory operands in var # 5 and 6 ? (e, w)

Glenn9999

Quote from: dedndave on January 05, 2011, 05:39:03 AM
show us the line with the very first error in the list
you may have to look in C:\masm32\bin\asmbl.txt

That's just it and why I'm asking.  I don't know what to look for (what I posted above doesn't make sense).  And I'm doing the same thing as in the MD5 thread (compared it), so I really don't know.  Here's the first macro call if that helps any.  But what I do read from that output is that it's referring to errors in the macro.


Sha1Loop1 [eax], [ebx], [ecx], [edx], [esi], [edi+ 0*4]

dedndave

ADD  e, w

you can't do that   :bg

ADD [esi],[edi+ 0*4]

not allowed to add 2 memory operands

Glenn9999

Quote from: dedndave on January 05, 2011, 05:51:21 AM
ADD [esi],[edi+ 0*4]

So I see now.  Macros are a little different than what I'm used to, but seems to be what is needed for the algo to work as intended.

donkey

A couple of months ago I started reading this in the hopes of implementing it but never got around to it, maybe someone else would like to give it a shot:

http://software.intel.com/en-us/articles/improving-the-performance-of-the-secure-hash-algorithm-1/

Ofcourse its Intel so its all 64 bit code, but if you want real speed...
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

Glenn9999

Quote from: Glenn9999 on January 05, 2011, 05:52:51 AM
So I see now.  Macros are a little different than what I'm used to, but seems to be what is needed for the algo to work as intended.

I hit a roadblock.   I needed an extra register and ESP turned out to not really be a good choice.  Then what I had with those lines commented wasn't any faster than what I posted in here, which really shocks me since the Delphi version was faster.  At least it was worth a try.

Anyhow, if anyone wants the version of Sha1Compress that doesn't do the memory shifts I can post it in here.

Glenn9999

Quote from: Glenn9999 on January 05, 2011, 07:12:09 AM
I hit a roadblock.   I needed an extra register and ESP turned out to not really be a good choice.  Then what I had with those lines commented wasn't any faster than what I posted in here, which really shocks me since the Delphi version was faster.  At least it was worth a try.

And roadblocks are meant to be torn down.  Figured it out, and got a newer (to me) MASM version, seems to work.  Get this:

Athlon XP 2000 Old (what I posted in here): Processed 25260492 bytes in  468 ms.
Athlon XP 2000 New: Processed 25260492 bytes in  250 ms.
There's a 4.2 second difference in code on my testbed of 660MB of files.

Core2Duo 2900 Old: Processed 25260492 bytes in  265 ms.
Core2Duo 2900 New: Processed 25260492 bytes in  156 ms.

Quote from: Antariy on January 04, 2011, 11:42:52 PM
Glenn, is Delphi's inline ASM support MMX instructions? Have a look into my previous post then.

The version I usually work with doesn't, but the newer version I have evidently does.  However it blows up with "floating point instruction error".  Regardless, the newest Sha1LoadW code I have is now a macro in an MASM module, so do I just copy it as is and add ".XMM/.MMX" to the header?

dedndave

        .686p
        .MMX
        .XMM
        .MODEL  Flat,StdCall
        OPTION  CaseMap:None


you need the xmm for sse

Glenn9999

Quote from: dedndave on January 06, 2011, 07:09:55 AM
        .686p
        .MMX
        .XMM
        .MODEL  Flat,StdCall
        OPTION  CaseMap:None


you need the xmm for sse

Okay, but kinda meant whether the code that was posted was good or not.  I get the same error trying it in my MASM module.

dedndave

oops - sorry, Glenn   :P
not sure which code you are talking about

but....
am i the only one that uses google ? - lol

http://code.google.com/p/pyrit/issues/attachmentText?id=207&aid=4885098494754558868&name=xmm.s.200806291157.2.s&token=f0f6faa218f86cc0a17e32b05ac6b464

kinda bloatish - but i bet it hauls ass

it might make a starting point for a good algo - lol


oh - and - that was the first hit for "sse2 sha1"
c'mon, guys   :lol

jj2007

# More, I used "movaps", "xorps" and other opcode because they are one byte shorter than the
# equivalent "movups" and so on.

Doesn't inspire confidence, though: movaps and movups are same size but have different functions...

dedndave

well - he says it works
also - English is not his primary language - not sure what he meant there
anyways - i wasn't suggesting you copy/paste his code - lol
but, you might look at it and get some ideas
as tricky as optimizing pipeline usage is, it probably isn't optimal, anyways

point was - has anyone tried searching the web - you know - that internet thingy
that was the first hit - i didn't even break a sweat

Glenn9999

Quote from: dedndave on January 06, 2011, 07:36:15 AM
oops - sorry, Glenn   :P
not sure which code you are talking about

The code that Antariy asked me to try.

Quote
but....
am i the only one that uses google ? - lol

Lol no.  But sometimes Google doesn't give answers that are too great, though it was great for researching the algorithm in the first place. :D

dedndave

oh - found it - page 1 of the thread
did you notice that he edited that post ?

Glenn9999

Quote from: dedndave on January 06, 2011, 07:49:48 AM
oh - found it - page 1 of the thread
did you notice that he edited that post ?

Yes, I copied it when I made the other post in order to test it.  Tried 1B and B1 as well in the "Edited" Description.  1B compiled, B1 didn't.  Maybe he can clarify.