News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

SHA1 revisited.

Started by hutch--, January 12, 2011, 05:26:19 AM

Previous topic - Next topic

hutch--

The algo I had been working on works fine but there is a problem with almost all implimentations I have seen, they load the entire file into memory and this limits the size that can be loaded to physical memory limits. The task I have to address is a SHA1 algo that can handle the source file chopped into sections, process the data in each section until the end of the file is reached and produce the correct SHA1 hash at the end of it.

The application it will be used in needs to be able to handle much larger files than fit into memory. What I was hoping was that someone may have seen some source code that does this, I have a couple of C implimentations that work but none that deal with tiling the source into pieces.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Glenn9999

Quote from: hutch-- on January 12, 2011, 05:26:19 AM
The algo I had been working on works fine but there is a problem with almost all implimentations I have seen, they load the entire file into memory and this limits the size that can be loaded to physical memory limits. The task I have to address is a SHA1 algo that can handle the source file chopped into sections, process the data in each section until the end of the file is reached and produce the correct SHA1 hash at the end of it.


Wow.  I thought handling sections of files was typical.  So most of what you see involves file mapping?   Performance wise, I tried it and found no difference compared to caching the file as it is read.  But for my testing, I found a buffer size of 64KB to be most efficient, so that's what I tend to use for most file reading operations.

The source & timings I've been using in posting to your other thread involves processing the file in 64KB sections.  In fact, the procedure calls I've been using (both CAPI and my source) allow the passing of a buffer processing size, which makes either putting the whole file into memory or reading it in sections a possibility.

The Delphi I posted in the other thread illustrates reading the file in sections and processing it.  Or if I'm misunderstanding, could you please clarify what you are talking about?  Thanks.

MichaelW

The Microsoft CryptoAPI functions are designed to operate on discontinuous data streams, and judging from his SHA-1 hash so are the procedures in Drizz's CryptoHash library.

;==============================================================================

    include \masm32\include\masm32rt.inc
    include \masm32\include\advapi32.inc
    includelib \masm32\lib\advapi32.lib

    ;--------------------------------------------------------------------
    ; This from Drizz's CryptoHash Library in the download section here:
    ;   http://www.drizz.eu.pn/
    ;--------------------------------------------------------------------

    include sha1.asm

;==============================================================================

;--------------------------------------------
; These derived from WinCrypt.h in the PSDK:
;--------------------------------------------

ALG_CLASS_HASH  equ 4 SHL 13
ALG_TYPE_ANY    equ 0
ALG_SID_SHA1    equ 4
CALG_SHA1       equ ALG_CLASS_HASH or ALG_TYPE_ANY or ALG_SID_SHA1

;==============================================================================
    .data

      hProv     dd 0
      hHash     dd 0
      hashBuff  db 20 dup(0)
      dwDataLen dd SIZEOF hashBuff

      ;-------------------------------------------------------
      ; Sample messages and digests from FIPS PUB 180-1 here:
      ;   http://www.itl.nist.gov/fipspubs/fip180-1.htm
      ;-------------------------------------------------------

      szSample1 db "abc",0
      szDigest1 db "A9993E364706816ABA3E25717850C26C9CD0D89D",0
      szSample2 db "abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq",0
      szDigest2 db "84983E441C3BD26EBAAE4AA1F95129E5E54670F1",0
      szSample3 db "1000000 * 'a'",0
      szDigest3 db "34AA973CD4C4DAA4F61EEB2BDBAD27316534016F",0

    .code
;==============================================================================
start:
;==============================================================================

    mov esi, alloc( 1000000 )
    mov eax, "aaaa"
    invoke memfill, esi, 1000000, eax

    lea edi, hashBuff

    invoke CryptAcquireContext, ADDR hProv,
                                NULL,
                                NULL,
                                PROV_RSA_FULL,
                                CRYPT_VERIFYCONTEXT

    print ADDR szSample1, 13,10
    print ADDR szDigest1,13,10
    invoke CryptCreateHash, hProv, CALG_SHA1, 0, 0, ADDR hHash
    invoke CryptHashData, hHash, ADDR szSample1, SIZEOF szSample1 - 1, 0
    invoke CryptGetHashParam, hHash, HP_HASHVAL, edi, ADDR dwDataLen, 0
    invoke CryptDestroyHash, hHash

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [edi+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample2, 13,10
    print ADDR szDigest2,13,10
    invoke CryptCreateHash, hProv, CALG_SHA1, 0, 0, ADDR hHash
    invoke CryptHashData, hHash, ADDR szSample2, SIZEOF szSample2 - 1, 0
    invoke CryptGetHashParam, hHash, HP_HASHVAL, edi, ADDR dwDataLen, 0
    invoke CryptDestroyHash, hHash

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [edi+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample3, 13,10
    print ADDR szDigest3,13,10
    invoke CryptCreateHash, hProv, CALG_SHA1, 0, 0, ADDR hHash
    invoke CryptHashData, hHash, esi, 1000000, 0
    invoke CryptGetHashParam, hHash, HP_HASHVAL, edi, ADDR dwDataLen, 0
    invoke CryptDestroyHash, hHash

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [edi+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample3, 13,10
    print ADDR szDigest3,13,10
    invoke CryptCreateHash, hProv, CALG_SHA1, 0, 0, ADDR hHash
    mov ebx, esi
    REPEAT 10
        invoke CryptHashData, hHash, ebx, 100000, 0
        add ebx, 100000
    ENDM
    invoke CryptGetHashParam, hHash, HP_HASHVAL, edi, ADDR dwDataLen, 0
    invoke CryptDestroyHash, hHash

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [edi+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    invoke CryptReleaseContext, hProv, 0

    print chr$(13,10,13,10)

    print ADDR szSample1,13,10
    print ADDR szDigest1,13,10
    invoke SHA1Init
    invoke SHA1Update, ADDR szSample1, SIZEOF szSample1-1
    invoke SHA1Final

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [SHA1Digest+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample2, 13,10
    print ADDR szDigest2,13,10
    invoke SHA1Init
    invoke SHA1Update, ADDR szSample2, SIZEOF szSample2-1
    invoke SHA1Final

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [SHA1Digest+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample3, 13,10
    print ADDR szDigest3,13,10
    invoke SHA1Init
    invoke SHA1Update, esi, 1000000
    invoke SHA1Final

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [SHA1Digest+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample3, 13,10
    print ADDR szDigest3,13,10
    invoke SHA1Init
    mov ebx, esi
    REPEAT 10
        invoke SHA1Update, ebx, 100000
        add ebx, 100000
    ENDM
    invoke SHA1Final

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [SHA1Digest+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print chr$(13,10,13,10)

    inkey "Press any key to exit..."
    exit
;==============================================================================
end start


In my limited tests with a 10 MB buffer hashed in a single pass, and running on a P3, the CryptoAPI functions were slightly faster than Drizz's code.
eschew obfuscation

clive

Always thought SHA was agnostic to size, and could be done blockwise


sha_init(&sha);

        while(FileRemains)
  sha_update(&sha, Buffer, Length);

sha_final(digest, &sha);


ideally in a multiple of 64 bytes.
It could be a random act of randomness. Those happen a lot as well.

Glenn9999

Quote from: clive on January 12, 2011, 08:36:56 PM
Always thought SHA was agnostic to size, and could be done blockwise

It is, you just got be sure you process the stream 64 bytes at a time.  As long as you keep properly presenting 64 bytes to the calc routine, SHA-1 is fine.  I guess there are a lot that either think file mapping is more efficient, or can't manage a correct algo to present the 64 byte blocks to calc?


hutch--

#5
Michael,

Thanks for decyphering the crypto API code. I have ported it to basic and it works reliably in the simple context of passing a single memory block to it. I confess to being near illiterate with crypto APIs so I have yet to digest how to stream the data. The folks who will be using this code will be doing checksums on very large files, > 1 gig so streaming the data in blocks is a necessity. The following code has been cleaned up a bit from the 1st version I posted.


' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    MACRO ALG_CLASS_HASH  = 32768          ''' 4 SHL 13
    MACRO ALG_TYPE_ANY    = 0
    MACRO ALG_SID_SHA1    = 4
    MACRO CALG_SHA1       = ALG_CLASS_HASH or ALG_TYPE_ANY or ALG_SID_SHA1

    MACRO PROV_RSA_FULL   = 1
    MACRO CRYPT_VERIFYCONTEXT = &HF0000000
    MACRO HP_HASHVAL      = 2

  ' ------------------------------------------
  ' these API DECLAREs are slightly renamed so
  ' they don't clash with future INCLUDE files
  ' ------------------------------------------

    DECLARE FUNCTION Crypt_Acquire_Context LIB "ADVAPI32.DLL" ALIAS "CryptAcquireContextA"( _
                     ByVal phProv as DWORD,ByVal pszContainer as DWORD, _
                     ByVal pszProvider as DWORD,ByVal dwProvType as DWORD, _
                     ByVal dwFlags as DWORD) as DWORD

    DECLARE FUNCTION Crypt_Create_Hash LIB "ADVAPI32.DLL" ALIAS "CryptCreateHash"( _
                     ByVal hProv as DWORD,ByVal Algid as DWORD, _
                     ByVal hKey as DWORD,ByVal dwFlags as DWORD, _
                     ByVal phHash as DWORD) as DWORD

    DECLARE FUNCTION Crypt_Hash_Data LIB "ADVAPI32.DLL" ALIAS "CryptHashData"( _
                     ByVal hHash as DWORD,ByVal pbData as DWORD, _
                     ByVal dwDataLen as DWORD,ByVal dwFlags as DWORD) as DWORD

    DECLARE FUNCTION Crypt_Get_Hash_Param LIB "ADVAPI32.DLL" ALIAS "CryptGetHashParam"( _
                     ByVal hHash as DWORD,ByVal dwParam as DWORD, _
                     ByVal pbData as DWORD,ByVal pdwDataLen as DWORD, _
                     ByVal dwFlags as DWORD) as DWORD

    DECLARE FUNCTION Crypt_Destroy_Hash LIB "ADVAPI32.DLL" ALIAS "CryptDestroyHash"( _
                     ByVal hHash as DWORD) as DWORD

' ÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-

FUNCTION API_sha1(src$) as STRING

    #REGISTER NONE

    LOCAL psrc as DWORD
    LOCAL phsh as DWORD
    LOCAL hProv as DWORD
    LOCAL hHash as DWORD
    LOCAL dwDataLen as DWORD
    LOCAL phx as DWORD

    psrc = StrPtr(src$)         ' source address
    hashbuff$ = space$(20)      ' 20 byte working buffer
    phsh = StrPtr(hashbuff$)    ' working buffer pointer
    hxout$ = space$(40)         ' allocate 40 bytes for the output string
    phx = StrPtr(hxout$)        ' get its address
    dwDataLen = 20              ' set data length to length of hashbuff$

    Crypt_Acquire_Context VarPtr(hProv),%NULL,%NULL,PROV_RSA_FULL,CRYPT_VERIFYCONTEXT
    Crypt_Create_Hash hProv, CALG_SHA1, 0, 0, VarPtr(hHash)
    Crypt_Hash_Data hHash,psrc,len(src$),0
    Crypt_Get_Hash_Param hHash, HP_HASHVAL, phsh, VarPtr(dwDataLen), 0
    Crypt_Destroy_Hash hHash

  ' --------------------------------------
  ' convert bytes in phsh to hex in hxout$
  ' --------------------------------------
    ! xor ebx, ebx              ' zero the counter
    ! mov esi, phsh             ' load the address of bytes in phsh
    ! mov edi, phx              ' load the output buffer address

  oloop:
    ! movzx eax, BYTE PTR [esi+ebx]
    ! movzx ecx, WORD PTR hextbl[eax*2]
    ! mov [edi], cx
    ! add edi, 2
    ! add ebx, 1
    ! cmp ebx, 20
    ! jle oloop

    FUNCTION = hxout$
    Exit FUNCTION

  #align 4
  hextbl:
    ! dw "00","10","20","30","40","50","60","70","80","90","a0","b0","c0","d0","e0","f0"
    ! dw "01","11","21","31","41","51","61","71","81","91","a1","b1","c1","d1","e1","f1"
    ! dw "02","12","22","32","42","52","62","72","82","92","a2","b2","c2","d2","e2","f2"
    ! dw "03","13","23","33","43","53","63","73","83","93","a3","b3","c3","d3","e3","f3"
    ! dw "04","14","24","34","44","54","64","74","84","94","a4","b4","c4","d4","e4","f4"
    ! dw "05","15","25","35","45","55","65","75","85","95","a5","b5","c5","d5","e5","f5"
    ! dw "06","16","26","36","46","56","66","76","86","96","a6","b6","c6","d6","e6","f6"
    ! dw "07","17","27","37","47","57","67","77","87","97","a7","b7","c7","d7","e7","f7"
    ! dw "08","18","28","38","48","58","68","78","88","98","a8","b8","c8","d8","e8","f8"
    ! dw "09","19","29","39","49","59","69","79","89","99","a9","b9","c9","d9","e9","f9"
    ! dw "0a","1a","2a","3a","4a","5a","6a","7a","8a","9a","aa","ba","ca","da","ea","fa"
    ! dw "0b","1b","2b","3b","4b","5b","6b","7b","8b","9b","ab","bb","cb","db","eb","fb"
    ! dw "0c","1c","2c","3c","4c","5c","6c","7c","8c","9c","ac","bc","cc","dc","ec","fc"
    ! dw "0d","1d","2d","3d","4d","5d","6d","7d","8d","9d","ad","bd","cd","dd","ed","fd"
    ! dw "0e","1e","2e","3e","4e","5e","6e","7e","8e","9e","ae","be","ce","de","ee","fe"
    ! dw "0f","1f","2f","3f","4f","5f","6f","7f","8f","9f","af","bf","cf","df","ef","ff"

End FUNCTION

' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Glenn9999

The Delphi I used in the CAPI program mentioned in the other thread.


infile := CreateFile(PChar(OpenDialog1.FileName), GENERIC_READ,
                FILE_SHARE_READ, nil, OPEN_EXISTING,
                FILE_FLAG_SEQUENTIAL_SCAN, 0);
CryptAcquireContext(hProv, nil, nil, PROV_RSA_FULL, CRYPT_VERIFYCONTEXT);
CryptCreateHash(hProv, CALG_SHA1, nil, 0, hHash);
ReadFile(infile, inbuffer, sizeof(inbuffer), amount_read, nil);
repeat
   CryptHashData(hHash, @inbuffer, amount_read, 0);
   ReadFile(infile, inbuffer, sizeof(inbuffer), amount_read, nil);
until amount_read = 0;
cbHashDataLen := 20;
CryptGetHashParam(hHash, HP_HASHVAL, @outHash, cbHashDataLen, 0);
CryptDestroyHash(hhash);
CryptReleaseContext(hprov, 0) ;
CloseHandle(infile);


The output value is in outHash.  Run your ByteToHex against that to get your SHA1

dedndave


hutch--

 :bg

LATER: After slightly diluting my ignorance of Microsoft crypto API functions, Michael's code handles the streaming of blocks of data so all is well and it is even reasonably fast.

Gratsie again !  :U
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

hutch--

Here is a test piece that uses the technique that Michael developed. Hope your PB is up to scratch  :bg

I have also included the Microsoft fciv.exe and an old Unix port for comparison testing. The Microsoft version appears to be the fastest, this one is not far behind and the old Unix port is the slowest.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

we don't allow powerbasic in this forum   :lol

Glenn9999

Quote from: hutch-- on January 14, 2011, 12:27:42 AM
I have also included the Microsoft fciv.exe and an old Unix port for comparison testing. The Microsoft version appears to be the fastest, this one is not far behind and the old Unix port is the slowest.

Actually, I find if you don't turn off the OS file caching (and you don't in your code), that 64KB is almost always best for a buffer size.  You might consider adjusting/testing your code in this way, especially since using such a big buffer causes a performance hit in such an environment.  That performance hit isn't as large on faster computers/drives, but it's still there.

Straight I/O test varying buffer sizes, file cache ON.

Processing against: D:\TEST2
1024 bytes 1062
2048 bytes 703
4096 bytes 531
8192 bytes 454
16384 bytes 421
32768 bytes 391
65536 bytes 375
131072 bytes 375
262144 bytes 359
524288 bytes 375
1048576 bytes 985
2097152 bytes 1015
4194304 bytes 1016
8388608 bytes 1031
16777216 bytes 1016
33554432 bytes 1031
67108864 bytes 1047

hutch--

Glen,

The rough test I used was a 1 gig VOB file and the buffer size did not effect the timing much at all. I played with the CreateFile() flags but saw no timing change from any option. I would have tried the CreateFile() flag FILE_FLAG_NO_BUFFERING but I cannot garrantee the data alignment which is 4 byte aligned.

One it has been run it then lives in the OS file cache and you are effectively only timing the crypto routine.

The main factor at the moment is that its reliable and easy to use.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

hutch--

Glen,

I did some tests with your suggestion and the results have been very good. The smaller buffer, 16, 32 and 64k run about 25% faster so it appears to have something to do with the way the crypto algo works. Gratsie !
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Glenn9999

#14
Quote from: hutch-- on January 16, 2011, 02:14:54 AM
Glen,

I did some tests with your suggestion and the results have been very good. The smaller buffer, 16, 32 and 64k run about 25% faster so it appears to have something to do with the way the crypto algo works. Gratsie !

More to do with the ReadFile API and how it is written, actually.  Note in the test results above that read I/O was being benchmarked and not the CAPI routines.  The delay varies based on CPU (I guess something in the file caching algorithm since I would presume in reading the file that it would already be cached in memory?) and drive speed/ability to handle reads.  Your code actually was about 3-4x slower than FCIV in certain scenarios I ran it through when I was testing it, and I was able to trace it to the buffer size by duplicating the slower speed in my own CAPI program by changing the buffer size to 16MB.

Glad you were able to try it and find the suggestion helpful.

Edit: Actually looking at those test results, I'm curious as to how high you can push the buffer size without impacting performance.  For example, it seems between the 512K and 1024K line is where the performance hit starts to take hold.  Might have to test it sometime soon.   Edit2: I tested another couple of drives and one of the lines was between 32-64K, and another one was slightly greater than that.  So drive performance is playing into this somewhere, too, and not just the file caching algorithm in the OS.  I'm wondering now how you determine what would be fully optimal besides testing/benchmarking like I've been doing.