SHA1 revisited.

hutch-- · January 12, 2011, 05:26:19 AM

The algo I had been working on works fine but there is a problem with almost all implimentations I have seen, they load the entire file into memory and this limits the size that can be loaded to physical memory limits. The task I have to address is a SHA1 algo that can handle the source file chopped into sections, process the data in each section until the end of the file is reached and produce the correct SHA1 hash at the end of it.

The application it will be used in needs to be able to handle much larger files than fit into memory. What I was hoping was that someone may have seen some source code that does this, I have a couple of C implimentations that work but none that deal with tiling the source into pieces.

Glenn9999 · January 12, 2011, 03:39:08 PM

Quote from: hutch-- on January 12, 2011, 05:26:19 AM
The algo I had been working on works fine but there is a problem with almost all implimentations I have seen, they load the entire file into memory and this limits the size that can be loaded to physical memory limits. The task I have to address is a SHA1 algo that can handle the source file chopped into sections, process the data in each section until the end of the file is reached and produce the correct SHA1 hash at the end of it.

Wow. I thought handling sections of files was typical. So most of what you see involves file mapping? Performance wise, I tried it and found no difference compared to caching the file as it is read. But for my testing, I found a buffer size of 64KB to be most efficient, so that's what I tend to use for most file reading operations.

The source & timings I've been using in posting to your other thread involves processing the file in 64KB sections. In fact, the procedure calls I've been using (both CAPI and my source) allow the passing of a buffer processing size, which makes either putting the whole file into memory or reading it in sections a possibility.

The Delphi I posted in the other thread illustrates reading the file in sections and processing it. Or if I'm misunderstanding, could you please clarify what you are talking about? Thanks.

MichaelW · January 12, 2011, 07:55:09 PM

The Microsoft CryptoAPI functions are designed to operate on discontinuous data streams, and judging from his SHA-1 hash so are the procedures in Drizz's CryptoHash library.

Code Select


;==============================================================================

    include \masm32\include\masm32rt.inc
    include \masm32\include\advapi32.inc
    includelib \masm32\lib\advapi32.lib

    ;--------------------------------------------------------------------
    ; This from Drizz's CryptoHash Library in the download section here:
    ;   http://www.drizz.eu.pn/
    ;--------------------------------------------------------------------

    include sha1.asm

;==============================================================================

;--------------------------------------------
; These derived from WinCrypt.h in the PSDK:
;--------------------------------------------

ALG_CLASS_HASH  equ 4 SHL 13
ALG_TYPE_ANY    equ 0
ALG_SID_SHA1    equ 4
CALG_SHA1       equ ALG_CLASS_HASH or ALG_TYPE_ANY or ALG_SID_SHA1

;==============================================================================
    .data

      hProv     dd 0
      hHash     dd 0
      hashBuff  db 20 dup(0)
      dwDataLen dd SIZEOF hashBuff

      ;-------------------------------------------------------
      ; Sample messages and digests from FIPS PUB 180-1 here:
      ;   http://www.itl.nist.gov/fipspubs/fip180-1.htm
      ;-------------------------------------------------------

      szSample1 db "abc",0
      szDigest1 db "A9993E364706816ABA3E25717850C26C9CD0D89D",0
      szSample2 db "abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq",0
      szDigest2 db "84983E441C3BD26EBAAE4AA1F95129E5E54670F1",0
      szSample3 db "1000000 * 'a'",0
      szDigest3 db "34AA973CD4C4DAA4F61EEB2BDBAD27316534016F",0

    .code
;==============================================================================
start:
;==============================================================================

    mov esi, alloc( 1000000 )
    mov eax, "aaaa"
    invoke memfill, esi, 1000000, eax

    lea edi, hashBuff

    invoke CryptAcquireContext, ADDR hProv,
                                NULL,
                                NULL,
                                PROV_RSA_FULL,
                                CRYPT_VERIFYCONTEXT

    print ADDR szSample1, 13,10
    print ADDR szDigest1,13,10
    invoke CryptCreateHash, hProv, CALG_SHA1, 0, 0, ADDR hHash
    invoke CryptHashData, hHash, ADDR szSample1, SIZEOF szSample1 - 1, 0
    invoke CryptGetHashParam, hHash, HP_HASHVAL, edi, ADDR dwDataLen, 0
    invoke CryptDestroyHash, hHash

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [edi+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample2, 13,10
    print ADDR szDigest2,13,10
    invoke CryptCreateHash, hProv, CALG_SHA1, 0, 0, ADDR hHash
    invoke CryptHashData, hHash, ADDR szSample2, SIZEOF szSample2 - 1, 0
    invoke CryptGetHashParam, hHash, HP_HASHVAL, edi, ADDR dwDataLen, 0
    invoke CryptDestroyHash, hHash

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [edi+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample3, 13,10
    print ADDR szDigest3,13,10
    invoke CryptCreateHash, hProv, CALG_SHA1, 0, 0, ADDR hHash
    invoke CryptHashData, hHash, esi, 1000000, 0
    invoke CryptGetHashParam, hHash, HP_HASHVAL, edi, ADDR dwDataLen, 0
    invoke CryptDestroyHash, hHash

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [edi+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample3, 13,10
    print ADDR szDigest3,13,10
    invoke CryptCreateHash, hProv, CALG_SHA1, 0, 0, ADDR hHash
    mov ebx, esi
    REPEAT 10
        invoke CryptHashData, hHash, ebx, 100000, 0
        add ebx, 100000
    ENDM
    invoke CryptGetHashParam, hHash, HP_HASHVAL, edi, ADDR dwDataLen, 0
    invoke CryptDestroyHash, hHash

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [edi+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    invoke CryptReleaseContext, hProv, 0

    print chr$(13,10,13,10)

    print ADDR szSample1,13,10
    print ADDR szDigest1,13,10
    invoke SHA1Init
    invoke SHA1Update, ADDR szSample1, SIZEOF szSample1-1
    invoke SHA1Final

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [SHA1Digest+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample2, 13,10
    print ADDR szDigest2,13,10
    invoke SHA1Init
    invoke SHA1Update, ADDR szSample2, SIZEOF szSample2-1
    invoke SHA1Final

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [SHA1Digest+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample3, 13,10
    print ADDR szDigest3,13,10
    invoke SHA1Init
    invoke SHA1Update, esi, 1000000
    invoke SHA1Final

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [SHA1Digest+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print ADDR szSample3, 13,10
    print ADDR szDigest3,13,10
    invoke SHA1Init
    mov ebx, esi
    REPEAT 10
        invoke SHA1Update, ebx, 100000
        add ebx, 100000
    ENDM
    invoke SHA1Final

    xor ebx, ebx
    .WHILE ebx < 20
        movzx edx, BYTE PTR [SHA1Digest+ebx]
        invoke crt_printf, cfm$("%02X"), edx
        inc ebx
    .ENDW
    print chr$(13,10)

    print chr$(13,10,13,10)

    inkey "Press any key to exit..."
    exit
;==============================================================================
end start

In my limited tests with a 10 MB buffer hashed in a single pass, and running on a P3, the CryptoAPI functions were slightly faster than Drizz's code.

clive · January 12, 2011, 08:36:56 PM

Always thought SHA was agnostic to size, and could be done blockwise

Code Select


	sha_init(&sha);

        while(FileRemains)
	  sha_update(&sha, Buffer, Length);

	sha_final(digest, &sha);

ideally in a multiple of 64 bytes.

Glenn9999 · January 12, 2011, 09:26:52 PM

Quote from: clive on January 12, 2011, 08:36:56 PM
Always thought SHA was agnostic to size, and could be done blockwise

It is, you just got be sure you process the stream 64 bytes at a time. As long as you keep properly presenting 64 bytes to the calc routine, SHA-1 is fine. I guess there are a lot that either think file mapping is more efficient, or can't manage a correct algo to present the 64 byte blocks to calc?

hutch-- · January 13, 2011, 12:50:42 AM

Michael,

Thanks for decyphering the crypto API code. I have ported it to basic and it works reliably in the simple context of passing a single memory block to it. I confess to being near illiterate with crypto APIs so I have yet to digest how to stream the data. The folks who will be using this code will be doing checksums on very large files, > 1 gig so streaming the data in blocks is a necessity. The following code has been cleaned up a bit from the 1st version I posted.

Code Select


' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    MACRO ALG_CLASS_HASH  = 32768          ''' 4 SHL 13
    MACRO ALG_TYPE_ANY    = 0
    MACRO ALG_SID_SHA1    = 4
    MACRO CALG_SHA1       = ALG_CLASS_HASH or ALG_TYPE_ANY or ALG_SID_SHA1

    MACRO PROV_RSA_FULL   = 1
    MACRO CRYPT_VERIFYCONTEXT = &HF0000000
    MACRO HP_HASHVAL      = 2

  ' ------------------------------------------
  ' these API DECLAREs are slightly renamed so
  ' they don't clash with future INCLUDE files
  ' ------------------------------------------

    DECLARE FUNCTION Crypt_Acquire_Context LIB "ADVAPI32.DLL" ALIAS "CryptAcquireContextA"( _
                     ByVal phProv as DWORD,ByVal pszContainer as DWORD, _
                     ByVal pszProvider as DWORD,ByVal dwProvType as DWORD, _
                     ByVal dwFlags as DWORD) as DWORD

    DECLARE FUNCTION Crypt_Create_Hash LIB "ADVAPI32.DLL" ALIAS "CryptCreateHash"( _
                     ByVal hProv as DWORD,ByVal Algid as DWORD, _
                     ByVal hKey as DWORD,ByVal dwFlags as DWORD, _
                     ByVal phHash as DWORD) as DWORD

    DECLARE FUNCTION Crypt_Hash_Data LIB "ADVAPI32.DLL" ALIAS "CryptHashData"( _
                     ByVal hHash as DWORD,ByVal pbData as DWORD, _
                     ByVal dwDataLen as DWORD,ByVal dwFlags as DWORD) as DWORD

    DECLARE FUNCTION Crypt_Get_Hash_Param LIB "ADVAPI32.DLL" ALIAS "CryptGetHashParam"( _
                     ByVal hHash as DWORD,ByVal dwParam as DWORD, _
                     ByVal pbData as DWORD,ByVal pdwDataLen as DWORD, _
                     ByVal dwFlags as DWORD) as DWORD

    DECLARE FUNCTION Crypt_Destroy_Hash LIB "ADVAPI32.DLL" ALIAS "CryptDestroyHash"( _
                     ByVal hHash as DWORD) as DWORD

' ÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-

FUNCTION API_sha1(src$) as STRING

    #REGISTER NONE

    LOCAL psrc as DWORD
    LOCAL phsh as DWORD
    LOCAL hProv as DWORD
    LOCAL hHash as DWORD
    LOCAL dwDataLen as DWORD
    LOCAL phx as DWORD

    psrc = StrPtr(src$)         ' source address
    hashbuff$ = space$(20)      ' 20 byte working buffer
    phsh = StrPtr(hashbuff$)    ' working buffer pointer
    hxout$ = space$(40)         ' allocate 40 bytes for the output string
    phx = StrPtr(hxout$)        ' get its address
    dwDataLen = 20              ' set data length to length of hashbuff$

    Crypt_Acquire_Context VarPtr(hProv),%NULL,%NULL,PROV_RSA_FULL,CRYPT_VERIFYCONTEXT
    Crypt_Create_Hash hProv, CALG_SHA1, 0, 0, VarPtr(hHash)
    Crypt_Hash_Data hHash,psrc,len(src$),0
    Crypt_Get_Hash_Param hHash, HP_HASHVAL, phsh, VarPtr(dwDataLen), 0
    Crypt_Destroy_Hash hHash

  ' --------------------------------------
  ' convert bytes in phsh to hex in hxout$
  ' --------------------------------------
    ! xor ebx, ebx              ' zero the counter
    ! mov esi, phsh             ' load the address of bytes in phsh
    ! mov edi, phx              ' load the output buffer address

  oloop:
    ! movzx eax, BYTE PTR [esi+ebx]
    ! movzx ecx, WORD PTR hextbl[eax*2]
    ! mov [edi], cx
    ! add edi, 2
    ! add ebx, 1
    ! cmp ebx, 20
    ! jle oloop

    FUNCTION = hxout$
    Exit FUNCTION

  #align 4
  hextbl:
    ! dw "00","10","20","30","40","50","60","70","80","90","a0","b0","c0","d0","e0","f0"
    ! dw "01","11","21","31","41","51","61","71","81","91","a1","b1","c1","d1","e1","f1"
    ! dw "02","12","22","32","42","52","62","72","82","92","a2","b2","c2","d2","e2","f2"
    ! dw "03","13","23","33","43","53","63","73","83","93","a3","b3","c3","d3","e3","f3"
    ! dw "04","14","24","34","44","54","64","74","84","94","a4","b4","c4","d4","e4","f4"
    ! dw "05","15","25","35","45","55","65","75","85","95","a5","b5","c5","d5","e5","f5"
    ! dw "06","16","26","36","46","56","66","76","86","96","a6","b6","c6","d6","e6","f6"
    ! dw "07","17","27","37","47","57","67","77","87","97","a7","b7","c7","d7","e7","f7"
    ! dw "08","18","28","38","48","58","68","78","88","98","a8","b8","c8","d8","e8","f8"
    ! dw "09","19","29","39","49","59","69","79","89","99","a9","b9","c9","d9","e9","f9"
    ! dw "0a","1a","2a","3a","4a","5a","6a","7a","8a","9a","aa","ba","ca","da","ea","fa"
    ! dw "0b","1b","2b","3b","4b","5b","6b","7b","8b","9b","ab","bb","cb","db","eb","fb"
    ! dw "0c","1c","2c","3c","4c","5c","6c","7c","8c","9c","ac","bc","cc","dc","ec","fc"
    ! dw "0d","1d","2d","3d","4d","5d","6d","7d","8d","9d","ad","bd","cd","dd","ed","fd"
    ! dw "0e","1e","2e","3e","4e","5e","6e","7e","8e","9e","ae","be","ce","de","ee","fe"
    ! dw "0f","1f","2f","3f","4f","5f","6f","7f","8f","9f","af","bf","cf","df","ef","ff"

End FUNCTION

' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

Glenn9999 · January 13, 2011, 03:35:43 AM

The Delphi I used in the CAPI program mentioned in the other thread.

Code Select


infile := CreateFile(PChar(OpenDialog1.FileName), GENERIC_READ,
                FILE_SHARE_READ, nil, OPEN_EXISTING,
                FILE_FLAG_SEQUENTIAL_SCAN, 0);
CryptAcquireContext(hProv, nil, nil, PROV_RSA_FULL, CRYPT_VERIFYCONTEXT);
CryptCreateHash(hProv, CALG_SHA1, nil, 0, hHash);
ReadFile(infile, inbuffer, sizeof(inbuffer), amount_read, nil);
repeat
   CryptHashData(hHash, @inbuffer, amount_read, 0);
   ReadFile(infile, inbuffer, sizeof(inbuffer), amount_read, nil);
until amount_read = 0;
cbHashDataLen := 20;
CryptGetHashParam(hHash, HP_HASHVAL, @outHash, cbHashDataLen, 0);
CryptDestroyHash(hhash);
CryptReleaseContext(hprov, 0) ;
CloseHandle(infile);

The output value is in outHash. Run your ByteToHex against that to get your SHA1

dedndave · January 13, 2011, 04:12:11 AM

Quotehttp://www.drizz.eu.pn/

drizz lives on Pitcairn ?

hutch-- · January 13, 2011, 04:36:05 AM

:bg

LATER: After slightly diluting my ignorance of Microsoft crypto API functions, Michael's code handles the streaming of blocks of data so all is well and it is even reasonably fast.

Gratsie again ! :U

hutch-- · January 14, 2011, 12:27:42 AM

Here is a test piece that uses the technique that Michael developed. Hope your PB is up to scratch :bg

I have also included the Microsoft fciv.exe and an old Unix port for comparison testing. The Microsoft version appears to be the fastest, this one is not far behind and the old Unix port is the slowest.

dedndave · January 14, 2011, 12:37:37 AM

we don't allow powerbasic in this forum :lol

Glenn9999 · January 14, 2011, 08:28:07 PM

Quote from: hutch-- on January 14, 2011, 12:27:42 AM
I have also included the Microsoft fciv.exe and an old Unix port for comparison testing. The Microsoft version appears to be the fastest, this one is not far behind and the old Unix port is the slowest.

Actually, I find if you don't turn off the OS file caching (and you don't in your code), that 64KB is almost always best for a buffer size. You might consider adjusting/testing your code in this way, especially since using such a big buffer causes a performance hit in such an environment. That performance hit isn't as large on faster computers/drives, but it's still there.

Straight I/O test varying buffer sizes, file cache ON.

Code Select


Processing against: D:\TEST2
1024 bytes 1062
2048 bytes 703
4096 bytes 531
8192 bytes 454
16384 bytes 421
32768 bytes 391
65536 bytes 375
131072 bytes 375
262144 bytes 359
524288 bytes 375
1048576 bytes 985
2097152 bytes 1015
4194304 bytes 1016
8388608 bytes 1031
16777216 bytes 1016
33554432 bytes 1031
67108864 bytes 1047

hutch-- · January 15, 2011, 12:45:28 AM

Glen,

The rough test I used was a 1 gig VOB file and the buffer size did not effect the timing much at all. I played with the CreateFile() flags but saw no timing change from any option. I would have tried the CreateFile() flag FILE_FLAG_NO_BUFFERING but I cannot garrantee the data alignment which is 4 byte aligned.

One it has been run it then lives in the OS file cache and you are effectively only timing the crypto routine.

The main factor at the moment is that its reliable and easy to use.

hutch-- · January 16, 2011, 02:14:54 AM

Glen,

I did some tests with your suggestion and the results have been very good. The smaller buffer, 16, 32 and 64k run about 25% faster so it appears to have something to do with the way the crypto algo works. Gratsie !

Glenn9999 · January 16, 2011, 08:03:31 AM

Quote from: hutch-- on January 16, 2011, 02:14:54 AM
Glen,

I did some tests with your suggestion and the results have been very good. The smaller buffer, 16, 32 and 64k run about 25% faster so it appears to have something to do with the way the crypto algo works. Gratsie !

More to do with the ReadFile API and how it is written, actually. Note in the test results above that read I/O was being benchmarked and not the CAPI routines. The delay varies based on CPU (I guess something in the file caching algorithm since I would presume in reading the file that it would already be cached in memory?) and drive speed/ability to handle reads. Your code actually was about 3-4x slower than FCIV in certain scenarios I ran it through when I was testing it, and I was able to trace it to the buffer size by duplicating the slower speed in my own CAPI program by changing the buffer size to 16MB.

Glad you were able to try it and find the suggestion helpful.

Edit: Actually looking at those test results, I'm curious as to how high you can push the buffer size without impacting performance. For example, it seems between the 512K and 1024K line is where the performance hit starts to take hold. Might have to test it sometime soon. Edit2: I tested another couple of drives and one of the lines was between 32-64K, and another one was slightly greater than that. So drive performance is playing into this somewhere, too, and not just the file caching algorithm in the OS. I'm wondering now how you determine what would be fully optimal besides testing/benchmarking like I've been doing.

News:

SHA1 revisited.