News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

StrStrIW (unicode)

Started by KSS, April 07, 2010, 08:30:08 AM

Previous topic - Next topic

KSS

Oh, now I am understand how it work, thank you!

I saw that the function use SSEx and show MessageBox when it is not supported (I use HIEW to see this)
Does it work on CPU without SSEx ? Or I may use some other way on CPU that do not support SSEx?

As I can understand the function InstrCi() posted(public) only as library without source? (I am write my Sandbox on Delphi2010, so if source not available, I must use DLL for this function. Anyway sometime I will port my sandbox on asm.  :toothy)

jj2007

#16
Quote from: KSS on April 09, 2010, 08:06:08 AM
Does it work on CPU without SSEx ? Or I may use some other way on CPU that do not support SSEx?
The CPU must have SSE2. All Atom processors have it, and there are few people who still work on a pre-P4, so that shouldn't be a big problem.

> As I can understand the function InstrCi() posted(public) only as library without source?
Yes, for practical reasons. The macros are visible, but with the exception of qWord nobody dares to look at them ;-)

> if source not available, I must use DLL for this function
Yes. That shouldn't be too difficult from Delphi. Here is an example how it can be done from Masm:

Quoteinclude \masm32\include\masm32rt.inc

.data?
hDll   dd ?
ptrInstr   dd ?

.code
start:
   mov hDll,
rv(LoadLibrary, chr$("MbMiniDll"))      ; rv is the retval macro, which returns eax
   .if eax
      mov ptrInstr,
rv(GetProcAddress, hDll, chr$("InstrDLL"))
      .if eax
         push 1         ; mode 1, ANSI and case-insensitive
         push chr$("string")      ; the pattern
         push chr$("This is a STRING")   ; the source
         push 1         ; the start pos
         call ptrInstr
         MsgBox 0, eax, chr$("The match is here:"), MB_OK
      .else
         invoke MessageBox, 0, chr$("InstrDLL not found"), 0, MB_OK
      .endif
      invoke FreeLibrary, hDll
   .else
      invoke MessageBox, 0, chr$("DLL not found"), 0, MB_OK
   .endif
   invoke ExitProcess, 0

end start

Modify accordingly for Unicode strings. Full code is attached.
Edit: Version 2 of the dll fixes a bug that occurred for long patterns. New limit is 4000 bytes.
Edit: Version MbMiniDll_call_w_timings2 fixes another bug for patterns longer than the source.

KSS

jj2007, man you are the best  :8)  Thank you!
I am will write DLL with check support CPU SSE2 and if CPU not support it — I will use standard StrStrIW() (I like universal)

Thank you!

KSS

I am little confused for result of function wInstr()
QuoteFullFileName: C:\??\C:\Users\DIGGER
TempDir: C:\Users\DIGGER\AppData\Local\Temp\
wInstr(@FullFileName,@TempDir):C:\Users\DIGGER  :eek
StrStrIW(@FullFileName,@TempDir): null
Function StrStrIW() gave correct result.

Does wInstr() contain bug or I am not understand something?

KeepingRealBusy

For what its worth. My attack would be somewhat along the lines of Boyer-Moore(sp?).  Convert the first character of the search string to both upper and lower case characters, do the same for the last character (and this means that you also know the length of the search string). Search for the first occurrence of the first lower case char, then check if either of the last characters match the character in the string being searched at a location that is the length of the search string plus the first character match point - 1. If no match, then discard the first lowercase match and try with the first uppercase match. If there is a match for the lower case, then search for the uppercase first character, and if found and found then check for a last character match. You will have 0, 1, or 2 matches. If 0, then increment the first character pointers and start again. if 1 or 2, do a strincmp on the the match or matches, but remember that you already know that the first and last characters match, so start the compare at match +1 for a length of length -2. If no matches, then increment the pointers and continue, if one then accept it, if two, then accept the one with the lowest start.

Note, this method will work for MBCS (Multi Byte Characters) as well.

As pointed out earlier, if space is available (at least enough for two copies of the search string), then convert the search string to lowercase, and convert the matched strings to a lowercase string and do a lowercase compare.

Oh, I need to change this slightly. Like BM, search for the last character first, this gives you the end of the string, and you already know the start of the string. discard any last character matches that start too early (where  the length from start to end is less that the search string length). When you find a matching end, check for a matching front, then proceed to compare the string segments.

Dave.

jj2007

Quote from: KSS on June 30, 2010, 03:33:05 AM
Does wInstr() contain bug or I am not understand something?

Yes, it's a big fat bug. I had been a bit stingy with the local buffer for the pattern. Now you can use patterns up to 4000 bytes with dll version 2 attached above. Sorry for that :red

Quote from: KeepingRealBusy on June 30, 2010, 06:19:39 AM
For what its worth. My attack would be...
Go ahead - testbed is included now :bg

Intel(R) Pentium(R) 4 CPU 3.40GHz (SSE3)
766     cycles for wInstr (MasmBasic)
31141   cycles for StrStrIW

KSS

jj2007, you fix other bug  :toothy

Bug present, look this:

         

P.S. Do you check length of incoming strings?

jj2007

I have used your strings, and it works for me. Can you post a complete example, please?

I don't know how wInstrE is defined, and how you construct your @FullFileName and @TempDir strings. Judging on the basis of snippets is simply not possible...

KSS

wInstrE PROC lpFirst:DWORD, lpSrch:DWORD
     void wInstr(1,lpFirst,lpSrch,1)    ; 1=case-insensitive
RET
wInstrE ENDP


This function in my DLL.

I am do some code recheck and post result.

P.S. I have DEP and ASLR enabled.


I am write:
FatalAppExitW(0,wInstrE('C:\??\C:\Users\DIGGER','C:\Users\DIGGER\AppData\Local\Temp\'));
And have result:
C:\Users\DIGGER
:'(

jj2007

I am able to reproduce the problem. It may have to do with the fact that the pattern is longer than the search string, but in any case it should correctly return zero. I will post a new version tonight.

Thanks for finding this :thumbu

It should work now, please check with MbMiniDll_call_w_timings2.zip above.

KSS


jj2007

Glad to see it works :bg

By the way: Your message box says "error" in Russian. Have you tried the wRes$() function of MasmBasic?

QuoteRes$, wRes$
   Let esi=Res$(1000)      ; get the resource string with ID 1000 from the *.rc string table (ANSI)
   invoke MessageBoxW, 0, wRes$(1001), wRes$(1000), MB_OK
   Open "O", #3, "TestUnicode.txt"
   wPrint #1, wChr$("Unicode, oh yeah!", 13, 10)
   wPrint #1, wRes$(1000), wTb$, wRes$(1001), wCrLf$
   Close
Rem[/color]   - returns DWORD in edx
   - you can embed resources in your *.asc file by using two Rsrc bookmarks
     (case-sensitive) below end Start, for example:
Rsrc[/b]
STRINGTABLE
BEGIN
   1000,   "This is my title"
   1001,   "This is my text"
END
Rsrc

[/b]

KSS

Quote from: jj2007 on July 02, 2010, 08:31:20 AM
Your message box says "error" in Russian.
No, in Ukrainian.

Quote from: jj2007 on July 02, 2010, 08:31:20 AMHave you tried the wRes$() function of MasmBasic?
No, I don't try. I not used *.asc, yet.

P.S. I am use my own ASM IDE (res. editor(ID checker, ID Synchronization), auto complete code, code library, auto code builder, support MASM, GoASM(unicode) and many other features that I want ::) )

jj2007

Quote from: KSS on July 02, 2010, 09:48:00 AM
Quote from: jj2007 on July 02, 2010, 08:31:20 AM
Your message box says "error" in Russian.
No, in Ukrainian.
Oops, sorry :red

Quote
Quote from: jj2007 on July 02, 2010, 08:31:20 AMHave you tried the wRes$() function of MasmBasic?
No, I don't try. I not used *.asc, yet.

P.S. I am use my own ASM IDE (res. editor(ID checker, ID Synchronization), auto complete code, code library, auto code builder, support MASM, GoASM(unicode) and many other features that I want ::) )


Roll your own is always better. Or at least, more fun :green

The library does not depend on the *.asc format or RichMasm. Yoy can use plain text *.asm files, too:

include \masm32\MasmBasic\MasmBasic.inc
.data
My$ dd txMyString
txMyString db "Good morning, this is Masm32!", 0  ; just for fun

Init
invoke MessageBoxW, 0, wRes$(1000), wChr$("There it is:"), MB_OK
Exit

end start


(of course, the snippet must be assembled with the stringtable in the rc file, as in the example above...)