The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: zemtex on April 24, 2012, 07:29:57 PM

Title: filename from path
Post by: zemtex on April 24, 2012, 07:29:57 PM
I think I recall an API function for extracting filename from path, i'm not quite sure, I do think it exist.

I guess I could make my own routine to extract filename by searching from the end of the path and backwards until I find a '\' character. But the problem is that it is not reliable, you could encounter a case where the path is "c:file.ext" and that would be very bad indeed, a backslash is not a given. A solution to this could be to scan backwards as long as there are letters and not special characters, as soon as I encounter a special that would be the beginning of the filename.

But I want to avvoid all of this and just ask you people if there is a function for this in the masm32 library or any other libraries?
Title: Re: filename from path
Post by: dedndave on April 24, 2012, 07:52:42 PM
not sure if there is an API function for this
but, you almost have it written   :P

start at the end of the string and work toward the beginning
if you encounter a backslash, slash, or colon, the previous character is the beginning of the filename
if you get to the beginning of the string, without encountering one of those, it is the beginning of the filename

the forward slash is not likely, but makes the routine compatible with devices, volumes, registry entries, or even URL's i guess
the colon is not likely, either
as far as i know, API functions that return a fully qualified path, always insert a backslash afer the colon
but - it again makes the routine compatible with user-entered paths, etc, that may not
Title: Re: filename from path
Post by: zemtex on April 24, 2012, 08:00:50 PM
I could implement a safety mechanism, as soon as I have split the path and filename. I could check if the path exist, if it doesnt then I have splitted it badly.
Title: Re: filename from path
Post by: dedndave on April 24, 2012, 08:11:22 PM
probably not a problem, but...
for that matter - check to see if the file exists - that verifies the path for you
Title: Re: filename from path
Post by: hutch-- on April 25, 2012, 04:31:51 AM
 :bg

There are times when I wonder why I write help files. MASM32 Library reference.


Path Functions


GetAppPath
NameFromPath
GetPathOnly
Title: Re: filename from path
Post by: zemtex on April 25, 2012, 05:04:27 AM
Thanks that was exactly what I had in mind.  :U
Title: Re: filename from path
Post by: sinsi on April 25, 2012, 05:58:30 AM
shlwapi has a lot of path functions, ones like PathFindFileName
Title: Re: filename from path
Post by: jj2007 on April 25, 2012, 06:23:01 AM
Quote from: sinsi on April 25, 2012, 05:58:30 AM
shlwapi has a lot of path functions, ones like PathFindFileName

OUTPUT (http://msdn.microsoft.com/en-us/library/windows/desktop/bb773589%28v=vs.85%29.aspx):
==========
Search for the file in path        c:\path\file
Returns the file part of the path "file"

Search for the file in path       "c:\path"
Returns the file part of the path "path"

Search for the file in path       "c:\path\"
Returns the file part of the path "path\"

Search for the file in path       "c:\"
Returns the file part of the path "c:\"

Search for the file in path       "c:"
Returns the file part of the path "c:"

Search for the file in path       "path"
Returns the file part of the path "path"

Looks messy, but you can get Unicode mess, too  ::)
Title: Re: filename from path
Post by: xandaz on April 25, 2012, 10:04:33 PM
    Zem.... maybe you should write your own.

lea esi,szPathFileName
cld
align_to_eos:
lodsb
or al,al
jnz align_to_eos
align_to_filename:
lodsb
cmp al,'\'
jne align_to_filename
inc esi
mov eax,esi
ret

i think this may work but should be tested
Title: Re: filename from path
Post by: dedndave on April 25, 2012, 10:31:49 PM
i would think these should return NULL (i.e. a pointer to the null terminator at the end of the source string)

Search for the file in path       "c:\path\"
Returns the file part of the path "path\"

Search for the file in path       "c:\"
Returns the file part of the path "c:\"

Search for the file in path       "c:"
Returns the file part of the path "c:"

the other results are as expected
Title: Re: filename from path
Post by: fearless on April 26, 2012, 01:51:43 PM
I have a couple functions that i use that might be what your looking for: JustFname and JustFnameExt, first one returns filename without extension, second one returns filename and extension - both strips out the path parts. Seems to work ok, havent verified for every eventuality. Pass in szFilePathName as the full path of file, and pass a buffer into szFileName

.data
szMyFilesFullPath db "c:\program files\common\somefile.ext",0
szFileNameBuffer db MAX_PATH dup 0
.code
Invoke JustFname, Addr szMyFilesFullPath, Addr szFileNameBuffer ; szFileNameBuffer should contain 'somefile'
Invoke JustFnameExt, Addr szMyFilesFullPath, Addr szFileNameBuffer ; szFileNameBuffer should contain 'somefile.ext'


.486                      ; force 32 bit code
.model flat, stdcall      ; memory model & calling convention
option casemap :none      ; case sensitive
include windows.inc
include kernel32.inc
includelib kernel32.lib
include masm32.inc
includelib masm32.lib

.code

;**************************************************************************
; Strip path name to just filename Without extention
;**************************************************************************
JustFname PROC szFilePathName:DWORD, szFileName:DWORD
LOCAL LenFilePathName:DWORD
LOCAL nPosition:DWORD

Invoke szLen, szFilePathName
mov LenFilePathName, eax
mov nPosition, eax

.IF LenFilePathName == 0
mov byte ptr [edi], 0
ret
.endif

mov esi, szFilePathName
add esi, eax

mov eax, nPosition
.WHILE eax != 0
movzx eax, byte ptr [esi]
.IF al == '\' || al == ':' || al == '/'
inc esi
.BREAK
.endif
dec esi
dec nPosition
mov eax, nPosition
.ENDW
mov edi, szFileName
mov eax, nPosition
.WHILE eax != LenFilePathName
movzx eax, byte ptr [esi]
.IF al == '.' ; found our full stop - so stop here
    mov byte ptr [edi], 0h
    .BREAK
.endif
mov byte ptr [edi], al
inc edi
inc esi
inc nPosition
mov eax, nPosition
.ENDW
ret
JustFname ENDP

end


.486                      ; force 32 bit code
.model flat, stdcall      ; memory model & calling convention
option casemap :none      ; case sensitive
include windows.inc
include kernel32.inc
includelib kernel32.lib
include masm32.inc
includelib masm32.lib

.code

;**************************************************************************
; Strip path name to just filename with extention
;**************************************************************************
JustFnameExt PROC szFilePathName:DWORD, szFileName:DWORD
LOCAL LenFilePathName:DWORD
LOCAL nPosition:DWORD

Invoke szLen, szFilePathName
mov LenFilePathName, eax
mov nPosition, eax

.IF LenFilePathName == 0
mov byte ptr [edi], 0
ret
.endif

mov esi, szFilePathName
add esi, eax

mov eax, nPosition
.WHILE eax != 0
movzx eax, byte ptr [esi]
.IF al == '\' || al == ':' || al == '/'
inc esi
.BREAK
.endif
dec esi
dec nPosition
mov eax, nPosition
.ENDW
mov edi, szFileName
mov eax, nPosition
.WHILE eax != LenFilePathName
movzx eax, byte ptr [esi]
mov byte ptr [edi], al
inc edi
inc esi
inc nPosition
mov eax, nPosition
.ENDW
mov byte ptr [edi], 0h ; null out filename
ret
JustFnameExt ENDP


end
Title: Re: filename from path
Post by: zemtex on April 26, 2012, 07:45:31 PM
Quote from: xandaz on April 25, 2012, 10:04:33 PM
    Zem.... maybe you should write your own.

lea esi,szPathFileName
cld
align_to_eos:
lodsb
or al,al
jnz align_to_eos
align_to_filename:
lodsb
cmp al,'\'
jne align_to_filename
inc esi
mov eax,esi
ret

i think this may work but should be tested


Yes  :P

I have written routines like this before, I always find myself in need of this type of function again, even though I have written it in the past, I've forgotten where I put it and what language it was.
I ended up using masm32 library call "NameFromPath". The only thing I couldn't find in the masm32 library was a function to retrieve file type extension "exe" for example.

It can be achieved with a few instruction though:

cld
lea edi, szFileName
mov ecx, MAX_PATH
mov eax, 2Eh
repne scasb
mov eax, edi

eax now points to the first character in the file extension. Assuming it doesn't have double extensions.
Title: Re: filename from path
Post by: dedndave on April 26, 2012, 07:47:31 PM
again - you should start from the end of the string
periods are allowed in folder names, as well
Title: Re: filename from path
Post by: zemtex on April 26, 2012, 07:50:54 PM
Like I said, it's not entirely safe, but fast. ( I didn't say you ought to start from the beginning of the string in my first post, you must have misread something )

Here is a safer way:

cld
lea edi, szFileName
mov ecx, MAX_PATH
mov eax, 2Eh
repne scasb
mov eax, [edi]
and eax, 0FF000000h
jz match
; IF YOU REACH THIS POINT, PERFORM A ROBUST EXTRACTION OF FILE EXTENSION METHOD FROM THIS POINT (slower), but chances are you will succeed in 99% of cases without reaching this place

match:
edi now points to file extension
Title: Re: filename from path
Post by: hutch-- on April 27, 2012, 02:27:09 AM
Most of you path API functions return the length of the string so you just add it to the start offset to get the end, then scan backwards to get either the "\", "." or the start of the string if neither are present.
Title: Re: filename from path
Post by: zemtex on April 27, 2012, 06:26:55 AM
you could also scan for the null terminator, which is a safer bet

cld
mov edi, szFileName
xor al, al
mov ecx, MAX_PATH + 1
repne scasb
sub edi, 4
mov al, byte ptr [edi-1]
cmp al, 2Eh
je match
; PERFORM ROBUST CALCULATION FROM THIS POINT IF YOU GET HERE
match:
edi points to file extension



Another example

cld
mov edi, szFileName
xor al, al
mov ecx, MAX_PATH + 1
repne scasb
mov eax, [edi-5]
cmp al, 2Eh
je match
; PERFORM ROBUST CALCULATION FROM THIS POINT IF YOU GET HERE
match:
the 3 h.o bytes of eax now contains the file extension. and if you are looking for an .exe file, you could quickly find out if it is by doing:
cmp eax, 6578652Eh
je foundexefile
Title: Re: filename from path
Post by: dedndave on April 27, 2012, 03:42:55 PM
when you scan for the null terminator, you are acquiring the string length
what Hutch is saying is - that may already be done for you if the string came from an API function

of course, if the string is defined in the .DATA section, you can use the SIZEOF operator to get the length

if the user enters a string, however, you will have to get the length
in some cases, the function that returns the string may also return the length
GetWindowText is one example that comes to mind - the length is in EAX
if you use SendMessage to get the string, it may not (depending on the type of control)
Title: Re: filename from path
Post by: jj2007 on April 27, 2012, 04:48:16 PM
Quote from: zemtex on April 27, 2012, 06:26:55 AM
or eax, 20202000h
cmp eax, 6578652Eh
je foundexefile

Just to be on the safe side  :wink
Title: Re: filename from path
Post by: zemtex on April 27, 2012, 04:55:59 PM
Yes, I forgot about uppercase and lowercase, but the algorithm is good. I didn't have time to test the code, I wrote that but I edited the post and deleted it again accidentially  :toothy

dave, nobody is arguing whether you are using the api or not, this is just pure algo.  :naughty:
beside, you don't need to extract the file extension from a file dialog, you use filter settings for that. No backscanning, searching or any mumbo jumbo, you simply set the file dialog to accept the specific file types you want.
There are cases when you don't need X, there are cases where you need X. You are arguing that an algorithm is bad because it is partially needed and partially not needed, which is bad logic.

To give a similar example to you: "I don't need to refill my car at the gas station, I already have gasoline on my car"  :toothy

I have made a diagram so that you understand what I am saying:

(http://i.imgur.com/MwQcp.png)

To take it even further. Here is your line of thinking. "I have solved a problem by using a dead slow function as an excuse to not use a faster one, therefore a different solution is always bad"

What you should have been thinking dave, is "There are two ways of solving a problem, the first solution is just as relevant as the second one, and there are no good reasons to exclude any of them"

....enough logic for now.  :lol

I'm tempted to take it a bit further:

Assumption from your part: "All programmers will put an API function call in front of another call, this is just the way it works. It will happen all the time, in 100% of the cases"
Title: Re: filename from path
Post by: dedndave on April 27, 2012, 06:22:24 PM
i was only trying to help
Title: Re: filename from path
Post by: zemtex on April 27, 2012, 06:24:13 PM
Quote from: dedndave on April 27, 2012, 06:22:24 PM
i was only trying to help

My criticism does not have reverse effect on your help. Perhaps you don't like constructive criticism?  :wink
Title: Re: filename from path
Post by: dedndave on April 27, 2012, 06:46:16 PM
you can criticize me all you like - lol

you seem to have missed the point
you are trying to write something that is "fast"
but, you insist on using CLD and REPNZ SCASB
that's not fast

the fact is - the length is often already there for you, as Hutch mentioned
in cases where it isn't - the masm32 StrLen function performs quite well
there may be faster routines, but probably not a lot faster - and this one is already written for you

as for the file dialogs - i was refering to other places where a user might enter a filename, like an edit box or something
i am aware of the file open/save dialogs functionality

quote from Hutch...
QuoteMost of you path API functions return the length of the string so you just add it to the start offset to get
the end, then scan backwards to get either the "\", "." or the start of the string if neither are present.

and your reply was...
Quoteyou could also scan for the null terminator, which is a safer bet

cld
mov edi, szFileName
xor al, al
mov ecx, MAX_PATH + 1
repne scasb

i tried to clarify for you, by trying to examine the cases where the length is provided and those where it is not

it's all good, though
if you don't want help - don't ask for it   ::)
Title: Re: filename from path
Post by: zemtex on April 27, 2012, 07:05:18 PM
Quote from: dedndave on April 27, 2012, 06:46:16 PM
you can criticize me all you like - lol

That was not a given, as I deduced  :lol

Quote from: dedndave on April 27, 2012, 06:46:16 PM
you seem to have missed the point
you are trying to write something that is "fast"
but, you insist on using CLD and REPNZ SCASB
that's not fast

I  have several implications here.

1: Instead of saying "is not fast", why do you not do as you always do, present an improvement like you always do, there are no reasons to step aside from that practice and sort to negativity.
2: working with strings, string instructions are almost always faster.
3: I used the term "faster one", which means relatively to the chunk you proposed to me (1 api + an extra algo of your own) You are making up thoughts of your own here.
4: "that's not fast" is another example of bad logic. "not fast" is a relative term.
5: I am not  "trying to write something", I have already written it, without testing it i might add. And I never needed the function, it was just for fun.
6: I have not missed the point, I have given you very accurate (with pictures) that your logic is bad.
7: About "You can criticize me all you like", the question is if I could have done that without telling you first. I leave that question to stay unanswered :)

Quote from: dedndave on April 27, 2012, 06:46:16 PM
the fact is - the length is often already there for you, as Hutch mentioned

Nobody is arguing about the length, the length is just a bieffect in my algorithm, and in your case it is a tool achieved by a slower algorithm.

Quote from: dedndave on April 27, 2012, 06:46:16 PM
in cases where it isn't - the masm32 StrLen function performs quite well
The StrLen function has got nothing to do with what we are talking about. What you are saying here is that "There are more green apples in africa, even though you have an apple tree in your own garden"

Quote from: dedndave on April 27, 2012, 06:46:16 PM
there may be faster routines, but probably not a lot faster - and this one is already written for you

When did this become a discussion of which is the fastest routine in existence?  :P You have missed the point yourself.

Quote from: dedndave on April 27, 2012, 06:46:16 PM
as for the file dialogs - i was refering to other places where a user might enter a filename, like an edit box or something
i am aware of the file open/save dialogs functionality

I do not care about your private popup where people have to type manually, in windows the default behavior is to use a file dialog. The second thing I want to say about this is that even though you propose examples of where it does not work, is not an argument for situations where it does work and where it is helpful. And it is definitely not an argument against the algorithm itself.

Quote from: dedndave on April 27, 2012, 06:46:16 PM
quote from Hutch...
QuoteMost of you path API functions return the length of the string so you just add it to the start offset to get
the end, then scan backwards to get either the "\", "." or the start of the string if neither are present.

hutch is a wise man.

Quote from: dedndave on April 27, 2012, 06:46:16 PM
and your reply was...
Quoteyou could also scan for the null terminator, which is a safer bet
It was not a reply to hutch, it was yet another example and I am well aware of the length being returned by api functions that is why I didn't comment it, it is a normal way of doing things in windows.

Quote from: dedndave on April 27, 2012, 06:46:16 PM
it's all good, though
if you don't want help - don't ask for it   ::)


I want help, but the length returned by API functions is not exactly what I was looking for, I would say you have a tremendous grasp of the obvious.
Quote
Title: Re: filename from path
Post by: dedndave on April 27, 2012, 07:09:36 PM
knock yourself out, buddy

you want me to draw you a picture, with circles and arrows ?
(http://l.yimg.com/us.yimg.com/i/mesg/emoticons7/24.gif)
Title: Re: filename from path
Post by: dedndave on April 27, 2012, 07:13:53 PM
(http://i.imgur.com/P5ZCR.gif)
Title: Re: filename from path
Post by: zemtex on April 27, 2012, 07:15:59 PM
(http://images.bookcloseouts.com/covers/large/isbn978047/9780471799412-l.jpg)  :U

You might want to practice
Title: Re: filename from path
Post by: dedndave on April 27, 2012, 08:39:00 PM
if extracting a few characters from a string is too difficult for you,
then i suggest that maybe assembly language isn't for you....

http://www.vbforums.com/
Title: Re: filename from path
Post by: zemtex on April 27, 2012, 08:45:27 PM
Quote from: dedndave on April 27, 2012, 08:39:00 PM
if extracting a few characters from a string is too difficult for you,
then i suggest that maybe assembly language isn't for you....

http://www.vbforums.com/

I am going to take this step by step, I realize what your lack is, read very carefully.

1: In my first post I ask for a function in the masm32 library so that I don't have to write it myself.
2: In a later reply of mine, I explicitly write that I have written this kind of function many times before.
3: I have already laid out an example how this can be done.

The problem here is that you are not thinking rationally, you are judging by emotions. Your emotions are stirred up, and these emotions are making you think the problem is real, there are no problems around this functionality that I have described, I have even written an algorithm for it in an earlier post.

You are not thinking clearly.
Title: Re: filename from path
Post by: jj2007 on April 27, 2012, 09:31:19 PM
Wow, this looks like fun :U
My bottle of red wine is already half empty, time to join the club with a modest contribution:

include \masm32\include\masm32rt.inc
.code
ThePath db "C:\Masm32\Stupid folder\Stupid sub folder\this is the file.exe", 0
start: mov edi, offset ThePath
xFile: or ecx, -1
xor eax, eax
repne scasb
not ecx
add eax, 92+0 ; put 92+1 to simulate the "nnt found" case
std
repne scasb
cld
jecxz NotFound
add edi, 2
NotFound:
.if ecx
print "We found it: ["
.else
print "No luck: ["
.endif
print edi, "] in just "
mov eax, NotFound
sub eax, xFile
inkey str$(eax), " bytes"
exit
end start


Any volunteers for pushing it below 21 bytes? Speed is an issue, of course: On my trusty Celeron, parsing the path above takes a whopping 0.27 microseconds. If you have a Million paths to parse, that makes it 0.27 seconds ::)
Title: Re: filename from path
Post by: qWord on April 27, 2012, 09:49:20 PM
ABI-friendly, 18 Bytes, no sting instructions:
mov edx,offset ThePath
xor ecx,ecx
@@: movzx eax,CHAR ptr [edx]
test eax,eax
jz @F
xor eax,'\'
cmovz ecx,edx
inc edx
jmp @B
@@:
.if ecx
lea ecx,[ecx+1]
print ecx,13,10
.else
print "damm",13,10
.endif

Prost  :bg
Title: Re: filename from path
Post by: zemtex on April 27, 2012, 09:51:41 PM
It's funny to see it went from "Don't need it" to "I can make it faster". I find it ridiculously easy to see the inconsistencies.  :lol
I see this inconsistency in both of you, and it is a strange shift.

Keep in mind

1: It was not a speed issue.
2: My code was laid out in 30 seconds, no testing, no preparations, (not 2 days of preparations in Olly)
Title: Re: filename from path
Post by: jj2007 on April 27, 2012, 10:25:12 PM
Quote from: qWord on April 27, 2012, 09:49:20 PM
ABI-friendly, 18 Bytes, no sting instructions

It was not my intention to sting you, qWord :wink

Ok, 3 bytes shorter and 0.15 instead of 0.27 seconds per Million strings parsed :U
More offers?
Prost :bg
Title: Re: filename from path
Post by: zemtex on April 27, 2012, 10:26:32 PM
(.... just keeping it going, everything is normal...)

I sometimes find people more interesting than programming  :bg
Title: Re: filename from path
Post by: hutch-- on April 27, 2012, 11:05:53 PM
 :bg

The only thing missing in this discussion on algorithm design is a registry check for a valid activated version of Windows followed by a secret internet connection to "big brother" to ensure that the paths use politically correct terminology.

I mean HAY !!!! where do you find a million paths in a hurry on any single computer and if you did, why would you need to parse them faster than disk IO ?

About the only reason I can think of to write even vaguely optimised code for such a simple task is so you don't end up with cache pollution using old junk instructions in the middle of fast stuff.
Title: Re: filename from path
Post by: jj2007 on April 27, 2012, 11:22:00 PM
Quote from: hutch-- on April 27, 2012, 11:05:53 PM...cache pollution using old junk instructions in the middle of fast stuff.

That sounds almost esoteric, Hutch :boohoo:

Howz the red wine down under? My bottle of Primitivo is getting lighter and lighter now... after a long fight with Masm's macro engine :bg
Title: Re: filename from path
Post by: dedndave on April 27, 2012, 11:48:48 PM
i tend to think more along the lines of flexibility
for this kind of routine - the more ways i can use it, the better
mine starts off like this....
ParseFileName PROTO :LPSTR,:UINT

;******************************************************************

        OPTION  PROLOGUE:None
        OPTION  EPILOGUE:None

ParseFileName PROC lpszPathName:LPSTR,uLength:UINT

;Parse File Name String - DednDave - 4, 2012
;
;  Used to seperate a fully qualified path string.
;
;Example:
; szPathName = 'C:\Masm32\Macros\Macros.asm',0
;
;After the call, EAX will point to: 'Macros.asm',0
;                ECX will point to: 0
;                EDX will point to: 'asm',0
; ECX may be used to calculate string lengths
; if no '\', '/', or ':' is found, EAX = lpszPathName
; if no '.' is found, EDX points to the null terminator
; only periods after the beginning of the file name are considered
; i.e. EDX is always greater than EAX, unless the path is null
;
;Call With: lpszPathName = address of zero-terminated path string
;                uLength = string length,
;             if uLength = NULL, the routine will find the length
;
;  Returns: EAX = address of possible filename string
;           ECX = address of null terminator
;           EDX = address of possible filename extension string
;
;Also Uses: all other registers are preserved
;
;----------------------------------------------

        push    esi
        mov     eax,[esp+12]     ;EAX = uLength
        mov     esi,[esp+8]      ;ESI = lpszPathName
        or      eax,eax
        jnz     PFile0

        INVOKE  StrLen,esi

PFile0: add     eax,esi          ;EAX = index
Title: Re: filename from path
Post by: hutch-- on April 27, 2012, 11:52:35 PM
JJ,

Pure malt man myself, we call primitive red wine "rough red" and while I know a few people who have the intestinal fortitude to consume such beverages, I am not one of them, almost all wines make me sick so I stick to the straight and narrow and only drink very pure spirits, albeit in extreme moderation.
Title: Re: filename from path
Post by: xandaz on April 28, 2012, 11:21:36 PM
    Nice JJ and Q.