i having trouble locating the end of some text within a block of text. I tried scanning through the text and keeping track of the last location i found a pattern but i couldnt get it to always work. I was wondering if someone can show me how they would go about find the end of this text is a block where this text is not always the same length.
WPN_M4AUTO.-1.-1.-1.WPN_M4.-1.-1.-1.WPN_AT4.-1.-1.-1.WPN_GRENADEFB.-1.-1.-1.WPN_GRENADEHE.-1.-1.-1.WPN_GRENADESM.-1.-1.-1.WPN_KNIFE.-1.-1.-1
this is another way it could be listed
WPN_CLAYMORE.-1.-1.-1PWPN_GRENADEFB.-1.-1.-1-WPN_GRENADEHE.-1.-1.-1-WPN_GRENADESM.-1.-1.-1-WPN_KNIFE.-1.-1.-1
the periods are actually a 0 charactor.
The text starts in the block at the same point but the end would be in a different place.. 2 blocks of text would be like this.
..WPN_M4AUTO.-1.-1.-1.WPN_M4.-1.-1.-1.WPN_AT4.-1.-1.-1.WPN_GRENADEFB.-1.-1.-1.WPN_GRENADEHE.-1.-1.-1.WPN_GRENADESM.-1.-1.-1.WPN_KNIFE.-1.-1.-1..
..WPN_CLAYMORE.-1.-1.-1PWPN_GRENADEFB.-1.-1.-1-WPN_GRENADEHE.-1.-1.-1-WPN_GRENADESM.-1.-1.-1-WPN_KNIFE.-1.-1.-1E..
Again the periods are 0 charactors. when i scanned i read 4 bytes and compared it to 12589 which worked for the first block but not on the second. i assume its because of the letter E at the end instead of beong a 0 charactor.
The crucial issue: What defines the end of your block of text ?
Without this, how can the end be known ?
How many end sequences are there, and what are they ?
Once you have the answers about the end, the code will be straight forward.
Your question is not entirely clear, so if you can correct my assumptions..
What you have is an array of zero-terminated strings (28 in total? I'm not sure if your examples are meant to be fully accurate or not - is it an exact number?).. and you're hoping to find the end of each array (weapons inventory)?
If that's the case then you can just count out 28 (or however many there are) items, each of which ends with the zero, then you're at the end of the array (and the start of the next one?)
Though the 'extra' letter E causes a problem, maybe that's because, either:
- the end of the array is not really where you think it should be;
- or there is some extra encoding which you don't know about - are the 'zeroes' always actually ZERO ?
P.S. What are you doing here - hacking a game?
no i am not hacking a game. I actually really really dislike people who write trainers and such. This is in a map file which the map editor can make a .mis file which is plain text and a .bms file which is read by the game. The map editor reopens the mis file. Well there has been these programs in the past to convert the bms file back to the mis. There are many people who want one for this series in the game and there is not one. Most people want this because it can be used to save their maps when they loose the .mis file. The zero's are hex 00 so a null charactor. The weapons are listed right before the items of the map and i am not worried about converting the weapons. I need to find where they end to start the items. I assume the reasoning that the map editor does not open the .bms file is because there is information that the bms file does contain like descriptions and such set by the user. When it saves the .mis it has the descriptions saved in the file. The number of weapons seems to change from map to map. It always starts offset 616. Some end at like 756 and some at 725.
heres a section in hex vs its plain text
5750 4E5F 4D34 4155 544F 002D 3100 2D31 002D 3100 5750 4E5F 4D34 002D 3100 2D31 002D 3100 5750 4E5F 4154 3400 2D31 002D 3100 2D31 0057 504E 5F47 5245 4E41 4445 4642 002D 3100 2D31 002D 3100 5750 4E5F 4752 454E 4144 4548 4500 2D31 002D 3100 2D31 0057 504E 5F47 5245 4E41 4445 534D 002D 3100 2D31 002D 3100 5750 4E5F 4B4E 4946 4500 2D31 002D 3100 2D31 0000
WPN_M4AUTO.-1.-1.-1.WPN_M4.-1.-1.-1.WPN_AT4.-1.-1.-1.WPN_GRENADEFB.-1.-1.-1.WPN_GRENADEHE.-1.-1.-1.WPN_GRENADESM.-1.-1.-1.WPN_KNIFE.-1.-1.-1..
It looks to me like it's simply a variable length weapon name followed by three parameters. Each weapon name and parameter is terminated by a zero byte. Repeat for however many weapons there are. What's the problem? Read until you find a zero byte, that's a weapon name. Read until you find a zero byte, that's parameter 1. same for parameter 2 and 3. Loop back and start over.
The big question is what do you want to do with this information? Write it out in a different format? What format?
jckl,
I'm not sure about the form of the input data, or exactly what component you are trying to find the end of, but this demonstrates one method of locating the double null byte, and one method of splitting the data into individual fields.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
.data
.RADIX 16
sample LABEL WORD
dw 5750,4E5F,4D34,4155,544F,002D,3100,2D31,002D,3100,5750
dw 4E5F,4D34,002D,3100,2D31,002D,3100,5750,4E5F,4154,3400
dw 2D31,002D,3100,2D31,0057,504E,5F47,5245,4E41,4445,4642
dw 002D,3100,2D31,002D,3100,5750,4E5F,4752,454E,4144,4548
dw 4500,2D31,002D,3100,2D31,0057,504E,5F47,5245,4E41,4445
dw 534D,002D,3100,2D31,002D,3100,5750,4E5F,4B4E,4946,4500
dw 2D31,002D,3100,2D31,0000
.RADIX 10
.code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
; ------------------------------------------------------------
; Reverse the bytes and the replace the embedded nulls with a
; suitable character that will not act as a string terminator.
; ------------------------------------------------------------
mov ebx, OFFSET sample
.WHILE WORD PTR[ebx] != 0 ; while not double null byte
mov ax, [ebx]
.IF al == 0
mov al, 0ffh
.ENDIF
.IF ah == 0
mov ah, 0ffh
.ENDIF
xchg ah, al
mov [ebx], ax
add ebx, 2
.ENDW
; ----------------------------------------------------
; Use the CRT strtok function to parse the data using
; the replacement character as a delimiter.
; ----------------------------------------------------
mov ebx, rv(crt_strtok, ADDR sample, chr$(0ffh))
.WHILE ebx
print ebx,13,10
mov ebx, rv(crt_strtok, NULL, chr$(0ffh))
.ENDW
inkey "Press any key to exit..."
exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start
Well all that example shows is that it seems to be: (weapon, value, value, value), (weapon, value, value, value), ..., NULL
(the extra null at the end to indicate the end of the list.)
Want to post hex for one that messes up this assumption (maybe with a few extra bytes before and after)..?
Obviously there must be a structure to it since it's expected to be read back in again. So, either the messed up example is actually a corrupt file, or there's some value somewhere that indicates the real length of the list, or ...(something else)...
...?..o.....................................WPN_CLAYMORE.-1.-1.-1PWPN_GRENADEFB.-1.-1.-1-WPN_GRENADEHE.-1.-1.-1-WPN_GRENADESM.-1.-1.-1-WPN_KNIFE.-1.-1.-1E.........&...........2.....0.....d...d.......@.......d.d.................................................................null
0000003F09006F000A000000000FA0050000000000000000000000000000000000000000000000000000000057504E5F434C41594D4F5245002D31002D31002D315057504E5F4752454E4144454642002D31002D31002D312D57504E5F4752454E4144454845002D31002D31002D312D57504E5F4752454E414445534D002D31002D31002D312D57504E5F4B4E494645002D31002D31002D314500BF000000000000002600000000000000A1C3DAFF32DFDCFFE21530000A000000640000006400000010000000400100000000000064006400000000000000000003000500FFFF1E0000000000000000000F00000000000000000000000A0000000000000000000000000000000000000000000000000000006E756C6C
This is one that messes it up.
Is this an actual file that was output from the game? Can you attach a couple of example file to a post?
here are a few of the bms files..
[attachment deleted by admin]
Without knowing the exact structure of the files it's going to be difficult to do anything with them. I see from a previous post that you have an idea of the item structure:
http://www.masm32.com/board/index.php?topic=4570.0
yea i have the structure of everything i needed to read in that file but cant get past this part. I can tell you that the items start right after the -1 on both examples. Ill attach my txt that i stored my offsets in but it may not make sense to you since its kind of notes to me.
[attachment deleted by admin]
For the file that is dropped on the EXE, this code will display the offset from the start of the file of the last occurrence of "WPN_", or zero if there is no occurrence.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
.data
pMem dd 0
memLen dd 0
cmd db 128 dup(0)
.code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
invoke GetCL, 1, ADDR cmd
invoke read_disk_file, ADDR cmd, ADDR pMem, ADDR memLen
mov ebx, memLen
sub ebx, 4
mov esi, pMem
.WHILE ebx
mov eax, [esi+ebx]
.IF eax == "_NPW"
.BREAK
.ENDIF
dec ebx
.ENDW
print uhex$(ebx),13,10
inkey "Press any key to exit..."
exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start
This will dump out the weapons contents, assuming good formating. It looks like the first three files are good, and test444 has problems. This program make no assumptions about the length of the parameters for each weapon. If you could assume that each parameter was exactly two characters long, you could also process the test444 file. Does the test444 file actually work as input to the game and show the correct information?
include \masm32\include\masm32rt.inc
.data
file1 db "1.bms",0
file2 db "2.bms",0
file3 db "Destroy All.bms",0
file4 db "test444.bms",0
mem dd ?
flen dd ?
.code
Program:
call Main
Invoke ExitProcess,0
ReadBmsFile proc filename:dword
print "Procesing file: "
print filename,13,10
invoke read_disk_file, filename,addr mem, ADDR flen
or eax,eax
jz Failure
mov esi,mem
add esi,268h
ReadLoop:
cmp byte ptr [esi],0
je done
print esi,13,10
.repeat
lodsb
.until al==0
jmp ReadLoop
done:
print "--------",13,10
invoke GlobalFree,mem
ret
Failure:
print "unable to read file",13,10
ReadBmsFileRet:
ret
ReadBmsFile EndP
Main Proc
invoke ReadBmsFile,addr file1
invoke ReadBmsFile,addr file2
invoke ReadBmsFile,addr file3
invoke ReadBmsFile,addr file4
inkey "Press any key to exit..."
ret
Main EndP
end Program
yea that file works fine. I was going good until i ran into this file.
in the sequence that does not work it has 00 after the E at the end. I believe this is always the same so is there a way i can find this point? I am not sure exactly how to do it but if i use that last function it shows the weapons and values and the last thing it shows is first items id. What i was thinking is a way to test if the string has WPN_ in it or if it has -1. If it does not then we know its the end and the start of the items. If thats not a good idea is there a way to detect it by that last 00.
Based on your statement that you are not worried about converting the weapons, I intended that you take the offset returned by my code, scan forwards to the next null byte, and add 9 to get the offset of the first item. This will work for all of the samples.
aww i missed that.. Thank you ill test it over and over with more and more files. For some reason i cant get any of these console apps to run but by creating a function in my app it works. They actually do run but i get nothing returned and they dont close. Anyways thanks for all the examples and help. I now have more grounds to play and learn.