I've been playing around with BEAEngine to disassemble a file.
I've got the BEAEngine project from the main website. Latest version being the 4.x one.
Now I am experiencing a small problem.
On the site there is an explanation which includes:
//BEAEngine Example
//0x89, 0x94, 0x88, 0x00, 0x20, 0x40, 0x00
//mov dword ptr ds:[eax + ecx*4 + 402000h], edx
//MyDisasm.Instruction.Category == GENERAL_PURPOSE_INSTRUCTION + DATA_TRANSFER
//MyDisasm.Instruction.Opcode == 0x89
//MyDisasm.Instruction.Mnemonic == "mov "
//
//MyDisasm.Argument1.ArgMnemonic == "eax + ecx*4 + 402000h"
//MyDisasm.Argument1.ArgType == MEMORY_TYPE
//MyDisasm.Argument1.ArgSize == 32
//MyDisasm.Argument1.AccessMode == WRITE
//MyDisasm.Argument1.Memory.BaseRegister == REG0
//MyDisasm.Argument1.Memory.IndexRegister == REG1
//MyDisasm.Argument1.Memory.Scale == 4
//MyDisasm.Argument1.Memory.Displacement == 0x402000
//MyDisasm.Argument1.SegmentReg == DSReg
//
//MyDisasm.Argument2.ArgMnemonic == "edx"
//MyDisasm.Argument2.ArgType == REGISTER_TYPE + GENERAL_REG + REG2
//MyDisasm.Argument2.ArgSize == 32
//MyDisasm.Argument2.AccessMode == READ
Now in my project I am dissassembling the following instruction:
push dword ptr [esp+ebx*4+30721500h]
Now according to the xample above the results should be:
MyDisasm.Argument1.ArgMnemonic == "esp + ebx*4 +30721500h" but it returns the following IDA output:
MyDisasm_Argument1_ArgMnemonic
data:00435180 MyDisasm_Argument1_ArgMnemonic db 10h
data:00435181 db 0
data:00435182 db 2
data:00435183 db 0
data:00435184 db 0
data:00435185 db 0
data:00435186 db 0
data:00435187 db 0
data:00435188 db 0
data:00435189 db 0
data:0043518A db 0
Another variable which should be filled correctly:
MyDisasm.Argument1.Memory.Scale == 4 --> but instead it returns IDA output:
MyDisasm_Argument1_Memory_Scale dd 10h
The only thing it fills correctly is the CompleteString variable.
I've included BEAEngine like this:
#Include BeaEngineGoAsm32.inc
Disasm = BeaEngine.lib:Disasm
and also dynamicly link: #DynamicLinkFile msvcrt.dll
Anyone else experienced any of these issues?
Hi Flysky,
BEAEngine is a great disassembly lib and well worth the effort to learn to use. Beatrix (http://www.masm32.com/board/index.php?action=profile;u=5325) (the author) is a member here and though not active lately you can try to send a PM or visit the dedicated forum on the BEAEngine website. For my own purposes I have a custom build of the library that Beatrix sent me and use that however, since it is a one-off I have not attempted to update it (since it serves my needs as they are quite nicely). Another albeit more cumbersome way to get disassembled code is to use the DbgHelp api, since it is totally COM based with event sinks etc.. without a good knowledge of COM in assembly you might get bogged down in bug hunting more than producing useable code. I am pretty sure that I have posted code for both BEAEngine and DBGHelp disassembly though I haven't looked around for it, my DBGHelp code can be found in the debug tools for RadAsm 3 in the appropriate subforum.
Edgar
hmm I can't edit my post.
It seems I was using an earlier version of BEAEngine.
Since version 4.x from August this year is giving me compiling errors that it doesn't support/ recognize MyDisasm.Eip
-----------------------------
EDIT FIXED the problem.
Added the latest version of BEAEngine again.
Now with the error. It seems the MyDisasm structure changed.
Use MyDisasm.EIP instead of MyDisasm.Eip
struct _Disasm {
UIntPtr EIP; --> In earlier version it was Eip
UInt64 VirtualAddr;
UIntPtr SecurityBlock;
char CompleteInstr[INSTRUCT_LENGTH];
UInt32 Archi;
UInt64 Options;
INSTRTYPE Instruction;
ARGTYPE Argument1;
ARGTYPE Argument2;
ARGTYPE Argument3;
PREFIXINFO Prefix;
UInt32 Reserved_[40];
};
I've cleaned my 'double posting message'.
Like stated above I fixed the BEAEngine problem, was a pretty stupid mistake from my side.
Now I have one more question.
Is it possible to assemble a string into opcodes?
BEAEngine uses the opcodes and generates the string.
Is there any library which can do the same the other way around.
Converting the string back into it's opcodes. I haven't seen anything like this
in BEAEngine documentation.
i recall something about this from before
i think you can use JwAsm to generate one line of code
the forum search tool might be helpful
I've found a couple of topics containing a bit of information.
Hutch provided two tools to convert a string to the opcodes (decimal presentation).
From the website from Jwasm it seems to be supporting MASM only?
i seem to recall there is a command-line switch for JwAsm to assemble one line
maybe i am thinking of GoAsm :P
Not sure what you mean dedndave but I will continue searching for the solution.
Quote from: donkey on November 30, 2011, 06:54:48 PM
Hi Flysky,
BEAEngine is a great disassembly lib and well worth the effort to learn to use. Beatrix (http://www.masm32.com/board/index.php?action=profile;u=5325) (the author) is a member here and though not active lately you can try to send a PM or visit the dedicated forum on the BEAEngine website. For my own purposes I have a custom build of the library that Beatrix sent me and use that however, since it is a one-off I have not attempted to update it (since it serves my needs as they are quite nicely). Another albeit more cumbersome way to get disassembled code is to use the DbgHelp api, since it is totally COM based with event sinks etc.. without a good knowledge of COM in assembly you might get bogged down in bug hunting more than producing useable code. I am pretty sure that I have posted code for both BEAEngine and DBGHelp disassembly though I haven't looked around for it, my DBGHelp code can be found in the debug tools for RadAsm 3 in the appropriate subforum.
Edgar
Edgar, I've found your dbghelp functions COM based. Indeed it's became bughunting instead of writing code that I could use.
the IDebugClient also offers a function called Assembly.
So I coded an Assembly function:
invoke Assemble, offset AssemblyResult, offset FullRebuildPattern1 // This loads the effective adresses of the string to convert called FUllRebuildPattern1 and AssemblyResult is a buffer to receive the translation.
Assemble FRAME Location, Instruction
uses Ebx
LOCAL status:%UINT_PTR
LOCAL EndOffset:Q
LOCAL pBuffer:%UINT_PTR
LOCAL DisassemblySize:%UINT_PTR
invoke CoTaskMemAlloc,1024
mov [pBuffer], Eax
invoke DebugCreate,offset IID_IDebugClient, offset pIDebugClient
test Eax, Eax
jnz >>.DBGCFAIL
CoInvoke(pIDebugClient,IDebugClient.QueryInterface,offset IID_IDebugControl,offset pIDebugControl)
test Eax, Eax
jnz >>.QIFAIL
invoke GetCurrentProcessId
CoInvoke(pIDebugClient,IDebugClient.AttachProcess,0,0,eax,DEBUG_ATTACH_NONINVASIVE | DEBUG_ATTACH_NONINVASIVE_NO_SUSPEND)
test Eax, Eax
jnz >>.ATTACHFAIL
CoInvoke(pIDebugControl,IDebugControl.WaitForEvent,DEBUG_WAIT_DEFAULT,INFINITE)
mov Ebx, [Instruction] //Load instruction to convert
mov Edx, [Location] //Load the location where Assembly should be stored
:
CoInvoke(pIDebugControl,IDebugControl.Assemble,edx,ebx,offset EndOffset) //Call function. After this call it always ends up with a error it can't determine.
test Eax, Eax
jnz >>.DISASMFAIL
CoInvoke(pIDebugControl,IDebugControl.Release)
CoInvoke(pIDebugClient,IDebugClient.Release)
invoke CoTaskMemFree,[pBuffer]
ret
.DBGCFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,NULL,[pBuffer],"DebugCreate failed",0
ret
.QIFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,NULL,[pBuffer],"QueryInterface failed",0
CoInvoke(pIDebugClient,IDebugClient.Release)
ret
.ATTACHFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,NULL,[pBuffer],"Attach to process failed",0
CoInvoke(pIDebugControl,IDebugControl.Release)
CoInvoke(pIDebugClient,IDebugClient.Release)
ret
.DISASMFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,0,[pBuffer],"Disassemble failed",0
CoInvoke(pIDebugControl,IDebugControl.Release)
CoInvoke(pIDebugClient,IDebugClient.Release)
ret
EndF
A quick look at the code shows that you are using 32 bit offsets for the Assemble method, as far as I know the offsets should be 64 bit whether in 32 or 64 bit mode. Also you will have a memory leak if you don't free the memory (pBuffer) in case of an error using CoTaskMemFree.
Could it be the string I am passing through the Assemble function causing the problem?.
What I did was pretty simple. I created a string which holds
mov esi, [esp+4B0] --> Exactly like this.
In WinDBG you simply type A (Assemble) and type the instruction to assemble.
Not sure if WinDBG is doing anything to the input like converting it or something.
The offsets 32 bit or 64 bit you kind of lost me there. To be able to use 64 bit offsets it would mean I would need
to have a 64 bit function aswell?
A 64 bit offset needs to be passed as 2 32 bit parameters, to do this pass the 32 bit address in the first parameter and NULL in the second:
invoke SomeFunc, Addr64
would be
invoke SomeFunc, Addr32, 0
A good way to tell if you are not passing wide enough addresses is to check the value of ESP before and after the call, in 32 bit mode you will have a difference of 4 bytes. For the string I am not familiar with the function but it requires a pointer to a NULL terminated ANSI string. I have found that pointers to strings in global or local memory can lead to threading problems with the debug api, I prefer to allocate memory using CoTaskMemAlloc and copy the string to that in order to get reliable results (in some cases it is the only way to get any result at all).
Edgar
still struggling and here is what I have so far.
I used BEAEngine to disassemble a couple of instructions.
In the DATA section I created:
FullRebuildPattern1 DB 100 Dup (?) // To have a buffer which can store a string.
and a couple more variables which should help me rebuild a string.
RebuildComma DB ',',0
RebuildSpace DB ' ',0
RebuildPlus DB '+',0
RebuildLe DB '[',0
RebuildRe DB ']',0
Now after playing with a couple of BEAengine returned disassembles I decided to make my own string and see if I can
code a function which converts that string back to it's opcodes so a real instruction occurs the CPU understands.
I've been using the lstrcat function to build that string.
invoke lstrcat, offset FullRebuildPattern1, offset MovInstr
invoke lstrcat, offset FullRebuildPattern1, offset RebuildSpace
invoke lstrcat, offset FullRebuildPattern1, offset PopRebuildPattern1
invoke lstrcat, offset FullRebuildPattern1, offset RebuildComma
invoke lstrcat, offset FullRebuildPattern1, offset RebuildSpace
invoke lstrcat, offset FullRebuildPattern1, offset RebuildLe
invoke lstrcat, offset FullRebuildPattern1, offset EspInstr
invoke lstrcat, offset FullRebuildPattern1, offset RebuildPlus
invoke lstrcat, offset FullRebuildPattern1, offset EspResult
invoke lstrcat, offset FullRebuildPattern1, offset RebuildRe
The string created is:
mov esi, [esp+000004b0]
Now the function I thought which is nice to recreate the opcodes based on a given string is based on donkey's IDebugClient code.
there is a function called:
The Assemble and AssembleWide methods assemble a single processor instruction. The assembled instruction is placed in the target's memory.
HRESULT
IDebugControl::Assemble(
IN ULONG64 Offset,
IN PCSTR Instr,
OUT PULONG64 EndOffset
);
HRESULT
IDebugControl4::AssembleWide(
IN ULONG64 Offset,
IN PCWSTR Instr,
OUT PULONG64 EndOffset
);
#ifdef UNICODE
#define AssembleT AssembleWide
#else
#define AssembleT Assemble
#endif
Parameters
Offset
Specifies the location in the target's memory to place the assembled instruction.
Instr
Specifies the instruction to assemble. The instruction is assembled according to the target's effective processor type (returned by SetEffectiveProcessorType).
EndOffset
Receives the location in the target's memory immediately following the assembled instruction. EndOffset can be used when assembling multiple instructions.
Return Value
S_OK
The method was successful.
I coded a function called Assemble which takes 2 parameters.
invoke Assemble, offset AssemblyResult, offset FullRebuildPattern1
Parameter 1 is pointer to a place which should receive assembled code.
Defined in the data section as:
AssemblyResult Db 100 Dup (?)
Parameter 2 pointer to the string to assemble.
The assemble function is based on donkeys IdebugClient code posted on the forum.
Donkey suggested to use 64 bit parameters so I had to rebuild a bit and it takes an ANSI string.
Although I noticed when debugging the dbgeng.dll and following it it automaticly calls MultiByteWideChar API
Assemble FRAME Location, Instruction
uses Ebx
LOCAL status:%UINT_PTR
LOCAL EndOffset:Q
LOCAL pBuffer:%UINT_PTR
LOCAL DisassemblySize:%UINT_PTR
LOCAL wfmt[MAX_PATH] :W
invoke CoTaskMemAlloc,1024
mov [pBuffer], Eax
invoke DebugCreate,offset IID_IDebugClient, offset pIDebugClient //Create debugging event
test Eax, Eax
jnz >>.DBGCFAIL
CoInvoke(pIDebugClient,IDebugClient.QueryInterface,offset IID_IDebugControl,offset pIDebugControl) //Determine which debugging interface to use
test Eax, Eax
jnz >>.QIFAIL
invoke GetCurrentProcessId //Get Current process ID -- it returns the process id the code is being ran from which is correct.
CoInvoke(pIDebugClient,IDebugClient.AttachProcess,0,0,eax,DEBUG_ATTACH_NONINVASIVE | DEBUG_ATTACH_NONINVASIVE_NO_SUSPEND) //Attach to process - all succeeds till here
test Eax, Eax
jnz >>.ATTACHFAIL
CoInvoke(pIDebugControl,IDebugControl.WaitForEvent,DEBUG_WAIT_DEFAULT,INFINITE) //Wait for a event
:
invoke MultiByteToWideChar,CP_ACP,NULL,[Instruction],-1,offset wfmt,MAX_PATH //I decided to code the MultiByteToWideChar API in the function to create an ANSI string. return value is 18 so it wrote 18 bytes in the buffer.
CoInvoke(pIDebugControl,IDebugControl.Assemble,offset wfmt, 0, [Instruction],0 ,offset EndOffset). // After this function it returns the value 80004005.
test Eax, Eax
jnz >>.DISASMFAIL
CoInvoke(pIDebugControl,IDebugControl.Release)
CoInvoke(pIDebugClient,IDebugClient.Release)
invoke CoTaskMemFree,[pBuffer]
ret
.DBGCFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,NULL,[pBuffer],"DebugCreate failed",0
ret
.QIFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,NULL,[pBuffer],"QueryInterface failed",0
CoInvoke(pIDebugClient,IDebugClient.Release)
ret
.ATTACHFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,NULL,[pBuffer],"Attach to process failed",0
CoInvoke(pIDebugControl,IDebugControl.Release)
CoInvoke(pIDebugClient,IDebugClient.Release)
ret
.DISASMFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,0,[pBuffer],"Disassemble failed",0
CoInvoke(pIDebugControl,IDebugControl.Release)
CoInvoke(pIDebugClient,IDebugClient.Release)
ret
EndF
HRESULT
IDebugControl::Assemble(
IN ULONG64 Offset,
IN PCSTR Instr,
OUT PULONG64 EndOffset
);
CoInvoke(pIDebugControl,IDebugControl.Assemble,offset wfmt, 0, [Instruction],0 ,offset EndOffset). // After this function it returns the value 80004005.
Only Offset and EndOffset are 64 bit, the Instr parameter is 32 bit according to the interface definition you posted. You have 64 bits for Offset and Instr and 32 bits for EndOffset (actually the function will see Instr normally and EndOffset as an unreasonalby large number), so it returned an error. You also have a period at the end of the line but that is probably ignored. Try this instead:
CoInvoke(pIDebugControl,IDebugControl.Assemble,offset wfmt, 0, [Instruction],offset EndOffset,0)
Edgar
Edgar,
First of all thanks for all your help.
I tried what you suggested. The Assemble function still returns: 80004005 which is the E_FAIL return code.
Could it be the created string causing the error?.
Quote from: FlySky on December 11, 2011, 08:24:47 AM
Edgar,
First of all thanks for all your help.
I tried what you suggested. The Assemble function still returns: 80004005 which is the E_FAIL return code.
Could it be the created string causing the error?.
Yes, it could be, MultibyteToWidechar will create a UNICODE not an ANSI string.
It's indeed the string donkey. Infact it seems you have to give the assemble function a NULL terminated string.
The string I created is filled by using the lstrcat function and therefor doesn't has a NULL terminator?.
I created the following string definition in the data section
TestString DB 'mov esi, [esp+4b0]',0 //Test string
Now running the assemble function it allocates the buffer at 51D8000
than it fills it with the opcode bytes. Looking it up in the debugger:
051d8000 8bb424b0040000 mov esi,dword ptr [esp+4B0h] //perfectly translated.
So my problem is not having a NULL terminated string.
Is it possible to add the NULL terminator after a string is created with lstrcat for example?
Glad you got it working FlySky, yes, PCSTR is always a pointer to a NULL terminated string (Pointer to C STRing). The lstrcat function will always NULL terminate the resulting string unless there is an error so it's not likely that API was the issue. You should look through your code to find the actual cause however, since you might want to change the function later it could be important to know what was changed to get it working to prevent introducing the same bug in future versions.
Edgar
I am atm testing what's causing the error, it has to do with the string building. So I am testing a couple of instructions which I append using lstrcat.
I will post my findings and final solution when I figure it out.
Close to finishing this part. Thanks for all the input.
I've fixed the error.
It has indeed to do with the way the string is being generated.
I used htodw function to convert an hexadecimal string into it's dword form.
so ESP + 4B0 was translated as ESP + 000004B0.
And this is exactly what the assemble function from the IdebugClient does not like.
It doesn't recognize all the 00000 before 4B0.
So I had to cut the zero's off and I am using the following function to assemble strings:
invoke Assemble, offset AssemblyResult, offset FullRebuildPattern1 // Call assemble function.
AssemblyResult is pointer to a buffer where opcodes are stored.
FullRebuildPattern1 is pointer to a buffer which holds the string to rebuild. In my case it holds: mov esi, [esp+4B0]
Assemble FRAME pTarget, pInstr
uses Ebx
// LOCAL status:%UINT_PTR
LOCAL EndOffset:Q
LOCAL pBuffer:%UINT_PTR
// LOCAL DisassemblySize:%UINT_PTR
LOCAL wfmt[MAX_PATH] :W
invoke CoTaskMemAlloc,1024
mov [pBuffer], Rax
invoke DebugCreate,offset IID_IDebugClient, offset pIDebugClient
test Eax, Eax
jnz >>.DBGCFAIL
CoInvoke(pIDebugClient,IDebugClient.QueryInterface,offset IID_IDebugControl,offset pIDebugControl)
test Eax, Eax
jnz >>.QIFAIL
invoke GetCurrentProcessId
CoInvoke(pIDebugClient,IDebugClient.AttachProcess,0,0,eax,DEBUG_ATTACH_NONINVASIVE | DEBUG_ATTACH_NONINVASIVE_NO_SUSPEND)
test Eax, Eax
jnz >>.ATTACHFAIL
CoInvoke(pIDebugControl,IDebugControl.WaitForEvent,DEBUG_WAIT_DEFAULT,INFINITE)
:
// invoke MultiByteToWideChar,CP_ACP,NULL,[pInstr],-1,[pBuffer],MAX_PATH //convert to ansi string
// invoke WideCharToMultiByte,CP_ACP, NULL, [pInstr],
mov Ebx, [pInstr] //holds instruction to convert
mov Edx, [pTarget] //holds place to store the recreated instruction
CoInvoke(pIDebugControl,IDebugControl.Assemble,Edx, 0, Ebx, offset EndOffset,0)
test Eax, Eax
jnz >>.DISASMFAIL
CoInvoke(pIDebugControl,IDebugControl.Release)
CoInvoke(pIDebugClient,IDebugClient.Release)
invoke CoTaskMemFree,[pBuffer]
ret
.DBGCFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,NULL,[pBuffer],"DebugCreate failed",0
ret
.QIFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,NULL,[pBuffer],"QueryInterface failed",0
CoInvoke(pIDebugClient,IDebugClient.Release)
ret
.ATTACHFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,NULL,[pBuffer],"Attach to process failed",0
CoInvoke(pIDebugControl,IDebugControl.Release)
CoInvoke(pIDebugClient,IDebugClient.Release)
ret
.DISASMFAIL
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
invoke MessageBox,0,[pBuffer],"Disassemble failed",0
CoInvoke(pIDebugControl,IDebugControl.Release)
CoInvoke(pIDebugClient,IDebugClient.Release)
ret
EndF
Thanks for all the input everyone. I hope this solution is helpfull to anyone.
Hi FlySky,
Since you are not using pBuffer there is no need to allocate it with CoTaskMemAlloc or to free it with CoTaskMemFree. You should remove the calls.