http://msdn2.microsoft.com/en-us/library/hha180wt(VS.80).aspx I saw this and for a moment became excited as I thought doing mod would be that easy but doesn't appear to work. :\ and I keep running into situations where I need to use MOD and DIV but had crashes etc...so I wrote these simple macro to do it (returns result in eax)
MODx MACRO num1:REQ,num2:REQ
xor edx,edx
mov eax,num1
mov ecx,num2
div ecx
mov eax,edx
endm
DIVx MACRO num1:REQ,num2:REQ
xor edx,edx
mov eax,num1
IF num2
mov ecx,num2
div ecx
ENDIF
endm
uptime example, this isn't the best way to get your uptime as GetTickCount has a duration restriction but this example makes good use of my macros
.386
.model flat, stdcall
option casemap:none
include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
include \masm32\include\advapi32.inc
include \masm32\include\masm32.inc
includelib \masm32\lib\kernel32
includelib \masm32\lib\user32
includelib \masm32\lib\advapi32
includelib \masm32\lib\masm32
Uptime proto
CTEXT MACRO text:VARARG
local TxtName
.data
TxtName BYTE text,0
.code
EXITM <ADDR TxtName>
ENDM
MODx MACRO num1:REQ,num2:REQ
xor edx,edx
mov eax,num1
mov ecx,num2
div ecx
mov eax,edx
endm
DIVx MACRO num1:REQ,num2:REQ
xor edx,edx
mov eax,num1
mov ecx,num2
div ecx
endm
.code
start:
invoke Uptime
invoke ExitProcess,0
Uptime proc
LOCAL buffer[50]:BYTE
LOCAL total:DWORD
LOCAL days:DWORD
LOCAL hours:DWORD
LOCAL minutes:DWORD
;total = GetTickCount() / 1000
invoke GetTickCount
DIVx eax,1000
mov total,eax
;days = total / 86400;
DIVx total,86400
mov days,eax
;hours = (total % 86400) / 3600;
MODx total,86400
DIVx eax,3600
mov hours,eax
;minutes = ((total % 86400) % 3600) / 60
MODx total,86400
MODx eax,3600
DIVx eax,60
mov minutes,eax
comment ~
DWORD total = GetTickCount() / 1000 - startup;
DWORD days = total / 86400;
DWORD hours = (total % 86400) / 3600;
DWORD minutes = ((total % 86400) % 3600) / 60;
~
@@:
invoke wsprintf,addr buffer, CTEXT("%dd %dh %dm"), days, hours, minutes
invoke MessageBox,0,addr buffer,addr buffer,MB_OK
ret
Uptime endp
end start
might be a good idea to check for ecx = 0 in the div etc.. to avoid division by zero exception...
Also, if they were macro functions, returning EXITM <edx> for MODx and EXITM <eax> for DIVx then I think they would be more flexible, if not more efficient.
I tried to make a version of the MODx macro that would do the MOD with an AND for immediate values that are a power of 2, but the code turned out to be more difficult than I expected (and too difficult for this late at night).
Thanks for the advice guys, feel free to addon/change them to fit your needs(and share if you like so others can benefit :bg)
heres the new DivX that doesn't crash if the number your dividing by equals 0
DIVx MACRO num1:REQ,num2:REQ
xor edx,edx
mov eax,num1
IF num2
mov ecx,num2
div ecx
ENDIF
endm
also MichaelW that sounds interesting with the AND instead for immediate values that are a power of 2, i'm assuming that's for optimization?
heres my go at what you're saying
MODx2 MACRO num1:REQ,num2:REQ
mov eax,num1
mov ecx,num1
and ecx,-num1
.if ecx==eax ;Is Power Of Two
;mov eax,num1
and eax,num2
.else
xor edx,edx
mov eax,num1
mov ecx,num2
div ecx
mov eax,edx
.endif
endm
Quotei'm assuming that's for optimization
Yes, AND being much faster than DIV.
http://www.masm32.com/board/index.php?topic=1102.msg8055#msg8055
Just to jab my elbow in :bg
If num2 is a power of 2 in DIVx, then you can use SHR (by its power of 2, not the number itself :wink)
Annd.. if num2 is a constant value (as opposed to register/memory) then you may as well use it directly.
Isn't writing macros fun! :bdg
I know people like their nice tidy .IF .ELSE structures, but it obfuscates the actual ops being used and which registers/flags may be trashed by them. For instance, in your MODx2 sample you reload EAX with an identical value three times unnecessarily, E^cube. This could be a memory load... :eek
Wouldn't this be better? Plus on exit you always have a "was num2 zero" flag set so the macro reports argument errors for you as well - the second test in this case is "free" as it occurs in parallel with the slow DIV.
DIVx MACRO num1:REQ,num2:REQ ; possibly add num1_hi for 64-bit values
mov ecx, num2
xor edx, edx ; or MOV EDX, num1_hi for 64-bit values
mov eax, num1
test ecx, ecx ; set zero flag if would be divide by zero
jz div_exit
; can remove next two ops if not using 64-bit version
cmp edx, ecx ; test for overflow before DIV
jae div_overflow
div ecx
test ecx, ecx ; unset ZF, DIV leaves it undefined
; can remove next two ops if not using 64-bit version
jmp div_exit
div_overflow:
sub eax, eax ; set ZF to show no result
div_exit:
endm ; result in EDX, ZF is set to show error to the calling code
; if ECX is zero this would have been a divide by zero, else ECX is nonzero showing 64-bit overflow
Apologies if I've done something you can't in macros, I never use them. ::)
EDIT: added possible updates for more general purpose 64-bit macro. I am not familiar enough with macros to know whether you can set a default value (ie. 0) for an argument if it's not explicitly provided. Perhaps someone can provide the necessary extra argument definition.
E^Cube-
Thank you, I had never seen that method of determining if a number was a power of 2 before!
Quote from: E^cube on May 11, 2007, 06:02:07 PM
also MichaelW that sounds interesting with the AND instead for immediate values that are a power of 2, i'm assuming that's for optimization?
heres my go at what you're saying
...
.if ecx==eax ;Is Power Of Two
;mov eax,num1
and eax,num2
.else
...
Um, how does this do a MOD? :eek
If you mean num2 is a power of 2, then:
65 MOD 2 = 1 whereas 65 AND 2 = 0
If you mean num1 is a power of 2, then:
64 MOD 3 = 1 whereas 64 AND 3 = 0
This code doesn't work. What you need is the bitmask of all the powers of 2 below num2 if num2 is a power of 2. Fortunately, using your method of detecting the power of 2 you can get this easily using intermediate results. Note that you only need to load num1 and num2 once each, remember they may be memory pulls. :wink
MODx2 MACRO num1:REQ,num2:REQ ; possibly add num1_hi for 64-bit values
mov ecx, num2
xor edx, edx
mov eax, ecx
sub edx, ecx ; sets ZF if divide by zero
jz mod_exit2
and eax, edx ; keep EDX
not edx ; and invert it for bitmask
cmp ecx, eax
mov eax, num1 ; doesn't change flags
je power2
xor edx, edx ; or MOV EDX, num1_hi for 64-bit values
div ecx
jmp mod_exit
power2:
and eax, edx
mod_exit:
test ecx, ecx ; must unset ZF
mod_exit2:
endm ; result in EAX, ZF set if ECX was zero (no result)
Quote from: Tedd on May 12, 2007, 11:50:39 AM
If num2 is a power of 2 in DIVx, then you can use SHR (by its power of 2, not the number itself :wink)
You mean, something like:
DIVx MACRO num1:REQ,num2:REQ
mov ecx, num2
xor edx, edx
mov eax, ecx
sub edx, ecx ; sets ZF if divide by zero
jz div_exit2
and eax, edx
xor edx, edx
cmp ecx, eax
mov eax, num1 ; doesn't change flags
je power2div
div ecx
jmp div_exit
power2div:
bsf ecx, ecx ; get power of 2 in CL
mov edx, eax
shr edx, cl
div_exit:
test ecx, ecx ; must unset ZF
div_exit2:
endm ; result in EDX, ZF is set to show divide by zero error to the calling code
Is it worth the extra overhead, though? Plus this isn't easily adaptable to 64-bit.
Yeah, you're right - it's probably too much overhead (unless there's a really nice way to find the correct power of two.)
But in the constant case then it's okay since the determination can be done at compile time rather than runtime.
Actually, I was a little surprised by BSF. According to the Opcodes help file that comes with MASM, for 486 BSF is VERY slow, at 6-42 clocks depending on the number of bits it has to search before first match. But according to Agner's Pentium optimisation guide, it is really quite fast, with only a 4-clock latency compared to the 50 or so for a DIV. So for newer processors, including the jump and 6 clocks for the SHR reg CL the special case optimisation isn't that bad, about five times as fast.
But how often are you likely to get a power of 2 to DIV by that you can't identify in advance and write specific code for instead of always using a less efficient do-it-all macro? Hence my extreme antipathy towards them generally.
Wow nice Ian, thanks for the posts. :U
Glad to be of help somewhere... :8)
But it was your neat trick of finding the power of two, as JimG noted, that made it possible. That was a real eyeopener when I checked it out in a calculator. It just needed a little followthrough and careful register handling to use the result effectively.
E^cube,
To do n mod m with AND you must AND n with m-1. This information was on the page that I linked, but I should have made it obvious by pointing out the error in your code.
You could also add handling for the special (trivial) case of DIV by 1:
DIVx MACRO num1:REQ,num2:REQ
mov ecx, num2
xor edx, edx
mov eax, ecx
sub edx, ecx ; sets ZF if divide by zero
jz div_exit2
cmp ecx, 1
je div_by_1
and eax, edx
xor edx, edx
cmp ecx, eax
mov eax, num1 ; doesn't change flags
je power2div
div ecx
jmp div_exit
div_by_1:
mov edx, num1 ; result = input
jmp div_exit
power2div:
bsf ecx, ecx ; get power of 2 in CL
mov edx, eax
shr edx, cl
div_exit:
test ecx, ecx ; must unset ZF
div_exit2:
endm ; result in EDX, ZF is set to show divide by zero error to the calling code
this macro divides a qword by a dword then returns the value in a dword(eax) I wrote this when I was playing with GetFreeDiskSpaceEx
DivQbyD MACRO num1:REQ,num2:REQ
xor edx,edx
mov eax,dword ptr num1+0
IF num2
mov ecx,num2
div ecx
ENDIF
endm
example
include \masm32\include\masm32rt.inc
CTEXT MACRO text:VARARG
local TxtName
.data
TxtName BYTE text,0
.code
EXITM <ADDR TxtName>
ENDM
DIVx MACRO num1:REQ,num2:REQ
xor edx,edx
mov eax,num1
IF num2
mov ecx,num2
div ecx
ENDIF
endm
DivQbD MACRO num1:REQ,num2:REQ
xor edx,edx
mov eax,dword ptr num1+0
IF num2
mov ecx,num2
div ecx
ENDIF
endm
.data
FreeBytes dq 0
TotalBytes dq 0
TotalFreeBytes dq 0
kilobytes dd 0
megabytes dd 0
.data?
buff db 1024 dup(?)
.code
start:
invoke GetDiskFreeSpaceEx,CTEXT("c:\"),addr FreeBytes,addr TotalBytes,addr TotalFreeBytes
;FreeBytes is a qword
DivQbD FreeBytes,1024
;results in a dword
mov kilobytes,eax
;lets divide once more to get megabytes
DIVx eax,1024
mov megabytes,eax
invoke wsprintf, addr buff,CTEXT("free disk space: megabytes:%i ,kilobytes:%i "),megabytes,kilobytes
invoke MessageBox,0,addr buff,NULL,MB_ICONINFORMATION
invoke ExitProcess,0
end start
also thanks for your guess optimized versions of these macros, I really hope hutch puts these or macros similiar to these in masm32 to make life easier for those just trying to do some simple math.
Quote from: E^cube on May 31, 2007, 10:15:55 PM
this macro divides a qword by a dword then returns the value in a dword(eax) I wrote this when I was playing with GetFreeDiskSpaceEx
DivQbyD MACRO num1:REQ,num2:REQ
xor edx,edx
mov eax,dword ptr num1+0
IF num2
mov ecx,num2
div ecx
ENDIF
endm
I must be missing something.. but how is it a qword when you set edx (the high dword component) to zero?
(Technically any such division is "qword DIV dword", but if edx is zero then it's no different to "dword DIV dword")
I was looking at the DivQbyD macro for use with the GetDriveSpaceEx function, after having tried a number of other code examples. Unfortunately this one doesnt seem to work either.
Does anyone have a simple function to divide a 64 bit number (QWORD) by a dword? I want to be able to break the 64bit number down to a value that can be utilized with a progress bar - PBM_SETRANGE32 and had hoped to divide the qword down to a dword value at most for this reason.
div64 proc dividend_hi:DWORD,dividend_lo:DWORD,divisor:DWORD
;;divides 64-bit dividend by 32-bit divisor
;(dividend_hi is the top 32-bits, dividend_lo the lower 32-bits)
mov eax,dividend_lo
mov edx,dividend_hi
mov ecx,divisor
div ecx
;returns with eax as the quotient
; and edx as the remainder
ret
div64 endp
But you may as well just do the division yourself (without calling a separate function) - the DIV instruction is intended to be used for dividing a 64-bit value.
Use IDIV if you require negative values.
Thanks Tedd,
I must have spent a good whole day trying everything i could think of and using loads of snippets of code in an attempt to resolve my impass. The solution appears so easy - i suppose it always does, with hindsight :bg
Appreciate your help though. Saved me from another day of frustration and ive been able to use that function to show the progress bar for the amount of space being processed vs total amount as obtained by the GetDiskFreeSpaceEx function. :U
Cheers
Quote from: Ian_B on May 12, 2007, 08:27:44 PM
Actually, I was a little surprised by BSF. According to the Opcodes help file that comes with MASM, for 486 BSF is VERY slow, at 6-42 clocks depending on the number of bits it has to search before first match. But according to Agner's Pentium optimisation guide, it is really quite fast, with only a 4-clock latency compared to the 50 or so for a DIV. So for newer processors, including the jump and 6 clocks for the SHR reg CL the special case optimisation isn't that bad, about five times as fast.
But how often are you likely to get a power of 2 to DIV by that you can't identify in advance and write specific code for instead of always using a less efficient do-it-all macro? Hence my extreme antipathy towards them generally.
Bitscan Forward and Reverse
Pentium 4
4 cycle latency
Amd64
8..10 cycle latency
Core2
2 cycle latency
..so even on the Core2 it is wise to have other work available (more than 1 dependency chain)
If I was to create some sort of "smart division" macro the only real division I would do is 2^32 / divisor (the 32-bit fixed point reciprocal of the divisor)
As long as the divisor is the same as last time, you can use a much cheaper fixed point multiplication strategy at the cost of a compare and conditional (and its potential branch misprediction)