Ok, I have a question on the two attached files.
Fire2a.asm and Fire2b.asm
Fire2a.asm uses MMX commands to average the surrounding pixels in a bitmap and compute the next screen to display. This is accomplished in BLUR_MMX2 procedure.
Now, the original code was posted on here in another forum, and can also be found at :
http://www.ronybc.8k.com
I have changed the procedure in Fire2a.asm to be quicker? Basically, I just figured out what the guy did, and eliminated more than half the code that didn't need to be there to perform the same task.
Now, in Fire2b.asm, I've tried to use more of the MMX register's capabilities, but it seems to be fuzzier? more white noise keeps appearing, instead of the colors blending naturally. I am wondering WHY? (The Big Question)
The first Fire2a.asm, I simply followed the original code and exchanged the PAVGW for all the ADDs and SRLWs...etc.
The 2nd Fire2b.asm, I tried to use all 64-bits of data from the source bitmap, then average each byte with the surrounding pixels (also a very cool fire algorithm BTW...) and then store that information back in the destination bitmap.
My question is, why when I try to use the full potential of the MMX registers in this new "optimization" do I get noise all over the screen?
Hopefully, someone on here is smarter than I am (that's not difficult....) :lol
Let me know what you find out, I'm still working on this program, and an XMM version as well.
More later,
Jeff C
::) ::) ::) ::) ::) ::)
[attachment deleted by admin]
Depending on your chipset, you can cause 'snow' by updating the screen too fast. Waiting on vsync solves it.
Just a thought - may or may not be relevant.
I downloaded the code, found the Blur_MMX2 procedure, but did not understand it too well. I am good with MMX and XMM and making algos, but I am rubbish with bitmap manipulation. Please could you help me understand what goes on before the @@?
Sure AeroASM: Sorry it took so long for a reply here, but I've been busy at work lately. I am going to upload full comments to the entire code here shortly too, but this should help.
... taken from BLUR MMX2
pxor MM7,MM7 ; this clears the MM7 register to ZERO
mov eax,fadelvl ; loads EAX with fadelvl (multiplier for fade subtraction)
imul eax,00010001h ; load WORDs in EAX with 0001h
mov [ebp-4],eax ; store this at [ebp-4]
mov [ebp-8],eax ; store is also at [ebp-8]
movq MM6,[ebp-8] ; now we have 64-bits of 0001h in MM6
mov eax,maxx ; load maxx (passed to maxx in WM_SIZE msg, # of columns, x pixels)
lea eax,[eax+eax*2] ; set eax = eax*3
mov ebx,eax ; set ebx = eax*3
imul maxy ; this multiplies eax * maxy (size of window # of rows, y pixels)
push eax ; so eax = maxx*3*maxy, bitmap size * 3 for 24-bits per pixel
; eax is also PUSHed onto the stack, so it's now [esp]
lea edx,[ebx-3] ; edx = maxx*3 - 3
lea ebx,[ebx+3] ; ebx = maxx*3 + 3
neg edx ; edx = -maxx*3 + 3
xor eax,eax ; clears EAX to ZERO
lea esi,[esi-3] ; offset ESI by 24-bits
Let me know how this helps.
Later,
Jeff Cummings
P.S. On a sad note, I think the author may have had some run in with the Tsunami, as the code is from this web site, based in Kerala, India...
http://www.ronybc.8k.com
If I find this to be the case, I may include ..."to the dedicated memory of rony b chandran" in the code somewhere.
I've left a few guestbook messages on his web site, but gotten no response from him, as well as sent a few emails...nothing.
Later guys,
Jeff C