News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Intro and coding problem/question

Started by robione, December 13, 2009, 08:18:32 PM

Previous topic - Next topic

FORTRANS

Quote from: robione on December 19, 2009, 07:39:33 AM
A note on color spaces. I think the most intuitive to use would be HSB

   HSV?

Quotebut the goal of this little project is realtime edge detection, segmentation,
feature extraction, etc. In the end I'd rather not convert anything if I can help it.

   Reviewing Pratt, and looking at all the example pictures, straight
RGB is probably usable without conversion to another color space
for most images.  The idea behind converting is to emphasize some
characteristic of the image that is of interest.

QuoteI'm not sure if Sobel or Laplace is the right algorithm to use either.

   Again, looking at the example pictures and scanning the text,
Sobel should be good as a baseline edge detector.  If you find
that it fails to perform adequately on your images, then try a
different algorithm.  It seems to be the best compromise between
simplicity and effectiveness.

   Trying my own code, Sobel works well except on noisy images.
(For my excursions, that meant "do not apply a high pass filter".)

QuoteI'm just trying a bunch of stuff out. I am kinda curious about seeing how the YIQ
comes out. If memory serves me it's a bit easier on the CPU then the RGB->HSB formula
wikipedia has.

   Yes, it is simpler to calculate.  And the gray scale from individual
RGB components is a good first test.  The RGB channels are highly
correlated in most images.

QuoteAfter my final the 23rd I'll have a bunch more time to tinker with this stuff :)

   Well, if you come to some firm conclusions, let us know.

Regards,

Steve N.

drizz

Quote from: robione on December 18, 2009, 02:34:03 AMI also dont really get what the purpose of rep stosd is? You're writing -1 in EAX to ES, ECX number of times? How are you later accessing the -1's in ES? Why are there -1's in ES?
It's this code that I moved outside the loop
/* image boundaries */
     if(Y==0 || Y==originalImage.rows-1)
  SUM = 0;
     else if(X==0 || X==originalImage.cols-1)
  SUM = 0;


Since in my code "dest" and "src" have the same size edges are filled with 0xFF.

QuoteIt is important to notice that pixels in the first and last rows, as well as the first and last columns cannot be manipulated by a 3x3 mask. This is because when placing the center of the mask over a pixel in the first row (for example), the mask will be outside the image boundaries.

using memset (rep stosd/stosb) is faster than having a loop that changes only first/last row and first/last columns

This can of course be avoided by having "dest" size of "(width-2)*(height-2)" but to simplify things i didn't do it.




Quote from: robione on December 18, 2009, 02:34:03 AM
Also I was curious why the following occurs:


mov esi,src
mov edi,dest
sub esi,edi
//loop
lea eax,[esi+edi]


Isn't eax essentially pointing to edi? Ok.. the debugger says it points back to the original src.... Ok I think I understand, sort of. It's done so you can skip the 'inc esi' at the end of the loop?

Yes, it's a trick i learned from c++compiler, there are many similar tricks that can be used to optimize loop code.

The one i did:

P3 = P1 - P2

mov ..,[P3+P2] ;; mem evaluates to P1
mov [P2],...
inc P2; by increasing P2 I increase both effective addresses

Remember: it's all integer arithmetic before accessing memory.



Quote from: robione on December 18, 2009, 02:34:03 AMBut this gets me curious about something else. Doesn't 'lea eax,[esi]' 'lea eax,[esi+edi]' 'lea eax,[esi+edi*3-3]' expand into multiple instructions within the architecture? Like 'loop' pretty much expands into  'dec ecx; cmp ecx,0; jnz label'. That would explain the behavior I had earlier encountered in my Sobel code... as the instruction count shrunk but the time was similar.
"lea" stands for "load effective address". This means that it only calculates address without accessing memory. You have to check the optimization manual(s) for instructions to avoid when writing optimized code.

here are some links:
http://www.agner.org/optimize/#manuals
http://www.mark.masmcode.com/
http://graphics.stanford.edu/~seander/bithacks.html
http://www.iti.cs.tu-bs.de/soft/www.goof.com/pcg/doc/opt-pairing.html
http://www.hackersdelight.org/
+ intel, amd opt. manuals which you surely have.




Quote from: robione on December 18, 2009, 02:34:03 AM
How long have you been doing this Drizz? My impression is this is like second nature to you.... It takes me forever to code something simple in asssembly. Then to add tricks on top... like realizing to multiply the float conversions by 1024 then shift left 10.... never happen to me.... not yet anyway :).
For me writing optimized code is fun. You can learn all the tricks too if you wanted. Also don't credit me for RGB2GRAY macro, I picked it up from some page, but writing code without floats was the first thing I thought of.

have fun  :wink
The truth cannot be learned ... it can only be recognized.

sprint

ahhh Dave...thats a sincere confession :U...now look at this attachment...u will find plenty of them in a country like india....what should i call it !!!   :bg

sorry for the attachement...couldn't find a way to embed the picture along with this post.....  :eek

i guess this is the best way of traveling .....  :lol

UtillMasm


dedndave

#34
Quoteoh my fo, crazy indian!

ROFL !!!



Sorry Rob - we have completely hi-jacked your thread   

robione

#35
Quote from: drizz on December 19, 2009, 06:25:05 PM
here are some links:
http://www.agner.org/optimize/#manuals
http://www.mark.masmcode.com/
http://graphics.stanford.edu/~seander/bithacks.html
http://www.iti.cs.tu-bs.de/soft/www.goof.com/pcg/doc/opt-pairing.html
http://www.hackersdelight.org/
+ intel, amd opt. manuals which you surely have.

Funny you should mention the opt. manuals.... I just downloaded them Friday. Interesting reads. Thx for the links. I'll definitely be checking them out (as I procrastinate for this physics final :) )

Quote from: dedndave on December 20, 2009, 02:50:59 AM
Sorry Rob - we have completely hi-jacked your thread   

For a sec I was trying to figure out who you were talking to :). I haven't given out my names on forums yet. This one seems to be more of a tight community than others I'm part of. My last name is Robidoux... that + Starwars is where the Robione comes from :)

Quote from: jj2007 on December 19, 2009, 08:03:45 AM
Quote from: robione on December 19, 2009, 07:39:33 AM
I think the gryscale converter has all the speed squeezed out of it possible now.
movdqu xmm0,[esi] is relatively slow; on a P4, lddqu is faster...

I've learned to stop saying such things LOL. I end up biting my tongue. TYVM, I just need to upgrade all my compilers. As far as C++ goes I'm not sure if VC++ Express 2008 had anything lacking from VS6 Enterprise Ed. That's the only thing that's kept me from changing compilers on my netbook. Unfort, I'm limited to SSE2 instructions with my VS6 setup. I'll keep that in mind until then :)

I finished the SSE color Sobel but am having some problems. [Edit: just about everything from here down deleted.... which was alot if you didn't see it before. I basically ran one loop too long, forgot to use width in one spot (was using width-8) and forgot to switch glDrawPixels from GL_LUMINENCE to GL_RGBA. Now I just need to figure out why everything is dull but otherwise all is good.]