News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Mouse gestures

Started by Shama, October 06, 2005, 09:03:39 PM

Previous topic - Next topic

Shama

Hey! I'm like a freshman in MASM programming! So please forgive me if I express myself incorrect! I've started to write a program, sumthing like Sensiva. I'm not pleased about the way it works, gesture qualities are damn bad! I ask if some one can help me with the algorithm of recognition. I had attached my project, please take a look. Though it's on C++ but there will be no problem to port on MASM. Would be happy for any help and/or comments about this topic!
Thanx.

I have deleted your attachment because there was no way of knowing what was in it. There were a pile of junk files, a DLL and an EXE. Remember this is an assembler forum so posting unknown binaries apparently written in C++ is too high a risk to allow.

PBrennick

Your zip is filled with garbage.  No project in there.

P
The GeneSys Project is available from:
The Repository or My crappy website

gabor

Hello Shama!

Mouse gestures are interesting and funny. I saw some games using this functionality.
Well, if you have something to start with, I could help you...

Greets, Gábor


Shama

Hey, Gábor!
My attachment had been deleted, i don't know why. That couple of files was my project  :( If you can take a look at my project I'll e-mail it to you, but my way of programming is to sloppy i'm afraid you won't understand the code. But if you can help me I'll try to make my code look neat!
I'll try to explain how i realized the gestures:
1. I create a pattern by drawing on the canvas. I separate this image into quadratic areas like an array. If there's a line in the area I put 1, if no - 0. So my array for symbl 'V' will look aproximately like this:
11000011
11000011
01100110
01100110
00111100
00011000
2. I've already made a nice code of drawing on the screen and capturing that image.
3. The screen image passes through step 1 of this algorithm.
4. Two arrays are compared.
If you can explain me how does PenAPI work or any better algorithm of gesture I'll be very glad  :U
Thanx

P1

Quote from: Shama on October 07, 2005, 04:44:15 PM
My attachment had been deleted, i don't know why.
Please read the edit to your first post !!!  :naughty:

Please read the forum rules.  You will get a better first impression, when you don't break rules around here.

Welcome to MASMforum!   :U

Read around the different areas and get a feel for the place.

Read the help files and tutorials of MASM32.  And always try the 'Search' function first.  Then ask your questions.

Regards,  P1  :8)

Tedd

Comparing with bitmaps doesn't work very well because it requires the gesture is made exactly the same without much room for error.
eg. all the following are V:

101   01100001   00001
010   00100001   00001
      00010010   01010
      00001100   00100


A better method is to use vectors. And also split gestures into combinations of simple gestures.
So if you have 8 simple gestures (up, down, left, right, up-left, up-right, down-left, down-right) then all other gestures can be made of combinations of these, eg. V would be \ (down-right) followed by then / (up-right)
Detecting the straight lines is much easier, and means you don't need to worry about scale, and can also be flexible on the exact angle and other details which could mess it up.
No snowflake in an avalanche feels responsible.

gabor

Hi!

Once I found a small program that demonstrated neuron networks and their learning capabilities. The subject was hand-written letter recogniton. There were two parameters of the drawings (the user wrote the letters by drawing with the mouse) one parameter was the angle the other had something to do with curving.
This was definitly a method that used vectors. Vectorizing the input makes the task more difficult and complex, but the system will be capable of recognizing far more figures more efficiently, with less mistakes.

In the case of mouse gestures I don't think that complex figures are to be recognized, so your first way of solution is not bad to me at all. Though it needs scaling, but vectorizing: calculating angles, line lengthes are not simple operations too.

I would do this:
1. capture the picture of the gesture
2. calculate the size of one block
3. create the matrix
4. compare it to the stored patterns

1. This you've got already. (With vectores there is no need to capture the pixels of the image. This means far less storage capacity is needed, this is an advantage of vectorizing.)
2. Let the captured image be 245*162px and the dimension of the figure to be recognized 10*8 block.
   The "mapping" will create 24px*20px blocks onto the image. The goal was to map every block of the stored figure onto a
   rectangular area of the image.
3. To create the 10*8 matrix of the image is not simple, because you have to find a good function that decides whether the block was filled or not. This step can bring in errors that lead to mistakes in recognition.
4. Compare the result matrix to the stored ones.

At step 3 I would look for a solution that uses probability. With probability the matchings (when more than one figure could match the image) can be classified.
Example: binary decision would return
    1, when block contained a drawn pixel
    0, when block contined blank pixels (background color) only
I feel that the line length is essential, so maybe the scale between the captured image and the stored figure should influence the line length. This affects the decision.

Finally I must admit, that for "real" recognition the vector utilizing solution must be choosen. I believe the scheme of the recognition is very similar.
1. capture the vectors of the gesture: angle and length.
2. the vectors must be normalized to match the scale of the stored vectores
3. the matrix is ready
4. compare it to the stored vector matrices

To capture angle and length is the most important and most difficult part. The possibility of mistakes lies in this step.

This is what I could think of, or I could recommend. These are just ideas, some ways to solve the problem I would try first.
I am looking forward to read other ideas and also about your progress.

Greets, Gábor

Shama

Hey, Guys, thank you for great ideas. The reason why I didn't think of vector recognition is probably because I didn't know how to realiza them. I remember once I've tried to save the captured image as sequence of dorection like: west, north, south, north-east etc. But a problem occured with the velocity the mouse is moved, because I captured the mouse position by timer. I could move the mouse fast to right then to left, and my algorithm thought i didn't move the mouse at all.
I don't agree with Tedd saying that
101                         1000001
010    is equal to       0100010
                              0010100
                              0001000
because when scaling first bitmap we get
110011
110011
001100 etc, which has more '1' pixels than the pattern.
I like the ideas given and I'll try to begin with realizing them, but I've no idea hov to capture vector image instead of the bitmap!
Will you help me?
Thanks. 

Tedd

I didn't say they were equal, I said they are both a V.
The point is that they are NOT equal - and that is why trying to match bitmaps is not a good method.

As for capturing vector 'images' - you're thinking about it wrong. Forget images!
A vector is just a line which starts at one point, has a length, and an angle.

The starting point is very easy - WM_LBUTTONDOWN

The length is not so important in this case - gestures should be same whether big or small; just consider shapes. Though you could possibly choose to ignore gestures that are too small (slip of the mouse.)

Angle is the clever part. The first thing to do is simplify it, so you only consider 8 angles - more than this is hard to use anyway.
A line can be made up of smaller lines, and as long as the angles between them are not so different, then you should consider them part of the same line, eg ----- is a straight line, ___,---- is also a straight line.
But now you're thinking it's hard to calculate all the angles? But you don't need to. There are only 8 angles to consider. So you can take each pair of captured mouse points (using WM_MOUSEMOVE, not a timer -- if the user moves it too fast and messes it up, that is their problem) and find the 'angle' of that line in one of the 8 angles (octants) just by looking at how the x and y of the start and end points are different from each other (I'll let you work this out, it's really not so hard; just check which are bigger than the others - you'll see.)
So then all you need to do is look at each of the line sections, and if you have two lines together with the same octant, then they are really part of the same line. And if they are in different octants, they are different lines.
And from that, you have a list of simple line-gestures ('vectors') - and you match this against your set of recognized gestures (also stored as a list of simple line-gestures) and presto!


C'mon, I've done almost everything except written the code for you here :P
No snowflake in an avalanche feels responsible.

gabor

Hello!


A really nice idea Tedd! Only one point is not clear. You speak about vectors, and a vector as we know has either 2 ends (2 points - 2 coordinate pairs) or a starting point an angle and a length. When we try to vectorize something the length is important too, because the length brings scale and ratio into the figure. Without length how can we separete these images/gestures:
 _
|            _ _ _
|   and   |
|

If I am right then calculating the angles is not enough, the length must be used as well. To do so, the vectors must be normalized first...

I have an other question:
How should this WM_MOUSEMOVE message be used for getting the ending point? I mean whenever the user moves the mouse a message is sent to the window. When should the program take the coordinates of the ending point? Surely it cannot take them after every single WM_MOUSEMOVE message.
I suggest some kind of polling is necessary. Again, if I am right then utilizing a timer is a possible solution...

This turns out to be a very good task to develop. I am positive that to implement the whole thing is absolutly not difficult, when everything was designed well before. I mean every bugs, bumpers and difficulties can be forseen and avoided during the design phase.
I still have some ideas, but I don't want to bore.  :bg Of course if you are agreed I post them.


Greets, Gábor


Tedd

Quote from: gabor on October 11, 2005, 01:53:34 PM
When we try to vectorize something the length is important too, because the length brings scale and ratio into the figure. Without length how can we separete these images/gestures:
_
|            _ _ _
|   and   |
|

If I am right then calculating the angles is not enough, the length must be used as well. To do so, the vectors must be normalized first...
True, you would not be able to distinguish between them. But IMHO you should not want to - they are the same gesture. Gestures are indications, they are not precise shapes. Therefore, you should ignore scale and exact angles.
If you want to distinguish, then finding the length of a line between two points really isn't difficult :wink

The problem is that if you make it too complex then it also becomes complex to use. Gestures are meant to be simple so they are simple to use, and so having to do a combination of long and short lines at definite angles between each other can be difficult. If you must include length, then I would say classifying them as anything more than long and short is overly complex.

As for knowing when the gesture has ended: a gesture begins when the mouse button is down, and so it ends when the button is released. There is no need for a timer :toothy
But yes, you do want to take every single mousemove message - this is the information used to create the line segments. I don't think there would be so many that it would become a problem.
No snowflake in an avalanche feels responsible.

JHER VON ARBANEL

shama,,, i saw your code when uve  posted ,, but then saobody erased it, could u send me ur code and and libs that u included plz ,, i think ur work is excelent thank  :bdg

Infro_X

<<- Infro's crazy ideas are about to be said, you were forewarned!

What I would do, is the whole vector thing as the following.

on wm_lbuttondown, start recording x/y into an array
on wm_mousemove, record x/y into an array
on wm_lbuttonup, end recording x/y into an array and calculate gesture

gesture function-zorz
find highest and smallest x's and y's creating a rectangle
divide the rectangle up into pieces (64 or more)
divide each peice into 4 quadrants
check each quadrant for a line if it has a line, check to see which direction it is (bounded by the box)
check the peice for a line, and which direction it is
1/2 result from quadrants, 1/2 result from peice = total result of that peice (rounded to nearest 15, 30, or 45 degree boundry depending on how you want to handle it)
take all the results from each peice, and try to create a general idea of how the peices connect to each other.

example:

       
       
       
    -  
-  |  
 \/    
       
       
equates into

       
       
       
    /  
\  /  
 \/    
       


after that is done try and match it to a pattern.
example:

if line a and line b, connect at botton 2 lines (16 squares) and have a ~40 degree angle between them, its a V
if they have a ~90 degree its a L
if there are 3 lines, then its a possible n or w or etc. etc.etc.

this method is a lot more difficult if you have many patterns because each pattern is "unique". (half circle with open end on right is a C, "a" is a line on right connected to a circle/half circle with possible opening on the top and/or bottom.


As for the masked pattern matching, it can work, but in reality, it isn't very versitile, you'd have to create 100's of bitmaps, each being able to be scaled, strechted and/or skewed, and you wouldn't really be able to "resuse" the bitmaps, as you may be able to with the "unique" pattern matching.

If this is a 1 man project I'd go with the following idea.
masked pattern matching with versitile functions to scale/stretch/(skew) existing bitmaps into many different shapes and sizes
use wm_lbuttondown, wm_mousemove, and wm_lbuttonup to record an array of movements
create a rectangle encompasing the "draw area"
find the centerpoint of the drawing (not of the rectangle, of the vectors)
LOOP1
stretch/scale/(skew) the patterns to 3/4th the size of the rectangle.
pattern match using the center of the drawing and the center of the rectangle for pattern matching
record all matches into a table
stretch/scale/(skew) the pattern up some x ammount, and redo pattern matching from LOOP1 until you get to 20% bigger than the rectangle
take the mean value from the matches table and use it as the matched pattern.

If this is <5 man project i'd go either way
>5 man project, go with the vectors

Shama

Aight, fellas! i'm so glad that my project have interested you. Who wants to see the code I've already done, please post your e-mail address and I will send you. Special thanks to Gabor - man I would be glad to hear any suggestions, post them without asking me. Probably my brain is not so powerful to analyse the ideas given and formulate a new decision in solving my problem!
Some of your ideas seem to require a lot of processor time, and some of your ideas are not clear to me in creating the gesture bitmap. Anyway I think I'll try to do something.
What I have now is this:
1. I start capturing image on WM_RBUTTONDOWN through WM_MOUSEMOVE till WM_RBUTTONUP (a lot of you are agree that it's the best way)
2. The image is scaled (i think skewing whould be too complicated in this case, it's rather OCR solution).
3. The pattern bitmap's lines are thicker than captured one, so when they are overlayed the pattern should suck in all geture image (you will think that 'O' pattern will suck 'C' gesture? - yep and I need a good algorithm of handling this, mine is not so good).
The only problem is that if we have 100 patterns the program should run through all of them and pick the best one, that takes time, so patterns are made tiny (that makes gesture fail sometimes)
If I'm able to combine all ideas suggested, I think the program will be pretty nice.
For all of you who want to dig into my code, please post your e-mails, I don't like to watch profiles  ::)
Thanks everyone and don't forget to think over new ideas!  :dance: