News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Hash table

Started by lamer, April 20, 2005, 08:51:25 PM

Previous topic - Next topic

thomasantony

Hi,
   I have a system in my OS for matching the commands to their coressponding functions. First I go through a list of commands declared as

length,"command"

and find the index of the command entered. Now I use this index into a table of addresses of functions. Is this a hash table?

Thomas :U
There are 10 types of people in the world. Those who understand binary and those who don't.


Programmer's Directory. Submit for free

AeroASM

NO, a hash table is where you use a hash function to make your lookup table (can't think why you would wnat to do that, but hey)

hutch--

Aero,

Have a look at the example I posted, for any word in the equate file, it will replace that equate in a source file with the value of that equate and the important thing with this type of application is that it does ALL of the replacements in 1 pass. The example is a hash table but you can also use a tree to do the same thing. In comparison a look up table is very slow if you have to call  it from beginning until you find the word you are searching for.

Typical use of this type of capacity is for assemblers and compilers and another application for this type of data structure is for the keys to data in a BIG database where searching it sequentially would be extremely slow.

With both hash tables and trees you can build either as you scan through text while you are testing other words to see if they are already in the data structure. Neither technique is simple to code but they can do things that are much faster than simplistic tecniques.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

pbrennick

Actually, it appears that both Thomas and AeroASM are correct although the method described by Hutch and Thomas is not one that I am familiar with.  Programming in Perl uses a method to build a hash table 'on the fly' similar to what AeroASM describes so I guess he is referring to that method.  It is used to build a table as a result of a search function involving a large database.  The data is then hex encoded in a certain way as an ascii representation with is why it is called a 'hash.'  This method described by Thomas and Hutch is not a 'proper' hash and I wonder why it is called such?

Paul

hutch--

The logic of a hash table is the use of a "hash"algo to correlate a numeric value derived from a word or text generally to a location in an array of predetermined member lengths.the method that TenKey described handles collisions by appending to or aranging a sequential list at every bucket that is used where the alternative is looking for the next empty bucket.

The method Thomas uses is a length based tree with a lookup table for each length. Works fine on small word counts but feed it 100000 words and you will understand why you need a more sophisticated data structure.

Speed is the method of comparison as a well written collision handling technique is fast so you balance the hash complexity against its speed and you trade between collisions and complexity to pick the fastest method. The advantage of a hash over a lookup is its calculation time and direct jump to the location  is shorter on any reasonable amount of data than methods of sequentially scanning for words.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

AeroASM

Does it work like this: you compute the hash of a string, then look that hash up using the hash value as an index into an array, then the other field of that row contains a pointer or something?

pbrennick

Who are you asking, me or Hutch.  If it is me then that kind of explains it.  Here is an example:


  # Read in all the variables set by the form
  read (STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
  @cgiPairs = split(/&/,$buffer);
  foreach $cgiPair (@cgiPairs)
  {
    # Create the hash
    ($name,$value) = split(/=/,$cgiPair);
    $name =~ s/\+/ /g;
    $name =~ s/%(..)/pack("c",hex($1))/ge;
    $value =~ s/\+/ /g;
    $value =~ s/%(..)/pack("c",hex($1))/ge;

    # Plug SSI security hole
    $value =~ s/<!--(.|\n)*-->//g;

    $form{$name} .= "\0" if (defined($form{$name}));
    $form{$name} .= "$value";
  }


This is actually a cgi example that takes input from html forms.  This is not really proper discussion for this forum so I think we should go back to talking about the other version of a hash which does.

Paul

hutch--

Aero,

That is basically what a has table does. The action is in producing a wide enough variation within the number range of the array or "buckets" so that you get a natural distribution that does not get too many collisions. What I did with the example that I posted was to use an array of prime numbers in conjunction with the characters of each word to try for a fast method of getting the final number within the array for each word.

It has a reasonable distribution but the next step was to try to make the collision handling as fast as possible as you always get collisions. From testing and in conjunction with most of the hash table design I researched, fast collision handling allows you to reduce the complexity of the hashing algo to get its speed up.

You can think of a hash table as a one step tree where you have a root node but very many branches.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

pbrennick

Hutch,
You are missing his point, which is "are you computing the hash of the string," I think not, it is a search in ascii which is something very different.

Here is an example of a hashed string:

"Hi there" is the same as "9E472F09082BB797429D07545C323836B7D175C0"  minus the quotes, of course.

If I am wrong, then I apologize.
Paul

hutch--

Paul,

Its probably among the range of algo variation although I am not familiar with the technique you have used, it does look like an interesting one. The crypto guys use the term hash in a similar way but they use far more complicated hashes because of the application they are working with.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

pbrennick

It is Perl's method and yes, the Crypto guys use the same method.  Perl needs the method for CGI as all transmissions to HTML are in the clear, it needs to somehow be encoded.  This was the result and has been around for a long time.

Paul

pbrennick

Hutch,
I think I have wasted enough of this topic, I just thought it was interesting and I knew the method you are using is really a souped up look up table and not really a hash though everyone calls it a hash (pretty cheeky).  I will stop posting in this topic, now.

Paul

Mark Jones

 Just for completeness, this is not a hash table:
  (but google lists lots of real hash tables.)  :bg
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

hutch--

Mark,

Is that before or after you have digested it ?
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Mark Jones

Before... else it would be a "gnash" table.  :toothy

I don't know why that photo came up when searching google for "hash table"... I can't think of anything that is "hashed" for breakfast. :)
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08