News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Associative arrays (Hash tables)

Started by mitchi, May 08, 2009, 01:41:56 AM

Previous topic - Next topic

mitchi

I'm interested to know how an associative array like this is implemented.
Example from php :

$salaries["Bob"] = 2000;
$salaries["Sally"] = 4000;
$salaries["Charlie"] = 600;
$salaries["Clare"] = 0;

echo "Bob is being paid - $" . $salaries["Bob"] . "<br />";
echo "Sally is being paid - $" . $salaries["Sally"] . "<br />";
echo "Charlie is being paid - $" . $salaries["Charlie"] . "<br />";
echo "Clare is being paid - $" . $salaries["Clare"];


I have a few ideas but I'm a bit confused by something. The hash function takes the small string as input and outputs an index number. But the index number can't be too big otherwise we would need an enormous amount of memory right? So it probably generates a small array index number (word sized maybe) and lots of collisions can potentially happen?

dedndave


mitchi

Yes I went there before asking here. But there's no code !  :bg
I'm hoping that someone has a snippet of code that shows how to do this.

hutch--

mitchi,

There is in fact an example in the masm32 example code on a hash table but it has never been simplified or particularly tidied up. If you do them right they are faster than a tree but suffer the problem that the array size must be fixed before the locations for members can be calculated. In another language I use hash tables that use variable length arrays for each member so you make massive table counts with empty members and only use memory for those "buckets" that get used.

The action in hash table design is a good fast hash algo that distributes the "buckets" across the number of slots and fast collision detection and handling.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

I used to work on satellite equipment.
One of the companies I worked for (Telonix) made transmitters that were attached to animals.
The satellite connection was of course, secure.
They used "hamming" codes to encrypt the access account numbers.
The satellite data modems at Fairchild/SSE Datacom used the same type of coding for encryption and for error correction/recovery.
I found it interesting, and what I classify as "FM" math magic.
I have not used hashing since, and have forgotten most of what I knew.
I think this is a little different than what you may be talking about.

What comes to mind, though is the decoder ring found at the bottom of a cereal box when I was a kid - lol
XLAT ?

hutch--

Dave,

Thats just about the other end of hash table design, encryption when done this way properly is very useful but in comparison to the type of hash required for hash tables that are used to replace trees, it is far too slow, the two factors are hashing speed and collission detection and handling. XLATB works OK for character conversion in examples like EBCDIC to ASCII conversions but it can be done faster with direct register pointers with normal Intel complex addressing and you can use any table size you like with the latter method.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

FORTRANS

Hi,

   I implemented a hash for a GIF encoder to keep track of the LZW
tokens.  As a basis for my code I used some GIF code in C from some
books I bought, a discussion on hashing in Knuth's AoCP chapter
on hashing, and some apparently broken LZW assembly code.  The
only really tricky part is defining a good hash function to efficiently
access the data array without collisions.  Even breaking some of
Knuth's rules doesn't seem to hurt, though in this simple case, it
should not matter too much.

   Try and define a good hash function for your application.  For a
text string a CRC might be a good starting point.

Regards,

Steve N.