News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Fastest data structure (generally)

Started by Seb, March 02, 2008, 01:23:08 PM

Previous topic - Next topic

Seb

Hey there!

I've got a question which's not really related to Assembly (which's why I'm posting it here), but first a little background to why I'm asking it.

I'm developing a project in C at the moment (with some routines written in Assembly), and I'm currently in a position where I've got to choose what type of data structure to use with regard to performance and speed, but I'm unsure what to choose. I'll only insert elements once (ie. when parsing a file), I'll never be removing any elements, and I'll never access the data randomly (it'll be sequentially). So what I'll be doing is - iterate through the data structure (which could be big) from start (0) to end (count) to find an element, and I'll be doing this operation many, many times over and over again (with a different element to find each time, of course).

So, all I'll be doing is essentially this (psuedo-code):

for each element in elements
check if other_element matches element
if so do_something
end check
next element


"Element" here is a pointer to a structure.

What data structure should I use to gain as much performance as possible? Would a simple dynamically allocated array be enough here? Or perhaps a linked list?

Thanks a lot! :bg

Regards,
Seb

Jimg

This all depends upon what you are actually searching for. 

1.  You will search for something within the data pointed to by an element in the array of elements (pointers).

2.  You have a pointer into the data, and you want to know which element this is, i.e. the record number.

3.  Something else we can't guess.


Each choice will bring up other questions.  Perhaps a concrete example of the data (fake data is fine), would help get an answer.

Seb

Jimg,

thanks for the reply. I'd say the first point is what I'll be doing.

More psuedo-code:


for each element in elements
if element->value=other_element then
call do_something
end if
next element


"Value" and "other_element" in this case are DWORDs. So I'll be comparing DWORDs throughout the iteration, but this design might change in the future (I might switch to a byte array which'll mean I'll be doing memcmp()'s instead).

So yes, I'll be searching an array of elements (pointers to structures) for a DWORD value. The structure itself in which the value might be is quite small (< 50 bytes), and all structures will have been allocated and stored in memory after the parsing of the file is done. This actually draws me to another question I've got: what memory allocation method is preferred for many small (< 50 bytes) and quick allocations? Global/Local/Heap/VirtualAlloc?

Thanks again. :U

Regards,
Seb

Jimg

To answer your last question first, I would normally make my best guess as to how much memory I will need, perhaps based upon the size of the file or other crieria, then use HeapAlloc with the next larger 16K amount.  If I find that wasn't enough, I would then use HeapReAlloc to add another 16K as needed.  You will find that everyone has their preferred method of allocating memory, so hopefully someone else will respond also.

As for the other part, I'm just not getting it.  If you are storing the data sequentially, then the pointer you are storing will also be in increasing order, so I would do a binary search.  But since you already know the value, what's the point of searching?  You already have the pointer to the data.  So obviously, I don't understand what you are saying yet.

Seb

Jimg,

thanks for the tip regarding the memory allocation.

As for my first question, I'm not searching for a specific value (as I perhaps did write or make you think), but I want to identify the structure holding the ID I'm using to search because I need the other information stored in that structure.


for each element in elements
if element->value=my_value then
do something with element
end if
next element


"my_value" is not an ID or index, so that's why I've got an "if" in there to compare the specific values. And the psuedo-code above is assuming "elements" is an array.

So if I've got an array/linked list/whatever which holds all pointers to the structures, and eg. I want to check if there's a pointer to a structure in there with the "value" member set to 0x10000000, and such an entry exists, I want the loop/iteration/whatever to find that entry and return me a pointer to the structure or do something with the information inside the structure.

So my question from the start was, what data structure would be the fastest and most efficient to use for this kind of operation, also taking the other facts into consideration (eg. that I'll never modify/remove entries in the data structure after the data structure has been built up, etc.)? It may sound complicated, but it's really not.

Thanks! :bg

Regards,
Seb

Jimg

Ok, for an example, assume you have the following:

       array of
       pointers
       to data
     (structures)

(note: assume all addresses in decimal for this example)
address         actual
within          pointer
pointer         value
array

10000           20000
10004           20050
10008           20100
10012           20150


       Data

20000  Id (4 bytes)
20004  Name (20 bytes)
20024  Quantity (4 bytes)
20028  Spares (22 bytes)

20050 Id (4 bytes)
20054 Name
  etc.

20100 Id
  etc.
  .
  .
  .



As I understand it, you want to find the structure (record) that contains an Id of 12345

Do you want the address within the pointer array, or the actual address of the data itself?

Will the data record (which you call a structure) be a fixed size or a variable size? I.e. will all records be 50 bytes, or will some be larger, some smaller?

Can the addresses within the pointer array change?
If not, are they assigned sequentially?

I'll probably have more questions after I understand these.

Seb

Hi Jimg,

yep, you got it right. Using your example, I'd want to find a structure (or data record) with the ID of eg. 12345, but this ID will not be constant as I'm fetching new IDs and do the same thing, ie. check if they're in the array/linked list/whatever I end up using.

I want the address of the data itself, that is, I want a pointer to the structure with the "ID" itself, so I easily can process the rest of the information in the structure.

The structure will be fixed-size, and the structure type itself will be byte-aligned as I'm parsing a binary file and then mapping the file data into multiple sequential structures. By the way, will there be performance penalties because of the byte-alignment of the structure when accessing individual member fields in a structure from the pointer array? If so, would copying the data to a new structure of 4/8/16-byte alignment and then freeing the old structure perhaps speed things up when iterating through the pointer array and accessing the pointers? I'm not very concerned with the performance when parsing the file, the most important thing is that I've got an efficient method of storing the structures, and accessing them from beginning to start (or the other way round) many, many times in a row.

I guess the addresses within the pointer array could change, but if they do, it wouldn't be by me - all I'll be doing is accessing all the structure pointers over and over again, and this has to be fast and efficient.

I don't know yet if they'll be assigned sequentially, I guess that depends on what data structure (ie. linear array, linked list, etc.) I'll end up using and how/where the memory allocator stores the data, or? Or did I misunderstand you?

Regards,
Seb

Jimg

To answer what questions I can, yes, there is almost always a penalty for accessing data that is not dword aligned.

For the best performance, I would create my structure to be a multiple of 4 bytes.  If I could afford the memory, I would align each interior element on a 4 byte boundary, assuming you might want to search on other elements rather than just ID.
This can waste a lot of memory if you have fill, but results in the quickest access.

Now, since the data is stored in a fixed size structure, and the array of pointers itself doesn't seem to contain anything other than the start of each data block, and the data doesn't seem to be in any particular order, I would not use the array of pointer at all, but just step through the data itself (only if you can align the start of each structure on a dword boundary).

I would do something like this-

mov esi,startofdata   ; startofdata is the pointer that heapalloc returns to you.
mov eax,valuetofind
.repeat
    .if eax==[esi]
        mov eax,esi
        jmp done
    .endif
    add esi,sizeof structure
.until esi> endofdata ; last addresses used when storing data.
mov eax,-1   ; meaning id not found
ret




MichaelW

Perhaps I'm missing something here, but if there will be very many elements the binary versus sequential search decision will have a bigger impact on performance than the alignment of the data. If the data will be copied to memory anyway, perhaps it could be inserted in a binary tree (or trees if there will be more than one search field) and in the process manipulated to align the search field(s).
eschew obfuscation

Jimg

Hi Michael-
I agree.  My ineloquence is causing trouble dragging the info needed out of Seb.  This would definitely be the next step if the performance warranted the added complexity.

u

 :toothy It's actually not too hard.  A linked list of structures of arrays.


Your unoptimized one:

ILKO struct
var1 dd ?
var2 dd ?
var3 dd ?
ILKO ends


Your optimized:

CLUSTER_SIZE equ 1024


ILKO_CLUSTER struct
;----[ linked-list stuff ]--------[
pPrev dd ? ; ptr to previous ILKO_CLUSTER in the linked-list. Optional
pNext dd ? ; ptr to next ILKO_CLUSTER in the linked-list
;---------------------------------/

;-----[ allocated bitfields stuff ]---------[
NumAlloc dd ? ; 0..CLUSTER_SIZE
AllocBitfield dd (CLUSTER_SIZE/32) dup (?)
;-------------------------------------------/

_padding db 20 dup (?)

;----[ structure of arrays ]-----------[
var1_a dd CLUSTER_SIZE dup (?)
var2_a dd CLUSTER_SIZE dup (?)
var3_a dd CLUSTER_SIZE dup (?)
;--------------------------------------/
ILKO_CLUSTER ends



;------[ assemble-time watchdog macro ]----------[
CheckStructMemberAlignment macro structMember
local intX
intX = structMember and 31

if intX ne 0
.err <please set _padding appropriately to align to 32-byte cache boundaries>
endif
endm
;-----------------------------------------------/


; do check on our struct
CheckStructMemberAlignment ILKO_CLUSTER.var1_a
Please use a smaller graphic in your signature.

u

In addition, in each "cluster" (yea, my terminology here was bad) you can put indexing info, hashing (easy),  min/max values-info (easiest). And as you can see, saving/loading to/from a file is a breeze. By adding an IsDirty flag to "clusters", you can also minimize HDD activity (though you need to align to 64kB size of saved struct).
And also you can have a "NextFreeElementIndex dd ?", when appending new data (when NumAlloc<CLUSTER_SIZE), as you will mostly (or only) be appending new elements and will rarely (or never) be deleting elements.
Please use a smaller graphic in your signature.

xmetal

Hash tables are definitely the fastest data structures, since they almost always take O(1) time to insert/access any element. However, they can incur slowdowns (as well as wastage of space) if their hashing algorithm does not distribute the elements uniformly enough.

If you can settle for O(log n) time, then I would recommend a height-balanced tree such as an AVL tree. It is relatively "worry free".

u

Here's a way to not-waste space via hashtables:

(for simplicity, 4 hash values are possible: )


int IndexOfHashElements[5]; // IndexOfHashElements[4] serves as a count of all elements
void* AllHashElements[1000];

void AddElement(int ElementHash,void* pElement){ // ElementHash=0..3
if(IndexOfHashElements[4]>=1000){ // debugging
debugCRASH("AllHashElements[] is already full here");
return;
}

//-----[ make space for the new element ]----------[
// i.e the AllHashElements[] is "aaabbcddd0" and we're inserting a value with hash "b"
// 0 = empty space, a/b/c/d are elements with those 4 hashes,
// B is the new element, we're about to insert
// so, we make the arrays like this:
// pass 1:  aaabbc0ddd // we move the first "d" element to the first untaken cell
// pass 2:  aaabb0cddd // we move the first "c" element to the emptied cell from pass1
// pass 3:  aaabbBcddd // we put the new element in the emptied cell from pass2



for(int curhash=3;curhash>ElementHash;curhash--){ // these are those pass1, pass2, pass3 ...
int arraystart;
int arrayafterend;
arraystart = IndexOfHashElements[curhash];
arrayafterend = IndexOfHashElements[curhash+1];

AllHashElements[arrayafterend] = AllHashElements[arraystart];
}
//-------------------------------------------------/

// insert current element
int elemIndex = IndexOfHashElements[ElementHash+1];
AllHashElements[elemIndex] = pElement;

//----[ increment indices ]-----------------------[
for(int curhash=3;curhash>ElementHash;curhash--){
IndexOfHashElements[curhash]++;
}
//------------------------------------------------/
}



And encapsulate this into a linked-list structure. The above code makes uneven hashing waste less space. I.e 1000 dwords max here instead of NumHashes*1000 dwords max with linked-lists of arrays.


You just have to KNOW almost exactly how fast some operations are, how much RAM you can waste, what could go wrong, and what exactly you want the data to be. That way you can create the fastest solution to any problem.
Please use a smaller graphic in your signature.

Jimg

All this really depends upon how many items there are, and how many times Seb needs to look up a value.  A one time sort of the addresses of a million items would take under a second, and then a binary search each time would probably be very quick and much simpler and quicker to initially set up than a linked list.