The MASM Forum Archive 2004 to 2012

General Forums => The Workshop => Topic started by: Glenn9999 on December 22, 2010, 07:57:04 AM

Title: Multi-Threading as it applies to Multi-Processing
Post by: Glenn9999 on December 22, 2010, 07:57:04 AM
Has anyone studied the topic as indicated in the subject?  I've done some work thinking about it and doing a couple of algorithms, but just curious if anyone here has worked on it, especially in light of utilizing the multiple cores on most newer CPUs and so forth.  For example, where the ability to use the multiple cores outweighs the code it takes to set up and manage the cores and so forth?

Curious more than anything...
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: oex on December 22, 2010, 12:41:33 PM
1 thread per core is optimal
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: dedndave on December 22, 2010, 12:56:36 PM
that seems like kind of a terse answer for a complicated subject   :P

you might use the forum search tool
Hutch was playing with some ideas on this a while back

there are some good reasons for creating alternate threads that aren't necessarily on a per-core basis
for example, a message box with a cancel button on one thread while the code runs on another
this allows a user to cancel an operation that has tied up the CPU

as for trying to get better performance by controlling affinity...
the general consensus is that it's usually best to let the OS handle it   :P
there may be cases where selecting a specific core may help on one machine and hinder on another
of course, it depends a lot on your application

however, there are some interesting possibilities
in a manner of speaking, having alternate cores (or hyper-threaded cores) may be used as an extra set of registers
while this idea may sound appealing, i have yet to see any particularly "cool" implementation of it

multi-threading certainly has a place in code-performance profiling
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: redskull on December 22, 2010, 01:12:21 PM
Quote from: oex on December 22, 2010, 12:41:33 PM
1 thread per core is optimal

1 running thread per core is optimal.

But beware, as this guidline can be misleading.  A running thread is not waiting for I/O, user-input, semaphores, mutxes, disk reads, or anything else; it is just crunching numbers.  Most programs do more waiting than running, so the best thread/core setup is highly application depedenent.

-r
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: oex on December 22, 2010, 02:32:23 PM
There is some less terse info here :lol

http://www.masm32.com/board/index.php?topic=13374.0

Always good to understand the most important parts first :bg

Like using adjectives in English sentences :lol
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: dedndave on December 22, 2010, 03:35:54 PM
QuoteThere is some less terse info here

QuoteLike using adjectives in English sentences

less terse = more verbose   :P
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: Glenn9999 on December 22, 2010, 04:00:10 PM
There's some good information here.  Thank you for posting it.  It will be helpful to study as I need to improve my thread handling and control code that I already have.

But maybe I should clarify what I was looking to discuss in making this post.  What I'm talking about is the utilization of those multiple cores on a job in order to make it go faster.   Given that, the answer to be able to use the multiple cores is to have multiple threads.  Of course, the job in question would have to be an algorithm that tends to parallelism.

For example, one of the algorithms I've identified is quick sort.   As a divide and conquer algorithm, you can piece out sections initially and instead of recursing the procedure on them, you can call a number of threads = number of processors and get a performance increase if you have enough data to make it worthwhile.  I've done this and get a clock time anywhere from n to (n / p) where p is the number of processors on the computer and n is the single core clock time.  However, this variation occurs depending on the sizes of the sections for this algorithm.  N / P occurs when the sections are almost evenly divided by the algorithm.  Eventually I'd like to try getting the code so I can keep track of the number of active threads and restart them on a new section if one is terminated before the others are done.  I wouldn't expect the code speed to be (n/p), but I would expect it to be more consistent towards that.

Now by "ability to use the multiple cores outweighs the code it takes to set up and manage the cores and so forth", I will note there is an overhead in setting up and managing all the threads.  The multi-threaded algorithm against 1 processor is slower than a single-threaded model on a similar amount of data.  As well, the single-threaded algorithm is faster on a multi-core system against a small amount of data.

I was wondering if anyone had any observations along these lines.

(I'll probably pick this thought back up as a project as soon as the one I'm working on is over.)
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: dedndave on December 22, 2010, 04:50:17 PM
i think if you go to all the trouble to set up threads (which isn't that difficult),
then it should be a simple matter to make some pre-determinations about the size of the data and the strategy to be used   :P
you may not be able to easily optimize it for all platforms, but you should be able to achieve "close enough"
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: oex on December 22, 2010, 05:07:44 PM
Quote from: dedndave on December 22, 2010, 03:35:54 PM
less terse = more verbose   :P

:lol How embarrassing being taught English by Americanz :red

:lol Now I know how my computer feels
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: dedndave on December 22, 2010, 06:03:03 PM
shooooot
wifee is a Brit
i take lessons every day - and not just in English
if it weren't for her, i wouldn't know how to act  :eek
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: zemtex on December 23, 2010, 03:45:38 PM
Quote from: oex on December 22, 2010, 12:41:33 PM
1 thread per core is optimal

...and 2 for hyper threading  :bdg

If you have a six core cpu you have 6 physical cores and 12 logical cores.
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: oex on December 23, 2010, 04:16:30 PM
:lol I was talking about logical cores I was just being less verbose :P
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: hutch-- on December 23, 2010, 04:30:15 PM
Glen,

Its always useful to start at the other end, core count matching thread count where intensive data is being processed. The issue that will dictate in part what you do with multiple threads is interval granularity versus thread overhead. When you are performing tasks that take a reasonable amount of time then distributing that load across multiple cores makes sense but as the intervals get shorter the thread handling overhead gets larger and when you get down to low millisecond counts that same thread handling overhead will kill any speed advantage of multiple cores running multiple threads.

Unless you know for sure what the core count will be you need to make the data processing scalable so that it has some chance of adapting to the core count. The worst scenario is a vast thread count with a low core count as you pick up all the thread handling overhead without the processing powewr to handle it.
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: zemtex on December 23, 2010, 09:46:29 PM
Amount of threads can be automatically adjusted in the program/game during runtime to find the optimal use of threads. But to use 1 thread per core is a good template to begin with. What works good for you during compile-time is not the same result that will show up when your friend runs it (ofcourse  :lol)

Some programs run a test in the background in a black screen to find this optimal setting and then it launch to give you the interface.
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: Glenn9999 on December 25, 2010, 08:56:43 AM
Quote from: hutch-- on December 23, 2010, 04:30:15 PM
Its always useful to start at the other end, core count matching thread count where intensive data is being processed.

Yes, that's what I did in my test.  Query the number of processors and that becomes the maximum number of threads my multi-processing code spawns.

Quote
the intervals get shorter the thread handling overhead gets larger and when you get down to low millisecond counts that same thread handling overhead will kill any speed advantage of multiple cores running multiple threads.

Exactly.  This number varies among systems, I'm sure, but you can always pick an amount of data where single processing needs to kick in.

On this, I actually got to thinking about the MD5 stuff I've been trying to optimize.  Since my testing routine pulls in a much larger block of data than 64 bytes when reading from the hard disk, it might not hurt to try testing multi-processing on that.   And it'd be much easier to scale than quicksort.   For handling the middle of the data (always multiple of 64 bytes), divide it up evenly, then let threads = processor count handle that (provided pointers and correct data sizes for each piece).  It might not net a huge performance advantage but it just might, too.

Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: hutch-- on December 25, 2010, 10:57:10 AM
If you can see the processing duration in well over milliseconds then if the data allows it to be chopped up in blocks, parallel processing makes sense as you can process blocks in parallel then write them back in sequence. You will need to have some thread synchronisation but its to your advantage to minimise this as it adds to the total overhead. Now for example if you are streaming data from disk, its possible to run one thread that does the disk input and more than one other to process the data once it is in memory.

It can get very messy but if you are streaming data fast enough for long enough you can justify the work to set up a technique like this, especially if you can set up the data processing with later instructions that are designed to handle parallel processing in each core. You basically need somewhat more processing power than the maximum disk input so that you have a bit of headroom for thread synchronisation.
Title: Re: Multi-Threading as it applies to Multi-Processing
Post by: dedndave on December 25, 2010, 12:34:18 PM
i don't think the number of threads per core is the real issue, within reason
it's more important to talk about what they're doing
i.e., create a new thread when it's meaningful to do so
a simple example might be a window update loop
in that case, the new thread has purpose
it wasn't created in an attempt to enhance performance, per se
it has a task to perform, which frees up the main program to perform other tasks

now, in some cases, you might be able to split a single task to enhance performance
i think you could read a chunk of a file and split that into sections to search for text - something like that
unfotunately, so many functions that we'd like to speed up must be performed in serial order