Message boards : Number crunching : Hyper Threading or not?
Author | Message |
---|---|
Christian Diepold![]() Send message Joined: 23 Sep 05 Posts: 37 Credit: 300,225 RAC: 0 |
Hi! It's the same old question as with most projects, but does Rosetta draw a benefit from HT being turned on or would it be better to turn it off? I'm not talking about the scientific benefit, to get results returned faster but about the mere number of completed WUs. I'm asking because my P4 2,8 HT has the same RAC (180) as my Athlon XP 2000+, but it has 2 WUs running at the same time. Both have been crunching for a month now, so RAC should be pretty accurate by now. |
Ethan Volunteer moderator Send message Joined: 22 Aug 05 Posts: 286 Credit: 9,304,700 RAC: 0 |
Evening, A p4 2.8 is middle of the road in terms of modern cpus. Hyperthreading 'should' give you a couple extra percent, but I'm guessing that is only true if you allow both threads to persist in memory. If I owned a similar setup (which I do), I'd suggest setting it to one cpu and allowing it to reside in memory. True, it may be faster to run 2x, but down the road R@H may run larger proteins and your computer may not like it. |
![]() ![]() Send message Joined: 16 Sep 05 Posts: 59 Credit: 99,832 RAC: 0 |
... but down the road R@H may run larger proteins and your computer may not like it. Ethan, could you explain what you mean by this? It won't like it in what sense? The amount of memory used by 2 work units running at the same time? BOINC.BE: For Belgians who love the smell of glowing red cpu's in the morning Tutta55's Lair |
![]() ![]() Send message Joined: 17 Sep 05 Posts: 161 Credit: 162,253 RAC: 0 |
Ethan, could you explain what you mean by this? It won't like it in what sense? The amount of memory used by 2 work units running at the same time? OK, I'm not Ethan, but... Two WU take up more RAM than 1 - even my 1GB 3.4GHz P4 (running Windows 2003 server) gets up to 800 or 900MB used at times. If I had 512MB or less, it would be a lot slower because of constant memory swapping to the hard-disk. This would be more evident on WU that need more RAM. Also, two WU running at the same time make the CPU work harder so it will likely get hotter and start to clock-throttle. Mine runs at 67-69 Celcius with one thread, and often hits 72 when running two. (Yes, I'm about to fix that - have ordered a Zalman CNPS9500 cooler) All that heat ~may~ not do the computer much good in the long run either (and I'm not just talking about the CPU). If we're swapping to disk all the time and the CPU is clock-throttling, running only 1 thread ~may~ be more productive. *** Join BOINC@Australia today *** |
![]() ![]() Send message Joined: 18 Sep 05 Posts: 662 Credit: 12,140,580 RAC: 0 |
My 3.2GHz HT Prescott was getting warmer. It seemed to peak if I had 2 Predictor units running at the same time. It started to climb towards 70C. I opened it up and cleaned the fins of my CPU heatsink. Now, with 2 Predictors running, it gets up to 62C. If you are running into CPU throttling, I'd suggest giving your system a good clean, and it that doesn't help, adding one or more case fans, or upgrading your CPU heatsink. *** edited for spelling *** Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Andrew Send message Joined: 19 Sep 05 Posts: 162 Credit: 105,512 RAC: 0 |
It's the same old question as with most projects, but does Rosetta draw a benefit from HT being turned on or would it be better to turn it off? I'm not talking about the scientific benefit, to get results returned faster but about the mere number of completed WUs. In general, if your computer can handle 2 WUs running, then yes it's better to have HT enabled. Just fyi, there's a thread (here) which talks about Intel and AMD. EDIT: Interestingly, if you're running Seti's super optimized client, then HT is of little gain b/c the bottleneck is the FSB. (link) ![]() |
![]() ![]() Send message Joined: 17 Sep 05 Posts: 161 Credit: 162,253 RAC: 0 |
If you are running into CPU throttling, I'd suggest giving your system a good clean, and it that doesn't help, adding one or more case fans, or upgrading your CPU heatsink. Already have 2 front intake, 3 rear outlets (including PSU). But the stock cooler is on the way out in a matter of days so there's little point cleaning it now. Thanks for the suuggestion anyway - it may help someone else. *** Join BOINC@Australia today *** |
Christian Diepold![]() Send message Joined: 23 Sep 05 Posts: 37 Credit: 300,225 RAC: 0 |
|
BadThad Send message Joined: 8 Nov 05 Posts: 30 Credit: 71,834,523 RAC: 0 |
<------ From emjem, not BadThad: Windows task manager shows that my CPU is spending ~50% of it's time on each of the two WUs running. So on the surface it would appear that an HT cpu does twice as much work as a non-HT unit. But this could very well be false logic since not all is as it appears in the cpu world. It would be nice to have some FACTS on this issue. As for the memory usage issue with HT I don't see a problem. Two of my P4 3.2 systems use ~69 meg for each 'cpu'. So it would seem that any system with at least 256meg of ram has room for around 80% growth in WU size. |
![]() ![]() Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,700,566 RAC: 3 |
Windows task manager shows that my CPU is spending ~50% of it's time on each of the two WUs running. So on the surface it would appear that an HT cpu does twice as much work as a non-HT unit. But this could very well be false logic since not all is as it appears in the cpu world. It would be nice to have some FACTS on this issue. Each result is getting 50% of the CPU time, but each result also takes longer to complete than it would if it were running alone. You can easily get the facts - for your specific CPU - by downloading the SETI "reference" WU, and running it by itself with HT turned off, then running it twice in parallel, and comparing the times. With a bit more effort (and with your network connection turned off to prevent uploading) you can do the same with any given WU from any project. (Not for the nervous; requires a lot of file moving and/or XML editing.) The gain seen from HT varies dramatically with the project and the application used, as well as with the specific CPU, size of cache, etc., but seems to _generally_ be between 3 and 10%. Definitely not double. The largest improvement is seen when the two threads are running totally different applications, such as SETI and Rosetta, rather than two Rosettas. HT isn't dual-core; there aren't REALLY two CPUs in there, just some smart logic that says "this program isn't using the floating point unit at this instant, I'll let this other program have it for a while". Without HT, only one program can be running on the CPU at any moment; with HT, two programs can run, but if they both want the same PART of the CPU at the same time, one has to wait. Some heavily-used parts may actually be present twice, but if the entire CPU was there twice, it'd be dual-core and not HT! ![]() |
![]() ![]() Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
|
mikey![]() Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,724,195 RAC: 678 ![]() |
Bumping this thread for HT in nehalem intel chips? I'd guess their HT technology has come a long way by now... It still has the same problems though, there are some internal memory parts that are shared and Boinc just seems to want to use those ALOT! One problem with the older designs, I do not know if the newer chips do it or not, but when workunit A wanted to use the memory and then workunit B wanted to use the same memory, the memory was flushed and then reloaded for B. That slowed things down alot, but if they weren't doing exactly the same thing it probably had to be done anyway. Faster internal clock speeds may make HT faster but in comparison to non HT on the same cpu, it still has the same percentage increase of less than twice as fast. |
![]() Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,260,318 RAC: 9 |
Bumping this thread for HT in nehalem intel chips? I'd guess their HT technology has come a long way by now... From past posts, nehalem HT does improve throughput considerably and is a lot better than it was on the P4, but will need twice the RAM. I couldn't give any figures, although I might be able to soon - I've got a build that will either be a Phenom II X6 or an i7... |
M Send message Joined: 24 Oct 07 Posts: 5 Credit: 119,215 RAC: 0 |
Hi, Just chipping in. I have been running Rossetta on a i7-920 with 6gigs ram for a few months. This is what I observed : HT off CPU works on 4 WU's at once. HT on, CPU works on 8 WU's at once. I have not seen any time penalty from having HT on, it's practically a 100% improvement, I was quite impressed. CPU is a little warmer though with HT on & running at 100%, which from the improvements is gives, is understandable. Y. |
Jochen Send message Joined: 6 Jun 06 Posts: 133 Credit: 3,847,433 RAC: 0 |
You won't see a time penalty, you will just get less credits per task with HT on. I would guess, the gain with HT on is approx. 50 percent. But there is no way it's going to be a gain of 100 percent. cu Joe |
Speedy![]() Send message Joined: 25 Sep 05 Posts: 163 Credit: 826,597 RAC: 0 |
1 thing I've noted on my I7 980X is that with ht on ab_07_02 & 07_16 tasks when I open graphics the steps tick over very slowly. When ht is off the steps rool over very fast. Have a crunching good day!! |
![]() ![]() Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
1 thing I've noted on my I7 980X is that with ht on ab_07_02 & 07_16 tasks when I open graphics the steps tick over very slowly. When ht is off the steps rool over very fast. Some proteins roll over fast simply because they are smaller. But, it'd be expected that with HT on each WU would run a bit slower than off. But there's still a performance boost since you run twice the number of WUs. ![]() |
The_Bad_Penguin![]() Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,276,053 RAC: 0 |
Hope this is not too OT... Is there a "big" difference between HT on the i7's and the (much) older Xeon 7020's? For some reason, I seem to be fighting against the idea that the dual-core Xeon 7020's actually have "real" HT, and thus are actually capable of running 4 threads each. With four 7020's (32gb ram), I'm expecting 8 cores / 8 threads (at least by how a current i7 would define a "thread"). But specs say it would be 8 cores, 16 threads... Anyone with more experience care to opine? Defeat Censorship! Wikileaks needs OUR help! Learn how you can help (d/l 'insurance' file), by clicking here. "Whoever would overthrow the liberty of a nation must begin by subduing the freeness of speech" B. Franklin |
![]() ![]() Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
Hope this is not too OT... Each CPU has two cores. Each core can run 2 threads. So... you have 4 7020... that's 8 cores, and since each core can squeeze two threads at a time.... you get 16 threads. ![]() |
The_Bad_Penguin![]() Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,276,053 RAC: 0 |
Thanx, Chilean. Perhaps my question is better re-phrased as: How much (performance) difference is there between a dual core cpu which has "Net Burst microarchitecture" HT, versus a dual core cpu which has "Nehalem microarchitecture" HT ? Defeat Censorship! Wikileaks needs OUR help! Learn how you can help (d/l 'insurance' file), by clicking here. "Whoever would overthrow the liberty of a nation must begin by subduing the freeness of speech" B. Franklin |
Message boards :
Number crunching :
Hyper Threading or not?
©2025 University of Washington
https://www.bakerlab.org