Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 247 · 248 · 249 · 250 · 251 · 252 · 253 . . . 300 · Next
Author | Message |
---|---|
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
All Geforce 10X0 GPUs have DPAt 1:32, which is laughable. My AMDs are 1:4. and Moo! doesn't need it at all.Sure? Most projects need some DP. MW is entirely DP, but other projects need it sometimes. At 1:32 that's gonna slow things down immensely. But I'm testing a GTX 275 since yesterday and I get there the same issue with Moo!, 100% of a CPU core to feed it, the HD3850 I had before needed 1-2%. On Milkyway the GTX 275 however does not need that much, so maybe it's a Moo! (or distributed.net) thing.They're not the best designed tasks. I have a dual GPU card and it doesn't notice the second chip, yet every other project does. Mind you I've heard of Nvidia needing high CPU usage on a number of projects, I can't remember what the cause is. Why do you say "or distributed.net? Are they not one and the same? |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
All Geforce 10X0 GPUs have DPAt 1:32, which is laughable. My AMDs are 1:4. IIRC only Tahiti-Chips and the Radeon VII have 1:4, other are worse, down to at least 1:16, but that doesn't matter for CPU usage, if the GPU has it and the app suports and needs it, it will use it regardless of the ratio, which the app even doesn't know anything about. and Moo! doesn't need it at all.Sure? Most projects need some DP. MW is entirely DP, but other projects need it sometimes. Sure, since it runs on SP cards and doesn't have additional CPU load there, IIRC it's doing integer only like Collatz did. The only project I know that needs DP "sometimes" and is able to run on SP cards and doing the DP part on CPU is Einstein. At 1:32 that's gonna slow things down immensely. Still several times faster than doing it on CPU, hence Einstein ist doing the DP part even on 1:64 GPUs. But I'm testing a GTX 275 since yesterday and I get there the same issue with Moo!, 100% of a CPU core to feed it, the HD3850 I had before needed 1-2%. On Milkyway the GTX 275 however does not need that much, so maybe it's a Moo! (or distributed.net) thing.They're not the best designed tasks. I have a dual GPU card and it doesn't notice the second chip, yet every other project does. Mind you I've heard of Nvidia needing high CPU usage on a number of projects, I can't remember what the cause is. Don't remember the cause exactly either, on Milkyway it seems to be acceptable for my card at least, but I've seen there other Nvidias using 100% of a CPU core as well, so YMMV. Why do you say "or distributed.net? Are they not one and the same? No, they are separate projects (even if Moo tasks come from distributed.net), Moo! is responsible for the wrapper application, distributed.net for the actual client (and that was using all the CPU time, not the wrapper). . |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
IIRC only Tahiti-ChipsWhich is why I go for those. and the Radeon VII have 1:4, other are worse, down to at least 1:16But always 2 times better than Nvidia. but that doesn't matter for CPU usage, if the GPU has it and the app supports and needs it, it will use it regardless of the ratio, which the app even doesn't know anything about.Wrong, if it needs 1/4 of its stuff done in DP, and the card can't do 1:4, then some will have to go to the CPU. Sure, since it runs on SP cards and doesn't have additional CPU load there, IIRC it's doing integer only like Collatz did. The only project I know that needs DP "sometimes" and is able to run on SP cards and doing the DP part on CPU is Einstein.There are many of them, Folding, Primegrid, World Community Grid for example. Still several times faster than doing it on CPU, hence Einstein ist doing the DP part even on 1:64 GPUs.Not if you have a Ryzen 9 CPU, those are pretty fast even compared to GPUs. Provided you can multithread. For some reason I don't see many multithread CPU + GPU tasks. In fact I was surprised at Greg's 2CPU+2NV Moo task. Although it probably just split it's workload in two and did 1 CPU + 1 NV for each half. Don't remember the cause exactly either, on Milkyway it seems to be acceptable for my card at least, but I've seen there other Nvidias using 100% of a CPU core as well, so YMMV.Odd, since MW is one of the least CPU intensive GPU tasks. No, they are separate projects (even if Moo tasks come from distributed.net), Moo! is responsible for the wrapper application, distributed.net for the actual client (and that was using all the CPU time, not the wrapper).So Moo sometimes does stuff from places other than distributed net? And distributed net is a non-Boinc program you can run seperately? |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
But MOO is sharing the cards with FAH on my system. So maybe that downgrades the time a bit? And I don't have the fastest cards. A 1080 plain and a 1050TI |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
But MOO is sharing the cards with FAH on my system.If the GPUs are also doing Folding, then Moo is using a CPU core for less than a GPU. Anyway, you're doing almost as much Moo as I expect from those cards, I think it was 1.2 times as much as me and I thought it should be 1.5, so all is well. Rule 1: If heat is pouring off the chip, it's doing a lot of work. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
but that doesn't matter for CPU usage, if the GPU has it and the app supports and needs it, it will use it regardless of the ratio, which the app even doesn't know anything about.Wrong, if it needs 1/4 of its stuff done in DP, and the card can't do 1:4, then some will have to go to the CPU. The code of the app needs to support it like in case of Einstein, GPU and CPU is not one thing where the code gets executed and can choose by itself where if feels it might run faster or do "some" stuff on the CPU if the DP performance is low on the GPU, the detection of SP/DP on GPU and the fallback to CPU for DP calculations must be implemented, otherwise a DP app will just error out on SP card. Sure, since it runs on SP cards and doesn't have additional CPU load there, IIRC it's doing integer only like Collatz did. The only project I know that needs DP "sometimes" and is able to run on SP cards and doing the DP part on CPU is Einstein.There are many of them, Folding, Primegrid, World Community Grid for example. OPNG shouldn't need it IIRC, no idea about the others, but unlike they implemented it like Einstein, if they need DP, the app will not run on SP cards, just like Milkyway. Still several times faster than doing it on CPU, hence Einstein ist doing the DP part even on 1:64 GPUs.Not if you have a Ryzen 9 CPU, those are pretty fast even compared to GPUs. Provided you can multithread. For some reason I don't see many multithread CPU + GPU tasks. It will still run on GPU, the app doesn't know if the CPU is faster: https://einsteinathome.org/content/fgrp5-cpu-and-fgrpb1g-gpu-why-does-crunching-seem-pause-90 In fact I was surprised at Greg's 2CPU+2NV Moo task. Although it probably just split it's workload in two and did 1 CPU + 1 NV for each half. Never seen that either. So Moo sometimes does stuff from places other than distributed net? No, but like yoyo, which does as one of their projects the OGR project from distributed.net, they could if the admin would want to. And distributed net is a non-Boinc program you can run seperately? Yes. https://www.distributed.net/Download_clients . |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Rule 1: If heat is pouring off the chip, it's doing a lot of work.- The fans get their workout everyday. At the moment they are just doing the easy stuff with FAH. Apparently TN needs everything cpu, thought I suppose I can write a script to put that down to 14 or something. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
The code of the app needs to support it like in case of Einstein, GPU and CPU is not one thing where the code gets executed and can choose by itself where if feels it might run faster or do "some" stuff on the CPU if the DP performance is low on the GPU, the detection of SP/DP on GPU and the fallback to CPU for DP calculations must be implemented, otherwise a DP app will just error out on SP card.Isn't it possible to write the program so it does as much DP as possible on the GPU, but if that's not enough, use the CPU aswell? OPNG shouldn't need it IIRC, no idea about the others, but unlike they implemented it like Einstein, if they need DP, the app will not run on SP cards, just like Milkyway.I'm sure I saw someone mention there's some DP in it. I would assume a decent program would use the CPU for that bit if the card was SP only (those actually exist? I thought all cards had at least a tiny bit of DP). Anyway, judging by the speed OPNG runs on my different cards, there's DP in it. It will still run on GPU, the app doesn't know if the CPU is faster: https://einsteinathome.org/content/fgrp5-cpu-and-fgrpb1g-gpu-why-does-crunching-seem-pause-90Surely the app can get the benchmark data from Boinc? Never seen that either.Have you tried Moo on two Nvidia cards? Yes. https://www.distributed.net/Download_clientsROFL at "Trojan horses and other perverted versions have been known to have been circulated" Perverted maths programs? Ooooh sexy numbers! |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
Rule 1: If heat is pouring off the chip, it's doing a lot of work.- The fans get their workout everyday. At the moment they are just doing the easy stuff with FAH. Apparently TN needs everything cpu, thought I suppose I can write a script to put that down to 14 or something.Some maths projects seem to use more electricity (and therefore produce more heat) on a GPU. My Fury card seems to have been badly designed. At normal settings, it drops the clock dramatically on some projects. I found this was due to it hitting the power limit of the VRMs. So I cranked the power limit up to 150% in Afterburner, which worked for about 3 weeks, then the power connector melted, deforming the plastic shroud and oxidising the contacts. So I soldered the wires on directly :-) |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Now you have seen every angle and data point as to how the GPU's operate with MOO and FAH working simultaneously. There is no OC on this. Forgot to turn that on. So this is default mode. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
Isn't it possible to write the program so it does as much DP as possible on the GPU, but if that's not enough, use the CPU aswell? Possible: maybe, depending if the calculation can be split into few smaller ones, rational: very likely no. You would need the DP part twice in your application (OK, Einstein has that) AND a scheduler, which monitors the performance and assigns parts to the CPU and GPU. That would always cost performance, you would eventually need to assign more CPU cores to the task because of that (specially if they should be mt like you said before) and at the end of the day you might have speed up some tasks while slowing down the overall production of the computer. And that for some rare hardware configurations with higher DP performance on CPU than on GPU? OPNG shouldn't need it IIRC, no idea about the others, but unlike they implemented it like Einstein, if they need DP, the app will not run on SP cards, just like Milkyway.I'm sure I saw someone mention there's some DP in it. It was a bug in the beta. I would assume a decent program would use the CPU for that bit if the card was SP only (those actually exist? I thought all cards had at least a tiny bit of DP). So Milkyway isn't a decent program? No, if the program requires something from the hardware, it will simply crash if it's missing, that's same for CPUs, a program, which requires for example SSE2 won't run if the CPU is missing it. And no, not all cards have DP. Well, the newer ones I think do, but on older generations only the high end ones had it, otherwise there would never have been any question about it on Milkyway, it would always run on that "tiny bit", even if slow. Anyway, judging by the speed OPNG runs on my different cards, there's DP in it. As far as I can tell, a lot of it runs on the CPU, so people need up to 16 instances to load their GPUs. That does not indicate any DP requirement, it seems more like the hybrid app we had once for Astropulse on SETI. It will still run on GPU, the app doesn't know if the CPU is faster: https://einsteinathome.org/content/fgrp5-cpu-and-fgrpb1g-gpu-why-does-crunching-seem-pause-90Surely the app can get the benchmark data from Boinc? No idea if it theoretically could, it's not doing it, if DP is available on the card, it will use it, because according to the devs anything else would be nonsense. BOINC has just the SP flops anyway + DP yes/no, see coproc_info.xml. Never seen that either.Have you tried Moo on two Nvidia cards? No. But I meant it more like "never heard of it". . |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
Now you have seen every angle and data point as to how the GPU's operate with MOO and FAH working simultaneously. For comparison it would be interesting to see how the GPU load looks like with just Moo running and with just FAH running. . |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
And that for some rare hardware configurations with higher DP performance on CPU than on GPU?I thought that was common, as CPUs are damn good at everything, and Nvidia sorely lack DP. It was a bug in the beta.That doesn't say much about it. The fact is it needs DP. Perhaps it was causing an error if there weren't enough, but now it makes do. So Milkyway isn't a decent program? No, if the program requires something from the hardware, it will simply crash if it's missing.But it isn't missing from the computer, it's on the CPU. A GPU task actually runs on the CPU and passes relevant parts to the GPU, which is sometimes all of it, and sometimes half chunks of it. And if DP is required when you have an SP card it should just run those bits on the CPU. As far as I can tell, a lot of it runs on the CPU, so people need up to 16 instances to load their GPUs. That does not indicate any DP requirement, it seems more like the hybrid app we had once for Astropulse on SETI.That's nothing like what I see here, I've not even had to double them up on my GPUs. And I have some pretty shit CPUs. No idea if it theoretically could, it's not doing it, if DP is available on the card, it will use it, because according to the devs anything else would be nonsense. BOINC has just the SP flops anyway + DP yes/no, see coproc_info.xml.Yeah that was a bit daft of me to assume Boinc would be even remotely that clever. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
I gave up on Folding at Home. Until they join Boinc they can get lost. It's ridiculous trying to use both at once because they aren't aware of what each other is doing, so it's impossible to fully load my computers. Also their scheduler is even stupider than Boinc.Now you have seen every angle and data point as to how the GPU's operate with MOO and FAH working simultaneously.For comparison it would be interesting to see how the GPU load looks like with just Moo running and with just FAH running. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
And that for some rare hardware configurations with higher DP performance on CPU than on GPU?I thought that was common, as CPUs are damn good at everything, and Nvidia sorely lack DP. See the thread from Einstein, which I posted above, there are run times for CPU vs. GPU for the DP stage. It was a bug in the beta.That doesn't say much about it. The fact is it needs DP. Perhaps it was causing an error if there weren't enough, but now it makes do. "Double precision is not a requirement. During BETA, some hosts (mostly Intel IIRC) were getting some errors due to a lack of double precision floats but those errors were fixed in the application code and haven't been reported since." From the link I poted above. That sounds pretty clear to me, feel free to post something, that states the opposite. BTW, "not enough DP" does not exist, like every other instruction set, it's either available or not. So Milkyway isn't a decent program? No, if the program requires something from the hardware, it will simply crash if it's missing.But it isn't missing from the computer, it's on the CPU. A GPU task actually runs on the CPU and passes relevant parts to the GPU, which is sometimes all of it, and sometimes half chunks of it. And if DP is required when you have an SP card it should just run those bits on the CPU. It doesn't, Milkyway won't run on SP card. It's possible the way you describe it, if the application supports it, i.e. has both paths in it's code like Einstein's FGRPB1G. But than it does not require DP, it's optional. As far as I can tell, a lot of it runs on the CPU, so people need up to 16 instances to load their GPUs. That does not indicate any DP requirement, it seems more like the hybrid app we had once for Astropulse on SETI.That's nothing like what I see here, I've not even had to double them up on my GPUs. And I have some pretty shit CPUs. No idea, it's just what I've read on the forums, however it might not apply for all cards. . |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
"Double precision is not a requirement. During BETA, some hosts (mostly Intel IIRC) were getting some errors due to a lack of double precision floats but those errors were fixed in the application code and haven't been reported since." From the link I poted above. That sounds pretty clear to me, feel free to post something, that states the opposite. BTW, "not enough DP" does not exist, like every other instruction set, it's either available or not.I have run it on many different cards, and comparing the time per task, the DP and SP speed of each card, I can see it's using some DP. It might not have to, but it can. My cards with a more DP do the tasks faster than they should be able to. No idea, it's just what I've read on the forums, however it might not apply for all cards.I've seen lots of forums with people complaining about high CPU usage on Nvidia cards. Cuda often needs a lot more CPU help than OpenCL. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 2,588 |
WCG now appears to be trying to get more useful work done by sending mostly tasks with small total sizes of the input files, such as tasks for the OPN1 subproject. Is that what others are also seeing? I don't expect Krembil to like this, since it means little work for the MCM1 subproject they are especially interested in. In other words, this may change soon. They were previously sending so many MCM1 tasks that the download server was often slow to respond, although it tended to make it easier to download large input files that small ones. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 2,588 |
I've seen lots of forums with people complaining about high CPU usage on Nvidia cards. Cuda often needs a lot more CPU help than OpenCL. Depends on how it's written. CUDA allows more of the work that would normally be done on the CPU to be done on the GPU instead, but using the GPU clock instead of the CPU clock. If it's something that cannot be done in parallel, this usually means that the GPU will take about four times as long to do it. Such complaints could mean that there is only one version of the application, which does all DP work on the CPU even if the GPU could also do DP if it uses a different version of the application for GPUs that can handle DP. Moving DP work between the CPU and the GPU is NOT automatic - the application or applications must be written so that they know how to do so. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,717,270 RAC: 11,974 |
WCG now appears to be trying to get more useful work done by sending mostly tasks with small total sizes of the input files, such as tasks for the OPN1 subproject.I was doing WCG a day or three ago and was getting 90% cancer and 10% CPU COVID. I was getting no GPU COVID. Why don't they put all the COVID onto GPU and leave the CPUs free for cancer? Or do they have different types of tasks and some need a CPU? EDIT: I've just switched it back on and one of my computers requested work and got a tonne of CPU COVID, and no cancer. I want the GPU work. Even my phone's got some COVID work. What happened to the rainfall project? I know they had difficulties with the input data before the move to Krembil, but I thought that was all sorted out, and when Krembil started it up, I got lots of rainfall. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org