Discussion on increasing the default run time

Message boards : Number crunching : Discussion on increasing the default run time

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · Next

AuthorMessage
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1670
Credit: 17,496,529
RAC: 24,523
Message 94834 - Posted: 19 Apr 2020, 7:16:08 UTC - in response to Message 94829.  

Do you think i should upgrade RAM to 32 because of higher usage? have two Ryzen 5 3600 12 Core with 16GB each. have always been enough so far.
In the past (all of 3 weeks), the most RAM i have seen a single Task require has been 1.3GB. If you were to get nothing but those tasks then that system would be short of RAM for the system to function and all Tasks to be processed.

There are plans to release some Tasks that may require as much as 4GB of RAM. So more RAM would allow you to run more Tasks -even those requiring huge amounts of RAM- at a given time. Otherwise cores will sit unused until there is enough RAM to start processing other Tasks again.
If you can afford it, it certainly won't go to waste.
Grant
Darwin NT
ID: 94834 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1670
Credit: 17,496,529
RAC: 24,523
Message 94843 - Posted: 19 Apr 2020, 8:23:16 UTC
Last modified: 19 Apr 2020, 8:54:17 UTC

Presently processing rb_04_16_21806_21365_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_03_09_918009_94_1

So far,
10hrs 11min Runtime, 10hrs 04min 36sec CPU Time and still no checkpoint.
Grant
Darwin NT
ID: 94843 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1670
Credit: 17,496,529
RAC: 24,523
Message 94855 - Posted: 19 Apr 2020, 10:24:53 UTC - in response to Message 94843.  
Last modified: 19 Apr 2020, 10:56:08 UTC

Presently processing rb_04_16_21806_21365_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_03_09_918009_94_1
Finished.
12hr 13min 42 Sec Runtime.
No checkpoints made.
Grant
Darwin NT
ID: 94855 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
xii5ku

Send message
Joined: 29 Nov 16
Posts: 22
Credit: 13,815,783
RAC: 837
Message 94864 - Posted: 19 Apr 2020, 11:29:10 UTC - in response to Message 94586.  
Last modified: 19 Apr 2020, 11:34:50 UTC

On April 16 Mod.Sense wrote:

CPU time: 43761.3s, 14400s + 28800s


That is how you know the watchdog ended the task. 14,400 seconds is the 4 hours plus the WU target runtime.

So two were ended by watchdog. All three of them got over 300 points of credit. So that implies the batch has some incredibly tough models.

The abnormal results came from the Windows x86-32 application version.
Could this have the same or similar bug as the Linux x86-32 v4.12 - v4.15 application (on x86-64 hosts at least)?


On April 19 Grant (SSSF) wrote:
Presently processing rb_04_16_21806_21365_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_03_09_918009_94_1
Finished.
12hr 13min 42 Sec Runtime.
No checkpoints made.

This task on the other hand was run by the x86-64 application version; i.e. this one was not related to the specific problem of the Linux i686 application builds.
ID: 94864 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2114
Credit: 41,100,175
RAC: 22,181
Message 94939 - Posted: 19 Apr 2020, 20:27:33 UTC - in response to Message 94829.  

i think that might have caused my "problems".
Now changed the settings to 2 days and WU-Runtime to 6 hours.
Will check the next days if there is a change.

Do you think i should upgrade RAM to 32 because of higher usage? have two Ryzen 5 3600 12 Core with 16GB each. have always been enough so far.

2 days will definitely help you. 6 hours won't help or hinder you.
1.5 days and default 8hrs is more productive as the project has indicated a 2 day total turnaround of tasks is ideal (within the 3-day deadlines)
On RAM, the way things are going, it may become essential before long. Start saving up
ID: 94939 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Clive

Send message
Joined: 15 Jun 19
Posts: 4
Credit: 844,412
RAC: 0
Message 94957 - Posted: 20 Apr 2020, 0:13:58 UTC

Hi all:

When I learned that I could assist in understanding COVID-19, I decided to pitch in. The hardware I bring with me is:
1. CPU - i7-8700K water cooled
2. GPU - NVIDIA Geoforce 1070
3. RAM - 16 GB
4. Fully patched Win 10 64 bit

When I initially d/l the workunits, the estimated completion time was 6 hours, 40 minutes. Quite reasonable I thought. Today I have some workunits hitting 30 hours before they are forecast to be completed.

Why is BIONIC so far off in the estimated completed times? Is it in my settings?

Clive Hunt
British Columbia, Canada [/list][/list]
ID: 94957 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1670
Credit: 17,496,529
RAC: 24,523
Message 94959 - Posted: 20 Apr 2020, 1:25:01 UTC - in response to Message 94957.  
Last modified: 20 Apr 2020, 1:40:53 UTC

3. RAM - 16 GB
Won't be enough RAM if you are using all cores & threads.

At present a Task can take up to 1.5GB of RAM, which would use more than the system has if you had nothing but them (extremely unlikely, but still...), and if you have allowed BOINC to use 100% of it. And there are set to be Tasks that may use up to 4GB of RAM coming out in the near future.
I'd either take this as an excuse to upgrade to 32GB of RAM, or limit the number of cores/threads in use to 8 or less. That way your system can cope with several 1.3GB Tasks at a time, or one 4GB Task, and lots of other more regular RAM requirement Tasks (250MB to 850MB).
I'd also allow BOINC to use more of the available RAM. In your Account page, Computing preferences.
   Memory
          When computer is in use, use at most 95 %
      When computer is not in use, use at most 95 %




When I initially d/l the workunits, the estimated completion time was 6 hours, 40 minutes. Quite reasonable I thought. Today I have some workunits hitting 30 hours before they are forecast to be completed.

Why is BIONIC so far off in the estimated completed times? Is it in my settings?
All projects have the issue of Estimated completion time being out for a new Application or Task, as there is no history for the time it takes a system to process it. In the case of Rosetta, the run times of a Task are set in your preferences, but it still takes the Estimated completion times a while to settle down. And where in some projects the initial Estimated completion times are greater than the actual time, with Rosetta they tend to be less.

And while the Target CPU time is set (the default is 8 hours), some Tasks do require more processing than others to give usable results. So there is a 10 hour extension for Tasks that don't end by their Target CPU Runtime after which the Tasks will be ended if it doesn't finish sooner.

Given the short deadlines of most Rosetta Tasks, and the variability in processing time (even though there is a set Target time), a small cache is best. If you run more than one project, a very, very small cache is best. For Rosetta only,
   Other	
                                Store at least 1 days of work
                     Store up to an additional 0.02 days of work
is working pretty well for me.
No missed deadlines yet, even with a few longer (way, way longer) than Target CPU Runtime tasks.
Grant
Darwin NT
ID: 94959 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 94992 - Posted: 20 Apr 2020, 13:25:32 UTC - in response to Message 94957.  

Clive, welcome aboard! Can you link to the WUs that are taking so long? Also, be sure to look at the properties of the task and see the CPU time there, rather than going by elapsed time shown in BOINC Manager.
Rosetta Moderator: Mod.Sense
ID: 94992 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2114
Credit: 41,100,175
RAC: 22,181
Message 95019 - Posted: 20 Apr 2020, 22:33:24 UTC - in response to Message 94957.  
Last modified: 20 Apr 2020, 22:38:41 UTC

When I learned that I could assist in understanding COVID-19, I decided to pitch in. The hardware I bring with me is:
1. CPU - i7-8700K water cooled
2. GPU - NVIDIA Geoforce 1070
3. RAM - 16 GB
4. Fully patched Win 10 64 bit

When I initially d/l the workunits, the estimated completion time was 6 hours, 40 minutes. Quite reasonable I thought. Today I have some workunits hitting 30 hours before they are forecast to be completed.

Why is BOINC so far off in the estimated completed times? Is it in my settings?

It is in your settings.
Looking at this task of yours, go to Options/Computing Preferences and untick all the options under the Computing tab in the section "When to suspend"
However you have them set at the moment is causing this:

Run time 1 days 4 hours 16 min 30 sec
CPU time 7 hours 51 min 19 sec

By all means make the adjustments Grant has suggested too, but it actually looks like everything else you have set isn't causing a problem imo

It does look like Boinc downloaded too many tasks for you on the assumption you'd completed them more quickly, but they'll be cancelled automatically if they pass deadline and the more tasks you complete before then, the more appropriate number will be called for next time. No need to intervene
ID: 95019 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Clive

Send message
Joined: 15 Jun 19
Posts: 4
Credit: 844,412
RAC: 0
Message 95021 - Posted: 20 Apr 2020, 23:27:21 UTC - in response to Message 95019.  

Thank you Sid, I have made yours and Grant's recommended changes to my BONIC settings. Hopefully these changes will speed things up.

Clive Hunt
British Columbia Canada
ID: 95021 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Clive

Send message
Joined: 15 Jun 19
Posts: 4
Credit: 844,412
RAC: 0
Message 95022 - Posted: 20 Apr 2020, 23:36:36 UTC - in response to Message 94959.  

Thank you Grant, I have made yours and Sid's recommended changes to my BONIC settings. Hopefully these changes will speed things up.

Clive Hunt
British Columbia Canada
ID: 95022 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1670
Credit: 17,496,529
RAC: 24,523
Message 95023 - Posted: 20 Apr 2020, 23:42:31 UTC - in response to Message 95019.  

Looking at this task of yours, go to Options/Computing Preferences and untick all the options under the Computing tab in the section "When to suspend"
However you have them set at the moment is causing this:

Run time 1 days 4 hours 16 min 30 sec
CPU time 7 hours 51 min 19 sec
In addition to those changes,
Suspend when non-BOINC CPU usage is above --- %
is best left blank.
Rosetta runs at Idle priority level, so pretty much everything else will run before Rosetta does anyway. No need to specifically suspend Rosetta.
For a lightly used system, the difference between Run time & CPU time should only be 4min or so for a 8hr CPU time Task. A heavily used system will have a bigger discrepancy. A dedicated cruncher, less than 30sec.
Grant
Darwin NT
ID: 95023 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile MeeeK

Send message
Joined: 7 Feb 16
Posts: 31
Credit: 19,737,304
RAC: 0
Message 95112 - Posted: 22 Apr 2020, 4:13:12 UTC

https://boinc.bakerlab.org/rosetta/result.php?resultid=1156119366

I found a lot of these in my list.
All with same error. And only at this client with upgraded RAM two days ago.

The other client dont have this problems.
ID: 95112 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1670
Credit: 17,496,529
RAC: 24,523
Message 95115 - Posted: 22 Apr 2020, 4:53:27 UTC - in response to Message 95112.  

https://boinc.bakerlab.org/rosetta/result.php?resultid=1156119366
I found a lot of these in my list.
All with same error. And only at this client with upgraded RAM two days ago.
If you can, swap the RAM between systems, and see if the errors change systems as well.
Grant
Darwin NT
ID: 95115 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile MeeeK

Send message
Joined: 7 Feb 16
Posts: 31
Credit: 19,737,304
RAC: 0
Message 95116 - Posted: 22 Apr 2020, 5:27:12 UTC - in response to Message 95115.  

Thats not possible.
This system have 4 slots the other one just 2.

But the "new additional" ram in this "faulty" system have been in the secound system before. They are both 3200 kits with same timings. Both kits worked fine.

I will check ram settings in bios. But there aren't any issues for sure.
ID: 95116 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1670
Credit: 17,496,529
RAC: 24,523
Message 95117 - Posted: 22 Apr 2020, 5:30:41 UTC - in response to Message 95116.  
Last modified: 22 Apr 2020, 5:34:31 UTC

Thats not possible.
This system have 4 slots the other one just 2.
?
So the system with 4 slots only gets 2 modules, the system with 2 slots can only have 2 modules.



But the "new additional" ram in this "faulty" system have been in the secound system before. They are both 3200 kits with same timings. Both kits worked fine.
If it is the only change you made, then it's very likely they are the cause of you problems & they don't work fine now.
Are they the same brand & size as well as speed & timings?
And even so, what works in one system may not work in another.
Grant
Darwin NT
ID: 95117 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile hnapel

Send message
Joined: 8 Apr 20
Posts: 8
Credit: 835,346
RAC: 0
Message 95149 - Posted: 22 Apr 2020, 19:09:02 UTC - in response to Message 56932.  

Great I'm replying to a 12 year old post! But I don't see my question answered anywhere: If I increase the runtime from the (now) default 8 hours to for example 12 to help in reducing the load on the servers will the larger runtime also have an increased requirement on RAM usage? Will the workunits received be actually different or does it just try more science on the same units?
ID: 95149 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 95160 - Posted: 22 Apr 2020, 22:43:09 UTC - in response to Message 95149.  

No, a 12hr runtime preference just runs along with the same memory footprint for a longer period of time.

The workunit is the same, your system just computes more models from it until it gets near the runtime preference.
Rosetta Moderator: Mod.Sense
ID: 95160 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
KWSN Ekky Ekky Ekky

Send message
Joined: 3 Apr 20
Posts: 9
Credit: 5,062,511
RAC: 0
Message 96135 - Posted: 5 May 2020, 21:19:35 UTC

The runtime limitations make things more than silly, to my very limited mind. Running Seti@home presented no such problems: if it took longer than expected, it was fine, but such short termism is driving me nuts. I may well join the great exodus if things are not managed better for us humble crunchers.
ID: 96135 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1670
Credit: 17,496,529
RAC: 24,523
Message 96136 - Posted: 5 May 2020, 21:30:15 UTC - in response to Message 96135.  
Last modified: 5 May 2020, 21:32:31 UTC

The runtime limitations make things more than silly, to my very limited mind. Running Seti@home presented no such problems: if it took longer than expected, it was fine, but such short termism is driving me nuts. I may well join the great exodus if things are not managed better for us humble crunchers.
Looking at you results the problem isn't the Target CPU time, but a large cache setting and a system that is busy doing things other than processing BOINC work.
Hence the huge difference between CPU time & Runtime on your system- your main system is taking 24 hours to process a Task with an 8 hour Target CPU time.
For a 8 hour Target CPU time the difference between CPU time & Runtime on a lightly used system is about 4 minutes.

Set you cache to
   Other	
                                Store at least 0.5 days of work
                     Store up to an additional 0.02 days of work
and that should stop you from missing deadlines.
If you choose to determine what is using your system so heavily, that will also help get more work done by not taking 3 times longer to do it than than actual Target time.


Recent changes to how work is sent out will also help stop systems from getting more work than they can handle when new applications are released in the future, or new systems are added to do Rosetta work.
Grant
Darwin NT
ID: 96136 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · Next

Message boards : Number crunching : Discussion on increasing the default run time



©2024 University of Washington
https://www.bakerlab.org