Looooooooooong Running Task

Message boards : Number crunching : Looooooooooong Running Task

To post messages, you must log in.

AuthorMessage
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1834
Credit: 124,260,318
RAC: 8
Message 30287 - Posted: 30 Oct 2006, 14:28:44 UTC
Last modified: 30 Oct 2006, 14:45:21 UTC

[edit]changed the title - i believe it's technically a 'task' that's being run rather than a 'WU'.
I don't usually pay much attention to the WUs, unless one's not uploaded for a while, but I just noticed that this Celeron 2.66GHz (256MB RAM) XP machine has been running this task:

1hz6A_BOINC_NATIVEJUMPS_CLOSE_CHAINBREAKS_VARY_ALL_BOND_ANGLES_ALL_BOND_DISTANCES_SAVE_ALL_OUT__1306_16116_1 using rosetta version 534

for 12hrs and is sitting at 8.500%. I took a peak at the graphics and it's still wiggling so i'll leave it be. The run time is set as 4hrs on this machine, so this decoy has well and truly smashed that! It's not a problem for me, but I think it will be a big problem for the project's overall production as there'll be lots of computers running these that don't have a chance of checkpointing while switched on.

I think there need to be a few categories that computers are placed into, based on memory available (physical RAM is pretty much irellevant - it's the available RAM divided by no of cores that needs to be considered!), and the computer's crunching ability between power cycles. The big jobs need to go to the most capable computers, which there are plenty of, but there's not much point in sending them to my cousin's P3-550!

cheers
Danny
ID: 30287 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Marky-UK

Send message
Joined: 1 Nov 05
Posts: 73
Credit: 1,689,495
RAC: 0
Message 30292 - Posted: 30 Oct 2006, 15:40:25 UTC - in response to Message 30287.  

I aborted a WU this morning that had been running for 3 days over the weekend, despite the runtime limit being set to 3 hours! And the percentage wasn't moving either. I really can't be bothered to babysit Rosetta's hand all the time.
ID: 30292 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1834
Credit: 124,260,318
RAC: 8
Message 30300 - Posted: 30 Oct 2006, 18:42:24 UTC

i've got another one on a different machine - 10hrs at 1% running:
1hz6A_BOINC_NATIVEJUMPS_CLOSE_CHAINBREAKS_VARY_ALL_BOND_ANGLES_ALL_BOND_DISTANCES_SAVE_ALL_OUT__1306_2624_0

It's running as a service so I can't see whether it's moving atm...

Should I abort these?
ID: 30300 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile scsimodo

Send message
Joined: 17 Sep 05
Posts: 93
Credit: 946,359
RAC: 0
Message 30306 - Posted: 30 Oct 2006, 19:23:48 UTC - in response to Message 30300.  

i've got another one on a different machine - 10hrs at 1% running:
1hz6A_BOINC_NATIVEJUMPS_CLOSE_CHAINBREAKS_VARY_ALL_BOND_ANGLES_ALL_BOND_DISTANCES_SAVE_ALL_OUT__1306_2624_0

It's running as a service so I can't see whether it's moving atm...

Should I abort these?


The longest 1hz6a-Model took about 1h30 for me on a 3GHz-machine. I guess your's stuck! Happens pretty often with the new 5.34 (aborted WUs too!!)


ID: 30306 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Looooooooooong Running Task



©2025 University of Washington
https://www.bakerlab.org