Questions and Answers : Unix/Linux : Many client errors
Author | Message |
---|---|
Pushkin Send message Joined: 10 Mar 07 Posts: 14 Credit: 7,068,050 RAC: 0 |
Hi guys, a few days ago I had a look at my Rosetta account and I found out, that it generated (and generates) many and many client errors - since December 2012 I had no successful workunit. The output in task details looks like this: <core_client_version>7.0.27</core_client_version> <![CDATA[ <stderr_txt> [2013- 1-10 10:50:47:] :: BOINC:: Initializing ... ok. [2013- 1-10 10:50:47:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Setting WU description ... Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev52077.zip Unpacking WU data ... Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/input_rb_01_09_35680_67579__t000__0_C2_robetta.zip Setting database description ... Setting up checkpointing ... Setting up graphics native ... BOINC:: Worker startup. Starting watchdog... Watchdog active. ====================================================== DONE :: 1 starting structures 5422.32 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== BOINC :: WS_max 5.25381e-287 BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down cleanly ... called boinc_finish </stderr_txt> ]]> I don't see any kind of error there but the result of the task is always Client Error. Unfortunately I cannot find when Rosetta started behave like this, since all recent workunits have this error. I am running Debian Wheezy 64-bit on a PC with Intel Core i5 3570, BOINC Client 7.0.27: root@pushkin:/home/pushkin# lshw -short H/W path Device Class Description ========================================================= system HP Compaq Elite 8300 MT (QV994AV) /0 bus 3397 /0/0 memory 64KiB BIOS /0/4 processor Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz /0/4/5 memory 256KiB L1 cache /0/4/6 memory 1MiB L2 cache /0/4/7 memory 6MiB L3 cache /0/2b memory 16GiB System Memory /0/2b/0 memory 4GiB DIMM DDR3 Synchronous 1600 MHz (0,6 ns) /0/2b/1 memory 4GiB DIMM DDR3 Synchronous 1600 MHz (0,6 ns) /0/2b/2 memory 4GiB DIMM DDR3 Synchronous 1600 MHz (0,6 ns) /0/2b/3 memory 4GiB DIMM DDR3 Synchronous 1600 MHz (0,6 ns) /0/100 bridge Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller /0/100/1 bridge Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port /0/100/1/0 display NVIDIA Corporation /0/100/1/0.1 multimedia NVIDIA Corporation /0/100/1a bus 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 /0/100/1b multimedia 7 Series/C210 Series Chipset Family High Definition Audio Controller /0/100/1d bus 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 /0/100/1e bridge 82801 PCI Bridge /0/100/1f bridge Q77 Express Chipset LPC Controller /0/100/1f.2 scsi0 storage 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] /0/100/1f.2/0 /dev/sda disk 500GB Hitachi HDS72105 /0/100/1f.2/0/1 /dev/sda1 volume 100MiB Windows NTFS volume /0/100/1f.2/0/2 /dev/sda2 volume 97GiB Windows NTFS volume /0/100/1f.2/0/3 /dev/sda3 volume 368GiB Extended partition /0/100/1f.2/0/3/5 /dev/sda5 volume 65GiB Linux filesystem partition /0/100/1f.2/0/3/6 /dev/sda6 volume 65GiB Linux filesystem partition /0/100/1f.2/0/3/7 /dev/sda7 volume 19GiB Linux filesystem partition /0/100/1f.2/0/3/8 /dev/sda8 volume 41GiB Linux swap / Solaris partition /0/100/1f.2/0/3/9 /dev/sda9 volume 175GiB Linux filesystem partition /0/100/1f.2/1 /dev/cdrom disk CDDVDW SH-216BB /0/100/1f.3 bus 7 Series/C210 Series Chipset Family SMBus Controller /0/100/14 bus 7 Series/C210 Series Chipset Family USB xHCI Host Controller /0/100/16 communication 7 Series/C210 Series Chipset Family MEI Controller #1 /0/100/16.3 communication 7 Series/C210 Series Chipset Family KT Controller /0/100/19 eth0 network 82579LM Gigabit Network Connection /0/1 scsi7 storage /0/1/0.0.0 /dev/sdb disk 500GB SCSI Disk /0/1/0.0.0/1 /dev/sdb1 volume 19GiB Windows NTFS volume /0/1/0.0.0/2 /dev/sdb2 volume 445GiB W95 FAT32 (LBA) partition /1 power To Be Filled By O.E.M. Can you please kick me where I could try to find a solution? Thank you, Pushkin |
Pushkin Send message Joined: 10 Mar 07 Posts: 14 Credit: 7,068,050 RAC: 0 |
Today I have let Rosetta calculate another workunit ... again it ended with Client error (see WU558909674). Really no idea what to do with this issue? |
Polian Send message Joined: 21 Sep 05 Posts: 152 Credit: 10,141,266 RAC: 0 |
Sorry for the lack of response. I'm afraid this section of the forums is pretty underutilized and frequently overlooked. Looking at your lshw output and the symptoms you describe, two things stand out. /0/4 processor Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz and /0/100/1/0 display NVIDIA Corporation /0/100/1/0.1 multimedia NVIDIA Corporation Many users have reported the same symptoms, but the pattern is hard to establish. DK is actively working on it. There is much speculation to the cause, but I'm not sure if we've nailed it down completely yet. At this point, a possibility is that it is related to NVIDIA drivers; some users have reported success with downgrading. I want to say that it seems to happen only with Ivy Bridge processors, although this could be incorrect. I have been unable to reproduce the problem with my Nehalem i7 and GTX460. Please see this thread: https://boinc.bakerlab.org/forum_thread.php?id=6177 for the latest. It is mentioned in other threads in under the "number crunching" section as well. Again, sorry for not noticing your post sooner. |
Pushkin Send message Joined: 10 Mar 07 Posts: 14 Credit: 7,068,050 RAC: 0 |
Hi, thank you for your answer anyway, at least I know that I am not alone with this problem. I'll start following the thread you have linked, we'll see how things will continue. Thanks, Pushkin |
Questions and Answers :
Unix/Linux :
Many client errors
©2024 University of Washington
https://www.bakerlab.org