Bad workunits

Message boards : Number crunching : Bad workunits
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
BetelgeuseFive

Send message
Joined: 30 Mar 19
Posts: 1
Credit: 2,372,259
RAC: 0
Message 899 - Posted: 16 Apr 2019, 15:27:45 UTC

It looks like some bad workunits were distributed on April 14:

https://rake.boincfast.ru/rakesearch/workunit.php?wuid=19424958
https://rake.boincfast.ru/rakesearch/workunit.php?wuid=19424055
https://rake.boincfast.ru/rakesearch/workunit.php?wuid=19424378
https://rake.boincfast.ru/rakesearch/workunit.php?wuid=19424051

Based on the fact that all hosts seem to have problems with these units I assume the problem is server-side.

Any clues as to what happened here ?

Tom
ID: 899 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 645
Credit: 22,435,583
RAC: 12,832
Message 903 - Posted: 17 Apr 2019, 7:21:20 UTC - in response to Message 899.  
Last modified: 17 Apr 2019, 7:21:34 UTC

Hello Tom!

... It looks like some bad workunits were distributed on April 14 ...

Thank you for information!
Now we collect and analyze results received during sprint and regenerate workunits like this.
ID: 903 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pschoefer

Send message
Joined: 1 Jan 19
Posts: 4
Credit: 32,381,006
RAC: 12,838
Message 915 - Posted: 25 Apr 2019, 5:16:15 UTC

There's another bunch of bad workunits being distributed right now, almost all tasks on my computers crash immediately (e.g., https://rake.boincfast.ru/rakesearch/results.php?hostid=5553&offset=0&show_names=0&state=6) and crashed with the same 0x80000003 exception on other computers.
ID: 915 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JugNut

Send message
Joined: 6 Jan 18
Posts: 7
Credit: 16,825,117
RAC: 9,311
Message 916 - Posted: 25 Apr 2019, 7:10:40 UTC
Last modified: 25 Apr 2019, 8:10:28 UTC

Yea same, just had 26 in a row error out. Similar to the above the error returned was. an Unhandled Exception "-2147483645 (0x80000003)" https://rake.boincfast.ru/rakesearch/results.php?hostid=7574&offset=0&show_names=0&state=6&appid=

EDIT: Just found another dozen that just happened on this PC. https://rake.boincfast.ru/rakesearch/results.php?hostid=2190&offset=0&show_names=0&state=6&appid=
ID: 916 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 645
Credit: 22,435,583
RAC: 12,832
Message 917 - Posted: 25 Apr 2019, 8:02:49 UTC
Last modified: 25 Apr 2019, 10:05:33 UTC

Hello folks!

Thank you for information! During generation of bunch of workunits some of them "crashed". In next few days we plan to regenerate it.
Also we check integrity of bunches of workunits and results during receiving and archivation phase and if we detech missed or crushed workunits - we make a new copy for them.

Thank you!
ID: 917 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
jozef j
Avatar

Send message
Joined: 11 Sep 17
Posts: 51
Credit: 194,388,032
RAC: 3,439
Message 918 - Posted: 26 Apr 2019, 6:14:01 UTC

Hi, bad work units still incereasing , https://rake.boincfast.ru/rakesearch/results.php?userid=67&offset=0&show_names=0&state=6&appid=

750 and 1000 waiting for walidation .. at night fall project to "24 hours refresh" .. and my pc was empty of units,
hope you find where is problem soon..
ID: 918 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile AlexxSaigon
Avatar

Send message
Joined: 17 Mar 19
Posts: 4
Credit: 6,693,878
RAC: 161
Message 920 - Posted: 26 Apr 2019, 14:09:03 UTC

Unfortunately, some tasks are downloaded empty.
https://thumbsnap.com/LHQvN3JJ
ID: 920 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Suicyder

Send message
Joined: 29 Mar 19
Posts: 1
Credit: 4,913,428
RAC: 0
Message 921 - Posted: 26 Apr 2019, 20:41:32 UTC

Yeah, had a ton of them again as well. Worst is that computation errors have a negative effect one the BOINC client connection retries (waits longer and longer, really annoying)
ID: 921 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 645
Credit: 22,435,583
RAC: 12,832
Message 922 - Posted: 27 Apr 2019, 16:30:43 UTC

Hello folks!

Total number of "crushed workunits" - about 7200. We know, how to extract its list from the database - by workunit attributes.
Today 3812 of "crushed workunits" were archived and regenerated.
ID: 922 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
RobertN

Send message
Joined: 9 Oct 18
Posts: 2
Credit: 3,853,343
RAC: 0
Message 924 - Posted: 29 Apr 2019, 19:31:25 UTC

It seems that another batch of bad work units are being distributed.
ID: 924 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 7 Sep 17
Posts: 35
Credit: 1,706,725
RAC: 496
Message 925 - Posted: 30 Apr 2019, 0:29:23 UTC

Been receiving LOTS (50)+ BAD TASKS -- MOSTLY IN LAST HOUR
Error tasks for Dr Who Fan

ID: 925 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 645
Credit: 22,435,583
RAC: 12,832
Message 926 - Posted: 30 Apr 2019, 6:14:00 UTC - in response to Message 924.  

Hello RobertN!

Interesting! Post link to one or several tasks tasks, please.

Thank you!
It seems that another batch of bad work units are being distributed.
ID: 926 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 645
Credit: 22,435,583
RAC: 12,832
Message 927 - Posted: 30 Apr 2019, 6:21:47 UTC - in response to Message 925.  

Hello Dr Who Fan!
Been receiving LOTS (50)+ BAD TASKS

Your hosts hidden, but we can view failed tasks later - in database.
ID: 927 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JugNut

Send message
Joined: 6 Jan 18
Posts: 7
Credit: 16,825,117
RAC: 9,311
Message 928 - Posted: 30 Apr 2019, 6:53:37 UTC
Last modified: 30 Apr 2019, 6:59:14 UTC

From what I can see there from just two batches.

Tasks that are still being issued that start with either "R9_02026****" or "R9_02025****" all give computation error.
ID: 928 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 7 Sep 17
Posts: 35
Credit: 1,706,725
RAC: 496
Message 929 - Posted: 30 Apr 2019, 7:25:57 UTC - in response to Message 927.  

Hello Dr Who Fan!
Been receiving LOTS (50)+ BAD TASKS

Your hosts hidden, but we can view failed tasks later - in database.


My host are NOT hidden.... I have the check box checked/filled for Should RakeSearch show your computers on its web site?

I never had a problem linking to my public viewable task list at other projects in the past.
Could it be there has been some change to the base/default BOINC SERVER CODE removing the ability to view other peoples task list recently?

ID: 929 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JugNut

Send message
Joined: 6 Jan 18
Posts: 7
Credit: 16,825,117
RAC: 9,311
Message 930 - Posted: 30 Apr 2019, 7:37:35 UTC - in response to Message 929.  
Last modified: 30 Apr 2019, 7:44:33 UTC

@Dr Who Fan: In your link titled "Error tasks for Dr Who Fan" you tried to link "all" your PC's error tasks. You can only show links to just one PC at a time. It's always been that way.
For some reason only a logged in user has access to all his PC's at once, why? who knows?
ID: 930 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jaari

Send message
Joined: 31 Mar 19
Posts: 7
Credit: 1,204,769
RAC: 0
Message 931 - Posted: 30 Apr 2019, 17:58:28 UTC - in response to Message 926.  
Last modified: 30 Apr 2019, 18:01:37 UTC

Eh, been getting lots of bad WUs myself. Nvm the ratelimit, getting more bad WUs again :P
ID: 931 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 645
Credit: 22,435,583
RAC: 12,832
Message 932 - Posted: 30 Apr 2019, 18:49:36 UTC
Last modified: 30 Apr 2019, 18:52:23 UTC

Thank you for attention, folks! Some "crushed workunits" created (during its generation) normally but the problem arose during generation of tasks. But some workunits, really crushed at the beginning, during generation files for workunits. Both groups were "distributed" over an entire set of failed tasks. The first group already detected and most of workunits within - already processed, but the second group we detected now with your help!

Now we make a new copy of workunits and upon "completion" present "failed workunits", we delete it from database and make a new copy of tasks.

Thank you!
ID: 932 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 645
Credit: 22,435,583
RAC: 12,832
Message 935 - Posted: 1 May 2019, 13:30:33 UTC

Minor update: now about 5600 workunits based on "broken" files in progress. Most of them already have 6 .. 9 results completed with errors and, I think, in next few days lifecycle of most of them come to and end and we can replace them with new ones. Half an hour ago we replace 1849 workunits.
ID: 935 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Dingo
Avatar

Send message
Joined: 15 Sep 17
Posts: 10
Credit: 11,208,449
RAC: 58
Message 936 - Posted: 1 May 2019, 16:58:19 UTC

Still coming through all my work is aborting:

<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
terminate called after throwing an instance of 'char const*'
SIGABRT: abort called
Stack trace (14 frames):
[0x40bf2d]
[0x493e00]
[0x493ccb]
[0x49e095]
[0x47927d]
[0x4772a6]
[0x4772d3]
[0x4773f2]
[0x404846]
[0x404bdb]
[0x400bdc]
[0x4013d6]
[0x49563b]
[0x400449]

Exiting...

</stderr_txt>
]]>

Proud Founder of BOINC@AUSTRALIA
Have a look at my WebCam
My best Prime 91655310131072 + 1
ID: 936 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Bad workunits

©2024 The searchers team, Karelian Research Center of the Russian Academy of Sciences