Message boards :
Number crunching :
Bad workunits
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 11 Aug 17 Posts: 644 Credit: 22,393,426 RAC: 12,551 |
Hello Dingo! Still coming through all my work is aborting: .. I should add some explanation. For example, two workunits: R9_020263562 and R9_020266751. Both workunits initially generated from "broken files" several days ago. Yesterday first workunit was regenerated, but before the detection of group of workunits that based on incorrect files. (Many files were generated normally, mistakes arised during generation workunits, after generation of base files). New replica of first workunit (which made yesterday), of course also produce tasks, that fallen immediately after the start of computation. These tasks will be generated for workunit like this until count of errors reaches 8. After count of failed tasks for workunit reach 8, workunits mark as completed with an error. Thereafter, we can found these workunits in database and corretly delete this information and create a new copy from correctly generated workunit file. The second workunit listed above - sample of correctly regenerated workunit. Now in project database present about 5400 workunits generated from incorrect files (from my previous post more than 200 workunits completed its lifecycle). Most of them near to 8 errors in tasks and in next 2 or 3 days will complete its lifecycle and we can correctly replace these on new replicas. But in these days on computers of participants can arrive incorrect tasks. But incorrect tasks does not consume a CPU time and falls immediately after start. The most correct way now - simple wait while workunits with incorrect tasks simply reach the end of its life. My computers receive incorrect tasks also. :) Simultaneously in project database adds new correct workunits that send to computers also. Thank you for attention! |
Send message Joined: 1 Jan 19 Posts: 4 Credit: 32,381,006 RAC: 12,838 |
I spotted another kind of bad WU the other day: https://rake.boincfast.ru/rakesearch/workunit.php?wuid=21236432 It caught my attention because it was already running for more than 2 hours on my fastest computer. On all other hosts, it crashed with EXIT_TIME_LIMIT_EXCEEDED after more than a day, so I decided to abort it. |
Send message Joined: 11 Aug 17 Posts: 644 Credit: 22,393,426 RAC: 12,551 |
I spotted another kind of bad WU the other day: https://rake.boincfast.ru/rakesearch/workunit.php?wuid=21236432 You did absolutely right! We found only one(!) workunit like this in the project database and we mark it as invalid for future recreation. Thank you for attention! |
Send message Joined: 11 Sep 17 Posts: 51 Credit: 194,388,032 RAC: 3,439 |
Hi, good work , you try fix this problem. i getting now some : https://rake.boincfast.ru/rakesearch/results.php?userid=67&offset=0&show_names=0&state=6&appid= can you see that failed Wus and stderr? |
Send message Joined: 11 Aug 17 Posts: 644 Credit: 22,393,426 RAC: 12,551 |
Hi, good work , you try fix this problem. Hi! We watched a list of 2990WX tasks with errors - these produced by workunits from incorrect files, most of its workunits near to the end of life and in next 1-2 days these will be ready to replacement. Thank you! |
Send message Joined: 6 Jan 18 Posts: 7 Credit: 16,825,117 RAC: 9,311 |
They certainly are a pain. I just noticed this box that received 56 bad WU's in a row. Because of all the Comp errors Boinc chucked a hissy fit and put in a 24 hr delay. When I just now found it, the PC had already been idle for 10 hrs. :( https://rake.boincfast.ru/rakesearch/results.php?hostid=2197&offset=0&show_names=0&state=6&appid= |
Send message Joined: 11 Aug 17 Posts: 644 Credit: 22,393,426 RAC: 12,551 |
Problem is mostly solved. Some hundreds of incorrect workunits not completed, but due to existence 1 or 2 results that were sent to computers and doesn't reported until now. If you see a massive bunch of errors for new tasks - please post to this thread. |
©2024 The searchers team, Karelian Research Center of the Russian Academy of Sciences