WU processing for 28 hours so far

Message boards : Number crunching : WU processing for 28 hours so far
Message board moderation

To post messages, you must log in.

AuthorMessage
Adam J Bavier

Send message
Joined: 28 May 18
Posts: 7
Credit: 3,416,960
RAC: 0
Message 473 - Posted: 15 Jul 2018, 2:29:02 UTC

Name: R9_005200353
http://rake.boincfast.ru/rakesearch/workunit.php?wuid=5338671

It has been processing for 1d 04:26:52 and is at 14.909%. Is it stuck? Should I abort it?

I'm running the SSE2 Optimized application.

Here is the text from the project directory on my computer:
# Move search of pairs OLDS status

# Generation of DLS status

9

{
0 1 2 3 4 5 6 7 8 
8 3 -1 -1 -1 -1 -1 0 -1 
-1 -1 7 -1 -1 -1 5 -1 -1 
-1 -1 -1 2 -1 7 -1 -1 -1 
-1 -1 -1 -1 6 -1 -1 -1 -1 
-1 -1 -1 1 -1 4 -1 -1 -1 
-1 -1 3 -1 -1 -1 8 -1 -1 
-1 2 -1 -1 -1 -1 -1 5 -1 
4 -1 -1 -1 -1 -1 -1 -1 1 
}

56

1 2
1 3
1 4
1 5
1 6
1 8
2 0
2 1
2 3
2 4
2 5
2 7
2 8
3 0
3 1
3 2
3 4
3 6
3 7
3 8
4 0
4 1
4 2
4 3
4 5
4 6
4 7
4 8
5 0
5 1
5 2
5 4
5 6
5 7
5 8
6 0
6 1
6 3
6 4
6 5
6 7
6 8
7 0
7 2
7 3
7 4
7 5
7 6
7 8
8 1
8 2
8 3
8 4
8 5
8 6
8 7

1 2 -1
1 2 0

0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 

0 0 0 0 0 0 0 0 0 
0 1 1 0 1 1 1 1 0 
1 1 1 1 1 0 1 0 1 
1 1 0 1 1 1 1 0 1 
1 1 1 1 1 1 0 1 1 
1 0 1 1 0 1 1 1 1 
1 1 1 0 1 1 1 1 0 
1 1 0 1 1 0 1 1 1 
1 0 1 1 0 1 1 1 1 

0 1 1 1 1 1 1 0 1 
1 0 1 0 1 1 1 1 0 
1 0 0 0 1 1 1 1 1 
1 0 0 0 1 1 1 1 1 
0 1 1 1 0 0 1 1 1 
1 1 1 1 1 0 0 0 1 
1 1 1 1 0 1 0 1 1 
1 1 0 1 1 0 1 0 1 
0 1 1 1 1 1 0 1 0 

0 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 0 1 
1 1 1 1 1 1 0 1 1 
1 1 1 1 1 0 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 0 1 1 1 1 1 
1 1 0 1 1 1 1 1 1 
1 0 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 

1 0 1 1 1 1 1 1 1 
0 1 1 1 1 1 1 1 1 
1 1 0 1 1 1 1 1 1 
1 1 1 0 1 1 1 1 1 
1 1 1 1 0 1 1 1 1 
1 1 1 0 1 0 1 1 1 
1 1 0 1 1 1 0 1 1 
1 1 1 1 1 1 1 0 1 
1 1 1 1 1 1 1 1 0 

1 1 0 1 1 1 1 1 1 
0 0 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 0 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 0 1 1 1 1 1 1 1 
0 1 1 1 1 1 1 1 1 

1 1 1 0 1 1 1 1 1 
1 0 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 0 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
0 1 1 1 1 1 1 1 1 

1 1 1 1 0 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 0 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 0 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
0 1 1 1 1 1 1 1 1 

1 1 1 1 1 0 1 1 1 
0 1 1 1 1 1 1 1 1 
1 1 0 1 1 1 0 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 0 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 0 1 1 
1 1 1 1 1 1 1 0 1 
1 1 1 1 1 1 1 1 1 

1 1 1 1 1 1 0 1 1 
0 1 1 1 1 1 1 1 1 
1 1 0 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 0 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 

1 1 1 1 1 1 1 0 1 
0 1 1 1 1 1 1 1 1 
1 1 0 1 1 1 1 1 1 
1 1 1 1 1 0 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 

1 1 1 1 1 1 1 1 0 
0 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 0 1 1 
1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 


0

# Move search component status

0 0 0
0 0
ID: 473 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 645
Credit: 22,431,455
RAC: 12,765
Message 474 - Posted: 15 Jul 2018, 4:09:40 UTC - in response to Message 473.  
Last modified: 15 Jul 2018, 7:53:37 UTC

Hello Adam!

Thank you for attention!

It is abnormal behavior. Your checkpoint(?) file corresponds to initial state of computing. I start to test this workunit on my PC (with workunit file and with your checkpoint file), no any deviation detected but test in progress.

Can you:
1. Stop BOINC; Check that all computing processes completed; Start BOINC again and watch some time for work of this task?

If (1) did not help:
2. Stop BOINC; Remove checkpoint.txt from slot of this task; Start BOINC again.

if (2) did not help - abort task.

Did it help?
ID: 474 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Adam J Bavier

Send message
Joined: 28 May 18
Posts: 7
Credit: 3,416,960
RAC: 0
Message 475 - Posted: 15 Jul 2018, 14:57:39 UTC

I tried troubleshooting option 1. That did not work. Still stuck.

Then I tried option 2:

This is the entire checkpoint file before I deleted it. It appears that it is short and missing data.
# Move search of pairs OLDS status

# Generation of DLS status

9
{
0 1 2 3 4 5 6 7 8 
8 3 1 6 5 2 4 0 7 
1 4 7 0 3 8 5 2 6 
5 8 6 2 1 7 3 4 0 
7 0 5 8 6 1 2 3 4 
3 5 8 1 7 4 0 6 2 
2 7 3 4 0 6 8 1 5 
6 2 4 7 8 0 1 5 3 
4 6 0 5 2 3 7 8 1 
}
56

1 2 
1 3 
1 4 
1 5 
1 6 
1 8 
2 0 
2 1 
2 3 
2 4 
2 5 
2 7 
2 8 
3 0 
3 1 
3 2 
3 4 
3 6 
3 7 
3 8 
4 0 
4 1 
4 2 
4 3 
4 5 
4 6 
4 7 
4 8 
5 0 
5 1 
5 2 
5 4 
5 6 
5 7 
5 8 
6 0 
6 1 
6 3 
6 4 
6 5 
6 7 
6 8 
7 0 
7 2 
7 3 
7 4 
7 5 
7 6 
7 8 
8 1 
8 2 
8 3 
8 4 
8 5 
        


Now after I restarted BOINC Manager this is the checkpoint file that was created, and the number at the bottom is getting larger:

# Move search of pairs OLDS status

# Generation of DLS status

9
{
0 1 2 3 4 5 6 7 8 
8 3 1 7 5 6 4 0 2 
1 6 7 0 8 2 5 4 3 
5 0 8 2 1 7 3 6 4 
2 5 0 4 6 8 1 3 7 
3 7 5 1 0 4 2 8 6 
7 4 3 6 2 0 8 1 5 
6 2 4 8 3 1 7 5 0 
4 8 6 5 7 3 0 2 1 
}
56

1 2 
1 3 
1 4 
1 5 
1 6 
1 8 
2 0 
2 1 
2 3 
2 4 
2 5 
2 7 
2 8 
3 0 
3 1 
3 2 
3 4 
3 6 
3 7 
3 8 
4 0 
4 1 
4 2 
4 3 
4 5 
4 6 
4 7 
4 8 
5 0 
5 1 
5 2 
5 4 
5 6 
5 7 
5 8 
6 0 
6 1 
6 3 
6 4 
6 5 
6 7 
6 8 
7 0 
7 2 
7 3 
7 4 
7 5 
7 6 
7 8 
8 1 
8 2 
8 3 
8 4 
8 5 
8 6 
8 7 

1 2 -1
8 7 55

0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 

0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 

0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 

0 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 0 1 
1 0 1 0 1 1 0 1 1 
1 0 1 1 1 0 1 1 1 
1 1 0 1 1 1 1 1 1 
1 1 1 0 0 1 1 1 1 
1 1 0 1 1 0 1 1 1 
1 0 1 1 1 1 0 1 0 
1 1 1 1 1 1 0 1 1 

1 0 1 1 1 1 1 1 1 
0 1 0 1 1 1 1 1 1 
0 1 0 1 1 1 1 1 1 
1 1 1 0 0 1 1 1 1 
1 1 1 1 0 0 0 1 1 
1 1 1 0 1 0 1 1 1 
1 1 0 1 1 1 0 0 1 
1 1 1 1 1 0 1 0 1 
1 1 1 1 1 1 1 1 0 

1 1 0 1 1 1 1 1 1 
0 0 1 1 0 0 0 1 0 
1 1 1 1 0 0 1 1 1 
1 1 1 0 1 1 1 1 1 
0 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 0 1 1 
1 1 1 1 0 1 1 1 1 
1 0 1 1 1 1 1 1 1 
0 1 1 1 1 1 1 0 1 

1 1 1 0 1 1 1 1 1 
1 0 1 1 1 1 1 1 1 
1 1 1 1 0 1 1 0 0 
0 1 1 1 1 1 0 1 1 
1 1 1 1 1 0 1 0 1 
0 1 1 1 1 1 1 1 1 
1 1 0 1 1 1 1 1 1 
1 1 1 1 0 1 1 1 1 
0 1 1 1 1 0 1 1 1 

1 1 1 1 0 1 1 1 1 
1 1 1 0 1 1 0 1 1 
1 0 0 1 1 1 1 0 1 
1 1 0 1 1 1 1 1 0 
1 0 1 0 1 1 1 1 1 
1 1 1 1 1 0 1 1 1 
1 0 1 1 1 1 1 1 1 
1 1 0 1 1 1 1 1 1 
0 1 1 1 1 1 1 1 1 

1 1 1 1 1 0 1 1 1 
0 1 1 0 0 1 1 1 1 
1 1 0 1 1 1 0 1 1 
0 1 1 1 1 1 1 1 1 
1 0 1 1 0 1 1 1 1 
1 1 0 1 1 1 1 1 1 
1 1 1 0 1 1 0 1 0 
1 1 1 1 1 1 1 0 1 
1 1 1 0 1 1 1 1 1 

1 1 1 1 1 1 0 1 1 
0 1 1 0 1 0 1 1 1 
1 0 0 1 1 1 1 1 1 
1 1 0 1 1 1 1 0 1 
1 1 1 1 0 1 1 1 1 
1 1 1 1 1 1 1 1 0 
0 1 1 0 1 1 1 1 1 
0 1 1 1 1 1 1 1 1 
1 1 0 1 1 1 1 1 1 

1 1 1 1 1 1 1 0 1 
0 1 1 0 1 1 1 1 1 
1 1 0 1 1 1 1 1 1 
1 1 1 1 1 0 1 1 1 
1 1 1 1 1 1 1 1 0 
1 0 1 1 1 1 1 1 1 
0 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 0 1 1 
1 1 1 1 0 1 1 1 1 

1 1 1 1 1 1 1 1 0 
0 1 1 1 1 1 1 1 1 
1 1 1 1 0 1 1 1 1 
1 1 0 1 1 1 1 1 1 
1 1 1 1 1 0 1 1 1 
1 1 1 1 1 1 1 0 1 
1 1 1 1 1 1 0 1 1 
1 1 1 0 1 1 1 1 1 
1 0 1 1 1 1 1 1 1 


80000000

# Move search component status

0 0 0
0 80000000


The task is now up to 33%.

Could the code be improved to look for this issue and not hang? The task was using 100% of a processor core when it was stuck.
ID: 475 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Adam J Bavier

Send message
Joined: 28 May 18
Posts: 7
Credit: 3,416,960
RAC: 0
Message 476 - Posted: 15 Jul 2018, 15:00:53 UTC - in response to Message 475.  

I forgot to say earlier. The data I pasted into the very first post of this thread came from the "wu_005200353.txt" file in the C:\ProgramData\BOINC\projects\rake.boincfast.ru_rakesearch directory. I didn't know about the "slots" directory. Now I know for the future.
ID: 476 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Adam J Bavier

Send message
Joined: 28 May 18
Posts: 7
Credit: 3,416,960
RAC: 0
Message 477 - Posted: 15 Jul 2018, 15:36:16 UTC

The work unit completed! Yay.

Stopping BOINC Manager and the tasks, then deleting the checkpoint file in the Slot, and restarting BOINC definitely was the fix.
ID: 477 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 645
Credit: 22,431,455
RAC: 12,765
Message 478 - Posted: 15 Jul 2018, 19:09:31 UTC

Adam, thank you for information!
ID: 478 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : WU processing for 28 hours so far

©2024 The searchers team, Karelian Research Center of the Russian Academy of Sciences