Server upgrade shifted to May 24th, 06:00-16:00 UTC

Message boards : News : Server upgrade shifted to May 24th, 06:00-16:00 UTC
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Natalia
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 103
Credit: 1,973,929
RAC: 15
Message 412 - Posted: 23 May 2018, 11:30:50 UTC

Dear crunchers,

the server's hardware upgrade (hopefully the final one!) is planned for tomorrow, May 24th, 06:00-16:00 UTC. We are sorry for shifting it once again. It is due to the fact that the server is shared with other people's projects, including a local ShareLaTeX server of the institute.

Have a good day!
ID: 412 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bill F
Avatar

Send message
Joined: 1 Jan 18
Posts: 10
Credit: 701,430
RAC: 311
Message 413 - Posted: 25 May 2018, 22:09:00 UTC

Natalia

Well how did the upgrade go on May 24th ? Were you able to complete it ?

What improvements were involved. ?

Thanks
Bill F
ID: 413 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 103
Credit: 1,973,929
RAC: 15
Message 414 - Posted: 26 May 2018, 10:52:20 UTC
Last modified: 26 May 2018, 11:04:52 UTC

We installed the second CPU and the second HDD, and made RAID 1 (mirror). This is in addition to the previous upgrade where we increased RAM. To summarize, we were able to increase MySQL caches, to speed up database queries, to improve filesystem cache and the system speed in general. Also, we are safe in case one of the HDDs fails.
And now it looks like this:
ID: 414 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 8 Sep 17
Posts: 34
Credit: 100,058,938
RAC: 8
Message 416 - Posted: 28 May 2018, 7:50:51 UTC - in response to Message 414.  

Hi Natalia,

The web site was unavailable last night from shortly after 21:00 UTC for well over an hour (I stopped checking and went to bed).

Was this just due to the generation of new tasks ?
Available WU had dropped below 5k before 21:00 UTC and this morning it is over 20k.

If the outage was for WU generation are there any plans to reduce the impact on the server when this happens please ?
ID: 416 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 103
Credit: 1,973,929
RAC: 15
Message 417 - Posted: 28 May 2018, 9:27:44 UTC - in response to Message 416.  

Dear PDW,

at the moment WUs are generated by a script which checks every 15 munites if there are <4000 tasks ready to send. If this happens, the next portion of 32 000 tasks are generated. The outages might have been because of failures in server's local network.
ID: 417 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 8 Sep 17
Posts: 34
Credit: 100,058,938
RAC: 8
Message 418 - Posted: 29 May 2018, 7:14:45 UTC - in response to Message 417.  

Dear Natalia,

I woke this morning to find machines backed off unable to upload completed work.
I manually re-tried the transfers and they were sent and then manually updated to get more work.

More troubles in the server's local network ?

I note that the number of tasks available is now over 470,000 which is a lot more than the expected <4000 + 32,000 :)
ID: 418 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 644
Credit: 22,403,961
RAC: 12,638
Message 419 - Posted: 29 May 2018, 7:28:08 UTC - in response to Message 418.  

PDW, hello!

More troubles in the server's local network ?

Yes. Now we think about of increasing of number of tasks per cpu core of hosts - it would reduce of impact of network troubles.

I note that the number of tasks available is now over 470,000 which is a lot more than the expected <4000 + 32,000 :)

It's normal, tonight I manually generate a bug bunch of workunits and tasks.
ID: 419 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 8 Sep 17
Posts: 34
Credit: 100,058,938
RAC: 8
Message 420 - Posted: 29 May 2018, 7:39:14 UTC - in response to Message 419.  

Hello hoarfrost,

An increase in tasks per core would help, thank you.

A bigger bunch of tasks could mean a bigger bunch of flowers for more crunchers !
ID: 420 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 644
Credit: 22,403,961
RAC: 12,638
Message 421 - Posted: 29 May 2018, 8:48:08 UTC - in response to Message 420.  
Last modified: 29 May 2018, 16:34:50 UTC

Number of tasks per core increased twicely. Now is 16 tasks in progress per core or cpu thread.
Upgrade of server RAM and increased MySQL caches, I think, allow us to made this.
ID: 421 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 644
Credit: 22,403,961
RAC: 12,638
Message 422 - Posted: 30 May 2018, 7:04:19 UTC - in response to Message 420.  

A bigger bunch of tasks could mean a bigger bunch of flowers for more crunchers !

And additional stress test for project server. Seems like that server works fine! :)
ID: 422 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : News : Server upgrade shifted to May 24th, 06:00-16:00 UTC

©2024 The searchers team, Karelian Research Center of the Russian Academy of Sciences