Message boards :
News :
Server upgrade shifted to May 24th, 06:00-16:00 UTC
Message board moderation
Author | Message |
---|---|
Send message Joined: 11 Aug 17 Posts: 103 Credit: 1,973,929 RAC: 15 |
Dear crunchers, the server's hardware upgrade (hopefully the final one!) is planned for tomorrow, May 24th, 06:00-16:00 UTC. We are sorry for shifting it once again. It is due to the fact that the server is shared with other people's projects, including a local ShareLaTeX server of the institute. Have a good day! |
Send message Joined: 1 Jan 18 Posts: 10 Credit: 701,430 RAC: 311 |
Natalia Well how did the upgrade go on May 24th ? Were you able to complete it ? What improvements were involved. ? Thanks Bill F |
Send message Joined: 11 Aug 17 Posts: 103 Credit: 1,973,929 RAC: 15 |
We installed the second CPU and the second HDD, and made RAID 1 (mirror). This is in addition to the previous upgrade where we increased RAM. To summarize, we were able to increase MySQL caches, to speed up database queries, to improve filesystem cache and the system speed in general. Also, we are safe in case one of the HDDs fails. And now it looks like this: |
Send message Joined: 8 Sep 17 Posts: 34 Credit: 100,058,938 RAC: 8 |
Hi Natalia, The web site was unavailable last night from shortly after 21:00 UTC for well over an hour (I stopped checking and went to bed). Was this just due to the generation of new tasks ? Available WU had dropped below 5k before 21:00 UTC and this morning it is over 20k. If the outage was for WU generation are there any plans to reduce the impact on the server when this happens please ? |
Send message Joined: 11 Aug 17 Posts: 103 Credit: 1,973,929 RAC: 15 |
Dear PDW, at the moment WUs are generated by a script which checks every 15 munites if there are <4000 tasks ready to send. If this happens, the next portion of 32 000 tasks are generated. The outages might have been because of failures in server's local network. |
Send message Joined: 8 Sep 17 Posts: 34 Credit: 100,058,938 RAC: 8 |
Dear Natalia, I woke this morning to find machines backed off unable to upload completed work. I manually re-tried the transfers and they were sent and then manually updated to get more work. More troubles in the server's local network ? I note that the number of tasks available is now over 470,000 which is a lot more than the expected <4000 + 32,000 :) |
Send message Joined: 11 Aug 17 Posts: 644 Credit: 22,403,752 RAC: 12,641 |
PDW, hello! More troubles in the server's local network ? Yes. Now we think about of increasing of number of tasks per cpu core of hosts - it would reduce of impact of network troubles. I note that the number of tasks available is now over 470,000 which is a lot more than the expected <4000 + 32,000 :) It's normal, tonight I manually generate a bug bunch of workunits and tasks. |
Send message Joined: 8 Sep 17 Posts: 34 Credit: 100,058,938 RAC: 8 |
Hello hoarfrost, An increase in tasks per core would help, thank you. A bigger bunch of tasks could mean a bigger bunch of flowers for more crunchers ! |
Send message Joined: 11 Aug 17 Posts: 644 Credit: 22,403,752 RAC: 12,641 |
Number of tasks per core increased twicely. Now is 16 tasks in progress per core or cpu thread. Upgrade of server RAM and increased MySQL caches, I think, allow us to made this. |
Send message Joined: 11 Aug 17 Posts: 644 Credit: 22,403,752 RAC: 12,641 |
A bigger bunch of tasks could mean a bigger bunch of flowers for more crunchers ! And additional stress test for project server. Seems like that server works fine! :) |
©2024 The searchers team, Karelian Research Center of the Russian Academy of Sciences