Optimized RakeSearch app for rank 9 (computations finished)

Message boards : Number crunching : Optimized RakeSearch app for rank 9 (computations finished)

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

AuthorMessage
Profile Stephen Uitti

Send message
Joined: 12 Nov 17
Posts: 7
Credit: 6,169,826
RAC: 0
Message 328 - Posted: 12 Mar 2018, 16:03:06 UTC - in response to Message 280.  

Thanks Daniel. I grep'ed for sse2 on the Phenom, didn't think to grep for neon on the Arms.

It turns out that both the pi 2 and the pi 3 Arm processors support NEON. Both processor systems have completed units. The pi 2 and pi 3 systems have gotten credit for NEON units.
Pi zeros don't work with the accelerated apps. They error out right away. (I've turned them off.) One zero was running Jessie, and the other Stretch, but I'm sure it's the processor, not the OS.

I've verified that the AMD A8 is in fact running the AVX accelerated app, and is successful. It's about 20% slower than the Phenom II, which doesn't have AVX, and is running SSE2. It's not unusual for the A8 to run 20% faster or 20% slower than the Phenom II on different apps or benchmarks. I might try the SSE2 app on the A8. I time these by pasting 20 valid units stats into a spreadsheet, and averaging.

Stephen.
ID: 328 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Stephen Uitti

Send message
Joined: 12 Nov 17
Posts: 7
Credit: 6,169,826
RAC: 0
Message 329 - Posted: 12 Mar 2018, 19:28:00 UTC - in response to Message 280.  

Thanks Daniel. I grep'ed for sse2 on the Phenom, didn't think to grep for neon on the Arms.

It turns out that both the pi 2 and the pi 3 Arm processors support NEON. Both processor systems have completed units. The pi 2 and pi 3 systems have gotten credit for NEON units.
Pi zeros don't work with the accelerated apps. They error out right away. (I've turned them off.) One zero was running Jessie, and the other Stretch, but I'm sure it's the processor, not the OS.

I've verified that the AMD A8 is in fact running the AVX accelerated app, and is successful. It's about 20% slower than the Phenom II, which doesn't have AVX, and is running SSE2. It's not unusual for the A8 to run 20% faster or 20% slower than the Phenom II on different apps or benchmarks. I might try the SSE2 app on the A8. I time these by pasting 20 valid units stats into a spreadsheet, and averaging.

Stephen.
ID: 329 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Stephen Uitti

Send message
Joined: 12 Nov 17
Posts: 7
Credit: 6,169,826
RAC: 0
Message 330 - Posted: 12 Mar 2018, 21:26:53 UTC - in response to Message 280.  

Thanks Daniel. I grep'ed for sse2 on the Phenom, didn't think to grep for neon on the Arms.

It turns out that both the pi 2 and the pi 3 Arm processors support NEON. Both processor systems have completed units. The pi 2 and pi 3 systems have gotten credit for NEON units.
Pi zeros don't work with the accelerated apps. They error out right away. (I've turned them off.) One zero was running Jessie, and the other Stretch, but I'm sure it's the processor, not the OS.

I've verified that the AMD A8 is in fact running the AVX accelerated app, and is successful. It's about 20% slower than the Phenom II, which doesn't have AVX, and is running SSE2. It's not unusual for the A8 to run 20% faster or 20% slower than the Phenom II on different apps or benchmarks. I might try the SSE2 app on the A8. I time these by pasting 20 valid units stats into a spreadsheet, and averaging.

Stephen.
ID: 330 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[AF>FAH-Addict.net]toTOW

Send message
Joined: 30 Nov 17
Posts: 10
Credit: 10,045,366
RAC: 0
Message 331 - Posted: 12 Mar 2018, 22:51:37 UTC

We understood with the first post ...
ID: 331 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [B@P] Daniel

Send message
Joined: 8 Sep 17
Posts: 89
Credit: 375,708,628
RAC: 88,093
Message 332 - Posted: 13 Mar 2018, 8:43:07 UTC - in response to Message 327.  

Thanks Daniel. I grep'ed for sse2 on the Phenom, didn't think to grep for neon on the Arms.

It turns out that both the pi 2 and the pi 3 Arm processors support NEON. Both processor systems have completed units. The pi 2 and pi 3 systems have gotten credit for NEON units.
Pi zeros don't work with the accelerated apps. They error out right away. (I've turned them off.) One zero was running Jessie, and the other Stretch, but I'm sure it's the processor, not the OS.

I've verified that the AMD A8 is in fact running the AVX accelerated app, and is successful. It's about 20% slower than the Phenom II, which doesn't have AVX, and is running SSE2. It's not unusual for the A8 to run 20% faster or 20% slower than the Phenom II on different apps or benchmarks. I might try the SSE2 app on the A8. I time these by pasting 20 valid units stats into a spreadsheet, and averaging.

Stephen.

Thanks for info. ARM app on Pi Zero crashed after receiving signal 4 - that is SIGILL, illegal instruction. Looks that there should be separate app for ARMv6, or non-NEON one should have some instruction sets disabled. I will look on this when I find some free time.
ID: 332 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Beyond
Avatar

Send message
Joined: 20 Mar 18
Posts: 6
Credit: 60,977,506
RAC: 0
Message 343 - Posted: 21 Mar 2018, 18:56:17 UTC

Hi Daniel, thanks for working on this! To clarify/summarize this long thread:
What's the current status/version of the optimized app, how much faster is it than the stock version and approximately when is it due to be made default?

Repository for optimized versions: https://github.com/sirzooro/RakeSearch/releases/tag/v1.0

Any thoughts on which version is currently best for most 64bit machines?
ID: 343 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Beyond
Avatar

Send message
Joined: 20 Mar 18
Posts: 6
Credit: 60,977,506
RAC: 0
Message 344 - Posted: 21 Mar 2018, 18:57:36 UTC
Last modified: 21 Mar 2018, 18:59:28 UTC

Double post due to reported "gateway timeout"...
ID: 344 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [B@P] Daniel

Send message
Joined: 8 Sep 17
Posts: 89
Credit: 375,708,628
RAC: 88,093
Message 346 - Posted: 26 Mar 2018, 8:57:34 UTC - in response to Message 343.  

Hi Daniel, thanks for working on this! To clarify/summarize this long thread:
What's the current status/version of the optimized app, how much faster is it than the stock version and approximately when is it due to be made default?

Repository for optimized versions: https://github.com/sirzooro/RakeSearch/releases/tag/v1.0

Any thoughts on which version is currently best for most 64bit machines?


Hi,
Most information about this optimized app is provided in my first post in this thread, please check it.

New app is about 10 times faster than original version (for AVX2 version running on Intel CPU). Other app versions for older CPUs are slower, but still a lot faster that original one - e.g. SSE2 version is about 9 times faster.

It turned out that AVX2+BMI2 app on AMD Ryzen/Threadripper is slower than AVX one. I created new AVX2 app without PEXT instruction to address this. I did not get any feedback about its speed on AMD CPUs, so I do not know if it is really faster there (on Intel it is a bit faster than AVX app).

Some time ago project admins announced that current optimized app will be released as a official one. They are going to do this after doing other planned tasks here. Optimized app can be installed as "anonymous platform" manually, so this is not highest priority for them now.

I am still working on new version of optimized app. x86 app version is ready, I still have some work to do for ARM versions.
ID: 346 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Millenium

Send message
Joined: 27 Jun 18
Posts: 44
Credit: 9,724,885
RAC: 218
Message 462 - Posted: 1 Jul 2018, 18:17:40 UTC

Then time for some feedback about AMD CPUs.

CPU: AMD Ryzen 1700
Motherboard: ASUS Prime B350 Plus, updated to 806 BIOS version, AGESA 1.0.0.6a
RAM: 1 8GB DIMM at 2666MHz
SO: Linux Kubuntu 18.04

I downloaded the apps in the first page and ran them, here are the results:

Rakesearch Linux 64 AVX:

Started RakeSearch test...
288.36user 0.00system 4:50.38elapsed 99%CPU (0avgtext+0avgdata 2984maxresident)k
1536inputs+40outputs (5major+149minor)pagefaults 0swaps


Rakesearch Linux 64 AVX2:

Started RakeSearch test...
353.64user 0.01system 5:55.66elapsed 99%CPU (0avgtext+0avgdata 3016maxresident)k
0inputs+48outputs (0major+131minor)pagefaults 0swaps

Note: it created a checkpoint.txt


Rakesearch Linux 64 AVX2 NOPEXT:

Started RakeSearch test...
278.79user 0.00system 4:40.80elapsed 99%CPU (0avgtext+0avgdata 3028maxresident)k
0inputs+40outputs (0major+148minor)pagefaults 0swaps


Rakesearch Linux 64 SSE2:

Started RakeSearch test...
298.48user 0.00system 5:00.50elapsed 99%CPU (0avgtext+0avgdata 3064maxresident)k
0inputs+40outputs (0major+146minor)pagefaults 0swaps


The Rakesearch Linux 64 AVX512 one does not work since that CPU does not have these instructions of course.
ID: 462 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Millenium

Send message
Joined: 27 Jun 18
Posts: 44
Credit: 9,724,885
RAC: 218
Message 463 - Posted: 1 Jul 2018, 18:18:50 UTC
Last modified: 1 Jul 2018, 18:20:30 UTC

Double post, sorry.
ID: 463 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dark Angel

Send message
Joined: 2 Jun 18
Posts: 4
Credit: 3,666,080
RAC: 0
Message 503 - Posted: 28 Jul 2018, 2:13:11 UTC

I realise this is a low priority, but is there any chance of a Linux 32bit sse/sse2 app, or at least instructions on how to build from source?
ID: 503 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dark Angel

Send message
Joined: 2 Jun 18
Posts: 4
Credit: 3,666,080
RAC: 0
Message 518 - Posted: 29 Jul 2018, 21:06:07 UTC

Thanks I got it going. Just have to wait and see if everything validates. Cut 24 hour work units down to two hours or less.
ID: 518 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dark Angel

Send message
Joined: 2 Jun 18
Posts: 4
Credit: 3,666,080
RAC: 0
Message 521 - Posted: 30 Jul 2018, 4:29:08 UTC

If anyone wants the binaries: https://www.dropbox.com/s/qhp8c2vd5prcmy8/rakesearch_linux_i686_sse2.tar.gz?dl=0

Same app_info.xml as the other files. Compiled from Danial's source code (I claim no credit) compiled on kernel 3.13.0-151-generic, 4 core 32bit system (Sossaman CPUs) with latest BOINC from git.

I don't plan on leaving them up forever though, so grab 'em if you want 'em.
ID: 521 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
jozef j
Avatar

Send message
Joined: 11 Sep 17
Posts: 34
Credit: 174,672,333
RAC: 49,159
Message 567 - Posted: 26 Aug 2018, 10:40:24 UTC

I see nice increasing of giga/fl to 43400.37 .. while in weeks we have 42xxx

avx app is best for amd cpus , for now in my test)
ID: 567 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
jozef j
Avatar

Send message
Joined: 11 Sep 17
Posts: 34
Credit: 174,672,333
RAC: 49,159
Message 568 - Posted: 26 Aug 2018, 15:28:03 UTC

I see nice increasing of giga/fl to 43400.37 .. while in weeks we have 42xxx

avx app is best for amd cpus , for now in my test)
ID: 568 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [AF>Amis des Lapins] Phil1966

Send message
Joined: 19 Oct 17
Posts: 3
Credit: 59,276,383
RAC: 15,729
Message 623 - Posted: 26 Oct 2018, 3:50:29 UTC
Last modified: 26 Oct 2018, 3:56:16 UTC

Hello !
Tried to run the optimized app on an Intel(R) Xeon(R) CPU E5-2667 v3, HT OFF , BOINC 7.14.2
Once the app_info is installed, impossible to receive any wu ...
And no specific error message.
Any idea ?
Thank You
Phil1966

EDIT : SOLVED :)
ID: 623 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [AF>Amis des Lapins] Phil1966

Send message
Joined: 19 Oct 17
Posts: 3
Credit: 59,276,383
RAC: 15,729
Message 624 - Posted: 26 Oct 2018, 7:57:40 UTC - in response to Message 623.  


26/10/2018 09:55:41 | Rake search of diagonal Latin squares | Sending scheduler request: To fetch work.
26/10/2018 09:55:41 | Rake search of diagonal Latin squares | Requesting new tasks for CPU
26/10/2018 09:55:42 | Rake search of diagonal Latin squares | Scheduler request completed: got 0 new tasks
26/10/2018 09:55:42 | Rake search of diagonal Latin squares | No tasks sent
26/10/2018 09:55:42 | Rake search of diagonal Latin squares | Message du serveur: Votre fichier app_info.xml n'a pas une version utilisable de RakeSearch.
26/10/2018 09:55:42 | Rake search of diagonal Latin squares | This computer has finished a daily quota of 16 tasks


After having reinstalles the optimized app AVX W10 64 :(
ID: 624 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 361
Credit: 8,955,579
RAC: 5,116
Message 625 - Posted: 26 Oct 2018, 20:55:27 UTC - in response to Message 624.  

26/10/2018 09:55:42 | Rake search of diagonal Latin squares | Message du serveur: Votre fichier app_info.xml n'a pas une version utilisable de RakeSearch.

After having reinstalles the optimized app AVX W10 64 :(

Hello! Is there an error now?
ID: 625 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [AF>Amis des Lapins] Phil1966

Send message
Joined: 19 Oct 17
Posts: 3
Credit: 59,276,383
RAC: 15,729
Message 636 - Posted: 9 Nov 2018, 16:49:58 UTC - in response to Message 625.  

Hello !
Sorry for late reply.
I reinstalled and everything ok :)
Thank You.

Any results with pairs in November ? ;)
ID: 636 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 361
Credit: 8,955,579
RAC: 5,116
Message 637 - Posted: 9 Nov 2018, 20:32:34 UTC - in response to Message 636.  

Any results with pairs in November ? ;)

Yes! Just uploaded. :)
ID: 637 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

Message boards : Number crunching : Optimized RakeSearch app for rank 9 (computations finished)


©2019 The searchers team, Karelian Research Center of the Russian Academy of Sciences