Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next
Author | Message |
---|---|
Send message Joined: 12 Nov 17 Posts: 7 Credit: 6,461,078 RAC: 0 |
Thanks Daniel. I grep'ed for sse2 on the Phenom, didn't think to grep for neon on the Arms. It turns out that both the pi 2 and the pi 3 Arm processors support NEON. Both processor systems have completed units. The pi 2 and pi 3 systems have gotten credit for NEON units. Pi zeros don't work with the accelerated apps. They error out right away. (I've turned them off.) One zero was running Jessie, and the other Stretch, but I'm sure it's the processor, not the OS. I've verified that the AMD A8 is in fact running the AVX accelerated app, and is successful. It's about 20% slower than the Phenom II, which doesn't have AVX, and is running SSE2. It's not unusual for the A8 to run 20% faster or 20% slower than the Phenom II on different apps or benchmarks. I might try the SSE2 app on the A8. I time these by pasting 20 valid units stats into a spreadsheet, and averaging. Stephen. |
Send message Joined: 12 Nov 17 Posts: 7 Credit: 6,461,078 RAC: 0 |
Thanks Daniel. I grep'ed for sse2 on the Phenom, didn't think to grep for neon on the Arms. It turns out that both the pi 2 and the pi 3 Arm processors support NEON. Both processor systems have completed units. The pi 2 and pi 3 systems have gotten credit for NEON units. Pi zeros don't work with the accelerated apps. They error out right away. (I've turned them off.) One zero was running Jessie, and the other Stretch, but I'm sure it's the processor, not the OS. I've verified that the AMD A8 is in fact running the AVX accelerated app, and is successful. It's about 20% slower than the Phenom II, which doesn't have AVX, and is running SSE2. It's not unusual for the A8 to run 20% faster or 20% slower than the Phenom II on different apps or benchmarks. I might try the SSE2 app on the A8. I time these by pasting 20 valid units stats into a spreadsheet, and averaging. Stephen. |
Send message Joined: 12 Nov 17 Posts: 7 Credit: 6,461,078 RAC: 0 |
Thanks Daniel. I grep'ed for sse2 on the Phenom, didn't think to grep for neon on the Arms. It turns out that both the pi 2 and the pi 3 Arm processors support NEON. Both processor systems have completed units. The pi 2 and pi 3 systems have gotten credit for NEON units. Pi zeros don't work with the accelerated apps. They error out right away. (I've turned them off.) One zero was running Jessie, and the other Stretch, but I'm sure it's the processor, not the OS. I've verified that the AMD A8 is in fact running the AVX accelerated app, and is successful. It's about 20% slower than the Phenom II, which doesn't have AVX, and is running SSE2. It's not unusual for the A8 to run 20% faster or 20% slower than the Phenom II on different apps or benchmarks. I might try the SSE2 app on the A8. I time these by pasting 20 valid units stats into a spreadsheet, and averaging. Stephen. |
Send message Joined: 30 Nov 17 Posts: 10 Credit: 10,045,366 RAC: 0 |
We understood with the first post ... |
Send message Joined: 8 Sep 17 Posts: 99 Credit: 402,603,726 RAC: 0 |
Thanks Daniel. I grep'ed for sse2 on the Phenom, didn't think to grep for neon on the Arms. Thanks for info. ARM app on Pi Zero crashed after receiving signal 4 - that is SIGILL, illegal instruction. Looks that there should be separate app for ARMv6, or non-NEON one should have some instruction sets disabled. I will look on this when I find some free time. |
Send message Joined: 20 Mar 18 Posts: 6 Credit: 60,978,227 RAC: 60 |
Hi Daniel, thanks for working on this! To clarify/summarize this long thread: What's the current status/version of the optimized app, how much faster is it than the stock version and approximately when is it due to be made default? Repository for optimized versions: https://github.com/sirzooro/RakeSearch/releases/tag/v1.0 Any thoughts on which version is currently best for most 64bit machines? |
Send message Joined: 20 Mar 18 Posts: 6 Credit: 60,978,227 RAC: 60 |
Double post due to reported "gateway timeout"... |
Send message Joined: 8 Sep 17 Posts: 99 Credit: 402,603,726 RAC: 0 |
Hi Daniel, thanks for working on this! To clarify/summarize this long thread: Hi, Most information about this optimized app is provided in my first post in this thread, please check it. New app is about 10 times faster than original version (for AVX2 version running on Intel CPU). Other app versions for older CPUs are slower, but still a lot faster that original one - e.g. SSE2 version is about 9 times faster. It turned out that AVX2+BMI2 app on AMD Ryzen/Threadripper is slower than AVX one. I created new AVX2 app without PEXT instruction to address this. I did not get any feedback about its speed on AMD CPUs, so I do not know if it is really faster there (on Intel it is a bit faster than AVX app). Some time ago project admins announced that current optimized app will be released as a official one. They are going to do this after doing other planned tasks here. Optimized app can be installed as "anonymous platform" manually, so this is not highest priority for them now. I am still working on new version of optimized app. x86 app version is ready, I still have some work to do for ARM versions. |
Send message Joined: 27 Jun 18 Posts: 47 Credit: 9,875,775 RAC: 0 |
Then time for some feedback about AMD CPUs. CPU: AMD Ryzen 1700 Motherboard: ASUS Prime B350 Plus, updated to 806 BIOS version, AGESA 1.0.0.6a RAM: 1 8GB DIMM at 2666MHz SO: Linux Kubuntu 18.04 I downloaded the apps in the first page and ran them, here are the results: Rakesearch Linux 64 AVX: Started RakeSearch test... 288.36user 0.00system 4:50.38elapsed 99%CPU (0avgtext+0avgdata 2984maxresident)k 1536inputs+40outputs (5major+149minor)pagefaults 0swaps Rakesearch Linux 64 AVX2: Started RakeSearch test... 353.64user 0.01system 5:55.66elapsed 99%CPU (0avgtext+0avgdata 3016maxresident)k 0inputs+48outputs (0major+131minor)pagefaults 0swaps Note: it created a checkpoint.txt Rakesearch Linux 64 AVX2 NOPEXT: Started RakeSearch test... 278.79user 0.00system 4:40.80elapsed 99%CPU (0avgtext+0avgdata 3028maxresident)k 0inputs+40outputs (0major+148minor)pagefaults 0swaps Rakesearch Linux 64 SSE2: Started RakeSearch test... 298.48user 0.00system 5:00.50elapsed 99%CPU (0avgtext+0avgdata 3064maxresident)k 0inputs+40outputs (0major+146minor)pagefaults 0swaps The Rakesearch Linux 64 AVX512 one does not work since that CPU does not have these instructions of course. |
Send message Joined: 27 Jun 18 Posts: 47 Credit: 9,875,775 RAC: 0 |
Double post, sorry. |
Send message Joined: 2 Jun 18 Posts: 6 Credit: 3,670,795 RAC: 0 |
I realise this is a low priority, but is there any chance of a Linux 32bit sse/sse2 app, or at least instructions on how to build from source? |
Send message Joined: 2 Jun 18 Posts: 6 Credit: 3,670,795 RAC: 0 |
Thanks I got it going. Just have to wait and see if everything validates. Cut 24 hour work units down to two hours or less. |
Send message Joined: 2 Jun 18 Posts: 6 Credit: 3,670,795 RAC: 0 |
If anyone wants the binaries: https://www.dropbox.com/s/qhp8c2vd5prcmy8/rakesearch_linux_i686_sse2.tar.gz?dl=0 Same app_info.xml as the other files. Compiled from Danial's source code (I claim no credit) compiled on kernel 3.13.0-151-generic, 4 core 32bit system (Sossaman CPUs) with latest BOINC from git. I don't plan on leaving them up forever though, so grab 'em if you want 'em. |
Send message Joined: 11 Sep 17 Posts: 51 Credit: 194,406,895 RAC: 2,340 |
I see nice increasing of giga/fl to 43400.37 .. while in weeks we have 42xxx avx app is best for amd cpus , for now in my test) |
Send message Joined: 11 Sep 17 Posts: 51 Credit: 194,406,895 RAC: 2,340 |
I see nice increasing of giga/fl to 43400.37 .. while in weeks we have 42xxx avx app is best for amd cpus , for now in my test) |
Send message Joined: 19 Oct 17 Posts: 3 Credit: 62,552,930 RAC: 16 |
Hello ! Tried to run the optimized app on an Intel(R) Xeon(R) CPU E5-2667 v3, HT OFF , BOINC 7.14.2 Once the app_info is installed, impossible to receive any wu ... And no specific error message. Any idea ? Thank You Phil1966 EDIT : SOLVED :) |
Send message Joined: 19 Oct 17 Posts: 3 Credit: 62,552,930 RAC: 16 |
After having reinstalles the optimized app AVX W10 64 :( |
Send message Joined: 11 Aug 17 Posts: 648 Credit: 22,550,169 RAC: 13,564 |
26/10/2018 09:55:42 | Rake search of diagonal Latin squares | Message du serveur: Votre fichier app_info.xml n'a pas une version utilisable de RakeSearch. Hello! Is there an error now? |
Send message Joined: 19 Oct 17 Posts: 3 Credit: 62,552,930 RAC: 16 |
Hello ! Sorry for late reply. I reinstalled and everything ok :) Thank You. Any results with pairs in November ? ;) |
Send message Joined: 11 Aug 17 Posts: 648 Credit: 22,550,169 RAC: 13,564 |
Any results with pairs in November ? ;) Yes! Just uploaded. :) |
©2024 The searchers team, Karelian Research Center of the Russian Academy of Sciences