Optimized RakeSearch app for rank 9 (computations finished)

Message boards : Number crunching : Optimized RakeSearch app for rank 9 (computations finished)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
JockMacMad TSBT

Send message
Joined: 30 Nov 18
Posts: 3
Credit: 2,450
RAC: 0
Message 649 - Posted: 1 Dec 2018, 7:28:19 UTC

Timings for a Cavium ThunderX 1 48-core ARM 64

Started RakeSearch test...
1093.05 user 0.05 system

18:15.04 elapsed 99%CPU (0avgtext+0avgdata 3248maxresident)k

16 inputs+64 outputs (1 major + 525 minor) pagefaults 0 swaps
Files result.txt and result-ok.txt are identical
ID: 649 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JockMacMad TSBT

Send message
Joined: 30 Nov 18
Posts: 3
Credit: 2,450
RAC: 0
Message 652 - Posted: 1 Dec 2018, 12:47:50 UTC

Timings for a Cavium ThunderX 1 48-core ARM 64

Started RakeSearch test...
1093.05 user 0.05 system

18:15.04 elapsed 99%CPU (0avgtext+0avgdata 3248maxresident)k

16 inputs+64 outputs (1 major + 525 minor) pagefaults 0 swaps
Files result.txt and result-ok.txt are identical
ID: 652 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JockMacMad TSBT

Send message
Joined: 30 Nov 18
Posts: 3
Credit: 2,450
RAC: 0
Message 653 - Posted: 2 Dec 2018, 0:28:41 UTC

Timings for a Cavium ThunderX 1 48-core ARM 64

Started RakeSearch test...
1093.05 user 0.05 system

18:15.04 elapsed 99%CPU (0avgtext+0avgdata 3248maxresident)k

16 inputs+64 outputs (1 major + 525 minor) pagefaults 0 swaps
Files result.txt and result-ok.txt are identical
ID: 653 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 624
Credit: 20,554,828
RAC: 7,971
Message 654 - Posted: 2 Dec 2018, 7:23:57 UTC - in response to Message 653.  

Thank you for report!
ID: 654 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Stephen Uitti

Send message
Joined: 12 Nov 17
Posts: 7
Credit: 6,461,078
RAC: 0
Message 659 - Posted: 8 Dec 2018, 15:15:14 UTC

I recently got a new pi 3, and couldn't recall how to install the app. Google got me to my own instructions here. I had forgotten the restart. I now have the instructions local. Apparently, i use the neon version on pi 2 and pi 3.

My notes suggest that the neon version may be 10% faster than the other one on the pi 3. That's within experimental error.

I don't have a comparison on the pi 2. I only have neon times.
ID: 659 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Stephen Uitti

Send message
Joined: 12 Nov 17
Posts: 7
Credit: 6,461,078
RAC: 0
Message 660 - Posted: 8 Dec 2018, 15:18:57 UTC

I recently got a new pi 3, and couldn't recall how to install the app. Google got me to my own instructions here. I had forgotten the restart. I now have the instructions local. Apparently, i use the neon version on pi 2 and pi 3.

My notes suggest that the neon version may be 10% faster than the other one on the pi 3. That's within experimental error.

I don't have a comparison on the pi 2. I only have neon times. Weird. I also started routinely overclocking the pi 2 about then.
The overclocked pi 2 shows 23% faster than the pi 3. Also weird.

Stephen.
ID: 660 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[SG]Felix

Send message
Joined: 14 Dec 17
Posts: 11
Credit: 3,282,877
RAC: 959
Message 674 - Posted: 26 Dec 2018, 16:08:10 UTC

just a quick question.
i am explaining the installation of this app to a few other people, and i found out, the standart poinc path in windows changed.
maybe you could change this
ID: 674 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
M0CZY
Avatar

Send message
Joined: 24 Aug 18
Posts: 6
Credit: 104,687
RAC: 26
Message 691 - Posted: 12 Jan 2019, 8:25:56 UTC
Last modified: 12 Jan 2019, 8:42:25 UTC

I looked at the list of optimized apps, but I couldn't see the one that my computer needs, which is Linux 32-bit with SSE2.
Whereabouts on the page is it?

Can one be built for me please?
ID: 691 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [B@P] Daniel

Send message
Joined: 8 Sep 17
Posts: 99
Credit: 402,603,726
RAC: 0
Message 696 - Posted: 15 Jan 2019, 21:07:55 UTC - in response to Message 691.  

I looked at the list of optimized apps, but I couldn't see the one that my computer needs, which is Linux 32-bit with SSE2.
Whereabouts on the page is it?

Can one be built for me please?

Hi,
I am working on new version of my optimized app. I am going to release it before end of January. I will add 32-bit Linux versions together with other ones.
ID: 696 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Millenium

Send message
Joined: 27 Jun 18
Posts: 47
Credit: 9,875,775
RAC: 0
Message 703 - Posted: 27 Jan 2019, 19:22:39 UTC

I'm curious to hear more about that and once it's ready benchmark it! Keep up the good job!
ID: 703 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
jozef j
Avatar

Send message
Joined: 11 Sep 17
Posts: 51
Credit: 193,013,442
RAC: 1,960
Message 707 - Posted: 30 Jan 2019, 14:24:53 UTC

Hi daniel, how is about new app? cant wait.. ))
i use just AVX ,still best on win 10 with THR 2 cpu.. i experimented with coreprio last month but for boinc is not there improvement.. maybe for gamers or web. ,, i try also cinebench and other tests from hwboot but different was 100-200 points in cine. for exemple..
Thank you much for work.
ID: 707 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [B@P] Daniel

Send message
Joined: 8 Sep 17
Posts: 99
Credit: 402,603,726
RAC: 0
Message 711 - Posted: 30 Jan 2019, 21:14:34 UTC - in response to Message 707.  

Hi daniel, how is about new app? cant wait.. ))
i use just AVX ,still best on win 10 with THR 2 cpu.. i experimented with coreprio last month but for boinc is not there improvement.. maybe for gamers or web. ,, i try also cinebench and other tests from hwboot but different was 100-200 points in cine. for exemple..
Thank you much for work.

Things looks good, most things are ready now. If everything will go well, I will release new version this week.
ID: 711 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [B@P] Daniel

Send message
Joined: 8 Sep 17
Posts: 99
Credit: 402,603,726
RAC: 0
Message 715 - Posted: 3 Feb 2019, 14:32:17 UTC

Hi all,
I have (finally!) released new version of my optimized RakeSearch app, Opti v1.1. It can be downloaded from here: https://github.com/sirzooro/RakeSearch/releases/tag/v1.1. Installation instruction is the same as before, so please refer to 1st post in this thread for details.

There are 4 versions available as before: SSE2, AVX, AVX2 and AVX512. There are also apps for 32-bit Windows and Linux.

For comparison, here are results for previous (Opti v1.0) version:

SSE2:
real    6m2.431s
user    6m0.451s
sys     0m0.030s

AVX:
real    5m45.740s
user    5m43.759s
sys     0m0.026s

AVX2:
real    5m24.624s
user    5m22.626s
sys     0m0.042s


And this is for new version:

SSE2:
real    4m14.850s
user    4m12.858s
sys     0m0.047s

AVX:
real    3m58.809s
user    3m56.813s
sys     0m0.035s

AVX2:
real    3m51.881s
user    3m49.885s
sys     0m0.040s


As you can see, new app version is about 30% faster than previous one.

New AVX2 app does not use PEXT instruction. This means that this app version also on AMD Ryzen and Threadripper will be faster than AVX one.

I also changed AVX512 app version a bit, now it uses new instructions which operate on old (SSE/AVX) registers only. This means that it will not suffer from CPU frequency throttling related to use of new AVX512 registers. I have some results captured for his app version. Unfortunately this machine had other things running on it, so results look high in comparison to ones above. Anyway, you can see that AVX512 app is the fastest:

SSE2:
real    6m47,701s
user    0m0,000s
sys     0m0,046s

AVX:
real    6m22,165s
user    0m0,015s
sys     0m0,062s

AVX2:
real    6m14,271s
user    0m0,000s
sys     0m0,031s

AVX512:
real    6m12,458s
user    0m0,000s
sys     0m0,062s


I did not create ARM/AARCH64 app versions yet, I am going to release them soon.
ID: 715 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 11 Aug 17
Posts: 103
Credit: 1,973,929
RAC: 15
Message 716 - Posted: 3 Feb 2019, 15:05:58 UTC

Thank you very much, Daniel! It is a pleasure to see your interest in the project, and we appreciate how much effort you are contributing.
ID: 716 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pututu

Send message
Joined: 1 Apr 18
Posts: 6
Credit: 19,413,439
RAC: 73
Message 717 - Posted: 3 Feb 2019, 18:25:39 UTC

Thanks Daniel! Great job.
ID: 717 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JugNut

Send message
Joined: 6 Jan 18
Posts: 7
Credit: 16,825,117
RAC: 9,311
Message 719 - Posted: 4 Feb 2019, 19:14:52 UTC - in response to Message 718.  
Last modified: 4 Feb 2019, 19:19:16 UTC

Yea great work Daniel thanks again :)
ID: 719 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 20 Mar 18
Posts: 6
Credit: 60,978,227
RAC: 60
Message 720 - Posted: 4 Feb 2019, 19:49:34 UTC

Thanks Daniel for the updates! Have the AVX running on a couple older machines and the AVX2 on a Ryzen. Unfortunately Panda AV flagged both exe files on all 3 machines as a virus (and deleted them) and I had to exclude them to get them to run. Never had this happen with the older versions. Is it possible to clue in the Panda people concerning this?
ID: 720 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [B@P] Daniel

Send message
Joined: 8 Sep 17
Posts: 99
Credit: 402,603,726
RAC: 0
Message 722 - Posted: 4 Feb 2019, 20:27:12 UTC - in response to Message 720.  

Thanks Daniel for the updates! Have the AVX running on a couple older machines and the AVX2 on a Ryzen. Unfortunately Panda AV flagged both exe files on all 3 machines as a virus (and deleted them) and I had to exclude them to get them to run. Never had this happen with the older versions. Is it possible to clue in the Panda people concerning this?

Hmm, interesting. I tried to scan them using Virus Total which allows ts scan file using 69 scanners (including Panda) and they are clear. MetaDefender (37 scanners) also confirmed this. Anyway, I sent info to Panda about this false positive.
ID: 722 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 8 Sep 17
Posts: 44
Credit: 10,377,099
RAC: 3,219
Message 725 - Posted: 5 Feb 2019, 0:06:04 UTC

Thanks Daniel for your efforts in getting this out there.

However I seem to have a problem with your new programme, all I get is

<core_client_version>7.4.25</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
SIGILL: illegal instruction
Stack trace (10 frames):
[0x411740]
[0x4f0b10]
[0x40c282]
[0x409d6c]
[0x40cdc4]
[0x4063dd]
[0x404f5c]
[0x4f21a4]
[0x4f2421]
[0x4058b6]

Exiting...

</stderr_txt>

Which is trashing a heap of work units, I re-downloaded and re-installed the files but this made no difference.
I noticed that the line <platform>xxxx</platform> had been added to the apt_info.xml file so I removed that to see if it made a difference but alas it did not.
So I will re-install your older version and see if all starts working again.

Thanks
Conan
ID: 725 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 8 Sep 17
Posts: 22
Credit: 18,374,830
RAC: 11,948
Message 729 - Posted: 5 Feb 2019, 0:41:46 UTC - in response to Message 725.  

Thanks Daniel for your efforts in getting this out there.

However I seem to have a problem with your new programme, all I get is

<core_client_version>7.4.25</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
SIGILL: illegal instruction
Stack trace (10 frames):
[0x411740]
[0x4f0b10]
[0x40c282]
[0x409d6c]
[0x40cdc4]
[0x4063dd]
[0x404f5c]
[0x4f21a4]
[0x4f2421]
[0x4058b6]

Exiting...

</stderr_txt>

Which is trashing a heap of work units, I re-downloaded and re-installed the files but this made no difference.
I noticed that the line <platform>xxxx</platform> had been added to the apt_info.xml file so I removed that to see if it made a difference but alas it did not.
So I will re-install your older version and see if all starts working again.

Thanks
Conan


Which app was used? The 955 doesn't have AVX so the SSE app would be required.

Same exe name so these should be a drop in replacement.

Thanks for the great work Daniel.
ID: 729 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next

Message boards : Number crunching : Optimized RakeSearch app for rank 9 (computations finished)

©2024 The searchers team, Karelian Research Center of the Russian Academy of Sciences