41)
Questions and Answers :
Web site :
Cannot display tak list
(Message 308)
Posted 28 Jan 2018 by [B@P] Daniel Post: This bug is fixed now in BOINC, you can selectively apply fix for it or upgrade everything to latest version. |
42)
Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
(Message 306)
Posted 27 Jan 2018 by [B@P] Daniel Post: is the optimized app now part of the official package ? Not yet, but this is in plans. BTW, I am going to release new optimized app version soon. Stay tuned! |
43)
Message boards :
Science :
Source code of the project application
(Message 293)
Posted 14 Jan 2018 by [B@P] Daniel Post: Thanks for answers! Hello! Do you have any plans to start using non-negative values of keyValue in the future? I wonder if you could use longer path prefix instead, and remove this field completely. 2. Can I assume that every new WU at the beginning will have Generator::cellId set to 0? I asked these two questions because I wonder if I can assume a "clean slate" state at the beginning, i.e. only bits corresponding to cells in constant path prefix are changed, and all other would be changes by app during WU processing. I would like to use cellsHistory to store cell value candidates in similar way like in MovePairSearch::MoveRows. If in new WU cells on path would be partially processed, this would complicate things for me. 4. How do you check if result is valid - are you checking if files are binary identical, or examine contents more closely? I wonder what will happen if for some WU app will find two square pairs, but report them in different order than now. Will server be able to handle this? Or do I need to sort pairs in such case to make result "canonical"? OK, so GPU app would have to take care of this (I started working on it). It would run few hundreds of generators (or maybe thousands?), each of them processing its part of search space. They would run in parallel, so order of results is no longer predetermined. One more question: do you always generate WUs with all values on diagonals set? I checked few WUs and noticed this. I wanted to optimize Generator::Start by processing diagonal elements before non-diagonal ones and save some cycles consumed by processing of 'primary' and 'secondary' variables in latter part, but now I wonder if this part of code could be eliminated completely. Yes, this is what I meant. Generator is 2nd most time-consuming function (~40% of total time), so elimination of these checks would speedup everything. |
44)
Message boards :
Science :
Source code of the project application
(Message 290)
Posted 14 Jan 2018 by [B@P] Daniel Post: One more question: do you always generate WUs with all values on diagonals set? I checked few WUs and noticed this. I wanted to optimize Generator::Start by processing diagonal elements before non-diagonal ones and save some cycles consumed by processing of 'primary' and 'secondary' variables in latter part, but now I wonder if this part of code could be eliminated completely. |
45)
Message boards :
Science :
Source code of the project application
(Message 288)
Posted 13 Jan 2018 by [B@P] Daniel Post: Hi, I have few question about things which are not clear for me: 1. Is Generator::keyValue field needed? I checked few WUs and it always was set to -1 there. I wonder if this field and code which uses it can be removed; 2. Can I assume that every new WU at the beginning will have Generator::cellId set to 0? 3. Can I assume that every new WU at the beginning for every cell in Generator::path will have all corresponding values (bits) in Generator::cellsHistory set to 1 (i.e. all numbers from 0 to 8 can be potentially used for cell)? 4. How do you check if result is valid - are you checking if files are binary identical, or examine contents more closely? I wonder what will happen if for some WU app will find two square pairs, but report them in different order than now. Will server be able to handle this? Or do I need to sort pairs in such case to make result "canonical"? |
46)
Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
(Message 285)
Posted 9 Jan 2018 by [B@P] Daniel Post: I run tests again several times with offset set the same for AVX2 and AVX512 to 0 =4300MHz and the results are: Thanks! These results looks reasonable, I was expecting something like this. Real WUs are about 6 times longer, so with AVX512 computations would complete about 40 seconds faster. PC running 24/7 would be able to complete 5 more WUs per core per day. |
47)
Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
(Message 281)
Posted 8 Jan 2018 by [B@P] Daniel Post: Sure, results for AVX2 and AVX512 on i9-7920X (offset for AVX2 is set to 4GHz, for AVX512 is set to 3.8GHz) under Windows 10: Thanks for results. This is interesting, I thought that AVX512 version would be faster a bit. I wonder if it is really slower, or it was some random execution time variation. If you execute test few times (e.g. 3 times), you will see that numbers are different each time. CPU load also influences results. Could you repeat these tests few times with BOINC suspended to confirm if AVX512 version is really slower instead of faster? |
48)
Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
(Message 280)
Posted 8 Jan 2018 by [B@P] Daniel Post: I downloaded rakesearch_linux_arm_v7l.tgz from github. Good to hear that!~ If you want to check if your RPI supports NEON or not, please execute following command. If it will print something, it would mean that your CPU supports NEON instructions. grep 'neon\|asimd' /proc/cpuinfo | head -1 I'm running the rakesearch_linux_64_sse2.tgz version on an AMD Phenom (running Linux Mint 13). It's not young enough to support AVX. I also have an AMD A8 also on Mint 13, which does have AVX. I haven't attempted to run that as yet. Well, most of this complicated stuff is done by compiler :). I had to find proper intrinsics which will do what I need, and this was most complicated part for me. Beside this things are similar to SSE/AVX programming :) |
49)
Message boards :
News :
RakeSearch project technical update 2018-01-06
(Message 274)
Posted 7 Jan 2018 by [B@P] Daniel Post: Minor update at 2018-01-07. Please modify this a bit, to add new tasks when there is less that 1000 tasks in the queue. By doing so some tasks should always be in the queue. |
50)
Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
(Message 273)
Posted 7 Jan 2018 by [B@P] Daniel Post: Hi, I am trying running AVX512 app and it is not triggering my AVX512 offset set in BIOS on my i9-7920X cpu, it runs with offset for AVX2. Is this app really using AVX512? Answer is more complicated. This app version in most performance-critical place uses new AVX512 instruction which works on old AVX registers. Beside this there are some places where memory blocks are copied, what uses AVX512 registers. However these copies are made rarely. This matches with what you are observing. BTW, could you test performance of various app versions on your machine? In post linked below I wrote small instruction how to do this. I am mainly interested how AVX512 version compares with AVX2 one, I do not have any hardware to do such benchmark. http://rake.boincfast.ru/rakesearch/forum_thread.php?id=39&postid=237 |
51)
Message boards :
Number crunching :
ARM chip support: Raspberry Pi (Linux/Raspbian) or Android
(Message 269)
Posted 31 Dec 2017 by [B@P] Daniel Post: Hello Brian! Please check the Optimized RakeSearch app thead here. I have released ARM apps there (for v7l CPU, with and without NEON), and for AARCH64 one. Let me know if they work for you, or if you need one for some older CPU. |
52)
Questions and Answers :
Web site :
Cannot display tak list
(Message 257)
Posted 21 Dec 2017 by [B@P] Daniel Post: This memory allocation error has been happening on quite a few projects for quite a few years. Thanks for info. I have checked list of open issues at GitHub BOINC repository, and found that they do not have open issue for this, so I logged them new one: https://github.com/BOINC/boinc/issues/2277. |
53)
Questions and Answers :
Web site :
Cannot display tak list
(Message 253)
Posted 20 Dec 2017 by [B@P] Daniel Post: I guess the system need an automatic process to purge the older WUs from this page ... It may be surprising, but it is enough to load and process 20 entries at once. Plus one or few ones extra to display various total numbers there. As I checked in BOINC sever code, it should be already working this way. Something strange is going there. And yes, periodic removal of old WUs would help to keep server load low. I read somewhere that BOINC server comes with script which can do this, so there is no need to reinvent the wheel. Edit: BTW, there are people who have higher RAC than me :) http://rake.boincfast.ru/rakesearch/top_users.php |
54)
Questions and Answers :
Web site :
Cannot display tak list
(Message 249)
Posted 17 Dec 2017 by [B@P] Daniel Post: Hi, I cannot display task list - I get following error: Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 2 bytes) in .../html/inc/db_conn.inc on line 125 It is a bit surprising that 128MB is not enough there. I checked original code and it loads only 20 results from DB at once. If you did not make any changes in this area, please log bug at https://github.com/BOINC/boinc/issues that something else consumes lots of memory. |
55)
Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
(Message 248)
Posted 17 Dec 2017 by [B@P] Daniel Post: On my Odroid-XU4 i get this error: I have rebuilt ARM apps and now this lib is linked statically. Please download new app, it should work now. |
56)
Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
(Message 244)
Posted 15 Dec 2017 by [B@P] Daniel Post: Hi Daniel, and Anyone Else involved in this Optimized App, This code is specific to this project, so it cannot be integrated directly with other projects or Boinc itself. However other projects may review all changes done by me, get familiar with optimization techniques used by me and then apply them to their apps. I only wonder about ODLK project, it also works with Latin Squares. Maybe it could directly integrate some code. |
57)
Message boards :
News :
New badges for total credit!
(Message 240)
Posted 12 Dec 2017 by [B@P] Daniel Post: I am also OK with removing these badges. When you will do this, I will have more motivation to get them back and make app even faster :) |
58)
Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
(Message 237)
Posted 11 Dec 2017 by [B@P] Daniel Post: I have fixed the avx2nopext app for WIndows, please try it again. Linux version was fine. I also added NEON app version for ARM CPUs. It is about 22% faster than non-NEON one. Before installing it please check if your device supports NEON instructions - open /proc/cpuinfo file.and check if there is "neon" in "Features" line. ARM: real 20m37.322s user 20m35.665s sys 0m0.155s ARM+NEON: real 15m58.774s user 15m57.060s sys 0m0.080s Edit: I have added test.tgz archive, which contains files needed to perform benchmark test. If you are using it, unpack this archive somewhere, copy rakesearch file to the same dir and run test.sh script. It is also possible to test Windows apps. You need to install Cygwin, and then follow above steps. Please do not rename rakesearch.exe to rakesearch, Cygwin will be able to run it as-is. Note: for some reason now Cygwin displays 0.000 as a user time, what is incorrect. It used to work properly when I was using Win7, I suspect that Win10 broke this. Please post your results. I am especially interested how AVX512 app compares with other app versions. |
59)
Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
(Message 224)
Posted 9 Dec 2017 by [B@P] Daniel Post: PEXT instruction is very slow on Ryzen, as I wrote above. Please try avx2nopext app version, it does not use it, and should be a bit faster than AVX one for you. Good to know that is does not work :) I suspect what may be wrong, but today I do not have access to my PC - I will do it tomorrow. |
60)
Message boards :
Number crunching :
Optimized RakeSearch app for rank 9 (computations finished)
(Message 223)
Posted 9 Dec 2017 by [B@P] Daniel Post: Could someone please explain the process for correctly unpacking the optimized app files in Linux? I have successfully downloaded and extracted the files to my desktop, but when attempting to place them in the rakesearch folder I hit a dead end. I must be going about this the wrong way. I am trying to use the same process as setting up a cc_config file and it is not working. I do not use desktop on Linux, only shell :) Here are required commands to execute. You may have to adjust paths and URLs: su - cd /var/lib/boinc/projects/rake.boincfast.ru_rakesearch/ wget https://github.com/sirzooro/RakeSearch/releases/download/v1.0/rakesearch_linux_64_avx.tgz tar zxvf rakesearch_linux_64_avx.tgz systemctl restart boinc-client Above commands are enough to download, unpack and install AVX app on CentOS 7. You may have to adjust them a bit for your Linux version. You may have BOINC in /var/lib/boinc-client/... dir, and its service may be called boinc instead of boinc-client. BTW, Boinc prints path to its dir in event log when it starts, you can look for it there. |
©2024 The searchers team, Karelian Research Center of the Russian Academy of Sciences