
I took some time to benchmark different plots in order to provide another data point for anyone out there researching what their plotter build should be. My machine specs are listed in the about page. A link to it is at the top right of this post. Below I give just some different configurations I tried of the K32 plot to see what the differences are. These were recorded on v1.0.4 of the chia client:
K32 | 4 Thread Single NVMe 3408 RAM | 6 Thread Single NVMe 3416 RAM | 6 Thread Split NVMe 3416 RAM | 6 Thread RAID0 NVMe 3416 RAM |
F1 | 149.9 | 119.98 | 123.1 | 169.1 |
Phase 1 | 5393.3 | 4929.3 | 4608.7 | 4617.9 |
Phase 2 | 2615.5 | 2627.8 | 2581.3 | 2507.1 |
Phase 3 | 5329.2 | 5304.2 | 5015.5 | 5002.0 |
Phase 4 | 373.5 | 387.1 | 371.0 | 378.4 |
Total Time | 13711.7 | 13248.5 | 12576.7 | 12505.5 |
To add context, these were all single runs. No other parallel plotting was being done. Where I labeled “Split NVMe” that means I used the “-2” flag in the plotter settings on the CLI to specify a secondary temp file location. The secondary location is used during phase 3 and 4. This is a good point because the data is being read from phase 2 on disk 1, then its compressed, then its written to disk 2. There are a few takeaways from the data above.
- The F1 time seems a bit random and not hardware dependent. I’m assuming this is because of the randomness involved in generating the seed data.
- There are speed ups when using more than one drive for the temp files.
- The difference between a single NVMe and RAID0 of two NVMes was only roughly 12 minutes.
Remember, this is single plot performance, not parallel performance. The next table below shows another set of tweaking but with different parameters. This time I did not run it to the end, I only allowed it to compute Table 2.
K32 | 6 Thread RAID0 NVMe 128 Buckets 3416 RAM | 6 Thread RAID0 NVMe 32 Buckets 13644 RAM | 24 Thread RAID0 NVMe 32 Buckets 13644 RAM |
F1 | 169.1 | 107.0 | 91.381 |
Compute Table 2 | 550.4 | 549.5 | 638.8 |
This small dataset revealed a potential issue with AMD processors. The 5900x has two separate physical cores, called CPU complexes or CCXs for short. My theory is that Configuring a plotter that is more than the cores of one of these complexes causes a slow down. From the data above, It takes almost two minutes longer to finish Table 2 than running with only 6 threads. Another item we can read from this data is that the number of buckets doesn’t really affect the speed that much. You may have also noticed that the RAM is much higher when using 32 buckets. This is the trade off. If you choose less buckets, you will need more RAM.
Below is the last data set I tried. Here I tried to modify two other parameters that were on my mind:
K32 | 6 Thread Single NVMe 4k Allocation 3416 RAM | 6 Thread Single NVMe 64k Allocation 3416 RAM | 6 Thread Single NVMe 64k Allocation 2400 RAM |
F1 | 119.98 | 112.3 | 168.5 |
Phase 1 | 4929.3 | 4664.8 | 4982.9 |
Phase 2 | 2627.8 | 2574.8 | 2544.4 |
Phase 3 | 5304.2 | 5094.7 | 6291.2 |
Phase 4 | 387.1 | 376.9 | 370.5 |
Total | 13248.5 | 12711.3 | 14189.2 |
Since I’m using Windows, my NVMe drives are formatted NTFS. During formatting you can select the Allocation Size. In the table above, this is the setting I modified.
- It looks like there was a small time boost with a 64k allocation. Although this might be good news, I don’t know all the ramifications it could have.
- The last one I modified the RAM to 2400. Since I am RAM limited with 32GB of RAM, I wanted to see how much longer it would take with less RAM. It looks like about 16 minutes longer.
I still need to do more testing and I’ll create another update on here once I have enough data. Happy Plotting!
Nice post!
How to setup 2 temp drives (split)?
Hi mtjk, To use the 2nd temp drive, you use the “-2” flag in the CLI. If you use the GUI to plot, I think it is under advanced settings and its called “2nd Temporary Directory”.
Hi Alex
i found your blog on chia forum and have question about ploting.
My spec: i7 – 10700 8/16 23 gskill 3600 ram, nvme – 1tb samsung 980.
I set k-32 4 threats, 3408 MB – 2 plots in parrarel and it took 9h :/ after this i set up another 2 plots in parrarel 4 threats 4600 MB. and it took about 12h. I use linux and start ploting via gui. Maybe u have any advise ?
thanks and Best regards
Kamil
Hi Kamil, parallel will affect plot speed, but this does not win at the end of the day. The more plots you can have in parallel, the better. With a 1TB NVMe, you can have 4 plotters at one time. I suggest you start plotters with 2 threads, 3389 RAM and have a delay of 1 hr. Set the queue to however many plots you need. This should put you in a good cycle to produce the most plots in 24 hrs.
Thanks Aleex for answer, today for testing I set 1 plot 4threads and 5000MB and it took 7h. When I’m looking at your times, I know that Intel lost this war 🙂
How would you fit 4 plots on a 1 TB NVMe, if the working space used for one plot is ca. 270 GiB ? Are you minimizing the temp space used at a time with the delay ?
Hi MF, Yes someone just mentioned the same thing to me. I haven’t noticed this before. An investigation must be done! Thanks.
Thanks for documenting your tests in a proper way. I’m allocating 4 threads to my plots which Is actually double than what I got available (12) and the systems is fine. staggered plotting and the 2nd temp drive Has been Speeding up my Frankenstein ,
Too.
Hello, I don’t know how the 2nd directory setup not command, just use the hard drive for 2nd drive? or NVme drive?
Hi Alex. Excellent Blog! I have a 2xXeon server (32 logical cores) + 64GB RAM . It appears that its better to have less buckets (32 or even 16) and also more cores. Ti would be nice if RAM was used to temporarily store the plots (instead of nvme, SSD, HD) Any suggestions?
Hello George! Thank you for the comment. I would not touch the bucket setting, I my testing, there really isn’t any benefit with less buckets. Maybe this is different with the Xeon’s? RAM disk also doesn’t see much benefit. The best way is to have as many plotters in parallel as you can.
Hi Alex, my name is Diego. I am from Argentina. I have a 10 Gen I5, 32 GB of mem and 2 NVMe of 1 TB. Yesterday I did a test with K32 and this configuration per plot: 3390 mem, 2 threads, 128 cubes.
I plotted six in parallel, three to an NVMe drive and three to another NVMe.
The total plotting time was 9 hours.
My question is: if I make a RAID 0 array for two NVMe, will I have more speed when plotting?
Thank you!
I’m not sure if RAID0 will help the NVMes yet. I’m currently testing this and will write a post about it.
cubes = buckets. Sorry.
Excellent work!!!
Let me ask you a question please,
I have a 7700k with 32gb of ram and 4tb m.2
Do u suggest me lower to 2400 ram per plot? where I will have the bottleneck?
any advice?
Thanks!!!
What brand NVMe do you have? I’m finding out that there are some Brand of NVMes that cannot provide enough speed for lots of plots in parallel.
PNY CS2130 2TB SSD M.2 NVMe PCIe Gen3 x4
TY SO MUCH for ur answer!
I can’t find a sustained write performance graph for that model of drive. So I’m not sure if the performance drops after writing to it for a while.
Hello Alex i am using nvme wd black i was shocked when i saw your blog regarding to the speed drop. i have question which brand are you using or do you recommend any ? thanks alot
Hi Salah, My computer specs are displayed on the “about” page. I haven’t found data on my NVMe so I’m unsure if it has the same issue.
You’re tweaking for speed but you’re using windows? Isn’t there still a performance hit under windows compared to linux?
Hi Ren, Yes, there is a performance hit under Windows because the variable uint128 isn’t active in the Windows version of the Chia Client. Although it would be easy to go to Linux, there are many people who want to do it on their Windows machine and I still want to play games on my machine :).
Hello,
What is maximum throughput in MBps or GBps needed by 1 plot job?
I am trying to optimize my plotting throughput!
Hi Ashar, This metric is different for every system. This is because on plotter does not max the PCIe bus. Look at my blog post about “Plots per Day”. There is an additional element to an NVMe that needs to be considered also. Good Luck!
Hello,
I went through that post and it was very insightful. I was doing on the cloud and needed to set my parameters for the plot drives accordingly.
I also have another question regarding the temp 2 drive. What is the maximum amount of data written to temp 2 per K32 drive, if you have that information available, it will really help me set up good parallelism.
Hey Alex,
new farmer here.
Would you advise going for 64 buckets rather than 128 for plotting on an HDD on a system with an excess of RAM?
Sounds like the trade-off should be worth it.
When starting the plot, do I need to manually double the allocated RAM (so 4 threads, buffer 6816, 64 buckets) or does the plotter double the size on its own?
Many thanks!
If you really have extra RAM that is doing nothing, then you can try it. You have to change the settings so that it works. Those settings look correct, remember to check the plotter so that you don’t get QuickSorts.