We like challenges and as it is well known Proxmox does not recommend for its software raid platform because the performance is not good. And this is true but half true, after many tests mounting a 3 node Proxmox platform with GlusterFS local disks 3 of 2TB 7k in Raid 5 for the Gluster brick partition and in raid 1 for the Proxmox system partition.
As you can see the disks are 7k and if we add to that that the data will be replicated on 3 servers with a 1Gb connection is difficult to have good performance, it is true, but only if we follow recommendations that we will find on the Internet and that are logical, in fact we started there, seeing recommendations and testing. Once we exhausted all the recommendations and seeing that the machines had a really horrible writing performance 10mb/s whatever you did we decided to do an analysis from scratch and configuring according to our criteria and calculations.
We are not going to explain how to install Proxmox with software raid as there are many good manuals on the Internet and we are not going to explain how to install Gluster, only the parameters we have touched, even so if you want a manual please contact us and we will be happy to make time to do it.
First of all, based on theory, the chosen format is raw because it is faster and has no metadata associated with it, although it loses the advantage of snapshotting. The choice also depends on what is the purpose or for example to use qcow2 for the system disk and raw for the data disk. In our case we are looking for performance and raw in "theory" is faster. Once the machines were created with raw format with VirtIO Block bus (the one that has given us the best performance) we started testing with write back as cache and without it reaching
Not bad for having many machines sharing datastore with 7k disks.
To get these values, which have been the maximum after many tests, we configured the following parameters in Gluster
With Gluster volume info all we can see all the parameters but we are interested in without all because it shows us those that have been touched and are not by default from where it says Options Reconfigured, we will not go into detail explaining one by one because there is already a lot of info on the internet and more or less with the name we can identify its purpose.
To configure these parameters is used:
gluster volume set <VOLNAME> <OPT-NAME> <OPT-VALUE>
Example: gluster volume set gv0 performance.cache-size 1024MB
Even so we did not fit much performance that was giving us although it was not bad, but it gave us the feeling that it was not using the cache correctly, was using it but not all that we marked and as after many tests of changing drivers, parameters etc ... did not get much more, we decided to try qcow2 transforming the disk of the machine in qcow2 (Make a backup before)
We turn off the machine and add a second small qcow2 disk to take little time, when we do the conversion and rejoin it will take the size of the disk we have.
Remove machine protection in options
We peel off both disks
Go to the console of one of the nodes and go to the directory of the machine inside the Gluster brick and convert the disc with qemu-img convert -f raw -O qcow2 disco.raw disco.qcow2
Depending on the data of the disk it will take more or less time and once finished we paste it again editing this disk that will appear without use in the hardware tab activating write back as it appears in the image.
Once this is done we start the machine and perform the same tests
As you can see now if you are making good use of the cache and performance has increased dramatically. Once you have verified that the machine is running perfectly you can remove the raw disk that will appear unused in the hardware options and re-enable the protection.
This change has not only improved the performance of the machine but also the performance of the hypervisor because when we had machines with raw type proxmox I/O delay graphs could shoot up to 50% while now it is in a good constant values for our scenario not exceeding 6%.
So we can say that raw on Proxmox with GlusterFS and software raid is not the best at all, it is a bottleneck.
And that with 3 nodes without disk arrays we have achieved a decent HA platform.