How to scale down a Pinot cluster without downtime on Kubernetes?
I have been using Apache pinot lately for an analytics project at my company for a bit now. We deployed it on Kubernetes and on the test environment, the infra that was setup was too large for the traffic we had. So, I had to learn how to downsize a pinot cluster that has data on it, without affecting the application or losing the data. So, the way we do that is by rebalancing the servers on which the tables are hosted.
Below are the steps I did manually in order to scale down the k8s cluster.
Some context about my setup
- Kubernetes cluster with 5 pinot servers. The pods were named as pinot-server-0 till pinot-server-4.
- 1 table which had 10 segments distributed across the 5 servers.
My target
- Downsize the number of pinot-servers from 5 to 4.
Steps:
1. First step is to make sure this setting is `pinot.set.instance.id.to.hostname=true` enabled on the cluster.
2. Since we are running on K8s, the cluster names are starting from pinot-server-0 to pinot-server-4. When we downscale the service on k8s, only pinot-server-0 to pinot-server-3 will remain. So, When deciding to downscale, we must choose to detach the server-4 first. So start from n, then n-1, n-2 and so on when deciding to downsize the k8s cluster.
3. Go to pinot server ui and select the pinot-server-4 and click "Edit Tags". Remove all the tags.
4. Now, since I have only 1 table, I go to the table config UI and click "Rebalance Servers" which then shows the status of "IN_PROGRESS" . You will also notice that it now excludes pinot-server-4 from the segment assignment. Now just wait for this to complete and monitor the controller logs if need be. There are a few configs you should input as per your need - they are mentioned here in a table - https://docs.pinot.apache.org/operators/operating-pinot/rebalance/rebalance-servers#running-a-rebalance . For my case, since the table had only replication=1, I had to set the value in rebalance for minAvailableReplicas as 0 for rebalancing to proceed.
5. Wait for sometime since it can take a while for rebalance to complete depending on the size of the table. If rebalance is complete, on the table's config ui page, you should not see pinot-server-4 listed under the instances.
6. In your case if there are more tables, all of them need to be rebalanced so that all the segments on this server are moved to other servers. So make sure to do that, other wise you will not be able to drop the server.
7. Next is to kill the pod in k8s for which you should scale down the pinot-server service. For instance, this is the command I used.
kubectl get statefulsetskubectl scale statefulsets pinot-server --replicas=4
8. Once the pod pinot-server-4 is killed, go back to pinot server UI page and then disable and drop the server. It should already be in dead state since the pod was killed.
Those were the steps for scaling down the pinot-servers without any loss of data. Let me know if you have further questions.
Comments
Post a Comment