SQL Performance - Tune It or Buy It ?

I was thinking about this while sitting in Vivek’s session at OCYatra in Gurgaon. He was talking about the techniques one can use to optimize badly performing SQLs. But there does exist the other side of this story i.e. throwing hardware at a performance problem to fix it. Of course, it works till a point only and it can’t solve every damn performance problem out there. But it is relatively easier to do (costs money, though) and depending upon where you are hosting your workloads, can be quicker too. You see CPU usage going up or users complaining, simple thing to do is to throw more cores at the database. If the system is CPU starved, it is going to give some temporary relief and you can control the situation. Same can be the case with the storage IOPS. In some extreme cases, you could even migrate the database to a more powerful hardware. With systems like Exadata being around which can make even a bad SQL do fairly well, this lever becomes even more powerful. But is it sustainable or is it the right way to deal with these kind of issues?

(Image generated using ChatGPT)

Tuning SQL is a long and a very involved process. You need a good technical guy who understands dagtabase performance. He will need to understand the problem and may need to talk to the development team and the business users to know what are they trying to achieve in that particular scenario. Enabling all of this to and fro communication isn’t easy. Only once he has all the information, he can decide if there is something that can be done in the database/OS/storage layer or there is a need to rewrite the business logic. Getting all of this done isn’t easy. On the other side, just increasing the cores/IOPS is much easier. Get the admin to do it and voila ! Problem solved ! The only sad part here is…it costs money !

I think a balanced approach is what is needed. There are situations where there is no other option than throwing more hardware at the problem. Say some month or quarter end process’ performance is messed up. In such a situation, you may not have the luxury of time to diagnose the problem, fix it or restart it. If enabling a little more cores or IOPS help complete it in time, it is a reasonable deal. But it is important to diagnose the problem and make sure that you don’t need it the next time it runs. That way you ensure that the problem at hand is dealt with in time and by identifying the issue and planning a fix, you are making sure that all goes well the next time it is executed.

What are your views on this ?