Tips and Tricks to Making Your SQL Queries Faster
Introduction
However much we’d like to avoid it there are times when a DBA has to perform a query on the production database. When you do this you want to make sure that the queries are efficient and don’t hog the system resources which could cause problems for your users. In this article we’re gonna talk about some tips to running more efficient queries for scenarios lie this one.
Don’t SELECT *
SELECT *
will give you all the data in a table but typically you don’t need all the data, instead you typically only need a few of the fields. Selecting all the data in the table when you don’t need it will needlessly hog resources. Instead you should select only the fields you know you’ll be utilizing.
A better way would be to use a query like SELECT id, zip FROM users
where you only select the exact fields you need to accomplish what you need to.
Don’t SELECT DISTINCT unless you really have to
The SELECT DISTINCT command is great but it as an expensive operation. We’ve seen it misused so we just like to state that you should save it for when it’s required.
Beware of Expensive Joins
Joins like the one below can be very expensive:
1 2 3 | SELECT users.id, users.zip, orders.id FROM users, orders WHERE users.id = orders.id |
This is what’s known as a Cross Join and is extremely expensive operation.
You can accomplish the same task more efficiently like this:
1 2 3 4 | SELECT users.id, users.zip, orders.id FROM users INNER JOIN orders ON users.id = orders.id |
WHERE over HAVING
Using HAVING instead of WHERE to just filter a query by a condition is much more efficient using WHERE. Why? Because HAVING is executed after WHERE, so it could pull lots of unneeded data that could have been accomplished with WHERE.
Let’s take a look at an example:
1 2 3 4 5 6 | SELECT users.id, users.zip, Count(orders.id) FROM users INNER JOIN orders ON users.id = orders.id GROUP BY users.id, users.zip HAVING orders.date BETWEEN #1/1/2019# AND #2/1/2019# |
Let’s say the orders
table has 500 entries, then this query would pull all 500 and then use HAVING to filter by date.
The better way to do it would be to use the WHERE clause like so:
1 2 3 4 5 6 | SELECT users.id, users.zip, Count(orders.id) FROM users INNER JOIN orders ON users.id = orders.id WHERE orders.date BETWEEN #1/1/2019# AND #2/1/2019# GROUP BY users.id, users.zip |
This query would only pull the orders within the data range instead of pulling all of them and then filtering.
HAVING does have it’s place though when you need to filter on field that’s been aggregated. Just be aware of what’s going on and you’ll save yourself from some needlessly expensive operations.
Beware Wildcards on Both Ends
Wildcards are plain inefficient so we need to be careful when we use these queries especially in production.
A query with wildcards on both ends is very expensive. It has to look for the text anywhere within the field. Take a look at this query:
1 2 | SELECT name from products WHERE name LIKE '%Almond%' |
We’ve seen this used all the time, and it will work but it will be needlessly expensive if you know that the product name begins with ‘Almond’ which in case you’d use LIKE 'Almond%'
which will make a huge difference.
Sometimes we can get into the pattern of using wildcards haphazardly so our point is to be aware how you are using them.
LIMIT is your friend
A lot of times a DBA will run multiple queries building towards the final query. If that’s what you’re doing or you’re just getting a sample set, then add a LIMIT onto the query. It will make the query much snappier:
For example maybe you are just viewing what fields are in a table:
SELECT * FROM users;
That is an insanely expensive operation, where you can get a good sample of the data and all the fields using:
1 | SELECT * FROM users LIMIT 5; |
Run these Queries at Night
You also typically have a little more freedom at night or whenever your system has the least amount of users. Take advantage of that, it will help you breath a little easier.
Conclusion
We hope these tips help you think about how you make queries especially when it really counts on a production environment. Some of these tips are common sense but often we get into bad habits and we need remind ourselves of the right way of doing things. Thank you for joining us and don’t hesitate to reach out to Object Rocket if you want to turn your data over to responsible hands.
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.
Get Started