SQL Query Optimization Techniques - KDnuggets

Image by author

At the beginner level, we only focus on writing and executing the SQL queries. We don’t care about how long it takes to run or if it can handle millions of records. But at the intermediate level, people expect their query to be optimized and take minimal time to execute.

It is imperative to write an optimized query in large applications with millions of records, such as e-commerce platforms or banking systems. Suppose you own an e-commerce company with over a million products, and a customer wants to search for a product. What if the query you wrote in the backend takes more than a minute to get that product from the database? Will you believe that customers buy products from your website?

You must understand the importance of SQL query optimization. In this tutorial, I’ll show you some tips and tricks to optimize your SQL queries and make them run faster. The main prerequisite is that you must have a basic understanding of SQL.

To check if a specific element is present in the table, use the EXIST() keyword instead of COUNT() will execute the query in a more optimized way.

Wearing COUNT(), the query should count all occurrences of that particular item which can be inefficient when the database is large. On the other hand, EXIST() it will check for only the first occurrence of that element and then stop when it finds the first occurrence. This saves a lot of time.

Also, you are only interested in whether a particular element is present or not. You are not interested in finding the number of occurrences. For that also EXIST() is better.

SELECT 
  EXISTS(
    SELECT 
      * 
    FROM 
      table 
    WHERE 
      myColumn = 'val'
  );

The above query will return 1 if at least one row of the table contains an entry where a column named myColumn has a value equal to worth. otherwise it will come back 0.

Both char and varchar data types are used to store strings in the table. But varchar is much more memory efficient than char.

The char data type can only store the defined fixed-length character string. If the length of the string is less than the fixed length, it will pad the blanks so that its length is equal to the set length. This will waste memory unnecessarily on padding. For example,CHAR(100) it will take 100 bytes of memory even if a single character is stored.

On the other hand, the varchar data type stores the variable-length character string that has a length less than the specified maximum length. It doesn’t fill in the blanks and just takes memory equal to the actual length of the string. For example, VARCHAR(100) it takes only 1 byte of memory when storing a single character.

CREATE TABLE myTable (
  id INT PRIMARY KEY, 
  charCol CHAR(10), 
  varcharCol VARCHAR(10)
);

In the example above, a table myTable is created having two columns, charCol and varcharCol having char and varchar data types respectively. charCol it will always occupy 10 bytes of memory. Unlike, varcharCol occupies memory equal to the actual size of the character string stored in it.

We must avoid using subqueries inside the WHERE clause to optimize an SQL query. As subqueries can be expensive and difficult to execute when they return a large number of rows.

Instead of using the subquery, you can get the same result by using a join operation or by writing a correlated subquery. A correlated subquery is a subquery in which the inner query depends on the outer query. And they are very efficient compared to uncorrelated subquery.

Below is an example to understand the difference between the two.

# Using a subquery
SELECT 
  * 
FROM 
  orders 
WHERE 
  customer_id IN (
    SELECT 
      id 
    FROM 
      customers 
    WHERE 
      country = 'INDIA'
  );

# Using a join operation
SELECT 
  orders.* 
FROM 
  orders 
  JOIN customers ON orders.customer_id = customers.id 
WHERE 
  customers.country = 'INDIA';

In the first example, the subquery first collects all customer IDs belonging to INDIA, and then the outer query will get all orders for the selected customer IDs. And in the second example, we have achieved the same result by joining the customers and orders tables and then selecting only orders where the customers belong to INDIA.

In this way, we can optimize the query by avoiding the use of subqueries inside the WHERE clause and making them easier to read and understand.

applying the JOIN Operating from a larger table to a smaller table is a common SQL optimization technique. Because joining from a larger table to a smaller table will make your query run faster. If we apply a JOIN operation from a smaller table to a larger table, our SQL engine has to look for matching rows in a larger table. This requires more resources and consumes more time. But on the other hand, if the JOIN is applied from a larger table to a smaller table, then the SQL engine has to search a smaller table for matching rows.

Here is an example for your better understanding.

# Order table is larger than the Customer table

# Join from a larger table to a smaller table
SELECT 
  * 
FROM 
  Order 
  JOIN Customer ON Customer.id = Order.id


# Join from a smaller table to a larger table
SELECT 
  * 
FROM 
  Customer 
  JOIN Order ON Customer.id = Order.id

Unlike LIKE clause, regexp_like it is also used for pattern matching. He LIKE The clause is a basic pattern matching operator that can only perform basic operations like _ either %, which are used to match a single character or any number of characters, respectively. He LIKE The clause must scan the entire database to find the particular pattern, which is slow for large tables.

On the other hand, regexp_like it is a more efficient, optimized and powerful pattern finding technique. It uses more complex regular expressions to find specific patterns in a string. These regular expressions are more specific than just wildcard matching because they allow you to match the exact pattern we’re finding. Because of this, the amount of data that must be searched is reduced and the query runs faster.

Note that regexp_like may not be present in all database management systems. Its syntax and functionality may vary on other systems.

Here is an example for your better understanding.

# Query using the LIKE clause
SELECT 
  * 
FROM 
  mytable 
WHERE 
  (
    name LIKE 'A%' 
    OR name LIKE 'B%'
  );
  
# Query using regexp_like clause
SELECT 
  * 
FROM 
  mytable 
WHERE 
  regexp_like(name, '^[AB].*');

The above queries are used to find the elements whose name starts with A or B. In the first example, LIKE is used to find all names beginning with A or B. A% means the first character is A; after that, any number of characters can be present. In the second example, regexp_like is used inside ^[AB], ^ represents that the symbol will match at the beginning of the string, [AB] represents that the initial character can be either A or B, and .* represents all characters after that.

Wearing regexp_likethe database can quickly filter out rows that do not match the pattern, which improves performance and reduces resource usage.

In this article, we have discussed various methods and tips to optimize SQL query. This article gives you a clear understanding of how to write efficient SQL queries and the importance of optimizing them. There are many more ways to optimize queries, such as preferring to use integer values instead of characters, or using Union All instead of Union when your table does not contain duplicates, etc.

Aryan Garg is a B.Tech. Electrical Engineering student, currently in the last year of the degree. His interest lies in the field of Web Development and Machine Learning. He has pursued this interest and I am looking forward to further work in these directions.

SQL Query Optimization Techniques – KDnuggets

Technical Terrence Team

Tesla vows to halve EV production costs, Musk keeps affordable car plan secret By Reuters

Leave a Reply Cancel reply

Recommended.

Ethereum Breaks Above $3,000 Again, Will FOMO Reach the Top?

Uniswap V4 release dependent on Ethereum Cancun upgrade and audit completion

Bitcoin Currently Outperforms 97% Of S&P 500 Companies And Gold By 2023, Analyst Expects A Crypto Supercycle

The second largest bank in Germany will offer crypto custody services to institutions

'A condemnation': Under mental health strains, students consider leaving university

Categories

Important Links