Structured Query Language or SQL is a powerful programming language that is used in many different technical fields and industries. If you are looking to land a career as a data analyst, data engineer, or data scientist then you'll need to learn and master SQL.
Luckily for you, we've done extensive research to accumulate some of the most common SQL questions from beginner to advanced to help you learn, practice, and understand SQL. Plus, enjoy bonus content surrounding the most common SQL interview concepts and frequently asked questions during SQL interviews.
We'll start by covering beginner SQL interview concepts such as teaching you how to properly query and filter the data using SELECT, WHERE, and HAVING. Next, we’ll then move on to more intermediate SQL interview questions covering subqueries and aggregate functions. Finally, we'll bring it all together by testing your abilities with advanced SQL interview questions that require you to solve problems with complex mathematical and multiple table operations.
Each problem will include an in-depth explanation that will take you through the step-by-step solution so you can get the most out of this article.
In this section, we'll cover the basics of structured query language (SQL) interview questions and answers, including how to write simple SELECT statements, use WHERE clauses to filter results, and work with basic SQL functions and joins.
Prompt: The HR manager wants to know how many gift baskets are needed for the spring retreat and in order to do so needs to get a headcount. Please write a query to find the total count of employees.
Step 1: We need to familiarize ourselves with the employee table, which contains all of our data about employees. This includes their ID, first name, last name, manager ID, salary, department, start date, and building ID. This is a relational database table and will be useful for writing SQL queries.
Step 2: To get the total count of employees in our SQL query, we'll use a SELECT statement to retrieve this information from the employee table and simply use the SQL keyword COUNT to count the number of rows in the employee table that match our criteria.
In this example, we are using COUNT to get the total count of employees in our SQL query. Additionally, we’ll use the DISTINCT clause to count only the unique rows.
Step 3: We can then simply run the query to get the final result
Prompt: Your portfolio manager wants to know which startups need attention and asks you to write a query to show the startups in the US that have no sales yet.
Step 1: To identify startups in the US that have no sales, we will first need to join our orders and catalog tables using a LEFT JOIN.
A LEFT JOIN is used to return all rows in the relational database from the left table (orders) even if there are no matching values in the right table (catalog).
In this case, the orders table will be on the left-hand side and the catalog table will be on the right-hand side of our query. You can join them by using orders o on c.item_id = o.item_id.
Step 2: Next, we’ll use a WHERE statement to find only the startups that are based in the US. We can do this with the equal sign with ' ' since the column is a string.
Step 3: Lastly, we will retrieve the final output by removing the rows from our resulting table that have a value of zero.
We'll use o.order_id is NULL because a SQL query will always return zero if the SQL statement is successful, so we can use this to filter out any results from our orders table that have no sales.
This will ensure that only startups with no sales are returned in our SQL query. After this, we have our SQL query properly structured and can simply run it to get the list of startups in the US that have yet to make any sales.
Prompt: Write a query to find the average salary by department. Round the results to 2 decimal places.
Step 1: To calculate the average salary by department, we will need to first run a SQL query that selects the dept column from the employee table in the relational database.
Step 2: Next, we can use a simple calculation to find the average salary for each department. In SQL, this is done using the AVG() function and assigning it to the respective column in our SQL query. Because we're finding the average by department, we'll need to use GROUP BY to group the results by dept.
Step 3: Finally, we can run our SQL query and round the results to 2 decimal places for accuracy. This will give us an output of the average salary for each department in our table.
Tags: SELECT, FROM, ROUND, GROUP BY, AVG
Intermediate SQL Interview Questions and Answers
All right, how did you do with the beginner SQL interview question section? If you're feeling confident, then let's move on to some intermediate SQL interview questions!
In this intermediate SQL interview question and answers section, we'll expand upon what we've already covered in the beginner SQL interview section and incorporate more intermediate SQL topics, including CTE or common table expressions, DATE_PART, and window functions.
Prompt: The finance director wants to cut costs and asks you to write a query to return a list of buildings where there are no employees in them.
Step 1: To find the buildings with no employees, we will need to start by writing a SQL query that selects the building_id and building_name columns from our building table.
Step 2: Next, we will need to use a NOT IN clause in our query to filter out any buildings where there are employees present. To do this, we will need to use a subquery that selects the building_id column from our employee table. Subqueries are SQL queries that are nested within another SQL query. You'll need to have a good understanding of SQL syntax in order to write subqueries.
Step 3: Finally, we can run our SQL query and return the list of buildings that have no employees in them. This will give us an output of the building_id and building_name columns for any buildings where there are no employees present.
Prompt: Write a query to show how many users made more than 5 searches in the last 7 days from June 4th, 2022.
Step 1: First, we’ll get a COUNT of the total searches by user_id. In order to do this we'll need to use a GROUP BY clause. GROUP BY is a SQL clause that allows us to group data together according to a specified column.
Step 2: Next, we want to filter for the last seven days. We can use the WHERE clause with a DATE - INTERVAL ‘6 days’ because it will be inclusive of June 6th, 2022.
Step 3: Lastly, we want to use a HAVING clause to get the users with more than 5 searches. HAVING is SQL syntax that allows us to filter SQL query results based on a specific condition.
Tags: SELECT, FROM, *, WHERE, BETWEEN, GROUP BY, COUNT, HAVING
Prompt: Write a query to find the customer(s) who made two different orders within a two day window. Return the results by user_id ASC.
Step 1: First, we need to find the previous order for each customer to measure the time between orders. We'll be using a LAG function to partition by order_date to find the previous order. LAG is a type of window function that returns a value from the previous row in a partition. For our query, we'll need to LAG(order_date) OVER (PARTITION BY user_id ORDER BY order_date ASC) AS last_order. By doing this, we'll be able to calculate the difference between two dates. We'll use a query like the below code.
Step 2: Next, we'll want to create a CTE called "base" and then write a SQL query to find all records where the last order was within 2 days of the current order for all unique users. CTE or common table expression is a SQL construct that can be used to write temporary result sets. We'll need to use a DATE_PART function to get the difference in days between the current order. DATE_PART is a SQL function that returns the day of the month from a specified date. The syntax is date_part('day',order_date-last_order)<=2 because we want to identify all customers who made two different orders within a two day window.
Step 3: Finally, now that our query is returning all of the unique users, we can easily order by user_id and return the results in ascending order.
Tags: SELECT, FROM, WHERE, ORDER BY, DISTINCT, DATE_PART, CTE, LAG
Advanced SQL Interview Questions
Well done, if you've made it this far in the article you've covered some of the most common beginner to intermediate SQL interview questions!
In this last section, we'll build upon what we've already covered in the beginner and intermediate SQL interview questions with more complex SQL questions that test your ability to retrieve the right solution using a combination of functions and clauses.
Prompt: Write a query to find the second earliest bid for each customer on the day they place 2 or more bids on the same day. Please return the customer id, order date, and second bid id.
Note: For this problem, we are assuming that today is 2022-01-19
Step 1: First, we need to find an order sequence for all the orders by the day they were placed, excluding the time. We are casting order_datetime as a ::DATE because that is the return value requested in the output. We RANK() all orders by customer id and order datetime casted as a ::DATE so that our generated sequence will restart the ranking for each customer and for each day, regardless of the time of the order.
Step 2: We'll then make step 1 into a CTE (called “ranked”). We can simply query the second orders of each day. CTE stands for Common Table Expression and can be used to simplify queries in SQL. We then use the WITH clause to name the CTE (called “ranked”) and reference it in the SQL query itself.
Step 3: Finally, we use a WHERE clause to filter out all results except for the order_datetime of the second bid.
Tags: SELECT, FROM, WHERE, ORDER BY, RANK, ORDER_DATETIME, CTE
Prompt: Write a query to find all second and third level managers who've worked at the company longer than one year. Please order the managers from longest to shortest tenure.
Step 1: First, we need to get the list of top level managers; employees who don’t have a manager to return employees who don’t have a manager and are therefore top level, and we assign a level of 1. We can do this by querying the employee table and using a WHERE clause for managerid is NULL.
Step 2: Next we’ll need to query the same data for subordinates who have managers listed from the first result by using a CTE on the first step. We can identify level 2 by querying only employees who have a manager at level 1 as indicated by the CTE results.
Step 3: Now that we have a list of second-level candidates, let’s use a RECURSIVE CTE to get all levels of employees.
The RECURSIVE command for the CTE instructs the query to run the non-recursive term (the query before the UNION ALL) first, and then run the recursive term (the query after the UNION ALL) as many times as there are returned results.
The use of UNION ALL combines the data from all possible result sets without removing duplicates so that we get a full employee roster. We INNER JOIN the recursive term to the non-recursive term in order to limit the results to only those employees who have a manager in the previous result set.
Step 4: Since we don’t want all results and all columns, update the final SELECT statement to provide only the requested data. The WHERE clause uses a BETWEEN statement to specify that we want only the 2nd and 3rd level managers, based on the level we assigned in our CTE.
We ORDER BY date_joined ascending to place the managers in the requested order (oldest dates are higher in an ordered list).
Tags: SELECT, FROM, WHERE, ORDER BY, RANK, ORDER_DATETIME, CTE
Prompt: You are asked to review a 14 day log starting on June 1st, 2022 for missing entries. Determine which users are missing some kind of entry and on which days.
Step 1: First, we need to generate a list of dates that should be present in the log data. We are using a RECURSIVE CTE to cycle through the possible dates in the log range. RECURSIVE CTE is a type of SQL function that allows us to reference itself in its own SQL query.
We will use this CTE to iterate through each possible date. To start the sequence, we find the earliest (MIN()) entry date. We continue the recursion by selecting the next day (+ INTERVAL ‘1 day’) until we reach the end of the log.
The WHERE clause uses a subquery to find the last (MAX()) date in the log and ends the recursion there. We are using this recursion to cycle through all possible dates in the range. If we were to SELECT DISTINCT entry dates and there was a date entirely missing from the dataset (i.e. for all users), we wouldn’t be able to return that date as a result of what should be present.
Step 2: Next, we’ll find a list of all users who should be in the dataset in order to join that with all dates that should be in the dataset and create a list of total expected values.
Step 3: Finally, we’ll join the existing data onto the expected data and check for missing information. We LEFT JOIN our users CTE to create a list of possible combinations between users and dtrng. A CROSS JOIN could also be appropriate as well.
The condition ON 1=1 simply allows all rows from one table to be joined to all rows from the other table, creating a list of all possible combinations. We LEFT JOIN log entries again to pull in all the actual data and attempt to match it with the expected data.
Any records missing from log entries that should match the possible results will return a NULL for those fields. The WHERE criteria searches for these NULL records, giving us returned results of what was expected but not received.
Tags: SELECT, FROM, LEFT JOIN, RECURSIVE, CTE, ORDER BY, UNION, MAX, INTERVAL, MIN
Most Common SQL Interview Concepts
1. What is the difference between normalization and denormalization?
Answer: Normalization is the process of organizing data in a database in order to minimize redundancy and ensure data integrity. Denormalization involves breaking this structure down by adding redundant data to improve performance, while potentially sacrificing some degree of accuracy and consistency.
2. What are SQL joins, and how do they work?
Answer: SQL joins are used to combine data from two or more tables in a database, based on a common field between them. There are several different types of SQL joins, including inner, left, right, and outer joins. Each type works slightly differently to achieve the desired result set based on the type of query being executed.
For example, a SQL inner join will only return records that have matching values for the specified fields, while a SQL left join will return all records from one table, along with any matching records from the other table. In general, SQL joins are used to efficiently and accurately combine data from multiple tables in order to answer complex business questions or perform analytical tasks.
3. What is the difference between a table and a database?
Answer: A table is typically a collection of data within a database, organized into rows and columns. A database, on the other hand, is a collection of one or more related tables, along with other SQL objects like views and stored procedures. In general, a database is used to store and organize larger amounts of data while tables are used to store and manipulate smaller datasets within that database.
4. Can you explain the different types of SQL commands?
Answer: SQL commands can generally be divided into two categories: data manipulation commands, which are used to update or retrieve data from a SQL database, and SQL statements, which are used for a variety of different administrative and analytical purposes.
Data manipulation commands include: SELECT, INSERT, UPDATE, and DELETE.
SQL commands include: CREATE TABLE, ALTER TABLE, TRUNCATE TABLE, and DROP TABLE.
5. What are SQL subqueries, and how do you use them to perform complex queries?
Answer: SQL subqueries are used to query data within a SQL database. They allow you to perform more complex and powerful queries by taking advantage of SQL's powerful JOIN operators and other SQL functions.
For example, you could use a SQL subquery to filter data based on criteria that is not available in the main SQL query. Subqueries can also be used to perform more advanced SQL operations, such as combining data from multiple tables, performing self joins, or retrieving data from related SQL tables.
6. What is an example of a primary key? And how does it work?
Answer: A primary key is a column or group of columns within a table that uniquely identifies each record in the table. It typically consists of one or more fields that contain unique values and can never be null or empty. SQL uses primary keys to enforce data integrity by ensuring that each record in a table has a unique identifier, and can be used to perform joins more efficiently by eliminating records that do not have a matching primary key value.
7. What is an example of a unique key? And how does it work?
Answer: A unique key is a column or group of columns that identify records in a SQL table, but does not need to be a primary key. Unlike a primary key, a unique key can contain null or empty values, and unique within the table. SQL uses unique keys to enforce data consistency by ensuring that there are no duplicate records in a table.
8. What is an example of a foreign key? And how does it work?
Answer: A SQL foreign key is a column or group of columns under a relational database management system in one SQL table that references the primary key of another table. In other words, a foreign key acts as a relationship or link between two tables. When used in a query, foreign keys allow you to combine and query data from multiple tables at once, and can also be used to enforce referential integrity.
9. How do tables and fields work?
Answer: Tables are the basic units of SQL data storage and organization. Each table contains one or more fields, which are used to store data within records. SQL tables can contain any type of SQL data, including text, numbers, dates, and binary data. Each field typically corresponds to a column, with one or more records stored in each table.
10. When would you want to use a HAVING() instead of a WHERE clause?
Answer: HAVING clauses are used to perform filtering or aggregation on data, similar to WHERE clauses. However, HAVING clauses are often used to filter data based on aggregation functions, such as SUM(), AVG(), MAX(), MIN(), and others. This allows you to perform filtering on aggregated data, rather than raw data. For example, you could use a HAVING clause to filter data based on the average price of all records in a table, or to filter data based on the total number of records that match a certain criteria.
11. What are SQL operators? And how does it work?
Answer: SQL operators are commands that are used to perform specific database operations, such as filtering data, performing joins and subqueries, selecting from tables, and performing SQL grouping and aggregation. Operators can be used individually or in combination to perform a wide range of operations, such as filtering, grouping, sorting, aggregation, and SQL statistical calculations.
Frequently Asked Questions
1. How do I prepare for a SQL interview?
Answer: There are a number of tips and strategies that can help you prepare for a SQL interview. These include studying SQL best practices, familiarizing yourself with SQL syntax and concepts, practicing SQL queries and database coding, and reviewing SQL interview resources on bigtechinterviews.com
2. What are the 5 basic SQL commands?
Answer: The 5 basic SQL commands are SELECT, INSERT, DELETE, UPDATE, and CREATE.
3. What is RDBMS in SQL interview questions?
Answer: RDBMS stands for relational database management system, which is a type of SQL database that organizes SQL data into related tables and rows. RDBMS is commonly used in SQL interviews because it is the most commonly used SQL database, and SQL interview questions often focus on SQL queries and operations within a SQL RDBMS database.
4. How difficult is SQL to learn?
Answer: SQL can be a relatively difficult programming language to learn, especially for those with no prior SQL experience. However, there are many SQL tutorials and SQL training resources available online, such as bigtechinterviews.com, that can help you master SQL concepts and become more proficient in SQL coding and SQL queries.
5. What are good SQL interview questions?
Answer: We recommend you practice the free top 5 BTI questions to prepare for SQL interview questions. These SQL interview questions cover a range of SQL interview concepts that will help you prepare for your SQL interview.
As you've learned SQL is an extremely valuable and dynamic query language used by data scientists, data analysts, and data engineers. To master SQL and pass the daunting SQL interview process, you'll need to understand important concepts such as,
Single-Table Operations: SELECT, WHERE, HAVING, GROUP BY, ORDER BY
Making Calculations: AVG(), MAX()/MIN(), ROUND(), SUM(), WINDOW FUNCTIONS
Multi-Table Operations: JOINS, SUBQUERIES, CTE, UNION
It's not enough to have the concepts or questions memorized, you need to understand how they relate to one another and when is the right time to mix and match various concepts.
We recommend you use a simple framework like the one below to help you understand a SQL problem and identify the proper solution.
Familiarize yourself with the table and nuances by writing a simple query to select the first 10 columns.
Write out the possible approaches to uncover potential solutions and nuances.
Explore the most prominent solution and test for accuracy and efficiency.
Remember, mastering SQL takes time and practice so make sure you're consistently practicing and solving questions on Big Tech Interviews and Codewars to help you prepare for your SQL interview. Good luck!
Share this post
Feeling inspired to start coding? Check out some kata that have been hand selected based on this article.
For this challenge you need to create a UNION statement, there are two tables ussales and eusales the parent company tracks each sale at its respective location in each table, you must all filter the sale price so it only returns rows with a sale greater than 50.00. You have been tasked with combining that data for future analysis. Order by location (US before EU), then by id.