SQL Interview Quick Guide: From Basics to Interview Ready
Whether you're a beginner or looking to polish your skills, this blog post breaks down essential SQL concepts with clear explanations and practical examples of potential SQL interview questions. Prepare to your next SQL interview!

SQL Interview Preparation: Ace Your Next Interview
So you're gearing up for a SQL interview? You've come to the right place! Whether you're a fresh graduate or a professional, this blog will walk you through the essential SQL concepts you need to know to impress your interviewer. We'll tackle common questions, look into detailed explanations, and each topic with clear, practical examples. Let's get started!
1. What is SQL?
SQL stands for Structured Query Language. It's the standard language for interacting with relational databases (databases where data is stored in tables with relationships between them). Think of it as the language you use to talk to the database and tell it what information you need.
- Create and manage databases: Imagine building a library to store all your data.
- Add new data (INSERT): Like adding new books to your library.
- Update existing data (UPDATE): Correcting outdated information or adding new details.
- Delete data (DELETE): Removing books you no longer need.
- Retrieve specific data (SELECT): Finding the exact information you're looking for.
Example:
SELECT name, age FROM employees WHERE department = 'Sales';This SQL statement tells the database to find the names and age of all employees in the "Sales" department from the "Employees" table.
2. What are SQL Dialects?
Just like spoken languages, SQL has variations called dialects. These dialects are similar in core functionality but differ in some features and syntax. Popular dialects include:
- MySQL: Open source, known for its speed and ease of use.
- PostgreSQL: Open source, robust and feature-rich.
- Microsoft SQL Server: Microsoft's proprietary database system.
- Oracle: Powerful enterprise-level database.
Knowing the specific dialect used by the company you're interviewing for is beneficial.
3. What is a Database Management System (DBMS)?
A DBMS is a software application that you use to interact with a database. It allows you to:
- Create and manage databases.
- Define data structures (tables).
- Insert, update, delete, and retrieve data.
Popular DBMS examples include MySQL, PostgreSQL, and Oracle.
4. What are Tables and Fields in SQL?
- Table: An organized collection of data stored in rows and columns. Think of it like a spreadsheet.
- Field (Column): Represents a specific type of information in a table. For instance, a "Customer" table might have fields like "CustomerID," "Name," "Address," etc.
- Record (Row): A single entry in a table. For example, a single customer's information would be one record in the "Customer" table.
Example:
{{< table >}}
| CustomerID | Name | Address |
|---|---|---|
| 1 | John Doe | 123 Main St |
| 2 | Jane Smith | 456 Oak Ave |
| {{< /table >}} |
5. What are the different subsets/Sublanguages of SQL?
SQL commands are categorized into sublanguages based on their purpose:
-
DDL (Data Definition Language): Used for defining database structures. Key commands:
- CREATE TABLE: Creates a new table.
- ALTER TABLE: Modifies an existing table.
- DROP TABLE: Deletes a table.
- TRUNCATE: Used to delete all records from a table while preserving its structure
- RENAME: Used to rename a database object
-
DML (Data Manipulation Language): Used for manipulating data within tables. Key commands:
- INSERT: Adds new data.
- UPDATE: Modifies existing data.
- DELETE: Removes data.
- SELECT: Retrieves data (we'll explore this in detail later).
-
DCL (Data Control Language): Manages user permissions and access control. Key commands:
- GRANT: Gives permissions to users.
- REVOKE: Takes away permissions.
-
TCL (Transaction Control Language): Manages database transactions (a group of SQL statements that should be executed together). Key commands:
- COMMIT: Saves changes permanently.
- ROLLBACK: Undoes changes made during a transaction.
Understanding these sublanguages and their key commands is fundamental for working with SQL.
6. What are Joins in SQL?
Joins are used to combine data from multiple tables based on a shared field. JOINs are fundamental for querying relational databases effectively. They allow you to retrieve related data from multiple tables as if it were in a single table.
There are different types of joins:
- Inner Join: Returns rows where the shared field matches in both tables.
- Left Join: Returns all rows from the left table and matching rows from the right table.
- Right Join: Returns all rows from the right table and matching rows from the left table.
- Full Join: Returns all rows from both tables, even if there's no match in the shared field.
Example Scenario: Imagine you have a database with two tables:
- Customers: Contains customer information (CustomerID, Name, Email, etc.)
- Orders: Contains order information (OrderID, CustomerID, OrderDate, etc.)
Example:
SELECT Customers.Name, Orders.OrderID, Orders.OrderDate
FROM Customers
INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID;
Choosing the Right Join
INNER JOIN
The most common type. It retrieves rows only when there is a match in both tables based on the specified join condition.
Example:
SELECT c.Name, o.OrderID, o.OrderDate
FROM Customers c
INNER JOIN Orders o ON c.CustomerID = o.CustomerID;
LEFT JOIN (or LEFT OUTER JOIN)
Retrieves all rows from the left table (the first table in the join) and the matching rows from the right table. If there's no match, it fills in NULLs for the right table's columns.
Example:
SELECT c.Name, o.OrderID
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = o.CustomerID;
RIGHT JOIN (or RIGHT OUTER JOIN)
Retrieves all rows from the right table and the matching rows from the left table. If there's no match, it fills in NULLs for the left table's columns.
Example:
SELECT c.Name, o.OrderID
FROM Customers c
RIGHT JOIN Orders o ON c.CustomerID = o.CustomerID;
FULL JOIN (or FULL OUTER JOIN)
Retrieves all rows from both tables. If there's no match in one table, NULLs are used.
Example:
SELECT c.Name, o.OrderID
FROM Customers c
FULL JOIN Orders o ON c.CustomerID = o.CustomerID;
Self-JOIN: Joining a Table to Itself
Sometimes, you need to compare or combine data within the same table. A self-join is used to join a table to itself, usually using aliases to distinguish the "two copies" of the table.
Example:
SELECT e.Name AS EmployeeName, m.Name AS ManagerName
FROM Employees e
INNER JOIN Employees m ON e.ManagerID = m.EmployeeID;
Extra Tip:
Understanding Primary and Foreign Keys in JOINs
JOINs are often based on the relationships between primary keys (unique identifiers in a table) and foreign keys (columns in one table that reference a primary key in another table). Foreign keys establish connections between tables, making JOINs more meaningful and efficient.
7. What is a Subquery?
A subquery is simply a query within another query. Subqueries are used to retrieve data that will be used in the main query. Think of it like asking a question within a question.
Types of Subqueries: Where and How They're Used
- Non-Correlated: Can run independently of the outer query.
- Correlated: Depends on the outer query to run.
Example:
Employees: {{< table >}}
| EmployeeID | Name | DepartmentID | Salary |
|---|---|---|---|
| 1 | John | 1 | 50000 |
| 2 | Jane | 1 | 55000 |
| 3 | Bob | 2 | 60000 |
| 4 | Alice | 2 | 62000 |
| {{< /table >}} |
Departments: {{< table >}}
| DepartmentID | Name |
|---|---|
| 1 | HR |
| 2 | IT |
| {{< /table >}} |
Correlated Subquery: Depends on the outer query and cannot be executed independently. It's like asking "What is the average salary of employees in this department?" where "this" refers to a department specified in the outer query.
SELECT Name, Salary
FROM Employees
WHERE Salary > (SELECT AVG(Salary) FROM Employees);Result:
{{< table >}}
| Name | Salary |
|---|---|
| Bob | 60000 |
| Alice | 62000 |
| {{< /table >}} |
This finds employees whose salaries are higher than the average salary of all employees. The subquery (SELECT AVG(Salary) FROM Employees) calculates the average salary first, and then the main query uses that result for comparison.
-
Scalar Subqueries
Return a single value. Used in WHERE and HAVING clauses, as well as within other expressions.
Example:
SELECT productName
FROM Products
WHERE price = (SELECT MAX(price) FROM Products);-
Row Subqueries
Return a single row. Used in WHERE and HAVING clauses when comparing to a single row.
Example:
SELECT *
FROM Employees
WHERE salary = (SELECT salary FROM Employees WHERE EmployeeID = 101);-
Table Subqueries
Return multiple rows and columns. Often used in FROM clauses as if they were tables.
Example:
SELECT *
FROM (SELECT CustomerID, SUM(OrderAmount) AS TotalSpent FROM Orders GROUP BY CustomerID) AS HighSpenders
WHERE TotalSpent > 500;Non-Correlated Subquery: Can be executed independently of the outer query. It's like asking "What is the highest salary in the company?" and then using that result in the main query.
SELECT e1.Name, e1.Salary, e1.DepartmentID
FROM Employees e1
WHERE e1.Salary = (
SELECT MAX(e2.Salary)
FROM Employees e2
WHERE e2.DepartmentID = e1.DepartmentID
);Result:
{{< table >}}
| Name | Salary | DepartmentID |
|---|---|---|
| Jane | 55000 | 1 |
| Alice | 62000 | 2 |
| {{< /table >}} |
8. What are Aggregate Functions?
Aggregate functions perform calculations on a set of values and return a single value as a result. They are essential for summarizing data and getting insights.
Common Aggregate Functions:
COUNT(): Counts the number of rows.SUM(): Calculates the sum of values in a column.AVG(): Calculates the average of values in a column.MAX(): Finds the highest value in a column.MIN(): Finds the lowest value in a column.
GROUP BY Clause: Grouping Data for Aggregation
The GROUP BY clause is used with aggregate functions to group rows with the same values in one or more columns.
Example:
SELECT department, AVG(salary) AS average_salary
FROM employees
GROUP BY department;This calculates the average salary for each department. The GROUP BY clause groups employees by their departments, and the AVG(Salary) function calculates the average salary for each group.
9. What is the Difference between WHERE and HAVING?
Both WHERE and HAVING are used to filter data, but they work at different stages of a query:
WHERE: Filters rows before grouping or aggregation. Think of it like selecting specific books from your library before organizing them by genre.HAVING: Filters groups after grouping or aggregation. It's like selecting specific genres after you've already organized the books.
Example:
SELECT Department, AVG(Salary) AS AverageSalary
FROM Employees
WHERE HireDate > '2022-01-01'
GROUP BY Department
HAVING AVG(Salary) > 70000;This finds departments where the average salary of employees hired after January 1, 2022, is greater than $70,000. The WHERE clause filters employees based on their hire date, and the HAVING clause filters departments based on their average salary.
10. What is the Purpose of DISTINCT?
The DISTINCT keyword is used to eliminate duplicate rows from the result set of a SELECT query. It's like selecting only one copy of each book from your library, even if you have multiple copies.
Example:
SELECT DISTINCT Department
FROM Employees;This returns a list of all the unique departments in the "Employees" table, without repeating any department names.
11. What are Stored Procedures?
Stored procedures are like pre-written scripts that you can store in the database and execute whenever you need them. They can accept input parameters, perform complex operations, and return results.
Advantages:
- Reusability: You can use the same stored procedure multiple times without rewriting the code.
- Performance: Stored procedures are pre-compiled and optimized, so they execute faster.
- Security: You can control access to data through stored procedures, enhancing database security.
Example:
CREATE PROCEDURE GetEmployeesByDepartment
@DepartmentName VARCHAR(50)
AS
BEGIN
SELECT EmployeeName, Salary
FROM Employees
WHERE Department = @DepartmentName;
END;This creates a stored procedure called "GetEmployeesByDepartment" that takes a department name as input and returns the names and salaries of employees in that department.
12. What are Window Functions?
Window functions perform calculations across a set of rows related to the current row, without grouping the rows into a single output like aggregate functions. Think of it as looking at a "window" of data around each row to perform calculations.
Common Window Functions:
ROW_NUMBER(): Assigns a unique sequential number to each row within a partition.RANK(): Assigns a rank to each row within a partition, allowing for ties (same rank for equal values).DENSE_RANK(): Assigns a rank to each row without gaps, even if there are ties.LAG(): Accesses data from a previous row.LEAD(): Accesses data from a subsequent row.
How would you rank rows in a SQL query? Example:
SELECT EmployeeName, Salary,
RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS RankInDepartment
FROM Employees;This query ranks employees within each department based on their salary, using the RANK() window function. The PARTITION BY clause divides the data into partitions (departments), and the ORDER BY clause determines the ranking order.
13. What is the Purpose of Common Table Expressions (CTEs)?
CTEs are named temporary result sets that you can define within a single query. Think of them like creating a temporary table that exists only for the duration of the query. They help break down complex queries into smaller, more manageable parts, improving readability and organization.
Example:
WITH TopSalaries AS (
SELECT EmployeeName, Salary
FROM Employees
ORDER BY Salary DESC
LIMIT 10
)
SELECT *
FROM TopSalaries
WHERE Salary = 60000;This CTE, named "TopSalaries," retrieves the top 10 highest-paid employees. The main query then selects from this CTE, filtering for employees in the "Marketing" department.
14. How Can You Optimize SQL Queries for Better Performance?
Query optimization is crucial for ensuring that your SQL queries run efficiently, especially when dealing with large datasets.
Optimization Techniques:
- Use Indexes: Indexes speed up data retrieval by creating a lookup table.
- Avoid
SELECT *: Select only the necessary columns instead of retrieving all columns. - Use
WHEREClause Wisely: Filter data early in the query to reduce the amount of data processed. - Optimize Joins: Choose the appropriate join type and ensure that join conditions are based on indexed columns.
- Use Stored Procedures: Pre-compiled and optimized stored procedures can execute faster than ad-hoc queries.
- Analyze Query Plans: Use database tools to analyze the execution plan of your query and identify potential bottlenecks.
Example:
-- Inefficient query
SELECT *
FROM Employees
WHERE DepartmentID = (SELECT DepartmentID FROM Departments WHERE DepartmentName = 'Sales');
-- Optimized query with join
SELECT e.EmployeeName, e.Salary
FROM Employees e
INNER JOIN Departments d ON e.DepartmentID = d.DepartmentID
WHERE d.DepartmentName = 'Sales';The optimized query uses a join instead of a subquery, which can often be more efficient.
15. What is the Difference Between UNION and UNION ALL?
Both UNION and UNION ALL combine the results of two or more SELECT statements, but:
UNION: Removes duplicate rows from the final result set.UNION ALL: Includes all rows, including duplicates.
Example:
SELECT EmployeeName FROM Employees
UNION
SELECT CustomerName FROM Customers;This returns a list of unique names from both the "Employees" and "Customers" tables.
What is the purpose of a UNION operator? How is it different from a JOIN?
- The UNION operator is used to combine the results of two or more SELECT statements into a single result set. It's important to note that UNION removes duplicate rows from the final output.
Difference from JOIN:
- JOIN: Combines columns from different tables based on a shared column.
- UNION: Combines rows from different SELECT statements.
16. What is the purpose of the CASE statement in SQL? Provide an example.
The CASE statement in SQL is used for conditional logic within a query. It allows you to perform different actions or return different values based on certain conditions.
Example:
You want to categorize customers based on their total spending:
SELECT
CustomerID,
SUM(TotalAmount) AS TotalSpent,
CASE
WHEN SUM(TotalAmount) > 1000 THEN 'High Value'
WHEN SUM(TotalAmount) BETWEEN 500 AND 1000 THEN 'Medium Value'
ELSE 'Low Value'
END AS CustomerCategory
FROM Orders
GROUP BY CustomerID;Explanation: The CASE statement checks the TotalSpent and assigns a CustomerCategory based on the specified conditions.
17. What are Triggers in SQL?
Triggers are special stored procedures that automatically execute when a specific event occurs in the database. They are commonly used for:
- Maintaining data integrity: Enforcing business rules or preventing invalid data changes.
- Auditing: Tracking changes made to data.
- Performing actions based on data changes: For example, updating a related table when a row is inserted into another table.
Example:
CREATE TRIGGER UpdateSalaryHistory
ON Employees
AFTER UPDATE
AS
BEGIN
IF UPDATE(Salary)
BEGIN
INSERT INTO SalaryHistory (EmployeeID, OldSalary, NewSalary, UpdateDate)
SELECT inserted.EmployeeID, deleted.Salary, inserted.Salary, GETDATE()
FROM inserted
INNER JOIN deleted ON inserted.EmployeeID = deleted.EmployeeID;
END;
END;This trigger "UpdateSalaryHistory" automatically inserts a record into the "SalaryHistory" table whenever an employee's salary is updated.
18. EXISTS and NOT EXISTS Operators: Checking for Existence
- EXISTS: Returns true if the subquery returns at least one row.
- NOT EXISTS: Returns true if the subquery returns no rows.
These operators are particularly useful in correlated subqueries to check if a related record exists in another table.
IN Operator: Comparing to Multiple Values
The IN operator checks if a value exists within a set of values returned by a subquery.
Example:
SELECT * FROM Employees
WHERE DepartmentID IN (SELECT DepartmentID FROM Departments WHERE Location = 'New York');
19. Describe the concept of data warehousing and how it differs from traditional online transaction processing (OLTP) databases?
Data Warehousing:
- A system designed for analytical and reporting purposes.
- Stores historical data from various sources.
- Optimized for complex queries and data aggregations.
Online Transaction Processing (OLTP):
- Focuses on handling day-to-day business transactions.
- Optimized for fast data inserts, updates, and retrievals.
- Emphasizes data integrity and concurrency control.
Key Differences:
| Feature | Data Warehouse | OLTP Database |
|---|---|---|
| Purpose | Analytical reporting | Transaction processing |
| Data | Historical, aggregated | Current, transactional |
| Queries | Complex, analytical | Simple, transactional |
| Schema | Denormalized, star schema | Normalized, relational |
| Performance | Optimized for read operations | Optimized for write operations |
20. What are Views in Databases?
A view is a virtual table that doesn't physically store data but presents a customized perspective of data from one or more base tables.
Key Points:
- Virtual Table: A view doesn't hold data itself; it retrieves data from the underlying tables when queried.
- Customized Perspective: Views allow you to select specific columns, filter data, join tables, and even use aggregate functions to create a tailored view of the data.
- Security: Views can be used to restrict access to sensitive data by exposing only specific columns or rows to certain users.
Example:
CREATE VIEW EmployeeDepartmentView AS
SELECT EmployeeName, DepartmentID
FROM Employees;You can then query the view:
SELECT * FROM EmployeeDepartmentView;Types of Views:
- Simple View: Based on a single table.
- Complex View: Involves multiple tables, joins, subqueries, or aggregate functions.
- Materialized View: Physically stores the view's data for faster access.
21. Explain about Materialized Views?
A materialized view is a pre-computed view that stores the results of a query as a physical table.
Advantages:
- Improved Query Performance: Data is readily available, eliminating the need to execute the underlying query each time.
- Reduced Query Complexity: Complex queries can be pre-computed and stored in the materialized view, simplifying subsequent queries.
- Data Aggregation: Materialized views can be used to pre-aggregate data, making reporting and analysis faster.
Drawbacks:
- Data Staleness: Materialized views can become outdated if the underlying data changes. Refresh mechanisms are needed to keep them synchronized.
- Storage Overhead: They require additional storage space to store the pre-computed results.
22. Describe the concept of database partitioning and its benefits?
Database partitioning is a technique used to divide large tables or indexes into smaller, more manageable pieces called partitions.
Key Aspects:
- Data Distribution: Partitions spread data across multiple storage units or servers.
- Logical Division: Each partition is a subset of the data, typically based on a specific criterion.
Types of Partitioning:
1. Range Partitioning:
- Divides data based on a range of values in a particular column (often a date or numeric column).
- Common use cases: Archiving historical data, separating data by date ranges (e.g., monthly sales data).
Example:
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
OrderDate DATE,
-- ... other columns
)
PARTITION BY RANGE (YEAR(OrderDate)) (
PARTITION p0 VALUES LESS THAN (2022),
PARTITION p1 VALUES LESS THAN (2023),
PARTITION p2 VALUES LESS THAN (2024),
PARTITION p3 VALUES LESS THAN MAXVALUE
);2. List Partitioning:
- Partitions data based on a list of specific values in a column.
- Useful for categorizing data based on discrete values (e.g., regions, departments).
Example:
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
Region VARCHAR(50),
-- ... other columns
)
PARTITION BY LIST (Region) (
PARTITION NorthAmerica VALUES ('USA', 'Canada', 'Mexico'),
PARTITION Europe VALUES ('UK', 'France', 'Germany'),
PARTITION Asia VALUES ('China', 'India', 'Japan')
);3. Hash Partitioning:
- Distributes data based on a hash function applied to a partitioning key.
- Useful for evenly spreading data across partitions, especially when there's no natural range or list to partition by.
Example:
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
-- ... other columns
)
PARTITION BY HASH (OrderID)
PARTITIONS 4;Benefits of Partitioning:
- Improved Query Performance: Queries that target specific partitions can run significantly faster.
- Enhanced Manageability: Easier to perform maintenance tasks (backups, index rebuilds) on smaller partitions.
- Increased Availability: Partitions can be distributed across multiple storage devices, improving data availability.
23. Transactions: Ensuring Data Consistency
Transactions are fundamental to maintaining data integrity. The ACID properties guarantee reliable transaction processing.
ACID Properties in Detail:
- Atomicity: Ensures that a transaction is treated as a single, indivisible unit of work. Either all changes within the transaction are applied, or none are.
- Consistency: Guarantees that a transaction brings the database from one valid state to another, preserving data integrity rules.
- Isolation: Prevents concurrent transactions from interfering with each other. Changes made by one transaction are not visible to other transactions until the transaction is committed.
- Durability: Ensures that once a transaction is committed, the changes are permanently stored and will survive system failures.
24. Stored Procedure Best Practices
Stored procedures offer many advantages, but it's essential to use them effectively. Here are some best practices for writing maintainable and efficient stored procedures.
Stored Procedure Best Practices:
- Modular Design: Break down complex logic into smaller, reusable procedures.
- Parameterization: Use parameters to pass input values, improving security and preventing SQL injection.
- Error Handling: Implement error handling using TRY...CATCH blocks to gracefully manage exceptions.
- Naming Conventions: Use clear and consistent naming conventions for procedures and parameters.
- Documentation: Document the purpose, parameters, and expected behavior of stored procedures.
- Version Control: Track changes to stored procedures using version control systems.
Example:
CREATE PROCEDURE UpdateCustomerInfo
@CustomerID INT,
@NewEmail VARCHAR(100) = NULL, -- Optional parameter
@NewPhone VARCHAR(20) = NULL -- Optional parameter
AS
BEGIN
BEGIN TRY
UPDATE Customers
SET Email = ISNULL(@NewEmail, Email), -- Update only if provided
Phone = ISNULL(@NewPhone, Phone) -- Update only if provided
WHERE CustomerID = @CustomerID;
END TRY
BEGIN CATCH
-- Handle errors and log information
END CATCH;
END;This stored procedure updates customer information, allowing for optional parameters and error handling.
25. Indexing Strategies: Beyond the Basics
Indexing is crucial for performance, but choosing the right type of index and optimizing its use can be challenging.
Advanced Indexing Strategies:
- Composite Indexes: Create indexes on multiple columns to optimize queries that filter on those columns together.
- Covering Indexes: Include all columns needed for a query in the index to avoid accessing the actual table.
- Filtered Indexes: Create indexes on a subset of data based on specific criteria.
- Index Usage Statistics: Monitor index usage to identify unused or inefficient indexes.
- Index Fragmentation: Manage index fragmentation to maintain performance.
26. Security Considerations: Protecting Your Data
Database security is paramount. Here are some key practices for safeguarding your SQL databases.
SQL Security Best Practices:
- Least Privilege Principle: Grant users only the necessary permissions to perform their tasks.
- Strong Passwords and Authentication: Enforce strong password policies and use secure authentication mechanisms.
- Data Encryption: Encrypt sensitive data both in transit and at rest.
- Regular Auditing: Track database activity and monitor for suspicious behavior.
- Input Validation: Validate and sanitize user input to prevent SQL injection attacks.
27. What is the purpose of normalization in database design?
Normalization is a process of organizing data in a database to:
- Minimize data redundancy: Reduce duplicated data.
- Improve data integrity: Ensure data consistency and accuracy.
- Optimize data storage: Use space more efficiently.
Normalization involves creating multiple tables with relationships between them, ensuring that each piece of information is stored only once.
SQL Interview: Code-Focused Challenges
Here are some SQL interview questions for practice:
1. Find employees with above-average department salary
Write a SQL query to find the employees who have a salary greater than the average salary of their department.
SELECT e.EmployeeID, e.EmployeeName, e.Salary
FROM Employees e
INNER JOIN (
SELECT DepartmentID, AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY DepartmentID
) AS DeptAvg ON e.DepartmentID = DeptAvg.DepartmentID
WHERE e.Salary > DeptAvg.AvgSalary;Explanation:
- Subquery (DeptAvg): Calculates the average salary for each department using AVG(Salary) and groups them by DepartmentID.
- Join: Joins the Employees table with the DeptAvg subquery results based on DepartmentID.
- Filter: Selects employees whose salary (e.Salary) is greater than the average salary of their department (DeptAvg.AvgSalary).
2. Top 3 customers by spending in the last quarter
Given a table "Orders" with columns (OrderID, CustomerID, OrderDate, TotalAmount), write a query to find the top 3 customers who have spent the most in the last quarter.
SELECT CustomerID, SUM(TotalAmount) AS TotalSpent
FROM Orders
WHERE OrderDate >= DATE_SUB(CURDATE(), INTERVAL 3 MONTH)
GROUP BY CustomerID
ORDER BY TotalSpent DESC
LIMIT 3;Explanation:
- Filter Orders: Selects orders from the last quarter using DATE_SUB to calculate the date three months ago.
- Group by Customer: Groups orders by CustomerID.
- Sum Total Amount: Calculates the total amount spent by each customer using SUM(TotalAmount).
- Order and Limit: Orders the results by TotalSpent in descending order and retrieves the top 3 using LIMIT.
3. List products with category names (including products without categories)
You have a table "Products" with columns (ProductID, ProductName, CategoryID) and a table "Categories" with columns (CategoryID, CategoryName). Write a query to list all products along with their category names, even if a product doesn't have a category assigned.
SELECT p.ProductName, c.CategoryName
FROM Products p
LEFT JOIN Categories c ON p.CategoryID = c.CategoryID;Explanation:
- Left Join: A left join ensures that all products from the "Products" table are included in the result set.
- Join Condition: Joins based on the shared CategoryID column.
- Missing Categories: If a product has no matching category, the CategoryName column will be NULL.
4. Find the second highest salary
Write a SQL query to find the second highest salary in the "Employees" table.
SELECT MAX(Salary) AS SecondHighestSalary
FROM Employees
WHERE Salary < (SELECT MAX(Salary) FROM Employees);Explanation:
- Subquery: Finds the highest salary in the table.
- Outer Query: Selects the maximum salary (MAX(Salary)) from the "Employees" table, excluding the highest salary identified by the subquery.
5. Delete all records from a table without dropping it
Write a query to delete all records from the 'Temporary' table without dropping the table itself.
TRUNCATE TABLE Temporary;Explanation:
- TRUNCATE is a DDL (Data Definition Language) command that quickly removes all rows from a table without logging individual deletions, making it faster than DELETE for clearing an entire table.
6. Count events by type in the last week
You have a table called 'Events' with columns (EventID, EventDate, EventType). Write a query to display the number of events of each type that occurred in the last week.
SELECT EventType, COUNT(*) AS EventCount
FROM Events
WHERE EventDate >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK)
GROUP BY EventType;Explanation:
- Date Filtering: Selects events from the last week using DATE_SUB to calculate the date one week ago.
- Group by EventType: Groups events by EventType.
- Count Events: Counts the number of events for each type using COUNT(*).
7. Find customers who ordered at least two different products
Write a SQL query to find all customers who have placed orders for at least two different products.
SELECT CustomerID
FROM Orders
GROUP BY CustomerID
HAVING COUNT(DISTINCT ProductID) >= 2;Explanation:
- Group by Customer: Groups orders by CustomerID.
- Count Distinct Products: Counts the number of distinct products ordered by each customer using COUNT(DISTINCT ProductID).
- Filter: Selects customers who have ordered at least two different products (>= 2).
8. Calculate years of service for each employee
You have a table "Employees" with a column "HireDate." Write a query to calculate the number of years each employee has been working for the company.
SELECT
EmployeeID,
EmployeeName,
DATEDIFF(YEAR, HireDate, GETDATE()) AS YearsOfService
FROM Employees;Explanation:
- DATEDIFF(YEAR, HireDate, GETDATE()) calculates the difference in years between the HireDate and the current date (GETDATE()).
9. Average weekend transaction amount per customer
Given a table 'Transactions' with columns (TransactionID, CustomerID, Amount, TransactionDate), write a query to find the average transaction amount for each customer on weekends.
SELECT CustomerID, AVG(Amount) AS AverageWeekendTransaction
FROM Transactions
WHERE DATEPART(WEEKDAY, TransactionDate) IN (1, 7)
GROUP BY CustomerID;Explanation:
- Filter by Weekends: Selects transactions that occurred on weekends using DATEPART(WEEKDAY, TransactionDate) and checking for Saturday (7) or Sunday (1).
- Group by Customer: Groups transactions by CustomerID.
- Calculate Average: Calculates the average transaction amount (AVG(Amount)) for each customer.
10. Find employees hired in the same month and year as their manager
Write a query to find the employees who were hired in the same month and year as their manager.
SELECT e.EmployeeID, e.EmployeeName
FROM Employees e
INNER JOIN Employees m ON e.ManagerID = m.EmployeeID
WHERE MONTH(e.HireDate) = MONTH(m.HireDate)
AND YEAR(e.HireDate) = YEAR(m.HireDate);Explanation:
- Self-Join: Joins the "Employees" table to itself, using aliases e for the employee and m for the manager.
- Join Condition: Joins based on the employee's ManagerID matching the manager's EmployeeID.
- Date Comparison: Filters for employees hired in the same month and year as their manager using MONTH() and YEAR() functions.
11. Display full names in reverse order
You have a table "Students" with columns (StudentID, FirstName, LastName). Write a query to display the full names of all students in reverse order (LastName, FirstName).
SELECT LastName + ', ' + FirstName AS FullName
FROM Students;Explanation:
- The + operator is used to concatenate the LastName, a comma, and the FirstName to create the full name.
12. Update status field for specific app_ids
Write a query to update the field "status" in table "applications" from 0 to 1 where the app_id is greater than 1000.
UPDATE applications
SET status = 1
WHERE app_id > 1000;Explanation:
- UPDATE sets the status field to 1.
- WHERE specifies the condition for updating rows (only where app_id is greater than 1000).
13. Find the most recent action for each user
You have a table 'Logs' with columns (LogID, UserID, Action, Timestamp). Write a query to find the most recent action performed by each user.
SELECT UserID, Action, Timestamp
FROM (
SELECT UserID, Action, Timestamp,
ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY Timestamp DESC) AS RowNum
FROM Logs
) AS RankedLogs
WHERE RowNum = 1;Explanation:
- Subquery (RankedLogs):
- Partitions the data by UserID.
- Assigns a row number (ROW_NUMBER()) to each action within the partition, ordered by Timestamp in descending order (most recent first).
- Outer Query:
- Selects rows where RowNum is 1, effectively retrieving the most recent action for each user.
14. Find products that have never been ordered
Write a query to find the products that have never been ordered.
Tables:
- Products: (ProductID, ProductName, ...)
- Orders: (OrderID, CustomerID, ProductID, ...)
SELECT p.ProductName
FROM Products p
LEFT JOIN Orders o ON p.ProductID = o.ProductID
WHERE o.OrderID IS NULL;Explanation:
- Left Join: Ensures that all products are included in the result, even if they haven't been ordered.
- NULL Check: The WHERE clause filters for rows where o.OrderID is NULL, indicating that there's no matching order for that product.
15. Find customers who placed orders every month this year
Write a query to find the customers who have placed orders in every month of the current year.
Table: Orders: (OrderID, CustomerID, OrderDate, ...)
SELECT CustomerID
FROM Orders
WHERE YEAR(OrderDate) = YEAR(GETDATE())
GROUP BY CustomerID
HAVING COUNT(DISTINCT MONTH(OrderDate)) = MONTH(GETDATE());Explanation:
- Filter for Current Year: Selects orders placed in the current year.
- Group by Customer: Groups orders by CustomerID.
- Count Distinct Months: Counts the distinct months in which each customer placed orders.
- Compare with Current Month: Checks if the count of distinct months equals the current month, indicating orders in all months.
16. Find employees who are not managers
You have a table "Employees" with columns (EmployeeID, EmployeeName, ManagerID). Write a query to find the employees who are not managers (i.e., their EmployeeID doesn't appear in the ManagerID column).
-- Using LEFT JOIN
SELECT e.EmployeeID, e.EmployeeName
FROM Employees e
LEFT JOIN Employees m ON e.EmployeeID = m.ManagerID
WHERE m.EmployeeID IS NULL;
-- Using NOT EXISTS
SELECT EmployeeID, EmployeeName
FROM Employees e
WHERE NOT EXISTS (
SELECT 1
FROM Employees m
WHERE m.ManagerID = e.EmployeeID
);Explanation:
- LEFT JOIN Approach: The left join includes all employees. The WHERE clause filters for employees where m.EmployeeID is NULL, indicating they are not managers.
- NOT EXISTS Approach: The subquery checks if the employee's ID exists as a manager's ID. The NOT EXISTS clause selects employees for whom the subquery returns no rows.
17. Calculate running total of order amounts for each customer
Write a query to calculate the running total of order amounts for each customer, ordered by order date.
Table: Orders: (OrderID, CustomerID, OrderDate, TotalAmount)
SELECT
OrderID,
CustomerID,
OrderDate,
TotalAmount,
SUM(TotalAmount) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RunningTotal
FROM Orders;Explanation:
- SUM(TotalAmount) OVER (PARTITION BY CustomerID ORDER BY OrderDate) calculates the running total of TotalAmount for each customer, ordered by OrderDate.
18. Find overlapping calls
You have a table "Calls" with columns (CallID, StartTime, EndTime). Write a query to find the overlapping calls, where the start time of one call falls within the duration of another call.
SELECT c1.*
FROM Calls c1
INNER JOIN Calls c2 ON c1.CallID <> c2.CallID
WHERE c1.StartTime BETWEEN c2.StartTime AND c2.EndTime;Explanation:
- Self-Join: Joins the "Calls" table to itself, using aliases c1 and c2.
- Exclude Same Call: The c1.CallID <> c2.CallID condition prevents a call from being compared to itself.
- Overlap Check: The WHERE clause checks if c1.StartTime falls between the StartTime and EndTime of another call (c2).
19. Average number of orders per day of the week
Write a query to find the average number of orders placed by customers each day of the week.
Table: Orders: (OrderID, CustomerID, OrderDate, ...)
SELECT
DAYNAME(OrderDate) AS DayOfWeek,
AVG(OrderCount) AS AverageOrders
FROM (
SELECT OrderDate, COUNT(*) AS OrderCount
FROM Orders
GROUP BY OrderDate
) AS DailyOrders
GROUP BY DayOfWeek
ORDER BY DayOfWeek;Explanation:
- Inner Query (DailyOrders):
- Calculates the number of orders placed on each day (OrderCount).
- Outer Query:
- Extracts the day of the week using DAYNAME(OrderDate).
- Calculates the average number of orders for each day of the week using AVG(OrderCount).
- Orders the results by DayOfWeek.
20. Find customers with highest order count in each category
Write a query to find the customers who have placed the highest number of orders in each product category.
Tables:
- Customers: (CustomerID, CustomerName, ...)
- Orders: (OrderID, CustomerID, ProductID, ...)
- Products: (ProductID, ProductName, CategoryID, ...)
- Categories: (CategoryID, CategoryName, ...)
WITH CustomerOrderCounts AS (
SELECT c.CustomerID, cat.CategoryName, COUNT(*) AS OrderCount,
RANK() OVER (PARTITION BY cat.CategoryName ORDER BY COUNT(*) DESC) AS RankInCategory
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
JOIN Products p ON o.ProductID = p.ProductID
JOIN Categories cat ON p.CategoryID = cat.CategoryID
GROUP BY c.CustomerID, cat.CategoryName
)
SELECT CustomerID, CategoryName, OrderCount
FROM CustomerOrderCounts
WHERE RankInCategory = 1;Explanation:
- CTE (CustomerOrderCounts):
- Joins all relevant tables to get customer, category, and order information.
- Groups by CustomerID and CategoryName.
- Counts the orders for each customer in each category (OrderCount).
- Ranks customers within each category based on OrderCount using RANK() OVER (PARTITION BY ... ORDER BY ...).
- Outer Query:
- Selects the customer with the highest order count (rank = 1) in each category.
21. Find missing values in a sequence
Write a query to find the missing values in a sequence of numbers (e.g., order IDs, invoice numbers).
Table: Orders: (OrderID, ...)
-- Assuming OrderID is an integer sequence
WITH ExpectedOrderIDs AS (
SELECT generate_series(MIN(OrderID), MAX(OrderID)) AS ExpectedID
FROM Orders
)
SELECT ExpectedID
FROM ExpectedOrderIDs
EXCEPT
SELECT OrderID
FROM Orders;Explanation:
- CTE (ExpectedOrderIDs): Generates a series of numbers from the minimum to the maximum OrderID using the generate_series() function (PostgreSQL specific).
- EXCEPT Clause: The EXCEPT clause returns values that exist in the first set (expected IDs) but not in the second set (actual OrderIDs).
22. Distribute employees into salary groups
You have a table "Employees" with a "Salary" column. Write a query to distribute the employees into three salary groups (Low, Medium,, High) based on equal-sized salary ranges.
This question requires dividing data into quantiles.
SELECT EmployeeID, EmployeeName, Salary,
CASE
WHEN Salary <= (SELECT PERCENTILE_CONT(0.33) WITHIN GROUP (ORDER BY Salary) FROM Employees) THEN 'Low'
WHEN Salary <= (SELECT PERCENTILE_CONT(0.66) WITHIN GROUP (ORDER BY Salary) FROM Employees) THEN 'Medium'
ELSE 'High'
END AS SalaryGroup
FROM Employees;
Explanation:
- The PERCENTILE_CONT() (or a similar function depending on your SQL dialect) is used to calculate the 33rd and 66th percentiles of salaries. The CASE statement assigns salary groups based on the calculated percentiles.
23. Write a query to find the busiest hour of the day in terms of the number of calls made.
This question combines time manipulation, grouping, and aggregation.
Table: Calls: (CallID, StartTime, ...)
SELECT
EXTRACT(HOUR FROM StartTime) AS CallHour,
COUNT(*) AS CallCount
FROM Calls
GROUP BY CallHour
ORDER BY CallCount DESC
LIMIT 1;Explanation:
- Extract Hour: Extracts the hour of the day from the StartTime using EXTRACT(HOUR FROM ...).
- Group by Hour: Groups calls by the extracted hour.
- Count Calls: Counts the number of calls in each hour.
- Order and Limit: Orders the results by call count in descending order and retrieves the top hour (busiest hour).
24. Write a query to find the pairs of employees who have the same manager.
This question requires self-joining the table and comparing manager IDs.
Table: Employees: (EmployeeID, EmployeeName, ManagerID)
SELECT e1.EmployeeName, e2.EmployeeName
FROM Employees e1
INNER JOIN Employees e2 ON e1.ManagerID = e2.ManagerID AND e1.EmployeeID < e2.EmployeeID;Explanation:
- Self-Join: Joins the "Employees" table to itself, using aliases e1 and e2.
- Matching Managers: Joins based on employees having the same ManagerID.
- Avoid Duplicates: The e1.EmployeeID < e2.EmployeeID condition prevents duplicate pairs and ensures each pair is listed only once.
Practice, Practice, Practice!
The key to acing SQL interview questions is consistent practice. Work through various SQL problems, experiment with different query approaches, and analyze the results.
Online SQL Practice Platforms:
Sample Databases:
- Use sample databases like the Northwind database or create your own to test queries.
Real-World Datasets:
- Explore publicly available datasets on Kaggle or government websites to practice with real-world data.
On this page
- 1. What is SQL?
- 2. What are SQL Dialects?
- 3. What is a Database Management System (DBMS)?
- 4. What are Tables and Fields in SQL?
- 5. What are the different subsets/Sublanguages of SQL?
- 6. What are Joins in SQL?
- Choosing the Right Join
- Extra Tip:
- 7. What is a Subquery?
- 8. What are Aggregate Functions?
- 9. What is the Difference between WHERE and HAVING?
- 10. What is the Purpose of DISTINCT?
- 11. What are Stored Procedures?
- 12. What are Window Functions?
- 13. What is the Purpose of Common Table Expressions (CTEs)?
- 14. How Can You Optimize SQL Queries for Better Performance?
- 15. What is the Difference Between UNION and UNION ALL?
- 16. What is the purpose of the CASE statement in SQL? Provide an example.
- 17. What are Triggers in SQL?
- 18. EXISTS and NOT EXISTS Operators: Checking for Existence
- 19. Describe the concept of data warehousing and how it differs from traditional online transaction processing (OLTP) databases?
- Data Warehousing:
- Online Transaction Processing (OLTP):
- Key Differences:
- 20. What are Views in Databases?
- Key Points:
- Example:
- Types of Views:
- 21. Explain about Materialized Views?
- Advantages:
- Drawbacks:
- 22. Describe the concept of database partitioning and its benefits?
- Key Aspects:
- Types of Partitioning:
- 1. Range Partitioning:
- 2. List Partitioning:
- 3. Hash Partitioning:
- Benefits of Partitioning:
- 23. Transactions: Ensuring Data Consistency
- 24. Stored Procedure Best Practices
- 25. Indexing Strategies: Beyond the Basics
- 26. Security Considerations: Protecting Your Data
- 27. What is the purpose of normalization in database design?
- 1. Find employees with above-average department salary
- 2. Top 3 customers by spending in the last quarter
- 3. List products with category names (including products without categories)
- 4. Find the second highest salary
- 5. Delete all records from a table without dropping it
- 6. Count events by type in the last week
- 7. Find customers who ordered at least two different products
- 8. Calculate years of service for each employee
- 9. Average weekend transaction amount per customer
- 10. Find employees hired in the same month and year as their manager
- 11. Display full names in reverse order
- 12. Update status field for specific app_ids
- 13. Find the most recent action for each user
- 14. Find products that have never been ordered
- 15. Find customers who placed orders every month this year
- 16. Find employees who are not managers
- 17. Calculate running total of order amounts for each customer
- 18. Find overlapping calls
- 19. Average number of orders per day of the week
- 20. Find customers with highest order count in each category
- 21. Find missing values in a sequence
- 22. Distribute employees into salary groups
- 23. Write a query to find the busiest hour of the day in terms of the number of calls made.
- 24. Write a query to find the pairs of employees who have the same manager.
- Practice, Practice, Practice!
Keep exploring
matched by tag + title overlap
Read next
SQL Server Setup on Mac with Docker: A Step-by-Step Guide
A straightforward guide to setting up SQL Server on Mac using Docker. From installation to running a SQL Server container, this article provides easy-to-follow steps tailored for developers and database enthusiasts on macOS
#data-analyst#data-engineeringPower BI Interview Ready
Let's look at all the possible Power BI interview questions
#data-analyst#data-engineeringReal Time - Analytics with Azure Databricks:
This beginner-friendly article explores Azure Databricks in IIoT, detailing its role in enhancing data analytics from ingestion to data visualization, machine learning. It includes a practical case study on wind turbine optimization,…
#data-analyst#data-engineeringOptmizing Query Performance with Clustering Keys in Snowflake
This blog explores how Clustering Keys Data Pruning, and the Search Optimization Service (SOS) enhance query efficiency in Snowflake. It explains how clustering keys physically organize data into micro-partitions enabling faster queries by…
#data-analyst#data-engineering