Getting Started with SQL Queries - Exercises for Beginners Part-1

Introduction

This must-have guide for SQL queries will help you to brush up on your skills, regain your confidence and be job-ready! This comprehensive article features the essential queries that all SQL developers need.

In this first part of SQL Queries for Beginners, we will take a deep dive into the basics of SQL queries to teach you how to write them properly. We will share with you many common queries for solving by taking a dataset as a reference.

We are taking the “Used Car Dataset” from Kaggle as a reference. It is a good dataset to implement some of the SQL techniques if you are a beginner.

Let’s see how the dataset looks like:

id url region region_url price year manufacturer model condition cylinders fuel odometer title_status transmission VIN drive size type paint_color image_url description county state lat long posting_date
0 7222695916 https://prescott.craigslist.org/cto/d/prescott-2010-ford-ranger/7222695916.html prescott https://prescott.craigslist.org 6000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN az NaN NaN NaN
1 7218891961 https://fayar.craigslist.org/ctd/d/bentonville-2017-hyundai-elantra-se/7218891961.html fayetteville https://fayar.craigslist.org 11900 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ar NaN NaN NaN
2 7221797935 https://keys.craigslist.org/cto/d/summerland-key-2005-excursion/7221797935.html florida keys https://keys.craigslist.org 21000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN fl NaN NaN NaN
3 7222270760 https://worcester.craigslist.org/cto/d/west-brookfield-2002-honda-odyssey-ex/7222270760.html worcester / central MA https://worcester.craigslist.org 1500 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ma NaN NaN NaN
4 7210384030 https://greensboro.craigslist.org/cto/d/trinity-1965-chevrolet-truck/7210384030.html greensboro https://greensboro.craigslist.org 4900 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN nc NaN NaN NaN
To learn how to run your queries on Dbeaver or Python, visit the link below:
DBeaver vs Python: Two Methods To Run Your Queries
This is a step-by-step guide on how to run your SQL queries using Dbeaver and Python..

We will jump straight to the questions and try to understand and learn about different functions as they come.

  • Note: Before we move on, it's important to note that SQL by default is case insensitive. So feel free to use lower cases for your functions and statements.

Let's get started!

💡
But before that, check us out on Twitter to reach more such content.

Q1. Which 5 regions have the lowest average pricing?

Answer:

SELECT Avg(price) AS average_price,
       region
FROM   vehicles
WHERE  price BETWEEN 1 AND 655000
GROUP  BY region
ORDER  BY Avg(price)
LIMIT  5; 

It's important to understand the functions before applying them, so let's take a quick look at each one before moving to the explanation.

SELECT and FROM statements

We need to specify two pieces of information to use the SELECT statement, i.e., what you want and where you want to select it from.

  • SELECT : It is used to select data from the database.
  • FROM : It is used to specify from where the data is being taken

Syntax:

SELECT column1, column2,..
FROM table_name

AVG() function

  • AVG(): This function returns the average value of a numeric column.

AVG() is one of the aggregate functions. Some of the other aggregate functions are, COUNT(), SUM(), MAX(), MIN().

Here, we are storing the average value of the column price in a new column named average_price.

Syntax:

AVG(numerical_column)

WHERE clause

  • WHERE : It is used to filter records and extract only those records that fulfil a specified condition.

The following operators can be used with the WHERE clause.

Operator Description
0 = Equal to
1 > Greater than
2 < Less than
3 >= Greater than or equal
4 <= Less than or equal
5 <> Not Equal
6 BETWEEN Between a certain range
7 LIKE Search for a pattern
8 IN To specify multiple possible values for a column

In the above query, we are using the BETWEEN operator with the WHERE clause.

Syntax:

SELECT column1
FROM table_name
WHERE condition;

GROUP BY statement

  • GROUP BY : It groups rows that have the same values into summary rows. It is often used with aggregate functions to group the result-set by one or more columns.

Syntax:

SELECT COUNT(column1),column2
FROM table_name
GROUP BY column2;

ORDER BY keyword

  • ORDER BY: It is used to sort the result-set in ascending or descending order. It sorts the records in ascending order by default. To sort the records in descending order, we can use the DESC keyword.

Syntax:

SELECT column1, column2,...
FROM table_name
ORDER BY column1, column2, ... ASC|DESC;

Note: LIMIT 5 will print the first 5 rows for us.

Now, let's look at the explanation of the above query.

Explanation:

  1. Calculating average price using AVG() function and storing it in a new column named average_price.
  2. Extracting average_price and corresponding region, w.r.t. the condition: Setting the price limit from 1 to 655000 using the WHERE clause.
  3. Grouping all the different regions together using GROUP BY.
  4. Arranging the result in ascending order using ORDER BY.
  5. Printing only the top 5 rows using the LIMIT function.

This gives us the top 5 regions with the lowest average price.

Q2. Which 5 regions have the highest average pricing?

Answer:

SELECT Avg(price) AS average_price,
       region
FROM   vehicles
WHERE  price BETWEEN 1 AND 655000
GROUP  BY region
ORDER  BY Avg(price) DESC
LIMIT  5; 

Explanation:

  1. Calculating average price using AVG() function and storing it in a new column named average_price.
  2. Extracting average_price and corresponding region, w.r.t. the condition: Setting the price limit from 1 to 655000 using the WHERE clause.
  3. Grouping all the different regions together using GROUP BY.
  4. Arranging the result w.r.t AVG(price) in descending order using ORDER BY and DESC functions.
  5. Printing only the top 5 rows using the LIMIT function.

This will give us the top 5 regions with the highest average pricing.

Q3. How does the price of a vehicle vary across the distance travelled by it?

Answer:

SELECT Avg(price) AS average_price,
       CASE
         WHEN odometer >= 0
              AND odometer <= 20000 THEN '0-20000'
         WHEN odometer > 20000
              AND odometer <= 40000 THEN '20000-40000'
         WHEN odometer > 40000
              AND odometer <= 60000 THEN '40000-60000'
         WHEN odometer > 60000
              AND odometer <= 80000 THEN '60000-80000'
         WHEN odometer > 80000
              AND odometer <= 100000 THEN '80000-100000'
         WHEN odometer > 100000 THEN '<100000'
         ELSE 'NA'
       END        AS distance_travelled
FROM   vehicles
WHERE  price BETWEEN 1 AND 655000
GROUP  BY distance_travelled; 

Let's first understand what the CASE function means.

CASE statement

  • CASE: It goes through conditions and returns a value when the first condition is met. The conditions are defined using the WHEN and THEN functions. Once a condition is true, it will stop the loop and return the result. If no conditions are true, it returns the value in the ELSE clause. If there is no ELSE part and no conditions are true, it returns NULL. It ends with a END command.

Syntax:

CASE
    WHEN condition1 THEN result1
    WHEN condition2 THEN result2
    .
    .
    .
    ELSE final_result
END;

Now, let's look at the explanation of the above query.

Explanation:

  1. Calculating average price using AVG() function and storing it in a new column named average_price.
  2. Defining cases for different conditions using the CASE statement, i.e.: when odometer>=0 and odometer <=20000 then we are printing '0-20000' and so on.. and naming this new column distance_travelled using the END command.
  3. Setting the price limit between 1 to 655000.
  4. Grouping the results with respect to the new column distance_travelled.

Q4. How has the price of vehicles changed over the years?

Answer:

SELECT Avg(price)AS average_price,
       year
FROM   vehicles
WHERE  price BETWEEN 1 AND 655000
GROUP  BY year; 

Explanation:

  1. Calculating average price using AVG() function and storing it in a new column named average_price.
  2. Extracting average_price and corresponding year, w.r.t. the condition: Setting the price limit from 1 to 655000 using the WHERE clause.
  3. Grouping the results w.r.t. the year using the GROUP BY function.

Q5. How does a vehicle's condition impact its price?

SELECT Avg(price)AS average_price,
       condition
FROM   vehicles
WHERE  price BETWEEN 1 AND 655000
GROUP  BY condition;

Explanation:

  1. Calculating average price using AVG() function and storing it in a new column named average_price.
  2. Extracting average_price and corresponding condition, w.r.t. the following condition: Setting the price limit from 1 to 655000 using the WHERE clause.
  3. Grouping the results w.r.t. condition using the GROUP BY command.

Q6. Which color vehicle sells for the highest price?

Answer:

SELECT Avg(price)AS average_price,
       paint_color
FROM   vehicles
WHERE  price BETWEEN 1 AND 655000
GROUP  BY paint_color
ORDER  BY average_price DESC
LIMIT  1;

Explanation:

  1. Calculating average price using AVG() function and storing it in a new column named average_price.
  2. Extracting average_price and corresponding paint_color, w.r.t. the condition: Setting the price limit from 1 to 655000 using the WHERE clause.
  3. Grouping the results w.r.t. paint colors using GROUP BY.
  4. Arranging the result in descending order using the ORDER BY and DESC commands.
  5. Printing the first row only (to get the highest) using the LIMIT 1 command.

Q7. Which is the highest-priced model across each manufacturer?

Answer:

SELECT Max(price)AS max_price,
       model,
       manufacturer
FROM   vehicles
WHERE  price BETWEEN 1 AND 655000
GROUP  BY manufacturer; 

Explanation:

  1. Calculating maximum price using MAX() function and storing it in a new column named max_price.
  2. Extracting max_price and corresponding model and manufacturer, w.r.t. the following condition: Setting the price limit from 1 to 655000 using the WHERE clause.
  3. Grouping the results w.r.t. manufacturer using the GROUP BY command.

Q8. What are the topmost 5 states that have the highest number of cars along with their corresponding average car price?

Answer:

SELECT Count(DISTINCT id)AS vehicle_counts,
       state,
       Avg(price)        AS average_price
FROM   vehicles
WHERE  price BETWEEN 1 AND 655000
GROUP  BY state
ORDER  BY vehicle_counts DESC
LIMIT  5; 

Let's first understand what the function DISTINCT() mean.

SELECT DISTINCT Statement

  • SELECT DISTINCT: It returns only the distinct or unique values.

Syntax:

SELECT DISTINCT column1
FROM table_name;

Below is the explanation of the above query.

Explanation:

  1. Calculating count of unique IDs using COUNT() and DISTINCT() functions and storing it in a new column named vehicle_counts().
  2. Calculating average price using AVG() function and storing it in a new column named average_price.
  3. Extracting vehicle_counts(), average_price and corresponding state, w.r.t. the following condition: Setting the price limit from 1 to 655000 using the WHERE clause.
  4. Grouping the results w.r.t. state using the GROUP BY command.
  5. Arranging in descending order using the ORBER BY and DESC commands
  6. Printing the first five rows using the LIMIT 5 function.

Q9. Which top 10 models have the highest average price for automatic transmission?

Answer:

SELECT Avg(price) AS average_price,
       model
FROM   vehicles
WHERE  ( transmission = 'automatic' )
       AND ( price BETWEEN 1 AND 655000 )
GROUP  BY model
ORDER  BY average_price DESC
LIMIT  10; 

Explanation:

  1. Calculating average price using AVG() function and storing it in a new column named average_price.
  2. Extracting average_price and corresponding model, w.r.t. the following conditions:

Fixing transmission to automatic
Setting the price limit from 1 to 655000

3.   Grouping the results w.r.t. model using the GROUP BY command.

4.  Arranging in descending order and printing the first ten rows.

Q10. Find out the model that has the highest average price from the vehicles that have an odometer reading greater than 50000?

Answer:

SELECT Avg(price) AS average_price,
       model
FROM   vehicles
WHERE  ( odometer > 50000 )
       AND ( price BETWEEN 1 AND 655000 )
GROUP  BY model
ORDER  BY average_price DESC
LIMIT  1; 

Explanation:

  1. Calculating average price using AVG() function and storing it in a new column named average_price.
  2. Extracting average_price and corresponding model, w.r.t. the following conditions:

Fixing ODOMETER to 50000
Setting the price limit from 1 to 655000

3.   Grouping the results w.r.t. model using the GROUP BY command.

4.  Arranging in descending order and printing the first row.

Q11. For the year 2021 which model was sold at the highest average price across transmission types?

Answer:

SELECT Avg(price)AS average_price,
       model,
       transmission
FROM   vehicles
WHERE  ( year = 2021 )
       AND ( price BETWEEN 1 AND 655000 )
GROUP  BY transmission
ORDER  BY Avg(price) DESC; 

Explanation:

  1. Calculating average price using AVG() function and storing it in a new column named average_price.
  2. Extracting average_price and corresponding model and transmission, w.r.t. the following conditions:

Fixing YEAR to 2021
Setting the price limit from 1 to 655000

3.   Grouping the results w.r.t. transmission using the GROUP BY command.

4.  Arranging the average price in descending order.

Q12. Which manufacturers have the highest average price across regions?

Answer:

SELECT Max(average_price)AS max_average_price,
       manufacturer,
       region
FROM   (SELECT Avg(price)AS average_price,
               manufacturer,
               region
        FROM   vehicles
        WHERE  price BETWEEN 1 AND 655000
               AND manufacturer <> ''
        GROUP  BY manufacturer,
                  region
        ORDER  BY average_price DESC)
GROUP  BY region
ORDER  BY manufacturer,
          region,
          max_average_price; 

Explanation:

  1. First, we are extracting average_price , manufacturer and region using the procedure followed in the above questions -- (second SELECT statement).
  2. Next, using the column average_price, calculate the maximum of the average price using MAX(average_price) command and store it in the column name max_average_price.
  3. Now, we extract max_average_price, manufacturer and region from the result obtained from step 1 -- (first SELECT statement).
  4. In the end, we group the results w.r.t the regions using the GROUP BY command.

Q13. For the manufacturer that has the cheapest 4 wheel drive model, mention the average price for all its models?

Answer:

SELECT Avg(price) AS average_price,
       model
FROM   vehicles
WHERE  ( manufacturer IN (SELECT manufacturer
                          FROM   vehicles
                          WHERE  ( drive = "4wd" )
                                 AND ( price BETWEEN 1 AND 655000 )
                                 AND ( manufacturer != "none" )
                          GROUP  BY model,
                                    manufacturer
                          ORDER  BY Avg(price)
                          LIMIT  1) )
       AND ( price BETWEEN 1 AND 655000 ); 

Explanation:

  1. First, we are extracting manufacturer w.r.t. the following conditions (second SELECT statement):

Fixing DRIVE=4wd
Setting price between 1 and 655000
Fixing manufacturer not equal to None

2.  Grouping the result w.r.t. model and manufacturer.

3.   Arranging the result in ascending order w.r.t. average price and printing the first row.

4.  Next, using the result obtained above, we extract the average price and model w.r.t. the following conditions (first SELECT statement):

MANUFACTURER obtained from above result
Setting price between 1 to 655000

Q14. For the model that has the highest volume, mention its average price across states?

Answer:

SELECT AVG(price) as average_price, state 
FROM vehicles 
WHERE (model IN
(
  SELECT model
  FROM vehicles
  WHERE price BETWEEN 1 AND 655000
  GROUP BY model
  ORDER BY COUNT(DISTINCT id) desc
  LIMIT 1
))
AND (price between 1 and 655000) 
GROUP BY state

Explanation:

  1. First, we are extracting model w.r.t. the following conditions (second SELECT statement):

Setting price between 1 and 655000

2.  Grouping the result w.r.t. the model.

3.   Arranging the result in descending order w.r.t. the number of unique IDs i.e.,     ( COUNT(DISTINCT id) ) and printing the first row.

4.  Next, using the result obtained above, we extract the average price and state w.r.t. the following conditions (the first SELECT statement):

MODEL obtained from above result
Setting price between 1 to 655000

5.  Grouping the results w.r.t. the state.

Q15. How does the average car price vary across the number of cylinders?

Answer:

SELECT AVG(price) as average_price,cylinders
FROM vehicles 
WHERE price BETWEEN 1 AND 655000
GROUP BY cylinders
ORDER BY average_price

Explanation:

  1. Calculating average price using AVG() function and storing it in a new column named average_price.
  2. Extracting average_price and corresponding cylinders, w.r.t. the condition: Setting the price limit from 1 to 655000 using the WHERE clause.
  3. Grouping all the different cylinders together using GROUP BY.
  4. Arranging the result in ascending order using ORDER BY.

Wrap Up

Trying this specially curated list of questions on SQL will help you get a better hold of SQL and will help you feel more confident with your queries. Whether you are just starting on SQL or wish to bring yourself up-to-date, the list of questions here has all that you need.

We know that the best way to learn SQL is by doing. That’s why we have provided you with more such exercises which will help you grow your skills. With every exercise you complete, you will develop stronger SQL abilities.

Find the link to the next part of the exercise below:

Getting Started with SQL Queries - Exercises for Beginners Part-2

See you there!

Show us some 💛 on Twitter