Getting Started with SQL Queries - Exercises for Beginners Part-1
Introduction
This must-have guide for SQL queries will help you to brush up on your skills, regain your confidence and be job-ready! This comprehensive article features the essential queries that all SQL developers need.
In this first part of SQL Queries for Beginners, we will take a deep dive into the basics of SQL queries to teach you how to write them properly. We will share with you many common queries for solving by taking a dataset as a reference.
We are taking the “Used Car Dataset” from Kaggle as a reference. It is a good dataset to implement some of the SQL techniques if you are a beginner.
Let’s see how the dataset looks like:
id | url | region | region_url | price | year | manufacturer | model | condition | cylinders | fuel | odometer | title_status | transmission | VIN | drive | size | type | paint_color | image_url | description | county | state | lat | long | posting_date | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 7222695916 | https://prescott.craigslist.org/cto/d/prescott-2010-ford-ranger/7222695916.html | prescott | https://prescott.craigslist.org | 6000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | az | NaN | NaN | NaN |
1 | 7218891961 | https://fayar.craigslist.org/ctd/d/bentonville-2017-hyundai-elantra-se/7218891961.html | fayetteville | https://fayar.craigslist.org | 11900 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ar | NaN | NaN | NaN |
2 | 7221797935 | https://keys.craigslist.org/cto/d/summerland-key-2005-excursion/7221797935.html | florida keys | https://keys.craigslist.org | 21000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | fl | NaN | NaN | NaN |
3 | 7222270760 | https://worcester.craigslist.org/cto/d/west-brookfield-2002-honda-odyssey-ex/7222270760.html | worcester / central MA | https://worcester.craigslist.org | 1500 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ma | NaN | NaN | NaN |
4 | 7210384030 | https://greensboro.craigslist.org/cto/d/trinity-1965-chevrolet-truck/7210384030.html | greensboro | https://greensboro.craigslist.org | 4900 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | nc | NaN | NaN | NaN |
To learn how to run your queries on Dbeaver or Python, visit the link below:
We will jump straight to the questions and try to understand and learn about different functions as they come.
- Note: Before we move on, it's important to note that SQL by default is case insensitive. So feel free to use lower cases for your functions and statements.
Let's get started!
Q1. Which 5 regions have the lowest average pricing?
Answer:
SELECT Avg(price) AS average_price,
region
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY region
ORDER BY Avg(price)
LIMIT 5;
It's important to understand the functions before applying them, so let's take a quick look at each one before moving to the explanation.
SELECT and FROM statements
We need to specify two pieces of information to use the SELECT
statement, i.e., what you want and where you want to select it from.
SELECT
: It is used to select data from the database.FROM
: It is used to specify from where the data is being taken
Syntax:
SELECT column1, column2,..
FROM table_name
AVG() function
AVG()
: This function returns the average value of a numeric column.
AVG()
is one of the aggregate functions. Some of the other aggregate functions are, COUNT()
, SUM()
, MAX()
, MIN()
.
Here, we are storing the average value of the column price
in a new column named average_price
.
Syntax:
AVG(numerical_column)
WHERE clause
WHERE
: It is used to filter records and extract only those records that fulfil a specified condition.
The following operators can be used with the WHERE
clause.
Operator | Description | |
---|---|---|
0 | = | Equal to |
1 | > | Greater than |
2 | < | Less than |
3 | >= | Greater than or equal |
4 | <= | Less than or equal |
5 | <> | Not Equal |
6 | BETWEEN | Between a certain range |
7 | LIKE | Search for a pattern |
8 | IN | To specify multiple possible values for a column |
In the above query, we are using the BETWEEN
operator with the WHERE
clause.
Syntax:
SELECT column1
FROM table_name
WHERE condition;
GROUP BY statement
GROUP BY
: It groups rows that have the same values into summary rows. It is often used with aggregate functions to group the result-set by one or more columns.
Syntax:
SELECT COUNT(column1),column2
FROM table_name
GROUP BY column2;
ORDER BY keyword
ORDER BY
: It is used to sort the result-set in ascending or descending order. It sorts the records in ascending order by default. To sort the records in descending order, we can use theDESC
keyword.
Syntax:
SELECT column1, column2,...
FROM table_name
ORDER BY column1, column2, ... ASC|DESC;
Note: LIMIT 5
will print the first 5 rows for us.
Now, let's look at the explanation of the above query.
Explanation:
- Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Extracting
average_price
and correspondingregion
, w.r.t. the condition: Setting the price limit from 1 to 655000 using theWHERE
clause. - Grouping all the different regions together using
GROUP BY
. - Arranging the result in ascending order using
ORDER BY
. - Printing only the top 5 rows using the
LIMIT
function.
This gives us the top 5 regions with the lowest average price.
Q2. Which 5 regions have the highest average pricing?
Answer:
SELECT Avg(price) AS average_price,
region
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY region
ORDER BY Avg(price) DESC
LIMIT 5;
Explanation:
- Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Extracting
average_price
and correspondingregion
, w.r.t. the condition: Setting the price limit from 1 to 655000 using theWHERE
clause. - Grouping all the different regions together using
GROUP BY
. - Arranging the result w.r.t
AVG(price)
in descending order usingORDER BY
andDESC
functions. - Printing only the top 5 rows using the
LIMIT
function.
This will give us the top 5 regions with the highest average pricing.
Q3. How does the price of a vehicle vary across the distance travelled by it?
Answer:
SELECT Avg(price) AS average_price,
CASE
WHEN odometer >= 0
AND odometer <= 20000 THEN '0-20000'
WHEN odometer > 20000
AND odometer <= 40000 THEN '20000-40000'
WHEN odometer > 40000
AND odometer <= 60000 THEN '40000-60000'
WHEN odometer > 60000
AND odometer <= 80000 THEN '60000-80000'
WHEN odometer > 80000
AND odometer <= 100000 THEN '80000-100000'
WHEN odometer > 100000 THEN '<100000'
ELSE 'NA'
END AS distance_travelled
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY distance_travelled;
Let's first understand what the CASE
function means.
CASE statement
CASE
: It goes through conditions and returns a value when the first condition is met. The conditions are defined using theWHEN
andTHEN
functions. Once a condition is true, it will stop the loop and return the result. If no conditions are true, it returns the value in theELSE
clause. If there is noELSE
part and no conditions are true, it returns NULL. It ends with aEND
command.
Syntax:
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
.
.
.
ELSE final_result
END;
Now, let's look at the explanation of the above query.
Explanation:
- Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Defining cases for different conditions using the
CASE
statement, i.e.: when odometer>=0 and odometer <=20000 then we are printing '0-20000' and so on.. and naming this new columndistance_travelled
using theEND
command. - Setting the price limit between 1 to 655000.
- Grouping the results with respect to the new column
distance_travelled
.
Q4. How has the price of vehicles changed over the years?
Answer:
SELECT Avg(price)AS average_price,
year
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY year;
Explanation:
- Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Extracting
average_price
and correspondingyear
, w.r.t. the condition: Setting the price limit from 1 to 655000 using theWHERE
clause. - Grouping the results w.r.t. the year using the
GROUP BY
function.
Q5. How does a vehicle's condition impact its price?
SELECT Avg(price)AS average_price,
condition
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY condition;
Explanation:
- Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Extracting
average_price
and correspondingcondition
, w.r.t. the following condition: Setting the price limit from 1 to 655000 using theWHERE
clause. - Grouping the results w.r.t. condition using the
GROUP BY
command.
Q6. Which color vehicle sells for the highest price?
Answer:
SELECT Avg(price)AS average_price,
paint_color
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY paint_color
ORDER BY average_price DESC
LIMIT 1;
Explanation:
- Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Extracting
average_price
and correspondingpaint_color
, w.r.t. the condition: Setting the price limit from 1 to 655000 using theWHERE
clause. - Grouping the results w.r.t. paint colors using
GROUP BY
. - Arranging the result in descending order using the
ORDER BY
andDESC
commands. - Printing the first row only (to get the highest) using the
LIMIT 1
command.
Q7. Which is the highest-priced model across each manufacturer?
Answer:
SELECT Max(price)AS max_price,
model,
manufacturer
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY manufacturer;
Explanation:
- Calculating maximum price using
MAX()
function and storing it in a new column namedmax_price
. - Extracting
max_price
and correspondingmodel
andmanufacturer
, w.r.t. the following condition: Setting the price limit from 1 to 655000 using theWHERE
clause. - Grouping the results w.r.t. manufacturer using the
GROUP BY
command.
Q8. What are the topmost 5 states that have the highest number of cars along with their corresponding average car price?
Answer:
SELECT Count(DISTINCT id)AS vehicle_counts,
state,
Avg(price) AS average_price
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY state
ORDER BY vehicle_counts DESC
LIMIT 5;
Let's first understand what the function DISTINCT()
mean.
SELECT DISTINCT Statement
SELECT DISTINCT
: It returns only the distinct or unique values.
Syntax:
SELECT DISTINCT column1
FROM table_name;
Below is the explanation of the above query.
Explanation:
- Calculating count of unique IDs using
COUNT()
andDISTINCT()
functions and storing it in a new column namedvehicle_counts()
. - Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Extracting
vehicle_counts(), average_price
and correspondingstate
, w.r.t. the following condition: Setting the price limit from 1 to 655000 using theWHERE
clause. - Grouping the results w.r.t. state using the
GROUP BY
command. - Arranging in descending order using the
ORBER BY
andDESC
commands - Printing the first five rows using the
LIMIT 5
function.
Q9. Which top 10 models have the highest average price for automatic transmission?
Answer:
SELECT Avg(price) AS average_price,
model
FROM vehicles
WHERE ( transmission = 'automatic' )
AND ( price BETWEEN 1 AND 655000 )
GROUP BY model
ORDER BY average_price DESC
LIMIT 10;
Explanation:
- Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Extracting
average_price
and correspondingmodel
, w.r.t. the following conditions:
Fixing
transmission
toautomatic
Setting the price limit from 1 to 655000
3. Grouping the results w.r.t. model using the GROUP BY
command.
4. Arranging in descending order and printing the first ten rows.
Q10. Find out the model that has the highest average price from the vehicles that have an odometer reading greater than 50000?
Answer:
SELECT Avg(price) AS average_price,
model
FROM vehicles
WHERE ( odometer > 50000 )
AND ( price BETWEEN 1 AND 655000 )
GROUP BY model
ORDER BY average_price DESC
LIMIT 1;
Explanation:
- Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Extracting
average_price
and correspondingmodel
, w.r.t. the following conditions:
Fixing
ODOMETER
to50000
Setting the price limit from 1 to 655000
3. Grouping the results w.r.t. model using the GROUP BY
command.
4. Arranging in descending order and printing the first row.
Q11. For the year 2021 which model was sold at the highest average price across transmission types?
Answer:
SELECT Avg(price)AS average_price,
model,
transmission
FROM vehicles
WHERE ( year = 2021 )
AND ( price BETWEEN 1 AND 655000 )
GROUP BY transmission
ORDER BY Avg(price) DESC;
Explanation:
- Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Extracting
average_price
and correspondingmodel
andtransmission
, w.r.t. the following conditions:
Fixing
YEAR
to2021
Setting the price limit from 1 to 655000
3. Grouping the results w.r.t. transmission using the GROUP BY
command.
4. Arranging the average price in descending order.
Q12. Which manufacturers have the highest average price across regions?
Answer:
SELECT Max(average_price)AS max_average_price,
manufacturer,
region
FROM (SELECT Avg(price)AS average_price,
manufacturer,
region
FROM vehicles
WHERE price BETWEEN 1 AND 655000
AND manufacturer <> ''
GROUP BY manufacturer,
region
ORDER BY average_price DESC)
GROUP BY region
ORDER BY manufacturer,
region,
max_average_price;
Explanation:
- First, we are extracting
average_price
,manufacturer
andregion
using the procedure followed in the above questions -- (secondSELECT
statement). - Next, using the column
average_price
, calculate the maximum of the average price usingMAX(average_price)
command and store it in the column namemax_average_price
. - Now, we extract
max_average_price
,manufacturer
andregion
from the result obtained from step 1 -- (firstSELECT
statement). - In the end, we group the results w.r.t the regions using the
GROUP BY
command.
Q13. For the manufacturer that has the cheapest 4 wheel drive model, mention the average price for all its models?
Answer:
SELECT Avg(price) AS average_price,
model
FROM vehicles
WHERE ( manufacturer IN (SELECT manufacturer
FROM vehicles
WHERE ( drive = "4wd" )
AND ( price BETWEEN 1 AND 655000 )
AND ( manufacturer != "none" )
GROUP BY model,
manufacturer
ORDER BY Avg(price)
LIMIT 1) )
AND ( price BETWEEN 1 AND 655000 );
Explanation:
- First, we are extracting
manufacturer
w.r.t. the following conditions (secondSELECT
statement):
Fixing
DRIVE
=4wd
Setting price between 1 and 655000
Fixingmanufacturer
not equal toNone
2. Grouping the result w.r.t. model and manufacturer.
3. Arranging the result in ascending order w.r.t. average price and printing the first row.
4. Next, using the result obtained above, we extract the average price and model w.r.t. the following conditions (first SELECT
statement):
MANUFACTURER
obtained from above result
Setting price between 1 to 655000
Q14. For the model that has the highest volume, mention its average price across states?
Answer:
SELECT AVG(price) as average_price, state
FROM vehicles
WHERE (model IN
(
SELECT model
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY model
ORDER BY COUNT(DISTINCT id) desc
LIMIT 1
))
AND (price between 1 and 655000)
GROUP BY state
Explanation:
- First, we are extracting
model
w.r.t. the following conditions (secondSELECT
statement):
Setting price between 1 and 655000
2. Grouping the result w.r.t. the model.
3. Arranging the result in descending order w.r.t. the number of unique IDs i.e., ( COUNT(DISTINCT id)
) and printing the first row.
4. Next, using the result obtained above, we extract the average price and state w.r.t. the following conditions (the first SELECT
statement):
MODEL
obtained from above result
Setting price between 1 to 655000
5. Grouping the results w.r.t. the state.
Q15. How does the average car price vary across the number of cylinders?
Answer:
SELECT AVG(price) as average_price,cylinders
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY cylinders
ORDER BY average_price
Explanation:
- Calculating average price using
AVG()
function and storing it in a new column namedaverage_price
. - Extracting
average_price
and correspondingcylinders
, w.r.t. the condition: Setting the price limit from 1 to 655000 using theWHERE
clause. - Grouping all the different cylinders together using
GROUP BY
. - Arranging the result in ascending order using
ORDER BY
.
Wrap Up
Trying this specially curated list of questions on SQL will help you get a better hold of SQL and will help you feel more confident with your queries. Whether you are just starting on SQL or wish to bring yourself up-to-date, the list of questions here has all that you need.
We know that the best way to learn SQL is by doing. That’s why we have provided you with more such exercises which will help you grow your skills. With every exercise you complete, you will develop stronger SQL abilities.
Find the link to the next part of the exercise below:
Getting Started with SQL Queries - Exercises for Beginners Part-2
See you there!
Show us some 💛 on Twitter