Getting Started with SQL Queries - Exercises for Beginners Part-1
Here's a selection of the most useful SQL queries every beginner must practice. This article will not only provide you with the answers but also the explanation to each query.
Introduction
This must-have guide for SQL queries will help you to brush up on your skills, regain your confidence and be job-ready! This comprehensive article features the essential queries that all SQL developers need.
In this first part of SQL Queries for Beginners, we will take a deep dive into the basics of SQL queries to teach you how to write them properly. We will share with you many common queries for solving by taking a dataset as a reference.
We are taking the “Used Car Dataset” from Kaggle as a reference. It is a good dataset to implement some of the SQL techniques if you are a beginner.
Let’s see how the dataset looks like:
| id | url | region | region_url | price | year | manufacturer | model | condition | cylinders | fuel | odometer | title_status | transmission | VIN | drive | size | type | paint_color | image_url | description | county | state | lat | long | posting_date | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 7222695916 | https://prescott.craigslist.org/cto/d/prescott-2010-ford-ranger/7222695916.html | prescott | https://prescott.craigslist.org | 6000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | az | NaN | NaN | NaN |
| 1 | 7218891961 | https://fayar.craigslist.org/ctd/d/bentonville-2017-hyundai-elantra-se/7218891961.html | fayetteville | https://fayar.craigslist.org | 11900 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ar | NaN | NaN | NaN |
| 2 | 7221797935 | https://keys.craigslist.org/cto/d/summerland-key-2005-excursion/7221797935.html | florida keys | https://keys.craigslist.org | 21000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | fl | NaN | NaN | NaN |
| 3 | 7222270760 | https://worcester.craigslist.org/cto/d/west-brookfield-2002-honda-odyssey-ex/7222270760.html | worcester / central MA | https://worcester.craigslist.org | 1500 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ma | NaN | NaN | NaN |
| 4 | 7210384030 | https://greensboro.craigslist.org/cto/d/trinity-1965-chevrolet-truck/7210384030.html | greensboro | https://greensboro.craigslist.org | 4900 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | nc | NaN | NaN | NaN |
To learn how to run your queries on Dbeaver or Python, visit the link below:
We will jump straight to the questions and try to understand and learn about different functions as they come.
- Note: Before we move on, it's important to note that SQL by default is case insensitive. So feel free to use lower cases for your functions and statements.
Let's get started!
Q1. Which 5 regions have the lowest average pricing?
Answer:
SELECT Avg(price) AS average_price,
region
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY region
ORDER BY Avg(price)
LIMIT 5; It's important to understand the functions before applying them, so let's take a quick look at each one before moving to the explanation.
SELECT and FROM statements
We need to specify two pieces of information to use the SELECT statement, i.e., what you want and where you want to select it from.
SELECT: It is used to select data from the database.FROM: It is used to specify from where the data is being taken
Syntax:
SELECT column1, column2,..
FROM table_nameAVG() function
AVG(): This function returns the average value of a numeric column.
AVG() is one of the aggregate functions. Some of the other aggregate functions are, COUNT(), SUM(), MAX(), MIN().
Here, we are storing the average value of the column price in a new column named average_price.
Syntax:
AVG(numerical_column)WHERE clause
WHERE: It is used to filter records and extract only those records that fulfil a specified condition.
The following operators can be used with the WHERE clause.
| Operator | Description | |
|---|---|---|
| 0 | = | Equal to |
| 1 | > | Greater than |
| 2 | < | Less than |
| 3 | >= | Greater than or equal |
| 4 | <= | Less than or equal |
| 5 | <> | Not Equal |
| 6 | BETWEEN | Between a certain range |
| 7 | LIKE | Search for a pattern |
| 8 | IN | To specify multiple possible values for a column |
In the above query, we are using the BETWEEN operator with the WHERE clause.
Syntax:
SELECT column1
FROM table_name
WHERE condition;GROUP BY statement
GROUP BY: It groups rows that have the same values into summary rows. It is often used with aggregate functions to group the result-set by one or more columns.
Syntax:
SELECT COUNT(column1),column2
FROM table_name
GROUP BY column2;ORDER BY keyword
ORDER BY: It is used to sort the result-set in ascending or descending order. It sorts the records in ascending order by default. To sort the records in descending order, we can use theDESCkeyword.
Syntax:
SELECT column1, column2,...
FROM table_name
ORDER BY column1, column2, ... ASC|DESC;Note: LIMIT 5 will print the first 5 rows for us.
Now, let's look at the explanation of the above query.
Explanation:
- Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Extracting
average_priceand correspondingregion, w.r.t. the condition: Setting the price limit from 1 to 655000 using theWHEREclause. - Grouping all the different regions together using
GROUP BY. - Arranging the result in ascending order using
ORDER BY. - Printing only the top 5 rows using the
LIMITfunction.
This gives us the top 5 regions with the lowest average price.
Q2. Which 5 regions have the highest average pricing?
Answer:
SELECT Avg(price) AS average_price,
region
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY region
ORDER BY Avg(price) DESC
LIMIT 5; Explanation:
- Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Extracting
average_priceand correspondingregion, w.r.t. the condition: Setting the price limit from 1 to 655000 using theWHEREclause. - Grouping all the different regions together using
GROUP BY. - Arranging the result w.r.t
AVG(price)in descending order usingORDER BYandDESCfunctions. - Printing only the top 5 rows using the
LIMITfunction.
This will give us the top 5 regions with the highest average pricing.
Q3. How does the price of a vehicle vary across the distance travelled by it?
Answer:
SELECT Avg(price) AS average_price,
CASE
WHEN odometer >= 0
AND odometer <= 20000 THEN '0-20000'
WHEN odometer > 20000
AND odometer <= 40000 THEN '20000-40000'
WHEN odometer > 40000
AND odometer <= 60000 THEN '40000-60000'
WHEN odometer > 60000
AND odometer <= 80000 THEN '60000-80000'
WHEN odometer > 80000
AND odometer <= 100000 THEN '80000-100000'
WHEN odometer > 100000 THEN '<100000'
ELSE 'NA'
END AS distance_travelled
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY distance_travelled; Let's first understand what the CASE function means.
CASE statement
CASE: It goes through conditions and returns a value when the first condition is met. The conditions are defined using theWHENandTHENfunctions. Once a condition is true, it will stop the loop and return the result. If no conditions are true, it returns the value in theELSEclause. If there is noELSEpart and no conditions are true, it returns NULL. It ends with aENDcommand.
Syntax:
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
.
.
.
ELSE final_result
END;Now, let's look at the explanation of the above query.
Explanation:
- Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Defining cases for different conditions using the
CASEstatement, i.e.: when odometer>=0 and odometer <=20000 then we are printing '0-20000' and so on.. and naming this new columndistance_travelledusing theENDcommand. - Setting the price limit between 1 to 655000.
- Grouping the results with respect to the new column
distance_travelled.
Q4. How has the price of vehicles changed over the years?
Answer:
SELECT Avg(price)AS average_price,
year
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY year; Explanation:
- Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Extracting
average_priceand correspondingyear, w.r.t. the condition: Setting the price limit from 1 to 655000 using theWHEREclause. - Grouping the results w.r.t. the year using the
GROUP BYfunction.
Q5. How does a vehicle's condition impact its price?
SELECT Avg(price)AS average_price,
condition
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY condition;Explanation:
- Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Extracting
average_priceand correspondingcondition, w.r.t. the following condition: Setting the price limit from 1 to 655000 using theWHEREclause. - Grouping the results w.r.t. condition using the
GROUP BYcommand.
Q6. Which color vehicle sells for the highest price?
Answer:
SELECT Avg(price)AS average_price,
paint_color
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY paint_color
ORDER BY average_price DESC
LIMIT 1;Explanation:
- Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Extracting
average_priceand correspondingpaint_color, w.r.t. the condition: Setting the price limit from 1 to 655000 using theWHEREclause. - Grouping the results w.r.t. paint colors using
GROUP BY. - Arranging the result in descending order using the
ORDER BYandDESCcommands. - Printing the first row only (to get the highest) using the
LIMIT 1command.
Q7. Which is the highest-priced model across each manufacturer?
Answer:
SELECT Max(price)AS max_price,
model,
manufacturer
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY manufacturer; Explanation:
- Calculating maximum price using
MAX()function and storing it in a new column namedmax_price. - Extracting
max_priceand correspondingmodelandmanufacturer, w.r.t. the following condition: Setting the price limit from 1 to 655000 using theWHEREclause. - Grouping the results w.r.t. manufacturer using the
GROUP BYcommand.
Q8. What are the topmost 5 states that have the highest number of cars along with their corresponding average car price?
Answer:
SELECT Count(DISTINCT id)AS vehicle_counts,
state,
Avg(price) AS average_price
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY state
ORDER BY vehicle_counts DESC
LIMIT 5; Let's first understand what the function DISTINCT() mean.
SELECT DISTINCT Statement
SELECT DISTINCT: It returns only the distinct or unique values.
Syntax:
SELECT DISTINCT column1
FROM table_name;Below is the explanation of the above query.
Explanation:
- Calculating count of unique IDs using
COUNT()andDISTINCT()functions and storing it in a new column namedvehicle_counts(). - Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Extracting
vehicle_counts(), average_priceand correspondingstate, w.r.t. the following condition: Setting the price limit from 1 to 655000 using theWHEREclause. - Grouping the results w.r.t. state using the
GROUP BYcommand. - Arranging in descending order using the
ORBER BYandDESCcommands - Printing the first five rows using the
LIMIT 5function.
Q9. Which top 10 models have the highest average price for automatic transmission?
Answer:
SELECT Avg(price) AS average_price,
model
FROM vehicles
WHERE ( transmission = 'automatic' )
AND ( price BETWEEN 1 AND 655000 )
GROUP BY model
ORDER BY average_price DESC
LIMIT 10; Explanation:
- Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Extracting
average_priceand correspondingmodel, w.r.t. the following conditions:
Fixing
transmissiontoautomatic
Setting the price limit from 1 to 655000
3. Grouping the results w.r.t. model using the GROUP BY command.
4. Arranging in descending order and printing the first ten rows.
Q10. Find out the model that has the highest average price from the vehicles that have an odometer reading greater than 50000?
Answer:
SELECT Avg(price) AS average_price,
model
FROM vehicles
WHERE ( odometer > 50000 )
AND ( price BETWEEN 1 AND 655000 )
GROUP BY model
ORDER BY average_price DESC
LIMIT 1; Explanation:
- Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Extracting
average_priceand correspondingmodel, w.r.t. the following conditions:
Fixing
ODOMETERto50000
Setting the price limit from 1 to 655000
3. Grouping the results w.r.t. model using the GROUP BY command.
4. Arranging in descending order and printing the first row.
Q11. For the year 2021 which model was sold at the highest average price across transmission types?
Answer:
SELECT Avg(price)AS average_price,
model,
transmission
FROM vehicles
WHERE ( year = 2021 )
AND ( price BETWEEN 1 AND 655000 )
GROUP BY transmission
ORDER BY Avg(price) DESC; Explanation:
- Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Extracting
average_priceand correspondingmodelandtransmission, w.r.t. the following conditions:
Fixing
YEARto2021
Setting the price limit from 1 to 655000
3. Grouping the results w.r.t. transmission using the GROUP BY command.
4. Arranging the average price in descending order.
Q12. Which manufacturers have the highest average price across regions?
Answer:
SELECT Max(average_price)AS max_average_price,
manufacturer,
region
FROM (SELECT Avg(price)AS average_price,
manufacturer,
region
FROM vehicles
WHERE price BETWEEN 1 AND 655000
AND manufacturer <> ''
GROUP BY manufacturer,
region
ORDER BY average_price DESC)
GROUP BY region
ORDER BY manufacturer,
region,
max_average_price; Explanation:
- First, we are extracting
average_price,manufacturerandregionusing the procedure followed in the above questions -- (secondSELECTstatement). - Next, using the column
average_price, calculate the maximum of the average price usingMAX(average_price)command and store it in the column namemax_average_price. - Now, we extract
max_average_price,manufacturerandregionfrom the result obtained from step 1 -- (firstSELECTstatement). - In the end, we group the results w.r.t the regions using the
GROUP BYcommand.
Q13. For the manufacturer that has the cheapest 4 wheel drive model, mention the average price for all its models?
Answer:
SELECT Avg(price) AS average_price,
model
FROM vehicles
WHERE ( manufacturer IN (SELECT manufacturer
FROM vehicles
WHERE ( drive = "4wd" )
AND ( price BETWEEN 1 AND 655000 )
AND ( manufacturer != "none" )
GROUP BY model,
manufacturer
ORDER BY Avg(price)
LIMIT 1) )
AND ( price BETWEEN 1 AND 655000 ); Explanation:
- First, we are extracting
manufacturerw.r.t. the following conditions (secondSELECTstatement):
Fixing
DRIVE=4wd
Setting price between 1 and 655000
Fixingmanufacturernot equal toNone
2. Grouping the result w.r.t. model and manufacturer.
3. Arranging the result in ascending order w.r.t. average price and printing the first row.
4. Next, using the result obtained above, we extract the average price and model w.r.t. the following conditions (first SELECT statement):
MANUFACTURERobtained from above result
Setting price between 1 to 655000
Q14. For the model that has the highest volume, mention its average price across states?
Answer:
SELECT AVG(price) as average_price, state
FROM vehicles
WHERE (model IN
(
SELECT model
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY model
ORDER BY COUNT(DISTINCT id) desc
LIMIT 1
))
AND (price between 1 and 655000)
GROUP BY stateExplanation:
- First, we are extracting
modelw.r.t. the following conditions (secondSELECTstatement):
Setting price between 1 and 655000
2. Grouping the result w.r.t. the model.
3. Arranging the result in descending order w.r.t. the number of unique IDs i.e., ( COUNT(DISTINCT id) ) and printing the first row.
4. Next, using the result obtained above, we extract the average price and state w.r.t. the following conditions (the first SELECT statement):
MODELobtained from above result
Setting price between 1 to 655000
5. Grouping the results w.r.t. the state.
Q15. How does the average car price vary across the number of cylinders?
Answer:
SELECT AVG(price) as average_price,cylinders
FROM vehicles
WHERE price BETWEEN 1 AND 655000
GROUP BY cylinders
ORDER BY average_priceExplanation:
- Calculating average price using
AVG()function and storing it in a new column namedaverage_price. - Extracting
average_priceand correspondingcylinders, w.r.t. the condition: Setting the price limit from 1 to 655000 using theWHEREclause. - Grouping all the different cylinders together using
GROUP BY. - Arranging the result in ascending order using
ORDER BY.
Wrap Up
Trying this specially curated list of questions on SQL will help you get a better hold of SQL and will help you feel more confident with your queries. Whether you are just starting on SQL or wish to bring yourself up-to-date, the list of questions here has all that you need.
We know that the best way to learn SQL is by doing. That’s why we have provided you with more such exercises which will help you grow your skills. With every exercise you complete, you will develop stronger SQL abilities.
Find the link to the next part of the exercise below:
Getting Started with SQL Queries - Exercises for Beginners Part-2
See you there!
Show us some 💛 on Twitter
Struggling to understand SELECT and GROUP BY and those gnarly WHERE clauses???
— datascience.fm (@DatascienceFm) January 22, 2022
We have you covered.
Check out a 4 part series that is the only SQL 101 you will ever need.https://t.co/NnsmUT6Jl7#sql #101 #startwithsql