# Getting Started with SQL Queries - Exercises for Beginners Part-3

Here's a selection of the most useful SQL queries every beginner must practice. This article will not only provide you with the answers but also the explanation to each query.

# Introduction

Welcome to the third part of the SQL series! We will discuss more challenging yet exciting questions on the Cars Dataset.

If you are new here, we recommend you to check out parts one and two first, since many concepts are carried forward in this tutorial.

Dataset: Used Cars Dataset

Getting Started with SQL Queries - Exercises for Beginners Part-1

Getting Started with SQL Queries - Exercises for Beginners Part-2

To learn how to run your queries on Dbeaver or Python, visit the link below:

Now, let's get jump straight to the questions.

💡

## Q1. How has the percentage market share of manufacturers changed across the years?

``````select table1.year,manufacturer_grouped_price,total_price,
(100*manufacturer_grouped_price/total_price) as market_share
from
(select sum(price) as manufacturer_grouped_price,manufacturer,year
from vehicles
where (price between 1 and 655000) and (manufacturer !="") and (year
!="")
group by manufacturer,year
order by year) table1
inner join
(select sum(price) as total_price,year
from vehicles
where (price between 1 and 655000) and (manufacturer !="") and (year
!="")
group by year
order by year) table2
on table1.year=table2.year``````

Explanation:

We will divide the whole query into small parts and understand each one in-depth.

1. Let's first look at the part of the query given below:
``````(select sum(price) as manufacturer_grouped_price,manufacturer,year
from vehicles
where (price between 1 and 655000) and (manufacturer !="") and (year
!="")
group by manufacturer,year
order by year) table1``````

Here, we are extracting sum of price as `manufacturer_grouped_price`, `manufacturer` and `year`, w.r.t the following conditions using the `WHERE` clause:

Fixing `manufacturer`not equal to " "
Fixing `year` not equal to " "
Setting `price` between 1 to 655000

After this, we group our results w.r.t the `manufacturer`and `year`. Next, we arrange the results in ascebding order w.r.t `year`. We call this resultant table, `table1`.

2.  Now, let's understand another chunk of the query below:

``````(select sum(price) as total_price,year
from vehicles
where (price between 1 and 655000) and (manufacturer !="") and (year
!="")
group by year
order by year) table2``````

Here, we are extracting the sum of price as `total_price` and `year`, w.r.t the following conditions using the `WHERE` clause:

Fixing `manufacturer` not equal to " "
Fixing `year` not equal to " "
Setting `price` between 1 to 655000

After this, we group our results w.r.t the `year`. Arrange the result in ascending order w.r.t `year`. We call this resultant table, `table2`.

3.  Next, we use `INNER JOIN` on `table1` and `table2` and we specify the join-predicate to be `year` using the query:

`on table1.year=table2.year`

4. Lastly, from this resultant table we extract the following columns:

• `table1.year`
• `manufacturer_grouped_price`
• `total_price`
• (100* `manufacturer_grouped_price` / `total_price` ) as `market_share`.

The final table will give us the percentage market share of manufacturers across the years.

## Q2. In which month the highest number of cars are listed for sale?

``````select count(id) vehicle_counts,SUBSTRING(posting_date,6,2) as month
from vehicles
where (posting_date!="") and (price between 1 and 655000)
group by month
order by vehicle_counts desc
limit 1
``````

First, let's learn about the `SUBSTRING()` function.

### SUBSTRING()

• `SUBSTRING()`: This function helps to extract some characters from a string.

Syntax:

``SUBSTRING(string, start, length)``

Explanation:

1. We extract no. of ids using `COUNT(id)` as `vehicle_counts`and `months` by applying the `substring()`function on the column `posting_date`.
2. We get the above result w.r.t some conditions using the `WHERE()`clause, i.e.:-

Fixing `posting_date` not equal to " "
Setting `price` between 1 to 655000

3.  Grouping the results w.r.t. the `month`.

4.  Arranging the result in descending order w.r.t. `vehicle_counts` and printing the 1st-row using `limit 1`.

## Q3. What are the most frequent conditions across colours?

``````select count(id) vehicle_counts,condition,paint_color
from vehicles
where (paint_color!="") and (condition!="") and (price between 1 and 655000)
group by condition,paint_color
order by vehicle_counts desc
``````

Explanation:

1. We extract the number of ids using `count(id)`, `condition` and `paint_color`.
2. We filter the above results w.r.t some conditions using the `WHERE()` clause, i.e.,

Fixing `paint_color` not equal to " "
Fixing `condition` not equal to " "
Setting `price` between 1 to 655000

3.  Grouping the results w.r.t. `condition` and `paint_color`.

4.  Arranging the final table in descending order of `vehicle_count`.

## Q4. Which colours are the most frequent across vehicle types, ranked in decreasing order of frequency?

``````select count(id) vehicle_counts,type,paint_color,
rank() over(partition by type order by count(id) desc)  rank_col
from vehicles
where (paint_color!="") and (type!="") and (price between 1 and 655000)
group by type,paint_color``````
We already know what the `RANK()`function does (Exercise -2, Question #10), so jumping straight to the explanation.

Explanation:

1. We extract no. of ids using `count(id)` as `vehicle_counts`, `type` and `paint_color`.
2. We also extract rank for each row using the function `rank()`. To specify the ranks, we use the criteria that we partition the result w.r.t. `type` and arrange it in descending order w.r.t `count(id)` (using order by) ~ saving this in the column name `rank_col`.
3. We want the above result w.r.t the following conditions using the `WHERE` clause:

Fixing `paint_color`not equal to " "
Fixing `type` not equal to " "
Setting `price` between 1 to 655000

4.  Grouping the above result w.r.t. `type` and `paint_color`.

## Q5. How is the frequency of vehicles distributed across their condition and type?

``````select count(id) as total_vehicles,condition,type
from vehicles
where (condition!="") and (price between 1 and 655000) and (type!="")
group by condition,type``````

Explanation:

1. Extracting no. of ids using `count(id)` as `total_vehicles`, `condition` and `type`.
2. Filtering the above result w.r.t the following conditions using the `WHERE` clause:

Fixing `condition`not equal to " "
Fixing `type` not equal to " "
Setting `price` between 1 to 655000

3.  Grouping the final result w.r.t. `condition` and `type`.

## Q6. Rank (without repetition) the models according to their frequency across posting years?

``````select count(id) as total_vehicles,model,SUBSTRING(posting_date,1,4) as post_year,
ROW_NUMBER () over(partition by SUBSTRING(posting_date,1,4) order by count(id) desc)  rank_col
from vehicles
where (model!="") and (price between 1 and 655000) and (posting_date!="")
group by model,post_year
``````
We already know what the `ROW_NUMBER()`function does (Exercise -2, Question #13), so jumping straight to the explanation.

Explanation:

1. We extract the number of ids using `count(id)` and save it in the column `total_vehicles`.
2. We also extract `model`, `post_year` using the `substring` function.
3. Next, we use `row_number` over `post_year`. Then, we sort it by arranging them in descending order using the code: `order by count(id) desc` . We save this in a column named `rank_col`.
4. We want the above result w.r.t the following conditions using the `WHERE` clause:

Fixing `model`not equal to " "
Fixing `posting_date`not equal to " "
Setting `price` between 1 to 655000

5.  Grouping the final result w.r.t `model`and `post_year`.

## Q7. The highest number of vehicles were posted for sale in which year?

``````select count(id) as total_vehicles,SUBSTRING(posting_date,1,4) as post_year
from vehicles
where  (price between 1 and 655000) and (posting_date!="")
group by post_year
order by total_vehicles
limit 1
``````

Explanation:

1. We extract no. of ids using `COUNT(id)` as `total_vehicles`and `post_year` by applying the `substring()`function on the column `posting_date`.
2. We get the above result w.r.t some conditions using the `WHERE()`clause, i.e.:-

Fixing `posting_date` not equal to " "
Setting `price` between 1 to 655000

3.  Grouping the results w.r.t. the `post_year`.

4.  Arranging the result in ascending order w.r.t. `total_vehicles` and printing the 1st-row using `limit 1`.

## Q8. Mention the top 10 costliest models for every manufacturer?

``````select *
from(
select max(price) as max_price,model,manufacturer,
ROW_NUMBER () over(partition by manufacturer order by max(price) desc)  rank_col
from vehicles
where  (price between 1 and 655000) and (model!="") and (manufacturer !="")
group by model,manufacturer
)
where rank_col<=10
``````
Refer to Question 6 for the explanation.

## Q9. What is the most occurring description phrase across vehicle types?

``````select count(description) as description_occurance,description,type
from vehicles
where  (price between 1 and 655000) and (type!="") and (description !="")
group by type,description
order by description_occurance desc
``````

Explanation:

1. We extract no. of description using `count(description)` as `description_occurance`, `description` and `type`.
2. We filter the above result w.r.t. some conditions using the `where()`clause:

Fixing `type` not equal to " "
Fixing `description` not equal to " "
Setting `price` between 1 to 655000

3.  Grouping the above result w.r.t `type` and `description`.

4.  Arranging the final result in descending order w.r.t. `description_occurance`.

## Q10. What is the average description character length across the vehicle condition column?

``````select avg(description_length) as average_length, condition
from
(select length(description) as description_length,description,condition
from vehicles
where  (price between 1 and 655000) and (condition!="") and (description 	!=""))
group by condition
``````

Explanation:

1. First, we are extracting the length of the description using `length(description)` and save it in a column name `description_lenght`. We also extract the columns `description` and `condition` -- (second `SELECT` statement).
2. Filtering the above result w.r.t the following conditions using `WHERE` clause:

Fixing `condition` not equal to " "
Fixing `description` not equal to " "
Setting `price` between 1 to 655000

3.  Now, we extract `avg(description_length)` as `average_length` and `condition` from the result obtained from step 1 -- (first `SELECT` statement).

4.  In the end, we group the results w.r.t `condition` using the `GROUP BY` command.

## Q11. Mention the most selling model across every state?

``````select *
from(
select COUNT(id) vehicle_counts,model,state,
ROW_NUMBER () over(partition by state order by COUNT(id) desc)  rank_col
from vehicles
where  (price between 1 and 655000) and (model!="") and (state !="")
group by model,state)
where rank_col=1
``````
Refer to Question 6 for the explanation.

## Q12. Arrange the number of postings in decreasing order of their posting time?

``````select COUNT(id) as vehicle_counts, SUBSTRING(posting_date,12,2) as hour
from vehicles
where  (price between 1 and 655000) and (posting_date!="")
group by SUBSTRING(posting_date,12,2)
order by vehicle_counts desc``````
Refer to Question 7 for the explanation.

## Q13. What is the percentage distribution of trucks across fuel type?

``````select table1.fuel,fuel_vehicle_counts,total_vehicle_counts,
(100*fuel_vehicle_counts/total_vehicle_counts)
from
(select COUNT(id) as fuel_vehicle_counts,fuel,type
from vehicles
where  (price between 1 and 655000) and (fuel!="")  and (type="truck")
group by fuel,type) table1
inner join
(select COUNT(id) as total_vehicle_counts,type
from vehicles
where  (price between 1 and 655000) and (fuel!="")  and (type="truck")
group by type) table2
on table1.type=table2.type``````
Refer to Question 1 for the explanation.

## Q14. How does the price vary for Sedans across their condition ?

``````select avg(price) as average_price,condition
from vehicles
where  (price between 1 and 655000) and (condition!="")  and
(type="sedan")
group by condition``````

Explanation:

1. Calculating average price using `AVG()` function and storing it in a new column named `average_price`.
2. Extracting `average_price` and corresponding `condition`, w.r.t. the conditions:

Fixing `condition` not equal to " "
Fixing `type` equal to "sedan"
Setting `price` between 1 to 655000

3.  Grouping all the result w.r.t. `condition` using `GROUP BY`.

## Q15. Arrange the number of postings in decreasing order of their posting time?

``````select COUNT(id) as vehicle_counts, SUBSTRING(posting_date,12,2) as hour
from vehicles
where  (price between 1 and 655000) and (posting_date!="")
group by SUBSTRING(posting_date,12,2)
order by vehicle_counts desc
``````

Explanation:

1. We extract no. of ids using `COUNT(id)` as `vehicle_counts`and `hours` by applying the `substring()`function on the column `posting_date`.
2. We get the above result w.r.t some conditions using the `WHERE()`clause, i.e.:-

Fixing `posting_date` not equal to " "
Setting `price` between 1 to 655000

3.  Grouping the results w.r.t. the `hour`.

4.  Arranging the result in descending order w.r.t. `vehicle_counts`.

# Conclusion

Here, we come to the end of our practice questions. You can walk through these on your own.

But, how much practice do you need to master SQL?

Find out how many questions you can answer without getting stuck or looking back at the explanation. If you get stuck, come back to Exercise Parts 1,2 and 3 and you will find the explanation here.

Click on the link below for more challenging questions on the same dataset.

Getting Started with SQL Queries - Exercises for Beginners Part-4

Come meet us there!