SQL Exercises with Google Play Store Data - all in your own Docker Container
Have you ever wanted to learn SQL or enhance your SQL skills with real-world data? Today, we'll guide you through a journey where you can run SQL exercises on genuine Google Play Store data. The best part? All this will be encapsulated within a Docker container.
Have you ever wanted to learn SQL or enhance your SQL skills with real-world data? Today, we'll guide you through a journey where you can run SQL exercises on genuine Google Play Store data. The best part? All this will be encapsulated within a Docker container.
This means you don't need to worry about setup, configuration, or any database installations on your local machine!
Getting Started with Docker
Before we dive into the SQL exercises, let's set up our playground using Docker.
1. Pull the Docker Image
For this exercise, we have a ready-made Docker image that contains not only a PostgreSQL database but also PgWeb - a web-based database browser for PostgreSQL. This Docker image also comes with the "playstore_apps" table pre-loaded with real-world Google Play Store app data.
To get started, you first need to pull the Docker image:
docker pull harshsinghal/pgimage-tutor:latest
2. Run the Docker Container
Once you've pulled the image, you can run a Docker container with the following command:
docker run -d --name 'pgwebtutor' -p 5432:5432 -p 8081:8081 -e POSTGRES_USER=root -e POSTGRES_PASSWORD=root -e POSTGRES_DB=data harshsinghal/pgimage-tutor:latest
This command will start a Docker container with PostgreSQL on port 5432 and PgWeb on port 8081.
It also sets the PostgreSQL credentials to root
for both username and password.
3. Access PgWeb
After running the Docker container, open your web browser and navigate to:
http://localhost:8081
You should see the PgWeb login screen. Fill in the connection details as:
- Host: localhost
- Username: root
- Password: root
- Database: data
- SSL Mode: disable
Click on "Connect", and you should be inside the PostgreSQL database ready to run SQL queries!
The data we have made available in a Postgres database table is a sample from the dataset available at https://www.kaggle.com/datasets/lava18/google-play-store-apps
We have sampled it and cleaned up various columns to ensure that you have a seamless experience.
SQL Case Studies
With our database playground set up, it's time to delve into some SQL exercises:
Case Study 1: Most Popular Categories
SELECT
category,
AVG(rating) AS average_rating,
COUNT(*) AS app_count
FROM
playstore_apps
GROUP BY
category
HAVING
COUNT(*) >= 1000
ORDER BY
average_rating DESC
LIMIT 5;
Case Study 2: Performance of Paid vs Free Apps
Compare the average rating of free apps against that of paid apps.
SELECT
CASE WHEN free = 1 THEN 'Free' ELSE 'Paid' END AS app_type,
AVG(rating) AS average_rating
FROM
playstore_apps
GROUP BY
free;
Conclusion
Docker has made it incredibly convenient for users to learn and practice SQL without the hassles of installing and configuring databases.
Combining Docker with real-world datasets further enriches the learning experience. Dive into these exercises, explore the dataset, and enhance your SQL proficiency!
Stay Tuned for More!
To all our passionate data enthusiasts and aspiring data scientists reading this, we have some thrilling news for you!
In our commitment to providing you with the most enriching learning experiences, we are in the process of developing a new Docker container. This isn't just any container. It will come pre-loaded with a plethora of real-world datasets spanning multiple industries and domains.
Whether you're looking to delve into e-commerce analytics, delve into healthcare data, or traverse the vast landscape of social media metrics, we'll have something in store for you!
But that's not all!
As we release this new container, we'll also be launching an eBook centered on "Learning Product Analytics." This guide will be your go-to resource, taking you on a comprehensive journey from the basics of product analytics to advanced strategies, all illustrated with real-world examples.
Exclusive for Our Subscribers: This eBook will be available absolutely FREE for all datascience.fm subscribers. If you haven’t subscribed yet, now is the perfect time. Not only will you gain access to this invaluable resource, but you'll also be first in line for future updates, tutorials, and exclusive content.
Become a datascience.fm subscriber now and unlock a world of data-driven opportunities!
Thank you for being a part of our community.
We're excited about the journey ahead and are grateful to have you with us every step of the way. Stay curious, keep learning, and remember — the world of data is vast and waiting for you to explore!