SQL Exercises with Google Play Store Data - all in your own Docker Container

Have you ever wanted to learn SQL or enhance your SQL skills with real-world data? Today, we'll guide you through a journey where you can run SQL exercises on genuine Google Play Store data. The best part? All this will be encapsulated within a Docker container.

This means you don't need to worry about setup, configuration, or any database installations on your local machine!

Getting Started with Docker

Before we dive into the SQL exercises, let's set up our playground using Docker.

1. Pull the Docker Image

For this exercise, we have a ready-made Docker image that contains not only a PostgreSQL database but also PgWeb - a web-based database browser for PostgreSQL. This Docker image also comes with the "playstore_apps" table pre-loaded with real-world Google Play Store app data.

To get started, you first need to pull the Docker image:

docker pull harshsinghal/pgimage-tutor:latest

2. Run the Docker Container

Once you've pulled the image, you can run a Docker container with the following command:

docker run -d --name 'pgwebtutor' -p 5432:5432 -p 8081:8081 -e POSTGRES_USER=root -e POSTGRES_PASSWORD=root -e POSTGRES_DB=data harshsinghal/pgimage-tutor:latest

This command will start a Docker container with PostgreSQL on port 5432 and PgWeb on port 8081.

It also sets the PostgreSQL credentials to root for both username and password.

3. Access PgWeb

After running the Docker container, open your web browser and navigate to:

http://localhost:8081

You should see the PgWeb login screen. Fill in the connection details as:

  • Host: localhost
  • Username: root
  • Password: root
  • Database: data
  • SSL Mode: disable

Click on "Connect", and you should be inside the PostgreSQL database ready to run SQL queries!

The data we have made available in a Postgres database table is a sample from the dataset available at https://www.kaggle.com/datasets/lava18/google-play-store-apps

We have sampled it and cleaned up various columns to ensure that you have a seamless experience.

SQL Case Studies

With our database playground set up, it's time to delve into some SQL exercises:

SELECT
  category,
  AVG(rating) AS average_rating,
  COUNT(*) AS app_count
FROM 
  playstore_apps
GROUP BY 
  category
HAVING
  COUNT(*) >= 1000
ORDER BY
  average_rating DESC
LIMIT 5;

Case Study 2: Performance of Paid vs Free Apps

Compare the average rating of free apps against that of paid apps.

SELECT
  CASE WHEN free = 1 THEN 'Free' ELSE 'Paid' END AS app_type,
  AVG(rating) AS average_rating
FROM 
  playstore_apps
GROUP BY 
  free;

Conclusion

Docker has made it incredibly convenient for users to learn and practice SQL without the hassles of installing and configuring databases.

Combining Docker with real-world datasets further enriches the learning experience. Dive into these exercises, explore the dataset, and enhance your SQL proficiency!

Stay Tuned for More!

To all our passionate data enthusiasts and aspiring data scientists reading this, we have some thrilling news for you!

In our commitment to providing you with the most enriching learning experiences, we are in the process of developing a new Docker container. This isn't just any container. It will come pre-loaded with a plethora of real-world datasets spanning multiple industries and domains.

Whether you're looking to delve into e-commerce analytics, delve into healthcare data, or traverse the vast landscape of social media metrics, we'll have something in store for you!

But that's not all!

As we release this new container, we'll also be launching an eBook centered on "Learning Product Analytics." This guide will be your go-to resource, taking you on a comprehensive journey from the basics of product analytics to advanced strategies, all illustrated with real-world examples.

Exclusive for Our Subscribers: This eBook will be available absolutely FREE for all datascience.fm subscribers. If you haven’t subscribed yet, now is the perfect time. Not only will you gain access to this invaluable resource, but you'll also be first in line for future updates, tutorials, and exclusive content.

Become a datascience.fm subscriber now and unlock a world of data-driven opportunities!

Thank you for being a part of our community.

We're excited about the journey ahead and are grateful to have you with us every step of the way. Stay curious, keep learning, and remember — the world of data is vast and waiting for you to explore!