January 10, 2022 - SQL Queries tutorial, Data Analysis using Pandas & Plotly, and more

Wrote 2021 instead of 2022 in the heading - we are still in that stage. Check out today's newsletter to catch up with some informative content around the data world.

January 10, 2022 - SQL Queries tutorial, Data Analysis using Pandas & Plotly, and more
Photo by Annie Spratt / Unsplash


Welcome to the latest newsletter! Today we will go through some informative articles around the web, interesting tutorials, job postings, and tweets.

Dive in to learn more!

Around the Web:

  • ML 101 Series:  Support Vector Machines
    Support vector machines (SVMs) are powerful yet flexible supervised machine learning algorithms that are used both for classification and regression. But generally, they are used in classification problems. This post covers SVMs as a part of the ML 101 series.
  • Data Science and the Art of Persuasion by Harvard Business Review
    Despite heavy investments to acquire talented data scientists and take advantage of the analytics boom, many companies have been disappointed in the results. The problem is that those scientists are trained to ask smart questions, wrangle the relevant data, and uncover insights—but not to communicate what those insights mean for the business. To be successful, the author writes, a data science team needs six talents: project management, data wrangling, data analysis, subject expertise, design, and storytelling. He outlines four steps for achieving that success: (1) Define talents, not team members. (2) Hire to create a portfolio of necessary talents. (3) Expose team members to talents they don’t have. (4) Structure projects around talents.
  • Forbes: EarthOptics Is Using Artificial Intelligence To Help Consumers Choose Climate-Smart Products
    EarthOptics' new Soil Carbon Project labeling initiative is designed to help growers and the food industry quantitatively demonstrate to consumers that the climate-smart products they purchased contributed to the world's carbon neutrality goals.
  • ML 101 Series: K- Means Clustering: An Overview
    Clustering is the process of grouping similar data points together. K-means clustering is a typical unsupervised algorithm that assumes that the output has been labeled without considering the input vectors. The primary purpose of the K-means algorithm is to find the group in data, with the number of groups represented by the variable K. This post covers K-Means Clustering as a part of the ML 101 series.

Tutorials & Interesting Python Libraries:

  • Getting Started with SQL Queries - Exercises for Beginners Part-1
    "Here's a selection of the most useful SQL queries every beginner must practice. This article will not only provide you with the answers but also the explanation to each query." This is part-1 of a four-part tutorial series.
  • Analysis of Flipkart Sales Dataset using Pandas & Plotly
    As one of Flipkart's analysts, you are asked to present a detailed report for the management on various aspects. You are provided with a dataset consisting of 20,000 purchases. We will use the Pandas library to analyze the dataset. Moreover, it is recommended to visualize the results using various graphs and charts. Therefore, in addition to Pandas, we will make use of the Plotly library as well.
  • Tutorial: Create a Neural Network from Scratch in Python 3
    Neural networks are a foundational concept to machine learning."In this post, we’re going to build a fully connected deep neural net (DNN) from scratch in Python 3."
  • Library: Attrs - Classes without Boilerplate
    attrs is the Python package that will bring back the joy of writing classes by relieving you from the drudgery of implementing object protocols.  Trusted by NASA for Mars missions since 2020!
    Its main goal is to help you to write concise and correct software without slowing down your code.

Interesting Jobs for Data Folks:

  • Calm: Data Scientist, Analytics  (Remote, USA)
    Calm's vision is to make the world a happier and healthier place. Currently, Calm is offering a Data Scientist position in the Analytics team focused on leveraging our growing dataset to help prioritize product and content development, identify optimization opportunities, and automate data flows and analyses for key decision-makers.
  • TrueAccord: Data Scientist (Remote, USA)
    TrueAccord combines machine learning with a human-based approach to transform debt resolution and to get people on the path towards financial health. TrueAccord is looking for a data scientist to join its growing Data Science team, who will build machine learning models that optimize and elevate their automated debt collection strategy.
  • CircleCI: Data Analyst for Product (Remote US, Remote Canada)
    CircleCI is a pre-built CI/CD system that enables you to bring your creativity forward and scale it quickly. CircleCI is looking for a Data Analyst for Product who will lead all things data for a product team, analyze rich data sets, and improve their product to help 100,000+ developers ship code faster



We hope some important content from the data world was brought to your attention, for more such content - subscribe.

See you next time!