Welcome to the final newsletter of this year (and get ready for the 'see you next year' jokes). Let's get into some articles, recent developments, and around the web chatter from the World of Data.
- Top Python Libraries of 2021
Take a look at the 7th edition of the yearly Top Python Library list. Compiling since 2015, this list presents the 'best Python libraries that are launched or popularized'. Django Ninja made the list! What else?
- The Art of Automation: Automation in Financial Services
AI-powered Automation has made its way well into the financial services industry to transform organizations' operations and enhance interaction with the customers. This article is condensed into a conversational format between Jerry Cuomo and Oscar Roque which discusses the application of automation in the enterprise.
- Spotify: How we improved Data Discovery for our Data Scientists?
Spotify diagnosed a problem regarding Data Discovery back in 2016 when they migrated to Google Cloud Platform and saw an explosion of dataset creation in Big Query. With an increase in more research and insight produced in the company, the crux of the problem was believed to be the lack of a centralized catalog for these data and insights. "In early 2017, we released Lexikon, a library for data and insights, as the solution to this problem."
- Ways to use 'Testing' as a Data Scientist
"As a data scientist, I wear many different hats, which also made learning about testing difficult. There’s plenty of material on testing from a software development perspective, but if I’m doing analysis and not developing software, I found many of those concepts difficult to translate and apply in my work.
In that spirit, I thought I would write a blog post on the many ways I use testing in my work, in hopes that other data scientists will find it helpful when they’re trying to figure out what to test and how to test in the code they write."
- Introducing Skippa 0.1.10!
Released on December 28, 2021, Skippa is a SciKit-learn Pre-processing Pipeline in Pandas. It allows you to create a pre-processing and modeling pipeline, based on scikit-learn transformers but preserving pandas dataframe format throughout all pre-processing. This can drastically simplify development by serializing data cleaning, preprocessing with your model pipeline.
- What's new in PyCharm 2021.3
"This release cycle introduces Poetry support, the new FastAPI project type, the Beta version of our Remote Development support, and a redesigned Jupyter Notebook experience."
- Announcing the NeurIPS 2021 Award Recipients
Neural Information Processing Systems (NeurIPS), one of the largest AI conferences of the year. This year six papers were chosen as recipients of the Outstanding Paper Award. The committee selected these papers due to their excellent clarity, insight, creativity, and potential for lasting impact.
Around the Web:
- By @TIME
“With still emerging AI technologies creating an insatiable hunger for more computation, Huang’s team is well-positioned to keep driving technological advances for decades to come,” writes @AndrewYNg #TIME100 https://t.co/JUC0eQwPc7 pic.twitter.com/DC1NeemKVb— TIME (@TIME) September 15, 2021
Would love your feedback on this: AI Systems = Code (model/algorithm) + Data. Most academic benchmarks/competitions hold the Data fixed, and let teams work on the Code. Thinking of organizing something where we hold the Code fixed, and ask teams to work on the Data. (1/2)— Andrew Ng (@AndrewYNg) May 24, 2021
If one of your new resolutions is to stay updated in the field of data, this newsletter is one way to go - consider subscribing for more such issues.
See You Next Year (Hah!)