Become a python data science expert with these hands-on tutorials

Dunder Data Challenge #3 - Optimal Solution

dunder data challenges Sep 17, 2019

In this article, I will present an ‘optimal’ solution to Dunder Data Challenge #3. Please refer to that article for the problem setup. Work on this challenge directly in a Jupyter Notebook right now by clicking this link.

Naive Solution — Custom function with apply

The naive solution was presented in detail in the previous article. The end result was a massive custom function containing many boolean filters used to find specific subsets of data to...

Use the brackets to select a single pandas DataFrame column and not dot notation

pandas Sep 13, 2019

pandas offers its users two choices to select a single column of data and that is with either brackets or dot notation. In this article, I suggest using the brackets and not dot notation for the following ten reasons.

Select column names with spaces
Select column names that have the same name as methods
Select columns with variables
Select non-string columns
Set new columns
Select multiple columns
Dot notation is a strict subset of the brackets
Use one way which works for all situations
...

Dunder Data Challenge #3 - Naive Solution

dunder data challenges Sep 12, 2019

To view the problem setup, go to the Dunder Data Challenge #3 post. This post will contain the solution.

Master Data Analysis with Python

Master Data Analysis with Python is an extremely comprehensive course that will help you learn pandas to do data analysis.

I believe that it is the best possible resource available for learning how to data analysis with pandas and provide a 30-day 100% money back guarantee if you are not satisfied.

Solution

I will first present a naive solution that...

Dunder Data Challenge #3 - Multiple Custom Grouping Aggregations

dunder data challenges Sep 09, 2019

Welcome to the third edition of the Dunder Data Challenge series designed to help you learn python, data science, and machine learning. Begin working on any of the challenges directly in a Jupyter Notebook courtesy of Binder (mybinder.org).

This challenge is going to be fairly difficult, but should answer a question that many pandas users face — What is the best way to perform a groupby that does many custom aggregations? In this context, a ‘custom...

Dunder Data Challenge #2 - Explain the 1,000x Speed Difference when taking the Mean

dunder data challenges Sep 08, 2019

Welcome to the second edition of the Dunder Data Challenge series designed to help you learn python, data science, and machine learning. Begin working on any of the challenges directly in a Jupyter Notebook courtesy of Binder (mybinder.org).

In this challenge, your goal is to explain why taking the mean of the following DataFrame is more than 1,000x faster when setting the parameter numeric_only to True.

Learn Data Science with Python

I have...

Dunder Data Challenge #1 - Optimize Custom Grouping Function

dunder data challenges Sep 07, 2019

This is the first edition of the Dunder Data Challenge series designed to help you learn python, data science, and machine learning. Begin working on any of the challenges directly in a Jupyter Notebook thanks to Binder (mybinder.org).

In this challenge, your goal is to find the fastest solution while only using the Pandas library.

Begin Mastering Data Science Now for Free!

Take my free Intro to Pandas course to begin your journey mastering data analysis with Python.

...

From pandas to scikit-learn - An Exciting New Workflow

machine learning Sep 05, 2019

Scikit-Learn’s new integration with Pandas

Scikit-Learn will make one of its biggest upgrades in recent years with its mammoth version 0.20 release. For many data scientists, a typical workflow consists of using Pandas to do exploratory data analysis before moving to scikit-learn for machine learning. This new release will make the process simpler, more feature-rich, robust, and standardized.

Become an Expert

If you want to be trusted to make decisions using...

The Five-Step Process for Data Exploration in a Jupyter Notebook

pandas Aug 07, 2019

Video available

I also have a video from the Dunder Data YouTube channel where I demonstrate this entire process. I believe this is a post that is better viewed as a demonstration, so if you have the time see the video below.

Tutorial

A major pain point for beginners is writing too many lines of code in a single cell. When you are learning, you need to get feedback on every single line of code that you write and verify that it is in fact correct. Only once you have verified the...

Pandas Cookbook — Develop Powerful Routines for Exploring Real-World Datasets

pandas Jul 18, 2019

In this article, I will discuss the overall approach I took to writing Pandas Cookbook along with highlights of each chapter.

New Book — Master Data Analysis with Python

I have a new book titled Master Data Analysis with Python that is far superior to Pandas Cookbook. It contains over 300 exercises and projects to reinforce all the material and will receive continuous updates indefinitely. If you are interested in Pandas Cookbook, I would strongly suggest to purchase Master...

Python for Data Analysis — A Critical Line-by-Line Review

book review pandas python Jul 09, 2019

In this post, I will offer my review of the book, Python for Data Analysis (2nd edition) by Wes McKinney. My name is Ted Petrou and I am an expert at pandas and author of the recently released Pandas Cookbook. I thoroughly read through PDA and created a very long, review that is available on github. This post provides some of the highlights from that full review.

What is a critical line-by-line review?

I read this book as if I was the only technical reviewer and I was...

1 2 3 4 5

Dunder Data Challenge #3 - Optimal Solution

Naive Solution — Custom function with apply

Use the brackets to select a single pandas DataFrame column and not dot notation

Dunder Data Challenge #3 - Naive Solution

Master Data Analysis with Python

Solution

Dunder Data Challenge #3 - Multiple Custom Grouping Aggregations

Dunder Data Challenge #2 - Explain the 1,000x Speed Difference when taking the Mean

Learn Data Science with Python

Dunder Data Challenge #1 - Optimize Custom Grouping Function

Begin Mastering Data Science Now for Free!

From pandas to scikit-learn - An Exciting New Workflow

Scikit-Learn’s new integration with Pandas

Become an Expert

The Five-Step Process for Data Exploration in a Jupyter Notebook

Video available

Tutorial

Pandas Cookbook — Develop Powerful Routines for Exploring Real-World Datasets

New Book — Master Data Analysis with Python

Python for Data Analysis — A Critical Line-by-Line Review

What is a critical line-by-line review?

Master Data Analysis with Python

Start learning Data Science using Python with our free courses!

Register for a free account