Official Release of bar_chart_race - A Python Package for Creating Animated Bar Chart Races

 

I'm excited to announce the official release of bar_chart_race, a python package for creating bar chart races. In this post, I'll cover many of the major available options. Navigate to the official documentation for a full breakdown of all of the options.

Motivation

Bar chart races have become very popular over the last year and no python package existed to create them. I also built some for my coronavirus dashboard.

Installation

Install with:

pip install...
Continue Reading...

Minimally Sufficient Pandas Cheat Sheet

pandas Jan 03, 2020

This article summarizes the very detailed guide presented in Minimally Sufficient Pandas.

Begin Mastering Data Science Now for Free!

Take my free Intro to Pandas course to begin your journey mastering data analysis with Python.

What is Minimally Sufficient Pandas?

  • It is a small subset of the library that is sufficient to accomplish nearly everything that it has to offer.
  • It allows you to focus on doing data analysis and not the syntax

How will Minimally Sufficient Pandas benefit...

Continue Reading...

Minimally Sufficient Pandas

pandas Jan 01, 2020

In this article, I will offer an opinionated perspective on how to best use the Pandas library for data analysis. My objective is to argue that only a small subset of the library is sufficient to complete nearly all of the data analysis tasks that one will encounter. This minimally sufficient subset of the library will benefit both beginners and professionals using Pandas. Not everyone will agree with the suggestions I lay forward, but they are how...

Continue Reading...

The Craziness of Subset Selection in Pandas

pandas Nov 21, 2019

Selecting subsets of data in pandas is not a trivial task as there are numerous ways to do the same thing. Different pandas users select data in different ways, so these options can be overwhelming. I wrote a long frou-part series on it to clarify how its done. For instance, take a look at the following options for selecting a single column of data (assuming it’s the first column):

  • df[‘colname’]
  • df[[‘colname’]]
  • df.colname
  • df.loc[:,...
Continue Reading...

Use the brackets to select a single pandas DataFrame column and not dot notation

pandas Sep 13, 2019
 

pandas offers its users two choices to select a single column of data and that is with either brackets or dot notation. In this article, I suggest using the brackets and not dot notation for the following ten reasons.

  1. Select column names with spaces
  2. Select column names that have the same name as methods
  3. Select columns with variables
  4. Select non-string columns
  5. Set new columns
  6. Select multiple columns
  7. Dot notation is a strict subset of the brackets
  8. Use one way which works for all situations
  9. ...
Continue Reading...

The Five-Step Process for Data Exploration in a Jupyter Notebook

pandas Aug 07, 2019

Video available

I also have a video from the Dunder Data YouTube channel where I demonstrate this entire process. I believe this is a post that is better viewed as a demonstration, so if you have the time see the video below.

Tutorial

A major pain point for beginners is writing too many lines of code in a single cell. When you are learning, you need to get feedback on every single line of code that you write and verify that it is in fact correct. Only once you have verified the...

Continue Reading...

Pandas Cookbook — Develop Powerful Routines for Exploring Real-World Datasets

pandas Jul 18, 2019

In this article, I will discuss the overall approach I took to writing Pandas Cookbook along with highlights of each chapter.

I have a new book titled Master Data Analysis with Python that is far superior to Pandas Cookbook. It contains over 300 exercises and projects to reinforce all the material and will receive continuous updates indefinitely. If you are interested in Pandas Cookbook, I would strongly suggest to purchase Master...

Continue Reading...

Python for Data Analysis — A Critical Line-by-Line Review

book review pandas python Jul 09, 2019

In this post, I will offer my review of the book, Python for Data Analysis (2nd edition) by Wes McKinney. My name is Ted Petrou and I am an expert at pandas and author of the recently released Pandas Cookbook. I thoroughly read through PDA and created a very long, review that is available on github. This post provides some of the highlights from that full review.

What is a critical line-by-line review?

I read this book as if I was the only technical reviewer and I was...

Continue Reading...

Selecting Subsets of Data in Pandas: Part 4

pandas Jun 30, 2019

This is the fourth and final part of the series “Selecting Subsets of Data in Pandas”. Pandas offers a wide variety of options for subset selection, which necessitates multiple articles. This series is broken down into the following topics.

  1. Selection with [].loc and .iloc
  2. Boolean indexing
  3. Assigning subsets of data
  4. How NOT to select subsets of data

Become an Expert

If you want to be trusted to make decisions using pandas and scikit-learn, you must become an...

Continue Reading...
Close

Register for a free account

Upon registration, you'll get access to four free courses.