Jul 18, 2019

In this article, I will discuss the overall approach I took to writing Pandas Cookbook along with highlights of each chapter.

I have a new book titled Master Data Analysis with Python that is far superior to Pandas Cookbook. It contains over 300 exercises and projects to reinforce all the material and will receive continuous updates through 2020. If you are interested in Pandas Cookbook, I would strongly suggest to purchase Master Data Analysis with Python instead.

If you want to learn python, data analysis, and machine learning, then the All Access Pass! will provide you access to all my current and future material for one low price.

I had three main guiding principles when writing the book:

- Use of real-world datasets
- Focus on doing data analysis
- Writing modern, idiomatic pandas

First, I wanted you, the reader, to explore real-world datasets and not randomly...

Jul 09, 2019

In this post, I will offer my review of the book, Python for Data Analysis (2nd edition) by Wes McKinney. My name is Ted Petrou and I am an expert at pandas and author of the recently released Pandas Cookbook. I thoroughly read through PDA and created a very long, review that is available on github. This post provides some of the highlights from that full review.

I read this book as if I was the only technical reviewer and I was counted on to find all the possible errors. Every single line of code was scrutinized and explored to see if a better solution existed. Having spent nearly every day of the last 18 months writing and talking about pandas, I have formed strong opinions about how it should be used. This critical examination lead to me finding fault with quite a large percentage of the code.

The main focus of PDA is on the pandas library but it does have material on basic Python, IPython...

Jul 01, 2019

In this tutorial, I will describe a process for setting up a lean and robust Python data science environment on your system. By the end of the tutorial, your system will be set up such that:

- Python is installed with only the most common and useful packages for data science
- Conda is installed to manage packages and environments
- You’ll have a single, robust environment which minimizes dependency issues by relying on the conda-forge channel

I am extraordinarily dedicated to producing the absolute best content for doing data science using Python. For all my courses and live training visit Dunder Data.

- Get Master Data Analysis with Python to form an extremely strong foundation for analyzing data with Python.
- Sign-up for the
**FREE**Intro to Pandas class - Take a live in-person bootcamp—next course in Toronto, Aug 26–30
- Follow me on Twitter @TedPetrou for my daily data science tricks
- Take 10% off all products with...

Jun 30, 2019

This article is available as a Jupyter Notebook complete with exercises at the bottom to practice and detailed solutions in another notebook.

This is part 3 of a 4-part series on how to select subsets of data from a pandas DataFrame or Series. Pandas offers a wide variety of options for subset selection which necessitates multiple articles. This series is broken down into the following topics.

- Selection with
`[]`

,`.loc`

and`.iloc`

- Boolean indexing
- Assigning subsets of data
- How NOT to select subsets of data

When you see the word **assign** used during a discussion on programming, it usually means that a variable is set equal to some value. For most programming languages, this means using the equal sign. For instance, to assign the value 5 to the variable

in Python, we do the following:**x**

`>>> x = 5`

This is formally called an assignment statement. More generally, we can...

Jun 30, 2019

This article is available as a Jupyter Notebook complete with exercises at the bottom to practice and detailed solutions in another notebook.

This is part 2 of a four-part series on how to select subsets of data from a pandas DataFrame or Series. Pandas offers a wide variety of options for subset selection which necessitates multiple articles. This series is broken down into the following 4 topics.

- Selection with
`[]`

,`.loc`

and`.iloc`

- Boolean indexing
- Assigning subsets of data
- How NOT to select subsets of data

Part 1 of this series covered subset selection with `[]`

, `.loc`

and `.iloc`

. All three of these **indexers** use either the row/column labels or their integer location to make selections. The actual **data** of the Series/DataFrame is not used at all during the selection.

In Part 2 of this series, on **boolean indexing**, we will select...

Jun 05, 2019

This article is available as a Jupyter Notebook complete with exercises at the bottom to practice and detailed solutions in another notebook.

`[ ]`

, `.loc`

and `.iloc`

This is the beginning of a four-part series on how to select subsets of data from a pandas DataFrame or Series. Pandas offers a wide variety of options for subset selection which necessitates multiple articles. This series is broken down into the following four topics.

- Selection with
`[]`

,`.loc`

and`.iloc`

- Boolean indexing
- Assigning subsets of data
- How NOT to select subsets of data

If you’d like to learn more and support my work:

- Get the book Master Data Analysis with Python (300+ Exercises)
- Sign-up for the
**FREE**Intro to Pandas class - Follow me on Twitter @TedPetrou

These series of articles assume you have no knowledge of pandas, but that you understand the fundamentals...

50% Complete

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.