Selecting Subsets of Data in Pandas: Part 3

This article is available as a Jupyter Notebook complete with exercises at the bottom to practice and detailed solutions in another notebook.

Part 3: Assigning subsets of data

This is part 3 of a 4-part series on how to select subsets of data from a pandas DataFrame or Series. Pandas offers a wide variety of options for subset selection which necessitates multiple articles. This series is broken down into the following topics.

  1. Selection with [].loc and .iloc
  2. Boolean indexing
  3. Assigning subsets of data
  4. How NOT to select subsets of data

Assignment

When you see the word assign used during a discussion on programming, it usually means that a variable is set equal to some value. For most programming languages, this means using the equal sign. For instance, to assign the value 5 to the variable x in Python, we do the following:

>>> x = 5

This is formally called an assignment statement. More generally, we can...

Continue Reading...

Selecting Subsets of Data in Pandas: Part 2

This article is available as a Jupyter Notebook complete with exercises at the bottom to practice and detailed solutions in another notebook.

Part 2: Boolean Indexing

This is part 2 of a four-part series on how to select subsets of data from a pandas DataFrame or Series. Pandas offers a wide variety of options for subset selection which necessitates multiple articles. This series is broken down into the following 4 topics.

  1. Selection with [].loc and .iloc
  2. Boolean indexing
  3. Assigning subsets of data
  4. How NOT to select subsets of data

Part 1 vs Part 2 subset selection

Part 1 of this series covered subset selection with [].loc and .iloc. All three of these indexers use either the row/column labels or their integer location to make selections. The actual data of the Series/DataFrame is not used at all during the selection.

In Part 2 of this series, on boolean indexing, we will select...

Continue Reading...

Selecting Subsets of Data in Pandas: Part 1

This article is available as a Jupyter Notebook complete with exercises at the bottom to practice and detailed solutions in another notebook.

Part 1: Selection with [ ].loc and .iloc

This is the beginning of a four-part series on how to select subsets of data from a pandas DataFrame or Series. Pandas offers a wide variety of options for subset selection which necessitates multiple articles. This series is broken down into the following four topics.

  1. Selection with [].loc and .iloc
  2. Boolean indexing
  3. Assigning subsets of data
  4. How NOT to select subsets of data

Learn More

If you’d like to learn more and support my work:

Assumptions before we begin

These series of articles assume you have no knowledge of pandas, but that you understand the fundamentals...

Continue Reading...
Close

50% Complete

Two Step

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.