Build a Data Analysis Library from Scratch in Python

Immerse yourself in a long, comprehensive project where you build an entire data analysis library on your own from scratch.

Buy Now!

What you'll learn

Build an entire library

You'll produce the code to build a fully-functioning Python library all from scratch.

Advanced Python

You'll learn advanced Python topics such as how to implement special methods and decorators.

Test-Driven Development

You'll learn the importance of test-driven development to build robust software by having to pass 100 tests in order to complete the library.

Course Description


Build a Data Analysis Library from Scratch in Python targets those that have a desire to immerse themselves in a single, long, and comprehensive project that covers several advanced Python concepts. By the end of the project, you'll have built a fully-functioning Python library that is able to complete many common data analysis tasks. The library will be titled Pandas Cub and have similar functionality to the popular pandas library.

This course focuses on developing software within the massive ecosystem of tools available in Python. There are 40 detailed steps that you must complete in order to finish the project. During each step, you will be tasked with writing some code that adds functionality to the library. In order to complete each step, you must pass the unit-tests that have already been written. Once you pass all the unit tests, the project is complete. The nearly 100 unit tests give you immediate feedback on whether or not your code completes the steps correctly.

There are many important concepts that you will learn while building Pandas Cub.

  • Creating a development environment with conda

  • Using test-driven development to ensure code quality

  • Using the Python data model to allow your objects to work seamlessly with builtin Python functions and operators

  • Build a DataFrame class with the following functionality:

    • Select subsets of data with the brackets operator

    • Aggregation methods - sum, min, max, mean, median, etc...

    • Non-aggregation methods such as isna, unique, rename, drop

    • Group by one or two columns to create pivot tables

    • Specific methods for handling string columns

    • Read in data from a comma-separated value file

    • A nicely formatted display of the DataFrame in the notebook

It is my experience that many people will learn just enough of a programming language like Python to complete basic tasks, but will not possess the skills to complete larger projects or build entire libraries. This course intends to provide a means for students who are looking for a challenging and exciting project that will take serious effort and a long time to complete.

This course is taught by expert instructor Ted Petrou, author of Pandas CookbookMaster Data Analysis with Python, and Exercise Python.

Who this course is for:
  • Students who understand the fundamentals of Python and are looking for a longer more comprehensive project covering advanced topics that they can immerse themselves in.
Start building the data analysis library now!

What your purchase includes

  • 40 detailed steps that can only be completed by passing at least one of the 100 unit tests
  • 7.5 hours of video detailing exactly how to complete each step
  • A fully-functioning data analysis library similar to pandas upon completion


  • Students must know the fundamentals of Python. This is an intermediate/advanced course.
  • Must feel comfortable using and iterating through lists, tuples, sets, and dictionaries
  • Exposure to numpy and pandas is helpful
Buy Now!

"I'm going to say it again: it's a fantastic course for someone like me who wants to go further and work on an intermediate coding/data project in Python! I'm not going to say that I could just write a data analysis library from scratch right now, but I definitely have a much better understanding of Python, NumPy, pandas, data analysis, and the programming itself. Ted Petrou is really a great instructor, he clearly explains the mechanisms behind the library and the way it's all done is very clean, concise and readable. In fact, the code readability is one of the things you can learn here, as well as the test-driven approach, there's a strong focus on that. So that was my course, it was definitely a lot of fun and a lot of new things I didn't know before. I will definitely go for Ted's next course, "Master Data Analysis with Python - Intro to Pandas", and I'm looking forward to finishing reading "Pandas cookbook". Once again, I really recommend this course and I'm very proud and happy to have finished it."

Michal Lasak
Build a Data Analysis Library from Scratch in Python student

50% Complete

Two Step

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.