The most comprehensive course available to learn data analysis and visualization in Python. Get access to over 600 pages of text, 300 exercises, 13 hours of video, multiple projects, and detailed solutions.
Many other courses use poor practices to teach the data science libraries such as pandas, matplotlib, and seaborn. With Master Data Analysis with Python, you will be given the absolute best practices to use the libraries to help you rapidly transform into an expert.
Instructor Ted Petrou has taught more than 1,000 hours of live classes using this course. Each time he uses his experience to improve explanations and clarify results. Ted Petrou has become one of the foremost authorities on how to best use the python data science libraries.
Reading texts or listening to lectures give the false impression of learning. To demonstrate learning, you must be able to complete tasks on your own. Master Data Analysis with Python comes with 300+ exercises and detailed solutions along with several longer and more complex projects.
The python data science libraries are in a state of flux with new additions added and other parts deprecated. This course is continuously updated to reflect the newest changes to the libraries. You will have lifetime access to all the updates.
The course is divided into the following 11 parts. Each part can be considered its own mini-course with exercises and projects.
In Python, pandas is a popular and powerful library to explore, analyze, and visualize data. You will be introduced to the DataFrame and the Series, the two main containers of data within pandas. You will learn the components of these objects and a few basic operations.
Intro to pandas is available to take for free and is bundled together with the next part, Selecting Subsets of Data.Take Intro to pandas for free
One of the most common tasks during a data analysis is to select some subset of the data. In pandas, you can select data by row/column label or integer location as well as with conditional logic applied to the values. Although this is a rather simple task, pandas offers multiple ways to complete it, which causes confusion to the novice user.
In this part, you will be given very clear instruction on what are best practices for subset selection. You will also learn what methods of subset selection you should avoid.Take Selecting Subsets of Data for free
In this part, you'll begin performing calculations on your data. You'll begin by learning to how to operate on a single column of data, a pandas Series. You'll learn the difference between methods that aggregate (return a single value) and those that do not. You'll learn how to access string-only and datetime-only operations to process Series with those specific data types.Take Essential Pandas Commands ($15)
After learning how to process a single column of data, you'll learn how to process multiple columns at the same time by calling methods on a DataFrame. You'll learn how to change the direction of the operations from vertical to horizontal.Take Essential Pandas Commands ($15)
There are a huge number of data types that are available for your DataFrames. In this part, you'll get a comprehensive tour of the exact definitions of each data type and how to convert to and from each one.
You'll also learn about the categorical data type, which is unique to pandas, and has the ability to save a tremendous amount of memory.Take Essential Pandas Commands ($15)
Up to this point in the course, all operations were applied to the entire dataset. You will learn how to apply operations to independent groups within your data instead of the whole.
You will also learn how to display the results of grouping in a more human-readable way with pivot tables.
Grouping data can be tricky in pandas and has potential to be one of the slowest performing operations. You will learn best practices on how to optimize performance along with the newest syntax available.
A time series is a sequence of data observed over a period of time. Each observation is labeled by a specific moment in time with the entire data. The entire set of observed data is ordered by its time component. You will learn how to sample time series data at evenly spaced intervals, operate over a rolling window of time, and group by any time period you desire.
Regular expressions are a miniature programming language on their own that help you match patterns within text. They can be extremely useful when combined with the pandas string-only methods to manipulate and analyze strings in almost any way.
Tidy data is a structure of data that makes analysis easier. Often, it is necessary to rearrange, transform, and extract data so that it conforms to tidy data principles. You will learn how to tidy a variety of 'messy' data sets with the tools given to you by pandas.
In this part, you will learn how to work with multiple data sets together. You will learn how pandas implicitly uses automatic alignment of the index to combine datasets causing problems for the novice. You will also learn how to make SQL-like joins by interacting with a relational database.
A good visualization can make for easier understanding and decision making. In this part, you will learn a straightforward approach to using the powerful, yet confusing library matplotlib. You will then learn how to plot data using pandas, before simplifying the process with the seaborn library.
Before getting started analyzing data, you will learn how to setup a robust environment on your system to do data science. You will install the Miniconda distribution along with all the data science libraries used throughout the course.
You will also learn how to best use Jupyter Notebooks, our main tool for exploring data.
This course is taught by Ted Petrou, an expert at Python, data exploration and machine learning. Ted is the author of the highly rated text Pandas Cookbook. Ted has taught hundreds of students Python and data science during in-person classroom settings. He sees first hand exactly where students struggle and continually upgrades his material to minimize these struggles by providing simple and direct paths forward.
Ted is one of the foremost authorities on using the pandas library to do data analysis. His blog posts have totaled well over 1 million views. He is also a prolific contributor on Stack Overflow having answered over 400 questions.
Ted holds a master's degree in statistics from Rice University and is the author of Exercise Python and Master Machine Learning with Python.
Video lessons for parts 2, 3, and 4 of Master Data Analysis have been produced. There are a total of 13 hours of video that cover nearly every section in the text. The following video is one of several dozen from the part Essential pandas Commands covering string Series methods.
All exercises from the course are solved by Ted on video so that you can see exactly how he thinks through a problem. The following video shows exercise solutions from the second chapter of Essential pandas Commands.
You are purchasing a digital download along with access to all videos currently produced. The digital download includes the following:
"in my opinion what distinguishes you from everyone is your deep understanding of Python and Pandas. I follow lot of people on twitter, linkedin, Medium who share tips/tricks/codes on Python, Pandas, scikit-learn but no one comes close to you when writing efficient code and explaining the finer nuances"
Master Data Analysis assumes you already have a solid understanding of the fundamentals of Python. If you do not, you should master these fundamentals first. Exercise Python provides the necessary prerequisite knowledge.
This book assumes no knowledge of any of the Python data science libraries. Each part progresses slowly and thoroughly beginning with the basics. Advanced topics are covered towards the last chapters in each part.