This is a 90-minute hands-on, interactive tutorial with Q&A to follow
Before getting started analyzing data, you will learn how to setup a robust environment on your system to do data science. You will install the Miniconda distribution along with all the data science libraries used throughout the course.
You will also learn how to best use Jupyter Notebooks, our main tool for exploring data.
In Python, pandas is a popular and powerful library to explore, analyze, and visualize data.
You will be introduced to the DataFrame and the Series, the two main containers of data within pandas. You will learn the components of these objects and a few basic operations. duct.
One of the most common tasks during a data analysis is to select some subset of the data. In pandas, you can select data by row/column label or integer location as well as with conditional logic applied to the values. Although this is a rather simple task, pandas offers multiple ways to complete it, which causes confusion to the novice user.
In this part, you will be given very clear instruction on what are best practices for subset selection. You will also learn what methods of subset selection you should avoid.