On the career path to Data Science, a fundamental understanding of statistics, specifically inferential statistics is required. Explore how different t-tests can be performed using the SciPy library to test hypotheses. How to calculate the skewness and kurtosis of data using SciPy and compute regressions using scikit-learn is also covered.

Data Science Statistics: Applied Inferential Statistics

Course Overview

test a hypothesis about a sample by comparing it to the general population using the one-sample t-test available in the SciPy library

compare a sample with another independent sample using the independent t-test and with a related sample using a paired t-test using the SciPy library

apply independent t-tests on a real dataset to test a hypothesis that managers at a firm have higher salaries than non-managerial employees

work with Pandas and Matplotlib to analyze the stock price of Volkswagen in 2008, which were affected by some extreme events

compute the skewness and kurtosis of the returns on Volkswagen stock in 2008 and recognize how it was a few days of extreme behavior which increased those numbers

perform pre-processing operations on a dataset containing close prices for stocks and indices to analyze it using linear regression

use the scikit-learn library to fit a linear regression model on the returns on a stock and the returns on the S&P 500 index

use two explanatory variables - the returns on the S&P 500 index and on an index tracking the strength of the US Dollar - to perform a regression on the returns on individual stocks

recall different types of T-tests and identify the values they return, calculate percentage returns from time series data using Pandas, and measure the skew and kurtosis values for a series