Data Science Statistics: Applied Inferential Statistics

On the career path to Data Science, a fundamental understanding of statistics, specifically inferential statistics is required. Explore how different t-tests can be performed using the SciPy library to test hypotheses. How to calculate the skewness and kurtosis of data using SciPy and compute regressions using scikit-learn is also covered.

1.3

• test a hypothesis about a sample by comparing it to the general population using the one-sample t-test available in the SciPy library
• compare a sample with another independent sample using the independent t-test and with a related sample using a paired t-test using the SciPy library
• apply independent t-tests on a real dataset to test a hypothesis that managers at a firm have higher salaries than non-managerial employees
• work with Pandas and Matplotlib to analyze the stock price of Volkswagen in 2008, which were affected by some extreme events
• compute the skewness and kurtosis of the returns on Volkswagen stock in 2008 and recognize how it was a few days of extreme behavior which increased those numbers
• perform pre-processing operations on a dataset containing close prices for stocks and indices to analyze it using linear regression
• use the scikit-learn library to fit a linear regression model on the returns on a stock and the returns on the S&P 500 index
• use two explanatory variables - the returns on the S&P 500 index and on an index tracking the strength of the US Dollar - to perform a regression on the returns on individual stocks
• recall different types of T-tests and identify the values they return, calculate percentage returns from time series data using Pandas, and measure the skew and kurtosis values for a series
it_dssds2dj_02_enus

Intermediate