# Core Statistical Concepts: Statistics & Sampling with Python

Statistics    |    Beginner
• 11 videos | 1h 35m 56s
• Includes Assessment
Rating 4.6 of 21 users (21)
Data is one of the most valuable assets a business has, but it's only as valuable as the methods used to interpret it. Data science, which at its core includes statistics and sampling, is the key to data interpretation. In this course, practice using the pandas library in Python to work with statistics and sampling. Practice loading data from a CSV file into a pandas DataFrame. Compute a variety of statistics on data. While doing so, see how to visualize the relationship between data and computed statistics. Moving along, implement several sampling techniques, such as stratified sampling and cluster sampling. Then, explore how a balanced sample can be created from an imbalanced dataset using the imblearn module in Python. Upon completion, you'll be able to generate samples and compute statistics using various tools and methods.

## WHAT YOU WILL LEARN

• Discover the key concepts covered in this course
Install the latest versions of pandas and visualization modules used to analyze data
Load data from a csv file into a pandas dataframe and perform some initial analysis
Calculate the mean and median of a distribution using your own function and compare it with the built-in pandas function
Use seaborn and matplotlib to visualize a distribution and where the mean, median, and mode fit in
Compute and visualize the standard deviation and variance of a distribution
• Implement simple random and stratified sampling on a data frame
Use pandas to generate a sample using cluster and systematic sampling
Create a balanced sample using random undersampling and oversampling
Generate synthetic data in order to create a balanced sample using the synthetic minority over-sampling technique (smote)
Summarize the key concepts covered in this course

## IN THIS COURSE

• 4.  Computing the Mean and Median of a Distribution
• 5.  Visualizing Distributions with Seaborn & Matplotlib
• 6.  Computing Variance and Standard Deviation
• 7.  Generating Random and Stratified Samples
• 8.  Implementing Cluster and Systematic Sampling
• 9.  Implementing Undersampling and Oversampling
• 10.  Oversampling with SMOTE
• 11.  Course Summary

## EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

## YOU MIGHT ALSO LIKE

Rating 4.7 of 12 users (12)

## PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.7 of 3 users (3)
Rating 4.5 of 445 users (445)
Rating 4.4 of 631 users (631)