Core Statistical Concepts: Statistics & Sampling with Python
Statistics
| Beginner
- 11 Videos | 1h 35m 56s
- Includes Assessment
- Earns a Badge
Data is one of the most valuable assets a business has, but it's only as valuable as the methods used to interpret it. Data science, which at its core includes statistics and sampling, is the key to data interpretation. In this course, practice using the pandas library in Python to work with statistics and sampling. Practice loading data from a CSV file into a pandas DataFrame. Compute a variety of statistics on data. While doing so, see how to visualize the relationship between data and computed statistics. Moving along, implement several sampling techniques, such as stratified sampling and cluster sampling. Then, explore how a balanced sample can be created from an imbalanced dataset using the imblearn module in Python. Upon completion, you'll be able to generate samples and compute statistics using various tools and methods.
WHAT YOU WILL LEARN
-
discover the key concepts covered in this courseinstall the latest versions of pandas and visualization modules used to analyze dataload data from a CSV file into a pandas DataFrame and perform some initial analysiscalculate the mean and median of a distribution using your own function and compare it with the built-in pandas functionuse Seaborn and Matplotlib to visualize a distribution and where the mean, median, and mode fit incompute and visualize the standard deviation and variance of a distribution
-
implement simple random and stratified sampling on a data frameuse pandas to generate a sample using cluster and systematic samplingcreate a balanced sample using random undersampling and oversamplinggenerate synthetic data in order to create a balanced sample using the Synthetic Minority Over-sampling Technique (SMOTE)summarize the key concepts covered in this course
IN THIS COURSE
-
1.Course Overview2m 43sUP NEXT
-
2.Installing pandas and Data Visualization Modules6m 24s
-
3.Loading and Analyzing Data Using pandas11m 42s
-
4.Computing the Mean and Median of a Distribution8m 59s
-
5.Visualizing Distributions with Seaborn & Matplotlib11m 16s
-
6.Computing Variance and Standard Deviation12m 26s
-
7.Generating Random and Stratified Samples12m 31s
-
8.Implementing Cluster and Systematic Sampling10m 38s
-
9.Implementing Undersampling and Oversampling10m 45s
-
10.Oversampling with SMOTE6m 42s
-
11.Course Summary1m 51s
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platform
Digital badges are yours to keep, forever.