Python for Data Science: Petl 1.6 beginner

https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587413&expertiselevel=3587412 https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587414&expertiselevel=3587412 https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587416&expertiselevel=3587412 https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587419&expertiselevel=3587412 https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587420&expertiselevel=3587412 https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587421&expertiselevel=3587412 https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587413&expertiselevel=3587417 https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587415&expertiselevel=3587417 https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587416&expertiselevel=3587417 https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587418&expertiselevel=3587417 https://www.skillsoft.com/channel/python-for-data-science-bd759c83-688a-4877-ab40-b855ec06c6b1?technologyandversion=3587419&expertiselevel=3587417
  • 4 Courses | 5h 57m 30s
  • 6 Books | 23h 15m
  • Includes Lab
  • 3 Courses | 4h 1m 23s
  • 1 Course | 1h 10m 56s
  • 5 Books | 20h 22m
  • Includes Lab
  • 2 Courses | 3h 7m 57s
  • 3 Courses | 4h 52m 16s
  • Includes Lab
  • 2 Courses | 1h 42m 19s
  • 2 Books | 9h 53m
  • Includes Lab
  • 2 Courses | 3h 20m 12s
  • 2 Courses | 3h 12m 2s
  • 3 Books | 20h 37m
  • Includes Lab
  • 11 Courses | 17h 4m 34s
  • 8 Books | 38h 47m
  • Includes Lab
  • 1 Course | 1h 7m 42s
  • 4 Courses | 4h 55m 16s
Rating 5.0 of 1 users Rating 5.0 of 1 users (1)
 
Data Science has become the de facto field in computational and predictive statistical analysis and Python has become an indispensable tool to enable this. Explore the use of key tools and libraries used by Python for Data including NumPy and Pandas.

GETTING STARTED

Analyzing Data Using Python: Data Analytics Using Pandas

  • 2m 45s
  • 11m 32s

GETTING STARTED

Python with Altair: An Introduction to Altair

  • 2m 19s
  • 6m 36s

GETTING STARTED

Python - Using Pandas to Work with Series & DataFrames

  • 1m 29s
  • 4m 49s

GETTING STARTED

Probability Distributions: Getting Started with Probability Distributions

  • 1m 47s
  • 6m 12s

GETTING STARTED

Operations with petl: Introduction

  • 2m 30s
  • 7m 55s

GETTING STARTED

Streaming Data Architectures: An Introduction to Streaming Data in Spark

  • 2m 29s
  • 7m 18s

GETTING STARTED

Python Statistical Plots: Visualizing & Analyzing Data Using Seaborn

  • 2m 41s
  • 5m 6s

GETTING STARTED

Python & Matplotlib: Getting Started with Matplotlib for Data Visualization

  • 2m 13s
  • 10m 41s

GETTING STARTED

Python - Introduction to NumPy for Multi-dimensional Data

  • 2m 6s
  • 2m 11s

GETTING STARTED

Applied Predictive Modeling

  • 1m 30s
  • 5m 55s

GETTING STARTED

Probability Distributions: Uniform, Binomial, & Poisson Distributions

  • 1m 55s
  • 7m 1s

COURSES INCLUDED

Analyzing Data Using Python: Data Analytics Using Pandas
Built on the Python programming language, pandas provides a flexible and open source tool for data manipulation. In this course, you'll develop the skills you need to get started with this library. You'll begin by installing pandas from a Jupyter notebook using pip. Next, you'll instantiate a pandas object, including a Series and DataFrame, and practice several ways of instantiating Dataframes - for instance, from lists, dictionaries of lists, and tuples created from lists using the zip method. You round out this course by performing filter operations on DataFrames using the loc and iloc operations - fundamental techniques used to access specific rows and columns. You'll use loc to identify rows based on labels and iloc to access rows based on the index offset position starting from 0.
9 videos | 1h 15m has Assessment available Badge
Analyzing Data Using Python: Importing, Exporting, & Analyzing Data With Pandas
You can analyze a myriad of data formats through pandas - all you need to know is how. In this course, you'll bring various data types into pandas and perform several operations on the data. You'll practice using common file types such as CSV, Excel, JSON, and HTML through pandas. You'll not only learn how to open and read files of different types, but you'll also serialize objects and copy them to the in-memory clipboard. You'll move on to perform various fundamental operations on DataFrame objects. Lastly, you'll learn to compute basic statistics, access metadata, and modify and sort data in rows.
10 videos | 1h 22m has Assessment available Badge
Analyzing Data Using Python: Filtering Data in Pandas
Not all data is useful. Luckily, there are some powerful filtering operations available in pandas. The course begins with a detailed look at how loc and iloc can be used to access specific data from a DataFrame. You'll move on to filter data using the classic pandas lookup syntax and the pandas filter and query methods. You'll illustrate how the filter function accepts wildcards as well as regular expressions and use various methods such as the .isin method to filter data. Furthermore, you'll filter data using either two pairs of square brackets - in which case the resulting subset is itself a DataFrame - or a single pair of square brackets, in which case the returned data takes the form of a Series. You'll drop rows and columns from a pandas DataFrame and see how rows can be filtered out of a DataFrame. Lastly, you'll identify a possible gotcha that arises when you drop rows in-place but neglect to reset the index labels in your object.
10 videos | 1h 26m has Assessment available Badge
Analyzing Data Using Python: Cleaning & Analyzing Data in Pandas
For data analysis to be useful and accurate, the analyzed data needs to be cleaned and curated. There are copious methods to achieve this in pandas. In this course, you'll learn how to identify and eliminate duplicates in pandas. You'll start by using the pandas cut method to discretize data into bins, using bins to plot histograms and identify outliers using box-and-whisker plots. You'll parse and work with datetime objects read in from strings and convert string columns to datetime using the dateutils python library. Moving on, you'll master different pandas methods for aggregating data - including the groupby, pivot, and pivot_table methods. Lastly, you'll perform various joins - inner, left outer, right outer, and full outer - using both the merge and join methods.
13 videos | 1h 53m has Assessment available Badge
SHOW MORE
FREE ACCESS

COURSES INCLUDED

Python with Altair: An Introduction to Altair
This course will get you familiar with the building blocks of Altair visualizations and some of the important chart settings. You will touch upon some of the fundamentals of plotting graphs in Altair. You'll start off by learning about the basic data structures that can form the basis of Altair visualizations, including JSON data and Pandas DataFrames in both wide-form and long-form. You'll then move on to plotting one of the simpler graphs, histograms, to visualize the distribution of values for a quantitative field in your dataset. While doing so, you'll get to explore the different ways in which Altair graphs can be customized including augmenting your chart with text, layering histograms to view two distributions together, and making histograms interactive.
8 videos | 52m has Assessment available Badge
Python with Altair: Plotting Fundamental Graphs
This course will introduce you to a breadth of charts available in Altair and how you can use them to get an all-round understanding of your data. The focus is to get you familiar with the wide variety of graphs that are available. You'll begin by visualizing a distribution of numeric values using box plots and violin charts, each of which has its own strengths and limitations when analyzing distributions. You'll then move on to bar charts to analyze numbers associated with categories in your data. While doing so, you will get to explore a variety of aggregate operations that are available in Altair in order to calculate a sum, mean, median, and so on. You'll then use line charts to visualize the changes in a particular value over a period of time and also its related visual - the area chart. Finally, you'll produce scatter plots to visualize the relationship between a pair of fields in your data. Throughout this course, you'll delve into a number of customizations which are available in Altair for each of the graphs which you plot.
13 videos | 1h 38m has Assessment available Badge
Python with Altair: Working with Specialized Graphs
This course introduces you to the use of Altair visualizations which can convey very detailed information for specialized datasets. You will cover some of the graphs that can be used to convey the information in very specific kinds of datasets, while also giving you some hands-on experience with advanced chart configurations. You'll begin by plotting information on a map, both to mark locations of places as well as to convey numerical information about regions. You'll then build a heatmap to analyze the numbers associated with a combination of two categorical variables. Next, you'll implement candlestick charts to visualize stock price movements, dot plots to analyze the range of movement for some values, and Gantt charts to view a project plan. Finally, you'll explore the use of window functions to analyze the top K elements in each category of your dataset.
13 videos | 1h 31m has Assessment available Badge

COURSES INCLUDED

Python - Using Pandas to Work with Series & DataFrames
Pandas, a popular Python library, is part of the open-source PyData stack. In this 10-video Skillsoft Aspire course, you will learn that Pandas represents data in a tabular format which makes it easy and intuitive to perform data manipulation, cleaning, and exploration. You will use Python's DataFrame a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). To take this course, you should already be familiar with Python programming language; all code writing is in Jupyter notebooks. You will work with basic Pandas data structures, Pandas Series objects representing a single column of data which can store numerical values, strings, Booleans, and more complex data types. Learn how to use Pandas DataFrame, which represents data in table form. Finally, learn to append and sort series values, add missing data, add columns, and aggregate data in a DataFrame. The closing exercise involves instantiating a Pandas Series object by using both a list and a dictionary; changing the Series index to something other than default value; and practicing sorting Series values in place.
11 videos | 1h 10m has Assessment available Badge

COURSES INCLUDED

Probability Distributions: Getting Started with Probability Distributions
Probability distributions are statistical models that show the possible outcomes and statistical likelihood of any given event and are often useful for making business decisions. Get familiar with the theoretical concepts around statistics and probability distributions through this course and delve into applying statistical concepts to analyze your data using Python. Start by exploring statistical concepts and terminology that will help you understand the data you want to use for estimations on a population. You'll then examine probability distributions - the different forms of distributions, the types of events they model, and the various functions available to analyze distributions. Finally, you'll learn how to use Python to calculate and visualize confidence intervals, as well as the skewness and kurtosis of a distribution. After completing this course, you'll have a foundational understanding of statistical analysis and probability distributions.
13 videos | 1h 25m has Assessment available Badge
Graph Data Structures: Understanding Graphs & Knowledge Graphs
Graphs are used to model a large number of real-world scenarios, including professional networks, flight networks, and schedules. Working in these problem domains involves a deep understanding of how graphs are represented and how graph algorithms work. Learn the basic components of a graph and how nodes and edges can be used to model relationships. Examine how domains such as social networks, purchases on an e-commerce platform, and connected devices can be modeled using graphs. Next, explore how to use an organizing principle to add semantic meaning and context to graphs. Discover how to apply higher-level organizing principles to knowledge graphs using taxonomies and ontologies. Finally, get hands-on experience creating and manipulating graphs, and running graph algorithms using the NetworkX library in Python. When you have completed this course, you will have a solid understanding of how graphs model entities and relationships in the real world.
12 videos | 1h 42m has Assessment available Badge

COURSES INCLUDED

Operations with petl: Introduction
Extract, Transform, and Load (ETL) tasks help in collecting and manipulating data from diverse sources to fit the user's requirements. In this course, you'll explore different interfaces available in the petl library and perform basic ETL tasks using petl. You will begin by examining how to import data from various data sources, including delimited text files, Microsoft Excel, and structured JSON data. You'll also recognize how to load and save data in these formats. Next, you'll outline how to integrate petl with a relational database using SQLAlchemy and SQLite3. Finally, you'll perform transform operations on data using different petl features to filter specific data needed by you. Once you have completed this course, you'll have a clear understanding of the role played by petl in simplifying ETL tasks.
15 videos | 2h 2m has Assessment available Badge
Operations with petl: Basic Data Transformations
Software development often requires manipulation of data that has been extracted from different data sources to make it compatible with the user's specifications and requirements. petl's data transformation features can help achieve this. In this course, you'll investigate fundamental data transformations that can be performed using the petl library. You'll demonstrate how to load data into a petl table, filter columns, and combine multiple tables using different forms of concatenation operations. Next, you'll outline how to convert data in a petl table into a form that is compatible with your requirements. This includes transforming strings to numbers, applying calculations to numeric fields, and replacing specific values in the table. Lastly, you'll explore ways to filter content in petl tables using the facet() function and different select operations.
11 videos | 1h 34m has Assessment available Badge
Operations with petl: Advanced Extractions & Transformations
Petl facilitates and streamlines tasks related to data extraction and manipulation, often required by software developers to make data fit for actionable business intelligence (BI). In this course, you'll work with complex operations in petl and outline how to extract data from a source and convert it to a format that complies with your requirements. You'll begin by investigating the use of regular expressions to analyze, search, and extract specific rows and columns in a petl table. You'll then create transform functions and apply them to your data. These include operations on numeric as well as string fields. Moving on, you'll implement sort operations to organize data in a petl table and arrange it in a sequence that suits your purposes. Finally, you'll investigate how to perform joins and set operations on data tables and meaningfully reduce the data in them using aggregation functions.
10 videos | 1h 15m has Assessment available Badge

COURSES INCLUDED

Streaming Data Architectures: An Introduction to Streaming Data in Spark
Learn the fundamentals of streaming data with Apache Spark. During this course, you will discover the differences between batch and streaming data. Observe the types of streaming data sources. Learn about how to process streaming data, transform the stream, and materialize the results. Decouple a streaming application from the data sources with a message transport. Next, learn about techniques used in Spark 1.x to work with streaming data and how it contrasts with processing batch data; how structured streaming in Spark 2.x is able to ease the task of stream processing for the app developer; and how streaming processing works in both Spark 1.x and 2.x. Finally, learn how triggers can be set up to periodically process streaming data; and the key aspects of working with structured streaming in Spark
9 videos | 50m has Assessment available Badge
Streaming Data Architectures: Processing Streaming Data with Spark
Process streaming data with Spark, the analytic engine built on Hadoop. In this course, you will discover how to develop applications in Spark to work with streaming data and generate output. Topics include the following: Configure a streaming data source; Use Netcat and write applications to process the data stream; Learn the effects of using the Update mode on your stream processing application's output; Write a monitoring application that listens for new files added to a directory; Compare the append output with the update mode; Develop applications to limit files processed in each trigger; Use Spark's Complete mode for output; Perform aggregation operations on streaming data with the DataFrame API; Process streaming data with Spark SQL queries.
11 videos | 52m has Assessment available Badge

COURSES INCLUDED

Python Statistical Plots: Visualizing & Analyzing Data Using Seaborn
The wealth of Python data visualization libraries makes it hard to decide the best choice for each use case. However, if you're looking for statistical plots that are easy to build and visually appealing, Seaborn is the obvious choice. You'll begin this course by using Seaborn to construct simple univariate histograms and use kernel density estimation, or KDE, to visualize the probability distribution of your data. You'll then work with bivariate histograms and KDE curves. Next, you'll use box plots to concisely represent the median and the inter-quartile range (IQR) and define outliers in data. You'll work with boxen plots, which are conceptually similar to box plots but employ percentile markers rather than whiskers. Finally, you'll use Violin plots to represent the entire probability density function, obtained via a KDE estimation, for your data.
17 videos | 1h 46m has Assessment available Badge
Python Statistical Plots: Time Series Data & Regression Analysis in Seaborn
Seaborn's smartly designed interface lets you illuminate data through aesthetically pleasing statistical graphics that are incredibly easy to build. In this course, you'll discover Seaborn's capabilities. You'll begin using strip plots and swarm plots and recognizing how they work together using low-intensity noise. You'll then work with time series data through various techniques, like resampling data at different time frequencies and plotting with confidence intervals and other types of error bars. Next, you'll visualize both logistic and linear regression curves. Moving on, you'll use the pairplot function to visualize the relationships between columns in your data, taken two at a time, in a grid format. You'll change the chart type being visualized and create pair plots with multiple chart types in each plot. Lastly, you'll create and format a heatmap of a correlation matrix to identify relationships between dataset columns.
13 videos | 1h 33m has Assessment available Badge

COURSES INCLUDED

Python & Matplotlib: Getting Started with Matplotlib for Data Visualization
Matplotlib is a Python plotting library used to create dynamic visualizations using pyplot, a state-based interface. You'll learn how to correctly install and use Matplotlib to build line charts, bar charts, and histograms in this course. You'll create basic line charts out of randomly generated data. You'll learn how to use the plt.subplots() function, import data from a CSV file using pandas, and create and customize various line charts. Additionally, you'll create figures holding more than one axes object, learn why and how to use the twinx() function, and create multiple lines in the same line chart with different y-axes for each line. Moving on, you'll construct histograms that visualize multiple variables and approximate the cumulative probability density function. Lastly, you'll create some bar charts to represent categorical data.
13 videos | 1h 43m has Assessment available Badge
Python & Matplotlib: Creating Box Plots, Scatter Plots, Heatmaps, & Pie Charts
Matplotlib can be used to create box-and-whisker plots to display statistics. These dense visualizations pack much information into a compact form, including the median, 25th and 75th percentiles, interquartile range, and outliers. In this course, you'll learn how to work with all aspects of box-and-whisker plots, such as the use of confidence-interval notches, mean markers, and fill color. You'll also build grouped box-and-whisker plots. Next, you'll create scatter plots and heatmaps, powerful tools in exploratory data analysis. You'll build standard scatter plots before customizing various aspects of their appearance. You'll then examine the ideal uses of scatter plots and correlation heatmaps. You'll move on to visualizing composition, first using pie charts, building charts that explode out specific slices. Lastly, you'll build treemaps to visualize data with multiple levels of hierarchy.
11 videos | 1h 28m has Assessment available Badge

COURSES INCLUDED

Python - Introduction to NumPy for Multi-dimensional Data
ThisSkillsoft Aspire course explores NumPy, a Python library used in data science and big data. NumPy provides a framework to express data in the form of arrays, and is the fundamental building block for several other Python libraries. For this course, you will need to know basics of programming in Python3, and should also have some familiarity in working with Jupyter notebooks. You will learn how to create NumPy arrays and perform basic mathematical operations on them. Next you will see how to modify, index, slice, and reshape the arrays; and examine the NumPy library's universal array functions that operate on an element-by-element basis. Conclude by learning how to iterate various options through NumPy arrays.
11 videos | 58m has Assessment available Badge
Python - Advanced Operations with NumPy Arrays
NumPy is oneof the fundamental packages for scientific computing that allows data to be represented in dimensional arrays. This course covers the array operations you can undertake such as image manipulation, fancy indexing, and broadcasting. To take this Skillsoft Aspire course, you should be comfortable with how to create, index, and slice Numpy arrays, and apply aggregate and universal functions. Among the topics, you will learn about the several options available in NumPy to split arrays. You will learn how to use NumPy to work with digital images, which are multidimensional arrays. Next, you will observe how to manipulate a color image, perform slicing operations to view sections of the image, and use a SciPy package for image manipulation. You will learn how to use masks, an array of index values, to access multiple elements of an array simultaneously, referred to as Sansi indexing. Finally, this course covers broadcasting to perform operations between mismatched arrays.
13 videos | 1h 7m has Assessment available Badge
Python - Introduction to Pandas and DataFrames
Simplify data analysis with Pandas DataFrames. Pandas is a Python library that enables you to work with series and tabular data, including initialization, and population. For this course, learners do not need prior experience working with Pandas, but should be familiar with Python3, and Jupyter Notebooks. Topics include the following: Define your own index for a Pandas series object; load data from a CSV (comma separated values) file, to create a Pandas DataFrame; Add and remove data from your Pandas DataFrame; Analyze a portion of your DataFrame; Examine how to reshape or reorient data, and to create a pivot table. Finally, represent multidimensional data in two-dimensional DataFrames, with multi or hierarchical indexes.
14 videos | 1h 4m has Assessment available Badge
Python - Manipulating & Analyzing Data in Pandas DataFrames
Explore advanced data manipulation and analysis with Pandas DataFrames, a Python library that shares similarities with relational databases. To take this course, prior basic experience is needed with Pandas DataFrames, data loading, and Jupyter Notebook data manipulation. You will learn to iterate data in your DataFrame. See how to export data to Excel files, JSON (JavaScript Object Notation) files, and CSV (comma separated values) files. Sort the contents of a DataFrame and manage missing data. Group data with a multi-index. Merge disparate data into a single DataFrame through join and concatenate operations. Finally, you will determine when and where to integrate data with structured queries, similar to SQL.
10 videos | 44m has Assessment available Badge
Python for Data Science: Basic Data Visualization Using Seaborn
Explore Seaborn, a Python library used in data science that provides an interface for drawing graphs that conveys a lot of information, and are also visually appealing. To take this course, learners should be comfortable programming in Python and using Jupyter notebooks; familiarity with Pandas for Numpy would be helpful, but is not required. The course explores how Seaborn provides higher-level abstractions over Python's Matplotlib, how it is tightly integrated with the PyData stack, and how it integrates with other data structure libraries such as NumPy and Pandas. You will learn to visualize the distribution of a single column of data in a Pandas DataFrame by using histograms and the kernel density estimation curve, and then slowly begin to customize the aesthetics of the plot. Next, learn to visualize bivariate distributions, which are data with two variables in the same plot, and see the various ways to do it in Seaborn. Finally, you will explore different ways to generate regression plots in Seaborn.
11 videos | 1h 6m has Assessment available Badge
Python for Data Science: Advanced Data Visualization Using Seaborn
Explore Seaborn, a Python library used in data science that provides an interface for drawing graphs that convey a lot of information, and are also visually appealing. To take this course, learners should be comfortable programming in Python, have some experience using Seaborn for basic plots and visualizations, and should be familiar with plotting distributions, as well as simple regression plots. You will work with continuous variables to modify plots, and to put it into a context that can be shared. Next, learn how to plot categorical variables by using box plots, violin plots, swarm plots, and FacetGrids (lattice or trellis plotting). You will learn to plot a grid of graphs for each category of your data. Learners will explore Seaborn standard aesthetic configurations, including the color palette, and style elements. Finally, this course teaches learners how to tweak displayed data to convey more information from the graphs.
11 videos | 1h 3m has Assessment available Badge
Python - Using Pandas for Visualizations and Time-Series Data
This 12-video Skillsoft Aspire course uses Python, the preferred programming language for data science, to explore data in Pandas with popular chart types such as the bar graph, histogram, pie chart, and box plot. Discover how to work with time series and string data in data sets. Pandas represents data in a tabular format which makes it easy to perform data manipulation, cleaning, and data exploration, all important parts of any data engineer's toolkit. You will learn how to use Matplotlib, a multiplatform data visualization library built on NumPy, the Python library that is used to work with multidimensional data. Learners will use Panda's features to work with specific kinds of data such as time series data and stream data. This course uses a real-world demonstration using Pandas to analyze stock market returns for Amazon. Finally, you will learn how to make data transformations to clean, format, and transform the data into a useful form for further analysis.
13 videos | 1h 28m has Assessment available Badge
Python - Pandas Advanced Features
This course uses Python, the preferred programming language for data science, to explore Pandas, a popular Python library, and is a part of the open-source PyData stack. In this 11-video Skillsoft Aspire course, learners will use Pandas DataFrame to perform advanced category grouping, aggregations, and filtering operations. You will see how to use Pandas to retrieve a subset of your data by performing filtering operations both on rows, as well as columns. You will perform analysis on multilevel data by using the GROUPBY operation on Dataframe. You will then learn to use data masking or data obfuscation to protect classified or commercially sensitive data. Learners will work with duplicate data, an important part of data cleaning. You will examine the two broad categories of data continuous data which comprise of a continuous range of value, and categorical data has discrete, finite values. Pandas automatically generates indexes for each of our DataFrame rows, and here you will learn to different types of reindexing operations on Dataframe.
12 videos | 1h 11m has Assessment available Badge
Data Wrangling in Python Bootcamp: Session 1 Replay
This is a recorded Replay of the Data Wrangling in Python Live session that ran on February 23rd.
3 videos | 2h 59m available Badge
Data Wrangling in Python Bootcamp: Session 2 Replay
This is a recorded Replay of the Data Wrangling in Python Live session that ran on February 24th at 11 AM ET.
4 videos | 2h 40m available Badge
Data Wrangling in Python Bootcamp: Session 3 Replay
This is a recorded Replay of the Data Wrangling in Python Live session that ran on February 25th at 11 AM ET.
3 videos | 2h 39m available Badge
SHOW MORE
FREE ACCESS

COURSES INCLUDED

Applied Predictive Modeling
In this course, you will explore machine learning predictive modeling and commonly used models like regressions, clustering, and Decision Trees that are applied in Python with the scikit-learn package. Begin this 13-video course with an overview of predictive modeling and recognize its characteristics. You will then use Python and related data analysis libraries including NumPy, Pandas, Matplotlib, and Seaborn, to perform exploratory data analysis. Next, you will examine regression methods, recognizing the key features of Linear and Logistic regressions, then apply both a linear and a logistic regression with Python. Learn about clustering methods, including the key features of hierarchical clustering and K-Means clustering, then learn how to apply hierarchical clustering and K-Means clustering with Python. Examine the key features of Decision Trees and Random Forests, then apply a Decision Tree and a Random Forest with Python. In the concluding exercise, learners will be asked to apply linear regression, logistic regression, hierarchical clustering, Decision Trees, and Random Forests with Python.
13 videos | 1h 7m has Assessment available Badge

COURSES INCLUDED

Probability Distributions: Uniform, Binomial, & Poisson Distributions
Python libraries, such as NumPy and SciPy, are used for mathematical and numerical analysis. Through this course, learn how to generate uniform, binomial, and Poisson distributions using these libraries. Begin by exploring uniform distributions and delve into continuous and discrete distributions. You will then explore binomial distributions in-depth, including real-life situations where they can be applied. This course will also help you learn more about Poisson distributions and recognize their use cases. While examining these distributions, you will use functions, such as the probability density or probability mass functions and cumulative distributions functions, among others, to make estimations from your data. Upon completion of this course, you'll have the skills and knowledge to implement and visualize uniform, binomial, and Poisson distributions in Python.
11 videos | 1h 28m has Assessment available Badge
Probability Distributions: Understanding Normal Distributions
This course dives deep into normal distributions, also known as Gaussian distributions, while also introducing you to the law of large numbers and the Central Limit Theorem. You will begin by using Python's SciPy library to generate a normal distribution and examine the use of several available functions that allow you to make estimations on normally distributed data. This course will also help you understand and visualize the law of large numbers and explore the Central Limit theorem by generating multiple samples and analyzing them. After you are done with this course, you'll have the skills and knowledge to analyze data and build your own models.
8 videos | 1h 4m has Assessment available Badge
Graph Data Structures: Representing Graphs Using Matrices, Lists, & Sets
In order to really understand how graphs work, it is important to know how they are implemented. There are multiple ways to represent graphs in code and each representation has its own advantages and disadvantages. In this course, you will implement graphs using three different representations - the adjacency matrix, the adjacency list, and the adjacency set. Learn how the adjacency matrix representation uses a square matrix to represent connections between the nodes of a graph and also edge weights. Next, explore how the adjacency list suffers from a major drawback: the same graph can have multiple representations. Finally, discover how the adjacency set representation has exactly one way in which a graph is represented. When you are finished with this course, you will be able to create and work with your own graph structures and optimize them for different purposes.
8 videos | 52m has Assessment available Badge
Graph Data Structures: Implementing Graph Traversal & Shortest Path Algorithms
What makes the graph data structure very interesting and powerful is the large number of algorithms that can be run on graphs to extract insights. Common graph algorithms include traversing a graph and computing the shortest path between nodes. Implementing these algorithms is a great way to learn how graphs are explored and optimized. In this course, learn how graphs can be traversed by studying both depth-first and breadth-first graph traversal and discover how they can be implemented using a stack and a queue respectively. Next, explore how to compute the shortest path in an unweighted graph. And finally, use Dijkstra's algorithm to compute the shortest path in a weighted graph. Upon completion of this course, you will be able to implement optimal algorithms on graphs.
11 videos | 1h 29m has Assessment available Badge
SHOW MORE
FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THESE COURSES

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

BOOKS INCLUDED

Book

Thinking in Pandas: How to Use the Python Data Analysis Library the Right Way
Understand and implement big data analysis solutions in pandas with an emphasis on performance. This book strengthens your intuition for working with pandas, the Python data analysis library, by exploring its underlying implementation and data structures.
book Duration 2h 3m book Authors By Hannah Stepanek

Book

Python Data Analytics: With Pandas, NumPy, and Matplotlib, Second Edition
Whether you are dealing with sales data, investment data, medical data, web page usage, or other data sets, this book explores the latest Python tools and techniques to help you tackle the world of data acquisition and analysis.
book Duration 5h 49m book Authors By Fabio Nelli

Book

Python Data Analytics: Data Analysis and Science Using Pandas, Matplotlib, and the Python Programming Language
By expertly showing the strength of the Python programming language when applied to processing, managing and retrieving information, this book will help you tackle the world of data acquisition and analysis using the power of the Python language.
book Duration 4h 33m book Authors By Fabio Nelli

Book

Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
The book will start with quick introductions to Python and its ecosystem libraries for data science such as JupyterLab, Numpy, Pandas, SciPy, Matplotlib, and Seaborn.
book Duration 2h 6m book Authors By PURNA CHANDER RAO. KATHUL

Book

Pandas in 7 Days: Utilize Python to Manipulate Data, Conduct Scientific Computing, Time Series Analysis, and Exploratory Data Analysis
If you're looking to expedite a data science or sophisticated data analysis project, you've come to the perfect place. Each data analysis topic is covered step-by-step with real-world examples.
book Duration 4h 51m book Authors By Fabio Nelli

Book

Data Wrangling: Using Pandas, SQL, and Java
This book is intended primarily for those who plan to become data scientists as wellas anyone who needs to perform data cleaning tasks.
book Duration 3h 53m book Authors By Oswald Campesato
SHOW MORE
FREE ACCESS

BOOKS INCLUDED

Book

Data Science Fundamentals for Python and MongoDB
Helping you build the foundational data science skills necessary to work with and better understand complex data science algorithms, this book provides complete Python coding examples to complement and clarify data science concepts, and enrich the learning experience.
book Duration 1h 39m book Authors By David Paper

Book

Data Science Using Python and R
Written for the general reader with no previous analytics or programming experience, this step-by-step book will show you how to produce hands-on solutions to real-world business problems, using state-of-the-art techniques.
book Duration 3h 50m book Authors By Chantal D. Larose, Daniel T. Larose

Book

Python for Data Science for Dummies, 2nd Edition
Written for people who are new to data analysis, this book provides the statistical background needed to get started in data science programming, including probability, random distributions, hypothesis testing, confidence intervals, and building regression models for prediction.
book Duration 7h 11m book Authors By John Paul Mueller, Luca Massaron

Book

Python for R Users: A Data Science Approach
Short on theory and long on actionable analytics, this definitive guide provides readers with a detailed comparative introduction and overview of both languages and features concise tutorials with command-by-command translations of R to Python and Python to R.
book Duration 2h 53m book Authors By Ajay Ohri

Book

Practical Data Science with Python 3: Synthesizing Actionable Insights from Data
Providing insight into essential data science skills in a holistic manner, this book will empower you to analyze data, formulate proper questions, and produce actionable insights, three core stages in most data science endeavors.
book Duration 4h 49m book Authors By Ervin Varga
SHOW MORE
FREE ACCESS

BOOKS INCLUDED

Book

PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes
Using a problem-solution approach, this authoritative resource will teach you how to carry out data analysis with PySpark SQL, graphframes, and graph data processing.
book Duration 2h 42m book Authors By Raju Kumar Mishra, Sundar Rajan Raman

Book

Python for Data Science for Dummies, 2nd Edition
Written for people who are new to data analysis, this book provides the statistical background needed to get started in data science programming, including probability, random distributions, hypothesis testing, confidence intervals, and building regression models for prediction.
book Duration 7h 11m book Authors By John Paul Mueller, Luca Massaron

BOOKS INCLUDED

Book

Numerical Python: Scientific Computing and Data Science Applications with Numpy, SciPy and Matplotlib, Second Edition
Leverage the numerical and mathematical modules in Python and its standard library as well as popular open source numerical Python packages like NumPy, SciPy, FiPy, matplotlib and more.
book Duration 10h 15m book Authors By Robert Johansson

Book

Python Data Analytics: With Pandas, NumPy, and Matplotlib, Second Edition
Whether you are dealing with sales data, investment data, medical data, web page usage, or other data sets, this book explores the latest Python tools and techniques to help you tackle the world of data acquisition and analysis.
book Duration 5h 49m book Authors By Fabio Nelli

Book

Python Data Analytics: Data Analysis and Science Using Pandas, Matplotlib, and the Python Programming Language
By expertly showing the strength of the Python programming language when applied to processing, managing and retrieving information, this book will help you tackle the world of data acquisition and analysis using the power of the Python language.
book Duration 4h 33m book Authors By Fabio Nelli

BOOKS INCLUDED

Book

Python Data Analytics: With Pandas, NumPy, and Matplotlib, Second Edition
Whether you are dealing with sales data, investment data, medical data, web page usage, or other data sets, this book explores the latest Python tools and techniques to help you tackle the world of data acquisition and analysis.
book Duration 5h 49m book Authors By Fabio Nelli

Book

Numerical Python: Scientific Computing and Data Science Applications with Numpy, SciPy and Matplotlib, Second Edition
Leverage the numerical and mathematical modules in Python and its standard library as well as popular open source numerical Python packages like NumPy, SciPy, FiPy, matplotlib and more.
book Duration 10h 15m book Authors By Robert Johansson

Book

Data Science Fundamentals for Python and MongoDB
Helping you build the foundational data science skills necessary to work with and better understand complex data science algorithms, this book provides complete Python coding examples to complement and clarify data science concepts, and enrich the learning experience.
book Duration 1h 39m book Authors By David Paper

Book

Data Analysis and Visualization Using Python: Analyze Data to Create Visualizations for BI Systems
Featuring a detailed business case on effective strategies on data visualization, this book looks at Python from a data science point of view and teaches proven techniques for data visualization as used in making critical business decisions.
book Duration 2h 21m book Authors By Ossama Embarak

Book

Data Science Using Python and R
Written for the general reader with no previous analytics or programming experience, this step-by-step book will show you how to produce hands-on solutions to real-world business problems, using state-of-the-art techniques.
book Duration 3h 50m book Authors By Chantal D. Larose, Daniel T. Larose

Book

Python for Data Science for Dummies, 2nd Edition
Written for people who are new to data analysis, this book provides the statistical background needed to get started in data science programming, including probability, random distributions, hypothesis testing, confidence intervals, and building regression models for prediction.
book Duration 7h 11m book Authors By John Paul Mueller, Luca Massaron

Book

Python for R Users: A Data Science Approach
Short on theory and long on actionable analytics, this definitive guide provides readers with a detailed comparative introduction and overview of both languages and features concise tutorials with command-by-command translations of R to Python and Python to R.
book Duration 2h 53m book Authors By Ajay Ohri

Book

Practical Data Science with Python 3: Synthesizing Actionable Insights from Data
Providing insight into essential data science skills in a holistic manner, this book will empower you to analyze data, formulate proper questions, and produce actionable insights, three core stages in most data science endeavors.
book Duration 4h 49m book Authors By Ervin Varga
SHOW MORE
FREE ACCESS

SKILL BENCHMARKS INCLUDED

Python Data Visualization Competency (Intermediate Level)
The Python Data Visualization benchmark will measure your ability to apply data visualization techniques in Python using Python statistical plots, Python with Altair, and Dash Python frameworks. You will be evaluated on your ability to recognize the visual and analytical features of Python. A learner who scores high on this benchmark demonstrates that they have the skills to develop interactive Python applications with visual representations of plots, graphs, and charts.
30m    |   20 questions

SKILL BENCHMARKS INCLUDED

Graph Analytics Literacy (Beginner Level)
The Graph Analytics Literacy benchmark will measure your ability to recall, recognize, and understand graph concepts, fundamentals of graph databases, and graph data structures. You will be evaluated on your ability to recognize the basic concepts of graph data structures and algorithms. A learner who scores high on this benchmark demonstrates that they have the required foundation of graph data structures.
12m    |   12 questions

SKILL BENCHMARKS INCLUDED

Python ETL with petl Literacy (Beginner Level)
The Python ETL with petl Literacy (Beginner Level) benchmark measures your ability to process data belonging to various file formats, connect to a database, and perform basic Extract, Transform, and Load (ETL) tasks using petl. You will be evaluated on your ability to perform fundamental transform operations on numbers, strings, and tables using petl. A learner who scores high on this benchmark demonstrates that they have the skills to perform basic data transformations with petl.
14m    |   14 questions
Python ETL with petl Competency (Intermediate Level)
The Python ETL with petl Competency (Intermediate Level) benchmark measures your ability to perform data operations by implementing replace and type change operations, querying data in petl data tables, and defining filters. You will be evaluated on your ability to extract data using regular expressions, implement joins and set operations on tables, and aggregate data using petl. A learner who scores high on this benchmark demonstrates that they have the skills to implement advanced extractions and transformations with petl.
11m    |   11 questions

SKILL BENCHMARKS INCLUDED

Data Visualization in Python with seaborn and Altair Literacy (Beginner Level)
The Data Visualization in Python with seaborn and Altair Literacy (Beginner Level) benchmark measures your ability to use Seaborn to build univariate and bivariate histograms and kernel density estimation (KDE) curves and plots, as well as box, boxen, and violin plots. You will be evaluated on your ability to recognize the types of data that can be visualized in Altair and plot some of the basic charts available in this tool. A learner who scores high on this benchmark demonstrates that they have the skills to visualize and analyze data using seaborn and Altair.
16m    |   16 questions
Data Visualization in Python with seaborn and Altair Competency (Intermediate Level)
The Data Visualization in Python with seaborn and Altair Competency (Intermediate Level) benchmark measures your ability to use seaborn to work with strip and swarm plots, time series data, error bars, logistic and linear regression curves, pair plots, and heatmaps. You will be evaluated on your ability to plot different forms of charts using Altair in order to analyze a variety of datasets and visualize specialized data using a variety of Altair charts. A learner who scores high on this benchmark demonstrates that they have the skills to visualize data with representations of plots, graphs, and charts using seaborn and Altair frameworks
25m    |   25 questions

SKILL BENCHMARKS INCLUDED

Data Visualization Proficiency (Advanced Level)
The Data Visualization Proficiency benchmark will measure your ability to recall, relate, demonstrate, and apply the data visualization concepts and techniques in Excel, QlikView, and various Python visualization libraries. You will be evaluated on your ability to recognize and apply the concepts of data visualization techniques, tools, and functions in Excel, Qlikview, Infographics, and Python. A learner who scores high on this benchmark demonstrates that they have the required data visualization skills to understand, apply, and work independently on the visualizations in their projects.
30m    |   20 questions
Data Visualization in Python with Matplotlib Literacy (Beginner Level)
The Data Visualization in Python with Matplotlib Literacy (Beginner Level) benchmark will measure your ability to recall and relate underlying data visualization concepts using Python and Matplotlib. You will be evaluated on your ability to recognize the foundational concepts of data visualization, its uses, and best practices. A learner who scores high on this benchmark demonstrates that they have the basic data visualization skills to understand and grasp visualization techniques and their uses.
16m    |   16 questions
Data Visualization in Python with Matplotlib Competency (Intermediate Level)
The Data Visualization in Python with Matplotlib Competency (Intermediate Level) benchmark will measure your ability to recall, relate, demonstrate, and apply data visualization concepts and techniques in Python using the Matplotlib library. You will be evaluated on your ability to recognize and apply data visualization concepts, techniques, tools, and functions in Matplotlib. A learner who scores high on this benchmark demonstrates that they have the required data visualization skills to understand, apply, and work independently on visualizations in their projects.
10m    |   10 questions

SKILL BENCHMARKS INCLUDED

Data Analysis with Python Literacy (Beginner Level)
The Data Analysis with Python Literacy benchmark will measure your ability to recall and relate Python concepts, including using the NumPy library and its arrays for manipulating and analyzing data, and a basic idea of Python libraries such as pandas, Matplotlib, seaborn for data analysis. A learner who scores high on this benchmark demonstrates that they have a basic understanding of Python libraries, visualization libraries such as Matplotlib and seaborn, and basic skills for performing data analysis using NumPy and pandas.
18m    |   18 questions
Data Visualization with Python Literacy (Beginner Level)
The Data Visualization with Python Literacy benchmark will measure your ability to recall and relate the underlying data visualization concepts in Python. You will be evaluated on your ability to recognize the foundational concepts of data visualization, representation, charting, and plotting in Python using libraries such as Matplotlib, Plotly, and Seaborn. A learner who scores high on this benchmark demonstrates that they have basic data visualization skills using Python.
13m    |   13 questions
Data Analysis with Python Competency (Intermediate Level)
The Data Analysis with Python Competency benchmark will measure your ability to recall and relate Python concepts, including NumPy and pandas for manipulating, analyzing, and transforming the data, as well as Matplotlib and seaborn for visualizing data. A learner who scores high on this benchmark demonstrates that they have good Python data analysis, visualization, and data wrangling skills and can work on data analysis projects with minimal supervision.
23m    |   23 questions
Data Visualization with Python Competency (Intermediate Level)
The Data Visualization with Python competency benchmark will measure your ability to recall and relate underlying data visualization concepts in Python. You will be evaluated on your ability to recognize the concepts of data visualization and advanced data visualization, as well as data representation, charting, and plotting in Python using pandas, Matplotlib, and Plotly libraries. A learner who scores high on this benchmark demonstrates that they have data visualization skills using Python.
16m    |   16 questions
Data Visualization with Python Proficiency (Advanced Level)
The Data Visualization with Python Proficiency benchmark will measure your ability to perform data visualizations in Python using advanced plotting and charting techniques, as well as various visualization libraries such as Matplotlib, Plotly, seaborn, and Bokeh. A learner who scores high on this benchmark demonstrates that they can independently work on data visualization in Python.
30m    |   20 questions
Data Analysis with Python Proficiency (Advanced Level)
The Data Analysis with Python Competency benchmark will measure your ability to recall and relate Python concepts, including using NumPy and pandas for manipulating, analyzing, and transforming the data, as well as Matplotlib and seaborn for visualizing data. A learner who scores high on this benchmark demonstrates that they have very good Python data analysis, visualization, and data wrangling skills and can work independently on data analysis projects.
22m    |   22 questions
Data Visualization in Python with Seaborn Literacy (Beginner Level)
The Data Visualization in Python with Seaborn Literacy (Beginner Level) benchmark will measure your ability to recall and relate underlying data visualization concepts using Python and seaborn. You will be evaluated on your ability to recognize the foundational concepts of data visualization, its uses, and best practices. A learner who scores high on this benchmark demonstrates that they have the basic data visualization skills to understand and grasp visualization techniques and their uses.
8m    |   8 questions
Data Visualization in Python with Seaborn Competency (Intermediate Level)
The Data Visualization in Python with Seaborn Competency (Intermediate Level) benchmark will measure your ability to recall, relate, demonstrate, and apply data visualization concepts and techniques in Python using the seaborn library. You will be evaluated on your ability to recognize and apply data visualization concepts, techniques, tools, and functions in Seaborn. A learner who scores high on this benchmark demonstrates that they have the required data visualization skills to understand, apply, and work independently on visualizations in their projects.
9m    |   9 questions
SHOW MORE
FREE ACCESS

SKILL BENCHMARKS INCLUDED

Graph Analytics Literacy (Beginner Level)
The Graph Analytics Literacy benchmark will measure your ability to recall, recognize, and understand graph concepts, fundamentals of graph databases, and graph data structures. You will be evaluated on your ability to recognize the basic concepts of graph data structures and algorithms. A learner who scores high on this benchmark demonstrates that they have the required foundation of graph data structures.
12m    |   12 questions
Graph Analytics Proficiency (Advanced Level)
The Graph Analytics Proficiency benchmark will measure your ability to recall, recognize, and understand graph analytics concepts, graph databases, and Cypher Query Language for querying graph data and graph data science for identifying hidden relationships. A learner who scores high on this benchmark demonstrates that they have the required skills of Neo4j graph analytics and graph data science, graph data science with Spark, and to work independently in their projects.
20m    |   20 questions