Data Science Track 4: Data Scientist

For this track of the data science journey, the focus will be on the Data Scientist role. Here we will explore areas such as: visualization, APIs, and ML and DL algorithms.

Technologies: Data Research, Data Science, Data Visualization, Machine Learning, Node.js 10.15.1, Python 3, Recommendation System, RStudio 1.1.4, Tableau Desktop
In the overview, Chris will explain what will be covered in the Data Science track of the Data Science journey as well show you how to use code assets, the lab, and browse the different assets in the track.
Explore how to select the most effective visuals for storytelling, eliminating clutters, and the best practices for story design. We will also learn to work with Tableau and PowerBI to facilitate storytelling with data.
Explore the concept of storytelling with data, the processes involved in storytelling and interpreting data contexts. We will also explore the prominent types of analysis, visualizations, and graphic tools that we can use for storytelling.
Discover how to build advanced charts using Python and Jupyter Notebook. Explore R and ggplot2 visualization capabilities and how to build charts and graphs with them.
Explore approaches to building and implementing visualizations, as well as plotting and graphing using Python libraries like Matplotlib, ggplot, bokeh, and Pygal.
Discover how to use machine learning methods and visualization tools to manage anomalies and improvise data for better data insights and accuracy.
Examine statistical and machine learning implementation methods and how to manage anomalies and improvise data for better data insights and accuracy.
The imbalanced-learn library that integrates with Pandas ML offers several techniques to address the imbalance in datasets used for classification. Explore oversampling, undersampling, and a combination of these techniques.
Classification, regression, and clustering are some of the most commonly used machine learning techniques and there are various algorithms available for these tasks. Explore their application in Pandas ML.
Examine the fundamentals of machine learning and how Pandas ML can be used to build ML models. The workings of Support Vector Machines to perform classification of data is also covered.
Explore the fundamentals of regression and clustering and discover how to use a confusion matrix to evaluate classification models.
Discover how to apply statistical algorithms like PDF, CDF, binomial distribution, and interval estimation for data research. How to implement visualizations to graphically represent the outcomes of data research is also covered.
In order for an organization to be data science aware, it must evolve and become data driven. In this course, you will examine the meaning of a data driven organization and explore analytic maturity, data quality, missing data, duplicate data, truncated data, and data provenance.
To master data science, you must learn the techniques around data research. In this course you will discover how to apply essential data research techniques, including JMP measurement, and how to valuate data using descriptive and inferential methods.
To master data science, you must learn the techniques around data research. In this course you will discover how to use data exploration techniques to derive different data dimensions and derive value from the data. How to practically implement data exploration using R, Python, linear algebra, and plots is also covered.
To master data science it is important to take raw data and turn that into insights. In this course you will learn to apply and implement various essential data correction techniques, transformation rules, deductive correction techniques, and predictive modelling using critical data analytical approaches.
To master data science it is important to take raw data and turn that into insights. In this course you will explore the concept of statistical analysis and implement data ingestion using various technologies including NiFi, Sqoop, and Wavefront.
Data science skills aren't value unless you have data to work with. Automating your data retrieval through APIs is a process that any data scientist must understand. In this course you will explore how to create RESTful OAuth APIs using Node.js.
The four Vs of big data and data science are a popular paradigm used to extract the meaning and value from massive datasets. In this course, you discover the four Vs (volume, variety, velocity, and veracity), their purpose and uses, and how to extract value using the four Vs.
On the career path to Data Science, a fundamental understanding of statistics, specifically inferential statistics is required. Explore how different t-tests can be performed using the SciPy library to test hypotheses. How to calculate the skewness and kurtosis of data using SciPy and compute regressions using scikit-learn is also covered.
As organizations become more data science aware and learn how to collect more data. Taking that data and integrating that into recommendation engines is an essential skill. In this course you will explore how recommendation engines can be created and used to provide recommendations for products and content.
On the career path to Data Science, a fundamental understanding and the application and visualization of statistics is required. Discover how to use the NumPy, Pandas, and SciPy libraries to perform various statistical summary operations on real datasets and how to visualize your datasets in the context of these summaries using Matplotlib.
To become a data science expert, you must master the art of data visualization. In this course you will explore how to create and use real time dashboards with Tableau.
Explore how to use R to create plots and charts of data.
Seaborn is a data visualization library used for data science that provides a high-level interface for drawing graphs. These graphs are able to convey a lot of information, while also being visually appealing. In this course you will explore how to analyze continuous and categorical variables in a dataset using various plotting options in Seaborn. These include box and violin…
Seaborn is a data visualization library used for data science that provides a high-level interface for drawing graphs. These graphs are able to convey a lot of information, while also being visually appealing. In this course you will explore Seaborn basic plots and aesthetics.