Data Essentials: Data Engineering intermediate
Technology:
Expertise:
- 2 Courses | 1h 13m 44s
- 3 Books | 9h 42m
- 3 Courses | 2h 37m 12s
- 4 Courses | 3h 42m 8s
- 3 Courses | 2h 51m 39s
- 3 Books | 9h 42m
- 1 Course | 1h 13m 39s
- 1 Course | 45m 26s
- 1 Book | 6h 22m
These days, it is essential for businesses to work with large amounts of data on a daily basis. In this channel, you will explore the basics of data in data-driven organizations.
GETTING STARTED
Traditional Data Architectures: Relational Databases
-
1.Course Overview1m 44sNOW PLAYING
-
2.Different Types of Databases3m 10sUP NEXT
GETTING STARTED
Setting up the Data Infrastructure in an Organization
-
1.Course Overview2m 12sNOW PLAYING
-
2.Data Infrastructure in an Organization7m 19sUP NEXT
GETTING STARTED
Data Nuts & Bolts: Fundamentals of Data
-
1.Course Overview1m 16sNOW PLAYING
-
2.Data, Information, Knowledge, and Wisdom2mUP NEXT
GETTING STARTED
Data Architecture Getting Started
-
1.Course Overview1m 54sNOW PLAYING
-
2.Data Defined3m 31sUP NEXT
GETTING STARTED
Data Lakes
-
1.Course Overview2m 18sNOW PLAYING
-
2.Data Lake Evolution7m 59sUP NEXT
GETTING STARTED
Data Engineering Getting Started
-
1.Course Overview1m 23sNOW PLAYING
-
2.Overview of Distributed Systems5m 38sUP NEXT
COURSES INCLUDED
Traditional Data Architectures: Relational Databases
Databases are essential in working with large amounts of data. Managers, leaders, and decision-makers need to choose the right approach when working on a large data project, distinguishing among multiple database types and their use cases. A relational database is a primary traditional data architecture commonly used by most businesses. Working with relational databases has some key advantages but also poses certain limitations. In this course, learn how critically evaluate and work with relational databases. Explore normalization and denormalization of datasets along with specific use cases of these opposite approaches. Examine two main online information processing systems, Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) systems. Finally, investigate the concepts of data warehousing, data marts, and data mining. Upon completion, you'll be able to identify when and how to use a relational database.
12 videos |
34m
Assessment
Badge
Traditional Data Architectures: Data Warehousing and ETL Systems
Data warehouses are actively used for business intelligence and, because they integrate data from multiple sources, are advantageous to simple databases in many instances. Considering modern companies often have ETL-based data warehousing systems, decision-makers need to comprehend how they operate and are appropriately managed. In this course, learn the necessary concepts and processes required to work with and manage projects related to data warehousing. Study data warehousing architectures and schemas and investigate some core data warehouse elements, such as dimension, fact tables, and keys. Furthermore, examine the extract, transform, and load (ETL) approach for working with data warehouses, specifying process flow, tools, and software as well as best practices. When you're done, you'll know how to adopt data warehousing and ETL systems for your business intelligence and data management needs.
12 videos |
38m
Assessment
Badge
COURSES INCLUDED
Setting up the Data Infrastructure in an Organization
In this course, you will look into the data mesh architecture and the process of selecting data platforms which best fulfill the needs of a large, data-driven organization. Begin by delving into two approaches to managing the data in an organization: a centralized data team and a data mesh architecture, which is a more federated approach. Explore how a data mesh allows individual domain teams in an organization to manage their own data as long as it is made available to other teams and adheres to certain standards. Next, discover the various considerations for selecting data-related tools in an organization. You will get a glimpse into Apache Kafka and RabbitMQ, two widely used messaging tools, and will see use cases where each of them excel. Finally, you will look into two use cases for data stores: one for a web or mobile app and another for a team performing data analysis. Here, you will look into the use of Apache Cassandra and the Snowflake platform.
7 videos |
45m
Assessment
Badge
New Age Data Infrastructures: Factors Driving Data Infrastructures
As technology advances, new ways to store, process, and analyze data emerge. For example, large database systems, which require a lot of storage space, have been moved to the cloud and made remotely accessible to many users. These kinds of data infrastructures require business leaders to understand modern data systems and their working principles fully. Use this course to get to grips with the key differences between legacy data systems and modern infrastructures and explore crucial concepts related to modern data infrastructures. By the end of the course, you'll be able to argue why new age data infrastructures are necessary and traditional data systems are limited.
12 videos |
37m
Assessment
Badge
Data Infrastructure: Databases in FSD Development
In this course, learners discover the role played by databases in the FSD (full stack development) process. The 14-video course explores differences between relational and non-relational databases and the advantages associated with each type; how to install and configure the MySQL, PostgresSQL, and MongoDB database systems; and how these systems are used in both the test and live environments of FSD development. Learn how to recognize best practices associated with the design of database systems in the FSD development process. You will then examine how to download, install, and configure the MySQL relational database system for use in FSD development. Then move on to the installation and configuration of the PostgreSQL, MongoDB NoSQL, and SQL Server relational database systems for use in FSD development. Learners will examine components required in both a test and live environment for FSD development, and the requirements of the FSD test environment and specific challenges. Finally, you will learn about the requirements of the FSD production environment and specific challenges.
14 videos |
1h 13m
Assessment
Badge
COURSES INCLUDED
Data Nuts & Bolts: Fundamentals of Data
Dealing with large amounts of data is essential to any modern business and to become a data-driven organization, leaders and decision-makers must establish a deeply ingrained data culture. Use this course to understand the underlying principles of analyzing data and get familiar with terms related to data in order to properly deliver data-related projects. This course will help you identify the basic concepts and processes related to data analysis, modern data sources, and data pipelines. You'll also discover fundamental principles of data storage, migration, and integration, along with common methods for data visualization and reporting. Having completed the course, you'll be well versed in foundational concepts of data, related terminologies, and various data processing methods.
10 videos |
29m
Assessment
Badge
Modern Data Management: Data Management Systems
As companies transition to the digital age, it is increasingly essential for decision-makers to utilize the vast amount of data in their systems properly. Proper governance and a working knowledge of data management systems ensure a significant competitive advantage, allowing companies to have more insight into their work and utilize their resources more efficiently. Use this course to familiarize yourself with the various strategies for handling and transacting data. Examine how data management systems work, study domain-wise data handling, and outline strategies to develop data management systems. Study how to integrate data management into different domains and identify and prioritize domains in various fields of data technologies and data architectures. When you're done with this course, you'll have a solid foundational comprehension of how to establish appropriate data management solutions in an organizational setting.
12 videos |
1h 2m
Assessment
Badge
Modern Data Management: Data Governance
Data governance is important in data management, as it focuses on the availability, consistency, usability, and security of data sources. Utilizing data governance is important for creating consistent pipelines for data management solutions. Use this course as an introduction to data governance, exploring how it relates to master data management and is implemented into a business program. Then, examine how to create consistent and transparent governance models across multiple domains in data management. Investigate data stewardship, integrity, and security, studying how data governance interacts with information technology in a business enterprise context. Identify the benefits of establishing multi-domain data governance. Lastly, list various ways different data management systems interact to maintain data integrity and enhance data security. Upon completing this course, you'll know how to implement a data governance model correctly for your data management systems.
12 videos |
1h 6m
Assessment
Badge
Modern Data Management: Data Quality Management
Since low-quality data can provide poor insights and be detrimental to an organization, data quality improvement is essential in data management and governance. Use this course to learn how to improve the quality of your data. Learn how to distinguish between high and low-quality data. Then, examine the entire cycle for developing high-quality data from data acquisition, advanced data process implementation, and effective distribution. Recognize the importance of managerial oversight in information processing, data compliance, and governance implementations in developing high-quality data. As you advance, learn how to create an integrated system of good data quality management processes. Upon completing this course, you'll know the best techniques and cloud-based data management solutions to ensure the data used in decision-making is always of the highest quality.
12 videos |
1h 3m
Assessment
Badge
SHOW MORE
FREE ACCESS
COURSES INCLUDED
Data Architecture Getting Started
In this 12-video course, learners explore how to define data, its lifecycle, the importance of privacy, and SQL and NoSQL database solutions and key data management concepts as they relate to big data. First, look at the relationship between data, information, and analysis. Learn to recognize personally identifiable information (PII), protected health information (PHI), and common data privacy regulations. Then, study the data lifecycle's six phases. Compare and contrast SQL and NoSQL database solutions and look at using Visual Paradigm to create a relational database ERD (entity-relationship diagram). To implement an SQL solution, Microsoft SQL Server is deployed in the Amazon Web Services (AWS) cloud, and a NoSQL solution by deploying DynamoDB in the AWS cloud. Explore definitions of big data and governance. Learners will examine various types of data architecture, including TOGAF (The Open Group Architecture Framework) enterprise architecture. Finally, learners study data analytics and reporting, how organizations can derive value from data they have. The concluding exercise looks at implementing effective data management solutions.
13 videos |
1h 2m
Assessment
Badge
Cloud Data Architecture: Cloud Architecture & Containerization
In this course, learners discover how to implement cloud architecture for large- scale data science applications, serverless computing, adequate storage, and analytical platforms using DevOps tools and cloud resources. Key concepts covered here include the impact of implementing containerization on cloud hosting environments; the benefits of container implementation, such as lower overhead, increased portability, operational consistency, greater efficiency and better application development; and the role of cloud container services. You will study the concept of serverless computing and its benefits; the approaches of implementing DevOps in the cloud; and how to implement OpsWorks on AWS by using Puppet which provides the ability to define which software and configuration a system requires. See demonstrations of how to classify storage from the perspective of capacity and data access technologies; the benefits of implementing machine learning, deep learning, and artificial intelligence in the cloud; and the impact of cloud technology on BI analytics. Finally, learners encounter container and cloud storage types, container and serverless computing benefits, and advantages of implementing cloud-based BI analytics.
10 videos |
44m
Assessment
Badge
Cloud Data Architecture: Data Management & Adoption Frameworks
Explore how to implement containers and data management on popular cloud platforms like Amazon Web Services (AWS) and Google Cloud Platform (GCP) for data science. Planning big data solutions, disaster recovery, and backup and restore in the cloud are also covered in this course. Key concepts covered here include cloud migration models from the perspective of architectural preferences; prominent big data solutions that can be implemented in the cloud; and the impact of implementing Kubernetes and Docker in the cloud, and how to implement Kubernetes on AWS. Next, learn how to implement data management on AWS, GCP, and DBaaS; how to implement big data solutions using AWS; how to build backup and restore mechanisms in the cloud; and how to implement disaster recovery planning for cloud applications. Learners will see prominent cloud adoption frameworks and their associated capabilities, and hear benefits of and how to implement blockchain technologies or solutions in the cloud. Finally, learn how to implement Kubernetes on AWS, build backup and restore mechanisms on GCP, and implement big data solutions in the cloud.
13 videos |
1h 4m
Assessment
Badge
COURSES INCLUDED
Data Lakes
Data lakes are a useful way of storing all your structured and unstructured data in a single repository. They're widely used in the data industry to quickly retrieve data in raw formats and expose them to data pipelines. Anyone working with data technologies would benefit from appreciating the power and intricacies of data lakes. Use this course to explore the different aspects of data lakes, including their evolution, architecture, and maturity stages. Examine the advantages of governed data lakes. Learn about different data lake platforms. Identify the risks and challenges associated with data lakes and distinguish between a data warehouse and a data lake. Upon completion of this course, you'll fully comprehend why and how data lakes are used.
12 videos |
1h 13m
Assessment
Badge
COURSES INCLUDED
Data Engineering Getting Started
Data engineering is the area of data science that focuses on practical applications of data collection and analysis. This 12-video course helps learners explore distributed systems, batch versus in-memory processing, NoSQL uses, and the various tools available for data management/big data and the ETL (extract, transform, and load) process. Begin with an overview of distributed systems from a data perspective. Then look at differences between batch and in-memory processing. Learn about NoSQL stores and their use, and tools available for data management. Explore ETL-what it is, the process, and the different tools available. Learn to use Talend Open Studio to showcase the ETL concept. Next, examine data modeling and creating a data model in Talend Open Studio. Explore the hierarchy of needs when working with AI and machine learning. In another tutorial, learn how to create a data partition. Then move on to data engineering and best practices, with a look at approaches to building and using data reporting tools. Conclude with an exercise designed to create a data model.
13 videos |
45m
Assessment
Badge
EARN A DIGITAL BADGE WHEN YOU COMPLETE THESE COURSES
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.BOOKS INCLUDED
Book
Modern Big Data Architectures: A Multi-Agent Systems PerspectiveWith practical examples and detailed solutions suitable for a wide variety of applications, this unique, up-to-date volume provides joint analysis of big data and multi-agent systems, with emphasis on distributed, intelligent processing of very large data sets.
3h 24m
By Dominik Ryżko
Book
Scalable Big Data Architecture: A Practitioner's Guide to Choosing Relevant Big Data ArchitectureCovering real-world, concrete industry use cases, this book is for developers, data architects, and data scientists looking for a better understanding of how to choose the most relevant pattern for a big data project and which tools to integrate into that pattern.
1h 51m
By Bahaaldine Azarmi
Book
Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and Data VaultDrawing upon years of practical experience and using numerous examples and an easy to understand framework, this timely guide defines the importance of data architecture and how it can be used effectively to harness big data within existing systems.
4h 27m
By Daniel Linstedt, W.H. Inmon
BOOKS INCLUDED
Book
Modern Big Data Architectures: A Multi-Agent Systems PerspectiveWith practical examples and detailed solutions suitable for a wide variety of applications, this unique, up-to-date volume provides joint analysis of big data and multi-agent systems, with emphasis on distributed, intelligent processing of very large data sets.
3h 24m
By Dominik Ryżko
Book
Scalable Big Data Architecture: A Practitioner's Guide to Choosing Relevant Big Data ArchitectureCovering real-world, concrete industry use cases, this book is for developers, data architects, and data scientists looking for a better understanding of how to choose the most relevant pattern for a big data project and which tools to integrate into that pattern.
1h 51m
By Bahaaldine Azarmi
Book
Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and Data VaultDrawing upon years of practical experience and using numerous examples and an easy to understand framework, this timely guide defines the importance of data architecture and how it can be used effectively to harness big data within existing systems.
4h 27m
By Daniel Linstedt, W.H. Inmon
BOOKS INCLUDED
Book
Enterprise Big Data Engineering, Analytics, and ManagementFeaturing essential big data concepts including data mining, artificial intelligence, and information extraction, this book presents novel methodologies and practical approaches to engineering, managing, and analyzing large-scale data sets with a focus on enterprise applications and implementation.
6h 22m
By Martin Atzmueller, Samia Oussena, Thomas Roth-Berghofer (eds)
SKILL BENCHMARKS INCLUDED
Data Literacy (Beginner Level)
The data literacy benchmark will measure your ability to speak the language of data. You will be evaluated on your ability to recognize key topics such as; data science concepts, analytics, database types, predictive analytics, data visualization, data stewardship, data compliance, and data governance. A learner who scores high on this benchmark demonstrates that you have the skills to interpret data and incorporate it into your daily life.
37m 30s
| 25 questions
Data for Leaders Awareness
The Data for Leaders Awareness benchmark will measure your ability to recall and relate to basic data concepts. You will be evaluated on your ability to recognize the foundational concepts of data such as data formats, sources, various data operations, terminologies, and processing methods. A learner who scores high on this benchmark demonstrates that they have a basic level of awareness of data concepts.
20m
| 20 questions
Data for Leaders Competency (Intermediate Level)
The Data for Leaders Competency benchmark will measure whether a learner has had exposure to data concepts and terminology. You will be evaluated on your ability to recognize key concepts of data such as big data, data governance and management, and emerging new age architecture. A learner who scores high on this benchmark demonstrates that they have the basic data skills to understand and grasp various data related technologies, tools, and frameworks.
20m
| 20 questions
Data for Leaders Proficiency (Advanced Level)
The Data for Leaders Proficiency benchmark will measure whether a learner has had significant exposure and experience with data technologies. You will be evaluated on your ability to recognize the concepts of data such as big data analytics, data architecture, data processing, data governance and management, and emerging new age architecture. A learner who scores high on this benchmark demonstrates independent knowledge across a variety of data technologies and platforms.
25m
| 25 questions
Data Management and Governance Literacy (Beginner Level)
The Data Management and Governance Literacy (Beginner Level) benchmark will measure your ability to recall and relate the foundational concepts of data management, domains, data sources, and governance. A learner who scores high on this benchmark demonstrates that they have a basic understanding of data management and integration concepts.
11m
| 11 questions
Data Management and Governance Competency (Intermediate Level)
The Data Management and Governance Competency (Intermediate Level) benchmark will measure your ability to recall and relate the concepts of data management, quality, compliance, and governance. A learner who scores high on this benchmark demonstrates that they have a good understanding of data management, compliance, quality, and governance elements.
15m
| 15 questions
SHOW MORE
FREE ACCESS