Data Engineer
- 19 Courses | 9h 15m
- 56 Labs | 56h
Welcome to the Data Engineer Career Path
Discover what you will learn on your journey to becoming a Data Engineer!
- 1 Lab | 1h
Python Fundamentals for Data Engineers
Learn the fundamentals of Python, and build your data engineering foundation.
- 10 Labs | 10h
SQL Fundamentals for Data Engineers
Learn how to create, manage, and protect relational databases with SQL.
- 1 Course | 20m
- 7 Labs | 7h
Python Pandas for Data Engineers
Pandas provides tools for working with tabular data, i.e. data that is organized into tables that have rows and columns. Tabular data has a lot of the same functionality as SQL or Excel, but Pandas adds the power of Python.After learning Pandas, you’ll be able to ingest, clean, and aggregate large quantities of data, and then use that data with other Python modules like Scipy (for statistical analysis) or Matplotlib (for visualization).
This course will cover how to create Pandas DataFrames, calculate aggregates, and merge multiple tables.
Advanced SQL for Data Engineers
Keep building your SQL skills with advanced techniques and hands-on practice.
- 5 Labs | 5h
Data Wrangling, Cleaning, and Tidying
Clean, well-structured data is essential to data science but cleaning data requires both a keen eye and technical skills. Develop both here!
- 4 Courses | 1h 5m
- 1 Lab | 1h
Getting Started Off-Platform for Data Engineers
This course will introduce you to all of the tools you will need to get started off-platform as well as some helpful tricks for each technology. These are the most common technologies used by Data Professionals across sectors.
This course will teach you how to do data science projects on your own computer locally. In this course, you will learn about: * Using the Command Line * Installing and using Jupyter Notebooks * Setting up and using PostgreSQL
- 2 Courses | 1h
Data Management Portfolio Project
Put your data management skills to work in this portfolio project
- 1 Lab | 1h
Introduction to Big Data with PySpark
This course is an introduction to the underlying concepts behind big data with a practical and hands-on approach with PySpark. Big data is everywhere, and touches data science, data engineering, and machine learning. It is becoming central to marketing, strategy, and research. This course covers the applications and implications of big data on finance, social media, health, and medicine. PySpark makes it easy to start analyzing big data, making the potential of big data accessible to anyone who knows Python.
In this course, you will learn how to handle big data with PySpark. In addition to learning how to manage the data, you will also be exposed to the conceptual underpinnings that make working with big data possible.
- 1 Course | 40m
- 2 Labs | 2h
Intermediate Python for Data Engineers
Dive deeper into Python 3, an essential skill for Data Engineers!
- 2 Courses | 30m
- 7 Labs | 7h
Learn the Command Line
We use our mouse and fingers to click images of icons and access files, programs, and folders on our devices. However, this is just one way for us to communicate with computers.
The command line is a quick, powerful, text-based interface developers use to more effectively and efficiently communicate with computers to accomplish a wider set of tasks. Learning how to use it will allow you to discover all that your computer is capable of!
By the end of the course, you will be able to navigate, access, and modify files and folders on your computer—all without a mouse!
- 5 Labs | 5h
Advanced Python for Data Engineers
Are you a lover of Python looking to advance your skills in the language? This course may be right up your alley! In this course, we will dive into some advanced python skills that will allow you to take your programming skills to the next level. We’ll learn new paradigms that will give you the flexibility to create clean, effective code and make you a truly advanced Python programmer.
By taking this course, you will expand your core Python skillset. Here is what you'll be learning: * How to use logs in Python to help debug and track your software. * How to use functional programming, a coding paradigm that is sometimes used by software engineers in favor of object-oriented programming, to create clean, efficient programs. * How to use concurrent programming to implement code more efficiently using the threading, multiprocessing, and asyncio modules. * How to do database operations in Python using sqlite3. * How to deploy a simple Python script using Flask.
- 2 Courses | 1h 5m
- 4 Labs | 4h
Learn Git: Introduction to Version Control
Version control (what git gets you) is essential to working on teams, and managing the status of code, edits, and workflows. If you've ever named a file ""-final"", ""final-final"", etc. You need version control.
By the end of this unit, you will know everything you need manage files and contribute to collectively built code.
- 1 Course | 1h 25m
- 4 Labs | 4h
Learn Git II: Git for Deployment
Take your git and Github skills to the next level by utilizing one of the most powerful tools: branching. By creating branches, code can evolve independently and later merge with other pieces of code. In addition, getting familiar with markdown will help you both communicate within the GitHub ecosystem and become more flexible with your writing.
You'll learn how to read and write markdown, and essential text formatting tool, leverage branching to work concurrently on code with your team, and some of the essential practices to facilitate teamwork.
- 3 Courses | 2h
- 4 Labs | 4h
Learn MongoDB
Learn more about NoSQL databases, MongoDB, its basic operations, and some of its more advanced features.
- 3 Courses | 1h 10m
- 3 Labs | 3h
EARN A DIGITAL BADGE WHEN YOU COMPLETE THESE TRACKS
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.