Data Filtering

Data Science    |    Beginner
  • 11 Videos | 1h 1m 52s
  • Includes Assessment
  • Earns a Badge
Likes 53 Likes 53
Once data is gathered for data science, it is often in an unstructured or raw format and must be filtered for content and validity. Explore examples of practical tools and techniques for data filtering.

WHAT YOU WILL LEARN

  • identify common filtering techniques and tools
    extract date elements from common date formats
    parse content types in HTTP headers
    use csvcut to filter CSV data
    use sed to replace values in a text data stream
    drop duplicate records from data
  • extract headers from a jpeg image
    use pdfgrep to extract data from searchable pdf files
    detect invalid or impossible data combinations
    parse robots.txt from a web site to decide what should and shouldn't be crawled nor indexed
    drop records from a CSV file based on date range

IN THIS COURSE

  • Playable
    1. 
    Data Filtering Techniques and Tools
    3m 24s
    UP NEXT
  • Playable
    2. 
    Processing Date Formats
    6m 7s
  • Locked
    3. 
    Filtering HTTP Headers
    5m 11s
  • Locked
    4. 
    Filtering CSV Data
    4m 52s
  • Locked
    5. 
    Replacing Values with sed
    6m 16s
  • Locked
    6. 
    Dropping Duplicate Data
    4m 44s
  • Locked
    7. 
    Working with JPEG Headers
    6m 49s
  • Locked
    8. 
    Filtering PDF Files
    4m 55s
  • Locked
    9. 
    Filtering for Invalid Data
    5m 51s
  • Locked
    10. 
    Exercise: Cull Old Data
    3m 21s
  • Locked
    11. 
    Parsing robots.txt
    5m 22s

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platform

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Likes 171 Likes 171  
Likes 46 Likes 46  
Likes 0 Likes 0  

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Likes 96 Likes 96  
Likes 28 Likes 28