Cody's Data Cleaning Techniques Using SAS, Third Edition

  • 3h 2m
  • Ron Cody
  • SAS Institute
  • 2017

Find errors and clean up data easily using SAS!

Thoroughly updated, Cody's Data Cleaning Techniques Using SAS, Third Edition, addresses tasks that nearly every data analyst needs to do - that is, make sure that data errors are located and corrected. Written in Ron Cody's signature informal, tutorial style, this book develops and demonstrates data cleaning programs and macros that you can use as written or modify which will make your job of data cleaning easier, faster, and more efficient.

Building on both the author’s experience gained from teaching a data cleaning course for over 10 years, and advances in SAS, this third edition includes four new chapters, covering topics such as the use of Perl regular expressions for checking the format of character values (such as zip codes or email addresses) and how to standardize company names and addresses.

With this book, you will learn how to:

  • find and correct errors in character and numeric values
  • develop programming techniques related to dates and missing values
  • deal with highly skewed data
  • develop techniques for correcting your data errors
  • use integrity constraints and audit trails to prevent errors from being added to a clean data set

About the Author

Ron Cody, EdD, a retired professor from the Rutgers Robert Wood Johnson Medical School now works as a private consultant and a national instructor for SAS Institute Inc. A SAS user since 1977, Ron's extensive knowledge and innovative style have made him a popular presenter at local, regional, and national SAS conferences. He has authored or co-authored numerous books, as well as countless articles in medical and scientific journals.

In this Book

  • Working with Character Data
  • Using Perl Regular Expressions to Detect Data Errors
  • Standardizing Data
  • Data Cleaning Techniques for Numeric Data
  • Automatic Outlier Detection for Numeric Data
  • More Advanced Techniques for Finding Errors in Numeric Data
  • Describing Issues Related to Missing and Special Values (Such as 999)
  • Working with SAS Dates
  • Looking for Duplicates and Checking Data with Multiple Observations per Subject
  • Working with Multiple Files
  • Using PROC COMPARE to Perform Data Verification
  • Correcting Errors
  • Creating Integrity Constraints and Audit Trails
  • A Summary of Useful Data Cleaning Macros
SHOW MORE
FREE ACCESS