Human-Centered Data Science: An Introduction

  • 5h 1m
  • Cecilia Aragon, Gina Neff, Marina Kogan, Michael Muller, Shion Guha
  • The MIT Press
  • 2022

Best practices for addressing the bias and inequality that may result from the automated collection, analysis, and distribution of large datasets.

Human-centered data science is a new interdisciplinary field that draws from human-computer interaction, social science, statistics, and computational techniques. This book, written by founders of the field, introduces best practices for addressing the bias and inequality that may result from the automated collection, analysis, and distribution of very large datasets. It offers a brief and accessible overview of many common statistical and algorithmic data science techniques, explains human-centered approaches to data science problems, and presents practical guidelines and real-world case studies to help readers apply these methods.

The authors explain how data scientists' choices are involved at every stage of the data science workflow—and show how a human-centered approach can enhance each one, by making the process more transparent, asking questions, and considering the social context of the data. They describe how tools from social science might be incorporated into data science practices, discuss different types of collaboration, and consider data storytelling through visualization. The book shows that data science practitioners can build rigorous and ethical algorithms and design projects that use cutting-edge computational tools and address social concerns.

About the Author

Cecilia Aragon is Professor in the Department of Human Centered Design and Engineering at the University of Washington.

Yadong Luo is Emery M. Findley Distinguished Chair and Professor of Management at the University of Miami. He is the author of Global Dimensions of Corporate Governance and other books.

Shion Guha is Assistant Professor in the Faculty of Information at the University of Toronto.

Marina Kogan is Assistant Professor in the School of Computing at the University of Utah.

Michael Muller is Research staff member at IBM Research.

Gina Neff is Director of the Minderoo Centre for Technology and Democracy at the University of Cambridge and Professor of Technology and Society at the Oxford Internet Institute and the Department of Sociology at the University of Oxford. She is the author of Venture Labor: Work and the Burden of Risk in Innovative Industries and coauthor of Self-Tracking and Human-Centered Data Science (both published by the MIT Press).

In this Book

  • Data Science to Human-Centered Data Science
  • The Data Science Cycle
  • Interrogating Data Science
  • Techniques and Tools for Data Science Models
  • Human-Centered Approaches to Data Science Problems
  • Human-Centered Data Science Methods
  • Collaborations across and beyond Data Science
  • Storytelling with Data
  • The Future of Human-Centered Data Science
  • Glossary
  • References