Data Science with Python: Creating Business Value
We are in the midst of a seismic shift to the data economy.
Winners and losers will be defined by how well they utilize data for competitive advantage. Failure to recognize change may be an existential threat. In the past 15 years, 52% of the Fortune 500 companies have disappeared. Understanding and adapting to current trends and directions, especially concerning data, may be a matter of survival. Agile organizations that think freely and back up decisions with evidence respect these trends.
Ever wonder how Google, Netflix, iRobot, Yelp, Reddit, NASA, and thousands of SMBs handle their most significant data challenges? Highly successful organizations rely on the general-purpose programming language Python to wrangle data and gather insights that may be locked away in their unstructured data.
The story of data science is best told through the experiences of two very different organizations:
- Abt Associates
- Located in Bedford, MA – Abt Associates is a global leader in research, evaluation, and program implementation, driving innovation, and measurable impact for more than 50 years. Their focus is on using evidence and cutting-edge methods to deliver results for their clients.
- U.S. Department of Health and Human Services (HHS).
- We also examine the experience in building analytics skills across a massive federal government agency. The HHS needed to increase data literacy across the organization so that employees at all levels of decision-making could harness the power of data to identify new insights, automate workflows, and increase the efficiency of the organization. HHS wanted its staff to have more autonomy and independence to use data science to advance the institution.
First, let’s examine the data challenges facing Abt Associates from its case study from Data Society, a Skillsoft content partner:
As is the case for most firms, the challenge facing Abt related to data science was not abstract or hypothetical. Abt was awash in text data and survey statistics, and managers across the organization reported needing “more staff who can carry out machine learning tasks,” as well as the ability to “provide clients with a better understanding of their survey data.” Too many Abt employees processed data manually with Excel, which limited processing speed, precluded exploratory analysis, and restricted their work to “mostly tackling smaller datasets.”
Abt missed opportunities within an exploding subset of private and public-sector projects: data-heavy analysis and data-driven strategy. They could not engage as many of its staff as it wanted to on the types of interesting projects that made them feel fulfilled in their jobs, dampening their employee retention, and hamstringing their talent development. Both of these factors had the potential to suppress Abt’s profits and growth in the years to come.
Abt faced a difficult problem shared by many firms today. Given the upside-down labor market for data talent, hiring data scientists on the job market is expensive, and new hires are typically unequipped with the domain knowledge and company experience that would make them efficient employees.
For another perspective on data challenges, let’s examine the pain points at HHS from the Data Society case study:
HHS identified a few key issues that challenged the organization. They did not have cohesive communities around data sharing and problem-solving. Employees had an overwhelming amount of data to sort through and analyze. And employees were spending a lot of valuable time sorting through text data and proposals.
In summary, both organizations shared some common challenges:
- Lack of data science skills to perform slow, time-consuming data tasks
- Aversion to going to the open market for data science skills because of the cost and lack of domain expertise
- Lost opportunities from an inability to make data-driven decisions supported by evidence
Will Yang, director at HHS, summed up the situation well for both organizations:
“We have plenty of people who are subject-matter experts and eminent in their fields of study, but it is not sustainable for us to rely on outsourced data science knowledge and skillsets. More importantly, it is hard for us to see the concrete opportunities (and the realities of addressing them) if we do not have a basic handle on data science, data architecture, and the state-of-the-art.”
Abt worked to identify key areas to focus on for data science education. Abt identified individuals among their staff who were best suited to upskill in data science to participate and build out capstone projects that consisted of practical applications of data analytics to current challenges at Abt.
Learners learned how to program proficiently in Python and apply skills to build unsupervised and supervised machine learning models to text data.
By the end of the workshop, learners were able to:
- Mine data to find latent patterns and groups in different types of data
- Build recommender systems
- Build powerful predictive models
- Develop a framework for analyzing data to improve processes and accuracy
The Abt workshop, together with Data Society, created a groundswell of interest among employees who wanted to build their skills and do so for the benefit of their current teams at Abt. Not only did the learners build out their skills, but they also built out their organizational network and returned to their team with insights about the challenge that they were working on during the program.
Abt staff members – including those both with and without technical backgrounds – were eager to make the most of the data science learning opportunity and apply the skills to drive bottom-line growth for the firm.
For HHS, Data Society implemented a customized data science Bootcamp that included in-person, live streaming, and on-demand training to help HHS learners maximize learning at their own pace and in the format that works best for them. The goal was to build a community of practice around learning data science.
The shared understanding of the principles of data-driven decision-making and data science algorithms allowed staff from different parts of HHS (i.e., the CDC, the NIH, etc.) to communicate effectively and built cross-departmental tools and capabilities.
The Python programming skills that staff learned increased efficiency and facilitated the development of new tools and solutions.
Learners reported that the program advanced their skills, helped them identify new ways of analyzing data, and automate laborious processes. The program resulted in millions of dollars in annual cost savings to HHS.
Abt Associates and HHS made a commitment to building internal data science skills beginning with intensive, albeit short-term education in Python programming. And, it paid off big time with clients, partners, and employees. Data science skills are the keys to unlocking competitive advantage in the data economy. Some of the ways businesses and organizations build value with Python include:
- Improving operational efficiency and employee productivity.
- Python is Open Source with no license fees
- Employees are more productive by minimizing repetitive tasks
- Python facilitates data mining of text that is efficient and reusable
- Building data science skills across the business or organization minimizes the risk of a brain drain if a data scientist departs
- Unlocking data insights from different data sources for better decision making
- Python supports interoperability saving on hardware and networking costs
- Employees across the organization in nearly every job role can leverage Python for data analysis
- Pythons supports AI, Machine Learning, and Big Data to turn data and information into insights that support better decisions
- Investing in data skills because it is good for business
- Python enables businesses and organization to control their data
- Better data control leads to more impactful customer interactions, deeper commitment, and greater customer satisfaction
- Internal data science skills are less costly and more advantageous to teams because they bring subject matter expertise and domain knowledge that external contract data science resources lack
- Data Science is good for employees
- Data science skills are a pivotal capability for team leadership roles
- Data science skills are in short supply and high demand, expanding career paths
- Enables employees to become more innovative and more valuable contributors
- Python skills are portable across teams, job roles, and departments
Build on your current knowledge to progress to more advanced topics. Where are you now and where do you want to go?
For example, skilled data analysts can start working on their data wrangling skills by learning different methods of gathering, filtering, modifying, and managing quality data inputs and outputs. Data wranglers are typically focused on normalizing, cleaning, structuring, automating, and transforming data in their organizations.
You can make an immediate impact in your organization by learning data wrangling with Python to automate data cleaning and processing and use Python frameworks NumPy and Pandas for data mining.