Increasingly, organizations are turning their attention to “dark data”. But what exactly is dark data?
According to Gartner, it is
“..the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value.”
Original source: Databerg
Why dark data is not optimized
Enterprises collect more data than ever before, however much of it is still siloed and not really used – or even directly usable – in a productive fashion. This is about to change due primarily to the rising popularity of Python and the falling costs of storing data in the cloud. But here’s the catch. While it might be easier and cheaper than ever before to store data on the cloud, the challenge now is that there is a lack of people with the skills necessary to extract and interpret this hidden treasure.
Although many IT teams include domain experts who understand the dark data best, these individuals often do not possess the skills to exploit the full power of Python. Historically, spreadsheet programs such as Microsoft Excel were the tool of choice, but there are far more efficient ways to accomplish the same tasks – the only drawback is that people need access to the right training.
Skillsoft’s Aspire learning journeys
With this in mind, Skillsoft crafted the Python Novice to Pythonista Skillsoft Aspire Journey. This journey has been carefully designed and created by Skillsoft in a manner that can help enterprises leverage the enormous power of Python. The journey consists of four tracks- Python Novice, Python Apprentice, Python Journeyman, and Pythonista – that build upon each other.
In a previous blog post, we discussed in detail the key elements of the first path The Python Novice. Today, we want to talk about the rest of the journey. In the second track – Python Apprentice – learners move on to more advanced topics including virtual environments and the structure of packages and modules, object-oriented programming with classes and inheritance. The track also covers several important data structures in Python.
The third track – Python Journeyman – moves into the block-and-tackle of everyday programming in an enterprise setting. Subjects covered include constructing robust, well-tested code with unit testing in Python, working with HTTP requests, and building web apps using Flask. The journey ends with a dive into advanced topics such as multi-threading, in which a developer learns to recognize and avoid some of the pitfalls and gotchas associated with concurrency.
The culminating track in this journey is titled Pythonista and kicks off with a learning path on using the PyCharm IDE to develop and debug robust applications in Python.
The Wrangling Excel Data with Python path follows, which consists of three courses: Working with Excel Spreadsheets from Python, Performing Advanced Operations in Excel using Python, and Constructing Data Visualizations in Excel using Python.
1. Wrangling Excel Data with Python The first course explores how Microsoft Excel spreadsheets can be created, opened, and modified programmatically from within Python. The powerful open PyXLL library is used to manipulate Excel’s object model programmatically from within Python.
2. Performing Advanced Operations in Excel using Python moves on to more complex operations in Excel workbooks, including the use of conditional formatting, named ranges, and merged data analysis and the use of pivot tables, as well as the locking of cell references using the $ operator.
3. By the end of the third and final course, Constructing Data Visualizations in Excel using Python, learners will be able to build various types of visualizations using Python, and manipulate chart properties to customize their appearance. They will be able to control, at a fine-grained level, the appearance of their charts, down to the tick frequency and cell location. Learners will also be able to use stock charts to represent the opening, high, low, and closing prices of stocks in a single visualization.
4. The Pythonista track then ends the Journey with detailed coverage of two more topics – Design Patterns in Python, and Socket Programming. By this point, the learner has gained an enormous amount of confidence and hands-on experience, even if she began with absolutely no prior exposure to Python.
This confidence and hands-on experience, when combined with years of domain expertise, can help with the ultimate in data analysis – shining the light on dark data and finding those priceless insights that have been waiting to be revealed.
Kishan Iyer is a content engineer at Loonycorn, a technical video content studio. He earned his master’s degree in Computer Science at Columbia University and has worked for various companies such as Deutsche Bank, Electric Cloud and WebMD in the US. He now works at Loonycorn and is also a Skillsoft technology instructor. Athira P R is a technical content developer at Loonycorn who is keenly interested in building with Python and Java. Amritha P is a technical content developer at Loonycorn with a strong background in computer engineering, as well as programming in various languages and the use of relational and NoSQL technologies.