Have you found yourself relying on Siri for information or asking more questions of Alexa than you used to? You are not alone. In our day to day lives we’ve become very accustomed to being provided with all kinds of information in milliseconds. We take it for granted and possibly don’t ask ourselves often enough, what the origin is of what is being thrown at us.
In our industry we are already seeing evidence of businesses relying on data science and emerging technology. A 2017 report by the Financial Stability Board, pure-play AI and machine learning firms have managed assets of over $10 billion till 2017. In 2019 there are many exciting opportunities for asset and wealth management firms exploiting financial data to create a competitive edge. From a skills perspective, there are many of us too who will be looking to find new opportunities ourselves and future-proof our careers with a bit of data savviness too.
What does it all mean?
If you don’t know your Alt data from your geospatial data or your IoT from GPU-accelerated statistics, you might be wondering - what exactly is data science?
The classic definition of natural science is “the intellectual and practical activity encompassing the systematic study of the structure and behaviour of the physical and natural world through observation and experiment”. An important point here for me is the “Physical and Natural world” and although not always visible through my own eyes, I would argue we study the facts to create new knowledge.
With data science, despite the association with science, the explanation of what it is, is slightly different i.e. an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured.
So, methods, processes, algorithms and systems replace observation and experiment. Does that therefore mean that instead of taken natural facts as a base of science, facts become the resultant of data science? The challenge with science is that it is easy to find “interesting” patterns labelled as knowledge or insight where in fact none exist! Humanity used to think that based on their observations the world was flat, by “experimenting” they discovered the world was round. Today we have the images of satellites to prove this.
This raises the question how we should judge whether a “pattern” is interesting? When should we worry about falsely labelling patterns “interesting”? (eg. mistranslation of a sentence vs. incorrect cancer diagnosis).
With the enormity of data having become available in the last let’s say 50 years, how much do we really know about what the patterns tell us, is the world of data maybe still in its flat phase and not round yet?
Historically, a scientist would be leveraging existing natural sources and create new ones as needed in order to extract meaningful information and actionable insights.
I would argue that although data scientists have created significant meaningful information which has serviced a many a good cause, for example in health and elderly care, equally however, data science is also being used to let people believe the earth is flat and the horizon shown is all there is to know (you have given a like to a post of a friend and before you know it, an ad is thrown at you to buy some (assumed) related product).
In my view data science in general has still a way to go before the output of it can be completely trusted. We need to keep setting sail and independently try to discover what is over the horizon. Human intervention will almost always be necessary, especially when it comes to identifying anomalies and nuances. And of course, there will always be the occasional glitch – the Christmas day 2018 outage of Alexa is a prime example!