Data science is one of the hottest roles in technology today. If you’ve got experience and a degree in data science or a related field, then you can write your own ticket with regards to who you want to work for and command a significant salary. However, will the data scientist be the star of the AI show for long, or is the spotlight on data science going to fade?
In an episode of the AI Today podcast, Anthony Scriffignano, Senior Vice President and Chief Data Scientist at Dun & Bradstreet shares his experiences, opinions, and insight into the current state of data science as a profession, as well as the potential of artificial intelligence to change the finance industry.
The current role of Data Science
At Dun & Bradstreet, Scriffignano is responsible for innovation and development of technologies and is working with the ‘the world’s largest commercial database of its kind’. Scriffignano explains how this unprecedented database collects data from every country in the world with the sole exceptions of North Korea and Cuba millions of times a day.
Incorporating every language and writing system, the database is composed of seven different integrated databases in lieu of one single database. This composite data system is used to develop a global insight on total risk and opportunity while keeping track of company data. The database can thus be used to perform large-scale data analysis, facilitating the detection of supply chain anomalies and changes in customer buying behavior. It comes as no surprise that data science is key to extracting value from such a large repository of information.
One of the biggest challenges for organizations like Dun and Bradstreet is finding skilled data scientists who have both the background as well as experience to handle data sets as large as the one that Dun & Bradstreet is working on. Unfortunately, the market is not keeping pace with organizations’ needs for data science skills. Scriffignano shares that he believes the basics of AI are becoming generalized and democratized in a way that will not necessarily need skilled data scientists in the future. Scriffignano believes that the set of skills necessary to be a fully-fledged data scientist is much broader and in-depth than those simply required to create machine learning models. In essence, true data scientists are focused on broader issues of extracting value from data, while many who call themselves data scientists now are really machine learning engineers, focused on ML model development.
Scriffignano is of the opinion that we must focus on the scientist aspect of a data scientist, believing that a data scientist must be able to formulate a question or theory from observed data, enact and experiment upon this theory and subsequently come to a conclusion and share their results. Noting that most data scientists are simply expected to churn out repeatable models, Scriffignano believes that challenging your data scientists to improve and innovate is truly where success lies. He states that the lack of pushing data scientists to innovate their profession beyond simply model development is a reason why a significant amount of organizations struggle with data science and AI.
Challenges: governance and ethics
Besides issues of extracting value from large data sets, Scriffignano believes that the primary challenges for AI and data science are focused around governance and ethics. This is especially the case when personal information is involved. How can we make sure we’re making responsible use of private information when building large databases and building intelligent models that use that private information?
Part of the reason why there’s increased scrutiny on machine learning models has more to do with issues of privacy and security than on the specific characteristics of those models. Scriffignano makes an interesting point by stating just how troublesome it will be to cater to everyone in terms of AI regulation with the variation in needs and wants. People are looking for more customization and more rapid model development yet are not willing to compromise on their privacy. Some companies and individuals will benefit from models that use lots of data to create much more precise and accurate predictions, but at the expense of scooping up large amounts of private information. Others might resist the inclusion of their data in those models even if it results in less-accurate models that they might end up depending on. As a result, not everyone will be satisfied by greater expansion of data used to build machine learning models.
Scriffignano believes that Government regulators will need to keep up with evolving technologies if they wish to ensure optimal national security and avoid these privacy-related issues. These laws and regulations will vary heavily in different regions of the world, and as such, the notion of ethics might not even be consistent across different jurisdictions. Ethics, and the resulting laws, will vary largely from country to country and regions as it does now with Europe taking a more ethical approach, China being less interested in privacy, and the United States somewhere in the middle. Some countries are simply more interested in privacy, with others looking towards national security and even economic advancement. The issue with this, as Scriffignano shares, is that machine learning really doesn’t have any geographic boundaries. What might be unacceptable in one location might be perfectly fine in another. The models will be built and then be available for use in other regions. It might be very difficult to control the spread of models developed in one region with less care for privacy that might be used in another location with higher regard for data ethics.
On the podcast, Scriffignano also shares his dislike of anthropomorphizing AI. Taking a much more practical approach, Scriffignano reminds us that our current evolution of AI functions by means of algorithms and processes. Scriffignano uses Artificial General Intelligence (AGI) as an example from which to share his point of view. Our limitations start when we cannot ask the right question of the copious amount of data we possess. Scriffignano foresees a future in which professionals will work alongside AI and that we need not fear answering to robots or machines as long as we are vigilant about it. To achieve this outcome, we must be stringent and alert to data ethics and governance issues to allow this progress to be made without harm.
Author: Ron Schmelzer