By Pradeep Gulipalli – Co-Founder and Chief Delivery Officer.
As businesses today understand the value their data can unlock, they are faced with an increasing need for people who can help achieve this – Data Scientists.
Data Science is an overarching term that includes such wide variety of skills that it is extremely rare to find everything in one person. It’s not surprising that Data Scientists come in so many flavors, often leaving hiring managers and business sponsors a little confused whether they are bringing in the right person into their team.
At Tiger Analytics, we have worked with more than 50 clients at different stages of their analytics journey and executed hundreds of analytics projects, both small and large. From all these interactions with clients, I’ve come to realize that there are multiple views on what makes a good data scientist. These views are often shaped by what has worked in specific settings and there is no single correct answer that works everywhere.
In fact, at Tiger Analytics, we face this question time and again – Who will be the right fit for the nature of work with a particular client engagement? As we address this question multiple times, we observe a clear pattern regardless of whether we look at it from the perspective of the business problem or the skill set needed to address the problem. We see these skills and problems falling into two broad groups based on the objectives they aim to achieve for clients.
In this article, I share our learnings on this topic – what type of a data scientist would be the best fit for your team.
Objective 1: Interpretable Insights for Business Users
This involves deriving insights from the data so that they can be reviewed and acted upon by business users. So the models or solution frameworks need to be sufficiently transparent to them. Business Users should be able to understand and interpret the models at a high level. “Black box” models will rarely find acceptance.
Below are some business problems in this category:
– Understanding key drivers of brand perception
– Quantifying ROI of different marketing initiatives
– Determining whether a credit card application should be approved
– Quantifying impact of various pricing strategies on sales
– Forecasting demand under various economic scenarios
The data scientist working on these problems needs to have a strong understanding of econometric and statistical modeling principles. Advanced engineering/computer science skills are not an absolute necessity as data used is not unusually large. The key objective here is to generate insights and not build computationally optimized algorithms. Ability to communicate and visualize results is very important. The main challenge for the data scientist here is to be able to derive insights from (often incomplete) data and be able to explain it to a non-technical audience to help drive adoption.
Objective 2: Back-end Intelligence for Smart Systems
In these types of problems, data science is used to build intelligence into applications. Interpretability of the models used is not of great interest to users. What matters more is whether the models produce accurate and relevant end results. Therefore, complex models and solution frameworks are very much acceptable.
Here are examples of some business problems:
– Personalizing news articles for a user using recommendation engines
– Quantifying sentiment of conversations on social media
– Detecting credit card fraud in real time
– Digital campaign optimization and execution (bidding, budgeting, targeting etc.)
– Processing unstructured data such as images, audio, or videos
The data scientist working on such problems should have a solid understanding of machine learning and be comfortable with writing software in production environments. The problems frequently involve big data and occasionally need real-time analysis. The main challenge for the data scientist here is to be able to bring together their technical, computational, and mathematical skills to build robust large scale systems.
The solution approaches and skill sets needed to solve the above types of problems are quite different. An excellent data scientist in the first area might struggle significantly in the second area, and vice versa. One can find further nuances in the data science problems being solved or skill sets needed, but the two-fold categorization above captures our big picture learning.
First published on – www.kdnuggets.com/2016/06/identify-right-data-your-team.html