The Rise of the Data Strategist – Part 1

Written by Boyan Angelov

Have you noticed the growing array of job positions with the word data in front of them? Back in 2012, the job title “data scientist” was deemed “the sexiest job of the 21st century” by Harvard Business Review. Ever since, and as correctly predicted, numerous spin-offs and new job titles with the word data have been propagated, and the ambition to become a priceless “data science unicorn” has been matched with the on-going talent war and feeding frenzy by companies and headhunters desperate to discover and catch one.

Figure 1: Data Scientist Searches on Google 2004 – 2018

Figure 2: Data Strategist Searches on Google 2004 – 2018

With data science unicorns seemingly very rare and priceless, companies seek the same expertise or skill sets through several people.

A skillset commonly found on data science job descriptions, in addition to the common requirements of an advanced science degree, SQL, R, Python, machine learning, etc is the following:

the ability to translate business problems to analytical solutions and insights”.

Technical people might scoff at this phrase and discard it as a typical example of useless business lingo. This seems so obvious that this is important for the job of a data scientist – why put it in the job description? The simple reason is that it is a real skill, and while it can be trained and developed, the skill is rarer than one might think. How many data scientists with PhDs have a business degree? There are a few, but it is not common either.

The definition above in reality represents the main skill of a Data Strategist. To illustrate what this means in practice, let’s look at an example from e-commerce:

Client:Can you help us understand our customers better?

The answer to this can be tricky and require a lot of experience and confidence. Let’s see what often follows:

Data Scientist: You can just deploy a neural network on AWS to train a classifier for the different user groups.

Client: Umm, okay. But what kind of data do we need for this?

Data Scientist: A labelled dataset of around several thousand observations – balanced classes of course.

Client: ???

From all the words of those sentences only a few are understandable to a manager hoping to simply understand their customers. The rest could as well be an arcane alien language. This simplified example illustrates the problem and demonstrates the need for translational roles. The data scientist used terms that those with advanced science degrees may understand, but which are incomprehensible to most managers and businesspeople for which they make a direct report.

Let’s call a data strategist to see how they do:

Client:Can you help us understand our customers better?

Data Strategist: OK, let’s see. So what kind of data do you already have, if any?

Client: We have google analytics data, user behaviour and purchasing data on our website, product reviews and comments data in a database and social media data.[1]

Data Strategist: Based on those datasets we can try to answer the question in several different ways. One thing we can do is take the behavioural data and do a sentiment analysis on it – how customers feel about your specific products. This is good, because parts of this project can be used to analyse the product reviews text data. We can go a step further here and see if there are some patterns in there – perhaps groups of different users. By using such models, we can start to understand if there are common patterns in customer feedback, and automatically tag negative feedback and forward it to the support staff. There is a good chance that this would work, I would say 80% confidence we will have relevant results.

Client: Oh, this sounds interesting. What else can we do?

Data Strategist: If we have a good dataset on the customers, we can try to predict their probability of churn, or in other words – the probability of abandoning their providers. We can then use those predictions to target users who are likely to churn with special offers, that can keep them on our platform. This can be a bit trickier and depends a lot on the amount of data you have – I would say it has 50% probability of working out, and might take more work, expertise and domain knowledge to build a churn model.

Can you guess which one would have a larger business impact?

Here are the points that make the difference:

  • The first noticeable difference between the two conversations is that the latter is an actual dialogue. There is back and forth between the client and the data strategist, and it feels much more like a collaboration.
  • The data strategist tried to get an understanding first.
  • The data strategist connected the work to the business goal immediately.
  • The second case also started to think about code reusability.
  • And finally the strategist provided a few metrics for the comparison of the several approaches – importance, complexity and probability of success.

All of those points define the new Data Strategist role. This is also in line with the recently increasing specialisation and developing maturity within the data science field. Recent articles, some on HBR and Forbes, illustrate this point[2]. Let’s say you are convinced that your business or organisation needs such a role, how do you hire for it? What are the traits that make a good data strategist?

In this blog series, we explore the emerging role of the Data Strategist, and how we expect this role to drive business insights and evolve.

Boyan Angelov is a Senior Data Scientist at DAIN Studios in Berlin, Germany. Boyan has a background in bioinformatics and regularly speaks at events and blogs about data science and machine learning.

[1]Of course not all clients would immediately have such a great overview, but let’s assume they do for illustration purposes.

[2]They use the term “data translator”.