3 Things You Need to Do Before Hiring a Data Scientist

By BrainStation August 22, 2018
Share

A company decides to make the leap into data science, and has hired their first Data Scientist. She shows up and her boss hands her a laptop and tells her to get to work.

If the company is lucky, they’ve snagged an enterprising Data Scientist who can figure out a way to make an impact. Odds are though, it will lead to an expensive Data Scientist sitting around, frustratedly wishing they had something to work on.

What can you do to avoid this increasingly common situation? Here are some things you should do before hiring a Data Scientist.

Have a Realistic Data Roadmap

We’ve all heard how revolutionary data, AI and machine learning are going to be, so once you hire your first Data Scientist, they ought to have no trouble ushering your organization into the future. Right? Not exactly.

Data Scientists are equipped with in-depth knowledge of statistics, machine learning, and big data technologies…but they most likely don’t know your industry. This means that your Data Scientist will spend a significant amount of time learning your business and figuring out how to apply their knowledge and expertise to your domain.

You can help this process significantly by having a general understanding of what can and can’t be accomplished with data science – even if you don’t have a full understanding of how Data Scientists do what they do. By having some specific ideas about how the Data Scientist could impact your business, you will minimize that lag between your first hire and your first data-driven business decisions.

Understand What Data Scientists Can Do

“Data science” is broad, and Data Scientists can do a lot, but contemporary understanding of this is centered around machine learning.

Machine learning is fundamentally about making accurate predictions. As a side note, in machine learning, we use the term “predict” quite broadly—in every-day speak, predictions are about the future, but in machine learning, predicting is the much more general filling in of data that we don’t have. So you may have a machine learning algorithm that “predicts” what segment a customer belongs to based on their activities and interactions with your company. This prediction isn’t really about the future per se, but we call it a “prediction” because you are using data that you have to fill in data that you don’t have.

Machine learning predictions involve taking data that you have and using it to predict the values of data that you don’t have. Truly understanding this idea is half the battle. To pose a machine learning problem, you begin by identifying some outcome that you want to predict. You might go on to theorize some events or data points that you think would be instrumental in predicting that outcome. Generally these predictive data points are called features.

The machine learning approach is then to get as many historical examples as you can of the outcome that you’re trying to predict, along with the features that you’ve chosen, and have the computer analyze the historical examples to find patterns linking the features to the outcomes.

By understanding the outline of this process, you can empower your newly hired Data Scientist to do effective work much more quickly and easily. Moreover, if there is a strong understanding of what can be produced (and what is needed to produce it), you and and the business as a whole will be better positioned to use the Data Scientist’s work.

Understand What Can Go Wrong

Consider two competing telecom companies, BigTel and CoolTec. Neither has had a Data Scientist working in the marketing department before, but managers at both have read about how data science is the future and have decided to hire one. Each has a focus on improving customer retention.

The manager at BigTel brings the Data Scientist on, hands her a laptop, and tells her to start using data science to help customer retention. The newly hired Data Scientist at BigTel begins feeling around, getting her hands on whatever data she can find, analyzing different measures of customer churn and retention, trying to come up with a useful way to apply the skills that she brings to the problem of reducing retention.

She spends a few months producing impressive and informative reports and powerpoint presentations, interviewing people across the business to better understand customer retention, and brainstorming ways to impact retention. Unfortunately, it’s not clear how to make use of these facts, figures, and ideas to fundamentally affect changes in customer retention. Many months go by and it begins to look like the return on investment in data science isn’t as great as it was cracked up to be. Everyone is sad.

At CoolTec, the hiring manager has read this article and spent some time contemplating how to use machine learning to help with retention – before even making the first hire. By the time the Data Scientist is hired, there is already a plan: the Data Scientist at CoolTec will create a machine learning model that predicts which customers are likely to churn in the next three months. The model will make these predictions based on customer interactions, historical billing patterns, and geography. That model will be used to generate call lists for retention agents, who will contact the high-risk customers with special offers.

At CoolTec, the Data Scientist can get started doing what she’s good at right out of the gate. When the problem is posed in the language of machine learning—build a model to predict which customers are likely to churn—the Data Scientist’s education and training will be put to its best use. And while they are building this model, managers at CoolTec can make arrangements with the rest of the business for how to consume and deal with the predictions.

Conclusion

You don’t need to know everything about machine learning to hire a Data Scientist— after all, that’s what you’re hiring the Data Scientist to do. But it is useful to understand the general principles of machine learning, and how to use that process to produce positive business outcomes. The Data Scientist’s specialty is in creating predictive models, not in creating positive business outcomes. As a manager, if you can handle the translation from a business question to a machine learning question, the Data Scientist can hit the ground running, producing things that can be truly transformative.

 

Looking to improve your team’s data skills? Find out more about BrainStation’s accelerated corporate training options.