As interest in data grows, there might be some confusion over terminology. Here is a closer look at the difference between data science and data mining.
With the ever-growing abundance of data now available, companies are focusing on data analysis – and a fierce demand for Data Scientists has emerged.
Roles for Data Scientists have increased by 650 percent since 2012, which isn’t surprising given the ROI. Businesses that understand how to use data to drive decision-making are forecast to take $1.2 trillion from their competitors by 2020.
But there still exists a wide gap between companies using data effectively and those that are not. A 2016 Forrester study found that only 22 percent of firms were considered “Insights Leaders” – firms that use data science for competitive advantage – and that those companies were two times as likely to stand in market-leading position.
To join those at the front of the pack, you will need not just to hire a Data Scientist but the right Data Scientist, and with that aforementioned demand for their services, it might not be that easy. Jeremy Stanley, Chief Data Scientist and EVP Engineering at Sailthru, estimates that hiring alone can easily consume 20 percent or more of a Data Science team’s time.
To ensure a more efficient process, here are some tips on what to ask, and what to look for.
Know What You Want
It sounds simple, but it’s an unfortunate reality that too many organizations hire Data Scientists without first considering their goals.
The hiring process should reflect the specific needs of your team. Will your candidate be principally responsible for creating engaging visualizations and dashboards for non-analysts, designing and developing deep neural networks and machine learning models, or prototyping real applications?
Hiring a Data Scientist whose strengths don’t align with your company needs won’t work well for either party. Building a realistic data roadmap before beginning the process is crucial.
Develop a Practical Process
Interviewing Data Scientists can be a tricky process because, as we mentioned, your candidates have options. Stanley of Sailthru pointed out that strong candidates in the field can often receive three or more offers, and hiring success rates for managers are commonly below 50 percent.
That’s something to consider at every stage of the hiring process, and although you will want to be thorough, it’s similarly crucial not to scare off potential candidates.
Consider, for instance, the take-home test that is a preliminary step in the hiring process for most organizations. If a candidate is in demand and already being potentially pursued by several other companies – and likely employed already in the first place – then he or she will have little incentive to complete a needlessly arduous assignment that requires days of work.
Riley Newman, former Head of Analytics at Airbnb, said the company merely put forth a “basic data challenge” at this step in the process, rather than anything too demanding.
“The goal here is to validate the candidate’s ability to work with data, as described in their resume,” he wrote. “We send a few datasets to them and ask a basic question; the exercise should be easy for anyone who has experience.”
When candidates are eventually brought in for a more rigorous challenge, it should be viewed not just as an opportunity to test their skills, but also to show what makes your company special. Remember: the feeling-out process works both ways.
“We try to be as transparent about (our in-house data challenge) as possible,” Newman said. “You get to see what it’s like working with us and vice versa. So we have the candidate sit with the team, give them access to our data, and a broad question. Then they have the day to attack the problem however they’re inclined, with the support of the people around them.
“We encourage questions, have lunch with them to ease the tension, and check-in periodically to make sure they aren’t stuck on something trivial.”
Ask the Right Questions
Sometimes, Hiring Managers are tempted to use the interview to drill down into every last detail of a candidate’s technical skill set, barraging their interviewee with queries about a broad array of competencies that might not even be relevant to the job at hand. That’s a mistake. Limit your questions to tools you actually use, then follow-up by investigating how a candidate would apply those tools to a specific problem.
To get a better sense of how your prospective Data Scientist thinks, ask them to explain a recent problem that they set out to solve. Ask how they approached the project – and why they made the choices they did– and how they determined how to allocate their time and resources. Finally, ask what they learned from the project – and what they’d do differently.
Beyond giving you a window into the candidate’s process, this will also give you an insight into whether or not they possess a soft skill that can be crucial for Data Scientists: storytelling ability.
When the situation does demand a deeper look at the candidate’s technical prowess, Data Scientist Jacqueline Nolis recommends keeping questions relatively basic. For example, if probing a candidate’s knowledge of statistics, she suggests asking how to explain a linear regression to a business executive, or if the interviewee can list some alternative models to a linear regression and explain why they are better or worse.
“Given a particular area, I only ask questions at an introductory level. For instance, if asking a question about machine learn models, I would only ask about linear and logistic regressions and avoid asking about more advanced topics like Random Forests or boosting,” she wrote. “The reason for this is that if they understand the basics, they should be able to pick up the advanced topics on the job.”
Even before getting to that point, Nolis advises asking the candidate first about their familiarity with the topic.
“If they say they don’t have much, I skip it,” she said. “I want to avoid having the candidate feel overwhelmed or frustrated by that topic, as that could jeopardize the rest of the interview.”
Look Out for These Skills and Qualifications
The classic Venn diagram for Data Scientists has always pegged their skill sets as sitting at the nexus of business, computer science, and mathematics. While that largely holds true, not all roles demand a mastery of strategy, engineering, and mathematical modeling. So once again, it’s best to drill down into the specific needs of the position you’re filling.
Still, the general technical skills Data Scientists mostly need to possess can be broken up into three categories:
- To collect and store data with databases, Excel, and querying languages such as XML and SQL
- To analyze and model data sets using such tools as Python, R, Hadoop, and Spark
- To create engaging and insightful visualizations using tools including Tableau and PowerBL
Although it’s true that not every Data Scientist needs to possess a mastery of all aspects of the role, these skills and competencies are near-universally desired.
In fact, Michael Li, a Data Scientist who has worked at Google and Foursquare, lamented that too many of the people who call themselves Data Scientists are really “clickers,” comfortable only with point-and-click tools for data analysis and visualization.
“It’s time to hire Data Scientists who can code with more powerful tools, like R, Python, and TensorFlow,” Li wrote. “And companies should consider training their existing workforces in data science and AI skills in order to teach clickers some of the skills they’ll need to become coders.”
Still, it’s important not to overlook the value of less technical skills like creativity, critical thinking and business sense.
Sara Vera, Data Scientist at Insightly, suggests that the best approach is to hire for Data Science talent and send your employees for additional training programs as needed.
“Analytical thinking and communication skills are harder to teach than SQL, Python, and R,” she wrote. “The added benefit of training for tech skills is that you create an environment where mentorship is really strong, which creates a more cohesive team. The best candidates will want to always be learning from each other anyway.”
Offer Competitive Salaries
Finally, it’s worth noting that specialists as in-demand as Data Scientists do not come cheap; Glassdoor lists the average Data Scientist salary in the U.S. at $139,840 per year.
After this long process, be sure not to lose your prized candidate over a lowball salary offer. Remember: he or she likely has options.