Everything you wanted to know about machine learning (but were afraid to ask).
An online portfolio is key for anyone working in the world of data science — because it’s the best way to show employers evidence of your skill set, be it your Python prowess or your knack for data modeling.
But knowing where to start can be tricky, and you don’t want your data portfolio to just be… a data dump.
According to Adam Thorsteinson, BrainStation’s Lead Educator for Data Science, budding Data Scientists should be aiming for the opposite: A curated, well-rounded showcase of your best work that’s capable of catching an employer’s eye.
With that in mind, he offered a few dos and don’ts for anyone building a data science portfolio.
Don’t Include Your Whole Body of Work
The first thing on your agenda needs to be conducting an inventory of all the data science work you’ve done to date, Thorsteinson says.
And it’s worth thinking outside the box — consider everything from an eye-catching data visualization produced for a big-name client, to a thesis project where you showed off some powerful Python coding skills.
Then, Thorsteinson says, figure out which projects make the cut for your portfolio. You want a few pieces that best showcase your range of skills and the whole data science process, from starting with a basic data set, to defining a problem, doing a cleanup, to building a model, and ultimately finding a solution.
“That’s the arc of any data science project out in the wild,” Thorsteinson says.
Do Showcase Your Communication Skills
For data-based jobs, employers will want to see your number-crunching and coding abilities, but that’s not the only thing they’re looking for.
“A huge piece of data science in the workplace is being able to communicate,” says Thorsteinson.
In a portfolio, you can show off your communication skills by coupling portfolio samples with an accompanying narrative, showing the work you did to find a solution to each problem. “You could write a whole blog post around a piece of work you’ve done,” Thorsteinson suggests.
It’s also worth including a bit about yourself — like your passions and past work experience — as part of the non-data elements of your portfolio.
When asked what would make him want to hire someone, Thorsteinson put it this way: “A good combination of well-written code, with a strong amount of communication built around that code.”
Do Consider GitHub Instead of a Website
Sure, you can build a basic online portfolio to showcase your work. But why not use a platform where other Data Scientists are already gathering?
GitHub — a popular software development platform — is used by millions of Developers around the world, meaning your work will be hosted in a space frequented by potential future coworkers, mentors, and hiring managers.
“Definitely having a well-rounded, well-populated GitHub portfolio will take things to another level,” Thorsteinson says. “That’s where a lot of Data Scientists will host their portfolio… instead of LinkedIn or a personal website.”
And, he adds, it’s a place where you can easily post not just the technical side of things, but instead, have each portfolio sample constructed to showcase your code embedded within a larger written piece outlining the problem and process.
Don’t Just Show Code in Isolation
Employers can get a bit glassy-eyed looking at the portfolios of applicants, and it’s important to stand out. And, while it might sound obvious, it’s also crucial to prove you actually know what you’re doing — and not taking a “fake it ‘til you make it” approach to coding.
That’s why just publishing your code in isolation doesn’t really highlight you know what you’re doing, Thorsteinson says. “That just highlights you’ve written this code, which may or may not have been copied and pasted from somewhere else.”
Yikes. Not exactly the impression you want to leave in a hiring manager’s mind.
But, thankfully, there’s a fix: “Incorporating data visualization wherever you can — that’s a skill employers look for,” Thorsteinson explains. “It’s one of the best ways to communicate your data findings to a non-technical audience.”
It’s also helpful to showcase a range of different techniques since data science is a pretty broad field — meaning there’s a lot of ways you can approach a problem, and a variety of approaches you can bring to the table.
“If you want to build a great, well-rounded portfolio, find some well-rounded problems to tackle and showcase,” Thorsteinson suggests.