Irmak Sirer and Laurie Skelly work as data scientists (Irmak is also a partner) at Datascope Analytics, a data-driven consulting and design firm in Chicago. When they met the folks at Metis, who have already proven their propensity for great partnerships with their thoughtbot collaboration, it was clear that a Data Science program was in the cards. Now, Irmak and Laurie are designing the curriculum for the upcoming Metis Data Science course in New York, which is one of the only programs that teaches relative beginners (students don't need PhDs to apply) to be employable data scientists.
Tell us about Datascope Analytics and how you got involved with Metis.
Laurie: Datascope is a data science consulting firm. We work with a wide range of clients from regional not-for-profits all the way to Fortune 500 and national corporations. People usually ask what kind of data we work with. We’re the kind of firm that uses any data source to help people solve their problems. We’re a very broad, general-purpose firm.
Most of us have some kind of academic background and we have continued to come back to the idea of doing some sort of teaching or training. We’re small in size, so finding the bandwidth to run classes and develop new material was really going to be difficult and probably a really long-term idea.
We ran into Jason and Bernardo from Metis at the Strata conference, which is a data science conference; I used to work for Kaplan part-time in graduate school. They were looking for data scientists, people with domain knowledge to help develop the curriculum, and we were really excited to find a way to do some teaching and training without having to take care of all the logistic aspects ourselves.
Irmak: Laurie first met Jason and Bernardo and the entire Datascope team was really excited about it and I also jumped in on the opportunity. Now Laurie and I are designing the course and we will be the first instructors.
We have seen the boot camp model expand to data science and product design, etc. Why do you think that the bootcamp model can work for data science?
Laurie: I think that there are a few reasons why it’s a really great option for a lot of people. Usually, people who would consider the bootcamp are comparing it against a master’s program or self-teaching.
A couple years ago, there were really no options for Data Science masters degrees, and this year I think there are 40 new ones that are starting. But those will take 1-2 years and they’re usually upwards of $50,000.
On the other side, there’s self-teaching. There’s so much available on the internet but it’s really difficult, even for disciplined people to stick with it for the amount of time it takes to learn and be employable.
The Metis program is a very happy medium of a 3-month time window where you can take a leave of absence or say, “I’m gonna make the jump and figure this out.” I’m sure that some people will also be sent by their employers.
Irmak: I think with something like a master's program, you absorb a lot of theoretical knowledge. Then you start working as a data scientist and it also takes a long time to get familiar with the real life practical applications of that theoretical knowledge and relearning all you were taught. We want to throw you right into the thick of real problem cases and supply the information you need to get through. This gives you valuable experience.
Do you expect that students will graduate ready to be junior data scientists at a company?
Laurie: Absolutely. There are a ton of people out there who have all or nearly all these skills; they’re just filling in a couple of cracks.
I had the experience of knowing a lot of what I needed to know and not understanding what it was called. We come from different academic traditions and a lot of times, when discussing machine learning and regression techniques and modelling, people will be talking about the same space and really have a lot more knowledge in common than they realize. They just need to have that reconciliation. There are also people in the academic world who need to adjust to how different things are in the world of business.
I don’t want to hit this point too hard because I think it’s part of the secret sauce of why data science is so valuable right now, but we don’t know everything. Being on a job, we don’t know everything all the time. Data scientists, even the best ones in practice do a lot of Googling. It’s about having that skill and that confidence to apply the solution to the problem even if it’s a problem that you’ve never encountered before.
That’s the sort of thing that we can definitely teach in 12 weeks, especially the way that we’ve designed it- the process of fearlessly tackling very technically challenging things, knowing where to look, and knowing how these pieces fit together.
This will be your inaugural cohort- do you expect people to have experience with coding or analytics or statistics, or could somebody theoretically be a beginner?
Irmak: What we expect is some exposure and experience in statistics and programming but not more than that. We’re actually expecting different people with very different skill sets.
I think that there will be people that have a lot of programming experience and have seen a little bit of statistics, maybe in school, but they don’t really do focus on analysis. There could be a student that worked in the sciences, as another example, where they used statistics a little bit and they coded a little bit but they haven’t done anything like data science directly.We would also consider people that do more traditional types of statistics. They’ve done some coding but they’re not that experienced in programming.
The idea here is, not everybody is going to gain the exact same skills going through the boot camp. Wherever you are weaker in the big picture of Data Science, you will learn and strengthen that part.
And we think that curiosity and creativity, those kinds of personal attributes are also very important; because as Laurie said, a lot of data science is having confidence in your ability to learn something you don’t know, which means that you should be curious and creative with the tools themselves.
Laurie: If your question is- are we going to be able to take beginners and make them data scientists in 12 weeks? I would say the answer is “not yet.” Because this is our first class, we don’t know how inexperienced someone could be and still bring them up to speed.
Fortunately, I think that since we’re so early in the data science bootcamp game, we’re going to have a pretty competitive applicant pool. Maybe in the future we’ll be able to assess people and say, “You’re really great on your programming but you’re missing some skills in experimental design; go bone up on that and come back and we’ll be ready to take you.”
Our biggest goal is to be telling the truth when we say in 12 weeks you’ll be able to get a job and not drown in that job. We don’t have any interest in putting people in situations they’re not ready for.
Have you thought about what the interview is going to look like?
Laurie: Yes! Based on people’s expectations, it’s way more of a culture interview – but there is a technical challenge as well. People think about data scientists and they think about the toolbox and how intimidating it is. But like we said, we really need curious, clever people more than someone with a lot of technical experience. I’d rather have someone who I can tell is going to pick something up quickly or is asking clever follow up questions and is just really listening closely. We’ll probably learn more as we go but for this round, we’re going to be setting ourselves up for a better experience if we find a lot of really good personality matches. These will be our pioneers.
What is the technology stack that students learn in a data science boot camp?
Irmak: We think of it not only as a technology stack; we think of it as data science dimensions- the domains and the data, algorithms, tools and the visualization/communication part of a project. Basically, we want the graduates to be equipped in all of these dimensions.
In terms of data, we’ll be teaching SQL and non-SQL databases. We’ll be teaching about APIs and web scraping, where you could get data from different sources and how to clean that data.
In terms of algorithms, we will go over machine-learning algorithms; we will go over regressions, supervised learning and non-supervised learning. The tools you will learn is how to apply those algorithms.We will teach the course using Python as a programming language. There are a whole bunch of packages in Python where you can apply these types of algorithms.
In terms of visualization and communication, we think that the communication skills and the presentation and the relationship with the client is very important. We will teach these skills alongside visualization tools such as d3.
Laurie: In every project, you have to decide what kind of data you’re using, how do you get it, how do you store it, what are you doing with is as far as algorithms, what are you using to implement those algorithms and when you are done, how are you going to show those results to somebody else?
So for each area, or dimension, of data science, we will provide broad exposure to what’s out there so the students have a good sense of the ‘lay of the land’. In each one we’ll also be working from a kind of ‘home base’ - for a programming language, it will be python; for visualization, it will be d3; for databases, we’ll use mongoDB and MySQL. For each of these they’ll have some repetitive use and training, so they will build up a toolkit that they are familiar with, but also understand what alternatives are out there for each piece and be ready to make some changes in the lineup if a job or project calls for it.
Of our 12 weeks, we use one of them to spend a lot of focus on d3 because everybody wants to be able to make cool visualizations in d3 and we think it’s worth it.
How will your project-based curriculum look?
Students will be working on 4 or 5 projects and for each one, there’ll be some kind of output. There are many ways where they might be doing similar algorithms but different people will be picking different data sets so they’ll have different outcomes and it’ll be really interesting for people to express themselves as they’re going.
Irmak: I think it’s a great way for them to show how they can tackle a lot of problems for potential employers, instead of just one big project, you get to show completely separate, unique problems that have nothing to do with each other that show all your skills.
Has it been a challenge to create a bootcamp curriculum for data science?
Laurie: Metis really picked the right data scientists in partnering with Datascope; we’re obsessed with design. When we were at Strata, we were doing a workshop called “Design Thinking for Dummies (Data Scientists).” So this is the kind of stuff we love to chew on anyway.
What will a typical day look like at Metis?
Irmak: First of all, generally, you will be working on a project. The first project will be just a week, another one 3 weeks long, and so on.
You will have questions about how to make progress with your current project, you need to know about the tools, algorithms, approaches. During the day we will give some lectures where you are getting some of that knowledge and then most of the day, you are actually working on that project, applying that knowledge and also, creating new questions.
Will students have pre-work once they get admitted and how many hours are you expecting that to take?
Laurie: They have online pre-work; we have aimed for a maximum of 30 hours so that people could do it in about 2 hours per day at home, basically 2 weeks.
Pre-work focuses on using the command line, Python and some statistics background. They’ll work through some examples from some books that we love and there’ll be some example problems from those books. If they can submit the answer correctly, they’ll pass pre-work.
Are you going to have the “personal investment days” that the Ruby on Rails boot camp has at Metis, where they give people free Fridays?
Laurie: Yeah; we’re going to have guest speakers come in; we’re going to have more like culture days. There are going to be some cool conferences going on while we’re in New York. That and the fact that we’re enforcing that the day ends at 6 and that people continue being reasonable whole people and will take breaks. It will be important to try to keep people from burning out too hard.
Will Datascope Analytics be hiring from the Metis graduate pool?
Laurie: We will be a hiring partner, absolutely. We’re really excited to have first look at the fresh recruits. As a goal, it’s absolutely at the forefront of my mind to create a course that can help students get awesome jobs because that’s why people want to be data scientists. In addition to Datascope, we are sharing information about the type of companies Metis should target to become hiring partners.
I’m sure you all have amazing insight into who they should be reaching out to.
Irmak: Metis is in charge of a lot of the organization but we’re in contact with companies that we worked with, companies that we know and have relationships with and we know are in need of data scientists.
Laurie: I don’t know how many data scientists you know but I’ll just get emails from people at random companies asking if I have recommendations. And it actually feels like a big relief to say, “Let me put you in contact with our hiring person at Metis.”