Shin Chin was already working as a data scientist when he decided to take NYC Data Science Academy’s online Data Science Bootcamp. Although he had studied math, engineering and physics at college, he felt he needed more specific practical skills in Python and R in order to move his career in data science forward. He started in October 2015, and talks to us about strengthening his data science skillset, and how learning online with NYC Data Science Academy is already making him a better employee!
What were you up to before you started at NYC Data Science Academy?
My educational background is in electrical engineering. I got my BSc in electrical engineering, an MSc in electrical engineering, and another MSc in physics at the University of Michigan. Then I got my PhD from Penn State in signal processing and pattern recognition. My PhD thesis title was “Anomaly detection in complex dynamical systems” so I implemented an algorithm that I researched and developed to detect anomalies in complex dynamical systems. I didn’t really do machine learning like the kind we do at NYC Data Science Academy.
Right now I’m a data scientist on an Air Force contract. I’m part of a web development team that tries to integrate analytics into the application we’re building. I have knowledge of data science, which is my value-add to the team, but I’m not actually writing code or analyzing large data sets right now. In my previous two jobs, I was a data scientist but I felt I needed to brush up more on my skills in order to succeed.
With that kind of background, why do you need NYC Data Science Academy? What drove you to do a bootcamp style program?
After college I did start interviewing for data science positions. But I felt like my skill level was not up to the degree needed to succeed at big companies like Facebook or LinkedIn, because my background is in electrical engineering, not computer science. My software development and programming skills were not as proficient as someone who is a computer scientist.
Over the last three years I picked up R and Python, but I was not very good. I’m not sure how to use machine-learning algorithms in Python and R to analyze sets, define patterns, and find anomalies. So I thought NYCDSA would help me brush up those skills, improve my understanding of these wonderful machine learning algorithms, and help me implement them practically in a work environment. I’m more of a research scientist and I want to be data scientist in the real world industry, rather than just being a theoretician.
Did you look at other data science bootcamps before you made your decision on NYC Data Science Academy?
I did look at the Python course at General Assembly.
How did you find out about NYC Data Science Academy?
I was looking through news articles about data science bootcamps and NYC Data Science Academy had great reviews. I heard they had a more rigorous curriculum in Python and R than other data science bootcamps.
Why did you decide to do the online version of NYC Data Science Academy?
I didn’t want to quit my job and move to New York City from Washington, DC. It would be too expensive. I talked to NYC Data Science Academy founder Vivian Zhang and told her I wasn’t interested in moving to New York City, and she told me about the online version.
Did you have to be convinced of the bootcamp model or the online bootcamp model, because you had done so much traditional education?
I know I have a strong background in math, engineering and physics, but I felt I was lacking practical skills. My traditional academic education gave me around 85% to 90% of the skills I needed to work as a data scientist for a big company, but the bootcamp will give me that last 10% to 15% to learn other practical programming skills. With these skills I’ll be able to hit the ground running in my first year at a big company.
What have you learned so far at NYC Data Science Academy?
We started with R, then moved on to Python. I haven’t got into Spark, Hive and Hadoop yet, but those are the next tools I’ll learn.
For beginners who are not totally sure, what is the difference between R and Python?
R is a great statistical computing package that a lot of statisticians use. They’re great libraries and great packages that can be used to perform machine learning visualizations. Python is more of a programming language used for a wide variety of purposes like web development. But Python is catching up very quickly because people have developed modules that implement a lot of the same stuff that R implements. A lot of companies use Python. It’s also very good for integrating into web applications. R is also a little more complex to learn than Python. It’s good to learn both because different companies use one or the other.
Do you like using one or the other more?
I’ve been using R more often, but I started to learn Python in the last year or so. I think both have their uses.
In the online version of the class, is job placement important?
Vivian has always been emphasizing that NYC Data Science can help you find a job after you graduate. She always gives me encouraging news about students or hiring companies coming to NYC Data Science to interview students, and tells me about students getting jobs at various companies. Hiring companies are invited to come meet students towards the end of the program, and she is encouraging me to go out to New York City to be present at hiring events. She also sent my resumes out to hiring partners such as BlackRock. I just started the interview procedure.
What is it like to take the online version of NYC Data Science?
It’s 25 to 30 hours a week. They record all the lectures and put them online for me to view them. They also put all the lecture notes and lecture slides on the website. I think it’s better than actually being in the classroom because I can stop the video and rewind. I meticulously listen to the videos, and go through the slides, to make sure I understand everything. There are also homework and projects you have to complete.
I have a TA who’s assigned to me. He helped me setup my environment for Git, Python, R and SQL. He reviews my homework and when I have finished a project, we have a Google hangout where he goes through it, suggests improvements then grades it. If I have any questions, I can call him anytime and he will give me the answer.
Do you get to talk to other people in the class ever or other people doing the online course?
Not really. I think I’m one of the few people doing the online course.
Who is the instructor who is delivering the lectures?
The main instructor, who is very good, is Chris. I’ve never met him personally, but he has a master's in statistics and he’s a great statistician. When he lectures, he gives very good explanations on all the concepts, and includes instructions on how to perform the machine learning.
What types of projects are you working on? Have you done a big project yet?
There’s a final project but I haven’t started working on that yet. I’m still in week 9 and I still have the machine-learning project to finish before I work on my capstone project. I’ve worked on three projects so far, and I’m working on the fourth project now, then the capstone project will be the biggest project.
Do you feel there are things they are teaching in the class that you already know or has everything been new?
Everything is familiar to me, except they go more in depth and I learn more about the algorithms, R, and Python and all the parameters and things that you can do. I learned more and I find myself thinking, “Oh! I never knew this about R.” So they helped me understand it more and gave me new insights into what is going on.
Is there a good feedback loop when a problem comes up?
Yes. Sometimes when I click on the online classroom and the links don’t work, I immediately communicate with the TA and he gets it fixed within a day.
Do you think somebody should have a PhD in order to do well as a data scientist?
I think it really helps to have at least a master’s in a quantitative subject because it’s not about knowing and knowledge, it’s about the method of thinking and analytical skills. The skills you have as a scientist are very helpful as a data scientist.
How are you balancing your studies with a full time job?
On my job, the last 10 months I’ve been working remotely, and my entire team work remotely. I work on NYC Data Science right after I finish my work in the late afternoon, and evenings. I’ve not been going out on the weekends. When my friends ask me to go out, I say I have to work on my studies.
You’re working on a data science team now for the Air Force. Have you noticed that what you’re learning at NYCDSA has made you better at your job already?
Yes, yes. I’m not working in a data science team in my job, I’m the only data scientist on my team. Most of the people on my team are analysts or web developers.
And also the reason why it’s taken a longer time – I signed up five months ago – is because I’ve taken a couple of vacations in between. I can take six months to finish the course.
What’s your dream scenario when you graduate?
To work as a data scientist with the skill sets I have learned, applying what I’ve learned on a day to day basis, and creating value for the company. I like where I’m currently working, so my goal right now is to help them improve their bottom line.
Do you have any advice for people thinking about doing a data science bootcamp?
I think it helps if you have a basic knowledge of statistics and programming skills. Also, be prepared to work hard, because it’s a lot of work. You need to work hard to get the most out of it.