Jupyter Notebooks is an IDE used to interface with Python and it’s becoming more popular in data science bootcamp curriculums. But what exactly is Jupyter Notebooks? Anurag Srivastava, lead instructor and mentor for Lighthouse Labs’ Data Science bootcamp, walks us through the basics of data science and Jupyter Notebooks, and answers the question: can you effectively learn Jupyter at a bootcamp?
Jupyter Notebook is an Integrated Development Environment (IDE) that acts as an interface between the user and Python.
The two most prominent languages used for data science are R and Python. But these are back end languages, working behind the scenes. Therefore you need some kind of a front end – something to interact with and display information for you. Jupyter is essentially a front end for executing Python commands.
What’s the history of Jupyter Notebook?
Jupyter is actually derived from IPython Notebook. In 2014, IPython Notebook was converted to Project Jupyter, an open-source project that's contributed to and maintained by people all over the world.
Jupyter is now being used in teaching data science around the world. As bootcamps and schools have taught the tool to new data science professionals, then slowly over time it's gained traction and popularity.
Jupyter makes it easy to bring everything that would normally be scattered across multiple applications into one place. Jupyter Notebook stands out because it has multiple functionalities. You can install it through your terminal. You can make your own slides, headings, subheadings, write text, and execute code. Jupyter Notebook allows you to distribute your code into different sections, you can run them separately, you can use code, images, charts, mathematical equations, and more all in one place. Moreover, Jupyter is a web-based application, so we can create and share code and documents. It provides a single environment to document codes, run, visualize the outcome all without leaving the environment.
We teach Jupyter Notebooks at Lighthouse Labs and just today we were talking about a technique called clustering – which is an unsupervised machine learning technique. Clustering can be used to simulate customer segmentation for a retail company, for example. Jupyter Notebook basically allows an end to end data science project right down to finding the requirements and identifying business problems and solutions. It fetches the data and reads it for missing values, outliers, and dirty data. Then you can use machine learning to find things out. In the end, you'll build a nice chart to present it. Everything is possible in Jupyter Notebook!
Are there specific jobs or companies that love Jupyter?
A company doesn't typically choose to use a specific IDE. These are all open-source IDEs so it almost always boils down to the data scientists themselves. In a job description, you may see companies wanting you to have experience in a specific IDE because the team there is practicing that specific technology.
If you see Jupyter Notebooks as a requirement on a job listing, is it easy to learn?
It's quite easy to learn! In Data Science, everything that you write in Jupyter is actually Python code. Jupyter notebook accepts all kinds of code – HTML, Java, etc.
When to use Jupyter on the job:
It depends on the audience you're dealing with. If you’re presenting your findings to a project manager or the VP of Data Science, then they would like to see it in Jupyter Notebook. If you’re presenting your findings to clients who aren’t data scientists, then you want to keep things simple for the business and use a PowerPoint presentation.
What does the Data Science Bootcamp at Lighthouse Labs cover?
You'll live, eat, sleep, and breathe data science for 12-weeks during the Lighthouse Labs Data Science bootcamp. We cover everything: Git, Bash, APIs, data types, data transfers, basic statistics, Python, Jupyter Notebooks, machine learning algorithms, deep learning algorithms, and AI.
Coming from a traditional education, were you skeptical that Lighthouse Labs could condense a data science education into 12-weeks?
Actually, no! When I started teaching the postgraduate course, I found that one year is too much time to learn data science. Anybody can be a data scientist. As you have noticed, I don't carry an engineering or mathematical background; I have a master of business administration. I'm not great at math or coding. If I can do it, anybody in the world can do it! Data Science is actually not that complicated. All of these algorithms are logical; if you wanted to be in a job where you’re developing new algorithms, then you would be in a PhD program.
Can you be a Data Scientist if you don't love math?
Absolutely! Data science is not about coding or math – it's about business. Not that math and coding are not important but the bigger picture is focused on businesses. You have to use mathematical models / algorithms but link it to business foresights. You need to have a bigger picture in mind. That's what makes this a hard job. If data science was just about a certain coding language, then data scientists would never be paid this much.
What are your favorite resources for beginners learning Jupyter?
Go to the Project Jupyter website. Who better to teach it to you than Jupyter themselves?
Google it! Everybody has their own preferences and their own learning styles. Me saying that Project Jupyter is the best doesn't make it the best. The word "best" itself is subjective. I suggest starting with Project Jupyter and if that doesn't make sense to you, go to Google! If you're a visual learner go to YouTube. If you learn by reading, find a book or documentation. If what you're watching or reading or trying to learn from does not grab your attention and teach you something within the first five minutes, then try something else.
Learn how Galvanize alum Suchaya became a remote Data Engineer at Gap!
Skills, Certifications, and Salaries – Jacob from NexGenT explains it all!
Co-Founder David Wintrich tells us about remote learning and measuring success for students!