How did you get into tech and data analytics, Craig?
I started out as a Data Engineer. I was on a team that developed the first database for the Macintosh platform and I ended up joining Apple as a result. Fast forward and I helped code the first 3D multiplayer online computer video game called Spectre. About 12 years ago, I was asked by Electronic Arts to build their data stack. It's been a fun career combination of data analytics and video games – basically the dream career.
Over the past few years, I've worked with General Assembly as a Subject Matter Expert in Data Analytics and particularly Tableau. I've developed GA courses, I'm on their product advisory board, and earlier this year I was recognized as a part of their Distinguished Faculty program. I’m also part of a company called Data Divers. If we're not talking about our next diving trip, we're talking about data. If you want to connect with me on LinkedIn, feel free to do so!
Data visualization is a relatively new concept. Could you give us a high-level overview? Why is data visualization important right now?
Data visualization is the difference between looking at thousands of rows on a spreadsheet versus seeing one chart or graphic that gives you that "aha!" moment. EMC did a study a few years ago that shows that 95% of all data that is collected today goes unanalyzed. Gartner Research, one of my favorite thought leaders and think-tanks, said that by the end of this year there will be 5 million jobs in data analytics, data science, and data engineering in the world with only 2.5 million people to fill them. Now is a great time to be getting into this role and understanding the nature of data analytics.
What kinds of jobs use data visualization? Is this skill only for folks working directly in tech?
The cool thing about data analytics is that the tools are built so that the non-technical user can very easily access the power of data analytics. Users can put visual analysis to work answering questions without any programming experience nowadays.
The range of uses for data visualization includes:
Product analytics – are products being used as designed?
HR and people analytics – is the next hire going to be a good fit? How can I know that objectively?
Marketing analytics – How do I optimize my ad spend to maximize lift on the next campaign?
This is just the start! The best part is that you don't have to be an analyst to ask these analytical questions. We're often teaching managers and leaders so that they can take the most advantage of the data resources they have.
One of the coolest applications of data visualization applications that I've seen over the years has been in journalism.
Let's dive in! You mentioned that Gartner defined the "4 V's of Data Visualization." Can you tell us about what those 4 V's are?
The incredible opportunity that we have in big data now is because of these four big V's of data that Gartner defined a few years ago. The four V’s are volume of data, variety of data from different sources, the velocity at which it's coming to us, and veracity or trustworthiness of data.
On the job, a Data Analyst pulls this data together into an analysis that answers key stakeholder questions. Their analysis moves the needle in some way – either by increasing revenue, decreasing cost, increasing competitive advantages, or decreasing risk.
Data can be collected by anything now – even your car is more computer than machine these days, spitting out tons of data. Appliances, Alexa, Google Home, and things like that generate tons of varieties of data. From structured data to graphic data to visual data to sound. All of these are data elements.
What kinds of tools do you use for data visualization?
There are hundreds of tools out there. It can be overwhelming. We've studied the market leaders in developing the course for Data Analytics and we've put it into three buckets:
Spreadsheets – We call spreadsheets the "analytics scratchpad." They're great for rapid prototyping, cleaning data, pivot, tables, graphs. They're not production ready databases, though. They don't have the rigors to be a full small-medium business or even enterprise data set. But they're good for a first cut at things.
Data Querying – This is where you get into bigger data sets. The theoretical limit of Excel is one million rows. If you want to get above that you use another tool called Structured Query Language (SQL). That's where you can look at millions, even tens of hundreds of millions of rows, query them with a human readable language and get the results in milliseconds.
Exploration Tools – You can bring that data into a data visualization and exploration tool like Tableau. We're actually in Tableau right now! I'm using something called Story Points in Tableau to share this presentation. Back to your point about journalism and data journalism, Tableau developed Story Point after talking to a number of visual journal publications that love to do data journalism! These are slides but you can put interactive visualizations, charts, and dashboards in Story Points as well.
These are the market leaders that we are teaching in this course. Our goal is to help you develop a portfolio of work through the projects and the capstones so that you are market ready and you can go out and land one of those 5 million open jobs in the marketplace right now.
What does the data visualization process actually look like in Tableau?
The first thing you do when you work on data in Tableau is connect to the data. The cool thing about Tableau – and the reason that over 50% of Fortune1000 companies use it as their primary tool for data analytics – is because of the democratization of data access that they bring to the table. Tableau connects to a variety of databases that you might find familiar: MySQL, MongoDB, Postgres. Plus, it's code-free! All you have to know is the credentials to get into your company’s database. You can also connect to popular file formats: Excel, Comma Separated Value Text Files (CSV), PDF.
Tableau has a couple of sample data sets that you can practice with. In the data source, you can see the data sample set that Tableau has provided here called "Super Store." It's 10,000 records of an orders database. I’ve created a quick data detective exploration in the dashboard using this "Super Store" dataset.
What are the most important features of Tableau?
As a detective or as a journalist, you often ask the questions "Who?" "What?" "Where?" "When?" "Why?" and "How?" Those same questions apply to analyzing data. Let’s look at the interface of Tableau. It's drag and drop – code-free!
In the Visualization pane, you can see "Measures" which are numeric types of information.
You can use the "Dimensions" to slice and dice the Measures. I've got a scatter plot here of sales and profit for my overall data set. It's not interesting yet. There's only one plot so far, the overall profit and sales form our "Super Store" data set.
If I want to slice and dice the data, I could grab the Customer Name and drag that into the Detail pane. These are called "Marks Cards." Detail allows me to slice and dice that data. Maybe I want to visualize this by coloring the Profit. Now I'm looking at the profitability of all of these customers.
I can use the built-in tool tips to hover over points and see that this customer, Tamara, is my most profitable. My other outlier here is Cindy Stewart. There are many ways that I can work with my details here to make it completely beautiful and elegant as I try to communicate my answers to the stakeholder questions.
So which questions can you answer with the SuperStore data set?
Let’s start with: "Who are our most profitable customers by segment and region?" I'll add Region as a filter. Note that I'm not doing any programming here, I'm simply dragging and dropping.
My second question is "What products are those customers purchasing and how many?" I took the number of records and the subcategory of the products and dragged them into the pane here. I can make them into rows, sort them, or add more color-coded labels like quantity to see how many were bought. If I want to see the type of customers, I can color code that segment and drag it in. Now I can see the number of products bought by each of my customer segments: corporate consumer and home office.
The third thing I wanted to show you is a geovisualization – a map. This is super fun! I'm going to double-click on "State." Tableau understands geography cleanly and it pops up a map of the US. Suddenly I can see all of the profitability by State with a few clicks.
We can pull all of our visualizations onto one page and then connect them together. So I have my US map here, my consumers, and my product visualizations all on one page. When I click on Texas, the other graphs filter their data based on that filter! You can even filter by regions. In only five minutes I've created three visualizations, put together a dashboard and answered multiple stakeholder questions without any coding.
What are your favorite resources for beginners getting started in data visualization and data analysis?
Data.World - This is one of the leading data catalogs and collaboration tools. They work cleanly with Tableau. They offer over 400,000 data sources that you can freely access and work on to understand what it is you're trying to do when you're answering questions.
Ryan Sleeper - Ryan is one of a few certified Tableau "Zen Masters." That's the top certification level of Tableau training you can receive in the world. He puts out a free weekly newsletter that I love. You can subscribe for free. He gives Tableau tips every week that you can use.
Is there a free version of Tableau that anyone could use?
Tableau Public is a free fully-functional version of Tableau. You can download Tableau for Mac or Windows. You can go to the Tableau Gallery and see some daily inspiration of what other Data Analysts have built! This is some incredible data journalism. You can download other projects to deconstruct, reconstruct, and learn from.
Do you have any advice for complete beginners who want to work with data?
I typically talk to people that fall into one of two buckets. One is brand new to data and they want to see and explore. The other type of learner wants to upskill and move up in their career.
If you’re brand new to data and you're pivoting into it, I invite you to one of the Free Friday Intro to Data Analytics Webinars. I'll take you through creating hands-on basic analysis in a spreadsheet. If you like what you saw on Tableau, there are one day 6-hour bootcamps available specifically about Tableau from General Assembly.
Kim Fessel, Instructor at Metis, answers all of your questions about Python in this video Q&A!
Two Flatiron School instructors explain: "What's the difference between Analytics and Engineering?"
How Neha went from being a Chemistry Teacher to Web Developer through Penn Boot Camps.