From the course: Actionable Insights and Business Data in Practice

Choose and prepare graphics for insight

- [Instructor] Humans are visual creatures, and so showing people your data in graphs is probably the best way to communicate your overall message in your analysis, but you need to keep it simple, and that's a challenge. Remember, the ultimate goal of what you're doing is not to present a comprehensive analysis of the data, now, that may seem surprising because that's probably in your job title, but your goal is to get actionable insights, or the things that your stakeholders can do as a result of your analysis. So, focus on the end. That's the actionable insights, and not just the means or the data analysis. When you are showing your data there's a few things you can do to keep it simple. You can use bar charts. They are universal, easily understood, easy to create. They're great for comparing groups and quantities. How many people in this group or what's the average? You can also use line charts, which are great for showing changes over time, say over the last five years, and maybe you can do scatterplots, which are good for showing the relationships between variables, but I'll explain that for just a second. First off, let's begin with bar charts. You want to keep them as easy as possible. You can either do vertical bar charts, and you see I've done these in Google Sheets because it keeps things really, really simple. It's flat, there's no false third dimension, there's no outlines, there's no funny stuff going on with the colors. I've made it as clean and easy as possible. I've also arranged the bars from the highest on the left by the axis, and going down to the lowest on the right. That's usually easiest. If you have long labels then sometimes it's better to do them horizontally so the labels are in the same orientation as the bars themselves. You can also do grouped bar charts where you're looking at different categories and the values for each of those categories. So for instance, what I have right here is data from the psychological profiles of the 48 contiguous United States, and it's breaking them down by region, Midwest, Northeast, South, and West, and three personality profiles, friendly and conventional in blue, relaxing and creative in red, and temperamental and uninhibited in yellow, and you can see that there's a big difference as we switch from one area to another. Line charts are also really simple, and again, I've done this to make it as easy as possible. I have one vertical reference line for every year. I just have the blue line. Again, there is no false third dimension. There's no changing of the colors, and what this is showing is data from Google Trends that shows the number of searches for the term AI over the past five years in the US, and you can see it took off dramatically in the last year, and that's because we had first here is Dall-E, and the generative AIs that could make pictures from words, and then this is where we had ChatGBT come out, and that's when it just went to a hundred percent. You can also do line charts with multiple values, as long as there's a strong reason to include all of them together, it can get kind of messy. This one's a little bit messy, but we're showing three different terms also from Google Trends for the same five-year period. Machine learning in blue, data science in red, and artificial intelligence spelled out in the yellow or gold, and what you can see from that is machine learning and data science are almost exactly the same in terms of the prevalence of search, and that artificial intelligence is only about half as high, and so that gives you an idea of the relative popularity of these things over a few years, and so a line chart with multiple lines can be a really good thing. It does require a little bit of explanation usually when you show it to your stakeholders, and scatterplots, I say maybe here, because it sometimes confuses people to see these things. I know that as an analyst you see these as sort of your bread and butter, they're easy to do, but not everybody knows how to read them, that each dot represents a combination of values. So for instance, what we have here are the 48 contiguous United States with a dot that shows how common business intelligence was as a search term for them. That's going across the X axis, and then data science going up on the Y axis, and you can see they're associated, though there are outliers. Or here's another one where we take the population of a state in millions and then we compare it to the searches in Google for business intelligence, and really what we get out of this is that there are a couple of really big states, California, Texas, Florida, and New York, and the pattern, while there is an overall uphill pattern, it's not huge, and there's Washington State up there that is leading the searches for business intelligence, even though it's a relatively small state compared to the others. The important thing to keep in mind is the difference between analysis and presentation. When you're analyzing data, use anything that helps. When I'm working with data, I use box plots and density plots and scatterplot matrices, and decision trees, and dendrograms for cluster charts. Anything that helps you as the analyst find the meaning in the data is going to be great. The difference is do you want to present it to your stakeholders, and the answer is a lot of the times, no. You don't need to show them all the drafts of your report. They just need to see the final thing. Now, I want you to know, this is kind of hard for me because I have a background in design, and there's some things in the data visualization world that I love, and I think they're beautiful and amazing but I wouldn't want to use them in my presentations, and that's going to be true for a lot of the really foundational work in data visualization by Edward Tufte, or this great website, the Data Visualization Catalog which gives you 60 different kinds of visualizations, most of which will just be confusing to your audience, or in the art world, there's the Dear Data Project, which is handmade data visualizations on postcards. It's an amazing project, and it's in the Museum of Modern Art. Or one of my favorites is Jose Duarte's Handmade Visualizations, where he would put physical objects in space and then label them. Those are all really, really cool. I love them, but when it comes to communicating actionable insights to stakeholders where you have limited attention, these are usually not the ones that you're going to want to use. Again, when you're presenting to stakeholders and you are trying to keep the attention on the actionable insights and not on the data and not on your process, you want to keep it as simple as possible that still allows you to get your message across. For presentations, use bar charts and line charts freely. They're easily understood. They can make your recommendations clear. Use other graphs sparingly, really only with justification. You got to have a reason for including a scatterplot or a reason for including the cluster chart in your presentation because again, it might create more confusion, it might create more problems than it solves. Also, I suggest that you use one graph at a time, and that all throughout your emphasis should always be on actionable insights. That is the end, that is the goal, or the point of your analysis. Actionable insights, and the presentations can provide the support for the justification of your actionable insights, and that's the way you want to keep it.

Contents