From the course: Microsoft Power BI Data Analyst Associate (PL-300) Cert Prep by Microsoft Press
Use grouping, binning, and clustering - Power BI Tutorial
From the course: Microsoft Power BI Data Analyst Associate (PL-300) Cert Prep by Microsoft Press
Use grouping, binning, and clustering
- [Instructor] In this sub lesson we're going to use grouping, binning and clustering. So let's take a look at the overview of grouping. When Power BI Desktop creates visuals, it aggregates your data into chunks or groups based on values found in the underlying data. You can find how those chunks are presented with grouping. The group type is list for grouping. Alright, so now let's take a look at binning. You can also define a bin size to put values into equally sized groups that better enable you to visualize data in ways that are meaningful. This is often called binning, and the group type is bin for binning. So using binning, we can set bins for numerical and time fields. You can set the bin type to size of bins or the number of bins, you can make bins for calculated columns but not measures. And finally, let's take a look at clustering. Clustering is a method of identifying similar groups of data in a dataset. Data points in each group are comparatively more like data points of that group than those of other groups. And in Power BI, we can use the scatter chart to perform clustering. So let's go to the Power BI desktop and take a look. Okay, so I am on the power BI desktop here and we can see that I have a very simple visual here of my total sales in the Y axis and the different regions that are part of our data set over here and our sales territories along the X axis. And if I want, I could actually pick different data points here. So I can maybe say United Kingdom and Germany. So let's just pick these two. And what if I wanted these two presented together? I could right click on this after I've chose those and I can go here and say group data. And what that's going to do is go ahead and group that data into a group called Germany, United Kingdom, and all the other data points are referred to as other. So that's what I can do by default. So I'm just going to go ahead and delete this. Actually, let's go like this instead. So what I'm going to do is turn this off and I'm going to put this region that I have here on the X axis. Let's just go ahead and do that and get rid of this. And now we're going to go ahead and take a look at some different ways that I've data based on this group that I've already set up. So I'm going to do is click on this region group, right click and say edit group. And here we can go in and take a look at this in further detail. So the name of this is region and the field, which we have this set up on is region. The group type is list and we cannot go ahead and change that. So if you recall that back from the slides, and we can see here that I've got a number of values on the left hand side that are ungrouped. So they do not belong to a group. And what I had done here is I went through and set up a number of groups. So I got the Commonwealth, which is Australia, Canada, United Kingdom, EU, which is France and Germany. And I've gone through and set up a number of other groups. So I could, if I want, go ahead and ungroup certain data points from this Commonwealth. So if I want to get rid of the United Kingdom from there, I can do that. Or if I want, I can go back over here. If I want to put them back as I can, click on the Commonwealth here and say group. And that will go ahead and put the United Kingdom back over here. So that's how I can initially go ahead and set my groups up and then move things back and forth. One other thing that I want to point out here is I can actually turn this little checkbox on down to the bottom of the screen here called include other group. And I can go ahead and create my own groups if I want, or I can automatically set up another group which will contain all other ungrouped values just in case any come in that I had not accounted for. So let's go ahead and click okay. And now we can see what that data looks like with this grouping here. Now I can do something similar with hierarchies, which we talked about in a earlier sub lesson, but this is how we could do it using the grouping functionality. Okay, so I am on this visual here in my Power Bi Desktop and this is where I'm going to go ahead and perform binning. So in this case, on my X axis, I have the day number of years, so the number of days across the year and the total sales by the day number of year. And we can see this is an awful lot of data points. So what if we want to group this into different sections? So for this we're going to go ahead and set up some bins using the binding functionality. So how we're going to do this is I'm going to go over to my order date here. I am going to go down here and find my day number of year right here. And this is where I initially have that data point, but I've also actually already gone ahead and created the bins for day number of year. So I'm going to just right click on this and go and edit the group to show you what we've done here. So this is the name of the groups here, we can see the field is based on day number, year. The group type is bin for this. The bin type is size of bins. So I can go here and choose the size of the bins that I want, or I can choose the number of bins. So with size of bins, I can go down here and say, how many values do I want in each bin? Or I can go down here and say number of bins. And then it will automatically gimme some value here. And this is the way that I can go ahead and set up a different number of bins. Let's just go back over here and go down to the size of bins. And let's just choose 15 to start with. So I'm going to go like that. I'm going to go over here and let's just get rid of the day number, a year off the x axis and then just drop this new grouping in here. So we can see here is the size of each bin is 15. So here are the total sales for each bin of 15 different days of years. So this will go through and show me what this looks like through the entire year. I could go over here. So let's just go back over and say edit the groups. And let's just say instead of the size of bins, I want to say number bins. And perhaps I want to make the bin count. Let's just say 12 bins. Let's look like that. So I can go like this. And now we'll see that the data has been set up differently. And I should see here that I have a dozen different bins in my data. So I have One, two, three, four, five, six, seven, eight, nine, 10, 11, 12. So here I've gone through and I have 12 different bins and each one of these will contain a different number of data points. Okay, so that brings us to the end of this sub lesson on using groupings, binnings, and clustering.