Tuesday, 15 July 2014

Packt Tableau Dashboard Cookbook errata and completed files download


Some of you know that I wrote a book on Tableau, called the Tableau Dashboard Cookbook.

I am writing to apologise for some errors that occurred at the point of production. These were not in my control but I do feel very sad about it. I put a year of effort into this book, and it is personally very disappointing for me that it turned out this way. At some point during the final editing and posting to print process, for some unknown reason, some text and some images got mixed up on Chapter 2. I can only apologise for this issue and I have no idea why this occurred.

By way of reparation, I have posted the correct chapter here, with additional images and additional notes to make it easier for people. These things happen and I am sad about it, and I can only try to help people who have got caught up in this issue.

If you'd prefer to download the actual working examples, you can get the working examples for the book from here.

 

Grouping your data with calculations


In the first topic, we specified communication as one of the key features of a dashboard. We need to be able to share the right information to the right audience, at the right time, to the right people, in the right format.

Sometimes, the data needs to be translated so that it matches the business rules and the business understanding of the organization. Tableau offers a number of different ways that help you to translate the data into something that the business decision makers will understand.

Grouping is one way of making data meaningful to the business. In this topic, we will look at grouping some dimension members into a single member. Rolling up some of the members in one dimension is a good way of summarizing the data for dashboards.

In this recipe, we will look at grouping dimension members, and then we will look at more complex grouping of calculations. The business question is an investigation into the characteristics of customers into those who have children, and those who do not. We will group the NumberChildrenAtHome dimension members into the group of customers who have children, and those who do not.

Getting ready


We will need to add in a new data source for this topic.

How to do it...


1. We will connect to an Excel file data called DimCustomer.xls and import it into Tableau's internal data store mechanism. To do this, navigate to Data | Connect to Data.

2. Then select the link Text File and a file browser will appear. Navigate to the folder where you stored the downloaded Excel and CSV files.

3. Navigate to the DimCustomer.csv file and select it.

4. In the Excel Workbook Connection dialog box, change the Step 4 name so that it says Connection_DimCustomer.

5. You can now see the new data source connection on the Data pane.

6. Click on Ctrl + M to get a new worksheet in Tableau. Alternatively, go to the Worksheet menu item and select New Worksheet.

7. Click on the Connection_DimCustomer source and drag Number of Records from the Measures pane to the Rows shelf.

8. Drag the NumberChildrenAtHome dimension attribute to the Tableau canvas, to the left of the NumberOfRecords column. You can see an example in the next screenshot.


 

 

8.     Click on the table in the Show Me Panel. When we look at the NumberChildrenAtHome dimension, we see the following members and the number of customer records associated with each member:



9. We will group the dimension members so it is easier to see which customers have children, and which do not.

10. Go to the dimension called NumberChildrenAtHome and right-click on it. You will see the following menu:



 
11. This produces the Create Group dialog box, which you can see in the following screesnhot:



 
12. In the Field Name field, enter the name of the field.

13. Multi-select the numbers 1 through to 5 by holding the Shift key and clicking to select more than one number at a time.

14. Click on Group, rename the group to CustomersWithChildren, and click on OK.

15. You can then see the new group on the left-hand side, on the Dimensions pane. Drag your new grouping to the Rows shelf.

16. Remove the NumberOfChildrenAtHome pill that is NOT a group. The NumberOfChildrenAtHome group should be on the Rows shelf.

17. Then choose Number of Records from the Rows shelf and put it into the Columns shelf.

18. You can see that the table now only has two rows in it; one with a zero, and the other row has CustomersWithChildren.

19. In order to make it clear, click on the zero and select Edit Alias.

20. Rename the zero to Customers with No Children.

21. You can also rename the Group alias to Customers With Children so that it matches the format used elsewhere. To do this, in the table, select the numbers 1 to 5 and right-click on them.

22. In the pop-up menu, select Group.

23. Right-click on the new group and select Edit Alias.

24. In the textbox, enter Customers With Children and click on OK.

25. It is now clear from the table that over 11,000 customers have no children, whereas just over 7,000 customers do have children. We can visualize this information in a better way, and we will do this for the rest of the exercise.

26. To do this, change the visualization to a heat map using the Show Me panel and click on the stacked bar chart component. Click on the Swap button so that the bar appears  in a horizontal line.

27.  Drag the group to the Label button on the Marks shelf.

28. We can then change the colors by dragging Number of Records to the Color panel on the Marks shelf.

29. In order to show the difference between the customers who do and who do not have children, the custom diverging color palette has been selected here.

30. Two diverging colors of orange and blue are selected. Orange is selected for Customers With No Children because there is a greater number of customers who do not have children; brighter and more intense colors are often used in order to denote higher values, as shown:


 

 

THIS IS a piece which people were missing. This paragraph was originally located here, and then it was moved to the beginning of the chapter. I think that people are missing it. However, the chapter can look disjointed but this recipe is actually two recipes joined together.

The first part focuses on groups by dimensions; the second part focuses on groups by measures. It is included here again, which is where I had it originally.

Then, we will look at a more advanced example of grouping the data by the measure, rather than the dimension. To do this, we can create a calculation that will distinguish the values that are below the average sales amount and above the average. Results that are classified as above average are labelled as Above or Equal to Average, and below average sales are labelled as Below Average. We can then use this calculation to convey a visual message to the business user; for example, we could color the above average sales in one color and the below average sales in another in order to make the distinction easily identifiable.

31. To do this, stay in the Chapter 2 workbook and continue to work in the existing worksheet.

32. Click on the Connection_FactInternetSales data connection at the top left hand side.

33. Click on the broken chain next to the CustomerKey in order to activate the relationship.

34. Locate and Right-click on the SalesAmount measure.

35. Select the measure Create Calculated Field which you can see illustrated in the following screenshot:

 
 


36. When you select this option, you will get the following dialog box:

 
36. In the Name field, enter the name of the calculated field SalesAboveOrBelowAverage.

37. In the Formula field, we will put in a formula that will calculate whether or not the sales amount is above or below the average amount. The formula is as follows:

IF ( SUM([SalesAmount]) - WINDOW_AVG(SUM([SalesAmount]), First(), Last() ) < 0 )

Then Below Average

Else Above or Equal To Average

END

38. When you've placed the formula into the Calculation editor and clicked on OK, you will be returned to the main Tableau interface. You will see your new calculation on the left-hand side in the Measures pane.

39. Remove the NumberChildrenAtHome label from the Marks shelf.

40. Remove SalesAmount from the Dimension pane to the Rows shelf.

41. Drag the SUM(SalesAmount) measure to the Columns shelf.

42. Drag NumberCarsOwned from the Dimension pane to the Rows shelf.

43. Drag your new calculation SalesAboveOrBelowAverage to the Color button on the Marks shelf.

44. Your screen should now look like the following:



To summarize, we have created a calculation that is meaningful to the business user. It drives the color display of the measure, which can help the business user to understand more efficiently. To summarize, it is simple and effective to conduct a grouping of dimension members into a binary grouping. This is useful for dashboards in order to provide an "at a glance" metric visualization that shows the organization has more customers who do not have children, than those who do.

Essentially, this formula uses the WINDOW_AVG function to work out the average of the values that are in the Tableau view of the data. Basically, this average works out the value of the data that is viewable in the Tableau canvas and does not include data that has been filtered.

It uses First() and Last() to work out the average over all the data from the first row right until the last row. The calculation takes the current SalesAmount value and compares it to the average SalesAmount.

How it works...


Tableau allows you to group data together by simply arranging fields of your data source on a Tableau worksheet.

When you group fields in a worksheet, Tableau queries the data using standard drivers and query languages (such as SQL and MDX). It then groups data together wherever necessary. Finally, it presents a visual analysis of the data.

 
Post a Comment