15.11.13

Excel and Tableau: When a Table is just a Table (and not data)


There's all kinds of 'analysts', but perhaps the most prolific is the Excel Analyst.  That pretty much encomapses everyone who works with computers.  Excel has provided everyone with the ability to conduct a wide range of analysis and reporting for years.  Anyone can make a chart or table in Excel - you don't need an understanding or training in data analysis.  However, things can get confusing when we move to an analysis tool like Tableau.
Part of the problem is that most of us use the terms Table and Data interchangeably (because we can).  Here's the rub:
Tables, charts and maps are presentations of Data.
In relational databases, Data is stored in Tables.
So a Table can present Data or contain it.
The problem occurs when we try to take a presentation Table and use it as Data for analysis to make a presentation Table or Chart.  Confused?  Yeah, don't worry, lot's of people get confused by this.  It's why most Open Data is available as a presentation Table and not Data.



It would be nice if all open data was available to us in a Data format, but often it's not.  In order to connect the data to Tableau (or even create a pivot table in Excel), we have to delete empty rows and columns and arrange fields into the proper orientation.  Tableau provides a great (free!) data reshaping add-in tool for Excel which takes a lot of the pain out of the process.

Think of the Excel table as data that you can update with new data in the future and if your calculations are in Tableau, they will automatically happen, without extra steps or extra opportunities for error.  MANUAL MANIPULATIONS IN EXCEL = INCREASED CHANCE OF ERROR that can be very difficult to discover until it's too late.  Especially VLOOKUP.



Before you can reshape the data you need to remove all the blank rows and make sure your field names are in row 1. Then you place the cursor in the first cell with a numeric value and re-shape.  Your table is now data.   For fields like this date field which are text strings, you can re-code this easily in Tableau.  Don't create any fields with formulas or calculations in Excel first.  Especially not VLOOKUP.

BTW, have I mentioned that I hate VLOOKUP?  Here's why.  It is perhaps the most used and abused formula in Excel's arsenal.  VLOOKUP requires proper sorting in order to return accurate results.  If you don't know what I mean by that, you need to stop using the formula immediately.  When I receive a workbook that has the formula, I have to check to ensure that it is being used correctly before I can use the data because VLOOKUP doesn't readily let you know that it's not returning exact match results.  To learn more about this, I highly recommend this post by Charley Kyd's at ExcelUser.com.