From “Zombie Data” to “Smart Reports”
October 4, 2021 § Leave a comment
The bane of my IT existence is a business user who says, “Please get me the latest version of <random Excel file I have never seen before, named using idiosyncratic or ambiguous words>. Oh, and I need it tomorrow or else we won’t {make our numbers | pass our audit | satisfy the board}.”
I call this “zombie data” because it:
- Lacks any self-awareness
- Doesn’t remember where it came from
- Has no relationship to its current context
- Infects everyone it touches with that same mindlessness.
The best current alternative is when I can replace a manual spreadsheet with “live data” from NetSuite. This typically takes the form of a Saved Search that is:
- Identified by a unique URL
- Emailed to the End Users every month (or more often)
- Presenter either inline or via a CSV attachment
- Described by a brief introductory sentence
This is a huge step forward over “zombie data”, but it has a few limitations:
- I can either show the data inline, or as a CSV, but not both
- There is no way to create or embed a graph or supporting tables
- None of the column labels are defined
- Once the user downloads the CSV, I lose all linkages to when and where it came from
- Other than the filename, if they are lazy enough to keep it
Smart Reports
As we are contemplating moving to a SyncHouse with a Reporting Control Center, my dream would be to go one step beyond “live data” to a Smart Report that is:
- Accessible either as a self-contained email or a persistent URL
- Available inside or outside our organization, without needing per-user licenses
- Triggered by a schedule or a condition
- Versioned, so it is trivial to view prior month’s editions
- Documented with table- and column-level “tool tips”
- Aggregating multiple tables, charts, and summaries
- Editable as a Google Sheet containing a metadata/provenance tab
- Perhaps even with pre-defined filters
Example: Customer Deployment
For example, one common request is to know how effectively we are deploying to a particular high-profile customer. This requires knowing:
- Status of Orders: placed, shipped, billed
- Status of Devices: received, installed, returned
I could imagine the Smart Report for this as:
- Emailed monthly to executives, and weekly to the deployment team
- Precisely defining terms like “billed” and “installed”
- Two bar charts (Orders and Devices), showing monthly trends
- Inline data for the most recent month
- A Google Sheet with all the raw data behind those bar charts
Initially, the static URL could just be a Quilt Data package. Long-term, my hope is to evolve Lightdash into an open source collaborative data portal a la Hex.
Addendum: Data Stories
Every Smart Report is a self-contained Data Story that should include:
- Objective: Why this report was created, and for whom
- Summary: Graphs and Tables that fulfill that Objective
- Freshness: when the Source data was pulled
- Definitions: what key terms mean / how they were derived
- Validation: Checks, results and approvals showing why (and when) we can trust this report
- Sources: Tables, Analyses, and other Datasets used to construct this report
- Configuration: parameters used to generate (and regenerate) this report
Leave a Reply