From “Zombie Data” to “Smart Reports”

October 4, 2021 § Leave a comment

The bane of my IT existence is a business user who says, “Please get me the latest version of <random Excel file I have never seen before, named using idiosyncratic or ambiguous words>. Oh, and I need it tomorrow or else we won’t {make our numbers | pass our audit | satisfy the board}.”

I call this “zombie data” because it:

  • Lacks any self-awareness
  • Doesn’t remember where it came from
  • Has no relationship to its current context
  • Infects everyone it touches with that same mindlessness.

The best current alternative is when I can replace a manual spreadsheet with “live data” from NetSuite. This typically takes the form of a Saved Search that is:

  • Identified by a unique URL
  • Emailed to the End Users every month (or more often)
  • Presenter either inline or via a CSV attachment
  • Described by a brief introductory sentence

This is a huge step forward over “zombie data”, but it has a few limitations:

  • I can either show the data inline, or as a CSV, but not both
  • There is no way to create or embed a graph or supporting tables
  • None of the column labels are defined
  • Once the user downloads the CSV, I lose all linkages to when and where it came from
    • Other than the filename, if they are lazy enough to keep it

Smart Reports

As we are contemplating moving to a SyncHouse with a Reporting Control Center, my dream would be to go one step beyond “live data” to a Smart Report that is:

  1. Accessible either as a self-contained email or a persistent URL
    • Available inside or outside our organization, without needing per-user licenses
    • Triggered by a schedule or a condition
    • Versioned, so it is trivial to view prior month’s editions
  2. Documented with table- and column-level “tool tips”
  3. Aggregating multiple tables, charts, and summaries
  4. Editable as a Google Sheet containing a metadata/provenance tab
    1. Perhaps even with pre-defined filters

Example: Customer Deployment

For example, one common request is to know how effectively we are deploying to a particular high-profile customer. This requires knowing:

  1. Status of Orders: placed, shipped, billed
  2. Status of Devices: received, installed, returned

I could imagine the Smart Report for this as:

  1. Emailed monthly to executives, and weekly to the deployment team
  2. Precisely defining terms like “billed” and “installed”
  3. Two bar charts (Orders and Devices), showing monthly trends
  4. Inline data for the most recent month
  5. A Google Sheet with all the raw data behind those bar charts

Initially, the static URL could just be a Quilt Data package. Long-term, my hope is to evolve Lightdash into an open source collaborative data portal a la Hex.

Addendum: Data Stories

Every Smart Report is a self-contained Data Story that should include:

  • Objective: Why this report was created, and for whom
  • Summary: Graphs and Tables that fulfill that Objective
  • Freshness: when the Source data was pulled
  • Definitions: what key terms mean / how they were derived
  • Validation: Checks, results and approvals showing why (and when) we can trust this report
  • Sources: Tables, Analyses, and other Datasets used to construct this report
  • Configuration: parameters used to generate (and regenerate) this report

Tagged: , , ,

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

What’s this?

You are currently reading From “Zombie Data” to “Smart Reports” at iHack, therefore iBlog.


%d bloggers like this: