Beyond (Data) Contracts: A Response to Benn Stancil
September 23, 2022 § Leave a comment
This essay by Benn Stancil provoked me so deeply my intended “comment” evolved into a full-fledged blog post:
Fine, let’s talk about data contracts
Benn’s “rant” feels profound on so many levels, especially if I can assume he’s captured the zeitgeist of our industry as accurately as he usually does.
« Read the rest of this entry »PipeBook: UX Design Brief
July 9, 2022 § 1 Comment
A key design goal of PipeBook is to break away from the single-browser-window user experience of traditional data notebooks, to take full advantage of the large screens on today’s laptops and desktops.

PipeBook.yml: Reimagining Notebooks as Resilient Data Pipelines
June 30, 2022 § 1 Comment
See Also
- The Data Config by Benn Stancil (Medium)
- https://github.com/TheSwanFactory/pipebook (App)
- PipeBook: UX Design Brief (Blog)
- https://github.com/TheSwanFactory/fridaay (Framework)
- Data on Rails: Solving the Data App Imperative (YouTube)
Overview
The modern data notebook has its roots in academic tools for mathematical research. Because of that, notebooks are fantastic for open-ended exploration, but an awkward match for production data pipelines. In particular, they don’t:
- Explicitly declare and track dependencies
- Enforce organizational quality and reproducibility standards
- Enable easy testing, validation, and alerting
PipeBooks are a simple but radical re-imagining of notebooks as “tools for iteratively constructing resilient data pipelines.” The key is a novel data format called FRIDAAY that allows us to:
- Express arbitrary data transformations
- As a series of idempotent Data Actions
- Via a single, easy-to-parse YAML file
Analytics Anonymous: The Missing Peace of the Modern Data Stack
May 31, 2022 § Leave a comment
Pitch 2 for Coalesce 2022 (unsubmitted) « Read the rest of this entry »
Pitch: Data is a Feature, not a Product
May 12, 2022 § 1 Comment
Communal Decision-Making Platforms and the End of the Modern Data Stack
Session Proposal for Coalesce 2022
TL:DR Businesses may start by developing a technical solution, but only succeed by integrating around a human problem. The same is true of the Modern Data Stack.
« Read the rest of this entry »How to Build LightDash from Source
November 11, 2021 § Leave a comment
LightDash is a super-cool Open Source business intelligence tool built on top of DBT (which I think of as node for SQL). While it is distributed as open source, the usual way to deploy it locally is by simply running a docker container.
If you want to actually built lightdash directly from source yourself, you need to follow the instructions under CONTRIBUTING. However, what was written there (as of November 11, 2021) did not quite work for me, so here are my workarounds.
I will also file this as a GitHub issue, and they are super-responsive so hopefully this page will be obsolete soon!
« Read the rest of this entry »From “Zombie Data” to “Smart Reports”
October 4, 2021 § Leave a comment
The bane of my IT existence is a business user who says, “Please get me the latest version of <random Excel file I have never seen before, named using idiosyncratic or ambiguous words>. Oh, and I need it tomorrow or else we won’t {make our numbers | pass our audit | satisfy the board}.”
I call this “zombie data” because it:
- Lacks any self-awareness
- Doesn’t remember where it came from
- Has no relationship to its current context
- Infects everyone it touches with that same mindlessness.
The Reporting Control Center
September 5, 2021 § 1 Comment
aka Quilt Data Hub or Lightdash 2.0?
Challenge
Can I evangelize
a corporate data platform
by just emailing out reports
with sufficiently smart URLs?
Rationale
I don’t have the power
to pull others onto a new platform.
But I can push useful data to others
in a way that inspires them to participate more directly with the platform
Proposal
Replace friendly Salesforce Reports and powerful NetSuite Saved Searches with a unified interface for viewing, editing, sharing, and managing:
- versioned reports
- personalized alerts
- variant analyses
that are delivered via self-contained emails that also onboard people into greater use of the platform
Definitions
Friendly
- Browseable
- Drag and Drop
- Live previews
Powerful
- Complex formulas
- Scaleable notifications
- Easy joins and relabeling
Motivation
The main value of Quilt to my business
is as a point of leverage
to shift the culture of communication
from “zombie data” in tables
to “smart reports” in a repository
The Coherency Manifesto: Towards Communal Data Platforms
August 21, 2021 § 2 Comments
Version 1.0: Sep 11, 2021 (Interdependence Day)
As a community
who produces, consumes, and manages data
we hold these truths to be self-evident:
Psycho-Analytic Engineering (Coalesce 2021)
June 6, 2021 § Leave a comment
Using Data to Differentiate Our Selves
Keynote Talk Proposal for Coalesce 2021
Based on “DBT as Organizational Therapy“
« Read the rest of this entry »
You must be logged in to post a comment.