(2021-04-15) Taylor Spreadsheet Rantifesto
Dorian Taylor: Spreadsheet Rantifesto. The spreadsheet as the go-to data structure is so lamentable. In general, I never want data as it comes in a spreadsheet. I almost always want it as a graph, or at least a tree. Graphs and trees—at least ordered trees like XML or JSON—can embed tabular data, such as that found in a spreadsheet, but the same cannot be said for the other way around.
I almost always want to model one-to-many relationships, which spreadsheets straight-up can't do. I always want to define the data semantics—what the columns and rows unambiguously mean. Spreadsheets can't do this either.
Delimited text files—usually but not necessarily by commas—have emerged over the decades as the de facto way to exchange record-oriented data. This situation could be much worse, but it isn't especially good.
Headers
Nulls
Dates and times
enums
Mixed datatypes
I suppose you could say the spiritual successor to CSV is JSON
Honourable(?) mention goes to YAML, which is yet another markup language liability borderline-Turing-complete data serialization format.
Anything that originates as an actual spreadsheet
Actual spreadsheets import all the problems of CSV and have plenty of their own.
I worked on a project last year where I interfaced with a teammate primarily through a spreadsheet. Being a site map, it shook out as a pseudo-hierarchy of about 200 entities
I want to underscore that I believe there is great value in being able to type in data with your fingers and move it around ad-hoc like a spreadsheet affords. I also believe that very same ad-hockery is what has limited the spreadsheet's functionality since it was invented over four decades ago
The original design was never meant to grow past the confines of an individual PC
How do you preserve this beefed-up, ad-hoc flexibility while adding the guardrails necessary to make the data more valuable to the larger ecosystem? I see two main challenges: data semantics and user interface.
Getting a modest quantity of data, by hand, into a persistent structure, that doesn't demand any up-front work on schema definition, and provides a rudimentary computational vocabulary, is what I believe to be the core strength of the spreadsheet.
Edited: | Tweet this! | Search Twitter for discussion