Tom MacWright

          Tom MacWright
            * [1]Writing⇠
            * [2]Reading
            * [3]Photos
            * [4]Projects
            * [5]About
          One way to represent things
          I have a theory about the future of programming. I doubt I’m the first
          to have it, but as far as I can tell this isn’t the mainstream thought
          in the area, and I want to see if this connects for other folks.
          There’s this idea about having ‘one way to do things’, that I think is
          most famous in its phrasing from the [6]Zen of Python
            There should be one– and preferably only one –obvious way to do
          Perl has the [7]opposite motto: There’s more than one way to do it.
          Whether it’s better to have “one way to do things”, as I’d guess is
          the dictum of Python and Go, or many ways, like Rust, JavaScript,
          Lisps, etc, it’s sort of undecided.
          But let’s flip the telescope around and peer into the other end.
          Programming is about data structures and the ways you manipulate them.
          There are few languages that can claim to have one way to store things.
          I claim that most simple programming environments are simple because
          their datatypes are simple, not because their control flow or
          statements or expressions are simple. Let’s take a look:
          Excel spreadsheets support sheets, columns, rows, and cells. That’s
          it. Until very recently ([8]2020), cells were extremely limited in
          what they could represent, and even with fancy new cells, those types
          are curated. Excel formulas work, and compose so well, because a
          column of numbers is generally the same in any kind of document.
          Successful visual programming thrives in constrained environments in
          which data is mostly homogenous. [9]Pure data has four simple kinds of
          ‘atoms’. [10]Max/MSP has a few more, but still limited and
          What has made R and Python such successful platforms for data science
          isn’t just TensorFlow and ggplot, but the thing that connects the
          parts of the data science toolkit together: dataframes. The Python
          ecosystem is far from perfect, but the fact that there are complex
          datatypes that can handle a wide variety of research data inputs &
          outputs, and that can be used by multiple packages - that pandas can
          talk to [11]seaborn to quickly generate a chart - is remarkable.
          In comparison, there are lots of systems in which the common data
          types are so low-level and people are so hesitant to accept shared
          definitions that every “computation” problem meets an equal or greater
          “representation” problem.
            * Going to parse a webpage? Is the webpage a DOM? A plain-old nested
              object? Somewhere in between, like a [12]cheerio or jQuery
            * Going to manipulate a color? Is it a RGB triplet in an array, or
              an object? Or is it an instance of a [13]class in a helper module,
              or a hex string?
          There’s so much energy put into visual programming or functional
          programming so that we can “connect things,” but not nearly as much
          time spent on what those things are. So what you get is the ability to
          connect any “compatible” parts, but a poor definition of what
          compatibility is, what those types are.
          What if a simpler programming language had first-class representations
          of a lot more than strings and arrays? Of course this would rankle
          seasoned developers who want ultimate power and prefer tiny extensible
          systems. When developers think of advanced type systems, they think of
          things like [14]Haskell’s scary-powerful primitives for creating new
          types, not of ecosystem-supported common types.
          But if the aim is ease of use and giving power to people who otherwise
          wouldn’t be doing programming, type-rich systems with lots of
          assumptions seem like a logical first step. And one that doesn’t need
          a visual editor or a new dialect of a rare programming language.
          February 23, 2021 [15]@tmcw