settings: show timestamps[1]Home [2]Ideas [3]Exobrain [4]Tags [5]Feed
          [6]Site [7]Me [8]@twitter [9]@github
       
          Map of my personal data infrastructure
          ======================================
       
          Table of Contents
          -----------------
       
            * 1. TODOs
       
            * 2. ---
       
          Some links into the map require javascript! Sorry!
       
          Well, it's been a year since I started the draft, so I guess it's
          about time to publish this! :)
       
          This is a map of my personal [10]data liberation [11]infrastructure,
          with links to the scripts and tools used; and my blog posts
          elaborating on different parts of it.
       
          My goal for data liberation is approximating the [12]'personal data
          mirror' concept, often [13]despite crappy interoperability (or lack
          thereof) of different platforms.
       
          I prepared this diagram for several reasons:
       
            * to give more context for my blog posts about data liberation and
              tools around it
       
            * to highlight the complexity and hoops we have to jump over because
              of the lack of interoperability
       
            * it was also sort of fun :)
       
          This time I won't write too much text and just let you explore it.
          Tips for exploring the diagram:
       
            * perhaps open the [14]full size SVG in a new tab
       
            * make sure to read the legend
       
            * links you can follow are marked with blue (and sometimes other
              colours)
       
            * there is a bubble (πŸ’¬) near some nodes/edges, you can hover it to
              see the comment
       
            * some integrations are in progress: marked with WIP, construction
              signs (🚧🚧) and dashed edges
       
            * arrows roughly represent the direction of data flow
       
            * arrow colors roughly correspond to the data source (so it's easier
              to track how it flows)
       
            * there are some rendering issues
       
                * it's probably not very mobile friendly (it's barely desktop
                  friendly!)
       
                * SVG support varies among web browsers, so there might be some
                  minor artifacts (chromium works better, but firefox works well
                  enough)
       
                * navbar centering isn't broken on this page – it's just a
                  temporary hack to fit in the diagram till I figure out wide
                  pages properly
       
          Legend Meta (why I'm doing all this?) Android phone Filesystem Orger ΒΆ
          Orger: plaintext reflection of your digital self Managing inbound
          digital content Orger + Roam Research Github: orger Mirrors: kobo
          twitter instapaper youtube hypothesis github polar ...and more Queues:
          kobo2org ip2org reddit hackernews ...and more Promnesia ΒΆ My journey
          in fixing browser history Github: promnesia Plaintext files Export
          layer ΒΆ telegram_backup fbmessengerexport semi-manual (periodic)
          vkexport twint manual request (periodic) manual request (periodic)
          pinbexport ghexport manual download pockexport rexport
          pushshift_export instapexport kobuddy script manual input manual input
          emfitexport jbexport manual input GarminDB endoexport manual input
          Data export infrastructure Building data liberation infrastructure In
          search of a friendlier scheduler Filesystem ΒΆ sqlite json json sqlite
          sqlite json json json zip/json json zip/json json json json html
          sqlite custom format orgmode orgmode json json orgmode sqlite json fit
          json orgmode Against unnecessary databases 🚧Ensuring backup safety
          🚧Data exports deduplication sqlite sqlite tcx workouts gpx tracks
          Human Programming Interface ΒΆ Device Cloud service Automatic script
          Manual step Entry from my blog (clickable) User facing interface The
          sad state of personal data and infrastructure Disk storage Dead
          service/product How to cope with a human brain What data I collect and
          why? GPS Garmin app Runnerup app Gpslogger app Google Browser history
          Location Takeout πŸ’¬ Jawbone (dead) πŸ’¬ API Endomondo (dead) πŸ’¬ API Garmin
          Connect website (scraping) πŸ’¬ sqlite sqlite tcx workouts gpx tracks
          Materialistic (Hackernews app) Bluemaestro app Telegram API FB
          Messenger API (private)πŸ’¬ fragile Wahoo Tickr X (HR monitor) BT BT
          Jawbone sleep tracker BT Bluemaestro (environment sensor) BT Garmin
          watch BT Emfit QS sleep tracker wifi (local API) wifi (cloud API)
          Emfit API VK.com API πŸ’¬ API closed? Twitter API πŸ’¬ website (scraping) πŸ’¬
          archive fragile Discord API πŸ’¬ archive Pinboard API Github API πŸ’¬
          archive Pocket API Reddit API πŸ’¬ GDPR export πŸ’¬ pushshift Instapaper API
          Kobo reader sqlite Remarkable 2 tablet ssh scales Blood tests
          (GP/Thriva/etc) dead Sleep data (subjective) dead Exercise Browser
          (extension) Archivebox (web preservation) data mirrors (read only)
          todo lists interactive queues Emacs (Doom) Logseq Building personal
          search engine DAL DAL DAL DAL DAL DAL DAL DAL 🚧WIP🚧 DAL DAL 🚧WIP🚧 DAL
          Usecases Making sense of Endomondo's calorie estimation Extending my
          personal infrastructure location.google gpslogger sb
          location&timezones for other modules messenger vk twitter discord sb
          pinboard github pocket reddit instapaper hackernews kobo and more...
          github/HPI bluemaestro body.weight body.blood body.sleep body.exercise
          Memacs 🚧 WIP 🚧 Jupyter IPython HTTP API (🚧wip🚧) Spreadsheet-like
          interface? 🚧 WIP 🚧 Influxdb 🚧 WIP 🚧 Other programming languages (FFI)
          Apache Arrow 🚧 WIP 🚧 Sqlite (via cachew) Memri 🚧 WIP 🚧 Timeline /Memex
          (🚧wip🚧) Dashboard (🚧wip🚧) Libraries/patterns cachew persistent
          cache/serialization Configs suck Using mypy for error handling Solid
          project 🚧 WIP 🚧 Metabase Grafana see demo 🚧 WIP 🚧 Datasette 🚧 WIP 🚧
          plugin 🚧 WIP 🚧 Browser (HTML) Jupyter IPython openhumans.org 🚧 WIP 🚧
       
          Some notes regarding the diagram:
       
            * it's plotted via graphviz, and you can find the source [15]here
              (although the code is quite domain specific)
       
            * even though there is a lot of stuff on the diagram, it's still
              incomplete!
       
                * [16]here there is an (also incomplete) list of data I
                  collect/export
       
                * [17]HPI modules are a good proxy for the data I'm using
       
            * 
       
              note that despite some platforms dying (e.g. Jawbone/Endomondo), I
              can still use data produced with them!
       
              E.g. after Endomondo was discontinued, I was able to quickly
              switch to open source [18]RunnerUp app, while [19]preserving
              complete data compatibility.
       
            * note how many services are outright malicious with their
              anti-API/anti-scraping/anti-interoperability measures (yellow/red
              highlight for API nodes)
       
            * probably more platforms have GDPR exports, I just haven't tried
              yet
       
            * 
       
              indirection is crazy
       
              Note how for some data, before I can get it on my computer, it
              goes as
       
                * device –> phone (over bluetooth)
       
                * phone –> cloud (over internet)
       
                * cloud –> computer (over internet)
       
            * 
       
              for many phone apps the only way I can sync the data is by rooting
              my phone in order to access the /data/data directory
       
              This is getting worse and worse with every Android version. I
              understand the security concerns, but this is ridiculous.
       
            * 
       
              some modules/packages (marked withsb superscript) were developed
              by [20]Sean Breckenridge
       
              He's forked my [21]HPI package and working on it [22]in parallel.
              For now, we decided to hack on it independently, in the hope that
              eventually we figure out what's a good model for cooperating and
              maintaining the modules.
       
              Also, he's done some cool work on [23]automatic HTTP API for HPI!
       
          ΒΆ1 TODOs
          --------
       
          TODO[C][2021-02-07 19:53] hmm some 'HTML label' boxes seem to have
          extra padding?
       
          although only in svg mode? png renders fine.
       
          STRT[C][2020-02-03 01:57] fix css so it's occupying full screen width
       
            * [2020-02-07 19:49] a bit adhoc, but works for now
       
          STRT[C][2020-02-03 01:57] legend
       
          DONE[B][2020-02-07 19:51] labels don't fit into the boxes??
       
            * [2020-02-14 21:25] apparently only on desktop Firefox =/
       
            * [2021-02-07 19:46] looks fine now?
       
          STRT[C][2020-02-14 21:30] Chrome [24]doesn't support svg side
          attribute, so some labels appear upside down :(
       
          fixing with JS for now…
       
          ΒΆ2 ---
          ------
       
          Let me know what you think, and as always happy to answer your
          questions!
       
          [11]#infra [10]#dataliberation 22 February 2021 [8]🐦 me @twitter [9]πŸ’»
          me @github [25]CC BY 4.0
       
          1. https://beepb00p.xyz/
          2. https://beepb00p.xyz/ideas.html
          3. https://beepb00p.xyz/exobrain
          4. https://beepb00p.xyz/tags.html
          5. https://beepb00p.xyz/feed.html
          6. https://beepb00p.xyz/site.html
          7. https://beepb00p.xyz/me.html
          8. https://twitter.com/karlicoss
          9. https://github.com/karlicoss
          10. https://beepb00p.xyz/tags.html#dataliberation
          11. https://beepb00p.xyz/tags.html#infra
          12. https://beepb00p.xyz/sad-infra.html#data_mirror
          13. https://www.eff.org/deeplinks/2019/10/adversarial-interoperability
          14. https://beepb00p.xyz/myinfra_files/myinfra.svg
          15. https://github.com/karlicoss/myinfra
          16. https://beepb00p.xyz/my-data.html
          17. https://github.com/karlicoss/HPI/tree/master/my
          18. https://github.com/jonasoreland/runnerup#readme
          19. https://github.com/karlicoss/HPI/blob/5b501d156266ca8e185d681fab6bc3ee156498a6/my/runnerup.py
          20. https://github.com/seanbreckenridge
          21. https://github.com/karlicoss/HPI
          22. https://github.com/seanbreckenridge/HPI
          23. https://github.com/seanbreckenridge/HPI_API
          24. https://developer.mozilla.org/en-US/docs/Web/SVG/Attribute/side#Browser_compatibility
          25. http://creativecommons.org/licenses/by/4.0