settings: show timestamps[1]Home [2]Ideas [3]Exobrain [4]Tags [5]Feed [6]Site [7]Me [8]@twitter [9]@github Map of my personal data infrastructure ====================================== Table of Contents ----------------- * 1. TODOs * 2. --- Some links into the map require javascript! Sorry! Well, it's been a year since I started the draft, so I guess it's about time to publish this! :) This is a map of my personal [10]data liberation [11]infrastructure, with links to the scripts and tools used; and my blog posts elaborating on different parts of it. My goal for data liberation is approximating the [12]'personal data mirror' concept, often [13]despite crappy interoperability (or lack thereof) of different platforms. I prepared this diagram for several reasons: * to give more context for my blog posts about data liberation and tools around it * to highlight the complexity and hoops we have to jump over because of the lack of interoperability * it was also sort of fun :) This time I won't write too much text and just let you explore it. Tips for exploring the diagram: * perhaps open the [14]full size SVG in a new tab * make sure to read the legend * links you can follow are marked with blue (and sometimes other colours) * there is a bubble (π¬) near some nodes/edges, you can hover it to see the comment * some integrations are in progress: marked with WIP, construction signs (π§π§) and dashed edges * arrows roughly represent the direction of data flow * arrow colors roughly correspond to the data source (so it's easier to track how it flows) * there are some rendering issues * it's probably not very mobile friendly (it's barely desktop friendly!) * SVG support varies among web browsers, so there might be some minor artifacts (chromium works better, but firefox works well enough) * navbar centering isn't broken on this page β it's just a temporary hack to fit in the diagram till I figure out wide pages properly Legend Meta (why I'm doing all this?) Android phone Filesystem Orger ΒΆ Orger: plaintext reflection of your digital self Managing inbound digital content Orger + Roam Research Github: orger Mirrors: kobo twitter instapaper youtube hypothesis github polar ...and more Queues: kobo2org ip2org reddit hackernews ...and more Promnesia ΒΆ My journey in fixing browser history Github: promnesia Plaintext files Export layer ΒΆ telegram_backup fbmessengerexport semi-manual (periodic) vkexport twint manual request (periodic) manual request (periodic) pinbexport ghexport manual download pockexport rexport pushshift_export instapexport kobuddy script manual input manual input emfitexport jbexport manual input GarminDB endoexport manual input Data export infrastructure Building data liberation infrastructure In search of a friendlier scheduler Filesystem ΒΆ sqlite json json sqlite sqlite json json json zip/json json zip/json json json json html sqlite custom format orgmode orgmode json json orgmode sqlite json fit json orgmode Against unnecessary databases π§Ensuring backup safety π§Data exports deduplication sqlite sqlite tcx workouts gpx tracks Human Programming Interface ΒΆ Device Cloud service Automatic script Manual step Entry from my blog (clickable) User facing interface The sad state of personal data and infrastructure Disk storage Dead service/product How to cope with a human brain What data I collect and why? GPS Garmin app Runnerup app Gpslogger app Google Browser history Location Takeout π¬ Jawbone (dead) π¬ API Endomondo (dead) π¬ API Garmin Connect website (scraping) π¬ sqlite sqlite tcx workouts gpx tracks Materialistic (Hackernews app) Bluemaestro app Telegram API FB Messenger API (private)π¬ fragile Wahoo Tickr X (HR monitor) BT BT Jawbone sleep tracker BT Bluemaestro (environment sensor) BT Garmin watch BT Emfit QS sleep tracker wifi (local API) wifi (cloud API) Emfit API VK.com API π¬ API closed? Twitter API π¬ website (scraping) π¬ archive fragile Discord API π¬ archive Pinboard API Github API π¬ archive Pocket API Reddit API π¬ GDPR export π¬ pushshift Instapaper API Kobo reader sqlite Remarkable 2 tablet ssh scales Blood tests (GP/Thriva/etc) dead Sleep data (subjective) dead Exercise Browser (extension) Archivebox (web preservation) data mirrors (read only) todo lists interactive queues Emacs (Doom) Logseq Building personal search engine DAL DAL DAL DAL DAL DAL DAL DAL π§WIPπ§ DAL DAL π§WIPπ§ DAL Usecases Making sense of Endomondo's calorie estimation Extending my personal infrastructure location.google gpslogger sb location&timezones for other modules messenger vk twitter discord sb pinboard github pocket reddit instapaper hackernews kobo and more... github/HPI bluemaestro body.weight body.blood body.sleep body.exercise Memacs π§ WIP π§ Jupyter IPython HTTP API (π§wipπ§) Spreadsheet-like interface? π§ WIP π§ Influxdb π§ WIP π§ Other programming languages (FFI) Apache Arrow π§ WIP π§ Sqlite (via cachew) Memri π§ WIP π§ Timeline /Memex (π§wipπ§) Dashboard (π§wipπ§) Libraries/patterns cachew persistent cache/serialization Configs suck Using mypy for error handling Solid project π§ WIP π§ Metabase Grafana see demo π§ WIP π§ Datasette π§ WIP π§ plugin π§ WIP π§ Browser (HTML) Jupyter IPython openhumans.org π§ WIP π§ Some notes regarding the diagram: * it's plotted via graphviz, and you can find the source [15]here (although the code is quite domain specific) * even though there is a lot of stuff on the diagram, it's still incomplete! * [16]here there is an (also incomplete) list of data I collect/export * [17]HPI modules are a good proxy for the data I'm using * note that despite some platforms dying (e.g. Jawbone/Endomondo), I can still use data produced with them! E.g. after Endomondo was discontinued, I was able to quickly switch to open source [18]RunnerUp app, while [19]preserving complete data compatibility. * note how many services are outright malicious with their anti-API/anti-scraping/anti-interoperability measures (yellow/red highlight for API nodes) * probably more platforms have GDPR exports, I just haven't tried yet * indirection is crazy Note how for some data, before I can get it on my computer, it goes as * device β> phone (over bluetooth) * phone β> cloud (over internet) * cloud β> computer (over internet) * for many phone apps the only way I can sync the data is by rooting my phone in order to access the /data/data directory This is getting worse and worse with every Android version. I understand the security concerns, but this is ridiculous. * some modules/packages (marked withsb superscript) were developed by [20]Sean Breckenridge He's forked my [21]HPI package and working on it [22]in parallel. For now, we decided to hack on it independently, in the hope that eventually we figure out what's a good model for cooperating and maintaining the modules. Also, he's done some cool work on [23]automatic HTTP API for HPI! ΒΆ1 TODOs -------- TODO[C][2021-02-07 19:53] hmm some 'HTML label' boxes seem to have extra padding? although only in svg mode? png renders fine. STRT[C][2020-02-03 01:57] fix css so it's occupying full screen width * [2020-02-07 19:49] a bit adhoc, but works for now STRT[C][2020-02-03 01:57] legend DONE[B][2020-02-07 19:51] labels don't fit into the boxes?? * [2020-02-14 21:25] apparently only on desktop Firefox =/ * [2021-02-07 19:46] looks fine now? STRT[C][2020-02-14 21:30] Chrome [24]doesn't support svg side attribute, so some labels appear upside down :( fixing with JS for nowβ¦ ΒΆ2 --- ------ Let me know what you think, and as always happy to answer your questions! [11]#infra [10]#dataliberation 22 February 2021 [8]π¦ me @twitter [9]π» me @github [25]CC BY 4.0 1. https://beepb00p.xyz/ 2. https://beepb00p.xyz/ideas.html 3. https://beepb00p.xyz/exobrain 4. https://beepb00p.xyz/tags.html 5. https://beepb00p.xyz/feed.html 6. https://beepb00p.xyz/site.html 7. https://beepb00p.xyz/me.html 8. https://twitter.com/karlicoss 9. https://github.com/karlicoss 10. https://beepb00p.xyz/tags.html#dataliberation 11. https://beepb00p.xyz/tags.html#infra 12. https://beepb00p.xyz/sad-infra.html#data_mirror 13. https://www.eff.org/deeplinks/2019/10/adversarial-interoperability 14. https://beepb00p.xyz/myinfra_files/myinfra.svg 15. https://github.com/karlicoss/myinfra 16. https://beepb00p.xyz/my-data.html 17. https://github.com/karlicoss/HPI/tree/master/my 18. https://github.com/jonasoreland/runnerup#readme 19. https://github.com/karlicoss/HPI/blob/5b501d156266ca8e185d681fab6bc3ee156498a6/my/runnerup.py 20. https://github.com/seanbreckenridge 21. https://github.com/karlicoss/HPI 22. https://github.com/seanbreckenridge/HPI 23. https://github.com/seanbreckenridge/HPI_API 24. https://developer.mozilla.org/en-US/docs/Web/SVG/Attribute/side#Browser_compatibility 25. http://creativecommons.org/licenses/by/4.0