Controlling your data

The most used software forge, Github, is sprawling with tools that data professionals make for their peers. And I love it. Every day I check the news to learn about new open-source goodies which we can all use. It is great to learn about tools like Polars and DuckDB, and ML model releases like stable-diffusion or LLaMa, and so much more.

But when it comes to managing personal data, I feel there is a lack of options. I want to focus on personal health data. Whether it is fitness trackers, smart scales, smartphones tracking steps, or keeping a food diary, many of us are generating health data. Yet all of this data is kept inside silos. I switched between an Apple and an Android phone some years ago, and there is no easy way to bring all the data together. We are being pushed towards putting more of our information in the same silo. Only then can for instance weight-tracking and number steps per day be combined.

At least when combining all this data inside a Google Fit or Apple Health silo we get some insights. But then I think back to the time when I was working near Healthcare-providers some time ago. How much interest there was from insurance agencies in using and modelling this personal data. I know it is in Apple’s and Google’s best interest to be very careful with this sensitive data, but many parties want access. Data breaches happen every day and large silos of information are more valuable targets. By keeping all the data in a proprietary external service I never feel like I am fully in control.

Another issue I have is that these apps make little use of the data they are provided. Goals or predictions are almost always based on a single factor. They do not give more than just the most generic advice. Any features they do try to push feel like native advertising.

There are quantified-self apps which are geared towards power-users. They have more features, but they can fall into the trap of making users feel bad for not using them. A true health app for everyone does not judge and is simply a good and useful aid. Luckily more and more FOSS app-alternatives are being created next to the large proprietary offerings. Some are very bare bones with minimal interfaces, others are more feature-full with sought after features like automatic reading of scale data over Bluetooth.

These alternatives make it (relatively) easy to export their data. In broad strokes all health apps are slowly but surely adding data export features. The new issue: there is too little software to make use of this exported data. This is where us data coders should step in. And by supporting FOSS apps which do one thing very well we avoid the pitfall of putting all our data in the Google or Apple silo. Letting users try different apps, without being afraid that all that data is useless if they decide to switch, would be amazing. At work people talk about data-lakes and analytical processing, yet where is our personal data pool at home? Where can we crunch some very personal small-sized data and share this data directly ourselves when we want to.