Using long-term Internet usage data collected by network connected Pi-hole to track personal behavior(s)

Recently installed a Pi-hole on my home network, and I’m thoroughly enjoying the tool. But it struck me today that the Internet usage data being collected by this computer could be extremely useful for tracking my own behavior in general. And then simply graphing data points to see what becomes obvious.

Here is a screenshot of the “long-term data” that has been gathered so far from my device.

After doing more research into how the Pi-hole data is being stored, it seems that I could simply export a SQL DB and have access to granular information about when, where, why and how I visited certain websites. As well as the volume of requests, associated IPs and much more.

This post might as well be #001 for this interesting research project, as I suspect I’ll be returning to the idea of using a Pi-hole DNS device for gathering (essentially) my entire lifestyle Online, and then analyzing the data for trends. If I’m correct, the potential(s) here is strong.

Thoughts?

Can you explain your experiment? Experiment isnt just about collecting a data. It’s about formulate a question / idea at first. What question you want to be answered? If answer doesnt lead to decisions, that’s not too much about QS, it’s just collecting data for fun…
Also there is a lot of questions raise as you start experiment: How much data to collect? Do i need to blind myself? What’s best accurate/convinient way to measure variable of interest? What’s effect size i’m looking for my outcome variable? How to analyse collected data? Do i need to take out outliers? Can i use correlation or regression analysis on my data? Do i need to account for multiple comparisons? etc

4 posts were merged into an existing topic: Experiments in Self-Research

I should not have originally described this research project as an “experiment”. My interest in tracking personal Internet usage data isn’t about formally experimenting using the scientific method. This undertaking is ultimately about growing my mindfulness practice as a Technologist.

@bretbernhoft I didn’t know much about the Pi-hole, so I did a little reading and found this post, which gave me a small window in the data that’s being stored and could be analyzed. : Pi-hole: 3 years later.

1 Like

Thank you. That’s exactly the kind of thing I’m looking to do. If I do return to this, I’ll make sure to share the findings here.

I wasn’t totally clear on the data collected, is it top level DNS data only?

The information being tracked (as far as I’m aware) includes IP address(es), client data, IPv4 vs IPv6, domain(s) queried, status, replies and actions. So there’s a bunch of data automatically being tracked by the Pi-hole that any user/owner of one can work with over a given period of time.

With that said, it will be interesting to see which websites I will visit and when, graphed as a bar chart with volume over time.

I have an idea of what most of this will look like, but not “actions.” What are those?

Whether the DNS request was “whitelisted” or “blacklisted”. In other words, whether the Pi-hole allowed or disallowed the lookup.

1 Like