Quantifying Habits using Predictive Models

Howard_Curry · July 6, 2014, 5:21pm

I have been thinking about Habits. Habits are a fixed way of doing things – or in other words - Habits are predictable.

For example: I like to get up early and get work done before my daughter wakes up. On Saturdays and Sunday’s I often don’t wake up early because I tend to stay up later on Friday’s and Saturday.

The distribution of my start times is bi-modal – with peaks at 5 AM and 8 AM. The average or any other point estimate (average is 6:30 AM) is a terrible way of quantifying my habit. A histogram or kernel density estimate is not a good summary either.

A predictive model seems like a good candidate to describe a habit. The model should predict that I will start working at 8 AM on Saturday and Sunday and 5 AM rest of the week – which is exactly my habit.

This can carry over to other habits – When and how much Coffee I consume? When and how much time I spend watching TV? Etc.

So - A habit can be quantitatively defined as the output of a predictive model. My habit is what the model predicts.

What do you think of this approach of quantifying habits?

ejain · July 7, 2014, 7:29pm

Simple averages are “predictive models”, too

I’ve been looking into quantifying habits for this service. Doing some kind of automatic feature detection is tempting, but produces too much garbage. It’s easier to get good results by calculating how closely someone follows specific habits (e.g. getting up early, or sleeping in on weekends).

Diogo_Neves · July 8, 2014, 11:05am

Great question, I’d add to that the home worker routine with occasional visit to the office.

I work from home most of the days but usually meet someone in town on Tuesday and go down to the office on Friday. How would you go about quantifying that kind of behaviour?

Howard_Curry · July 8, 2014, 3:00pm

Eric - You are right not just simple averages but even constant models are predictive.
So my assertion really should have been written as:

The best predictive model you can create is the best quantitative description of your habit.

Best here refers to predictive accuracy using a suitable error measure.
This seems logical to me but I was curious if there are better ways to quantify habits.

Diogo,
It should be easy to capture the home worker routine that you describe. Day of week as well as location seem like features that a predictive model for start time will typically include.

Howard_Curry · July 8, 2014, 3:04pm

Eric - You are right not just simple averages but even constant models are predictive.
So my assertion really should have been written as:

The best predictive model you can create is the best quantitative description of your habit.

Best here refers to predictive accuracy using a suitable error measure.
This seems logical to me but I was curious if there are better ways to quantify habits.

Diogo,
It should be easy to capture the home worker routine that you describe. Day of week as well as location seem like features that a predictive model for start time will typically include.

Diogo_Neves · July 8, 2014, 7:27pm

Howard - do you know of any open source projects where I can look at the models?
Would love to learn more about

ejain · July 9, 2014, 5:50am

Weka is a nice open source tool to play with different machine learning algorithms–but you still need to know what you are doing

Howard_Curry · July 9, 2014, 7:18pm

R and Weka are probably the most common open source tools for predictive modelling. GNU Octave is also a good choice. I personally use R or MATLAB.

The predictive power and complexity of the model depends on how much data you have.

For start time prediction these are the features that I use:

Day Of Week
Location
Lags (That is the start time the day before and the day day before etc)
Day of Week Lags - The start time 7\14\21… days ago.

if you just have a few months of data - you can probably only use one of the features (probably day of the week) without severe over-fitting.

Diogo_Neves · July 9, 2014, 9:55pm

Great responses! Thanks!