Background: I've been gathering data -- tons! -- on my health and well-being and various variables possibly connected to my health (supplement-intake, diet, exercise, etc.) for years in Word files, and want to make use of it, and also continue to be able to add data, even more than I have been doing, in an easier way. Specifically, my goals -- under the obvious overall rubric of maximizing health -- are:
1) to transfer the data I already have to a spreadsheet or some other form that will make it easier to analyze and see connections
2) to be able, from now on, to enter data directly into the spreadsheet (or other systematically organized format)
3) to be able to do regression analyses or the like to find connections among variables. Being able to visualize the data would be nice, but that's minor.
My vision, thus, is that I would have a massive spreadsheet, and could find, use, or create macros (I'm new to Excel -- sorry if "macro" isn't quite the right word!) that find correlations.
Simple, I think:
0) Set up a spreadsheet with time increasing to the right (with the columns [question: how finely divide units of time?]), and variables listed in the rows.
1) Write Visual Basic macros to convert the old Word data into Word tables and then move it over to the spreadsheet. (This part is easy.)
2a) Enter things that don't change much using copy&paste, far into the future. ("Zn 10 mg": copy, paste, paste, paste, paste.... Can delete or alter in the future if I change the amount of Zn I take.) (This part is also easy.)
2b) Enter by hand new data that doesn't come from a separate program ("2016-08-02; 21:00-23:00 / Social interaction: warmth level 7").
2c) Enter new data that does comes from a separate program automatically where possibly. I don't have many such programs right now, and this might have to be done partly manually, like an end-of-the-day or -week dump from the program (the only one I have now is CRON-O-Meter, which records dietary intake), write a macro to reformat if needed, then import into the spreadsheet.
3) Find, or write, macros (or whatever they're called in the spreadsheet world) to find correlations.
I see two problems, but I'm wondering whether there are others that aren't coming to mind.
A) The time-scale needs to be alterable, no doubt. At first I was thinking hour-long time columns would be fine, but that obviously won't work for tracking correlations among diet, exercise, and blood glucose, for example. Ten-minute increments is probably what I'll start with. But whatever fineness of scale I decide upon to begin with, it's likely I'll want to change it later. I think that's manageable, though.
B) The amount of data might be so large that I will need to get something more powerful than a typical personal computer to manage it! Twenty years of data split into 10-minute increments is 1025280 columns, and I'll probably want to track at least 400 variables, so that's over 400 million cells of data (most blank, of course -- many will be values from blood tests taken only occasionally). That's a lot. But I underestimate the power of my HP Pavilion. (Of course, if I want to enter my 23andMe data, the number of cells of data increases even more. But genetic data consists of fixed contextual variables, not sure that would belong in the spreadsheet. It's sort of like height and birth date.)
Anyone see other problems with my approach, or with my reasoning about my goals as stated here?
The post by Irene_Gabashvili here --
-- mentions a couple of websites/programs that I could use my data with, I think. But I'd have to learn how to connect my data to the programs. Surely doable, but maybe complicated.
Mainly, for now, I'd like to be entering all my data into one spreadsheet (or spreadsheet-like program), and I'd like to be able to find important correlations.