13 uBiome samples over 16 months

Tags: #<Tag:0x00007f3354587260> #<Tag:0x00007f3354587030>

I have recently recieved another OTU from ubiome for November 2016. I just reran my analysis and updated my plots on my github. I would love feedback about the visualization, analysis methods (not much here so far), and the results. uBiome says that this most recent samples matches 96% to their healthy prototype.

My results are here:

Something I’ve wondered about is what is the Nyquist frequency these kinds of “signals.” I have samples here collected within days of each other that vary greatly. This would lead me to believe the nyquist freq could be very high. On the other hand, I’d have to examine the eric alm paper but i believe he found it to not vary much day to day. I wonder what exactly that means.

I think this is an important point to make and that when correlating this data with other data (say diet, weight, bloodwork) you can end up with incorrect correlations. For example, if you data is sampled at a low frequency but the nyquist is high, first you will have aliasing effects, second, you can essentially be only correlating certain frequency components of the data which will yield a biased result.

Anyone have thoughts about this?

This data needs some context: Did you change your diet at any point, travel, have any illness etc?

The data is meant to be looked at and evaluated as is. Do you think I had travel or illness during the collection? I want to hold off giving a lot of information because I am finding that people read anything they want into microbiome data. For example, these samples could be completely moot given the Nyquist frequency is high. It wouldnt matter if I had traveled or had illness during the time.

What do you think happened during this time period?

Isaac, I think your samples are well below the Nyquist rate.
Normal Gut Transit Time is on the order of one to two days, your microbiome can also change rapidly, and the growth rate of Foodborne Illnesses can be as fast as 8 hours. You may also see changes over the length of a stool sample given that it represents multiple meals of different foods and give consideration to surface or interior sampling. There are also other factors such as exercise and stress that may change your results. Even how long the sample was exposed to oxygen may have an impact.

You could improve the measurements, but that would take some real dedication and $$$. Setting up a two level Design of Experiments test with daily sampling over a month. Eating the same food for all meals in a day (bracket your sample with the same food). Ingesting a non-toxic marker to know where to sample on the stool…

Then you get to what to do with the results. It may be that diversity is better than boosting the concentration of a specific family of microbes. One potential source of information on your microbiome is the American Gut Project.

Good luck.

Thanks for the feedback. My next steps are to present the correlations of the OTUs amoung the samples. Next, I am going to take my weight and any meds and regress on those to predict OTU percentages. Ive been able to get all the data into pandas so at this point which has made doing analysis a lot easier. Hope to show some results by the end of the week.

Correlations among the samples as requested.

EDIT: Found an extra sample that wasnt mine in the batch and removed. Plots are correct now.

I did a linear regression of phyla to month to look at seasonal effects but found a none with a p-value of ~0.3.

Entered my ubiome wellness match for each sample and did a regression on it wrt my phyla. Here are the results:

Formula: Y ~ <phylum_Actinobacteria> + <phylum_Bacteroidetes>
             + <phylum_Fibrobacteres> + <phylum_Firmicutes> + <phylum_Fusobacteria>
             + <phylum_Proteobacteria> + <phylum_Verrucomicrobia> + <intercept>

Number of Observations:         13
Number of Degrees of Freedom:   8

R-squared:         0.9173
Adj R-squared:     0.8015

Rmse:              4.4640

F-stat (7, 5):     7.9239, p-value:     0.0184

Degrees of Freedom: model 7, resid 5

-----------------------Summary of Estimated Coefficients------------------------
      Variable       Coef    Std Err     t-stat    p-value    CI 2.5%   CI 97.5%
--------------------------------------------------------------------------------
phylum_Actinobacteria     0.0779     0.0359       2.17     0.0818     0.0076     0.1482
phylum_Bacteroidetes     0.0778     0.0358       2.17     0.0820     0.0076     0.1480
phylum_Fibrobacteres     0.0863     0.0464       1.86     0.1218    -0.0046     0.1772
phylum_Firmicutes     0.0779     0.0358       2.17     0.0818     0.0076     0.1481
phylum_Fusobacteria     0.0898     0.0408       2.20     0.0790     0.0098     0.1698
--------------------------------------------------------------------------------
phylum_Proteobacteria     0.0777     0.0358       2.17     0.0821     0.0075     0.1479
phylum_Verrucomicrobia     0.0780     0.0357       2.18     0.0809     0.0079     0.1480
     intercept -77752.4916 35831.8130      -2.17     0.0822 -147982.8451 -7522.1382
---------------------------------End of Summary---------------------------------

Another interesting regression. I took nexium after gallbladder removal. Here is a regression against it.

-------------------------Summary of Regression Analysis-------------------------

Formula: Y ~ <phylum_Actinobacteria> + <phylum_Bacteroidetes>
             + <phylum_Fibrobacteres> + <phylum_Firmicutes> + <phylum_Fusobacteria>
             + <phylum_Proteobacteria> + <phylum_Verrucomicrobia> + <intercept>

Number of Observations:         13
Number of Degrees of Freedom:   8

R-squared:         0.2525
Adj R-squared:    -0.7939

Rmse:             34.5822

F-stat (7, 5):     0.2413, p-value:     0.9547

Degrees of Freedom: model 7, resid 5

-----------------------Summary of Estimated Coefficients------------------------
      Variable       Coef    Std Err     t-stat    p-value    CI 2.5%   CI 97.5%
--------------------------------------------------------------------------------
phylum_Actinobacteria    -0.0595     0.2778      -0.21     0.8389    -0.6040     0.4850
phylum_Bacteroidetes    -0.0599     0.2776      -0.22     0.8377    -0.6039     0.4842
phylum_Fibrobacteres    -0.0658     0.3592      -0.18     0.8618    -0.7698     0.6382
phylum_Firmicutes    -0.0599     0.2776      -0.22     0.8376    -0.6040     0.4842
phylum_Fusobacteria    -0.0012     0.3161      -0.00     0.9970    -0.6207     0.6183
--------------------------------------------------------------------------------
phylum_Proteobacteria    -0.0603     0.2773      -0.22     0.8364    -0.6039     0.4833
phylum_Verrucomicrobia    -0.0614     0.2769      -0.22     0.8332    -0.6041     0.4813
     intercept 59928.4745 277587.4209       0.22     0.8376 -484142.8705 603999.8195
---------------------------------End of Summary---------------------------------

This says there is no effect between nexium and phyla.

One last one for the evening. After my gallbladder removal I had IBS (first time in my life). Actually diagnosed with visceral hypersensitivity and was put on varying doses of zoloft for a while to see if it helped (it didn’t). Here is a regression against it.

-------------------------Summary of Regression Analysis-------------------------

Formula: Y ~ <phylum_Actinobacteria> + <phylum_Bacteroidetes>
             + <phylum_Fibrobacteres> + <phylum_Firmicutes> + <phylum_Fusobacteria>
             + <phylum_Proteobacteria> + <phylum_Verrucomicrobia> + <intercept>

Number of Observations:         13
Number of Degrees of Freedom:   8

R-squared:         0.7913
Adj R-squared:     0.4992

Rmse:             21.8532

F-stat (7, 5):     2.7085, p-value:     0.1451

Degrees of Freedom: model 7, resid 5

-----------------------Summary of Estimated Coefficients------------------------
      Variable       Coef    Std Err     t-stat    p-value    CI 2.5%   CI 97.5%
--------------------------------------------------------------------------------
phylum_Actinobacteria    -0.2534     0.1756      -1.44     0.2086    -0.5974     0.0907
phylum_Bacteroidetes    -0.2536     0.1754      -1.45     0.2078    -0.5974     0.0902
phylum_Fibrobacteres    -0.2868     0.2270      -1.26     0.2621    -0.7317     0.1581
phylum_Firmicutes    -0.2537     0.1754      -1.45     0.2077    -0.5975     0.0901
phylum_Fusobacteria    -0.3105     0.1997      -1.55     0.1808    -0.7020     0.0810
--------------------------------------------------------------------------------
phylum_Proteobacteria    -0.2541     0.1753      -1.45     0.2068    -0.5976     0.0894
phylum_Verrucomicrobia    -0.2553     0.1750      -1.46     0.2043    -0.5983     0.0876
     intercept 253763.0286 175413.6885       1.45     0.2076 -90047.8007 597573.8580
---------------------------------End of Summary---------------------------------

Not surprising you’d get a good p-value here. The uBiome algorithm is just a distance similarity metric comparing your samples to a select group of self-reported users. At the Phylum level, it’d be odd if you didn’t correlate with other, Western human beings.

Exactly. This was my sanity check. What is interesting is that PPI usage did not significantly change phyla distribution. I am in the midst now of looking at the individual regressions of the taxa being predicted by PPI use (nexium in this case). I would expect akkermansia to be modulated by PPI use.

@sprague

how do you decide to do correlation for each taxa vs doing single linear regression for each?

EDIT 1: Upon reading a bit today, regression is the wrong thing here. I think I want correlation analysis because this is exploratory. If i were to say do a prospective study than I feel regression would be appropriate. Will update plots and post later this week.

I think I get what you’re trying to do – I’ve tried something like this myself. You want to do a piecewise pearson correlation on every combination of taxons, so you can find which ones rise/fall more closely with each other. Here’s an example of what happened when I tried that on a bunch of my data::

See how some taxa are 100% correlated with each other? Could be that they are perfectly joined: one exclusively eats something that the other exclusively produces. Or could be that uBiome simply goofed and is labelling as separate two things that are really the same.

I played with this a long time and long-story-short, I didn’t find convincing reasons why many pairwise combination would be interesting. More likely it’s a whole cluster of taxa that rise/fall with one another, and to do that you’ll need regression or a machine learning or somesuch more complex correlation algorithm.

Incidentally, here are the correlations I calculated for your data:

2 Likes

Hi Isaac,
I am not surprised you did not see much seasonal variation given that most of us have refrigeration storage in our food chain and have access to produce from the opposite hemisphere. The result is that we eat almost the same thing year round. I wonder if you did a 6 month seasonal variation instead of a year would your results be much different.

Re IBS, did you try the Low FODMAP DIet and did that change your gut microbiome?

@OP_Engr I did do low fodmap for a while. I get your suggestion a lot. IBS is multifaceted and purely not microbiome modulated as the new popularizing. For example, I don’t mean the Rome III criteria IBS-C or D and technically fall under the category of IBS-U. I was also tested for SIBO twice and was negative both times. As best we can tell right now, there are 2 issues at play the first being my stomach makes too much acid which leads to dumping syndrome. The second is that when the first is controlled, I get vagal responses when my colon moves but this is only sometimes. Otherwise, I’m very healthy and generally work out several times/week, have had lots of bloodwork done also and it all comes back normal. My wife is a vegetarian so I eat well also.

All that said, I’m always open to suggestions.

EDIT 1: I will add the analysis you suggest to my list. Thank you.

@sprague Thanks for your time in taking a look at my data. The problem is with 13 samples, you can regression on more than a dozen items before you run out of DoF. Same with a machine learning approach. If I had 50+ samples that were regularly spaced in time I believe I could do something more fruitful. I take the lower bound to be roughly what the CFS Cornell study did.