Science in Action: Omega-3 (measurement improvement)

I’ve learned a few things. As some of you may know, I’ve been measuring my balance by standing on a board that is balanced on a tiny platform (a pipe plug) — pictures here. Now and then the board would slip off the platform. I supposed this was a failure of balance but I wasn’t sure, especially if it happened as soon as I stood on it. So I got another board into which my brother-in-law kindly drilled the perfect-size hole so that the plug will never slip:

New board (with hole for plug)

To see if this made a difference I did an experiment with a design I have never used before but that I really like: ABABABAB… (one day per condition). In other words, Monday I tested my balance with the old board, Tuesday with the new board, Wednesday with the old board, Thursday with the new board, etc. Simple, efficient, well-balanced. Here are the results:

new board vs. old board

The red line is fit to the red points, the blue line to the blue points. The two lines are constrained to have the same slope.

Well, that’s clear. I expected my balance to be better with the new board, actually.

Speaking of the unexpected, I made another measurement improvement that truly surprises me — the surprise is that I never did it before. When I looked at my early balance data (the first 10 or so days of data) I saw that my balance improved for the first 5 trials and was roughly constant after that. Each session was 20 trials so I dropped (excluded) the first 5 trials from my analyses — considering them “warm-up” trials. I took the mean of the last 15 trials. That seemed very reasonable and I thought nothing of it.

Recently I asked again how performance changes over a session. The answer was a bit different: I found that performance improved for the first 10 trials. Now there are 30 trials in a session, so dropping the first 10 of them seemed okay. And that’s what I did.

But then I looked at how variability changed over a session. I expected the earliest trials to be more variable than the rest but the data didn’t show that. Variability was pretty constant from the first trials to the last. Hmm. Maybe I am losing valuable information by not including those early trials in my averages. It occurred to me: why not allow for the warmup effect by modelling it, rather than by excluding it? (Modelling it meaning estimating it and then subtracting it.) I did that, and then I looked at the size of the standard errors of the means (standard errors based on the residuals from the fit) for the most recent 40 days — essentially, the error in measurement. Here is what I found. Median standard errors:

First 10 trials (out of 30) excluded: 0.073
First 5 trials excluded: 0.064
First trial excluded: 0.061
No trials excluded: 0.059

My eyes opened wide when I saw these numbers. Oh my god! I was throwing away so much! A reduction in error from 0.073 to 0.059 — that’s 20% better.

2 Responses to “Science in Action: Omega-3 (measurement improvement)”

  1. Tim Lundeen Says:

    Fascinating, both the consistency of the results with/without slippage, and the standard error reduction.

  2. seth Says:

    Thanks, Tim. It was your arithmetic results that led me to the standard error reduction. Your arithmetic results led me to do a very similar task, as you know. I didn’t want to place constraints on the 100 simple arithmetic problems I did each session (it was just too complicated — too many things that might be important) so instead I did my best to equate different days by modelling — by estimating and removing the effects of this and that. That gave me the idea of doing it here, too.