Hello Bloggers!! Just a quick post this morning to get the brain working and blood flowin’! This post is about weird surveys, and to be honest, I believe all surveys are faulty. That may not be the correct word, but it’s too early yet. I’ll think of it later and wish I could insert it into this post…I do that with arguments too. “WHY didn’t I say that?” or “I should have said…” I’m sure we have all experienced that.
Anyway, I will write a blog someday, when I have more time, about all the things wrong with surveys. But, for now…enjoy these “weird surveys” from www.weirdfacts.com (but of course, the witty sarcasm in bold and italics cannot be blamed on the poor innocent writers of weirdfacts.com)
- Nobody yet has explained satisfactorily why couples who marry in January, February, and March tend to have the highest divorce rates. Can anyone explain why people who get married in June, July, and August divorce? This survey named 1/4th of the year and acted like it’s a mystery! One of the months (or three) have to be the highest number of divorces, it can’t have anything to do with which month it is, can it?
- A recent study conducted by the Shyness Clinic in Menlo Park, California, revealed that almost 90 percent of Americans label themselves as shy. Of course they do, it’s cute and an acceptable term. I’m loud and obnoxious doesn’t give the same ooomph.
- The Average American/Canadian eats about 11.9lbs of cereal per year. This should be divided between adults and children.
- The Average American/Canadian drinks about 600 soda’s per year. That would be less than two a day. Yeah, I can see that.
- More People use blue toothbrushes then red ones. I can see this being correct too. And most of these I don’t think are weird at all. Blue is probably most people’s favorite color and the toothbrush thing, well guys might think red looks a little too much like pink.
- According to a 1995 survey, 7 out of 10 British dogs get Christmas gifts from their doting owners. Again, not weird. Or if it is, I’m weird too. I buy for my dog every Christmas. He’s one of the family!
- The average American family views television six hours each day. Ok, that’s just ridiculous. This one should be divided kids from adults, and weekends from weekdays. By saying American families watch TV for six hours a day, they get the shock factor out there that all Americans do is sit around all day and watch TV, but that’s not true for most of us. They should also sift out the employed adults from the unemployed.
- About two hundred babies are born worldwide every minute. I believe that, but how many people die per minute? It’s an impression that the population is bursting…ok, yeah we’re living longer and it could be, but we have to consider deaths in order to get a true picture of population numbers.
- Your statistical chance of being murdered is one in twenty thousand. Ok, wow…I’d like to see how they came up with this one. People in the nation murdered divided by the population numbers? It’s the smallest ratio I’ve ever seen. Twenty thousand? Hmmm….glad I don’t believe surveys.
Happy Bloggin’ and have a terrific Wednesday. Thank you for dropping by…

Hmmmm…… the thing is less on the surveys, and more on the stats themselves. It’s really helpful to recall one of the biggest truisms in my field, which says that all models are wrong, but some models are useful (I wish I could claim that, but that comes from the statistician George Box). Two of the hardest things about interpreting things like this is that most of the time stuff like this doesn’t give any idea on the degree of dispersion, and if they did, most people wouldn’t be able to interpret it. This dispersion can come from two places.
The best way to know what a population thinks is to ask the entire population. Most of the time, though, this isn’t feasible, so you have to sample the population. This sampling can build in error and biases that you have to try to minimize. For example, you don’t want to try to get an idea on how well you’re marketing your baby bottles by surveying the elderly, nor would you want to try to get an idea of the national political mood by randomly selecting passers-by outside a Tea Party meeting. You also want to try to get the biggest representative sample you can afford, since the larger the sample size, the more likely the probability that the answer you find is reasonably close to the population you’re trying to measure. This is all part of what’s called sampling error. Indications of sampling error are given in the confidence intervals you often see reported in polling data. They measure the statistical spread of how the population would respond given the response of the sample you’ve selected.
This, however, doesn’t examine the validity of your instrument. This validity comes from how well your instrument is crafted. In the TV statistic you mentioned, what does “watching tv” mean? Seems like an easy question but it’s not (trust me… one of the studies I’ve done looked at physical activity changes when tv’s are disabled or removed from houses). So, does watching tv mean that there’s always a tv on in every room all the time? A surprising number of people do this. Or does it mean butt-in-chair, eyes-on-screen, and totally absorbed? If so, how do you report that? Do you ask the people? (reporting errors when you do this are through the roof, both for questionnaires and for logs, so validity is low) Do you record how much time the tv is actually on? (but this biases high because the tv may be on even though nobody’s watching) Or do you put sensors on them to determine where they are and what they’re doing? (Which is what we do, but our sample size is small because the monitoring is intrusive, the sensors are expensive, and the amount of data generated is massive and difficult to handle, so while our results are valid, the small size we can study means that our sampling error is high…. that we probably can’t extrapolate our data as well to the general population)
Another problem comes from how the numbers are actually reported. So your example about the murder rate is probably generated through all sorts of caveats. It’s what we have to do in the realm of epidemiology and diagnostic statistics. It’s a difficult field to understand, but really really important. (I’ve thought about writing my own blog to give people more of an understanding). A sort of simple example: Let’s say you go into the doctor and get a test that’s 99% accurate (for sake of argument we’ll say it’s accurate on both ends… it has a 99% chance of detecting the disease if it’s there, and a 99% chance of not detecting the disease if it’s not… in reality, most tests have different percentages). So, you get the test results back positive. So there’s a 99% chance you have the disease, right? NOT SO FAST!!!
Let’s say that we test 10,000 people for this disease with this test. The disease affects 1 in 100 people (so not terribly rare). So out of those 10,000 people, 100 will have the disease, and 9,900 won’t. Our 99% accurate test shows that out of the 100 people with the disease, 99 will get back a true positive test, and 1 will get a false negative (they have it but the test says they don’t). Now, out of our 9,900 healthy folks, 9,801 of them will get the all clear, or a true negative. But 99 of them will get a false positive. The trouble is that there’s no answer in the back of the book to find who’s sick and who’s not. So 198 folks get a positive test result, but only half of them are actually sick, so your chance of having the disease with a positive test is 50%. (The chance of having the disease if you test negative is miniscule, about one hundredth of 1%, so a negative result would be said to rule out the disease) This is why doctors can’t just order tests and magically know what’s going on. They have to consider disease prevalence, the consequences if they’re right, and the consequences if they’re wrong. All this may seem academic and pointless but it’s not. A positive result could lead to an invasive biopsy or surgery, and half would be useless. That’s why the recent furor has erupted over breast and prostate cancer screening. Prevalence is fairly close to the same, test accuracy is lower, and results (particularly with prostate cancer, which tends to grow very slower) are equivocal. This is why it’s important for the public to understand these processes.
Stepping off my soapbox now!