SAS code of the day (simple linear regression):

proc reg;

model x = y;

plot x*y;

run;

quit;

x = dependent variable

y = independent variable

So, let's say you have a large population of data and you run a simple linear regression and get an r2 = 0.03. Pretty crappy, right? But, at least you have a scatterplot with 500+ data points, showing the independent variable cannot predict the dependent variable, or, at least, there are other factors, aside from that one variable, which need to be examined, in order to explain the dependent variable. So, instead of trying to examine other factors,or, God forbid, just say the results showed no relationship, you take a bunch of means. Leaving you with four data points. Four. Four out of 500. Then, you run a simple linear regression on the FOUR data points. Four. And your r2 is 0.98. Pretty good, right? Now you can report the highly significant predictive value of the independent variable and your research has the appearance of being legitimate. Using only FOUR effing data points. The start of any good statistical procedure.

In conclusion, science (in this case - statistics) is whatever we want it to be.

## Thursday, January 17, 2008

Subscribe to:
Post Comments (Atom)

## 1 comment:

Interesting...I'm an applied math major and I always think of my philosophy teacher whenever people discuss stats. He used to say, "There are liars, then there are damned liars, and then there are statisticians". I know there is a lot of 'statistical analysis' that is seriously outside of the realm of real science. And I'd argue that's why people need to understand math.

Ok, I didn't expect to go there :)

I wanted to say that I totally didn't think of the possibility of another hole in the wall moment. It makes me a little scared! I don't want Andy to get sent away!!

BTW Are you taking statistics?

Post a Comment