Designing Your Procedures to Fit
Convenient Statistical Analysis

To polish off a great science project, you need have statistics that give you a sense of confidence in what your conclusions about your hypotheses are. The ultimate inferential statistics allow you to have a probability statement, such as p=.01, that tell you what the chances are that your null hypothesis is correct. In the case of p=.01, there is 1 chance in 100 that your null hypothesis (H0) is correct. The inverse of the probability statement is your confidence level. So if there is only 1 chance in 100 the H0 is correct, then you can support your H1 hypothesis (classic hypothesis) with a confidence level of 99%. It is reassuring when you are being interviewed by a judge to say, "There is only 1 chance in 100 that I am not correct."

While working at a Wichita Northeast Science and Technology Magnet, I learned in my first year that I could easily reach brain death by having over 80 students show up with their data after their experiments and ask, "Here's my data. How do I analyze this?" It took less than 1/100th the effort the next year when I taught them to design their experiments to fit a few terrific statistical tests. Let us be easy on ourselves and do the same! Here are your options:

1) Design your experiment to fit the TTEST: This is probably the most accepted experimental design. It fits the classic CONTROL GROUP and EXPERIMENTAL GROUP setup. You design your experiment to have a minimum of 5 in each group (N>or=5). You treat the experimental group with the independent variable. You treat the control group the same as the experimental with everything except the one independent variable you have decided on. You then measure the dependent variable for each group. You figure the average of both group's data sets. You then figure the difference between their averages. Last of all you run a TTEST to decide whether the difference between the averages is scientifically significant (alpha p<or=.05). Scientists are willing to say they have measured a significant difference between the groups if there are 5 chances or less in 100 that the null hypothesis (H0) is correct.

This method was used for several of our labs earlier in the year. See an example: paired ttest for photosynthesis bubble data using Excel Spreadsheet software. If you have a data set that is not paired data, the only thing you do different is type "3" in on the "type" of ttest instead of "1".

An example data setup for the groups doing reaction time testing to fit a PAIRED TTEST:

Reaction Time Experiment
control experimental
react time (sec) react time (sec)
Person without caffeine with caffeine
1 0.21 0.19
2 0.18 0.18
3 0.23 0.2
4 0.25 0.23
5 0.27 0.24
6 0.19 0.2
7 0.18 0.18
8 0.24 0.17
9 0.27 0.17
10 0.21 0.16
average=
0.223 0.192
difference=
0.031 sec
ttest p=
0.02

Each person is in both the experimental and the control group. This is what qualifies this setup for the paired TTEST.

An example data setup for groups that might do a NON-PAIRED TTEST:

Mung Bean Grown Indoors and Outdoors
indoor outdoor
controls experimentals
trial
Height (cm) Height (cm)
1
10 10
2
12 5
3
10 6
4
15 9
5
7 6
6
7 12
7
8 18
8
15 12
9
15 10
10
12 10
average=
11.1 9.8
difference=
1.3 cm
ttest P=
0.4194679

The two plants in trial one are not the same plant, therefore the test is for nonpaired data.

2) Design your experiment to fit the CHI SQUARE analysis: This is a setup for observing choices behaviors and determining how the observed data frequencies compare to the expected data frequencies. Expected frequencies are calculated to fit the null hypothesis that there is no difference between the data sets and your independent variable is having no effect. If your null hypothesis were correct, then there would be no difference between the observed and expected frequencies. If you measured a definite effect and your observed and expected frequencies are definitely different, then you will get a very low probability that your H0 is correct (p<or=.05). This then allows you to be very confident that you measured an affect from the independent variable (above the 95% confidence level).

The following data table is for a chi square test (CHITEST) showing squirrels preferences for types of foods according to which dish they visited. The expected is figured by taking the 227 total visits and dividing by the number of choices, 5. This tells you what you would expect if squirrels do not have a preference of food types over others. Again, the expected is what would be if the null hypothesis were correct.

Squirrels Choices of Foods
# times squirrels visited
food choices
observed expected
walnuts
47 45.4
peanuts
30 45.4
almonds
58 45.4
hickory nuts
40 45.4
sunflower seeds
52 45.4
total=
227
chitest p=
0.03

The following data table is for another application of chi square test (CHITEST) showing gender's affects on stop sign behaviors:

Gender's Effects on Stop Sign Behaviors
observed stop sign behavior frequencies
gender
complete stop
roll
run
TOTAL
male
51
29
7
87
female
83
20
3
106
TOTAL=
134
49
10
193
expected stop sign behaviors frequencies
gender
complete stop
roll
run
male
60.40
22.09
4.51
female
73.60
26.91
5.49
chitest p=
0.01

To figure the expected frequencies for males complete stop 87/193*134=60.4; roll 87/193*49=22.09; run 87/193*10=4.51.

To figure the expected frequencies for females complete stop 106/193*134=73.60; roll 106/134*49=26.91; run 106/193*10=5.49. These expected frequencies would be if gender makes no difference in stop sign behavior.