# T-tests

INTRODUCTION

PERFORMING A ONE-SAMPLE T-TEST USING JMP (MAC AND PC)

PERFORMING A TWO-SAMPLE T-TEST USING JMP (MAC AND PC)

PERFORMING A PAIRED T-TEST USING JMP (MAC AND PC)

### INTRODUCTION

t-tests are normally used to compare the means of two samples of numeric data to determine whether they are significantly different from one another, although there is such a thing as a one sample t-test, which has a related but slightly different purpose (see below). The data can either be continuously distributed or discrete as long as they have a normal distribution.

• Continuously distributed numeric variables are ones that, in principle, can take an infinite number of values if measured precisely enough - for example: body mass, height, nitrogen concentration in a water sample, or cholesterol level in the bloodstream.
• Discrete numeric variables are ones that can take only a certain set of values - for example: the number of leaves on a tree; the number of bacterial colonies on a petri dish. Both these variables can take only integer values, although the number of possible values is very large.

To learn how to determine whether your data have a normal distribution, see Foundational Material. If your data are not normally distributed, alternative tests are available: see Nonparametric Statistics.

If you have three or more samples of data, rather than just one or two, see One-Factor Analysis of Variance.

For testing hypotheses about means when your data have a normal distribution there are three types of t-test:

• A one-sample t-test is used to compare the mean of a single sample with a value that is expected based on some prior knowledge. For example, the long-term average high temperature in the Twin Cities on 1st April is 50oF. If we had data on the high temperature for each 1st April from 1998-2017 (n = 20), and if these data had a normal distribution, we could perform a one-sample t-test to determine whether the average high for the past 20 years was significantly different from the long-term average of 50oF.
• A two-sample t-test is used to compare two sample means to determine if they are significantly different from each other. For example, if we had data on the Twin Cities 1st April high temperatures for the years 1978-1997 and 1998-2017, we could use a two-sample t-test to determine whether the average high for the most recent 20-year period was significantly different from the average high for the previous 20-year period.
• A paired t-test is used to compare two sample means if each value within one of the samples can be sensibly paired with an equivalent value in the other sample. For example, if we had data on the 1st April high temperatures in Duluth for 1998-2017, we could do a paired t-test to determine whether the average Duluth temperature during this period was significantly different from the average Twin Cities temperature during the same period. In this example, the two high temperatures for 1998 (Duluth and Twin Cities) can be sensibly paired with one another, as can every other pair of temperatures taken on 1st April of the same year. Contrast this with the two-sample test above, where there is no sensible justification for pairing 1978 in the first sample with 1998 (or any other year) in the second sample.

All other things being equal, paired tests are more powerful than unpaired tests because they control for other variables that might affect your data. For example, 2010 was a particularly warm spring in Minnesota, and April 1st highs were well above average in both the Twin Cities and Duluth. A paired t-test would take this year-to-year correlated variation into account, but a two-sample t-test would not.

### PERFORMING A ONE-SAMPLE T-TEST USING JMP (MAC AND PC)

Below is a link to an Excel spreadsheet of data on the foraging behavior of walleye in Lake Carlos, Minnesota, which has been invaded by non-native zebra mussels. A sample of fourteen fish was taken from the lake and the proportion of the energy in their diet that came from near-shore vs. open-water feeding was estimated.

Walleye energy.xlsx (Excel file)

Based on their knowledge of walleye ecology, the biologists who took these data had reason to expect that, in the absence of invasive zebra mussels, half of the energy in walleye diets would come from near-shore feeding and half from open-water feeding. The data from Lake Carlos were taken to determine whether the presence of zebra mussels changed the normal feeding behavior of walleyes. A one-sample t-test is appropriate here because the comparison is between an expected mean and the mean of single sample of data.

The null hypothesis for the test is that the proportion of energy obtained from open-water feeding is equal to 0.5.

The analysis can be performed in JMP as follows:

• After importing the spreadsheet into JMP, first notice that the two columns of diet proportions are interpreted as continuous variables, as indicated by blue triangles in the column list on the left.
• From the Analyze menu, select Distribution, and in the window that appears drag or click the Proportion open water energy variable into the Y, Columns box, then hit OK. (We could also analyze the Proportion near-shore energy variable in exactly the same way if we wanted.)
• A new window showing the data in histogram form will appear. If you want, you can turn this histogram sideways by clicking on the red arrow beside Distributions and selecting Stack.
• As well as the histogram, the window also provides summary statistics on the distribution of the data in the Quantiles and Summary Statistics tables.
• The data do not look much like a normal distribution, although this is often the case when sample sizes are small. Statistical analysis (not shown here) indicates that the deviation from normality is not significant, although given the small sample size this conclusion is possibly not reliable. We will proceed with the t-test, but will acknowledge that the analysis might have to be interpreted with caution, especially if the difference between observation and expectation is small. (To learn how to carry out formal statistical tests for normality, see Foundational Material.)
• Click on the red arrow next to Proportion open-water energy and select Test Mean. A new window will appear where you can enter your expected mean value in the Specify Hypothesized Mean box. In this case the expectation is 0.5 (half the diet is expected to come from open-water feeding). This expectation constitutes the null hypothesis for your statistical test. Hit OK. The window will now look like this: • The output from the t-test is provided under in the Test Mean table on the right. At the top, the expected mean (0.5) and the actual mean for the data (0.242) are provided, along with the degrees of freedom (DF) and the standard deviation for the data (0.064). For a one-sample t-test the degrees of freedom is equal to the total sample size minus one (14 – 1 = 13). Below this information is the statistical analysis. The t-value is -15.035 (the negative sign indicating that the observed mean is smaller than the expected value) and below that are three p-values.
• Prob > |t| is for a two-tailed test, which assesses whether the observed mean is either significantly greater than or significantly less than the expected value of 0.5. Under most circumstances, you should report this p-value because you normally have no a priorireason to exclude one of these possibilities, so both must be considered.
• Prob > t is for one of the two possible one-tailed tests, in this case assessing only whether the observed mean is significantly greater than the expected value. You would rarely be justified in reporting this type of test.
• Prob < t is for the other possible one-tailed test, in this case assessing only whether the observed mean is significantly less than the expected value. Again, you would rarely be justified in reporting this type of test.
• The p-value for the two-tailed test is well below the normal threshold for statistical significance (0.05). Thus, we can reject the null hypothesis and accept the alternative hypothesis that the true mean of the population is different from 0.5. The deviation from expectation is far greater than can be explained by chance (sampling variation) alone.
• Returning to our earlier caveats about the data, we can see that the difference between observation and expectation was very large (0.242 vs. 0.5) and that the p-value was very small. It therefore seems reasonable to accept the validity of our conclusions, despite the small sample size and the fact that the distribution may not be perfectly normal.
• When reporting the results of a one-sample t-test in a paper you would normally provide the expected value of the mean (your null hypothesis, H0), your observed mean and some indication of the variability in the data (typically the standard deviation of the data or the standard error of the mean), the degrees of freedom, the t-value, and the p-value. In this case, you would report:

H0 = 0.5, observed mean ± 1 S.E. = 0.242 ± 0.017, d.f. = 13, t = -15.035, p < 0.0001

### PERFORMING A TWO-SAMPLE T-TEST USING JMP (MAC AND PC)

Below is a link to an Excel spreadsheet containing data on the sizes of oak seedlings of two possible ecotypes (populations adapted to the local ecological conditions where they evolved). One population (Local) originally came from a site near the Twin Cities, the other (Southern) from a site in south-central Wisconsin. In 2012, when they were one year old and 20-30 cm tall, samples of both were planted in the same habitat restoration site just to the east of the Twin Cities. The scientists studying the seedlings wanted to know whether the two ecotypes were growing at different rates, so measured them again in 2015, after three years of growth. These data can be analyzed with a two-sample t-test because they are normally distributed and are not paired (i.e., there is no sensible way of pairing any individual Local seedling with a particular Southern seedling).

Fish Creek Seedlings 2015.xlsx (Excel file)

The null hypothesis for this test is that the Local and Southern ecotypes have the same mean height after three years of growth.

The analysis can be performed in JMP as follows:

• Import the data file into JMP and check that the Ecotype variable is categorical (indicated by a red histogram) and the Max Height variable is continuous (indicated by a blue triangle) in the column list on the left.
• From the Analyze menu select Fit Y by X and click or drag Ecotype into the X, Factor box and Max Height into the Y, Response box.
• Notice the image in the bottom left of this window. The Fit Y by X option can perform four basic types of analysis, depending on whether your X and Y variables are continuous (as shown by a blue triangle) or categorical (as shown by the red and green histograms). In this case, the X variable (Ecotype) is categorical and the Y variable (Max Height) is continuous, so Fit Y by X automatically carries out a Oneway analysis, which compares the means of two or more samples when there is only one X-variable.
• Hit OK. A new window will appear with all the data points in two vertical columns, one for each ecotype. Before doing the t-test, there are various modifications you can make to the data display.
• Click on the red arrow next to Oneway Analysis… and go to Display options. Unselect Points and select Mean Error Bars and Std Dev Lines. The former shows the mean ± 1 standard error for each sample; the latter shows ± 1 standard deviation for each sample. Also select Histograms to display the full distribution of the data in each sample. This allows a simple visual inspection of the data. Here, both samples show a symmetrical bell-shaped curve which appears to be approximately normal, so we will proceed with the t-test. (To learn how to carry out formal statistical tests for normality, see Foundational Material.)
• Click again on the red arrow and select both Means and Std Dev and t-test. The analysis window will now look like this • The Means and Std Deviations table provides descriptive statistics: sample sizes, means, standard deviations, standard errors of the means, and 95% confidence limits.
• The t Test table shows the results of the statistical analysis. On the left from top to bottom is the difference between the two means, the standard error of this difference and the 95% confidence limits of the difference. You can see that these limits do not enclose a difference of zero, which itself is an indicator that the samples are significantly different from one another.
• On the right is the t-value (-4.012) with the negative number indicating that the second mean is smaller than the first, then the degrees of freedom (DF). The degrees of freedom is an indicator of the amount of data you have (see Foundational Material for details), and normally you would expect it to be a whole number. In this case, however, we have carried out a t-test that assumes the variances in the two samples are unequal. This is the safest assumption to make because they often are. When variances are unequal, the formula for calculating the degrees of freedom is somewhat complicated and typically does not result in a whole number value. (When variances are equal, the degrees of freedom for a two-sample t-test is equal to the total sample size minus two.)
• Below the DF are three p-values
• Prob > |t| is for two-tailed test, which assesses whether the second mean is either significantly greater than or significantly less than the first mean. Under most circumstances, you should report this p-value because you normally have no a priori reason to exclude one of these possibilities, so both must be considered.
• Prob > t is for one of the two possible one-tailed tests, in this case assessing only whether the second mean is significantly greater than the first mean. You would rarely be justified in reporting this type of test.
• Prob < t is for the other possible one-tailed test, in this case assessing only whether the second mean is significantly less than the first mean. Again, you would rarely be justified in reporting this type of test.
• The p-value for the two-tailed test is well below the normal threshold for statistical significance (0.05). Thus, we can reject the null hypothesis and accept the alternative hypothesis that Local and Southern ecotypes do not have the same mean height after three years of growth in this habitat. The difference in means is far greater than can be explained by chance (sampling variation) alone.
• When reporting the results of a two-sample t-test in a paper you would normally provide the sample sizes, means, some indication of the variability in the data (typically the standard deviation of the data or the standard error of the mean), along with the t-value and p-value for the test, in the following way: Local: n = 93; mean ± 1 SE = 61.24 ± 1.70; Southern: n = 245, mean ± 1 SE = 53.32 ± 1.01; t = -4.01, p < 0.0001. Depending on the particulars of the paper, this information might be provided in the text of the Results section or (if multiple tests need to be reported) in a table.

### PERFORMING A PAIRED T-TEST USING JMP (MAC AND PC)

Below is a link to an Excel spreadsheet containing data on the root biomass in samples of soil collected at two different depths (0-7 cm and 7-14 cm) from 38 different locations in an old pasture field. The ecologists who collected these data were interested in comparing the amount of root material at different depths, and since the data are clearly paired (the 0-7 cm depth sample from each location can be paired with the 7-14 cm sample from the same location), a paired t-test is appropriate for this analysis.

Fish Creek roots 2014.xlsx (Excel file)

Note that the data are arranged differently from the two sample (unpaired) t-test described above. There, the data for the two samples of the dependent variable (seedling height) were all in one column and the two categories of the independent variable (Local vs. Southern) were in a second column. Here the data for the two samples (0-7 cm and 7-14 cm depths) are in different columns. The data must be arranged this way to allow JMP to perform a paired t-test.

The null hypothesis for this test is that the mean root biomass at a soil depth of 0-7 cm is the same as the mean root biomass at a soil depth of 7-14 cm.

The analysis can be performed in JMP as follows:

• Import the data into JMP and check that both columns of root biomass data have been coded as continuous variables, as indicated by the blue triangles in the column list to the left.
• From the Analyze menu, select Specialized Modeling, then Matched Pairs. In the window that appears, click or drag both Root Biomass variables into the Y, Paired Response box, then hit OK.
• In the new window that appears, click on the red arrow beside Matched Pairs and select Plot Dif by Row. The window will now look like this: • The first image is a little confusing to interpret but plots the difference in root biomass between the two depths at each location against the mean of these two values. The second image is somewhat more intuitive, showing the difference between the two values plotted against the row number of the data points in the JMP spreadsheet. In both cases, the horizontal grey line shows where the data would fall if the root biomass was the same at the two different depths at each location. The horizontal red lines show the mean difference between the two samples (solid line) and the 95% confidence limits above and below this mean (dashed lines).
• The left-hand side of the table below the images shows the mean values for the two samples, the difference between these means, the standard error of this mean difference, the 95% confidence limits above and below the mean difference, the sample size (number of sampling locations) and the correlation coefficient between the two variables: biomass at 0-7 cm vs. biomass at 7-14 cm. (See Correlation and Linear Regression for an explanation of what a correlation coefficient indicates.)
• The right-hand side of the table shows the t-value (t-Ratio) = -17.06, with the negative value indicating that the mean at 7-14 cm is lower than the mean at 0-7 cm, the degrees of freedom (DF) and three p-values. For a paired t-test the degrees of freedom is equal to the number of pairs of data minus one (38 – 1 = 37).
• The p-values can be interpreted as follows:
• Prob > |t| is for two-tailed test, which assesses whether the second mean is either significantly greater than or significantly less than the first mean. Under most circumstances, you should report this p-value because you normally have no a priori reason to exclude one of these possibilities, so both must be considered.
• Prob > t is for one of the two possible one-tailed tests, in this case assessing only whether the second mean is significantly greater than the first mean. You would rarely be justified in reporting this type of test.
• Prob < t is for the other possible one-tailed test, in this case assessing only whether the second mean is significantly less than the first mean. Again, you would rarely be justified in reporting this type of test.
• The p-value for the two-tailed test is well below the normal threshold for statistical significance (0.05). Thus, we can reject the null hypothesis and accept the alternative hypothesis that the mean root biomass at 0-7 cm is different from the mean root biomass at 7-14 cm. The difference in means is far greater than can be explained by chance (sampling variation) alone.
• When reporting the results of a paired t-test in a paper you would normally provide the sample sizes, means, some indication of the variability in the data (typically the standard deviation of the data or the standard error of the mean), along with the t-value and p-value for the test. Unfortunately, JMP does not provide standard deviations or standard errors for the two means in the Matched Pairs output, so you need to get this information in another way.
• Go back to the JMP data spreadsheet, and from the Analyze menu select Distribution. Click or drag the two root biomass variables into the Y, Columns box and hit OK. The window that appears will show histograms of the two distributions and two tables of descriptive statistics. From the Summary Statistics table at the bottom you can find the means, standard deviations, and standard errors .
• The analysis can then be summarized in the following way: 0-7 cm mean ± 1 SE = 5.056 ± 0.204; 7-14 cm mean ± 1 SE = 1.703 ± 0.097; n = 38, t = -17.06, p < 0.0001. Depending on the particulars of the paper, this information might be provided in the text of the Results section or (if multiple tests need to be reported) in a table.