Are we Academically Adrift? Evaluating results of the CLA assessment at St. Thomas
The Council for Aid to Education (CAE) in the early 2000s developed an instrument to measure student learning over a particular period of academic study in a manner that would permit a judgment to be made concerning how well students on average performed compared to a prediction of performance derived from each student’s standardized test scores (ACT and SAT). The instrument would also permit comparisons of how well students actually performed compared to predicted performance across different institutions offering the instrument. Such a measure would make it possible for institutions to judge how well they helped their students to increase their learning while still recognizing that students at different institutions varied in their initial level of academic ability as measured by standardized test scores used in the admission process.
They called the instrument the Collegiate Learning Assessment (CLA). The CLA consists of a Performance Task and a two-part Analytic Writing Task. Students take the assessment online. The performance task poses a particular problem for students to solve, offering a set of documents to be used in developing a solution. Students respond to a question presented on screen and draw upon the set of relevant documents provided to them in so doing. The Analytic Writing Task requires students to make an argument in response to a particular prompt and to critique an argument by evaluating the reasoning in an argumentative document presented to them.
In the summer of 2007, UST’s chief academic officer Tom Rochon decided that St. Thomas should participate in the CLA and asked Michael Jordan to take charge of the effort. Jordan in the fall of 2007 recruited 300 first-time first-year students to gather in computer-equipped classrooms on campus for a maximum time period of 3 hours to complete the assessment. In the spring of 2008, Jordan recruited a stratified sample of 100 graduating seniors to take the assessment so that we could undertake a cross-sectional analysis comparing the performance of the first-year students to the graduating students. Then in the spring of 2009, Jordan continued with what CLA calls a longitudinal analysis by recruiting 100 students from among the original 300 to take the assessment for a second time during the spring semester of their sophomore year. Finally, another set of 100 students drawn from the original 300 were recruited to complete the assessment during the spring semester of their senior year in 2011.
In general, the results for the cross-sectional analysis completed in the spring of 2008 were excellent. The results of the longitudinal analysis completed in the spring of 2011 are mixed. The Core Curriculum Committee is in the process of considering the significance of the detailed results for purposes of assessing and strengthening the core curriculum, and is focusing especially on the need to strengthen the ability of our students to critique an argument. You will hear more from the Core Curriculum Committee in the future about this effort.
The University of Chicago Press in 2011 published a book titled Academically Adrift by Richard Arum and Josipa Roksa that uses data from the CLA (but their data does not include UST”s students) to construct an argument about the state of learning in American colleges and universities today.
Response to conclusions reached in Academically Adrift.
Although the actual assessment of learning turns out to be a rather small portion of the text, the authors frame their research question, and subsequent conclusions, around the concept of critical thinking. More specifically, the authors state “But what if increased educational attainment is not equivalent to enhanced individual capacity for critical thinking and complex reasoning (p. 2)?” In response to this question, the authors of Academically Adrift have reached a negative conclusion of student learning based on this study. “They might graduate, but they are failing to develop the higher-order cognitive skills that are widely assumed college students should master (p. 121).” In order to reach this conclusion, the authors relied on the Performance Task component of the CLA (described above).
Does Academically Adrift provide information that would be helpful in the evaluation of the results of the CLA assessment conducted at St. Thomas? For the sake of brevity, we will focus on two key areas in order to provide the reader with a better understanding of the CLA and the implications of this study regarding UST undergraduate education. These areas include the study sampling design and the longitudinal nature of the measure used by the authors to reach their conclusion.
As mentioned earlier, the evidence the authors use to identify whether significant learning is occurring is the result of analyzing one of the three sections (Performance Task) of the CLA. Additionally, the CLA test scores were collected through a convenience sample of 24 four-year institutions with 2,322 students completing all three phases of the assessment. Of course, there are more than 4,000 colleges and universities in the US, which would lead one to question whether the results are generalizable to the US higher education as a whole. Further, one may posit that the 24 institutions are not similar to UST, which would lead to speculation the results should not be associated with UST without further understanding. The author’s acknowledge this sample methodology as a limitation in the study and indicate the approach is not a true experimental design.
From this one section of the CLA and the sample, the authors chose to measure the learning of students from the beginning of their first year to the second semester of their sophomore year. The authors and CLA researchers have chosen to report this learning using a statistical term called effect size. This measure is arrived at by subtracting the average (mean) score of students during their first year from the average (mean) score of these same students during the second semester of their sophomore year. The result of this computation is then divided by the standard deviation derived from the first year distribution. According to the authors, the results of their analysis revealed an effect size of .18. They go on to argue that a .50 to 1.00 effect size would have revealed significant learning.
When considering UST, the results of this same analysis covering the first and second years of study reveal a UST effect size of .01, which is lower than the figure presented by the authors. Of course, a first look at this comparison would raise significant concerns. However, UST Freshman scored above the national average on the Performance Task during their first semester. This should not come as a surprise as the undergraduate program at UST is considered highly selective by several external organizations based on ACT scores and high school grade point averages of UST incoming freshman. Some studies have suggested that students at highly selective schools will show a smaller degree of change in test scores and that the results of the CLA from highly selective schools will look less impressive as a result. Further discussion on this point would need to take into account how well the test is able to adjust for this factor by basing its predictions for student performance on the SAT and ACT test scores of students taking the test.
But the most important point to note is that when comparing results in the Performance Task from the freshmen year to the senior year, UST students exhibited an effect size of .84, which is rated at the top level, “Well Above Expected” (90-99th percentile), while the book focuses on results from tests taken during the first year compared to the Sophomore year. The dramatic difference in these results could lead one to ask why the authors focused solely on the first phase of the longitudinal analysis and did not include the full longitudinal analysis in their study. Could it be that we find support here for the extensive core curriculum at UST that few students complete in their first 2 years but that seems to produce strong results in the performance task over the full period of undergraduate study? Could we be finding support as well for the strength of our major field programs?
As we mentioned, the Core Curriculum Committee will continue to pursue these issues as it tries to determine what the CLA results suggest about student learning at St. Thomas. The broader context provided by Academically Adrift will provide additional context as this discussion continues, although we have found good reason to think critically about that text and about the CLA instrument as we do so. Surely the designers of the test and the authors of Academically Adrift would expect nothing less.