Exploring Variation in Student Academic Performance: Can Achievement in an Immersive Case Study Project Predict Exam Score in an Introductory Accounting Course?

In modern university education, quantitative analytical skills seem best acquired through deep learning of complex, multi-faceted problems. Our quasi-experimental design tested whether student achievement in an immersive classroom case study might affect subsequent academic performance, presumably reflecting deeper learning of fundamental principles in an accounting course. We analyzed exam scores of three behavior-based student groups: (a) “OOP,” who Opted Out of the immersive case study Project, (b) “BMP,” who earned Below Median marks on the Project, and (c) “AMP,” who scored At least the Median on the Project. Results indicate that student academic performance declined at effectively equal rates among the three student groups in any given semester. Surprisingly, students’ self-reported deep strategy at the start of the school term more strongly predicted their academic performance, accounting for more than 30% of exam score variation; group membership explained only 1.93% of exam score variation. These results underscore the need to document student learning approaches explicitly in order to complement observations of student classroom behaviors and academic performance.


Introduction
Case studies are common ways to engage students in complex learning. A case study can challenge students to reflect expansively and profoundly on a particular subject (Healy & McCutcheon, 2010). However, a case study might overwhelm students conceptually and/or procedurally, especially when illustrating a fundamental organizing principle (e.g., accounting cycle) all at once. For example, Phillips and Heiser (2011) found no difference in exam performance between students who were presented the accounting cycle piecemeal, focusing on "transactions that affect only balance sheet accounts" (p. 683), and students who were presented a slightly more complete accounting cycle early in the course, focusing on "transactions that affect both balance sheet and income statement accounts" (p. 683). Given that scholars have not addressed student learning of the complete accounting cycle (i.e., from analyzing transactions through production of financial statements), we decided to investigate how current pedagogical practices in an introductory accounting course might benefit university students. One such student-centered practice is instructor-guided immersion using two consecutive accounting cycles of a hypothetical consulting company.
By the 1970s, in order to increase efficiency of learning, educational researchers shifted attention from teacher improvement to understanding the cognitive processes that students undergo. McKeachie (1990), Duff and McKinstry (2007), and Entwistle (2015) reviewed these developments in fascinating detail. Briefly, Marton andSäljö (1976a, 1976b) noted a sharp contrast in how students processed a reading assignment. Some made little effort to find meaning beyond the printed words, while others sought conceptual connection with prior knowledge-the former were identified as using a "surface approach," the latter as using a "deep approach" (Marton, 1976;Svensson, 1976;Richardson, 2015). British researchers were subsequently able to generalize this reproducible learning dichotomy across various experimental tasks and developed a number of survey instruments to measure these approaches. In addition to standardizing terminology, these survey instruments clarified certain student behaviors associated with surface vs. deep learning approaches.
Hence, the consensus view of modern university education seems to downplay rote memorization of lectured facts and procedures in favor of more interactive, synthesis-driven, and integrative approaches to learning-in short, to foster cognitively deep learning (Hall et al., 2004;Nelson Laird et al., 2014). With the goal of systematic understanding of inherent processes, deep learners characteristically self-reflect often and attempt to make conceptual connections to solve problems. In the classroom, deep learning might take place through interactive discussion of ideas that challenge longstanding beliefs (i.e., "cognitive conflict"; Sargent & Borthick, 2013). Similarly, at program and institutional levels, some "high impact" educational practices that promote student selfreflection include a combination of first-year seminars, cross-disciplinary learning communities, studyabroad opportunities, and writing-intensive courses (Wawrzynski & Baldwin, 2014, p. 56).
Arguably, the "Presage, Process, and Product" (Dunkin & Biddle, 1974;reviewed in Groccia, 2012), or "3-P," model (Biggs, 1979) provides the clearest framework for operationalizing deep learning. As depicted in Duff and McKinstry (2007), Nelson -Laird et al. (2014), and Duff and Mladenovic (2015), the 3-P model holds that in the "Presage" stage, a student's personal attributes (e.g., earlier experiences that shaped learning interests) and current learning environment (e.g., physical, curricular, institutional, and social factors) interact to shape student perceptions of specific learning tasks, perceptions that might not always align with those of the instructor. In the "Process" stage, depending on task perception (e.g., easy, challenging, or impossibly difficult to relate to), the student can choose to adopt surface or deep approaches, or some combination of both. Presage factors that cumulatively fail to align with process variables result in low-quality learning outcomes in the "Products" stage. Therefore, an educator can best encourage deep learning in students by aligning presage factors that promote deep learning (e.g., relating course material to students' personal lives, providing prompt feedback to students, etc.) with deep-learning process variables (e.g., immersive classroom demonstrations) to yield the desired product (e.g., commendable exam scores).

Focus on Accounting Education
Students majoring in business comprise a significant portion of the American population attending university (see Table 318.20, National Center for Education Statistics, U.S. Department of Education, https://nces.ed.gov). From 1980 to 2012, the total number of degrees granted by U.S. baccalaureate institutions nearly doubled, growing from 935,140 to more than 1.79 million. In that same time period, the percentage of students graduating with a degree in business (separate from public administration and from legal studies) ranged from 19% to 24%. At the same time, baccalaureate degrees in the social and behavioral sciences, including anecdotally popular fields of psychology, social science, and history, actually declined from 23% to 16%. In other words, approximately one out of every five university students in the United States today has, is, or will be enrolled in an introductory accounting course. josotl.indiana.edu A fundamental framework for learning accounting principles is the accounting cycle. The accounting cycle is a series of steps grounded in accounting "best practices" for organizing financial information into a standardized format. More specifically, the accounting cycle includes (a) analyzing and recording financial transactions in journal entries, (b) posting those journal entries to a ledger that allows preparation of an unadjusted trial balance, (c) journalizing and posting adjusting entries to prepare an adjusted trial balance, (d) preparing a discretionary end-of-period worksheet to analyze and summarize the data, (e) preparing the financial statements, (f) journalizing and posting closing entries, and (g) preparing a post-closing trial balance. Warren et al. (2014) provided a detailed description of the accounting cycle, along with many helpful examples.
The conceptual breadth and depth of the accounting cycle mandates thoughtful analysis during instruction. The accounting cycle requires students to master and synthesize many interrelated concepts, each usually overviewed separately using various illustrative examples. Unfortunately, the seemingly fragmented presentation of the accounting cycle in traditional accounting instruction (e.g., conceptual "scaffolding"; Phillips & Heiser, 2011) might also reinforce a fragmentary understanding of this inherently recursive process. Engendering a deeper understanding of the accounting cycle might, instead, require greater emphasis on system-wide reflection, one that immerses students in balancing a complete set of business transactions for a single-company case study.

Research Objectives and Hypothesis
Here, we documented the effect of voluntary participation by students in an immersive classroom case study. Our findings are important for three reasons: (a) they extend discussion of how best to present the accounting cycle, a fundamental conceptual framework for understanding accounting "best practices"; (b) they invite accounting educators to confront aspects of their teaching that might discourage naïve students from pursuing accounting careers after exposure to realistic scenarios of accounting work; and (c) they suggest how educators within as well as outside the accounting discipline might apply powerful but rarely used statistical tools to measure the effectiveness of case studies in student learning.
By definition, students who adopt deep approaches to learning the accounting cycle will eventually possess superior understanding of its many interrelated facets. Deep learners presumably learn at different rates; as a result, they are likely to demonstrate content mastery at different times during a school term. By taking careful account of as many quantifiable contributing factors as possible-including student learning approach-we can isolate the effect of voluntary participation in an immersive case study about the accounting cycle on student exam performance. More specifically, assuming classroom behaviors indicate the learning strategy that students adopt (i.e., surface approach versus deep approach), we hypothesized that level of voluntary participation in an immersive case study about the accounting cycle would predict student exam scores.

Ethics Statement
The current project was reviewed by the university's Institutional Review Board (IRB), which exempted the project from full review. We conducted the study in an acceptable educational setting and involved normal educational practices in compliance with subsection 45 CFR 46.101(b)(1) of the U.S. Federal Policy for the Protection of Human Subjects. Student records that we accessed, while confidential, were not highly classified by school administrators and, thus, were appropriate for reporting in the aggregate to guarantee student anonymity. Statistical analyses were performed only Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu after the end of the school term in which data had been collected, precluding any influence of the current research findings on student course performance. In addition, both authors completed online CITI 1 Training Program on Information Privacy Security and Conflict of Interest.

Course Description
We collected student data from class rosters of Accounting Principles I courses from Summer 2017 through Spring 2019 at a southeastern U.S. university. Successful completion of this course required students (a) to demonstrate an understanding of how accounting generally "works" in any business operation, (b) to analyze, record, and summarize transactions related to financing and operating both service and merchandising businesses, and (c) to prepare and interpret basic financial statements. Readers interested in a lucid description of topical coverage in foundational accounting courses should consult Saudagaran (1996) and Warren and Young (2012). Both Accounting Principles I and its immediate follow-up, Accounting Principles II, ran eight weeks as back-to-back courses in the regular academic calendar (i.e., Fall and Spring semesters), and only five weeks back-to-back during Summer semesters. Typically, students complete Accounting Principles I and II by the end of their second year of university study.

Classroom Setting
A common mode of university classroom instruction emphasizes student collaboration after brief topical lectures. For example, the course instructor might introduce a sample problem with background information, demonstrate a solution to that problem, and then invite students to deliberate among themselves on possible solutions to a different, follow-up problem before the instructor reveals a solution. Typically, students direct questions at the course instructor at any time during the class period, and the instructor might respond forthrightly or with leading questions, invoking the Socratic method (e.g., preparing a case study for classroom presentation; Guess, 2014). In Accounting Principles I, student questions are predominantly procedural (e.g., how to calculate the maturity value of a note receivable) and sometimes conceptual (e.g., how omitting an adjusting entry might impact a financial statement).

Study Population and Focal Assignment
We classified student subjects according to their behavioral response to a voluntary assignment. To facilitate the exploratory nature of the current research, we limited the study population to students enrolled in a single instructor's course sections. The assignment was a comprehensive case-study problem involving a single hypothetical company. All financial transactions and procedural instructions were given in the course textbook (Warren et al., 2014). Though the project was materially based on widely available textbook content, the optional nature and pedagogical presentation of the project-hereafter, the Optional Cumulative Project (OCP)-afforded each student equal opportunity to learn fundamental accounting practices in a relatively low-stress environment. Students were also allowed to opt out of OCP instruction (i.e., leave class early or not attend at all during OCP instruction).
OCP instruction began during the class session immediately preceding the session when students completed the first exam. The first task of OCP was to enter journal entries on accounting paper by hand in the second month of the new (hypothetical) company's business transactions, a task that required them to transfer the post-closing balance from the first month's accounting cycle to the second. Students attempted the bulk of subsequent OCP tasks alongside the instructor in class and were required to complete unfinished tasks as homework (e.g., finish posting to the ledger and prepare the unadjusted, adjusted, and post-closing trial balances). Earning full credit required students to submit all required OCP tasks and correspondingly accurate documentation before the start of the second exam (i.e., one week later in most cases). Students who chose not to participate (i.e., failed to submit any OCP work when due or submitted only a partially completed OCP assignment) were identified as having Opted Out of the Project (i.e., "OOP" students). Students who voluntarily completed the OCP but earned Below Median marks on the Project were identified as "BMP" students, while those who achieved At least the Median score on the Project comprised the "AMP" student group.
In terms of assignment weight, OCP was virtually inconsequential to final grades. The maximum number of points from required course work during any school term was 569 points such that completion of OCP over three consecutive class periods (out of 21 instructional days in each academic term) determined less than 2% (i.e., 10/569) of the final course grade.
Our definition of student treatment groups based solely on observable behavioral response to a completely voluntary classroom exercise might provide insight into student learning motivation. Choy et al. (2012) argued that students should manifest their learning approaches as observable classroom behaviors before such learning approaches can reliably predict academic outcomes. Because OOP students failed to submit a completed OCP prior to Exam 2, we were not able to determine the primary motivation or reasoning for their non-participation. BMP students completed a low-quality OCP prior to Exam 2, a behavior that, nevertheless, suggests higher learning motivation than OOP students. AMP students completed a high-quality OCP prior to Exam 2, suggesting higher learning motivation than either BMP or OOP students.

Survey on Student Approaches to Learning
Within the first week of each academic term, we surveyed the general learning approaches of enrolled students. The survey instrument-Biggs' Revised Two-Factor Study Process Questionnaire, or RSPQ2F (Biggs et al., 2001)-was a brief series of statements to which participants rated their agreement. Students submitted both a formal consent form and the completed RSPQ2F with only minimal incentive (i.e., extra-credit points amounting to less than 1% of all points from scheduled course activities [= 5/569]).
Earlier studies suggest that RSPQ2F adequately measures the task perception of students who adopt surface versus deep(er) approaches to learning in general (Justicia et al., 2008;Lake et al., 2017). While scholars have translated it for non-native English students (e.g., Bati et al., 2010;Stes et al., 2013;Mirghani et al., 2014;Zakariya et al., 2020), the instrument's applicability to students beyond the Hong Kong study population highlighted in Biggs et al. (2001) is questionable (Immekus & Imbrie, 2010). Moreover, Vaughan (2018) argued that Biggs' second-order factors distinguishing learning motivation from learning strategy were context-dependent.
Despite these psychometric disagreements, the conciseness and lexical clarity of the survey items on RSPQ2F are strong. Students and teachers understand that strong agreement with statements such as "I only study seriously what's given out in class or in the course outlines," "I learn some things by rote, going over and over them until I know them by heart even if I do not understand them," or "I believe that lecturers shouldn't expect students to spend significant amounts of time studying material everyone knows won't be examined" corresponds to a learning strategy that promotes superficial understanding of the course material. Likewise, strong agreement with statements such as "I find that I have to do enough work on a topic so that I can form my own conclusions before I am Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu satisfied," "I find most new topics interesting and often spend extra time trying to obtain more information about them," or "I spend a lot of my free time finding out more about interesting topics which have been discussed in different classes" reflects a learning strategy for knowledge retention and deep understanding (e.g., accounting principles and practices).

Student Academic Performance and Background Data
We initially considered analyzing two measures of student academic performance-final course grade and individual exam scores. Constituting multiple observations of the same student across time (i.e., time series), individual exam scores provide statistical and logistical advantages over a single final grade (Neter et al., 1990). Likewise, final course grades mask the effects of putative causal factors (e.g., individual student nervousness or misunderstanding of idiomatic written English) that seem better reflected in individual exam scores. We eventually chose exam scores as the sole measure of student academic performance because they essentially determine final course grade and, unlike final grades, provide repeated observations that allow students to act as their own control group. The latter reason is particularly important in quasi-experimental designs for detecting nuanced performance differences between pre-treatment and post-treatment observations (Thyer, 2012).
We included in our formal, statistical analysis two key items from archived student records: gender and academic load per semester. We deduced gender from the 'Mr.' or 'Ms.' designation in student records, while academic load was simply the total number of enrolled credit hours at the start of each semester. Notably, summer-enrolled students tend to carry lower academic loads but are challenged with an accelerated instructional and assessment schedule.

Quantitative Analysis of Exam Scores
We borrowed statistical analytical procedures akin to those successfully implemented in animal behavior research (Olvido & Wagner, 2004;Olvido et al., 2010). First, to minimize confounding effects of heterogeneous variance (Sokal & Rohlf, 1981), we transformed all students' exam scores (originally expressed as percentages) to normalized ranks, thus allowing subsequent statistical tests to reach asymptotic relative efficiencies (~maximum statistical power) equal to or greater than those of any parametric or non-parametric test (Conover, 1999). Also, unlike simple ranks, normalized ranks facilitate unbiased tests of interactions (Conover, 1999).
Then, we applied the following repeated-measures analysis of variance (ANOVA) model: where Y ijklmn is normalized-rank score on the ith exam earned by a student in the jth academic term, of the kth gender and lth OCP treatment group, with mth academic load and nth deep strategy score; μ …… and ε (ijklmn) are model terms for, respectively, the common mean and unexplained variance (= error); EXAM i denotes the repeated-factor and fixed-effect model term for within-student variation among exam scores (i = 1, … , 7)-we excluded students who did not participate in all seven exams; TERM j is the fixed-effect model term denoting variation in exam scores among Summer, Fall, and Spring academic terms (j = 1, … , 6); GEND k(j) is the random-effect model term denoting variation in exam scores due to student gender (k = 1, 2) and nested within academic term; PROJ l(jk) is the random-effect model term denoting variation in exam scores among OCP student groups (i.e., AMP, Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu BMP, and OOP; l = 1, 2, 3) and nested within academic term and student gender; LOAD m(jkl) is the random-effect model term denoting variation in exam scores due to per-semester academic load (m = 1, … , 13), which varied from 3 to 20 credit-hours, and nested within academic term, student gender, and OCP student group; and DEEP n(jklm) is the random-effect model term denoting variation in exam score due to student self-reported deep strategy score (n = 1, … , 14), which varied from 10 to 23 in our student sample, and nested within academic term, student gender, OCP student group, and student academic load. Because we measured individual student exam performance only once in each examination period, the ANOVA model's error term not only remained unreplicated but also conflated with variation due to the two-way interaction between exam number and deep-strategy score group (i.e., an error-inflated (EXAM×DEEP) in(jklm) model term). The lack of within-subject, withintreatment replication only renders the corresponding F tests conservative and, thus, more challenged with detecting statistically significant variation (Neter et al., 1990).
Third, to describe the relative contribution of the ANOVA model terms to total variation in exam scores, we used restricted-maximum likelihood (ReML) estimation procedures to quantify variance components. In requiring independence of exam scores (i.e., uncorrelated observations), ReML estimation of variance components cannot appropriately evaluate statistical significance (unlike repeated-measures ANOVA) and only quantitatively describes the importance of each ANOVA model term to student exam performance.
Finally, we performed all statistical computations and summaries using Windows-based computer applications. For most descriptive statistics and calculation of all F-test mean-squares and ReML-based variances, we used the MEANS, GLM, and VARCOMP procedures, respectively, in SAS 9.3 (SAS Institute, 1999), with final P values obtained online (Soper, 2018). Organizing the various results into more reader-friendly format, we used Apache OpenOffice programs (Version 4.1.2; The Apache Software Foundation, 2014) to generate all figures and summary tables.

Sources of Variation in Student Exam Scores
For any given student, exam scores significantly varied across the course (see Line 1, Table 1). Mean exam scores appeared to decline across the seven exams (see Figure 1). The interaction of exam number with other variance factors (see Lines 7-10, Table 1), including OCP student group (see Line 9, Table 1) was not statistically significant, indicating that exam scores generally declined at equal rates over time across the three OCP student groups (see Figure 1). Note. N = 133 students × 7 exams = 931 observations. Due to the nature of a repeated-measures design, the last interaction term (Line 11) cannot be evaluated separately from unexplained (= error) variance. § degrees of freedom. *** p < 0.001. Exam scores significantly varied across deep strategy scores (see Line 6, Table 1). Students who self-reported lower and higher deep strategies tended to achieve the highest exam scores (see Figure 2). The lowest exam scores (i.e., cumulative final exam) appeared to correspond to those students who had self-reported a deep strategy score of 18 out of a maximum 25 on the Biggs RSPQ2F survey instrument.

Relative Contribution of Variance Sources
Only two of the six main-effect ANOVA factors significantly accounted for variation in exam scores (see Lines 1, 2, and 6 in Table 1). Deep strategy score accounted for the largest (33.3%; see Line 6, Table 2), with considerably lesser contribution from exam number (see Line 1 in Table 2). We also found notable non-zero contributions from statistically non-significant variance sources. In descending order, these sources were (a) academic term, (b) interaction between exam number and academic load, (c) academic load, (d) OCP student group, and interactions of exam number with (e) with academic term and (f) student gender.

Discussion
The results do not definitively support our hypothesis that OCP participation predicts subsequent exam scores. Given the general decline in exam scores over the course of a semester, if the results had supported our hypothesis, then exam scores of BMP would have declined at a faster rate than AMP students, while exam scores of OOP students would have declined the fastest. Instead, the lack of interaction between exam number and OCP student group (see Line 9, Table 1), together with the statistical non-significance of OCP student group as a main factor (see Line 4, Table 1), suggests that OCP participation had little influence on the overall trajectory of academic performance across a semester. This particular finding is entirely consistent with Wynn-Williams et al. (2016), who found that experimentally assigned unstructured case studies that they expected to promote deeper learning in accounting students actually had no effect on final course grades.
The slight performance increases we found in Exam 2 and Exam 4 are, therefore, worth additional attention (see Figure 1). We attribute the uptick in Exam 2 scores by AMP students to OCP participation-the small (<2%) variance contribution by the PROJ model term (see Line 4, Table 2) seems to reflect, in part, an immediate but temporary benefit of OCP instruction to AMP students (vs. presumably less engaged students in the other OCP groups). Later in the academic term, after further exposure to discrete elements in the accounting cycle, BMP and OOP students (and AMP students to a much lesser degree) scored higher on Exam 4, perhaps reflecting a better grasp of the accounting cycle. Such nuanced gains, however, were fleeting because scores after Exam 4 declined synchronously across the three OCP student groups (see Figure 1). While attributing post-Exam 4 declines to student stress or distraction from upcoming mid-term exams and project deadlines in other courses is tempting, the lack of significant effects from academic load by itself (see Line 5, Table 1) or from its interaction with exam (see Line 10, Table 1) suggests otherwise. Because many students modify their schedules by dropping and adding courses in the first week of an academic term, perhaps scholars might investigate the effects of academic load by comparing the type and amount of homework per week that might be expected in concurrent math and/or writing-intensive courses.
Perhaps our most surprising result is the significant degree to which self-reported deep strategy scores predicted exam scores (see Line 6, Table 1). While students scoring the highest on the deep-strategy scale tended to perform well across the seven exams, students on the opposite end of the deep strategy scale also performed nearly the same. Notably, one student posting the lowest deep-Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu strategy score actually performed as well as the eight students scoring 22 and 23 (out of 25) on the deep strategy scale (see Figure 2). A post hoc correlation analysis seemed to confirm the contours of a "shallow depression" in Figure 2-that is, a nonlinear relationship between self-reported deep strategy score and Exam 1 score (partial Pearson r = 0.047, p > 0.05). Together, these ambiguous results seem to agree with the findings of Gijbels et al. (2005) and Choy et al. (2012) regarding the generally weak but positive relationship between deep learning and classroom achievement.
Consistent with our thinking, Turner and Baskerville (2013) suggested that (more) experience of deep learning is a necessary prelude to highly valued social behaviors, including problem solving, teamwork, and communication. Students in the current study-most of whom were in their 2nd year of university study-might have misperceived their own learning intent and study practices, resulting in misleading SAL survey scores. Indeed, Ehrlinger et al. (2008) found that underperforming students overestimated their own abilities, even with substantive monetary incentives for accurate selfassessment. Follow-up studies that measure the relationship between self-reported learning approaches and observed classroom behaviors would be worthwhile.
The current study has some limitations that open pathways to future research. For one, a larger and more robust sample of university students across various course sections taught by different instructors would not only facilitate generalization to other student populations but also allow future researchers to incorporate into their statistical analyses other explanatory variables of academic performance (e.g., prior learning; Jones & Fields, 2001). Larger sample sizes would also allow investigation into explicit student behaviors outside the classroom and demographic factors such as ethnicity, socio-economic status, household income, parental education, job-related work hours, etc. In addition, scholars might consider the degree to which procrastination (Rotenstein et al., 2009) and all-night "cramming" (Hershner & Chervin, 2014) affect exam performance, though earlier findings suggest that cramming benefits achievement-and goal-oriented students (e.g., Brinthaupt & Shin, 2001). Researchers should also investigate student academic performance beyond a single course, what Lorz et al. (2013) referred to as the "stability of the dependent variable" (p. 143). If self-identified "slow learners" learned accounting principles at all, then a longer-term longitudinal study, along the lines of Sargent and Borthick (2013), who analyzed prior learning and student GPA, might more properly document deep learning in the form of improved academic performance in later, upperdivision coursework.
Moreover, establishing causality requires actual experimental manipulation, along with proper controls for myriad and possibly confounding factors (see Baeten et al., 2010 for an excellent review). Though perfectly applicable to diverse fields of human sciences, including healthcare and higher education (Thyer, 2012), our quasi-experimental design only allowed us to explore possible variables affecting student academic performance. To the extent allowed by ethical practice, scholars should consider running parallel course sections, whereby only one section would contain the experimental variable of interest (e.g., OCP of the current study).
Finally, without proper replication by researchers in other academic fields, our findings remain (only) discipline-and population-specific. While an optional cumulative project about the accounting cycle explained only 1.93% of variation in exam performance by business majors, similarly immersive and discipline-specific tutorials might have different effects on exam performance and learning outcomes in other fields. For example, as the raison d'être of science laboratory courses, immersive simulations and practicums seem indispensable to sound training of future healthcare professionals (Sanko, 2017). Examining board certification exam outcomes as a function of participant presence in training simulations might yield powerful insights (Makransky et al., 2020

Conclusion
Our findings did not definitively support the hypothesis that voluntary participation by students in an optional cumulative project (OCP) would predict subsequent exam performance. However, we successfully quantified contribution of salient factors, including academic load, OCP participation, and self-reported adoption of deep learning strategies, to student exam performance (i.e., final course grades). Whether the more successful students (with or without the benefit of interactive tutorials about the accounting cycle) actually exercised deep learning strategies that might explain their relatively high exams scores is a question worth addressing in future studies.