Iteration in an Inquiry-Based Undergraduate Laboratory Strengthens Student Engagement and Incorporation of Scientific Skills

The advantages of active learning approaches have prompted national recommendations for the development of inquiry-based laboratories to replace traditional laboratory classes. However, there is little consensus for the most-effective implementation strategies. Frequently, a single inquirybased exercise is incorporated at the end of a traditional course and students have little opportunity to repeat the experience before moving on to new courses. To test whether multiple-rounds of inquiry would be beneficial, we incorporated three rounds of inquiry-based experiments during a redesign of a traditional upper-level undergraduate developmental biology laboratory class. After the second and third round of inquiry, students gave slideshow presentations of their projects and received peer and instructor feedback. We then designed and validated a scoring rubric to assess student use of scientific skills. Substantial improvements were observed in five of seven categories of scientific skills when comparing student performance from the third round of projects to the second round. Surprisingly, these gains were not diminished when students in the course were given the rubric to use as a guide. Anecdotal evidence and responses to student questionnaires revealed substantial levels of student interest and engagement in the course. Overall, these results indicate that incorporating iterative rounds of inquiry-based laboratories is a promising strategy for teaching scientific skills, enhancing student engagement, and promoting learning.


Introduction
Science, technology, engineering and mathematics (STEM) education has been a national priority since the onset of the space race in the 1950s, yet recent estimates suggest the United States will need 1 million more STEM professionals over the next decade than it currently produces (Olson & Gerardi Riordan, 2012). A practical way to boost the numbers of STEM professionals is to increase student retention since it is estimated that less than 40% of college-level STEM majors complete their STEM degree (Olson & Gerardi Riordan, 2012). One of the key emerging strategies to improve student retention is through increased levels of student interest and engagement in their STEM courses (Graham, Frederick, Byars-Winston, Hunter, & Handelsman, 2013).
Inquiry-based learning is a pedagogical approach to education that uses active learning to enhance student interest and engagement. Inquiry-based learning incorporates the philosophical concept of inquiry as an approach to the generation of knowledge into the field of education, whereby student learning occurs through a process, including posing questions and interacting with the course materials. This approach can be contrasted with more traditional education practices that rely upon the transmission of a series of established facts and knowledge from teacher to student.
Learning in laboratory-based classes is already active by its very nature, but traditional laboratory classes often use "cookbook" exercises, where students follow a set experimental protocol to generate predetermined results. Education reformers have long been advocating for the incorporation of scientific inquiry and discovery-based approaches into laboratory classes (Chiappetta, 2008;National Research Council (NRC), 2000;Schwab & Brandwein, 1962). Inquirybased laboratory classes have been shown to increase student engagement and learning (Corwin, Graham, & Dolan, 2015;Gormally, Brickman, Hallar, & Armstrong, 2009). Engagement in coursebased research experiences has also been shown to positively impact retention and graduation rates in STEM majors (Lopatto, 2007;Rodenbusch, Hernandez, Simmons, & Dolan, 2016;Seymour, Hunter, Laursen, & Deantoni, 2004). These laboratory classes can also help to bridge the disconnect between student coursework and the actual practice of science (Anders, Berg, Christina, Bergendahl, & Lundberg, 2003;Bevins & Price, 2016). The benefits of these approaches have prompted a national recommendation to "advocate and provide support for replacing standard laboratory courses with discovery-based research courses" (Olson & Gerardi Riordan, 2012).
Inquiry-based laboratory exercises can take many forms and the best practices for inquirybased approaches remain an active topic of discussion (Academies, 2015). For example, inquiry can be teacher-directed, guided inquiry or a more fully student-directed, open inquiry that incorporates hypothesis generation and experimental design by the students (Minner, Levy, & Century, 2010;Sadeh & Zion, 2009). Frequently, inquiry-based experiences consist of a single laboratory session at the end of a traditional laboratory class, but they can also include one semester-long project or multiple rounds of inquiry-based experiments. Unfortunately, there are few examples in the literature analyzing the effects of multiple rounds of inquiry.
In this manuscript, we measured the effects of multiple rounds of inquiry using a Scholarship of Teaching and Learning (SoTL) based approach. SoTL-based approaches typically consist of teams with education experts and university professors that implement innovative teaching strategies, measure the effects of these strategies on student learning, and then share their results with others in the field (Kreber, 2007;O'Brien, 2008;Shulman, 2012). A distinction between SoTL and more traditional educational research is the increased emphasis on discipline-specific learning in university settings. Here, we report results showing that multiple rounds of inquiry have beneficial effects on student engagement and incorporation of scientific skills in an advanced undergraduate developmental biology laboratory class. Technology. The laboratory portion of the class consists of a three-hour session once a week over a 15-week semester. The course is required for majors in two undergraduate programs: General Biology and Premedical Biology, and is an elective for other undergraduate programs including: Aquaculture, Biomedical Sciences, Conservation Biology and Ecology, Genomics and Molecular Genetics, Marine Biology, and Molecular Biology. The demographics of Florida Tech are 69% male and 31% female, and 51% Caucasian, 22% international, 10% African American, 10% Hispanic, 2% Asian, and 5% other (https://www.fit.edu/institutional-research/student-diversity-data/).

Course redesign
We started with a traditional developmental biology lab class that was divided into two sections. For one-half of the semester, students examined sectioned slides of different embryonic animals and learned the developmental anatomy for major systems (nervous, digestive, reproductive, etc.). During the second half of the semester, the students worked in groups on a series of mini-projects that involved exposure to several different experimental models systems including sea urchins, planaria, Caenorhabditis elegans, and Xenopus tropicalis. At the end of the course, students submitted written lab reports on the mini-projects. Data was collected from this traditional laboratory class with 37 students over two semesters.
The course was redesigned to feature three distinct rounds of structured inquiry-based experiments. In the initial round, students working in groups were introduced to the slime mold Dictyostelium discoideum and fruiting body formation. Students were guided to collect background information and then design and conduct their own hypothesis-driven experiments. There was no assessment after the first round. Then, the students conducted a second round of inquiry-based experiments involving either egg laying or early development using the nematode roundworm C. elegans. At the end of the second round, students gave an oral presentation that included the hypothesis, methods, results, conclusions and limitations of their project. The students were given the option of presenting their projects from either the first or the second rounds. Presentations were followed with a short peer discussion where other students were asked to provide positive remarks about what they liked and constructive criticism about what could be improved to the presenters. The presentations were graded by the instructor based on student performance, but grades were not based on the rubric that was later used to assess the effectiveness of the course. In the third round of inquiry-based experiments, students were introduced to C. elegans as a model system for aging and neurodegenerative disease and then the students repeated the inquiry process, gave another slideshow presentation, and received feedback as before. Survey data and classroom presentations were collected and analyzed that represent 54 students working in 18 groups over three semesters.

Rubric development
A rubric was developed to measure student incorporation of scientific skills in order to evaluate the effectiveness of multiple rounds of inquiry. The rubric was developed following recommendations provided in the peer-reviewed literature (Allen & Knight, 2009;Moskal & Leydens, 2000;Stevens & Levi, 2004). The rubric was constructed by a committee consisting of a faculty member specializing in educational research with a strong biology background, two faculty members with their own active research groups in the biological sciences, who were also the previous and current course instructors, and a graduate student in the Biological Sciences PhD program. Each category was scored on a scale of 1-5, with the most complete and least acceptable benchmarks determined first, and then graduated responses were developed to fill in the rubric (Table 1). This was done in order to identify the important standards and to normalize scoring between instructors. For example, in Wiseman, Carroll, Fowler, and Guisbert Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu the first category, theoretical framework, presentations were assessed by the amount and quality of background information upon which the project was based and the degree to which it supported the project. The maximum score of 5 was given to projects with relevant and accurate background information that was supported by peer-reviewed research. A score of 4 was given for background information that was supported but could be expanded. A score of 3 was given if projects contained only some background information with little support. A score of 2 was given if projects contained some background information with no support. Finally, a score of 1 was given for no background information and no support. Other categories assessed the quality and testability of the hypothesis, the quality and quantity of quantitative analysis, interpretation of data, discussion and conclusions, presentation of limitations, and general organization of the presentation.
Several different lines of evidence were used to validate and assess the rubric. The rubric uses criterion evidence for its validity, as it reflects the abilities of the students to practice scientific skills in an environment that mimics a research laboratory (Moskal & Leydens, 2000). The rubric construction provides initial a priori construct validity as it was developed to assess scientific skills by practicing scientists. The rubric represents an analytic rubric as it is designed to measure seven largely independent categories of scientific skills, but it is likely that there are still holistic aspects since these skills are somewhat interdependent (Moskal, 2000). The rubric was optimized after it was independently used to assess several student presentations by the three faculty members. Similar scoring of the three different assessors demonstrated the reliability of the rubric and the robustness of the approach (Andrade, 2005).

Scientific skills assessment
After the course was concluded, student presentations were collected, stripped of identifying information, randomly assigned a number, and then independently judged using the rubric by the three faculty evaluators that were involved in rubric design. Since the Graduate Teaching Assistant worked with the students on the projects, she did not participate in the evaluations to prevent the introduction of any bias. All of students in the redesigned course had the same course instructor and the same graduate teaching assistant. In order to avoid bias from race, gender, or student speaking skills, student presentations were not recorded and the scoring was based only on the slideshows. Scores were compiled and the results were evaluated for statistical significance using a student t-test (unpaired, two tailed) and the raw p-values < 0.05 were signified with an asterisk.

End of semester surveys
Surveys were distributed to students at the end of each semester as part of the standard procedure for all courses at the Florida Institute of Technology. The surveys are Scantron forms that allow students to evaluate their instructor and the course anonymously. The assessment comprises 29 multiple choice questions and three write-in questions. Fifteen of the multiple choice questions cover instructor performance, six pertain to the course, and eight are for student demographics. The three write-in questions allow the students to explain what they found most valuable, identify areas for improvement, and give additional comments (Table 2). Surveys were completed during the last laboratory class session after the instructor left the room, and they were collected by a student in the course. Instructors do not see these evaluations until after the semester is completed. The results were compiled and evaluated for statistical significance using a student t-test (unpaired, two tailed). IRB approval was obtained (approval number 16-128).

Results
To quantify the effects of multiple rounds of inquiry-based experiments, a traditional, semester-long undergraduate developmental biology laboratory course was redesigned to include multiple rounds of inquiry. Rubrics from prior studies evaluating course-based research experiences have focused on general knowledge gains or emphasized student perception of benefits rather than instructorevaluated gains (Auchincloss et al., 2014;Luckie et al., 2012). Therefore, we designed a new rubric to assess whether multiple rounds of inquiry-based laboratories were effective at teaching authentic scientific research skills in order to prepare students for careers in the biological sciences. The rubric was designed and validated by a committee consisting of the past and current instructors for the course, an expert in science education, and the graduate student who served as the graduate teaching assistant (GTA) for the laboratory component of the course as described in the methods. One limitation of our approach is that the rubric is not appropriate for assessing cookbook laboratory classes that only use a subset of authentic scientific research skills. Therefore, we were unable to use the rubric to directly compare student outcomes from the previous course with the course redesign. However, we were able to measure student gains during different rounds of inquiry within the redesigned course. Student presentations were collected over two semesters of the course and used to assess the effectiveness of the course (k= 10 groups). Presentations were independently and blindly scored using the rubric and the results were averaged to generate a mean score for each category. Student incorporation of scientific skills was measured by the rubric in seven categories: theoretical framework, quality of hypothesis, quantitative analysis, data interpretation, discussion/conclusion, limitations, and organization (Table 1).
Comparison of the scores between the second and third rounds of inquiry revealed a significant improvement in five of the seven categories ( Figure 1A). The largest improvement was in quantitative analysis, where the scores went from an average of 2.1 in the first round of inquiry to an average score of 3.7 in the second. A score of 2 indicates "little quantification" and a score of 4 demonstrates that "quantification is present and mostly clear", suggesting that student improvement reflects a shift from largely qualitative or descriptive approaches towards more quantitative experiments. Improvements were also seen in student use of background data supported with peerreviewed research (theoretical framework), clearly defining testable hypotheses (quality of hypothesis), presenting conclusions (discussion/conclusions) and overall presentation skills (organization). Two of the categories, interpretation of data and limitations, had an increase in the average scores that did not reach the level of statistical significance. Together, these results demonstrate numerous improvements in student incorporation of scientific skills between the second and third rounds of inquiry.
One caveat for these results is that the students did not have access to the rubric since the rubric was constructed after the course was finished. To address this issue, the same framework was used in a third semester to analyze the results from students that were given access to the rubric prior to their first presentations (k=8 groups). Comparison of the scores between the second and third rounds of inquiry for these students revealed almost identical benefits as observed in the initial set of students ( Figure 1B). The only difference was that in the second group of students, all seven categories of scientific skills had statistically significant increases. This indicates that the observed improvements in student incorporation of scientific skills were unaffected by student knowledge of the assessment criteria and were remarkably robust.
A student survey was conducted at the end of each semester to evaluate student experiences with the course. Two years of student surveys from the traditional course before the redesign were compared to two years of student surveys from the redesigned inquiry-based course. Student responses on most questions were not substantially different, but a small, statistically significant Wiseman, Carroll, Fowler, and Guisbert Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu decrease (p-value=0.03) was observed for student perception of learning. The number of students who strongly agreed that they learned a great deal in the course dropped from 95% before the redesign to 78% after implementation (Table 2). However, the overall perception of learning in the inquiry-based class still remained high with 98% of students either agreeing or strongly agreeing that they learned a great deal in the course.
Student surveys also contained sections for comments about the course. The comments were classified as positive if the student indicated that they liked an aspect of the course. In the traditional course, 56% of students left comments on their evaluations of which 61% were positive comments regarding how the lab was run (n=37). After implementation of inquiry, the percentage of students leaving comments increased to 70% of which 70% were positive (n=54). Therefore, the fraction of positive student comments went from one-third (34%) of the students in the traditional classes to almost one-half (49%) in the inquiry-based classes. Furthermore, the types of responses were qualitatively different as they focused on the laboratory freedom and the experimental design process. For example, one student wrote: "I really appreciated the liberty we had in planning and designing our own experiments because we were actually able to get into certain topics and areas that interested us." Another student stated: "This lab made me really excited as a scientist to be doing my own experiments about something I'm interested in." All comments that were classified as positive are listed in Table 3. Overall, student comments reflected an increased interest in and engagement with the iterative inquiry-based course compared to the traditional course that matched anecdotal evidence observed by the course instructors. Wiseman, Carroll, Fowler, and Guisbert Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu Figure 1. A third round of iterative inquiry leads to substantial improvements in scientific skills. Student presentations were independently scored from 1-5 on the seven listed categories of scientific skills by three reviewers. Scores from the second round of inquiry-based experiments were compared to scores from the third round. Error bars reflect SEM. Asterisks reflect a p-value < 0.05. A) Results from students that did not receive the rubric. k=10 groups of students. B) Results from students that received the rubric prior to their presentations. k= 8 groups of students. Theoretical   Table 3: Positive student responses after the course redesign. End of semester student surveys were collected from students after the course redesign and the responses to the question: "What did you find most valuable about the course" are listed in alphabetical order. Duplicated comments are represented only once.
"Designing and performing the experiments" "Liked designing my own experiments" "Designing our own experiments!" "Loved C. elegan lab! Great course!" "Designing your own experiments" "Loved the freeness of the lab" "Developing own experiment was very valuable. Best lab I've ever taken" "Open lab, keep everything as it is" "Developing own experiments" "Practical experiences" "Freedom to experiment" "Self designed experiments" "Freedom" "Students were able to design their own experiments which was great and a lot of fun" "Helped us develop lab skills. Creating our own experiments is beneficial to the real world" "The freedom to do experiments that interested you" "How laid back the lab was and how we had to do most of the things on our own" "The freedom to run our own experiments which I loved" "I really appreciated the liberty we had in planning and designing our own experiments because we were actually able to get into certain topics and areas that interested us" "This lab made me really excited as a scientist to be doing my own experiments about something I'm interested in" "Independent experimentation" "You were able to think on your own and create your own experiments" "Lenient lab schedule and making own experiments! Much more exciting"

Discussion
We found that implementation of iterative inquiry in an undergraduate developmental biology laboratory led to two specific, measurable positive outcomes. First, student interest and engagement in the class was increased when compared to a traditional laboratory class. Second, students in the course experienced significant gains in scientific skills after the third round of inquiry when compared to the second round. The increase in student interest and engagement is evidenced by positive comments collected in student surveys at the end of the course and anecdotal evidence observed by the course instructor and graduate student assistant. These increases appear to be based on student appreciation for the freedom to pursue their own interests and excitement about the possibility of discovery. However, it must be noted that the traditional laboratory class and the inquiry-based laboratories were supervised by different course instructors and graduate student assistants, which could potentially confound these results. Nevertheless, these results add the benefit of iterative inquiry to a growing body of literature demonstrating that inquiry-based approaches are superior to traditional laboratory classes and support recommendations to replace traditional laboratory classes with inquiry-based and discovery-based approaches.
In contrast, we found that iterative inquiry had a small, negative effect on student perception of learning. This decrease has also been observed in other inquiry-based approaches (Henige, 2011). These results indicate an important limitation of inquiry-based experiments that involve more focused research experiences at the expense of a breadth of experimental approaches. However, it has been reported that multiple rounds of inquiry outperforms traditional laboratory classes when assessing performance with standardized testing (Luckie et al., 2012). Therefore, this decrease may only reflect student perceptions but not an actual decrease in comprehension. If the decrease in perceptions results from a focus on scientific skills that can be challenging for students to define and quantify, then the decrease might be ameliorated by incorporation of self-reflective exercises or other methodologies. In the future, it will be interesting to correlate this decrease with overall student performance and demographics to investigate whether this differentially affects particular subsets of students. Unfortunately, we cannot address this issue in the current study as the surveys we used were anonymous.
Most assessments for inquiry-based approaches rely primarily upon student feedback. In contrast, our rubric enables a more unbiased assessment measuring the effects of iterative rounds of inquiry on scientific skillsets. These scientific skills, such as hypothesis generation, reflect how inquiry-based approaches are more similar to the actual practice of research than traditional laboratory courses (Shaffer et al., 2014). Assessment of scientific skills after iterative rounds of inquiry revealed substantial improvements in five out of seven categories in the third round of inquiry compared to the second. For example, the average student group designed and presented an experiment with "little quantification" in the first round of inquiry but designed and presented an experiment with "quantification is present and mostly clear" in the second round. Appreciation for the power of quantitative experiments to clearly support or refute a hypothesis is a scientific skill that is difficult to teach in a traditional laboratory but was clearly demonstrated with iterative inquiry.
The dramatic improvements in scientific skills raise the question of what specific features of the course are important. One possibility is that these gains reflect increased time on task and that they arise merely from students spending more time doing inquiry based-projects. We believe that this is unlikely to be responsible for all of the observed benefits for two reasons. First, almost all of the students in our curriculum have previously experienced at least one inquiry-based experiment during their sophomore year. Second, we are analyzing the difference between the second and third Wiseman, Carroll, Fowler, and Guisbert Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu round of inquiry-based experiments, therefore the difference in time-on-task between the two groups is not dramatically different.
Another possible explanation for the gains in scientific skills is that they arise from the formative assessment after the second round. It is well-established that feedback is an important component of education and our students received feedback from peers and the instructor after the first presentations (Black & Wiliam, 1998). Additionally, the students may benefit from listening to the other student presentations and providing peer feedback to the other students. One way that assessment could help students is through clarification of instructor's expectations. We tested this by providing one set of students with the scoring rubric at the beginning of the class. Surprisingly, we found that the gains in student performance were not decreased for this cohort. Instead, there were statistically significant gains in all seven of the categories of scientific skills. These results indicate that some other aspect of conducting an inquiry project, doing a formative assessment, and then allowing the students an opportunity to immediately apply their knowledge to a new inquiry project results in significant gains for scientific skills. It is important to note that it is difficult to unravel whether the gains demonstrated in the third round of experiments were actually obtained during the third round, or rather arise from lesson learned during the second round. Therefore, our approach clearly demonstrates that two round of inquiry is superior to one round, but only suggest that three rounds may be better than two.
Iterative inquiry may provide additional benefits not assessed in this manuscript. For example, inquiry-based laboratories require the graduate teaching assistant (GTA) to develop greater depth and breadth of knowledge in the subject area in order to effectively supervise the undergraduate projects. Additionally, these courses require increased interaction between the GTA and the students. Not surprisingly, GTAs realize gains in their confidence as instructors after having taught inquiry-based laboratories (French & Russell, 2002). Unexpectedly, GTAs also reported benefits to their own research programs including gains in experimental design and writing skills after teaching an inquiry-based laboratory (French & Russell, 2002).
The educational implication of our research is that implementation of iterative inquiry in undergraduate laboratory classes contains many compelling advantages for students. These initial results with a single undergraduate class establish a precedent that should be generalizable and can be immediately adapted to other courses. As inquiry-based approaches are not limited to laboratory classes, these findings are also generalizable to other fields that use inquiry-based active learning approaches. Iterative rounds of inquiry can even be applied in non-STEM courses like the humanities. For example, instead of focusing on a single inquiry-based project the end of a humanities course, our findings suggest that a more effective teaching strategy would include incorporation of iterative rounds of inquiry-based projects with formative assessment. Furthermore, our findings will help to motivate larger, more comprehensive studies on iterative inquiry. In the future, it will be important to measure how robust the effects of iterative inquiry are across different courses, with students at different levels, and with different instructors. Additionally, it will be important to determine the optimum number of rounds of inquiry for a single course or perhaps an entire curriculum and test different balances of traditional and inquiry-based approaches. The powerful benefits of this initial approach open the door to these and other exciting future investigations.