Defining Science Literacy in General Education Courses for Undergraduate Non-Science Majors

This article describes a project focused on identifying science instructors’ conceptions of science literacy and using these conceptions to develop a brief science literacy student self-assessment (SCILIT). We present the rationale and process we used to elicit instructors’ conceptions of science literacy, crafted in a meaningful way with input by faculty and graduate student science experts. Next, we explain how we developed a novel student SCILIT self-assessment based on those expert conceptions. We describe our initial efforts using SCILIT in undergraduate general education science courses to explore students’ self-perceived science literacy. We discuss the use of SCILIT self-assessment to assess potential progression of students’ self-rated science literacy over the course of an academic term, and how this student self-assessment relates to instructor ratings of academic proficiency and science literacy. Finally, we reflect on the use of SCILIT self-assessment to guide instruction and assessment in general education science courses for non-science majors.


Introduction
Although there is an extensive literature about science literacy and consensus on the importance of building a scientifically literate populace, it is difficult to identify a single, adequate definition and an instrument to measure undergraduate student science literacy across disciplines (National Academies of Sciences, Engineering, and Medicine, 2016). The ability of higher education instructors to improve the impact of teaching practices to support student gains in science literacy and reduce students' scientific misconceptions rests on instructors' ability to monitor and clarify students' science literacy (Singer, Nielsen, & Schweingruber, 2012). However, even with the extensive literature about science literacy, faculty within or across institutions, and especially faculty in a variety of scientific disciplines, may lack consensus on a single definition, which could result in the term "science literacy" losing Vandegrift, Beghetto, Eisen, O'Day, Raymer, and Barber Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu long-term science literacy, it has been suggested that students learn how to solve "ill-defined" problems that simulate scientific discovery and "well-defined" problems with clear solutions (Singer et al., 2012, p. 76). Often multiple elements and components of science literacy are combined into a single definition.
Given the wide range of available science literacy definitions and contexts, we found it useful to recognize that our science instructors, who are experts in their respective fields, already hold implicit conceptions of science literacy that they apply when teaching science courses for non-science majors. Indeed, implicit conceptions "already exist, in some form, in people's heads" (Sternberg, 1985 p. 608). An important advantage of eliciting implicit conceptions from expert instructors is that the validity of elicited conceptions is inherent in the domain knowledge of experts themselves. Indeed, as Baer and McKool (2009) have argued, when one has access to experts, one need not use some externally developed "test, rubric, or some other device to approximate the judgments of experts" (p. 5). Rather, one can go directly to the "most valid yardstick" -experts themselves. We therefore looked to elicit a definition of science literacy using input from instructors in different disciplines within our program that would best operate across these diverse science disciplines. We further reasoned that with such varying (and conflicting) definitions of science literacy across the literature, faculty in our program would be more likely to "buy-in" to a definition of science literacy that was built upon their implicit conceptions and teaching expertise and developed specifically for this local context. Based on our elicited definition we developed a student science literacy self-assessment that probed the characteristics and behaviors associated with science literacy rather than any specific content knowledge. Students' perception of their own science literacy represents a specific type of self-belief. Self-beliefs play an important role in developing one's competence -be it in science or some other domain. Indeed, ability alone is insufficient when it comes to deepening one's understanding and persistence in learning (Bandura, 1997;Beghetto & Baxter, 2012). As Buehl & Alexander (2005) have succinctly stated, "students' beliefs matter" (p. 723). They matter because when students hold positive beliefs about their ability to learn science, they are more likely to value science, engage in classroom learning, and persist in subsequent science learning (Duschl, Schweingruber, & Shouse, 2007).

Context and Goals of Our Project
Our project was carried out in the context of a multidisciplinary general education science program, launched in 2010 with extramural support, at a public research university in the U.S. Pacific Northwest. A program goal is to increase students' science literacy using evidence-based teaching practices, known to improve student learning, by incorporating these practices into the development and teaching of general education courses for non-science majors in astronomy, biology, chemistry, geology, and Vandegrift, Beghetto, Eisen, O'Day, Raymer, and Barber Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu physics (Freeman, et al., 2014). All students at our university are required to take four general education science courses as part of a liberal arts distribution requirement. The science courses in our program were designed to place complex, discipline-specific scientific concepts within a framework relevant to non-science majors. Courses in our study were taught by teams of undergraduate science majors, science graduate students, and science department affiliated faculty, all of whom had at least initial exposure and some significant training in evidence-based, student-centered teaching methodologies (e.g., Handelsman, Miller & Pfund, 2007).
Affiliated faculty had a shared goal of increasing students' science literacy by supporting students' comfort with and ability to be conversant with scientific thinking. However, consistent with previous studies, we discovered that language used by affiliated faculty from different disciplines to explain their science was not automatically transferrable between disciplines (Marder, 2013). Given that affiliated faculty had a variety of discipline-specific ideas about elements contributing to science literacy, a key aim of our project was to establish an agreed upon conception of science literacy that could be used in program courses across disciplines to guide instruction and assess student learning. To this end, we employed an exploratory design with convenience samples with courses taught by instructors affiliated with the program and students enrolled in their courses.
More specifically we had three interrelated goals. First, we endeavored to define our science instructors' conceptions of science literacy by having affiliated faculty and graduate student instructors describe behaviors that they consider as evidence of students' science literacy. Next, we aimed to discover agreement amongst affiliated faculty and graduate student instructors regarding the most important of these behaviors, so we could develop a brief science literacy student self-assessment (SCILIT). Finally, we used SCILIT to explore our students' self-rated conceptions of science literacy in program affiliated undergraduate science courses designed for non-science majors to determine whether we could measure student self-assessed gains in science literacy during a term in science courses at our university and meaningfully compare students' self-assessments to faculty assessment of such gains in a diversity of science courses.

Goal 1: Define Science Literacy Across Scientific Disciplines
To attain our first goal, we elicited behavioral characteristics of science literacy from a group of 13 affiliated faculty and senior graduate students -our identified experts. Members of this initial group represented biology, chemistry, geology, and physics departments and each had prior engagement with relevant science education literature through a weekly science education journal club and workshops (e.g. Handelsman et al., 2007;Brewer & Smith, 2011). All of the graduate students and faculty were co-teaching or co-developing program affiliated courses with an expressed goal of improving student science literacy. Using a classical approach similar to previous researchers (Lim, Plucker, & Im, 2002;MacKinnon, 1964;Sternberg, 1985), we utilized an online survey to ask respondents to "list as many behaviors characteristic of Science Literacy" as they wished to share. The project team compiled respondents' raw descriptions, eliminated or combined any duplicate descriptions, corrected misspellings, and clarified phrasing on a few descriptions.
The resulting list of initial behavioral characteristics included 48 candidate descriptors of science literacy (Table 1). Characteristics where quite varied -ranging from "understands science as presented in popular media (e.g., at level of New York Times)" to "confident in ability to challenge the ideas of an expert source." Vandegrift, Beghetto, Eisen, O'Day, Raymer,  To focus and narrow the list displayed in Table 1, we asked affiliated faculty and graduate students to rate the science literacy behaviors they believed were most important in building students' science literacy. This approach represents a modified form of the "Delphi method" (Dalkey & Helmer, 1963;Osborne, Simon, & Collins, 2003), which involves establishing consensus on the most important items to reduce the total number of items by having a group of experts evaluate the individually generated items.
To this end, we had a larger, though partially overlapping, group of 29 science affiliated faculty and graduate students use an online survey to rate the importance of each of the 48 behavioral characteristics. This group included participants who were affiliated with the program (e.g., teaching affiliated courses, participating in science education journal club) representing biology, chemistry, geology, and physics disciplines. We asked participants to rate how important they thought it was for a scientifically literate, non-science major undergraduate to be able to demonstrate each of the characteristics using a 5-point rating scale (1 = not at all important; 5 = very important). We classified respondents' mean ratings 1 of the characteristics into high and low importance groups using percentile rank function in the Statistical Package for the Social Sciences (SPSS,v. 21). This allowed us to develop a shorter consensus list by reducing the initial list to 25 characteristics with an average rating in the top 50 th percentile (Table 2).  Harpe, 2016). The rating scales of the SCILIT items and our treatment of these data as continuous in our analysis align with these general recommendations. That said, we recognize that some readers may view this as a limitation. Having identified 25 characteristics judged to be most important for science literacy by our larger set of experts, we then modified descriptions into first person so they could be used as a science literacy student self-assessment (SCILIT) (e.g., "I can tolerate uncertainty" or "I can identify assumptions"), which could be rated using a five-point agreement scale (1 = strongly disagree; 5 = strongly agree). Given that we wanted to develop a brief scale, we endeavored to further reduce the 25 items into a shorter student self-assessment that would be easier for instructors to use with students.
To this end, we asked our affiliated faculty to have students enrolled in their general education science courses rate themselves on the 25 items. We collected data from 258 students across all four program-affiliated courses in winter 2013 (two interdisciplinary courses: chemistry/physics and biology/chemistry, and two single subject courses: biology and astronomy). These courses were part of the regular general education science offerings of the academic departments and were affiliated with the program through the training and mentoring offered to faculty, graduate student, and undergraduate teaching assistants. Almost two-thirds of respondents (n = 164; 63.6%) reported their gender as female and slightly more than two-thirds (n = 173; 67.1%) reported their race/ethnicity as white or Caucasian. Respondents represented a broad range of majors and areas of interest (including education, business, art history, science, Japanese language, as well as students undeclared in a major). The most frequent academic major or primary area of interest reported was psychology (n = 87; 33.7%). The student self-assessment data were analyzed using principal component analysis (PCA) to explore whether we could identify a smaller number of items that were still representative of, and highly correlated with, the 25 items. PCA is a technique used by researchers to reduce multiple variables into smaller components. PCA is a widely used and adaptive exploratory method, because it is not constrained by typical assumptions of inferential statistical techniques and thereby can be used by researchers for the descriptive analysis of various types of quantitative data across a wide variety of situations (see Jolliffe & Cadima, 2016). Given the flexibility of this approach, we determined that it was an appropriate tool to use to explore the dimensionality of students' responses on the 25 items.
Prior to conducting PCA, we carried out "parallel analysis" (O'Connor, 2000) to determine the number of components to keep from the total data set. This analysis indicated extraction of one component constituted from the student response data on the 25 science literacy items. To identify a core set of characteristics most representative of the overall science literacy component, we examined extracted communalities (which ranged from 0.11 to 0.53). We dropped items with communalities below 0.40, because variance of those items was not well-accounted for by the science literacy component (Costello & Osborne, 2005). This resulted in a total of eight retained SCILIT characteristics (Table 3). The resulting eight items represented a unidimensional component, accounting for 48.9% of variance. The eight items also demonstrated a high level of internal consistency (α = 0.85), and use the same five-point agreement scale (1 = strongly disagree; 5 = strongly agree) that was strongly correlated with the 25-item version (r = 0.93, p < 0.001). These analyses suggest that the eight items adequately represent content of the 25-item list of SCILIT characteristics and could be used as a more parsimonious scale of science literacy self-assessment recognized by both instructors and students in our program study as representing science literacy. Vandegrift, Beghetto, Eisen, O'Day, Raymer,

Item
Component loading .658 .54

Goal 3: Measure Student Self-Assessment Gains in Science Literacy in General Education Courses
To explore the implementation of the SCILIT self-assessment, our third goal had two parts: 1) to determine if we could measure student gains in student self-assessed science literacy during an academic quarter using the eight-item SCILIT instrument and 2) to compare students' self-assessed science literacy to instructors' evaluation of students' academic proficiency and science literacy in a set of courses across disciplines, including interdisciplinary courses. For part 1, to explore whether students gained in self-reported science literacy through the term, 162 students enrolled in three science courses in winter 2013 taught by graduate student and faculty teaching teams (a convenience sample of students in courses affiliated with the program consisting of one section each of two interdisciplinary courses: chemistry/physics and biology/chemistry and one single subject course: biology) completed pre-and post-SCILIT self-assessments. We computed a pre-course (beginning of term) SCILIT score and post-course (end of term) SCILIT score by averaging the responses on the eight science literacy self-assessment items. In this initial exploration, we found a significant, but modest increase in students' self-reported science literacy from the start of the term (M = 3.68; SD = 0.56) to the end of the term (M = 3.83; SD = 0.55), but the effect size was small indicating these gains were relatively small (paired-sample t (161) = 3.44, p = 0.001, d = 0.27). For part 2, to explore whether students' end-of-term self-assessments related to their individual instructors' rating of demonstrated academic proficiency and science literacy, we had students complete an end of term summative assessment (e.g. final exam) designed and scored by their instructors based on a previous instructor rating protocol (Beghetto & Baxter, 2012). Without a separate validated trans-disciplinary instrument for comparison, we chose to look at how typical instructor-designed summative assessments related to student self-reported science literacy, as a Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu comparison to ways students would normally be evaluated and graded in program affiliated courses. The summative assessments were written by the faculty and varied between courses so as to have direct alignment to specific course learning outcomes. Additionally, faculty focused on aligning and designing an assessment to measure at least one of the eight SCILIT self-assessment statements that best fit the course content. Faculty were asked for an end-of-term evaluation to "Create ONE question that is part of an in-class activity, homework assignment, quiz, or exam that includes reference to one of the eight science literacy behaviors within the question. This may be a written response to a question, a concept map, or other activity. Have your students answer this question." Faculty each developed a question that was best aligned to the learning environment they had created for students and designed to measure students' learning of science literacy as taught in each particular course.
In spring 2014 for the various course summative assessment questions, students: 1) read an article from a popular science magazine, 2) read a real-life case study, 3) read excerpts from newspapers, 4) interpreted data from graphs, 5) imagined a scenario in which they were asked to explain science to a novice or interpret a scientific representation on TV, or 6) watched a science documentary. Students then answered questions developed by faculty such as: 1) writing descriptions of the reading, 2) interpreting data and graphs, and/or 3) drawing numeric and graphical representations. To align with the course learning objectives, some of the summative assessments were presented in typical final exam format, others students completed during an in-class activity, and some were completed as part of a final take-home final exam.
Twelve faculty and graduate students (one astronomy, four biology, five geology, and two physics) rated answers to the questions for both academic proficiency and science literacy. Faculty received training from the authors on using the five-point scale rubric to score academic and science literacy proficiency prior to implementation and practiced scoring with mock-up assessments (Beghetto & Baxter, 2012). Faculty then provided training to their graduate student teams. Raters scored assignment answers from their course and discipline (e.g., biology instructors scored biology assignments). If more than one person evaluated the summative assessment responses, rating teams worked together scoring a small sample to establish consistency and then independently scored a portion of the students' work.
We looked at the relationship between instructor ratings of summative assessment answers and students' survey responses on the SCILIT self-assessment both of which were completed by 576 students. We included only students who had completed the SCILIT self-assessment (part 2), had an instructor rating on academic proficiency and science literacy, and were enrolled in a course with at least 30 students. The resulting dataset included students drawn from seven courses representing four subjects: astronomy (n = 101, 17.5%), biology (n = 166, 28.8%), geology (n = 252, 43.8%), and physics (n = 57, 9.9%). The majority of students reported their gender as female (n = 324, 55.6%) and their ethnicity as Caucasian or white (n = 415, 71.2%).
In an effort to evaluate students' self-rated science literacy, we had students respond to a survey that included items measuring these behaviors. We then calculated an average SCILIT score for all students based on their response to the eight items represented in Table 3 (M = 3.82, SD = 54, range 2 to 5). We then had instructors use a five-point scale (1= lowest, 5 = highest) to rate the academic proficiency of the student response to the course assessment (M = 3.47, SD = 1.17, n = 576) which was tied to the course grading rubrics for summative assessments directly related to course scientific content and learning objectives.
Instructors also used a five-point scale (1= lowest, 5 = highest) to rate the science literacy proficiency of the associated SCILIT behavior(s) included in the summative assessment (M = 3.34, SD = 1.18, n =576). For example, after reading an article and answering questions a student might correctly answer questions about the science content (e.g. scale point of 5) but have a partially incorrect Vandegrift, Beghetto, Eisen, O'Day, Raymer, and Barber Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu answer to questions about assumptions or the process of science included in the article (e.g. scale point of 3).
We conducted Pearson's correlations to explore the relationship between student self-reported science literacy and instructor-assessed science literacy or academic proficiency. There was a statistically significant but very weak, positive correlation between student self-reported science literacy and instructor-assessed science literacy (r = .119, n = 576, p = .004). There was also a statistically significant but very weak, positive correlation between student self-reported science literacy and instructor-assessed science academic proficiency (r = .164, n = 576, p < .001).

Concluding Thoughts
We elicited a science literacy definition from instructors affiliated with our program across science disciplines, and developed and implemented a student self-assessment (SCILIT) to learn how students assessed their science literacy.
We do not claim that our SCILIT definition is unique or the "best" or "correct" one or applicable to all other contexts. It grew from the initial input from our cross-disciplinary group of instructors who wanted to better understand the science literacy development of their students. While this group represents a well-rounded, experienced one, other similar efforts using a different group may elicit a different SCILIT definition. Our focus here is on the methodology developed and the initial, exploratory results it produced.
Our focus on student science literacy self-belief is particularly important in general education courses designed to empower students in scientific thinking, especially when these courses might be the last formal science instruction students received. We compared student self-assessed science literacy before and after a general education science course as well as relative to how they were rated by experts in the field. Our initial findings suggest that our eight-item SCILIT self-assessment scale has the potential to compliment other forms of assessment and help monitor students' perceptions of science literacy in our and other contexts.
Although self-assessments can serve as an important source of data -particularly when attempting to understand students' learning experiences (Ames, 1992;Stipek, 1981) -we recognize they are prone to various forms of misperception, inaccuracy, overestimation, and bias (Dunning, Heath, & Suls, 2004). Gaps often exist between espoused beliefs (what students believe they know and can do) and actual knowledge and behaviors (Argyris & Schon, 1974). Weak correlations observed between instructor-evaluated metrics and student-assessed science literacy may also reflect how well assignments aligned to science literacy behaviors. Future metrics could tease apart the elements of the course that lead to alignment, over-estimation, or under-estimation of ability.
Science anxiety is high, even in science courses designed for non-science majors (Udo et al., 2004). This is a factor that our affiliated faculty considered when designing courses and assessments. Science anxiety and instructor efforts to mitigate it may have led to under-estimation or overestimation of self-assessed science literacy. Future studies could examine which design or structural elements of each course lead to increasing student self-assessed science literacy, and ways in which faculty can continue to support development of student science literacy.
Even with these limitations, our project provided insights into how students perceive characteristics of science literacy in themselves, how those perceptions might change over a short time (across a term), and how those conceptions may differ from the way instructors view students' demonstrations of science literacy on courses assignments and assessments. Insights from evaluating students' views of their science literacy and relating it to academic proficiency with the course content could be helpful to instructors by revealing any divide between what instructors think they are teaching and what students are learning. Such insights could also prompt instructors to re-think their Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu conceptions of science literacy, both more generally and specifically in the context of their particular courses and fields of study, and to develop new strategies to help students increase their science literacy. For example, instructors may need to provide better scaffolding for students to align their abilities with course expectations, create opportunities for students to develop more accurate selfjudgments through inclusion of self-reflection and metacognition, or more transparently align course learning objectives and assessments.
Although the focus of our work was to clarify an expert science literacy conception and develop a student science literacy SCILIT self-assessment for use in our courses at our university, our initial results have potential applications for continued classroom teaching and learning. In our courses we emphasize creating student-centered classrooms using evidence-based techniques. Teaching teams provide opportunities for students to uncover relevance of science to their everyday lives with topics such as gene therapy and information technology. Indeed, continuing to provide these types of learning environments is in line with existing research regarding ways to improve learning outcomes and student success for all students while continuing to focus on development of students' science literacy (Allen & Tanner, 2005;Crouch & Mazur, 2001;Freeman et al., 2014;Singer et al., 2012). We have continued to use the student SCILIT self-assessment as a framework for developing and revising student learning outcomes and assessments across courses in our program as this framework for science literacy has resonance with both our instructors and students.
Our eight-item student SCILIT self-assessment is well-aligned with two Vision and Change Core Competencies: ability to apply the process of science and ability to understand the relationship between science and society (Brewer & Smith, 2011). The six Vision and Change competencies were specifically targeted to development of students' biological literacy. However, there are elements in each competency that are directly transferrable to other disciplines and the applicability to other contexts could be part of future work.
Our initial efforts have raised several important questions for us and for science instructors interested in this topic to explore and revisit, including: • How might instructors continue to close the gap between their conceptions of science literacy and their students' conceptions? • How does student self-reported science literacy compare to scores on a concept inventory of science literacy reasoning (Nuhfer et al., 2016)? • How does student self-reported science literacy change over the course of several general education science courses? • In what ways does the application of science literacy vary by course content, subject area, or student population? • How might instructors best target, teach, and assess science literacy in their particular courses, subject areas, and student population? • How might we and other instructors refine and improve upon our efforts to clarify and monitor students' science literacy?
In sum, now that we have developed an initial conception of eight science literacy behaviors with the student SCILIT self-assessment and explored the initial relationship between student selfassessed and instructor-assessed science literacy, future research should focus on further testing and refining our (and other) self-report science literacy measures. Our initial results indicate that our SCILIT self-assessment can be used across disciplines at our university. Further expansion of the SCILT self-assessment to other academic disciplines or contexts may refine the self-assessment for science instructors and researchers to identify, monitor, and support development of students' science Vandegrift, Beghetto, Eisen, O'Day, Raymer, and Barber Journal of the Scholarship of Teaching and Learning, Vol. 20, No. 2, October 2020. josotl.indiana.edu literacy. In this way, developing students' science literacy can be incorporated in the broader goal of promoting science learning. We hope that by sharing our efforts aimed at clarifying what science literacy means amongst our interdisciplinary group of science instructors, others may learn from our efforts and embark on similar projects. We therefore invite science instructors to use, modify, and refine our initial methods for developing a SCILIT definition and our approaches to using such a definition through student SCILIT self-assessment.