Using Indirect vs . Direct Measures in the Summative Assessment of Student Learning in Higher Education

Contradictory results have been reported regarding the accuracy of various methods used to assess student learning in higher education. The current study examined student learning outcomes across a multi-section and multiinstructor psychology research course with both indirect and direct assessments in a sample of 67 undergraduate students. The indirect method measured student perceived knowledge and abilities on course topics, while the direct method measured actual knowledge where students answered test questions or solved problems reflecting course content. Both measures independently demonstrated increases from pretest to posttest; however only the direct measure correlated with final course grades. Results also showed respondents scoring lower on the direct measure were overconfident (as measured by indirect score) in their perceived knowledge and ability, the Dunning-Kruger Effect. Based on our findings, we concluded that the indirect method was not an accurate measure of student learning, but may have benefits as an instructional tool.

Collecting data indirectly may involve focus groups, student interviews, instructor and course ratings, or knowledge surveys (Price and Randall, 2008).Much research has been conducted on the use of knowledge surveys which require students to indicate their perceived confidence or ability to answer questions, but do not require actual answers or problem solving (Nuhfur and Knipp, 2003).Due to the time savings from not having to actually solve problems, hundreds of items can be used to cover a wide array of course topics, an option not feasible with direct measures (Nuhfer and Knipp, 2003;Price and Randall, 2008).
Benefits from these types of knowledge surveys can be seen by both the faculty and the student (Nuhfer and Knipp, 2003).Details of course content reviewed prior to the semester aid in course preparation and expectations (Nufher and Knipp, 2003).When used in a pretest/posttest methodology, knowledge surveys can provide instructors with the ability to analyze course content and learning objectives, allowing them to modify instruction and ultimately improve student learning (Wirth and Perkins, 2005).
For students, the pretest survey can be used as a preview of the semester's course content (Nufher and Knipp, 2003) and further used as a study guide throughout the semester (Bell andVolckmann, 2011;Bowers, Brandon, and Hill, 2005;Clauss and Geedney, 2010;Nufher and Knipp, 2003;Wirth and Perkins, 2005).Indirect measures can also afford students the opportunity and ability to recognize and further develop their self-assessment skills (Bell and Volckmann, 2011;Clauss and Geedey, 2010;Wirth and Perkins, 2005).
There are, however, some drawbacks to the use of indirect measures such as knowledge surveys.Traditionally presented as Likert scale items, knowledge surveys may be susceptible to response sets such as social desirability, impression management, and "yea-saying," especially if students are concerned that faculty will have access to their responses.One could postulate that the issue of motivation noted earlier for the direct measures likely applies to indirect measures as well, even though these measures are less onerous for the respondent to complete.Additionally, much of the literature on indirect measures of assessment, such as knowledge surveys, is focused on their use by faculty as formative curriculum tools (Bell and Volckmann, 2001;Bowers et al., 2005;Nufher and Knipp, 2003;Wirth and Perkins, 2005).Thus, despite the benefits noted above, there is disagreement as to whether these indirect self-reported measures are valid indicators of student learning.
While Nuhfer and Knipp (2003) concluded that knowledge surveys, the indirect measure used in their research, provide a "detailed assessment of content, learning and levels of thinking" (p.13), it is critical to keep in mind the context of their research.Nuhfer and Knipp (2003) were not comparing indirect measures to more direct assessments; their focus was on identifying a better measure of course success than using student evaluations.They concluded that knowledge surveys "improve planning and preparation, that they validate student learning much better than summative student ratings" (p.13).Most of their analysis evaluated knowledge surveys as a tool for curriculum design to identify course overlap and topics in need of additional instruction.
Other researchers demonstrated improvement in indirect measures but without extending their investigation to consider the relationship between indirect measures and other criteria of learning.Clauss and Geedy (2010) found increases in knowledge survey scores over the course of the semester.The emphasis of their research was on metacognition, and their analyses focused on the accuracy of the self-assessment measures rather than the relationship of self-assessment to other measures of student learning.

Luce and Kirnan
Journal of the Scholarship of Teaching and Learning, Vol. 16, No. 4, August 2016. 78 josotl.indiana.eduBeyond an increase in scores over the course of a semester, what evidence exists to demonstrate that indirect measures accurately gauge student learning?A key methodological component to gathering such evidence is in the definition of student learning.As can be seen in the studies that follow, researchers define student learning in a variety of ways: 1) having students answer the indirect items, thus creating a direct measure of knowledge; 2) course grade; 3) exam grade; or 4) score on a direct measure unaffiliated with the course (locally developed or commercially available).
Wirth and Perkins (2005) administered a knowledge survey at several points during the semester noting increases in student confidence with course topics as well as a positive relationship between knowledge surveys and student performance.Knowledge surveys given before midterm exams correlated positively with midterm exam grade, r = .70,and a post course knowledge survey was similarly associated with final course grade, r = .72.Based on these findings, they suggested that indirect measures can accurately represent student knowledge.These correlations, however, may have been influenced by the fact that their knowledge survey was used in a formative manner; the survey was administered multiple times during the semester (prior to the course, prior to each exam, and after the course), used as a study guide by students, and had several items in common with the exams.Bell and Volckmann (2011) demonstrated increases from pre to post knowledge surveys and found a positive correlation between knowledge surveys and final exam performance (r = .58).Their study thus, lends support to Wirth and Perkins (2005) that knowledge surveys can accurately assess student learning.However, Bell and Volckmann (2011) reported briefly on a third study that predated the focus of their 2011 article.In this pilot study, the knowledge survey was prepared by someone other than the instructor and results were not shared with the instructor or the students.They reported a substantially lower correlation between post course knowledge survey scores and final exams of r = .37compared to r = .58 in the 2011 study when students had access to the knowledge survey throughout the course, and the final exam closely aligned with the knowledge survey items.This raises the question of whether the correlations observed between indirect and direct measures are in fact influenced by the use of the indirect measures as study guides throughout the semester and/or the fact that indirect items are often embedded in the direct measures.
Research by Price and Randall (2008) supports this conjecture, confirming increased levels in both direct and indirect measures of assessment across the semester.However, they failed to find a significant relationship between the two measures.Price and Randall (2008) demonstrated that students did not have the ability to differentiate between their perceived knowledge and their actual knowledge, and results supported the use of direct measures to assess student learning.The direct measure of knowledge used by Price and Randall (2008) was student responses to the indirect items taken at a later time in the semester; however, it is important to note that the indirect measures were used in more of a summative manner with neither students nor faculty having access to results or answers until the end of the semester.Bowers et al. (2005) similarly found an increase in knowledge survey scores over the course of a semester, but concluded that indirect measures are not accurate estimates of student knowledge.The magnitude of the correlation of indirect score with final grade was quite variable over the five sections of the Biology course under study (ranging from r =.21 to r =.46).Bowers et al. (2005) additionally analyzed item pairs between the indirect measure and the final exam, and

Luce and Kirnan
Journal of the Scholarship of Teaching and Learning, Vol. 16, No. 4, August 2016.79 josotl.indiana.edufound little agreement.Thus, the literature provides discrepant findings when comparing indirect and direct measures.
Although prior studies of indirect knowledge, (Bell and Volckmann, 2011;Bowers et al., 2005;Clauss and Geedney, 2010;Nufher andKnipp, 2003, Price andRandall, 2008;Wirth and Perkins, 2005) all provided evidence of increases from pretest to posttest demonstrating increased familiarity with course topics, when indirect measures were compared to direct measures, results were mixed.Findings are further confounded by variations in the use of indirect measures as formative or summative tools.In several studies indirect measures were used as study guides throughout the semester, and often the same indirect items appeared on later direct measures of student knowledge creating a situation that would inflate the relationship of these measures.Additionally, the motivational level of respondents differed across studies with some test scores contributing to course standing/grades (Price and Randall, 2008), others providing minimal credit for participation (Bowers et al., 2005), and still others varying in the confidentiality of survey results among students and faculty.To further complicate the findings, several researchers suggested that the accuracy of self-assessment is not consistent across students and this inconsistency follows a specified pattern (Bell and Volckmann, 2011).

Dunning-Kruger Effect
Kruger and Dunning (1999) were among the first researchers to systematically study a metacognitive deficiency where individuals are unable to accurately self-assess their performance on various tasks.Through a series of four studies spanning humor, grammar, and logical reasoning tasks, they demonstrated an overestimation of self-ability by the lowest performers and underestimation by the highest performers.These miscalibrations were disproportionately large for the lowest performers, suggesting more than a regression effect as an explanation.Through extensive experimentation, Kruger and Dunning (1999) were able to demonstrate that lower performers could not learn to recalibrate their own ability by observing the superior performance of others, but could achieve increases in accuracy if trained in the specific task being assessed (in this case logical reasoning).Kruger and Dunning's (1999) findings appear at odds with Bell and Volckmann (2011), as one would expect that by the end of a course sufficient training and feedback on content would lead to greater accuracy in self-assessment.
An application of these findings to academic assessment can be found in the work of Bell and Volckmann (2011).These researchers found correlations between knowledge survey scores and final exam performance (r = .58),but the relationship was inconsistent across student ability levels.Students who scored lower on the final exam were overconfident in their estimated ability (most dramatically in the post-measure of indirect knowledge, but also relatively overconfident in a pre-measure as well).Thus, the very students faculty would hope to assist with the use of an indirect measurethose low in abilitymay not benefit due to the inaccuracy of their selfassessment.Hacker, Bol, Horgan, and Rakow (2000) confirmed the inaccuracies in the lowest performers in their study of undergraduates in an educational psychology course.Their course incorporated instruction on the benefits of self-assessment and provided opportunities for students to self-assess before and after each exam.They showed that overall accuracy increased over time with repeated self-assessments; but not for the lowest performers.H1: Average posttest indirect scores will be higher than average pretest indirect scores.H2: Average posttest direct scores will be higher than average pretest direct scores.H3: Average posttest direct scores will correlate positively with final course grades.

Current Study
Several earlier studies used the indirect measures in a more formative manner allowing students and faculty access to the measure throughout the semester.This resulted in benefits of providing feedback and allowing for changes in study or course focus, but, such a practice may have exaggerated the relationship between indirect and direct measures.Our focus is on the use of indirect measures as a purely summative tool.Thus, while employing a pretest/posttest methodology for administration, the students' performance was not shared with students or faculty.Nor were faculty aware of the assessment items except to the extent that the items mirrored formal course descriptions and learning objectives.
Other studies employed faculty teaching the course to assist in the development of the measures, however, in our study both measures were developed by a senior faculty member who was not teaching the course.Thus, a broader view of the learning outcomes was employed, avoiding bias toward an individual faculty member's pedagogical perspective or course focus.
We also differed from earlier research in that our indirect measure was Likert scale responses to course terms or concepts, not to actual direct knowledge items.For example, other researchers assessed indirectly by repeating the direct knowledge item and having respondents self-report their level of confidence or ability to solve the problem/answer the question.We, however, had respondents read course terms or concepts and indicate their familiarity/confidence level with each item.Thus, the indirect responses were more general and not a response to a specific test item.We paired indirect items and direct items to ensure each item was represented on both measures (Figure 1).

Indirect Items Direct Items Development
Development was based on course topics identified through a review of course outlines and syllabi, followed with a review by faculty.
Knowledge or application questions were created that corresponded to the course topics.Due to the lack of availability of the test items to students and faculty, the use of "paired" items rather than identical items across the indirect and direct measures, the more general focus of both measures, and the potential differences in student motivation, we offer hypotheses in support of Price and Randall (2008), and Bowers et al. (2005).
H4: Posttest indirect scores will fail to correlate positively with posttest direct scores.H5: Posttest indirect scores will fail to correlate positively with final course grades.
Prior studies disagree as to both the existence and the cause of the Dunning-Kruger Effect.Given the differences already cited between this study and other researchers (the use of a measure as a summative tool, the general nature of indirect items, test items, and the results not accessible by faculty) we do not propose a specific result but will simply test if students vary in their ability to accurately self-assess.
H6: Respondents low in ability (measured as direct post score) will differ in their selfassessment of knowledge and abilities (measured as indirect post score) compared to respondents high in ability.

Participants
Students enrolled in six sections of an undergraduate Psychology research methods course at a mid-size 4-year public college in the northeast United States during the fall 2011 semester were invited to participate in both the pretest assessment and the posttest assessment.The pretest, which includes both the indirect assessment and the direct assessment was administered on the first day of classes to all those enrolled, and most students participated for an initial response rate of 88%.The same instrument was administered as a posttest at a date chosen by the instructor but within the last two weeks of the semester.Of the 108 participants who took both the pretest and posttest, a total of 38% failed a validity check item (34 participants pretest, 13 participants posttest), and were eliminated.The validity check was an instructed response item embedded in the indirect assessment and was of the format "For validation purposes, please respond 'Strongly Disagree' for this item".Analysis was performed on the remaining 67 participants.Gender and ethnicity were not collected, however data on year in college and status as a Psychology major were.Respondents were 22% freshmen, 37% sophomores, 28% juniors, and 12% seniors.The majority of participants were psychology majors or double majors, 30%, and an additional 34% stating that they planned to apply to the major.In terms of being a psychology minor, 18% were declared as such with an additional 12% indicating that they planned to apply to the minor.Finally 6% were not affiliated with psychology as a course of study.

Materials and Procedures
The psychology research methods course is the second prerequisite course in a series of four courses in the psychology major's methodological core, and is a requirement for higher level courses in the major.This course is also a requirement for the psychology minor (The College of New Jersey Psychology Department, 2015).Formal course descriptions and syllabi from the previous academic year were used to identify 55 subtopics that were relevant for the course.An on-line survey was developed where the current and recent faculty teaching the course rated each topic as "essential," "optional," or "unessential" for inclusion in the course.A senior faculty member who was not currently teaching the course developed test items to measure the subtopics rated as "essential" by at least 50% of the faculty respondents.The faculty teaching the course were only involved in reviewing the general topics, and did not see the final test items.
Indirect items were created in the manner of Cross and Angelo (1988) as modified by Eder (2009).These items were developed to measure familiarity with terms/concepts of the research methods course (using the same list of subtopics identified above by current faculty).Unlike other indirect measures of knowledge these items presented course terms and concepts rather than actual exam questions.A total of 49 concepts such as "convenience sampling," "independent variable," and "ordinal level of measurement," were presented along with a 4-point Likert scale to indicate familiarity with the topic (Figure 1).Relative to other studies, ours differed in that the direct measure was created, administered, and evaluated by an objective 3 rd party faculty member.Additionally, faculty teaching the course did not have access to or knowledge of the instrument itself or their course section's pretest or posttest scores until a report was compiled for the department a full year later.Even at that time, data were only available in the aggregate, not by section or by instructor.Thus, our use of the indirect measure was more summative than formative.We sought to determine if the indirect measure, which is simpler to construct and administer, could serve as a "stand alone" measure of student learning.If a knowledge survey or indirect measure could accurately represent actual knowledge gained, we could use these tools in future assessments.The 42 direct knowledge items were all objective, presented in a multiple choice format with a single correct answer.Thus, higher scores indicated greater knowledge of the course content.
The two measures were combined into a single on-line survey which took about 25 minutes to complete.The survey was administered in a pretest/posttest fashion in the fall 2011 semester.Only those respondents who completed both the pretest and the posttest, and passed a validity check were retained, resulting in 67 participants.Cronbach's coefficient alpha was used to determine reliability of the measures resulting in .75 and .92, for direct and indirect, respectively.

Results
Prior to conducting the analyses, a series of statistical checks were run to ensure that the responses we eliminated due to a failed validity check did not distort the findings.Failing a validity check is not an unexpected occurrence in low stakes assessments, especially on a pretest measure, when respondents have little knowledge of the test content.However, to ensure that the sample of respondents who were removed did not differ from the final sample used in the analyses, several statistical tests were conducted.A series of t-tests found no differences between these two groups for any of the measurespretest indirect, posttest indirect, pretest direct, or posttest direct.The measure of final grade more reflected the properties of ordinal data than interval level, thus this variable was analyzed using chi-square and again, no differences were found in the distribution of grades for these two samples.This was true whether analyzing all levels of final grade or final grade recoded into low, medium, and high groups.
The grade distribution was negatively skewed as 46% of students received a grade of A. Again, a check of the eliminated responses against the 67 responses retained for analysis was conducted to ensure the removal of these cases did not skew the data.Analyses of all measures revealed that the removal of those who failed the validity check did not impact the distribution of any of the measures.With or without those cases, the data reveal a strong negative skew evident in course grades and in posttest indirect scores for both groups.This high percent was expected and mirrors the recent trend of grade inflation in higher education, reported as 43% of letter grades being A (Rampell, 2011).
Hypothesis 1 examined the difference between pretest and posttest average indirect scores by using a paired samples t test to calculate the mean difference between scores across time.Referring to Figure 2, posttest indirect scores (M = 163.30,SD = 15.22) were statistically significantly higher than pretest indirect scores (M = 123.73,SD = 17.37), t(66) = 14.56, p < .001,d = 2.42, CI.95 = 44.99,34.14 supporting Hypothesis 1.  Hypothesis 2 examined the difference between pretest and posttest average direct scores by using a paired samples t test to calculate the mean difference between direct measure scores across time.Referring to Figure 3, posttest direct scores (M = 36.69,SD = 5.44) were statistically significantly higher than pretest direct scores (M = 29.50,SD = 5.60), t( 66  Hypothesis 3 examined the relationship between posttest direct scores and final course grades, and was addressed by using Spearman correlation.A statistically significant relationship was found between posttest direct scores and final grades, rs(67) = .47,p < .001.Grades correlated in a positive direction with posttest direct scores, therefore Hypothesis 3 was supported.
Hypothesis 4 examined the relationship between posttest indirect scores and posttest direct scores, and was addressed by using Pearson correlation.No statistically significant relationship was found between posttest indirect scores and posttest direct scores, r(67) = .11,p = .357.Students did not demonstrate the ability to self-assess their knowledge, and scores did not correlate significantly in a positive direction, therefore Hypothesis 4 was supported.
Hypothesis 5 examined the relationship between posttest indirect scores and final course grades, and was addressed by using Spearman correlation.No statistically significant relationship was found between posttest indirect scores and final grades, rs(67) = .06,p = .615.Students did not demonstrate the ability to self-assess their perceived knowledge, and grades had no relationship with posttest indirect scores, therefore Hypothesis 5 was supported.
Hypothesis 6 examined if direct measure low scorers differed in self-assessment of their knowledge and abilities than high scorers.This was addressed using a mixed-model ANOVA where the between-subjects variable was the three levels of direct knowledge and the withinsubject variable was the measure of indirect knowledge measured at the beginning and ending of the course.Referring to Figure 4, direct scores were divided into three intervals comprising the bottom, the middle, and the top third of the frequency distribution.A statistically significant difference was found between pretest (M = 123.73,SD = 17.38) and posttest (M = 163.30,SD = 15.22)indirect scores, F(1, 64) = 205.30,p < .001,η 2 = .762.However, there was no significant interaction of pretest and posttest scores, F(2, 64) = 0.11, p = .896,η 2 = .003,across the three ability levels, suggesting that the within group difference over time was similar for all three groups.There was a statistically significant difference for indirect scores across the three levels of direct knowledge: low (M = 142.93,SD = 18.08), moderate (M = 138.65,SD = 15.77) and high (M = 148.91,SD = 13.70),F(2, 64) = 4.76, p = .012,η 2 = .13.A post-hoc comparison of means using Tukey's HSD at α = .05revealed that the moderate direct knowledge group was significantly lower in perceived ability than the high direct knowledge group.The low direct knowledge group did not differ from the moderate or high direct knowledge groups as shown in Figure 4.

Discussion
Two different measures of assessing student learning were administered at two points in time to six sections of a psychology research methods course at a higher education institution.Indirect and direct assessments were used to determine levels of student learning by measuring both perceived knowledge and actual knowledge of course content.The indirect assessment measured students' confidence levels on their knowledge of course topics, and was further used to determine if an indirect measure could accurately gauge student learning.The direct assessment addressed student actual knowledge with items that paralleled the indirect course topic items; results were examined along with final course grades to determine if students acquired the essential knowledge and skills from the goals and objectives contained in the course curriculum.
We found the level of perceived knowledge from our indirect measure increased significantly from pretest to posttest across all sections of the course, indicating increased confidence and familiarity with course topics over time.This finding supports prior research (Bell and Volckmann, 2011;Bowers et al., 2005;Clauss and Geedey, 2010;Nuhfer and Knipp, 2003;Price and Randall, 2008) where student confidence levels showed an increase by the end of the semester.Similarly, the level of actual knowledge from the direct measure increased significantly from pretest to posttest, demonstrating students completed the course with more knowledge than they had at the beginning of the semester.In addition to statistical significance, both measures demonstrated a strong effect size, further supporting the magnitude of these differences, or gains in knowledge.Our results support Bell and Volckmann (2011) and Price and Randall (2008) who used pretest and posttest direct measures and showed gains in actual knowledge of course content by the end of the semester.Increases in pretest to posttest scores can identify areas in need of improvement and determine if the department and the faculty are "adding any value" demonstrated by increased knowledge in students (Pederson and White, 2011).Although this study, along with prior research found increased scores from pretest to posttest using indirect perceived knowledge measures and direct knowledge measures, the results do not support the use of indirect measures as indicative of student learning.
This study found indirect measures of perceived knowledge were not an accurate indicator of student learning, as our indirect measure did not correlate with the direct knowledge measure or course grades.Our results support prior studies (Bowers et al., 2005;Price and Randall, 2008) in which students were unable to accurately perceive their knowledge and abilities via indirect measures.Bell andVolckmann (2011) andWirth andPerkins (2005) on the other hand, found indirect posttest measures of perceived knowledge a good predictor of later performance on a test.However, when the indirect assessment was compared to final course grades, mixed results were found.In accordance with Bowers et al. (2005) results from the indirect perceived knowledge measure were not a good indicator of final course grades.Wirth and Perkins (2005) however, differed from these results, and found perceived knowledge a good indicator of final course grades.
Unlike previous research, we did not provide the indirect assessment to students as a study guide; in the other studies, this may have aided students in posttest preparation.Our small correlation may also be due to a lack of course consistency across the six sections of the psychology research course.Five of the six instructors who taught the course in the fall 2011 semester were adjunct faculty, and three of them were teaching the course for the first time.In developing our assessment, we consulted with faculty teaching the course, and created indirect items and direct items based on topics deemed essential, however individual instructor variation across six sections of the same course can be difficult to control without a more uniform course design.A possible solution to ensure course consistency in future assessments is to involve all faculty members registered to teach the course in the design and the implementation of the assessment measures.
The skewed distribution of final course grades may have weakened the correlation between the indirect scores and final course grades, due to restriction in range.Future researchers might minimize the impact on correlations by obtaining a measure of final course grade that is numerical (on a basis of 0 to 100) as opposed to a graded option (F through A).While skewness would still be evident, the finer measurement might mitigate some of the effect of range restriction.It is important to note however, that any restriction in final course grade would have similarly affected the direct measure's relationship with course grade, which was statistically significant.It seems reasonable to assume that if the correlations were stronger in a less restrictive sample, the magnitude of the difference in the correlations for direct and indirect score would still exist.Thus, the conclusion that the indirect score is not as effective as a measure of course knowledge is still supported.
Lastly, based on prior research (Bell and Volckmann, 2011), this study attempted to determine if low scorers on the posttest direct measure were more confident in their knowledge and ability than high scorers.In accordance, we found low direct measure scorers overconfident in their abilities (measured by the direct score).Bell and Volckmann (2011) suggested overconfident low scorers were a result of their inability to self-assess accurately.Clauss and Geedey (2010) suggested students with excellent self-assessment skills may do poorly on the measure, then use it as a study guide and do better on the direct measure thus, explaining some of the disconnect between the two measures as maturation/developmental.However, this rationale would only apply in studies that use the indirect measure in a more formative manner.
Luce and KirnanJournal of the Scholarship ofTeaching and Learning, Vol. 16, No. 4, August 2016.80  josotl.indiana.eduPrevious literature has focused on examining indirect measures and direct measures of student learning in science and quantitative courses.Price and Randall (2008) investigated a quantitative course in the business major;Bell and Volckmann (2011) a general chemistry course; Wirth and Perkins (2005) a geology course; and bothClauss and Greedey (2010)  andBowers et al. (2005) researched biology courses.This study is one of the first to address a course in the social sciences, a core psychology research methodology course.Hypotheses 1, 2, and 3 propose to confirm results from earlier research with a focus on a social science research course.
Journal of the Scholarship of Teaching and Learning, Vol. 16, No. 4, August 2016.86 josotl.indiana.edu

Figure 4 .
Figure 4. Consistent Overestimation of Self-Ability by Low Scorers The moderate direct knowledge group was significantly lower in perceived ability than the high direct knowledge group.The low direct knowledge group did not differ from the moderate or high direct knowledge groups.
Luce and KirnanJournal of the Scholarship ofTeaching and Learning, Vol. 16, No. 4, August 2016.87  josotl.indiana.edu Indirect and Direct Item Example Paired indirect item and direct item for the course topic of "reliability in measurement".