Rubrics on the Fly: Improving Efficiency and Consistency with a Rapid Grading and Feedback System

Many learning management systems (LMS) used in higher education provide customizable rubrics that aid in the process of grading and providing feedback for many forms of assessments commonly used by educators today. Rapid Grade (RG) is a grading and feedback feature built into a non-commercialized LMS developed by a large, public, Midwestern university in the United States. In this research, RG was compared to a grading and feedback system found in one of the most utilized LMS found in higher education. It should be noted that the name of this particular LMS is not named. Using the Technology Acceptance Model (TAM) to validate that RG empirically improves upon existing methods, survey results indicate that RG is a significant improvement in terms of ease of use and usefulness when grading and providing feedback for a given assessment. The RG framework, as well as the specific results of the TAM, are presented.


Introduction
One of the most time-consuming tasks from a teaching perspective is the process of grading assessments that have been submitted by students.These duties would be much less exhausting if the act providing useful feedback were omitted; however, constructive criticism is a major part of the educational equation (Hall, et al., 2001;He, Hui, & Quan, 2009).In this paper, a methodology related to the student-teacher feedback loop will be presented.Before outlining the scope of the paper, elements of the student-teacher feedback loop will be defined.The feedback loop is straightforward.Students are given assessments that have been prepared by a subject matter expert.They are then asked to complete and submit their assessments often through a learning management system (LMS).Based on the type of assessment given, instructors evaluate the students' performance by providing some sort of numeric score.In addition to this score, instructors provide written feedback so that the student can learn from their mistakes or successes (Butler, 2011).
From an instructor's perspective, the type of assignments chosen is often based on a few considerations.For example, instructors might choose one type of assessment over another simply based upon the size of the class.If the class size is large, instructors may choose scalable methods like multiple-choice questions.These types of assessments are not always easy to create, but can be graded easily by a LMS.However, some instructors could choose other types of assessments, such as reports or projects, if the size of the class is not large.In many cases, these types of assessments usually take much longer to evaluate, due to the variety of answers that are submitted.Moyer, Young, Weckman, Martin, and Cutright Journal of Teaching and Learning with Technology, Vol. 4, No. 2, December 2015. jotlt.indiana.edu 7 With that said, the most recent trend in higher education is that the student-to-faculty ratios are increasing (Spinellis, Zaharias, & Vrechopoulos, 2007).Thus, providing consistent, useful feedback in a timely manner will only become more challenging for instructors in the future.
Of course, there are many other things that instructors take into account when making decisions on what type of assessment to use.The decision could be based on the type of material being covered in the class.If, for example, an instructor is teaching a course in computer programming, some assessment types, like ones built with multiple-choice questions, may not be the most accurate choice of assessment in order to evaluate a student's knowledge of learning outcomes (Cheang, Kurnia, Lim, & Oon, 2003).Certainly, other factors come to mind when deciding what type of assessments should be given to students.Examples include whether or not grading assistance is available to instructors; the time that the instructor has to dedicate to grading; and whether technology can be used and managed effectively.Though only a few reasons why instructors choose certain types of assessments were provided, there is usually one common thread among all assessment types.This thread is the act of providing constructive criticism and/or assigning points to the particular assessment in question.In order to simplify the remaining portions of this manuscript, this act will simply be referred to from this point forward as "grading." Providing consistent grading and feedback on a particular assessment is often a difficult challenge for instructors.While grading assessments, there are many things that might contribute to inconsistencies.Providing consistent feedback is perhaps the biggest challenge instructors face because inconsistencies can potentially mean that those students do not share a common experience within the classroom.For example, depending on the complexity and the number of students submitting assignments, providing feedback can be very time-consuming.Too often, grading for extended periods can result in inconsistent assessment and feedback (Klein & El, 2010).This fatigue occurs when instructors are too tired to continue grading and must often take breaks from the work.This period of discontinuous work can result in inconsistent feedback, since some of the comments and/or points that were given to previously completed work are not as fresh in the instructor's mind as they once were.Using predetermined evaluation methods, such as rubrics, is one way to reduce the inconsistencies of grader fatigue.However, this is only part of the solution (Ramey, VandeVusse, & Gosline, 2007).Having a good rubric can go a long way towards making grading less time consuming for instructors, as well as providing feedback that is more consistent to students; however, there are certainly limitations (Cross, 1990).
The scope of this work includes the presentation of a framework that is both easy to use, easy to understand, and can be built in an LMS, which allows instructors to provide faster, more consistent grading and feedback to students.This framework, which will be referred to as Rapid Grade (RG) from this point forward, has been implemented into an LMS that has been used within a college of a large, public, non-profit Midwestern university in the United States.In order to validate the proposed methodology, expert users of the LMS which RG has been built into is benchmarked against a well-known and highly utilized LMS platform within higher education.It should be noted that the specific name of the vender of this popular LMS is intentionally not mentioned within this manuscript.Thus, this particular LMS will simply be referred to as XYZ henceforth.Therefore, RG will be benchmarked against XYZ based on survey results, which will be more formally described in a later section, in order to determine the perceived usefulness and ease of use of the RG technology.
Presenting a new framework that could be developed within any LMS environment is significant in many ways.For example, instructors can leave feedback that is more consistent at a more rapid pace than what is currently offered in popular LMS used in higher education.This is Moyer, Young, Weckman, Martin, and Cutright Teaching and Learning with Technology, Vol. 4, No. 2, December 2015. jotlt.indiana.edu 8 especially true when class sizes are large, or when an instructor is teaching multiple class sections at a time.The implemented framework also allows instructors to start the grading process without a pre-existing rubric.In other words, instructors can develop "rubrics on the fly" with the grading process ongoing.Simply put, the framework is robust and flexible enough to offer students with more consistent feedback, which from an instructor perspective is faster and easier to use than the benchmark LMS, or XYZ.

Literature Review
A review of literature is provided to obtain a broad view of what others have studied as it relates to rubrics as a grading and feedback system.This review provides an overview of these systems and goes more in-depth, with an overview of current state-of-the-art technologies and a brief description of the general deficiencies of these systems.Finally, since the framework of RG is presented as a LMS methodology, a brief overview of the software development strategies used to develop the framework is presented.

Feedback and Rubric Systems
There are many benefits to utilizing rubrics.For example, rubrics can be used by teachers to communicate their expectations to students; students to self-assess their work; by teachers who want to communicate learning objectives and outcomes with other teachers; and, of course, by teachers to provide grading and feedback, which is the focus of this article.
There are many ways to provide feedback to students.In terms of using rubrics as a feedback mechanism, there is a wide variety of strategies found in research.For example, the use of rubrics within quantitative courses is rather low (Riddle & Smith, 2008).However, some practitioners have used rubrics within these types of courses successfully (Riddle & Smith, 2008).In fact, they not only list a classification of rubrics into various categories such as checklists, analytical, and hybrid-based rubrics, they also provide insight as to which rubric type works best for the assignments given.
As previously noted, rubrics can be used as a feedback method and have been shown to improve teaching (Cooper & Gargan, 2009;Wolf & Stevens, 2007).However, in some cases there is a lack of quantitative research on the ability of the rubric value to enhance academic performances.In one such study, rubrics were found to be more significant to academic performance than college year, major, pre-test score, and gender (Howell, 2011).
A rubric is simply a set of criteria that is used to help articulate the gradations of the quality of work the student has demonstrated in an assessment (Goodrich, 1997).Rubrics as feedback mechanisms are also useful to students, as they clearly state what an instructor is looking for in order to demonstrate a mastery level of course content.Providing students access to the rubric before an assignment is submitted can make the assessment process to a student very clear (Andrade, 2000).Therefore, since the baseline of expectations has been established, providing consistent feedback is often easier (Andrade, 2000).

Current State of Rubrics as Feedback Systems
There have been astonishing findings when it comes to using customized feedback systems.For example, one study showed that using computer-assisted grading rubrics were almost 200% faster than traditional hand grading without rubrics, more than 300% faster than hand grading with Moyer, Young, Weckman, Martin, and Cutright Teaching and Learning with Technology, Vol. 4, No. 2, December 2015. jotlt.indiana.edu 9 rubrics, and nearly 350% faster than typing the feedback into an LMS (Anglin, Anglin, Schumann, & Kaliski, 2008).Not only did this particular system reduce the time required from the teacher perspective, their results indicated that utilizing an online system did not negatively affect students' attitudes or their overall satisfaction of receiving feedback.Though these types of systems are not widely used, there are popular systems that are used within GK-12 and higher education.For example, according to market share, Blackboard®, Moodle®, and Desire2Learn® (CampusComputing, 2013) are the top three LMSs used today, with a market share of 41%, 23%, and 11%, respectively.Based on these commercial options, some researching practitioners have studied the effectiveness from a student and teacher perspective with success.For example, one study showed that from a teacher's perspective there is a 40% reduction of marking time and improved student satisfaction with feedback (Atkinson & Lim, 2013).

Journal of
Teachers do not always have the luxury of having access to custom-built or commercial LMS.Even if they do, commercial systems are limited in terms of their capabilities especially when it comes to more subjective assignments (Thompson & Ahn, 2012).Therefore, another option for teachers is to utilize rubric systems that are available online.These systems allow teachers to build, share, and utilize pre-developed rubrics for the classroom.For example, these sites include RubiStar (4Teachers.org, 2012), Rubrics for Teachers (TeacherPlanet.com, 2012), andTeAch-nology (2012).

Deficiencies with Current Rubrics as Feedback Methods
Some researchers state that one limitation of using rubrics as a feedback system is that rubrics are not always self-explanatory (Andrade, 2005).Not only is this a limitation to students, but it is also a limitation for instructors.Rubrics that are unclear often are not reliable and could result in grading inconsistences even for the same body of work (Andrade, 2005).Therefore, though rubrics can improve consistency with feedback and make the process faster for instructors, they do have their limitations.However, some studies show that these limitations are improved upon when rubrics are co-developed, or when they go through a process of peer-review before they are used as a grading tool (Andrade, 2005).
Computers can assist with the process of grading and providing feedback for student assessments (Chen, 2004).However, there is certainly room for improvement, as it is often very cumbersome to work with a modern LMS in order to provide individualized feedback.For example, if an instructor is using XYZ as their LMS for a particular class, and the instructor is trying to view all assignments that were submitted, several steps are needed in order to provide a grade and feedback for an individual student.This process of completing several steps is then repeated for every student, which can be a tedious, draining task.
Another limitation of using rubrics is that it is very difficult to accommodate every grading scenario into the pre-defined rubric categories.The limitation here is that if pre-defined categories are used within an LMS environment, instructors often do not have a quick solution to provide students with individualized comments.Even if customized comments are left, they are often not available to reuse when a similar grading scenario occurs.For example, instructors are at the mercy of having a complete rubric before the grading process occurs.If there are issues that were not discovered when the rubric was created, then an instructor is left with a decision to make.Does the instructor correct the rubric for the current student and utilize it to grade all the remaining assessments, or does the instructor take the time to go back and redesign the rubric to make all of the necessary changes and then grade all of the assessments that were once complete?In addition, Moyer, Young, Weckman, Martin, and Cutright Journal of Teaching and Learning with Technology, Vol. 4, No. 2, December 2015. jotlt.indiana.edu10 if an instructor is using a rubric within an LMS across multiple class sections, then it is not easy to apply the changes made in one class section to another.LMSs also do not allow users to modify previously graded assignments easily if changes are made to the rubrics (Dornisch & McLoughlin, 2006).

Software Development Overview
The development of the proposed LMS feedback framework called RG is based on two existing software development approaches called Systems Development Life Cycle (SDLC) and Rapid Application Development (RAD).Therefore, a brief overview is provided.

Systems Development Life Cycle (SDLC)
SDLC is an approach that can be described in general terms as a process for planning, creating, testing, and deploying information systems (Centers for Medicare & Medicaid Services, 2008).
The processes within SDLC can be broken down into phases, activities, and tasks in order to achieve the overall goal of the project (US Department of Justice, 2012).The motivation behind SDLC is that projects can be managed in order to ensure that users get what they want in a timely and cost-effective manner (Necco, Gordon, & Tsai, 1987).The primary phases of the SDLC include initiation, system concept development, planning, requirements analysis, design, development, integration and test, implementation, operations and maintenance, and disposition.

Rapid Application Development (RAD)
RAD describes a methodology for building information systems.In comparison to SDLC, it is more of a user-centered approach (Mackay, Carne, Beynon-Davies, & Tudhope, 2000).For example, in RAD, users are more closely associated with the development process.In this process, users are considerably more integrated into the process than just supplying information and signing off on systems that have been developed.In RAD, users actively participate in the development process, with the motivation being that the applications that are built are completed in a short amount of time (Abdullah, Mateen, Sattar, & Mustafa, 2010).Development comes in the form of iterative prototyping (Beynon-Davies, 2000).In general, RAD is a form of incremental and iterative development, where previous versions of a system or application are tested by users and feedback is gathered to make incremental improvements upon previous versions of the system (Avison & Fitzgerald, 2003).

Methodology
The emphasis for the remainder of this manuscript is on the systems development framework of a feedback and grading methodology called RG.In part of the presentation for this methodology, RG will be explained through technology examples as they relate to the LMS features available to instructors.However, before this framework is presented, a brief introduction of RG is given, followed by the system development methodology, which will be referred to as PBR 2 (or Plan, Build, Run, and Repeat).Finally, survey results will be investigated with the Technology Acceptance Model (TAM) (Davis, 1989) used to compare the perceived usefulness and ease of use between RG and XYZ, which is one of the most widely-used LMS in higher education (CampusComputing, 2013).

Rapid Grade Overview
RG is a part of a larger LMS system called ISMS, or Integrated Site Management System, which has been developed and is currently being utilized by a college in a large Midwestern university.More specifically, ISMS, and therefore RG, is built using Microsoft C# programming language, and has been deployed on a Microsoft ASP.Net web server environment.
There are many functions of the LMS component of ISMS, which could be discussed.However, the scope of this section is strictly limited to the functionality of RG. Figure 1 shows a screen capture of RG when it was utilized to grade a particular assignment in a management information systems course.The screen capture shows a partial roster of the class, which has been blurred in order to protect the identity of the students.The screen capture also shows basic functionality such as controlling which students are selected, which in turn controls other functions such as adding points, assigning points, or assigning comments to the selected students.Also shown in the figure is a button for RG, which is located below the numeric scores for each student.Once activated, a secondary window is opened, which is located in the bottom right portion of the figure.The secondary window shows the total points available for an assignment as well as the pre-defined rubric dimensions, which are relevant to the particular assignment.For example, under the total possible points available, the first dimension that can be assigned by a simple click of the mouse is "All Requirements Met," which, if clicked, would mean that zero points would be taken off the student's assessment.However, please note that this is not the case for the particular example shown in the figure.What is selected is the first dimension that is applied to the student's grade, "Code: Inadequate Comments", followed by the rubric comments that students would see once they selected the particular assignment on the LMS.While grading an assignment, instructors can simply select any available rubric dimension that applies to the student's assessment.If a given rubric dimension is not available, or has not previously been defined, instructors can simply add rubric dimensions by hitting the "Add a New Dimension" button.Even after the creation of a new rubric dimension, instructors can simply hit the "Edit" button to make changes that will be retroactively applied to students who were assigned a given dimension on their assessment.In fact, if personalized feedback is needed, rather than the default feedback that was created when the rubric was later added in, then an instructor can simply hit the "Edit" button and has the option to apply customized feedback for a given instance, rather than changing the default rubric dimension for all the students.The point deductions can be modified in a similar fashion.RG is a novel feedback and grading system in comparison to the current state-of-the-art technologies offered on websites and LMS environments.Ultimately, the goal of RG is to create an easy-to-use grading system that provides feedback that is more consistent for instructors and students.In order to develop this system, a hybrid systems development technique was created which merges parts of SDLC and RAD.This hybrid approach will be called PBR 2 , which stands for Plan, Build, Run, and Repeat.
The remaining sections provide a more thorough explanation of PBR 2 as it relates to the development of RG.Thus, examples of RG will be given in order to provide an illustration of how PBR 2 was utilized in the development process.However, before these sections are presented, a brief overview of PBR 2 is presented.In brief, Planning involves understanding the players or those whom the solution will help, as well as the non-technical problems and user requirements.Requirements are prioritized into "need" and "want" categories, and then a strategy is developed to produce a potential solution.Building involves creating a functioning prototype through a constant feedback loop involving the players.While building, a prototype is made available for testing, comments, and constant feedback.Running is described as the point where the system is made available to users.However, it is important to note that anything not considered a "need" is Moyer, Young, Weckman, Martin, and Cutright Journal of Teaching and Learning with Technology, Vol. 4, No. 2, December 2015.jotlt.indiana.edu13 moved into the Repeat phase.Repeating allows the process to start over and build upon what has already been implemented.For example, the "wants" from a prior iteration are simply re-evaluated and are potentially classified as "needs" in a later iteration.If a prior "want" is re-classified as a "need," the feature requirement is then built into the Build and Run phases.Thereafter, repeating provides a constant feedback loop to improve the system continuously.

Plan
In terms of PBR 2 as it relates to the development of RG, a few considerations of the Plan phase are shown below.In this case, instructors are considered the users of RG.For example, instructors were asked for input in order to develop a feedback system.The process of categorizing their "needs" and "wants" were assessed.For example, the users stated that they "wanted" or "needed" a feedback system that: • improved the performance of the user's ability to provide feedback through enhanced consistency and speed of the grading process -a "need" • is easy to understand and use -a "need" • allows for easy and immediate historic feedback comparisons to permit continuous improvement -a "want" • addresses situations where the student has no computer access -a "want" In terms of the brief list above, two states were classified as a "need" and two were categorized as a "want."If a particular feature is determined to be a "need," the aspects related to the request are addressed in the planning stage.Features determined as a "want" are simply not addressed until the next iteration, or until the Repeat phase of PBR 2 is invoked.
For the first iteration of the development of RG, a prototype was developed.This plan is shown in Figure 2. To provide this explanation, the assumption will be made that an instructor does not have a pre-defined rubric to utilize for an assessment that needs grading.For example, when an instructor starts the grading process, they must make a decision about whether or not to create a new rubric dimension.If this is desired, the instructor simply creates the rubric dimension, where the description that will be available to the student is defined with a corresponding point deduction.The instructor will continue to grade additional questions, with new dimensions added as they are needed.Again, the assumption is that there is not a pre-defined rubric available to the instructor.However, if one was available, all of the rubric dimensions could be added before the grading process started.Assuming that the instructor has added all of the dimensions relevant to the first student's assessment, the instructor has decisions to make while grading the next student's work.For example, if the rubric dimension is available and applies to the next student's work, then the instructor can simply click on the dimension that needs to be assigned to the student's assessment.The instructor would simply create additional dimensions as they are needed.During the decision making process shown in Figure 2, the instructor must decide whether or not a new dimension is to be created, or if a previous one needs to be modified.In terms of adding a new dimension, this has already been discussed.However, what was not discussed was the act of modifying an existing dimension.Before presenting the slightly more complex decision process shown in Figure 3, where the instructor can decide if customized feedback is used, a simple example of modifying an existing rubric dimension is given.Given that rubric dimensions have been created, instructors have the ability to edit the dimensions at any time.These edits might be related to a more descriptive comment associated with a given rubric or a change in point values.The more complicated scenario of modification is presented in Figure 3.The added complexity is associated with the decision of how edits should be managed and applied across the course roster.For example, assume for a given assessment that about half of the student assignments have been graded and the instructor realizes that a change is necessary.Perhaps the instructor wants to take off more or fewer points on a given dimension, or perhaps the comments are not as detailed as desired.The instructor can choose either to apply the changes to a dimension across all previous assignments that have been graded, or to make a decision to customize feedback or point values to the specific student assignment, which is currently under review.This customized feedback is useful for many situations, including adding a personalized response to a student, which could ultimately promote an instructor's social presence in the classroom.In any case, giving instructors the power of retroactively applying changes to previously graded assignments saves time and improves the accuracy of grading course assignments.

Build
Based upon the input from instructors and teaching assistants, a prototype of RG was developed to address the "needs" listed in the Plan phase.During the Build phase, a simple yet effective feedback loop was followed, which is shown in Figure 4. Simply put, as prototypes were built and reached a stable state, feedback was requested from a small team of users.If the users were not satisfied with the solution presented, then more input was collected and the application was revised.This process continued until the user group was satisfied.Once this occurred, the component of the prototype was released to a larger group of users, which occured in the Run phase.

Run
From the development side of the process, the Run phase is nothing more than making the software application available to a larger base of users.During this phase, users work with the technology so that ideas can be generated for future enhancements and feature additions.With that said, a pool of "needs" and "wants" are generated by the users and are later assessed when the process is repeated.
For the remainder of this particular section, examples of features built into RG that were deemed acceptable by the small group of users in the Build phase of the project are presented.For example, the screenshot shown in Figure 5 displays various controls that can be set by the instructor that relate to the visibility of the rubric to students.In this particular screenshot, a question is proposed to the instructor.The question is asking the instructor whether the rubric developed for the assessment should be hidden or made available to students prior to the student submitting their work.

Figure 5. Feedback Visibility Controls
When users were asked whether they built rubrics before an assessment was to be graded or during the grading process, the feedback was mixed.Since the process of creating rubrics on the fly was previously discussed, the following figure shows an example of when rubric dimensions were built before the grading process.As one can see, the window for creating new rubric dimensions is straightforward.Users define the point values, which can be positive or negative.Positive point values might mean a bonus, whereas negative values would deduct points from a given assessment.The user then fills out a name of the new rubric dimension and gives a description.As noted, these dimensions can be edited later.Figure 7 shows one way in which rubrics can be managed.In this figure, users can modify the visibility setting as well as a number of other features.For example, users can add or delete dimensions from the "Manage" page.They can also select whether or not the dimension is selected by default when RG is used to grade a student's assignment.In addition, point values, as well as textual information, can be edited.One important feature included in this screen capture is the "Arrange" button.This simply allows instructors to change the order of the rubrics that have been developed.This is useful in making the grading process more user-friendly.In other words, dimensions can be aligned in such a manner that is directly related to the questions being assessed.

Figure 7. Editing Existing Dimensions
A final example of a feature of Rapid Grade is shown in Figure 8.This screen capture demonstrates the process an instructor would use when grading an assignment.Though the screen capture shows a limited number of dimensions, the instructor would simply need to select the appropriate dimensions that would apply to the assignment being graded.

Repeat
During the final phase of PBR 2 , feedback is solicited from the pool of users.Ideally, the size of the pool would be based upon the scope of the feature requests made, as well as the size of the development team.Once sufficient feedback is obtained, the iterative process of PBR 2 is repeated.

Results
In order to develop an effective solution that meets the pedagogical needs of the faculty and students, instructional technologies should be validated (Machado & Tao, 2007).Since there is often confusion between verification and validation, a definition is provided.Verification is the act of ensuring that the system is built correctly.In other words, does the system function as it was intended to function from a technical perspective?Validation, on the other hand, is the act of ensuring that the right system has been built.In other words, does the technology accomplish the goals of the users (Miser & Quade, 1988)?In order to validate that RG functions more efficiently than feedback and grading mechanisms within XYZ, a survey based on the TAM (Davis, 1989) was sent to higher education instructors who have had experience using both RG and XYZ.Thus, the results that will be presented in this manuscript are based upon a convenient sample.Before presenting the remaining portions of the results, a brief overview of TAM is provided.

Technology Acceptance Model
It should be noted that acceptance models are in short supply for researchers developing technology (Davis, 1989).Researchers claim that most models to validate systems are based upon subjective measures and suggest that they are often invalid.Thus, the TAM was chosen in order to test the hypothesis that RG improves upon the functionality of XYZ as a grading and feedbackbased methodology.
The TAM survey consists of fifty-six factors, twenty-eight of which are dedicated to the evaluation of RG, and the other twenty-eight to XYZ.The model consists of a set of validated questions that provides quantitative insight as to the perceived level of usefulness of technology, as well as the perceived ease of use (Davis, 1989).Perceived usefulness is the extent to which a user of the system expects the system to improve personal efficiency.Perceived ease of use describes the extent to which a system is easy to use.This survey was developed in and distributed from ISMS.This decision was simply made out of convenience to collect the survey data electronically.Each question of the TAM is based upon a linear, five-point Likert scale (Likert, 1932).

Data Collection
In total, twenty-three survey participants responded to the survey who had experience in both online systems.A one-tailed paired t-test was used to determine if there was a statistical difference between RG and XYZ for any categories and questions found in the TAM.For hypothesis testing, it was suggested that t-tests provide a good indication of the Likert response location (Meyers, Gamst, & Guarino, 2005).As noted, TAM consists of two evaluation categories, with the first being the perceived usefulness and second being the perceived ease of use.Each category consists of fourteen factors.It should be noted that when the factors were presented to the participants, they were not randomized.The factors, which will later be presented in squared brackets, were presented in numeric order to the survey participants.
The survey utilized the standard convention of a 5-point Likert scale, where each response category was assigned a linear value.For example, "Strongly Disagree," "Disagree," "Undecided," "Agree," and "Strongly Agree" were assigned integer values of 1, 2, 3, 4, and 5, respectively.Before running this analysis, the aggregated correspondent data was preprocessed in a manner such that a positive difference between RG and XYZ would indicate that RG was a potential improvement over XYZ.Likewise, if the difference were negative, this would indicate that XYZ was a potential improvement over RG.

Data Preprocessing
The data was preprocessed to order to make interpreting the results easier to understand.For example, if the individual paired responses for XYZ were subtracted from RG (i.e.RGi -XYZi), it would be logical that a positive difference would indicate that RG was an improvement over XYZ.For example, the first factor that appears in the survey is "My job would be difficult to perform without the technology."Assuming one survey participant replied with a value of 5 (i.e."Strongly Agree") for RG and a 3 (i.e."Undecided") for XYZ, the difference would be 2, which would indicate that the survey participant more strongly agreed with the notion that RG made his or her work easier to perform.On the other hand, if a survey participant replied with a value of 3 for RG and a 5 for XYZ, the difference would be -2, which would indicate that the survey participant more strongly agreed with the factor that XYZ made his or her work easier to perform.
Based upon the wording of some factors that appeared in the survey, an adjustment in terms of the point values assigned to the Likert scales needed to be made in order to keep the results consistent for the purpose of interpretation.For example, the fifteenth factor is "I often become confused when I use the technology."Thus, if a survey respondent indicated an answer of 5 (i.e."Strongly Agree") for RG and a 3 (i.e."Neutral") for XYZ the difference would be 2, but the interpretation of +2 has a different meaning.In this particular example, a +2 would not indicate that RG was an improvement over XYZ; rather, it would actually indicate that XYZ was an improvement over RG, meaning that the system was less confusing to work with for this particular example.Thus, to overcome these issues the values for the 5-point Likert were adjusted so that "Strongly Disagree," "Disagree," "Undecided," "Agree," and "Strongly Agree" were assigned values of 5, 4, 3, 2, and 1, respectively.Thus, after this adjustment was made, the interpretation of positive and negative differences would be consistent based upon any factor in the survey.Finally, Table 1 and Table 2 show the numbering convention used after the data preprocessing was complete.Factors that the standard Likert score changed (i.e.flipped), are indicated by an asterisk (i.e.*).
My job would be difficult to perform without the technology.

2.
Using the technology gives me greater control over my work.

3.
Using the technology improves my job performance.

4.
The technology addresses my job-related needs.

5.
Using the technology saves me time.The technology enables me to accomplish tasks more quickly.7.
The technology supports critical aspects of my job.8.
Using the technology allows me to accomplish more work than would otherwise be possible.9.
Using the technology reduces the time I spend on unproductive activities.10.Using the technology enhances my effectiveness on the job.11.Using the technology improves the quality of the work I do.12.Using the technology increases my productivity.13.Using the technology makes it easier to do my job.14.Overall, I find the technology useful in my job.

#
Factor 15.I often become confused when I use the technology.*16.I make errors frequently when using the technology.*17.Interacting with the technology is often frustrating.*18.I need to consult the user manual often when using the technology.*19.Interacting with the technology requires a lot of mental effort.*20.I find it easy to recover from errors encountered while using the technology.21.The technology is rigid and inflexible to interact with.*22.I find it easy to get the technology to do what I want it to do.23.The technology often behaves in unexpected ways.*24.I find it cumbersome to use the technology.*25.My interaction with the technology is easy for me to understand.26.It is easy for me to remember how to perform tasks using the technology.27.The technology provides helpful guidance in performing tasks.28.Overall, I find the technology easy to use.

Data Analysis
Ultimately, to determine if RG was an improvement over XYZ on average (i.e.μ), a paired, twosample, one-tailed t-test was needed to make a conclusion about the statistical hypothesis.In particular, the null hypothesis (or the status quo) is that XYZ outperforms or is at least equal to the mean performance of RG.Thus, the alternative factor is that the mean performance of RG outperforms, or is an improvement to, the XYZ system.Therefore, the two-sample, one-tailed hypothesis statement is shown below, where RG and XYZ are abbreviated as RG and XYZ, respectively.
• H0: μ RG ≤ μ XYZ • H1: μ RG > μ XYZ After stating the hypothesis test, the significance of the test was selected.For this particular study, an alpha value of 0.05, or 5%, was selected.As described in the previous section, the results from the survey respondents were preprocessed and the difference between RG and XYZ Moyer, Young, Weckman, Martin, and Cutright Journal of Teaching and Learning with Technology, Vol. 4, No. 2, December 2015.jotlt.indiana.edu21 responses were determined.Thus, the two-sample hypothesis test shown above can be rewritten in an equivalent form to: A statistical hypothesis test derived from the TAM data collected can be performed in many ways.The strategy of presenting the results in the following section will start at a high aggregated level, followed by additional analyses that are more detailed.Thus, the structure of the results is as follows.First, an overall test of twenty-eight paired factors of RG -XYZ will be conducted and presented.In this case, the "overall case," both the perceived usefulness and ease of use are grouped together.Second, factors associated with the perceived usefulness are aggregated, analyzed, and presented.Third, factors associated with the ease of use are aggregated and the testing results and discussions are presented.Fourth, results of the individual factors are presented.The presentation of the general descriptive statistics will be shown along with the paired, twosample, one-tailed t-test results.

Overall
Table 3 shows the descriptive statistics for the overall test conducted from the survey results.The sample size for this test is 644, which represents the twenty-eight factors on perceived usefulness and the ease of use from the twenty-three survey participants.From this table, the mean of the difference between RG and XYZ is positive, a value of 0.9829, which suggests that there is evidence to support the t-test analysis.In an overall sense, RG performs better than XYZ as a grading and feedback tool.To determine if the alternative hypothesis that the performance of RG is greater than XYZ is correct, a paired, two-sample, one-tailed t-test was performed, and the results are shown in Table 4. Based upon this aggregated study, the p-value is less than α of 0.05, which indicates that the null hypothesis can be rejected.Thus, it appears that RG is an overall improvement to the grading and feedback system of XYZ.From the previous section, the testing suggests that RG is an improvement over XYZ; however, in order to provide more detailed results, this section presents the results of just the questions related to the perceived usefulness of the online system.Based upon the descriptive statistics found in Table 5, the sample size is 322, which is based upon the fourteen factors associated with the perceived usefulness and the twenty-three survey participant.The mean of this sample is 0.7267, with a standard deviation of 1.3603.Figure 7 shows the results of the t-test where, like the overall test, the p-value is less than alpha.This indicates that from a perceived usefulness standpoint based on the TAM, RG is more useful than XYZ on an aggregated level.Ease of Use Thus far, statistical testing indicates that on a high, aggregated level, RG is an improvement over XYZ in terms of its overall performance and from a perceived usefulness standpoint.In addition to these tests, factors related to the ease of use were investigated, and the descriptive statistics for this sample are shown in Table 7.Like the perceived usefulness sample, the sample size was 322, which is based upon the fourteen factors associated with the ease of use and the twenty-three survey participants.The mean of this sample is 1.2391, with a standard deviation of 1.5891.Table 8 shows the results of the t-test.Similar to the overall and perceived usefulness, the p-value is less than alpha, indicating that from an ease-of-use standpoint, RG is perceived to be better than XYZ on an aggregated level.Prior testing was based upon the aggregated results or the overall results, as well as being based upon the category (i.e.perceived usefulness or ease of use) and factors associated with those categories from the TAM.This section investigates the individual factors related to the survey.Therefore, to be consistent with the presentation of the prior tests, Table 9 shows the descriptive statistics for all twenty-eight factors that appeared on the survey.grading and feedback system.These questions are 1, 4, 7, and 8, which correspond to "My job would be difficult to perform without the technology," "The technology addresses my job-related needs," "The technology supports critical aspects of my job," and "Using the technology allows me to accomplish more work than would otherwise be possible".It should be noted that the pvalue associated with Question 8 was very close to alpha; however, as it currently stands, there is not enough evidence to conclude that RG allows users to accomplish more work than would otherwise be possible.The remaining questions, however, are all statistically significant wherein the null hypothesis can be rejected.To summarize the information found in Table 11, the list below shows the most significant factors based upon the highest sum in relation to perceived usefulness.Finally, in addition to the list above, the list below shows the most significant factors based upon the highest sum related to the perceived ease of use.

Conclusions
From a low-level standpoint, RG outperformed XYZ in nearly every category that was evaluated in the TAM.In fact, the null hypothesis was rejected in twenty-four out of the twenty-eight individual tests that were performed.With this evidence, it is not surprising that from a slightly higher-level standpoint, the null hypotheses stating that the perceived usefulness and ease of use of RG were than or equal to XYZ were also rejected.Finally, when all of the respondent data was grouped together, the tests indicated that the RG technology was accepted from an overall standpoint.
From a high-level standpoint, the results of this investigation suggest that students and teachers would benefit if commercialized LMS environments would adopt a similar flexible rubric system.The primary features of RG that make the system unique include: • The ability for teachers to start the grading process with or without a pre-defined rubric • The ability for teachers to add or modify rubric dimensions during the grading process • The ability to control whether modifications are enforced across all graded and non-graded assignments or if one-time-only modifications are only applied to special cases • The ability to manage rubric dimensions so that one can easily delete, modify, or copy from one assignment or course section From a development standpoint, the RG component was straightforward to implement.However, RG was developed by a seasoned software engineer who was also an instructor and had a clear vision of what was either needed or wanted.The development continued as other instructors used the LMS within their classrooms.Therefore, RG is considered to be highly customized, which can be considered an advantage or a disadvantage.The advantage is that the features that could be developed into RG are endless.However, very few practitioners have the knowledge or resources to develop or modify existing code within non-commercialized LMS environments.In general, most teachers are using commercialized software that cannot be changed.Thus, one disadvantage is simply that this type of flexible feedback system does not exist in the most popular LMS environments that are used in GK-12 and higher education.
Currently, many of the LMS environments used within education have a plethora of features.From a flexibility standpoint, this is good.However, these features are not usually developed from a usability standpoint.If features such as RG are not developed with a usability perspective, then the practitioners do not benefit because the features are often considered too cumbersome to use and, therefore, they are limited in terms of their benefit.Thus, one conclusion that can be drawn from this study is that commercial suppliers of LMS environments should consider a framework like TAM when developing features for their customers.To say that a feature Moyer, Young, Weckman, Martin, and Cutright Journal of Teaching and Learning with Technology, Vol. 4, No. 2, December 2015. jotlt.indiana.edu 27 exists is not sufficient anymore.Features must be developed with the customer in mind and they must be developed in a manner that is easy for the customers to use.If these conditions are satisfied, then features like RG can be used in a variety of classes that vary regarding the forms of assessments that are used.Overall, a vast amount of evidence supports the claim that RG is an improvement over the XYZ system in terms of its usefulness and ease of use as an online grading and feedback system.As this investigation shows, RG enhances the way in which users perform their work.From this study, users are more efficient in grading assignments while also being able to provide high-quality feedback, which is more than what XYZ currently offers at no extra cost.In fact, in terms of the ease of use, there were no individual categories in which the null hypothesis failed to be rejected.Based on these results, an empirically-validated conclusion can be made that RG is less confusing to use, less frustrating for users to interact with, less cumbersome for users, and is an easy system to use.The simplicity of RG allows you to accomplish more with less effort, primarily by staying out of your way when grading.Spending less time worrying about making simple mistakes and being able to easily correct them when they inevitably occur is invaluable.Having the right tool for the job is sometimes all that you need.

Figure 2 .
Figure 2. Example of Adding New Dimensions

Figure 3 .
Figure 3. Example of Rubric Dimension Modifications

Figure 4 .
Figure 4. Example of Build Feedback Loop

Figure 6 .
Figure 6.Creating a New Rubric Dimension

•
The technology enables me to accomplish tasks more quickly [6] • The technology saves me time [5] • The technology increases my productivity [12] • The technology improves my job performance [3] • The technology reduces the time I spend on unproductive activities [9] Moyer, Young, Weckman, Martin, and Cutright Journal of Teaching and Learning with Technology, Vol. 4, No. 2, December 2015.jotlt.indiana.edu26

•
Interacting with the technology is NOT frustrating [17] • I NEVER become confused when using the technology [15] • Overall, I find the technology easy to use [28] • I do NOT find it cumbersome to use the technology [24]• The system is NOT rigid and inflexible to interact with the technology[21] Moyer, Young, Weckman, Martin, and Cutright Journal of Teaching and Learning with Technology, Vol. 4, No. 2, December 2015.jotlt.indiana.edu15

Table 9 . Individual Factors Descriptive Statistics
Table10shows the t-test results for each factor on the TAM used in this investigation, which is summarized more clearly in Table11.

Table 10 . Individual Questions Test for H0: μ RG ≤ μ XYZ, H1: μ RG > μ XYZ
Based on the results shown inTable 11, in only four of the twenty-eight total factors is there not enough evidence to reject the null hypothesis that RG was an improvement on XYZ as a Moyer, Young, Weckman, Martin, and Cutright Journal of Teaching and Learning with Technology, Vol. 4, No. 2, December 2015.jotlt.indiana.edu25