The Effect of Online Formative Self-Assessment on Academic Performance of Chiropractic Students

The main objectives of this study are: 1. Evaluate the degree of utilization of online, formative self-assessment (OFSA); 2. To evaluate the effect of OFSA on summative final exam (SFE) scores. The design of the study involved students having the opportunity to take a total of eight weekly OFSA quizzes voluntarily, outside of class time and throughout the academic term. Demographic, utilization and SFE scores were collected and analyzed. The results included: 1. high participation rate with 93% (N = 173) of the total number of students having taken at least one or more quizzes and 53% (N = 98) of students took at least four or more OFSA quizzes. 2. There was a 0.72 (p=.008; CI: .196 to 1.253) increase of SFE scores per quiz taken as per linear regression. The correlation was mildly, positive (r = .194, p < .01). In post hoc analysis, the mean SFE score of the frequent (4 or more quizzes) OFSA takers was 3.52 higher than that of the infrequent (3 or fewer quizzes) takers (p < .01). Based on the results, OFSA may offer a complementary learning tool for students in a Chiropractic program.

assessment, educators in various disciplines including chiropractic education have given a significant amount of attention to the value of formative assessment.
Formative assessment is referred to as a process of low-stake evaluation that is specifically intended for student-centered learning by generating feedback on performance to improve and accelerate learning (Nicol & Macfarlane-Dick, 2006). The focus of formative assessment is on providing feedback to students rather than evaluating them for course grades (Buchanan, 2000). Even when instruction is planned with great care and precision and delivered effectively, it is not uncommon that the learning outcome often bears little or no relation to what was intended. If what a student learns as a result of a particular sequence of instruction and when all the learners are at the same place when the instruction starts, within minutes students will have reached different understanding and/or different levels of understanding (Wiliam, 2011). However, one of the critical functions of formative assessment is that structured and clearly laid-out feedback can help students stay on track and produce the intended learning outcomes (Nicol & Macfarlane-Dick, 2006).
Few studies have been conducted using sequential weekly formative assessment tools that is given online, which not only evaluate utility of formative assessment but incorporate evaluation of study habits and time management. This sequential weekly formative assessment encourages students to make a regular and consistent commitment to their learning. Chang and Wilmmers reported that students' learning was enhanced by regular consecutive weekly quizzes rather than a single occurrence of a large pre-test (Chang & Wimmers, 2017). Therefore, we decided to evaluate the efficacy of Online Formative Self-Assessment (OFSA) based on the number of quizzes taken out of eight weekly OFSA quizzes.
Conventionally formative assessment has been performed by students taking written quizzes and most studies evaluating the efficacy of formative assessment have utilized conventional methods such as paper-based quizzes (Costa, Mullan, Kothe, & Butow, 2010;Gikandi, Morrow, & Davis, 2011;Ita, Kecskemety, Ashley, & Morin, 2014). However, there are several limitations with conventional paper-based formative assessments. First of all, students must be present together at one specific place and time. Secondly, providing students with individualized feedback can be virtually impossible, considering students' and instructors' busy schedules (Olson & McDonald, 2004). Other limitations include a cumbersome process of item analysis of question reliability and validity, lack of interactiveness, and loss of quality of image quality when printed on paper (Velan, Kumar, Dziegielewski, & Wakefield, 2002). However, the online formative assessment process is not limited to space and time. Students can take Online Formative Self-Assessment quiz (OFSA) virtually any place and time as long as they have a computer with internet access. They can even use their smart phones or tablets, which makes it more convenient for them to take the quiz.
Buchanan reported that one of the crucial factors for a formative assessment to be useful is that feedback should be provided at an appropriate point in the learning process (Buchanan, 2000). OFSA can provide students with immediate feedback, which minimizes or eliminates a gap between the time www.scholink.org/ojs/index.php/wjer students become aware that there is a learning issue and the time that learning issue is resolved.
Students can get relatively individualized feedback for questions they answered wrong, which increases students' metacognition and helps them monitor their learning (Ibabe & Jauregizar, 2010). Other advantages of OFSA include student-controlled pace of learning (Gikandi et al., 2011), convenience of item analysis of question reliability and validity, high-quality digital images, and ability to monitor students' learning progress to help instructors identify the areas or topics that students may be struggling with (Karami, Heussen, Schmitz-Rode, & Baumann, 2009).
With regard to the research methodologies and results presented in this article, we are unaware of similar reports in which the efficacy of formative assessment via electronic methodologies involving regular and weekly commitment instead of one-time formative assessment has been evaluated in chiropractic education. Therefore, in order to add to the knowledge base of chiropractic education literature, an online mode of executing eight weekly consecutive OFSA quizzes was performed to find out the possible relationship between the number of OFSA quizzes taken and students' academic performance on summative assessment including midterm and the final examinations.

Method
The methods of this study were deemed exempt from formal review by the Palmer College of Chiropractic institutional review board before the informed consent forms were given to participants.
The participating students were fully informed regarding the nature of the study and a signed consent form showing agreement that de-identified performance assessments would be collected and utilized for this study was obtained from the participants.

Participants
A total of 186 (n=186) in three separate cohorts of 9 th quarter students (there is a total of 13 quarters in the program) agreed to participate in this study across three separate consecutive terms with a total of 44, 58, and 84 students in each term respectively. The students are in the beginning of the 3 rd year of Chiropractic program at Palmer College of Chiropractic Florida and enrolled in a mandatory Soft Tissue Radiology class covering diseases of the chest and abdomen that can be seen on plain radiographs, ultrasonography, CT (computed tomography), and other advanced imaging modalities.
Demographic data including sex, age, highest academic degree, and ethnicity were collected from the participants.

Material
The author who was also the course instructor wrote a test bank of one hundred and fifty multiple choice questions for the soft tissue radiology course. Each OFSA quiz was created using BrightSpace®, a cloud-based learning management system. The BrightSpace® learning environment allows instructors to design interactive training courses and evaluate assignments. For this course, BrightSpace® was utilized to electronically administer the quizzes and send out mass emails every week to those enrolled in the course to remind them of the newly posted quizzes. Since these functions are already embedded in BrightSpace®, there was no additional cost to purchase extra educational software.
There was a total of eight OFSA quizzes throughout the course. Each OFSA quiz contained fifteen questions chosen from the question bank covering the topic that had been discussed in the previous lecture. The OFSA quizzes are completely voluntary, independent of the grade for the course; therefore, students were not required, but encouraged to take the OFSA quiz for their learning. Utilization of the OFSA quizzes gave no advantage in terms of automatic increase in the summative final exam score, and there was no direct punitive or negative consequence for those who did not take the OFSA quizzes.
The format of the OFSA quiz was primarily multiple choice, with a few true/false questions. Feedback for each answer choice was created with explanation, rationale, diagram, and/or snapshots of lecture notes about why certain choices were correct or incorrect. The OFSA quiz results were immediately available electronically once students completed the quiz, as were the correct answers for each question and feedback for each distractor.

Procedure
Because the groups that corresponded to the independent variable OFSA quizzes (users vs. non-users) could not be randomly assigned for ethical reasons in order to provide a non-discriminatory equal learning opportunity, a non-experimental or observational research methodology was used.
The course consisted of three hours of lecture per week with a total of eight weeks of lecture time in the quarter, excluding the summative examination week, and a total of eight weekly OFSA quizzes were offered throughout the term. The course material, lecture format, and instructor were consistent throughout the three terms. Table 1 outlines the time frame for OFSA quizzes and summative assessment. The students received extensive orientation in the first class of the term about the purposes of the OFSA quizzes, the goals of this research project, the weekly cycle of the OFSA quizzes, and how to use OFSA to its full advantage, including the feedback function. Students were instructed to take the OFSA quiz voluntarily outside of class time via verbal announcement in the class as well as a notification on the course website of a newly posted quiz. In addition, two reminder emails containing a link to the quiz were sent to the students one day after the quiz was posted and the day before the quiz expired. The OFSA quiz was posted for a week from Monday morning at 8 am through the following Monday morning at 8 am. As soon as the quiz expired, a new quiz was posted for the following week, and the cycle was repeated weekly for eight weeks. Once the quiz expired, students could not review the previous quiz. This was to prevent students from taking all of the quizzes at once immediately prior to the summative assessment and to help students make a regular and consistent commitment to learning the material. Along with the restricted time frame for taking the quiz, other restrictions included disabling the right click to prevent students from printing out the quizzes, a time limit of 30 minutes for 15 questions, and three attempts to take the quiz. After submission of the completed quiz, BrightSpace® allowed identification of the participants who took the quizzes and monitoring of their performance. For assessment and data analysis, the participating students were divided into a total of nine groups based on the number of OFSA quizzes they had taken, ranging from those who had never taken any quiz to those who completed all eight quizzes.

Assessment
Summative assessment was composed of two tests, midterm and final exam. The midterm exam was given in the 6 th week of the course term and the written final in the 11 th week, the final exam week. The most relevant variables measured were learning outcomes assessed by the scores of students' summative exams and student effort estimated by the number of OFSA quizzes taken. In order to take all the eight OFSA quizzes into consideration, Summative Final Exam (SFE) scores were used to evaluate the potential relationship between the number of OFSA quizzes taken and students' academic performance on the SFE. For example, a student may have taken a total of four quizzes. However, if those were taken after the midterm exam, the correlation between the midterm exam score and a total of four quizzes taken after the midterm cannot be validated. Therefore, the only valid dependent variable in this study is the SFE because it is given a week after all the eight quizzes were executed.

Data Analysis
Data were collected for analysis through the BrightSpace® including the number of students who took each OFSA quiz and the number of quizzes each student took. These data were summarized and analyzed using the statistical computing environment R (Version 3.4.3, R Foundation, Vienna, Austria).
Statistical assumptions were verified and p-values less than .05 were considered significant. The main hypothesis is was evaluated via Pearson's product-moment correlation. As a post hoc analysis, linear and nonlinear regression were used to assess the pattern of increase of the SFE scores and to calculate the slope coefficient in addition to independent sample t test to compare the means of the SFE scores between the frequent and infrequent OFSA takers.

Demographic Information
A total of 186 students participated and demographic data are summarized in academic degree was predominantly skewed in favor of bachelor's degrees (97%, n = 180) with only 3% (n = 6) having Master or PhD. The mean student age was 27.3 years (SD 5.1 n = 186). The highest age was 54 years and the lowest was 22 years. Ethnically, the majority of students (73%, n = 135) were Caucasian, 14% (n = 26) Hispanic, 6% (n = 12) African American, and 7% (n = 13%) other races.   weekly OFSA quizzes throughout the term. Unfortunately, we did not collect any information to ascertain why those 7% of students elected to not take the OFSA quizzes. Mean number of quizzes taken was 3.98 quizzes (SD = 2.438). More students took OFSA quizzes earlier in the term with the highest participation rate (119 students, 64%) in OFSA quiz 1. However, participation rate steadily declined as the term went by with OFSA quiz 8 (The last quiz) showing the lowest participation rate of 32.8% (see Table 4).

Number of OFSA Quizzes Taken and SFE Scores
Simple linear regression was carried out to investigate the relationship between the number of OFSA quizzes and SFE scores. The scatterplot showed that there was a weak but positive linear relationship between the two variables, which was confirmed with a Pearson's product-moment correlation coefficient of .194 with statistical significance (p = .008) (see Table 5). The slope coefficient for the number of quizzes taken by students was .72, which means that taking a single quiz is associated with a .72-unit change in the SFE scores (see Figure 1). Therefore, if students took all eight quizzes, they would potentially receive benefit of 5.76 points of increase on their SFE scores. Limitations include slight violations of assumptions including linearity, autocorrelation (DW = 1.746, p = .0255) and independence. Upon post hoc analysis, we found that summative midterm exam grade increases with the number of quizzes taken. Therefore, we are concerned that there may be a tertium quid effect since midterm grade seemed to increase with OFSA quizzes 5 through 8 even though those quizzes are not taken before the midterm. The R 2 was .038, which means that 3.8% of the variation in the SFE scores can be accounted for by the model containing only OFSA quizzes.

Post Hoc Analysis Comparing Frequent and Infrequent OFSA Takers Relative to the SFE Scores
Graphical analysis of non-linear regression model demonstrated a sudden increase in slope or "step-up effect" after the three quizzes (see Figure 2), which led to additional post hoc analysis with an independent-samples t-test to compare the SFE scores between the infrequent takers indicating those who took three or fewer OFSA quizzes (fewer than the mean number of quizzes taken = 3.98) and the frequent takers who took four or more OFSA quizzes. There was a statistically significant (p = .008) increase in the SFE scores of the frequent OFSA takers (M = 84.47, SD = 7.89) by 3.515 points compared to the infrequent OFSA takers (M = 80.95, SD = 10.025) (see Table 6).

Post Hoc Analysis Comparing High and Low GPA Students Relative to the SFE Scores
Additional post hoc analysis was conducted to investigate if Grade Point Average (GPA) is associated with the number of OFSA quizzes completed by students and the SFE scores. For graduate program, a GPA of 3.0 or higher is viewed as an indicator of good academic performance (Sansgiry, Bhosle, & Sail, 2006). Therefore, for post hoc analysis students are divided into two groups, low GPA students whose GPAs was lower than 3.0 and high GPA students whose GPAs was 3.0 or higher. Average number of quizzes taken by high GPA students is slightly higher with an average of 4.074 than low GPA group with an average of 3.725. The difference in the number of quizzes taken between high and low GPA students is not statistically significant (p = .386), which means how many quizzes were taken is not related to student's GPAs. However, high GPA students scored 7.055 points higher than those with low GPA with statistical significance of p = .000 (see Table 7).  In the linear regression model for each high and low GPA group, both groups have benefited from taking OFSA quizzes. However, the slope coefficient for students with high GPAs was greater (.711, p < .05) than those with low GPAs (.445, p = .453). These results could be interpreted to suggest that high GPA students have benefited more and more as they took a greater number of OFSA quizzes in gaining higher score in the SFE than low GPA students (see Figure 3). However, these results should be approached with care due to violations of assumptions: heterogeneity and linearity.
Pearson's product-moment correlation demonstrated a strong correlation between GPAs and the SFE scores, as evidenced by the coefficient of .569 with statistical significance (p < .000). Therefore, students' GPAs are a much stronger factor that is related to academic performance than the OFSA quizzes. The R 2 value was .324. Therefore, 32.4% of the variation in the SFE scores is explained by the model with GPAs, which is approximately 8.5 times higher than the R 2 (.038) of the number of OFSA quizzes (See Figure 4), which is consistent with the Pearson Correlation result.

Discussion
One of the most important goals of this study was to determine whether frequency of use of the OFSA quizzes was related to academic performance, assessed by the means of the final grade in the subject.
We hypothesized that the relationship between the extent of utilization of the OFSA quizzes, ranging from zero to eight quizzes, and academic performance on the SFE would be positive. As expected based on the previous studies (Sadler, 1989;Wilson, Boyd, Chen, & Jamal, 2011;Zhang & Henderson, 2015), the results demonstrated better SFE scores in those students who had a higher degree of utilization of the OFSA quizzes in comparison to those who never took or took fewer of them. However, the degree of positive correlation was not as strong as results from other studies in various disciplines of health care education including the medical, nursing, and dental programs (Hill, Guinea, & McCarthy, 1994;Kibble, 2007;Olson & McDonald, 2004). One explanation of this weak correlation is a "ceiling effect"-the summative examination may not have been difficult enough to differentiate students' level of understanding on the subject matter. However, considering the average percentage of the SFE was 82.8%, the "ceiling effect" may have been minimal to mild.
Despite the fact that there was a positive correlation between the number of quizzes taken and the SFE scores, this does not necessarily mean that taking the OFSA quizzes automatically improves academic performance. Because equal learning opportunities needed to be provided to all students enrolled in the class, it was impossible to employ experimental methodology that would have led to a more definitive causality. While there is a large body of literature that supports the positive impact of formative assessment on students' SFE scores, it should be pointed out that there are others who reported that there is no relationship between the two variables, concluding that formative assessment does not enhance students' learning or academic performance (Haberyan, 2003). These contradicting results may be due to failing to identify and take into consideration tertium quid effect on students' academic performance. Academic performance on SFE is a byproduct of multiple factors, of which the OFSA quizzes may play a small part. While previous academic performance is identified as the most significant predictor of academic performance (Graham, 1991;McKenzie & Schweitzer, 2001), there are other factors including academic competence, test competence, time management, strategic studying, and test anxiety (Sansgiry et al., 2006;Zhang & Henderson, 2014). Our post hoc analysis also showed a strong correlation between the GPAs of previous academic performance and the SFE scores for the course in this study.
Students with high GPAs took slightly more OFSA quizzes than those with low GPAs (anyone below 3.0 GPA in courses taken in the Chiropractic program). Also, those students with higher GPAs received more benefit from taking the OFSA quizzes by earning higher SFE scores than those with lower GPAs.
This can be interpreted as "better" students are more likely to avail themselves of the educational opportunity to take the OFSA quizzes and, as a result, they performed better in the subsequent SFE.
In post hoc analysis of the relationship between the number of quizzes taken and students' GPAs, students with low GPAs scored higher by taking more than four OFSA quizzes when compared to the group of students with similar GPAs who utilized fewer than four OFSA quizzes. Even more important than the difference in quantification of OFSA quizzes and GPAs may be students' attitudes toward the OFSA quizzes. It is speculated that the "better" performing students are more likely to show an intrinsic interest about learning, are better at managing their time, more aware of the educational opportunities available to them, and more likely to take advantage of those opportunities.
The result of high GPA students receiving more benefit from the OFSA quizzes is contrary to what Olson et al found that regardless of GPAs, those who took formative quizzes demonstrated higher academic performance on SFE with statistical significance compared to those who did not take the quizzes (Olson & McDonald, 2004). This difference could be due to various reasons including the design and format of formative assessments, how close they simulate the summative assessment, whether or not there is automatic incentive (bonus points or extra credit), whether they are paper-or computer-based, how frequently students receive opportunities to participate in formative assessments, and/or how close they are in time to the summative assessments.
The students with high GPAs (≥ 3.0) showed a stronger positive correlation between the number of OFSA quizzes taken and the SFE scores compared to those with low GPAs. This may be because the OFSA quizzes were voluntary and required a regular and consistent commitment from students in order to enhance their learning. While students with low GPAs did not receive as much benefit as those with high GPAs, it is certain that OFSA quizzes provided students with an additional way of learning, as informally reported by some of the students. Plus, as evidenced in the result, it is clear that students with low GPAs still benefited from taking the OFSA quizzes and received higher scores than those who did not utilize the OFSA quizzes even though the degree of increase/benefit is not is as great as that seen in students with high GPAs. Another noteworthy finding is that there were students who performed well on the SFE, yet did not participate in OFSA quizzes. These students consistently had high GPAs, so it can be interpreted that these students have adequate academic and test competence; therefore, they did not need to spend extra time and effort to boost their scores since there was not much room to increase their scores. In the same manner, there were students who scored low (less than 80%), yet took all eight OFSA quizzes. These students' GPAs were all consistently in the low GPA group (less than 3.0). This can be translated in two ways. One is that the OFSA quizzes were not helpful for these students in improving their academic performance. The other interpretation is that these students could have potentially scored lower if it were not for the OFSA quizzes they took.
However, these speculations require further investigation.
Students' level of use of the OFSA quizzes was considerably high-93% of students having taken at least one or more OFSA quizzes-taking into account that it was a voluntary activity, was to be done outside of lecture time, and had no additional incentive for their grade. This is significantly higher than what Kibble observed, with only a 52% participation rate with two formative assessment quizzes under the same condition that no course credit or incentive was given for taking the OFSA quizzes (Kibble, that an online learning setting can offer including easy access and the convenience of not being restricted to time and space (Barros, 2018;Demirci, 2007;Gikandi et al., 2011;Wang, Wang, Wang, & Huang, 2006). Email reminders about new OFSA quizzes at the beginning and end of each week may have contributed to the high participation rate. More than half (53%) of the students took four or more quizzes (frequent takers). These students performed significantly better on SFE than those who took three or fewer quizzes (infrequent takers) as seen in the post hoc analysis. The participation rate was highest (N = 119, 74%) with the OFSA quiz 1 in the beginning of the term. However, fewer students took the OFSA quizzes as the term went on with the lowest number of students taking the last OFSA quiz (N = 61, 32.8%). This pattern is consistent with what Darby et al. (Darby, Longmire-Avital, Chenault, & Haglund, 2013) found about students' motivation, which is highest at the beginning of the semester and steadily decreases throughout the semester with the lowest motivation at the end.

Limitations
One of the limitations of this study is the relatively small sample size and inability to divide the cohort into two groups, experimental and control, on the principle of providing an equal educational environment for all students who are enrolled in the same course. Even though the total number of students is 186, the sample size becomes significantly smaller when it is divided based on the number of OFSA quizzes taken, ranging from 13 to 34 students in each group. Generalizability of the results is another limitation of this study. This study was conducted using one chiropractic course, Soft Tissue Radiology and it is unknown whether the result of this study is transferable to other courses. Therefore, future studies should involve other chiropractic courses and comparison of the results should be made between different courses. As mentioned previously, while OFSA has many advantages over traditional paper-based formative assessment, there is no way to control or monitor how OFSA quizzes are utilized.
For example, if a student decides to take the OFSA quiz with his or her classmates, only the student who logged in will get the credit for taking that OFSA quiz and the remaining students will not, even though they are exposed to the information and effect of OFSA quizzes. Even though the right click function is disabled on the BrightSpace® setting as students are taking the OFSA quizzes to prevent printing out the quizzes, students can use their electronic devices such as a smartphone to take snapshots of each question of the OFSA quizzes and share with their peers. Inability to objectively quantify how close OFSA quizzes are to the SFE can be a limiting factor that can affect the result of this type of study. In theory, the closer the OFSA quizzes are to the SFE, the higher the correlation between the number of OFSA quizzes taken and the SFE scores. However, it is unknown whether giving formative assessment simulating summative exams in various ways including difficulty level and format of questions is beneficial for learning or just for test-taking. That is why the design of the formative assessment is crucial to create the desired educational effect of learning, resulting in higher academic performance.

Conclusion
In this study, the relationship between the number of OFSA quizzes taken and SFE scores demonstrated a weak, positive correlation. While it is possible that the OFSA quizzes helped students' learning, they account for a small portion of the variation in SFE scores. This is because learning is a byproduct of multiple factors and processes, of which OFSA quizzes are a small part. It appeared that students valued the opportunity of OFSA quizzes as evidenced by a high participation rate. In addition to the statistical results of the study, based on anecdotal claims by some of the students, the OFSA quizzes provided students with an important opportunity for a different way of learning and the flexibility to evaluate and enhance their understanding of the contents in the course of Soft Tissue Radiology.

Funding Sources and Conflicts of Interest
This work was funded internally. The authors have no conflicts of interest to declare relevant to this work.

Acknowledgement
We thank Margaret Thompson-Choi and Kenice Morehouse for their suggestions and editorial assistance.