In a previous post1 I described in broad strokes the components necessary for implementing standards-based grading. One of these is the elimination of grade averaging; specifically, using the mean of all the scores in the gradebook to determine a student’s final score. At our school we’ve done this to the extent recommended by Marzano but stopped short of giving teachers full discretion to determine the final grade. For us the challenge has been deciding which system should replace mean-based averaging – or whether full discretion should be left to the teacher.
The fundamental argument for getting rid of grade averaging is that it’s not a good reflection of what students know. Marzano and Heflebower put it thus:
A student might have received an overall or “omnibus” letter grade of B, not because he had a solid grasp of the target content, but because he was exceptionally well behaved in class, participated in all discussions, and turned in all assignments on time.2
So the problem with traditional grading is that it lacks specificity, for example by confusing effort and attitude with ability. There are other problems with traditional grading as well. Granted, many of these complaints have more to do with the 0-100 grading scale, the replacement of which I’ll discuss in detail in a future post. “Averaging implies…objectivity…[but] one percentage point is the arbitrary cutoff,” notes Wormeli.3 When low grades early in a unit are averaged with strong ones later on, which would imply steady growth to a level of mastery, mean-based averaging leaves the student with a distorted score compared with their actual ability – for example, a student acing the final but ending with a C average in the class.4 Even worse is when zeros are used to punish non-submission of assignments, which can condemn a student to failure regardless of later learning.5 Beyond statistical calculations, mean-based averaging gives rise to teacher concerns about students “gaming” the system by doing the minimum required work to pass, and even the language of mean-based averaging is norm-referenced rather than criterion-referenced, which standards-based grading aspires to be.6 Finally, Ken O’Connor deplores “worshipping averages” when a teacher uses a mean-based average for a final score even when he/she knows it’s not an accurate reflection of student ability.7
What should replace mean-based averaging? I’ve heard from a few different sources that automatic calculation of a grade shouldn’t be done at all – the final grade should be decided in each case by the teacher. James Nehring makes the cogent point that assessment is a “less precise enterprise involving evidence, intelligence, conversation, and judgment” and that in the US legal system, “jurors are provided with all the evidence and the best arguments on all sides, but the decision, ultimately, lies with them.8” A “body of evidence” is what should be used, adds O’Connor, in conjunction with professional judgment.9 But the case for completely eschewing any sort of averaging scheme – and I’m not necessarily talking about mean-based averaging here – is undermined Guskey’s assertion that teachers’ judgments of student achievement are highly subjective and inaccurate, at least in the context of grading on a 100-point scale10 – and if we can’t trust them there, then why would we let a teacher determine a final grade? One should also question whether we want the grading system to mirror America’s litigious culture, which Nehring’s comparison would seem to invite.
The most compelling argument against eschewing all automatic calculation of grades is that the experts don’t advocate for it. “Emphasize more recent achievement,” teaches O’Connor, advising teachers to ‘crunch’ numbers carefully and to use the median or mode rather than the mean. “The most recent evidence should always be given priority or greater weight,” adds Guskey, and “give priority or greater weight to the most comprehensive forms of evidence,11” a sentiment echoed by the Great Schools Partnership.12 This would suggest that teachers consider a decaying average model – essentially a weighted average – and it “is probably a more realistic model, especially as assessments are really measuring a multi-dimensional skill space, and assessments that are close in time tend to measure skills that are closely related,” says bioinformatics professor Kevin Karplus.13 Marzano recommends using the power law; as the only calculation method based on actual learning psychology research as opposed to (logical) conjecture and anecdotal evidence, “a final score based on the power law is most probably a better estimate of a student’s true score at the end of a grading period than is the average score.14” However, be prepared for “the resulting difficulty teachers/students/parents have understanding and explaining it,” warns Jesse Olsen;15 Frank Noschese adds that sticking to a simple calculation system means “kids can quickly and easily calculate their grade. Keep it simple.16”
So while prominent figures in the field agree that mean-based averaging should be off the table, there a number of valid options, from using the mode to a decaying average to the power law, to replace it. Power law is the only one backed by hard research, but so far I haven’t seen an experiment that attempts to correlate achievement on standardized tests with scores obtained from any of these methods – which would, I imagine, be the whole point of trying to find a more “accurate” grading system (depending on how much you take standardized assessments to be evidence of student learning). In any event, given that no alternative is unambiguously better, it’s probably not a good use of time for schools to spend too much time evaluating one against the other; as O’Connnor, says:
There are NO right grades, there are only justifiable grades!
Therefore, use expediency and ease of communication and implementation as your criteria and pick the method that you feel will lead most directly to the goal of focusing on the learning happening in the classroom.
- When Reality Meets Philosophy: An Introduction to Standards Based Grading ↩
- Marzano, Robert J., and Tammy Heflebower. “Grades That Show What Students Know.” Educational Leadership 69.3 (2011): 34-39. Web. 27 Dec. 2015. ↩
- Wormeli, Rick. “It’s Time to Stop Averaging Grades.” Web log post. Association for Middle Level Education. Association for Middle Level Education, Oct. 2012. Web. 28 Dec. 2015. ↩
- ibid; it is also worth noting that the examples presented in support of this scenario invariably use two Fs averaged with two As to make a C – even though this scenario is exceedingly rare in reality. ↩
- Guskey, Thomas R. “The Case Against Percentage Grades.” Educational Leadership 71.1 (2013): 68-72. Web. 28 Dec. 2015. ↩
- Wormeli, 2012. ↩
- O’Connor, Ken. “Troubleshooting and Implementing Standards-Based Grading and Reporting Part 1: How to Grade for Learning.” NESA Fall Training Institute. Bahrain, Manama. 30 Oct. 2009. Lecture. ↩
- Nehring, James. “To Measure, or to Assess, Learning? Human Judgment Is Not the Problem, It’s the Solution.” Education Week 35.2 (2015): 19-20 ↩
- O’Connor, 2009 ↩
- Guskey, 2013 ↩
- Guskey, T. R. “Computerized Gradebooks and the Myth of Objectivity.” Phi Delta Kappan 83.10 (2002): 775-80. Web.” ↩
- “Assessment + Verification.” Great Schools Partnership. Great Schools Partnership, Inc., n.d. Web. 28 Dec. 2015. ↩
- Karplus, Kevin. “Re: Sustained Performance and Standards-based Grading.” Weblog comment. Gas Station without Pumps. Kevin Karplus, 29 Aug. 2010. Web. 28 Dec. 2015. Read posts on decaying averages on the JumpRope blog, Haiku Learning, and Riley Lark’s blog. ↩
- “Marzano, Robert J. Transforming Classroom Grading. Alexandria, VA: Association for Supervision and Curriculum Development, 2000. Print. Marzano is also fine with mean averaging separate scores for what he calls “student choice” indicators, namely effort and behavior, so long as they’re not lumped in with content standards. At NCPA we call them Process Standards. ↩
- Olsen, Jesse. “Musings on Most Recent, Power Law, and the Decaying Average.” Web log post. JumpRope Blog. JumpRope Inc., 5 Feb. 2015. Web. 28 Dec. 2015. ↩
- Noschese, Frank. “Keep It Simple Standards-Based Grading.” Web log post. Action-Reaction: Reflections on the Dynamics of Teaching. Frank Noschese, 23 Aug. 2012. Web. 28 Dec. 2015. ↩