This post from Kevin Meyer brought back my first university days.
I was a naive 17-year-old with no guidance from a very small school when I decided to enroll in the College of Engineering at the University of Cincinnati (a large school—there were almost as many freshman engineers as the population of my village). We all arrived on campus. Then they told us that they would “flunk out” 33% of students the first year and another 33% the second. Only about a third of those freshman would be kept in the school to graduate. Almost everything was graded on a massive curve. We all competed against each other.
No complaints. In the end, I received a degree in a course of study I essentially designed for myself. The combination of math, literature, political science groomed me for a career first in management and later in media.
The ruminations began with Kevin Meyer’s thoughts on the ridiculous notion that Harvard awards too many A’s to its students. Notwithstanding that most students enrolling there are A students from where they are from (probably excepting those who get in on a parent’s coattails).
Bill George is someone I usually agree with. He’s been a consistent voice for principled leadership and organizational culture. So when he posted enthusiastic support on LinkedIn for Harvard’s recent vote to cap undergraduate A grades at 20% of any class, and pointed to HBS’s own 1-2-3 grading system as evidence it “works well.”
Is there a real problem?
I sat with that for a while before concluding he’s wrong on two levels: the problem isn’t diagnosed correctly, and the solution doesn’t follow even if it were.
Is 60% A’s actually a problem?
Does that mean that A’s were too easy, or does it mean that the students are smart and do their work?
More than 60% of undergraduate grades awarded at Harvard in 2025 were A’s, up from 24% in 2005. The faculty subcommittee called that grade inflation and voted to cap A’s at 20% of any class. But is 60% A’s actually a problem, or just a number that feels wrong?
Known unknowns—meaning further study needed.
Consider what we don’t know. Harvard admits around 3% of applicants, selecting for academic ability more aggressively than almost any institution on earth (setting aside legacy admissions and the occasional building named after a donor’s family…). Maybe a significant portion of those students genuinely earn A’s on a well-designed, rigorous exam. Maybe the exam was too easy. Maybe the professor grades generously. Maybe the course material isn’t demanding enough to differentiate outcomes. Maybe some combination of all of the above. The grade distribution alone tells us none of that. It’s a symptom without a diagnosis.
The wrong fix for the wrong problem?
Even granting that something has drifted in Harvard’s grading culture, capping A’s at 20% is the wrong fix. It’s the difference between criterion-referenced and norm-referenced evaluation. Criterion-referenced grading asks whether work meets a defined standard. Norm-referenced grading asks how a student ranks against peers. Harvard’s new policy is purely norm-referenced: if 30% of students in a class produce work that genuinely merits an A, 10% will be graded down anyway, not because their work was deficient but because the math requires a loser.
Gregory Samanez-Larkin, a professor of psychology and neuroscience at Duke, left the sharpest comment in the LinkedIn thread: “1-2-3 works well for what? Honestly curious.” Nobody produced a clean answer, which is telling.
Seems they have a lack of definitions. A vague solution to a vague problem.
The actual fix, harder but correct, is to define what an A requires and hold that line. If course material isn’t rigorous enough to naturally produce a spread of outcomes, that’s the problem to address. Capping the grade doesn’t make the course harder. It just penalizes students for the instructor’s design choices.
Meyer refers back to the disastrous Jack Welch era at GE.
Jack Welch made the same error at massive scale with his “vitality curve,” the 20-70-10 system at GE where the top 20% were rewarded, the middle 70% coached, and the bottom 10% fired annually. Mark Graban of Lean Blog asked the obvious question: if GE had to remove the bottom 10% every year, why did GE keep hiring turkeys?
The vitality curve assumes any workforce naturally distributes along a bell curve, so you might as well act on it. But a well-hired, well-developed team isn’t guaranteed to produce a bottom 10% of underperformers. A great leader who recruits carefully, trains well, and sets clear expectations might build a team where the weakest performer would be a star somewhere else. Forcing the ranking anyway fires people not because they failed a standard but because the math requires someone at the bottom.
The variables that actually explain underperformance are the same ones the curve ignores: unclear job requirements, inadequate training, a talented person in the wrong role, compensation too low to attract strong candidates, or simply a manager who tolerates mediocrity and then blames the team for it. Removing the bottom 10% addresses none of those. It just produces a vacancy and restarts the cycle.
Conclusion.
Both Harvard’s grade cap and Welch’s vitality curve skip the hard diagnostic work in favor of a mechanical fix that feels rigorous. Mandating a distribution is easy. Defining what mastery requires, building courses or jobs demanding enough to test it, and developing the people who fall short — that’s the actual work.
When you force or rig the curve, you don’t raise performance. You just guarantee somebody loses.




