Employee of the month.

Photographer: Qilai Shen/Bloomberg

Rating the Employee Review: Needs Improvement

Stephen Mihm, an associate professor of history at the University of Georgia, is a contributor to the Bloomberg View. Follow him on Twitter at @smihm.
Read More.
a | A

Few rituals of the modern workplace evoke more dread than the annual performance review. Like a colonoscopy or root canal, it has been viewed as a necessary evil: deeply unpleasant but indispensable for the health of an organization.

These methods of reviewing workers themselves are evaluated and, like so many employees, sometimes found wanting. In the past year, Accenture, General Electric, Microsoft and Adobe have all instituted review systems that move away traditional methods of assessment. Last week, Goldman Sachs and Morgan Stanley announced that they would abandon old-fashioned numerical rankings administered annually in favor of more qualitative assessments that uses adjectives -- Outstanding, Good, Needs Improvement -- delivered in real time.

Sadly, the history of performance appraisals in the workplace suggests that this latest trend, however well-intentioned, is unlikely to be the last word. For close to a century, management gurus have tinkered endlessly with the hated performance review.  The latest “reforms” look suspiciously like warmed-over versions of decades-old doctrines.

In the early 20th century, few companies assessed their employees’ performance in any formal way and those that did relied on rudimentary methods. For example, a company might rank employees using a 10-point scale, with “1” being excellent and “10” poor. But this system lacked rigor and depended far too much on the subjectivity of the reviewer.

Enter Walter Dill Scott, director of the Bureau of Salesmanship Research in the Division of Applied Psychology at Carnegie Mellon. In an attempt to figure out what traits helped men succeed in sales, he devised what became known as the “man-to-man” scale around 1915. In this system, a manager would select a trait such as “self-reliance” and then laboriously draw up a list of people he knew and rank them from best to worst for this attribute, with numbers assigned from 15 (the best) to 1 (the worst). Then the manager would try to determine where the employee being assessed belonged compared with other friends and acquaintances.

The system's virtue was that it wasn’t abstract, but judged the employee against a group of people the rater knew well. But it was, one critic noted, “too time-consuming, cumbersome, and difficult to understand for the average rating executive.” And though it provided a yardstick for managers could measure, every yardstick was unique. Scott’s method failed to catch on in corporate America, but he was a successful salesman, and he managed to persuade the U.S. Army to adopt the method during World War I.

After the war, Scott founded a company that specialized in devising new ways of measuring employees. One of the key players, Beardsley Ruml, devised what became known as the “Graphic Rating Scale.” Rather than ask managers to rate employees relative to outsiders, the new scale required that they rank them relative to other employees within the company -- but in a very unusual way.

A manager would be asked, for example, about an employee’s “initiative” -- the extent to which they were able to “make practical suggestions for doing things in a new way.” The manager would use a horizontal line that listed five adjectives, from “Very Original” on the far left to “Occasionally Suggests” in the center to “Needs Constant Supervision” on the far right.  The evaluator could then put a check mark anywhere on the line.  This gave managers discretion to split hairs and add shading to their evaluation. But when the human resources department collected the questionnaires, they superimposed a stencil over the paper that distributed the check marks into one of five quintiles.

This system gained acceptance in corporations, and with good reason: It was simple to use; it gave managers some wiggle room in assessing employee performance even as it assigned employees to one of five quintiles; and it ranked employees relative to other people within the organization.

But one thing was missing: how to help employees improve their performance from year to year. Harry Walker Hepner, author of "Psychology in Business" in 1930, observed that most ratings systems failed to give employees guidance about the qualities they should “develop or eliminate.” He lamented that employees “do not improve themselves, because the management does not tell them what to do to improve or how to do it.”

After World War II, the idea of giving “feedback” to employees caught on. It wasn’t enough to judge them; now they would be given the tools necessary to change their ways. An annual review was about the past, yes. But it also looked to the future. And it wasn’t just a once-a-year ritual; it was continuous.

During this same period, industrial psychologists tried to tackle the other flaws of the annual performance review. In the 1950s, concerns about rater bias prompted the oil company Esso to institute a forerunner of today’s 360-degree evaluation process. 

Ongoing frustration with definitions prompted the rise in the 1960s of Behaviorally Anchored Rating Scales, which framed questions in terms of very specific behavior: instead of asking whether an employee was efficient, for example, the rater would be asked to consider whether the employee completed 80 TPS Reports every month (or whatever other metric was deemed valuable). 

And in the 1980s, corporations instituted “vitality curves,” which required managers to rank employees along a set distribution: it was no longer possible for all employees to be above average. The most extreme version of this method was the “rank and yank” system that General Electric helped pioneer: employees at the wrong end of the bell curve would be fired at the end of the year.

All of these systems had drawbacks, which companies have tried to remedy. In the latest flurry of reforms, most are signaling a shift from purely quantitative rankings -- e.g. Goldman’s nine-point scale -- to qualitative, continuous rankings aimed at helping employees improve.

And for that, Goldman gets an “Outstanding” for effort. But originality? The words “Needs Improvement” come to mind.

This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.

To contact the author of this story:
Stephen Mihm at smihm1@bloomberg.net

To contact the editor responsible for this story:
Max Berley at mberley@bloomberg.net