Some parts of machine learning are incredibly esoteric and hard to
grasp, surprising even seasoned computer science pros; other parts of it
are just the same problems that programmers have contended with since
the earliest days of computation. The problem Amazon had with its
machine-learning-based system for screening job applicants was the
latter.
Amazon understood that it had a discriminatory hiring process: the
unconscious biases of its technical leads resulted in the company
passing on qualified woman applicants. This isn’t just unfair, it’s also
a major business risk, because qualified developers are the most scarce element of modern businesses.
So they trained a machine-learning system to evaluate incoming resumes,
hoping it would overcome the biases of the existing hiring system.
Of course, they trained it with the resumes of Amazon’s existing stable
of successful job applicants – that is, the predominantly male
workforce that had been hired under the discriminatory system they hoped
to correct.
The computer science aphorism to explain this is “garbage in, garbage
out,” or GIGO. It is pretty self-explanatory, but just in case, GIGO is
the phenomenon in which bad data put through a good system produces bad
conclusions.
Amazon built the system in 2014 and scrapped it in 2017, after
concluding that it was unsalvagable – sources told Reuters that it
rejected applicants from all-woman colleges, and downranked resume’s
that included the word “women’s” as in “women’s chess club captain.”
Amazon says it never relied on the system.
There is a “machine learning is hard” angle to this: while the flawed
outcomes from the flawed training data was totally predictable, the
system’s self-generated discriminatory criteria were surprising and
unpredictable. No one told it to downrank resumes containing “women’s”
– it arrived at that conclusion on its own, by noticing that this was a
word that rarely appeared on the resumes of previous Amazon hires.