E.B. White said that bias was impossible to avoid. He is perhaps best known nowadays as the author of children’s books, including ‘Stuart Little’ and ‘Charlotte’s Web’, but he was also a regular contributor to the ‘The New Yorker’ magazine and the co-author of one of America’s most influential writing style guides, known to generations of high school and college students. White claimed there was no such thing as objectivity: “I have yet to see a piece of writing, political or non-political, that does not have a slant,” he said. “All writing slants the way a writer leans, and no man is born perpendicular.”
Whether or not White was right about writers, human bias is certainly a fact of life in machine learning. In data science, it usually refers to a deviation from expectation, or an error in the data, but there is more to bias than that. We are all conditioned by our environments and experiences — “no man is born perpendicular” — and carry with us different kinds of social, political or values-based baggage. Sometimes our horizons are not as broad as we would like to think and as a result, the vast volumes of data used to train algorithms are not always sufficiently variegated or diverse. More often than not there is actual human bias in data or algorithms, which simply look for patterns in the data we feed it: garbage in, garbage out.