Artificial intelligence has given us algorithms capable of recognizing faces, diagnosing disease, and of course, crushing computer games. But even the smartest algorithms can sometimes behave in unexpected and unwanted ways, for example picking up gender bias from the text or images they are fed.
A new framework for building AI programs suggests a way to prevent aberrant behavior in machine learning by specifying guardrails in the code from the outset. It aims to be particularly useful for non-experts deploying AI, an increasingly common issue as the technology moves out of research labs and into the real world.
The approach is one of several proposed in recent years for curbing the worst tendencies of AI programs. Such safeguards could prove vital as AI is used in more critical situations, and as people become suspicious of AI systems that perpetuate bias or cause accidents.
Last week Apple was rocked by claims that the algorithm behind its credit card offers much lower credit limits to women than men of the same financial means. It was unable to prove that the algorithm had not inadvertently picked up some form of bias from training data. Just the idea that the Apple Card might be biased was enough to turn customers against it.
Similar backlashes could derail adoption of AI in areas like healthcare, education, and government. “People are looking at how AI systems are being deployed and they’re seeing they are not always being fair or safe,” says Emma Brunskill, an assistant professor at Stanford, and one of the researchers behind the new approach. “We’re worried right now that people may lose faith in some forms of AI and therefore the potential benefits of AI might not be realized.”