Machine learning models are increasingly used to inform high-stakes decisions. Discrimination by machine learning becomes objectionable when it places certain privileged groups at systematic advantage and certain unprivileged groups at systematic disadvantage. Bias in training data, due to prejudice in labels and under- or oversampling, yields models with unwanted bias.
Rachel Bellamy, Kush Varshney, Karthikeyan Natesan Ramamurthy, and Michael Hind explain how to use and contribute to AI Fairness 360—a comprehensive Python toolkit that provides metrics to check for unwanted bias in datasets and machine learning models and state-of-the-art algorithms to mitigate such bias. AI Fairness 360 (AIF360) contains 73 fairness metrics and 10 bias mitigation algorithms developed by the broader algorithmic fairness research community. They can all be called in a standard way, very similar to scikit-learn’s fit/predict paradigm.
By capturing existing metrics and mitigation algorithms created by the research community in one extensible toolkit, AIF360 makes it easier for all practitioners interested in AI fairness to work together to improve and apply technical approaches in the future. Compared to existing open source efforts on AI fairness, AIF360 takes a step forward in that it focuses on bias mitigation (as well as bias checking), industrial usability, and software engineering. By integrating these three aspects, AIF360 aims to bring together researchers with an interest in AI fairness and also helps translate collective research results to practicing data scientists, data engineers, and developers deploying solutions in a variety of industries.
Rachel Bellamy is a principal research scientist and manages the Human-AI Collaboration Group at the IBM T. J. Watson Research Center, where she leads an interdisciplinary team of human-computer interaction experts, user experience designers, and user experience engineers. Previously, she worked in the Advanced Technology Group at Apple, where she conducted research on collaborative learning and led an interdisciplinary team that worked with the San Francisco Exploratorium and schools to pioneer the design, implementation, and use of media-rich collaborative learning experiences for K–12 students. She holds many patents and has published more than 70 research papers. Rachel holds a PhD in cognitive psychology from the University of Cambridge and a BS in psychology with mathematics and computer science from the University of London.
Kush R. Varshney is a research staff member and manager at IBM Research AI at the T. J. Watson Research Center, where he leads the Learning and Decision Making Group. He’s the founding codirector of the IBM Science for Social Good initiative. His research applies data science and predictive analytics to human capital management, healthcare, olfaction, computational creativity, public affairs, international development, and algorithmic fairness, which has led to recognitions such as the 2013 Gerstner Award for Client Excellence for contributions to the WellPoint team and the Extraordinary IBM Research Technical Accomplishment for contributions to workforce innovation and enterprise transformation. He also conducts academic research on the theory and methods of statistical signal processing and machine learning. His work has been recognized through best paper awards at the Fusion 2009, SOLI 2013, KDD 2014, and SDM 2015 conferences. He holds a PhD and SM in electrical engineering and computer science from MIT, where he was a National Science Foundation Graduate Research Fellow, and a BS (magna cum laude) in electrical and computer engineering with honors from Cornell University.
Karthikeyan Natesan Ramamurthy is a research staff member in IBM Research AI at the T. J. Watson Research Center. His broad research interests include understanding the geometry and topology of high-dimensional data and developing theory and methods for efficiently modeling the data. He has also been intrigued by the interplay between humans, machines, and data and the societal implications of machine learning. His papers have won best paper awards at the 2015 IEEE International Conference on Data Science and Advanced Analytics and the 2015 SIAM International Conference on Data Mining. He’s an associate editor of Digital Signal Processing and a member of the IEEE. He holds a PhD in electrical engineering from Arizona State University.
Michael Hind is a distinguished research staff member in the IBM Research AI organization. His current research passion is in the general of area of trusted AI, focusing on the fairness, explainability, and reliability of the construction of AI systems. Previously, he led departments of dozens of researchers focusing on programming languages, software engineering, cloud computing, and tools for cognitive systems. Michael’s team has successfully transferred technology to various parts of IBM and launched several successful open source projects. Previously, Michael spent seven years as an assistant/associate professor of computer science at SUNY New Paltz. Michael is an ACM Distinguished Scientist, a member of IBM’s Academy of Technology, and a former associate editor of ACM TACO. He has served on over 30 program committees, given talks at top universities and conferences, and coauthored over 40 publications. His 2000 paper on adaptive optimization was recognized as the OOPSLA’00 Most Influential Paper and his work on Jikes RVM was recognized with the SIGPLAN Software Award in 2012. He holds a PhD from NYU.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org