Bias in Data Sets Limits the Use of Artificial Intelligence in Human Resources

The work in artificial intelligence is growing at record speed, with great leaps in the development of neural networks and natural language processing. In the excitement of actually getting AI to work, several key steps needed to ensure quality have been skipped over. The most important of these for AIs and deep learning machines that have any say over human behavior, such as recruitment and hiring, is the issue of bias in data sets and quality assurance. 

The machines make decisions based on the information we give them, and the data sets of human faces and voices does not reflect the actual population of the earth. The rich diversity of human skills, knowledge, and experience is matched by the rich diversity of our looks.  The deep neural networks make predictions and find patterns in the data we give them.  But large blocks of human populations are not included in the data given to the machines. They are not even recognized. 

Several young scientists are working to ensure some sort of quality assurance is being done before the machines are put into general use. The Algorithmic Justice League, born from the experiences of a diverse group of MIT researchers, is advocating for inclusion and diversity in the data sets across industries. Joy Buolamwini recently was awarded a $50,000.00 scholarship for her work in recognizing and fighting bias in machine learning data sets. 

Unconscious bias is not new, and humans have recognized this flaw in our natures always. We tend to feel most comfortable around people who look and think like we do. When we are in positions to gather others around us, we tend to look for familiar faces. But we know this about ourselves, and humans are always trying to do better. Our goals toward diversity and inclusion in the workplace are not just goals to meet some arbitrary standard, or an effort to improve outcomes. We want to do better and work better and reject any personal bias that sneaks in under our radar. We can carefully do this, and evaluate our thinking and actions, because we are self-aware.

At this point, the machines are not, and they are only able to work with the information we give them. In the mad rush to begin using our new technology, we have not developed adequate cross-checking and quality assurance. The data sets we give the AIs to find patterns and make predictions must reflect the entirety of human diversity.