Mitigating Biases in Crowdsourcing Data Collection
Date: 10th November 2021, 16:00 - 17:00
Data has become the secret sauce for the rapid progress of artificial intelligence (AI). Over the past decade, crowdsourcing has become a prevalent paradigm for obtaining data from people to enhance machine intelligence. However, there is a growing line of literature showing that data collected from crowdsourcing efforts could have significant biases, which not only decreases the quality of the data, but may even negatively impact the downstream algorithmic models built based on these data. Many factors could contribute to the biases in crowdsourced data, including the composition of the dataset on which annotations are solicited from the crowd (e.g., sampling bias of the dataset), and the cognitive and behavioral limitations that the crowd is subject to when providing annotations (e.g., cognitive bias, affective bias, social bias). In this talk, I'll present some of our recent efforts in mitigating biases throughout the crowdsourcing data collection lifecycle. In particular, I'll discuss how the wisdom of the crowd can be leveraged to detect sampling bias of the dataset before the data is collected, whether and how adding real-time worker interactions during data collection can help mitigate biases in the crowdwork, and how to account for worker biases after the data is collected to reduce data biases through post-hoc data aggregation.
The Future of A.I. for Social Good
Date: 11th November 2021, 15:30 - 16:30
The A.I. Industry has powered a futuristic reality of self-driving cars and voice assistants to help us with almost any need. However, the A.I. Industry has also created systematic challenges. For instance, while it has led to platforms where workers label data to improve machine learning algorithms, my research has uncovered that these workers earn less than minimum wage. We are also seeing the surge of A.I. algorithms that privilege certain populations and racially exclude others. If we were able to fix these challenges we could create greater societal justice and enable A.I. that better addresses people’s needs, especially groups we have traditionally excluded.
In this talk, I will discuss some of these urgent global problems that my research has uncovered from the A.I. Industry. I will present how we can start to address these problems through my proposed "A.I. For Good" framework. My framework uses value sensitive design to understand people's values and rectify harm. I will present case-studies where I use this framework to design A.I. systems that improve the labor conditions of the workers operating behind the scenes in our A.I. industry; as well as how we can use this framework to safeguard our democracies. I conclude by presenting a research agenda for studying the impact of A.I. in society; and researching effective socio-technical solutions in favor of the future of work and countering techno-authoritarianism.
On the Foresight and Measurement of Computational Harms
Date: 12th November 2021, 15:45 - 16:45
Abstract: There is a rich and long-standing literature on detecting and mitigating a wide range of biased, objectionable, or deviant content and behaviors, including hateful and offensive speech, misinformation, and discrimination. There is also a growing literature on fairness, accountability, and transparency in computational systems that is concerned with how such systems may inadvertently engender, reinforce, and amplify such behaviors. While many systems have become increasingly proficient at identifying clear cases of objectionable content and behaviors, many challenges still persist.
While existing efforts tend to focus on issues that we know to look for, techniques for preempting future issues that may not yet be on the product teams' and research community's radar are not nearly as well developed or understood. Addressing this gap requires deep dives into specific application areas. In addition, existing approaches to measuring computational harms also often embed many unnamed assumptions, with poorly understood implications to the fairness and inclusiveness of a system. I will ground our discussion in some of our recent research examining how language technologies are being evaluated and how we could do better.