About the project
Building a secure pipeline to share diverse voices to improve the representation of diversity of speech patterns
The Speech Accessibility Project will collect speech samples from individuals representing a diversity of speech patterns. UIUC researchers will recruit paid volunteers to contribute recorded voice samples and will create a private, de-identified dataset which can be used to train machine learning models to better understand a variety of speech patterns. The project will focus first on American English.
Artificial intelligence and machine learning allow people to use speech recognition, such as voice assistants or translation tools, to operate technology using their voices. Speech recognition is powered by machine learning; without diverse, representative data, ML models cannot learn how to understand a diversity of speech. This project aims to change that by creating the dataset needed to more effectively train these machine learning models.
Instead of separate and duplicative initiatives by different companies and research teams, the groups will collaborate on this project to gather a set of high-quality, representative speech samples that will help accelerate the technologies that support these communities of people with diverse speech patterns.
Frequently asked questions
Find answers to common questions about the Speech Accessibility Project.