CommonVoice
Contents
What is Common Voice
Mozilla Common Voice is an initiative to help teach machines how real people speak.
This project is an effort to bridge the digital speech divide. Voice recognition technologies bring a human dimension to our devices, but developers need an enormous amount of voice data to build them. Currently, most of that data is expensive and proprietary.
We want to make voice data freely and publicly available, and make sure the data represents the diversity of real people. Together we can make voice recognition better for everyone.
You can contribute today on Common Voice.
How does Common Voice work?
We’re crowdsourcing an open-source dataset of voices, to start and support languages on Common Voice the following steps are made.
1. New language request and localisation of Common Voice platform via Pontoon
2. Collecting and validating public domain sentences via the sentence collector, sentence extractor or CC0 text waiver agreement.
3. Recording and validating the recordings of the sentences on the Common Voice platform
4. Repeating this process to grow the size of the dataset
5. Generating a dataset which is released by the Common Voice team
This dataset can then be used by developers to create voice-enabled technologies.
Common Voice Communities
Common Voice wouldn’t be possible without our language communities. As of September 2021, we have 80 languages launched for voice data collection.
Community Playbook
Language community members and organisers; mobilise participation, provide valuable feedback and inspire us as a team. Our Community Playbook outlines how communities participate in Common Voice.
Communications Channels
To support our communities our two main channels are discourse for group and topical discussions and matrix for community chats. Our communities also have their own communication channels to help with self-organising.
We share weekly updates from the Common Voice Team on discourse, coordinated by Hillary, Common Voice Community Manager.
Community Sessions and Council
As part of our Community strategy, we seek to build on and create new ways to support our language communities. So far as part of this strategy we have; hosted Community Sessions on the Common Voice Roadmap, open discourse discussions on Reward and recognition and launched V1.2 Common Voice Community Playbook.
To further support communal voice, we would like to trial out a Common Voice reps council to support the community to have even more say in important decisions. Learn more about the Common Voice Reps programme and how you can apply today on the Common Voice Discourse.
Materials & Assets to Use and Remix
Common Voice assets and presentations can be viewed on our shared drive