Artificial intelligence has been at the forefront of technological advancements for a long time now. Today, AI can be found pretty much everywhere, from phones to smart speakers. And the technology is mainly thought to be in the hands of a few big names, such as Google, Apple, Facebook, and so on. For a company like Mozilla to jump into the AI game, things can get a bit difficult, but the company has found a way to collect the data it needs to get started – by giving the power to the public.
Mozilla is exploring an alternative way of gathering data, by asking users to pool information to power an open-sourced AI initiative, The Verge reports. One of the projects, for example, is called Common Voice, where users will be able to donate voice samples to create an open-sourced voice recognition system similar to how Siri and Alexa work. Companies like Google and Facebook have the benefit of having collected data via their respective products to help them build AI. For Mozilla, this initiative is a starting point in collecting data to start building its own AI software.
“Currently, the power to control speech recognition could end up in just a few hands, and we didn’t want to see that,” Sean White, vice president of emerging technology at Mozilla, tells The Verge. Big companies have the benefit of filtering data coming in, but for companies like Mozilla, other methods need to be taken. “The interesting question for us, is, can we do it so the people who are creating the data also benefit?”
Mozilla plans to open source its voice-recognition system by the end of the year. Anyone can head over to Common Voice to donate their voice by reading out some sample sentences. The company explains that volunteers who supply their personal information like age, location, gender, and accent, will be helping the software to reduce the error rate of recognising various accents, a problem faced by the likes of Siri even today.
Personalisation is a key factor Mozilla points out that sets its project apart from currently existing services. The company explains that traditional AI systems work on a collective datasets that doesn’t necessarily identify individual, smaller groups among them. This can skew an AI-based voice recognition software, favouring the popular and the majority voices.
“For us to be successful with data commons, there has to be a motivation [for users] other than realising one day that they’ve been giving away all their personal data,” says White. “We have to make their experience better because they’ve participated.” White says that the focus here is on collecting as much accents as possible so that the system works on spot. However, while this may take longer time for Mozilla, bigger companies like Google, for example, have the monetary power to get third-party companies to swoop in and solve issues like – getting the right accents.
White says that Mozilla’s initiative and vision is more along the lines of other open-sourced AI-based organisations like OpenAI, Healthcare.ai, and Comma.ai. But Mozilla understands that relying on open-source data alone may not help it take on the giants. Eventually, they may need to take help from third parties for some input. Chris Nicholson, CEO of deep learning company Skymind, says, “We may need third parties to step in – NGOs, governments, coalitions of smaller private firms – and pool their data.”