Tinkoff Integrates Its Voice Assistant Into Clubhouse

  • Banking
  • 12.03.2021 12:10 pm

Tinkoff, one of the world’s largest and most profitable digital banks, announces today it has integrated its assistant Oleg into Clubhouse, making it the first voice assistant, speech recognition and synthesis solution available in this audio-chat social network.

Oleg will be a full-fledged user helping room creators to communicate and moderate discussions in Clubhouse utilising its text-to-speech and speech-to-text capabilities (Tinkoff VoiceKit) in real time. Tinkoff’s voice assistant will be able to enter rooms, transcribe speech in real time, and stream the text in his Oleg in the Clubhouse Telegram channel. He can also moderate Clubhouse rooms, voice questions to speakers, remind users about time limits, regulations, etc.

Oleg made his debut appearance in Clubhouse on 11 March converting speech to text and streaming the results from the Tinkoff Investments room where Tinkoff Group CEO Oliver Hughes and other senior executives held a conference call to discuss Tinkoff Group’s financial performance and record net profit in 2020

Pavel Kalaidin, Director of Artificial Intelligence at Tinkoff, said: 

“Our voice assistant team is currently experimenting with various user scenarios in Clubhouse to determine how room creators or listeners can benefit from our technologies. We have already successfully tested Oleg’s ability to transcribe audio calls in real time streaming them in his Telegram channel. The feature was piloted in the Clubhouse room created to discuss Tinkoff’s 2020 financial results.

Oleg can also come in handy when listeners are unable to voice a question to speakers, for example when it is too noisy or they do not want to interrupt them. For such cases, we are designing an interface through which users can forward their questions to Oleg’s Telegram chat. Oleg will then voice the question with perfect pronunciation, keeping the user anonymous, if necessary.

One of the challenges in group speech recognition is the summarisation of information. Interjections, fillers and incoherent speech make reading even a good transcript difficult. For that reason, we are looking into ways of processing the text and capturing the gist of what is said to create a shorter and more readable transcript.

We are open to working with Clubhouse communities to make our voice assistant a useful tool for content makers and listeners."

Related News