Soniox co-founders Ambroz Bizjak and Klemen Simonic (R)
Klemen Simonic and Ambroz Bizjak met at the University of Ljubljana, Slovenia, during their undergraduate studies. After graduation, they pursued different paths: Simonic joined Facebook and worked on speech systems, while Bizjak was at Cosylab developing software for control systems in various industries.
After gaining experience in the corporate world, they came together to start a new venture focusing on understanding humans through audio AI technologies.
This led to the creation of Soniox.
Soniox, based in the US, developed AudioMind, a foundational AI model that can deeply analyze audio content.
“Through conversations with customers, we identified a need for more than just speech-to-text conversion. Clients wanted features like sentiment detection, summarization, and audio event recognition, leading us to develop AudioMind as a versatile audio intelligence solution,” says Simonic.
Revolutionizing Audio Processing
According to Simonic, AudioMind sets itself apart by offering a comprehensive approach to audio processing. Unlike other apps that focus solely on transcribing speech, AudioMind processes audio as the primary input, utilizing all available information in the audio signal.
“Our solution goes beyond transcription. With AudioMind, users can specify how they want the audio content to be interpreted through different prompts,” he explains.
AudioMind supports various instructions for speech-to-text conversion, allowing users to customize the output based on their needs.
“AudioMind excels in speaker intelligence, accurately separating and identifying speakers in a conversation,” claims Simonic.
Additionally, the app enables users to effortlessly generate speaker-separated transcriptions, summaries, and documents based on their prompts.
Understanding Emotions and Tone
AudioMind can decipher tone, intonation, and emotional cues in human communication, providing a more holistic understanding of interactions.
For industries like customer service, recognizing customer tone can enhance service quality and satisfaction levels.
It also aids in sentiment analysis and emotional recognition, benefiting fields like mental health care and public opinion analysis.
By filtering out background noise, AudioMind focuses on extracting meaningful information from audio inputs, improving accuracy.
Endless Possibilities
The founders see vast opportunities for AudioMind across sectors due to the prevalence of audio in various applications.
From healthcare documentation to customer service interactions, AudioMind offers unique benefits that enhance user experiences.
The app is designed to interpret voices accurately in virtual assistants and voice-enabled devices, creating personalized experiences for users.
“AudioMind has been trained extensively to understand audio like humans do, recognizing nuances and context in audio environments,” explains Simonic.
The startup aims to expand language support beyond English to cater to a global audience, breaking down language barriers.
“Our goal is to make AudioMind accessible and beneficial to users worldwide, promoting seamless communication across cultures,” concludes Simonic.
—
Join us at Singapore EXPO on May 15-16 for Echelon Asia’s leading tech and startup conference. Get your tickets here.
Interested in sponsoring or exhibiting at Echelon X? Send an enquiry here.
The post AudioMind goes beyond speech recognition and discerns tone, gender, emotions appeared first on e27.