logo
Hot Topics at Interspeech 2024: The Latest in Technology of Spoken Language Processing

Our team recently attended the 25th Interspeech Conference, held from September 1st to 5th on Kos Island, Greece. This year’s theme, "Speech and Beyond," highlighted new developments in speech technology, focusing on areas like healthcare diagnostics, virtual assistants, and even animal sound recognition. It was a great opportunity for experts worldwide to share their work and discuss the latest trends. Here are some of the key topics and insights we gathered from the event. A major topic at the

Published: October 1, 2024
# AI / ML
# Speech Processing
Voice Biometrics Recognition and Opportunities It Gives

Voice biometry is changing the way businesses operate by using distinctive features of a person's voice, like pitch and rhythm, to confirm their identity. This technology, a central part of Voice AI, turns these voice characteristics into digital "voiceprints" that are used for secure authentication. Unlike traditional methods such as fingerprint or facial recognition, voice biometry can be used remotely with just standard microphones, making it both practical and non-intrusive. This technology

Published: May 13, 2024
# AI / ML
# EdTech / LMS
# Speech Processing
How to Scale AI in Your Organization

According to WEKA's 2023 Global Trends in AI Report, 69% of organizations now have AI projects up and running, and 28% are using AI across their whole business. This shows a big move from just trying out AI to making it a key part of how companies operate and succeed. However, this is just the beginning as the major point is not to have only AI but to have it work to your benefit. Organizations have to address various challenges such as the collection of data, hiring the right skills, and fittin

Published: April 4, 2024
# AI / ML
# Computer Vision
# NLP
Top Computer Vision Opportunities and Challenges for 2024

Computer vision (CV) is a part of artificial intelligence that enables computers to analyze and understand visual information, both images and videos. It goes beyond plain “seeing” an image, but teaches computers to make decisions based on what they see. The AI-driven computer vision market is experiencing rapid growth, rising from $22 billion in 2023 to an expected $50 billion by 2030, with a 21.4% CAGR from 2024 to 2030. This technology imitates human vision but works faster using sophisticate

Published: March 8, 2024
# Computer Vision
# AI / ML
Automatic Speech Recognition (ASR) Systems Compared

Automatic speech recognition (ASR) systems are becoming an increasingly important part of human-machine interaction. Simultaneously, they are still too expensive to develop from scratch. Companies need to choose between using a cloud API for an ASR system developed by tech giants or playing with open-source solutions. In this post, we compare eight of the most popular ASR systems to facilitate the choice for your project needs and team’s skills. We have conducted our tests to define the word err

Published: July 7, 2021
# Speech Processing
Face Detection Explained: State-of-the-Art Methods and Best Tools

So many of us have used different Facebook applications to see us aging, turned into rock stars, or applied festive make-up. Such waves of facial transformations are usually accompanied by warnings not to share images of your faces — otherwise, they will be processed and misused. But how does AI use faces in reality? Let’s discuss state-of-the-art applications for face detection and recognition. First, detection and recognition are different tasks. _Face detection_ is the crucial part of face re

Published: June 17, 2021
# Computer Vision
# AI / ML
Brain-Computer Interfaces: Your Favorite Guide

At the beginning of April 2021, Neuralink’s new video featuring a monkey playing Pong with his mind hit the headlines. The company’s as-always-bold statements promise to give back the freedom of movement to people with disabilities. We decided to look beyond the hype and define what these brain-computer systems are capable of in reality. Let’s dive right into it. Brain-computer interfaces (BCIs)* or *Brain-machine interfaces (BMIs) capture a user’s brain activity and translate it into commands f

Published: May 20, 2021
# Computer Vision
# AI / ML
Memorability in Computer Vision

Among many things that define us as humans, there is our ability to remember things such as images in great detail, and sometimes after a single view. What is even more interesting, humans tend to remember and forget the same things, suggesting that there might be some general internal capability to encode and discard the same types of information. What makes certain images more memorable than others? Research suggests that pictures of people, salient actions and events are more memorable than n

Published: March 25, 2020
# Computer Vision
# AI / ML
Text-to-Speech Synthesis: an Overview

In my childhood, one of the funniest interactions with a computer was to make it read a fairy tale. You could copy a text into a window and soon listen to a colorless metallic voice stumble through commas and stop weaving a weirdly accented story. At those times it was a miracle. Nowadays the goal of TTS — the Text-to-Speech conversion technology — is not to simply have machines talk, but to make them sound like humans of different ages and genders. In perspective, we’ll be able to listen to mac

Published: February 13, 2020
# Speech Processing
# AI / ML
Our Expectations from INTERSPEECH 2019

In less than a month, from Sep. 15–19, 2019, Graz, Austria will become home for INTERSPEECH, the world‘s most prominent conference on spoken language processing. The conference unites science and technology under one roof and becomes a platform for over 2000 participants who will share their insights, listen to eminent speakers, and attend tutorials, challenges, exhibitions, and satellite events. What are our expectations of it as participants and presenters? Tanja Schultz*, the spokesperson of

Published: August 29, 2019
# Speech Processing
# AI / ML