Stay informed and inspired in the world of AI with us.
Our team recently attended the 25th Interspeech Conference, held from September 1st to 5th on Kos Island, Greece. This year’s theme, "Speech and Beyond," highlighted new developments in speech technology, focusing on areas like healthcare diagnostics, virtual assistants, and even animal sound recognition. It was a great opportunity for experts worldwide to share their work and discuss the latest trends. Here are some of the key topics and insights we gathered from the event. A major topic at the
Published: October 1, 2024Voice biometry is changing the way businesses operate by using distinctive features of a person's voice, like pitch and rhythm, to confirm their identity. This technology, a central part of Voice AI, turns these voice characteristics into digital "voiceprints" that are used for secure authentication. Unlike traditional methods such as fingerprint or facial recognition, voice biometry can be used remotely with just standard microphones, making it both practical and non-intrusive. This technology
Published: May 13, 2024According to WEKA's 2023 Global Trends in AI Report, 69% of organizations now have AI projects up and running, and 28% are using AI across their whole business. This shows a big move from just trying out AI to making it a key part of how companies operate and succeed. However, this is just the beginning as the major point is not to have only AI but to have it work to your benefit. Organizations have to address various challenges such as the collection of data, hiring the right skills, and fittin
Published: April 4, 2024Computer vision (CV) is a part of artificial intelligence that enables computers to analyze and understand visual information, both images and videos. It goes beyond plain “seeing” an image, but teaches computers to make decisions based on what they see. The AI-driven computer vision market is experiencing rapid growth, rising from $22 billion in 2023 to an expected $50 billion by 2030, with a 21.4% CAGR from 2024 to 2030. This technology imitates human vision but works faster using sophisticate
Published: March 8, 2024Automatic speech recognition (ASR) systems are becoming an increasingly important part of human-machine interaction. Simultaneously, they are still too expensive to develop from scratch. Companies need to choose between using a cloud API for an ASR system developed by tech giants or playing with open-source solutions. In this post, we compare eight of the most popular ASR systems to facilitate the choice for your project needs and team’s skills. We have conducted our tests to define the word err
Published: July 7, 2021So many of us have used different Facebook applications to see us aging, turned into rock stars, or applied festive make-up. Such waves of facial transformations are usually accompanied by warnings not to share images of your faces — otherwise, they will be processed and misused. But how does AI use faces in reality? Let’s discuss state-of-the-art applications for face detection and recognition. First, detection and recognition are different tasks. _Face detection_ is the crucial part of face re
Published: June 17, 2021At the beginning of April 2021, Neuralink’s new video featuring a monkey playing Pong with his mind hit the headlines. The company’s as-always-bold statements promise to give back the freedom of movement to people with disabilities. We decided to look beyond the hype and define what these brain-computer systems are capable of in reality. Let’s dive right into it. Brain-computer interfaces (BCIs)* or *Brain-machine interfaces (BMIs) capture a user’s brain activity and translate it into commands f
Published: May 20, 2021Among many things that define us as humans, there is our ability to remember things such as images in great detail, and sometimes after a single view. What is even more interesting, humans tend to remember and forget the same things, suggesting that there might be some general internal capability to encode and discard the same types of information. What makes certain images more memorable than others? Research suggests that pictures of people, salient actions and events are more memorable than n
Published: March 25, 2020In my childhood, one of the funniest interactions with a computer was to make it read a fairy tale. You could copy a text into a window and soon listen to a colorless metallic voice stumble through commas and stop weaving a weirdly accented story. At those times it was a miracle. Nowadays the goal of TTS — the Text-to-Speech conversion technology — is not to simply have machines talk, but to make them sound like humans of different ages and genders. In perspective, we’ll be able to listen to mac
Published: February 13, 2020In less than a month, from Sep. 15–19, 2019, Graz, Austria will become home for INTERSPEECH, the world‘s most prominent conference on spoken language processing. The conference unites science and technology under one roof and becomes a platform for over 2000 participants who will share their insights, listen to eminent speakers, and attend tutorials, challenges, exhibitions, and satellite events. What are our expectations of it as participants and presenters? Tanja Schultz*, the spokesperson of
Published: August 29, 2019