Connect with us

Technology

Can you hear me now? AI acoustics to combat noisy sound with generative artificial intelligence

Published

on

Noisy recordings of interviews and speeches are the nightmare of sound engineers. But one German startup hopes to solve this problem with a singular technical approach that uses generative artificial intelligence to improve the clarity of voices in video.

Today, AI acoustics got here out of hiding thanks to financing of 1.9 million euros. According to co-founder and CEO Fabian Seipel, AI-coustics technology goes beyond standard noise cancellation and works with any device and speaker.

“Our core mission is to ensure that every digital interaction, whether on a conference call, a consumer device, or a regular video on social media, is as clear as a professional studio broadcast,” Seipel told TechCrunch in an interview.

Seipel, an audio engineer by training, founded AI-coustics in 2021 together with Corvin Jaedicke, a lecturer in machine learning on the Technical University of Berlin. Seipel and Jaedicke met while studying audio technology at TU Berlin, where they often encountered poor sound quality in the net courses and tutorials that they had to take.

“We are driven by a personal mission to address the pervasive challenge of poor audio quality in digital communications,” said Seipel. “Although my hearing is somewhat impaired by music production in my early 20s, I have always struggled with online content and lectures, which led us to work primarily on speech quality and speech intelligibility.”

The marketplace for software that uses artificial intelligence to suppress noise and improve voice is already very strong. AI-coustics’ rivals include Insoundz, which uses generative artificial intelligence to enhance streamed and pre-recorded speech clips, and Veed.io, a video editing suite with tools to remove background noise from clips.

But Seipel says AI has a singular approach to developing AI mechanisms that truly reduce noise.

The startup uses a model trained on speech samples recorded on the startup’s studio in Berlin, AI-coustics’ hometown. People are paid to record samples – Seipel didn’t say what number of – that are then added to the info set to train an artificial intelligence noise reduction model.

“We have developed a unique approach to simulating audio artifacts and issues – e.g. noise, reverberation, compression, band-limited microphones, distortion, clipping, etc. – during the training process,” Seipel said.

I bet some people won’t mind AI-coustics’ one-time compensation system for creators, provided that the model the startup is training could prove quite lucrative in the long term. (There is a healthy debate about whether the creators of coaching data for AI models deserve to be compensated for his or her contributions.) But perhaps the larger and more immediate problem is bias.

It is well-known that speech recognition algorithms may cause errors – errors that ultimately harm users. AND test published in The Proceedings of the National Academy of Sciences found that speech recognition from leading firms was twice as likely to incorrectly transcribe audio from Black speakers than from white speakers.

To combat this, Seipel says the AI ​​focuses on recruiting “diverse” contributors to speech samples. He added: “Size and diversity are key to eliminating bias and ensuring the technology works across languages, speaker identities, ages, accents and genders.”

It wasn’t essentially the most scientific test, but I submitted three video clips – and interview with a farmer from the 18th centuryAND automotive driving demonstration and Protest in connection with the Israeli-Palestinian conflict — to the AI-coustics platform to see how well it handles each of them. AI has indeed delivered on its promise to increase transparency; to my ears, the processed clips had significantly less ambient noise drowning out the speakers.

Here’s an earlier clip of a farmer from the 18th century:


And after:

Seipel sees AI-coustics technology getting used to enhance real-time and recorded speech, and maybe even being built into devices resembling soundbars, smartphones and headphones to routinely increase voice clarity. Currently, AI-coustics offers an online application and API for audio and video post-processing, in addition to an SDK that permits the AI-coustics platform to be integrated with existing workflows, applications and hardware.

Seipel says the AI ​​– which makes money through a mix of subscriptions, on-demand pricing and licensing – currently has five enterprise customers and 20,000 users (though not all of them are paying). The plan for the subsequent few months includes expanding the corporate’s four-person team and refining its basic speech amplification model.

“Prior to our initial investment, Coustics AI was operating quite leanly and at a low burn rate to weather the headwinds in the VC investment market,” Seipel said. “AI-coustics now has a significant network of investors and mentors in Germany and the UK who provide advice. A strong technology base and the ability to serve different markets with the same database and core technology gives the company flexibility and the ability to change less.”

When asked whether audio mastering technologies resembling AI acoustics could steal jobs what some experts fearSeipel saw the potential of artificial intelligence to speed up time-consuming tasks that currently fall on audio engineers.

“A content creation studio or broadcast manager can save time and money by automating parts of the audio production process using artificial intelligence while maintaining the highest speech quality,” he said. “Speech quality and intelligibility continues to be a vexing issue for nearly every consumer or skilled device, in addition to when creating and consuming content. Any application that records, processes or transmits speech can potentially profit from our technology.

The financing got here in the shape of an equity and debt tranche from Connect Ventures, Inovia Capital, FOV Ventures and Ableton CFO Jan Bohl.

This article was originally published on : techcrunch.com
Continue Reading
Advertisement
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Technology

Australian government withdraws disinformation law

Published

on

By

The Australian government has withdrawn a bill that might have imposed penalties on online platforms as much as 5 percent their global income in the event that they fail to stop the spread of disinformation.

The bill, backed by the Labor government, would enable the Australian Communications and Media Authority to create enforceable rules on disinformation on digital platforms.

IN statementCommunications Minister Michelle Rowland said the bill would “provide an unprecedented level of transparency, holding big tech accountable for its systems and processes to prevent and prevent the spread of harmful misinformation and disinformation online.”

However, she said that “based on public statements and conversations with senators, it is clear that there is no way this proposal could be passed through the Senate.”

When a revised version of the bill was introduced in September, Elon Musk, the owner of X (formerly Twitter), criticized it in a one-word post: “Fascists.”

Shadow communications minister David Coleman was a vocal opponent of the bill, arguing it could encourage platforms to suppress free speech to avoid penalties. Because the bill seems dead now, Coleman sent that it was a “shocking attack on free speech that betrayed our democracy” and called on the Prime Minister to “rule out any future version of this legislation”.

Meanwhile, Rowland in his statement called on Parliament to support “other proposals to strengthen democratic institutions and keep Australians safe online”, including laws to combat deepfakes, enforcement of “truth in political advertising during elections” and regulation of artificial intelligence .

Prime Minister Anthony Albanese can be moving forward with a plan to ban children under 16 from using social media.

This article was originally published on : techcrunch.com
Continue Reading

Technology

Department of Justice tells Google to sell Chrome

Published

on

By

Welcome back to the week in review. This week, we take a look at how the Department of Justice ordered Google to sell Chrome to break its monopoly, whether OpenAI by chance deleted potential evidence in a copyright lawsuit filed by The New York Times, and the way artificial intelligence corporations are exploiting TikTok for research purposes. Let’s do it.

The U.S. Department of Justice argued that Google should get rid of its Chrome browser to help break the corporate’s illegal monopoly on online search. U.S. District Court Judge Amit Mehta ruled in August that Google is an illegal monopoly for abusing its power within the search industry, and the Department of Justice’s latest filing says Google’s ownership of Android and Chrome poses a “significant challenge” to pursuing countermeasures aimed toward establishing a competitive search engine market.

Anthropic raised a further $4 billion from Amazon and agreed to make Amazon Web Services the first training site for its flagship generative artificial intelligence models. Anthropic can be working with Annapurna Labs, AWS’s chip manufacturing division, to develop future generations of Trainium accelerators, custom AWS chips for training artificial intelligence models. Amazon’s recent money injection brings the tech giant’s total investment in Anthropic to $8 billion.

OpenAI by chance deleted potential evidence in The New York Times and Daily News’ copyright lawsuit, say the publisher’s lawyers. As part of the lawsuit, OpenAI agreed to provide two virtual machines so the lawyer could seek for copyrighted content in its AI training kits. However, within the letter, lawyers for the publishers claim that OpenAI engineers deleted all publisher search data stored on one of the virtual machines.



News

Image credits:Presley Ann/Getty Images and CFOTO/Future Publishing via Getty Images

Kim Kardashian meets Optimus: The fashion mogul had hands-on experience with Tesla’s bipedal humanoid robot. In videos posted to X, Kardashian encourages Optimus to make a heart out of his hand, dance like he’s at a luau and play rock, paper, scissors. Read more

Oura’s valuation exceeds $5 billion: The smart ring maker has received a $75 million investment from glucose device maker Dexcom. The investment, which constitutes Oura’s Series D financing round, raises the corporate’s valuation to over $5 billion. Read more

Let’s organize a celebration for Partiful: The customizable event planning app challenges legacy solutions like Evite, Eventbrite, and Facebook Events, is a favourite amongst Gen Z users, and was just named a top app of 2024 by Google. Read more

Talk to me in your language: Microsoft will soon allow Teams users to clone their voices so that they can talk to others in up to nine languages: English, French, German, Italian, Japanese, Korean, Portuguese, Mandarin Chinese and Spanish. Read more

Hackers attack Andrew Tate: According to The Daily Dot, hackers breached a web-based course founded by an influencer and self-confessed misogynist, exposing data on nearly 800,000 users. Tate is currently under house arrest awaiting trial on sex trafficking and rape charges. Read more

What makes a bank a bank? The U.S. Consumer Financial Protection Bureau has ruled that each one digital services that handle significant volumes of transactions needs to be subject to bank-style supervision, which could impact Apple Pay, Cash App, Google Pay, PayPal and Venmo. Read more

A more conversational Siri: According to sources cited by Bloomberg, Apple is developing a new edition of Siri based on advanced multilingual models in an attempt to meet up with more natural-sounding competitors comparable to Google Gemini Live. Read more

Making Money With TikTok Brains: Several AI-powered research tools are taking advantage of the “PDF to Brainrot” trend, during which the text of an uploaded document is read in a monotone voice against a backdrop of “weirdly satisfying” vertical videos like Subway Surfers gameplay. Read more

Threads attacks Bluesky: As Bluesky’s user base surpasses 20 million, Instagram Threads has begun rolling out a brand new feature called custom feeds to capitalize on user demand for more personalization. Read more

ChatGPT within the classroom: OpenAI has released a free online course to help elementary and middle school teachers find out how to introduce ChatGPT into their classrooms. However, some educators are concerned about this technology and its potential for error. Read more

Do we want one other day by day word game? Normally I’m an evangelist for word games and crosswords, but I feel like we’re quickly approaching market saturation. Netflix has launched a brand new day by day word puzzle game in partnership with TED called TED Tumblewords. Read more

Analysis

selection of x-ray scans of the human head
Image credits:Real444/Getty Images

Please don’t send X-ray images to the chatbot: People often turn to generative AI chatbots to ask questions on their health concerns and higher understand their health. Since October, X users have been encouraged to upload their X-rays, MRIs and PET scans to the AI-powered chatbot, Grok, to help interpret the outcomes. Medical data is a special category subject to federal protections that, usually, only you may circumvent. But simply because you may does not imply you need to. As Zack Whittaker writes, it’s price remembering that what goes on the Internet never leaves it. Read more

This article was originally published on : techcrunch.com
Continue Reading

Technology

How the digital “you” can withstand your torturous online conference calls

Published

on

By

Now you can appear like you are on a Zoom call in your office, even whilst you’re sipping a margarita in a hammock far, far-off. Courtesy of a several-month-old startup called Marinadethe premise is easy: upload a five-minute training video of you creating an avatar, and 24 hours later you may seemingly be able to go. Do you ought to call from your automotive? This can be your secret. Too lazy to get away from bed? No problem. At the beach club? You’re probably pushing it, although judging by the demo video, that is not the only problem that should be solved. (The service is currently available in Basic, Standard and Professional versions, with prices starting from $300 to $1,150 per yr.)

The technology, backed by Los Angeles-based Krew Capital, currently only works with macOS, Pickle says, but a Windows version is anticipated next month. As for the conferencing apps that customers can pick from, they include Zoom, Google Meet and Teams, in keeping with Pickle. However, you should have to attend to make use of them. According to the website, “due to high demand, clone generation is currently delayed.”

This article was originally published on : techcrunch.com
Continue Reading
Advertisement

OUR NEWSLETTER

Subscribe Us To Receive Our Latest News Directly In Your Inbox!

We don’t spam! Read our privacy policy for more info.

Trending