Category

Share

Behind the Sound: How AI Is Enhancing Audio Search

Sony AI

August 20, 2025

For sound designers, finding the right sound can be a challenge. Traditional methods rely on filenames and tags, which may not always accurately reflect how a sound is perceived. As researchers at Sony AI, we were curious: could we support creators by making this discovery process more intuitive?

That question sparked a research collaboration with Audiokinetic, resulting in Similar Sound Search (SSS), a new audio-to-audio and text-to-audio feature, now available in the Wwise Beta as of August 19, 2025. Wwise is an industry-standard audio middleware solution developed by Audiokinetic that provides game developers with comprehensive tools for creating, managing, and implementing interactive audio, allowing them to design dynamic soundscapes that respond to gameplay events without requiring extensive programming knowledge. It features a user-friendly interface with powerful functionality for sound design, music implementation, mixing, and optimization across multiple platforms, making it the preferred audio solution for thousands of games across the industry.

Similar Sound Search was developed to accelerate the process of finding the right sound so that designers can spend more time shaping and refining it and less time sourcing it. Our goal wasn’t to replace creativity, but to help remove some of the friction that can interrupt it.

Let’s take a closer look at how Sony AI researchers collaborated with Audiokinetic to explore new ways of sound discovery.

Rethinking Search for Sound: A Purpose-Built Collaboration

The collaboration between Audiokinetic and Sony AI grew from shared questions about how AI could assist in sound discovery. Audiokinetic brought deep expertise in game audio tools, while our researchers contributed a depth of knowledge in AI for audio and sound. Together, we explored how new models might enhance the way sound designers work.

Many search tools rely heavily on metadata or annotations and descriptors added by others, which can limit how sounds are discovered. When those tags don’t match your intent, the right sound might remain buried, even if it's perfect for the scene.

With Similar Sound Search, sound designers can go beyond descriptors, tapping into the sonic qualities of the audio itself. In early tests, the model revealed unexpected pairings, such as a crackling sound that resembled an animal call, illustrating how similarity in sonic texture can lead to creative opportunities.

Our team developed a model trained on licensed, high-quality sound libraries from BOOM and Pro Sound Effects. This allowed us to better represent professional workflows, with a focus on the types of sounds and descriptive vocabulary designers rely on. We also adapted the underlying architecture to better represent very short or very long sounds, which is common in effects libraries, ensuring the model could capture the detail and diversity professionals expect. More details about the collaboration with Pro Sound Effects can be found in their blog post at blog.prosoundeffects.com/why-sony-audiokinetic-chose-pse.

Two New Ways to Discover Sound

With Similar Sound Search, designers can now explore sound libraries through:

Text-to-Audio: Use a description to surface matches even if they’re labeled differently. As an example, a sound tagged and obtained by ‘smashing fruits’ could be used to create ‘footsteps in mud.’ Such a match would have never been possible with traditional search.

Audio-to-Audio: Upload a reference sound to find acoustically similar results. This allows you to search using tone, rhythm, or texture—qualities that metadata can’t capture. The model surfaces relevant and ranked results that are understood abstractly by the trained model.

These new methods have already revealed surprising connections. One internal test surfaced a “chopper” sound from a library of insect buzzes. Another found that an object's crackling sound sounded like an animal.

As the research team explains, sounds are very complex and they change based on the visual images we pair with them, so being able to search with audio, record, and mimic sounds changes the results and frees the sound designers from the perception of a sound based on how it is described or tagged.

Supporting Creative Exploration

Our goal was to support creative professionals by simplifying the search process so they can spend more time refining and experimenting with the sound itself. Meaning, AI handles the tedious parts of the workflow, freeing up time for creative decision-making. When sound itself becomes the search input, new ideas begin to emerge. Early testing suggests that this approach can streamline search time and surface more relevant options, helping professionals stay focused on the creative process.

To make things easier for creators, Similar Sound Search is built directly into Wwise. No new tools, no new workflows, just seamless integration. This lets sound designers preview and experiment with results right where their audio lives. The tool was also designed with usability in mind—it runs efficiently, without requiring special hardware.

For us, this partnership reflects our broader research goals. Our team is focused on exploring ways AI can support professional creators. By applying our research to sound effects (an area often underserved in AI), we hope to make these workflows more intuitive and inspiring. Unlike music or speech, sound effects have often been treated as secondary in AI research. With this work, our focus shifts toward supporting the nuanced and exploratory workflows unique to sound designers.

Looking Ahead

We’re excited to see how the audio community engages with Similar Sound Search in the Wwise Beta. “Our mission is to unleash imagination and creativity with AI. We believe that AI can empower all kinds of creators and amplify creativity in new ways,” the researchers note. This beta release is the first time that the tool is being used by audio professionals, the team also shared. We look forward to seeing how it is received by the Wwise community and collecting feedback to continue to improve the technology.

Explore the Beta

Similar Sound Search is now live in the Wwise Beta. Whether you’re designing ambiance, building game audio, or remixing soundscapes, Similar Sound Search is ready to change how you discover the sound. Stay tuned for an upcoming Livestream from Audiokinetic and a deep dive into the new feature at www.audiokinetic.com/blog.

Latest Blog

July 31, 2025 | Sony AI

Advancing AI: Highlights from July

July was a month of cultural fluency, scientific collaboration, and stronger defenses for creators. From innovative translation models presented at ACL 2025 to new tools for health…

July 29, 2025 | Events, Sony AI

New Research at ACL 2025 Tackles Real-World Translation Challenges

IntroductionLanguages are more than words. Language is tied to memory, culture, and identity. And nowhere is this more evident than in the challenges of machine translation. At ACL…

July 14, 2025 | Events, Sony AI

Sony AI at ICML: Sharing New Approaches to Reinforcement Learning, Generative Mo…

From powering creative tools to improving decision-making in robotics and reinforcement learning, machine learning continues to redefine how intelligent systems learn, adapt, and s…

  • HOME
  • Blog
  • Behind the Sound: How AI Is Enhancing Audio Search

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.