Speeding The Audio Search ProcessSpeeding The Audio Search Process

Nexidia Inc., a software company tackling the growing problem of analyzing sounds and speech, has a unique answer to making sense of the growing amount of audio data: speed. -- Sidebar to: Search For Tomorrow

Thomas Claburn, Editor at Large, Enterprise Mobility

March 25, 2005

2 Min Read
information logo in a gray background | information

Nexidia Inc., a software company tackling the growing problem of analyzing sounds and speech, has a unique answer to making sense of the growing amount of audio data: speed.

The company has developed a "phonetic search engine" that can process the basic units of speech that comprise words, called phonemes. Instead of trying to analyze whole words, the company's technology can search audio recordings 50 times faster than their actual duration. Traditional speech-to-text systems can process files only four times faster than in real time.

Speed is vital for intelligence applications such as monitoring audio traffic for enemy communications. "In Iraq, our military uses the term 'situational awareness,'" says retired Lt. Gen. Kenneth Minihan, a former director of the National Security Agency and a principal at Paladin Capital Group, which invests in Nexidia. "You always want to know what's going on around you." Nexidia's technology, he says, can pick up on unusual sounds, avoiding the time-consuming and error-prone process of converting speech into text. Speech-to-text systems, for example, can't easily distinguish context--such as whether the word "bomb" means an explosive or a bad movie.

Return to The Future Of Software homepageThe increase in cell-phone traffic and Internet phone calls helps drive demand, Minihan says. Companies that run call centers, or any business with substantial stores of audio data, could benefit from speed and accuracy improvements.

Analyzing voice-over-IP calls for signs of customer stress is another application, says Mark Finlay, Nexidia's development manager. Analysts, he says, want to "infer behaviors that aren't just related to what's being said, but how it's being said."

Illustration By Brian Stauffer

Return to the story:
Search For Tomorrow

Go to the series:
The Future Of Software

About the Author

Thomas Claburn

Editor at Large, Enterprise Mobility

Thomas Claburn has been writing about business and technology since 1996, for publications such as New Architect, PC Computing, information, Salon, Wired, and Ziff Davis Smart Business. Before that, he worked in film and television, having earned a not particularly useful master's degree in film production. He wrote the original treatment for 3DO's Killing Time, a short story that appeared in On Spec, and the screenplay for an independent film called The Hanged Man, which he would later direct. He's the author of a science fiction novel, Reflecting Fires, and a sadly neglected blog, Lot 49. His iPhone game, Blocfall, is available through the iTunes App Store. His wife is a talented jazz singer; he does not sing, which is for the best.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights