Real-Time Estimation of Propagation Delays for Temporal Alignment of Audio Signals in Augmented Reality Applications
* Presenting author
In augmented reality audio applications a superposition of environmental sounds and supplementary audio content is used to create auditory enhancements for the listener in a broad range of use cases. In some use cases environmental sounds and supplementary content may be highly correlated, for example in audience services at live events, where a live playback through PA speakers is enhanced by augmented reality audio content, e.g. to create an individualized live mix. Without temporal alignment of those signals a superposition causes comb filtering effects or confusing echoes.This contribution proposes an efficient method that is able to robustly detect a temporal offset of correlated audio signals. It is based on a recursive cross-correlation estimation and a peak detection algorithm. The method focuses on indoor music and speech events with their typically occurring problems like room reflections, crosstalk, tonal components and a large number of correlation lags. The obtained temporal offset is used to delay the supplementary audio content in order to achieve a temporal alignment of the signals.