|
Abstract |
It is standard practice in virtual reality applications to synthesize binaural audio based on a discrete set of directionally-dependent head-related impulse responses (HRIRs). This set of HRIRs is often time-aligned in a pre-processing step, to allow for high-fidelity interpolation between HRIRs corresponding with neighbouring directions. The fidelity of this operation depends on the similarity of neighbouring aligned HRIRs. The pairwise quality of similarity makes it a difficult criterion to optimize globally and consequently one often resorts to alignment methods based on a specific feature that can be extracted for each HRIR separately, e.g., the first-onset of the peak or the group delay. However, such proxies for similarity are very sensitive to noise and therefore require a high signal-to-noise ratio, which makes them less suitable for processing HRIRs acquired outside an anechoic room. In this paper, we advance a novel alignment method, which maximizes the similarity – defined as the correlation between the full-length HRIRs – between neighbouring aligned HRIRs for all directions at once. We show that this correlation-based alignment procedure outperforms the first-onset alignment with regards to the fidelity of the spherical harmonics representation of both the spectral and interaural time difference (ITD) information, when tested on the KEMAR HRIR and six human HRIRs. Finally, we show that the correlation-based alignment is more robust to noise. |
|