Language and Speech Laboratory

Energy reallocation strategies for speech enhancement in known noise conditions

Yan Tang, Martin Cooke.

Interspeech 2010 in Makuhari, Japan

Speech output, whether live, recorded or synthetic, is often employed in difficult listening conditions. Context-sensitive speech modifications aim to promote intelligibility while maintaining quality and listener comfort. The current study used objective measures of intelligibility and quality to compare five energy reallocation strategies operating under equal energy and preserved duration constraints. Results in both stationary and highly-nonstationary backgrounds suggest that time-varying modifications lead to large increases in objective intelligibility, but that speech quality is best preserved by time-invariant modifications. Selective amplification of time-frequency regions with low a priori SNR produced the highest objective intelligibility without severe disruption to quality.