The Hurricane Challenge

Increasing the intelligibility of natural or synthetic speech without increasing volume or loudness

purpose | procedure | signals | important dates | organisers

Purpose

Do you have an algorithm for the modification of natural or synthetic speech which improves intelligibility in known noise conditions without increasing volume or loudness? If so, you are invited to take part in the Hurricane Challenge (*), a co-ordinated international evaluation of modified speech intelligibility.

You will be provided with a corpus of recorded sentences along with separate noise signals at a number of signal-to-noise ratios. Your task is to modify the speech only in such a way to promote its intelligibility. Modifications will be expected to meet constraints on changes in RMS level and/or loudness as well as durational constraints. Your modified speech signals will be evaluated centrally by a large listener sample.

Results of the Challenge will be disseminated either at a Special Session of Interspeech 2013 or at a satellite workshop. Results will be returned to participants well before the Interspeech 2013 paper deadline.

If you intent to take part, it is essential that you register your interest with the organisers before 31st October 2012 so that we can plan the scale of the listening tests accordingly.

A previous internal evaluation which took part within the EU-funded Listening Talker project is described in this paper (under review). Please consult this article for further details of motivation, materials and evaluation procedure.

(*) The name Hurricane was suggested as a logical extension to even-more-adverse conditions of the annual Blizzard Challenge for the evaluation of synthetic speech.

Procedure

The data

  • Unmodified ('plain') speech and masker signals
  • Sentences are from the Harvard corpus, spoken by a male British English talker
  • Two maskers: speech-shaped noise and competing speech from a single talker
  • Each masker/speech combination will be presented at 3 SNRs
  • 180 sentences in each of the 6 conditions (2 maskers x 3 SNRs)
  • Maskers lead and lag the plain speech by 0.5s

Your task and constraints

  • You may modify the speech signals in any way, including durational changes up to a maximum of one second (to fit within the lead/lag of the masker)
  • You may use the noise signals to decide on your speech modifications; the only exception is to modify speech by subtracting the noise signal in the time domain!
  • You will return only the modified speech signals to us
  • If you modify duration, you should also supply a single separate text file containing the endpoints, with one for each signal using the following format:
    • masker SNR sentence startsample endsample e.g. cs snrHi hvd_001 5810 47173

What we will then do

  • We will rescale the speech to meet constraints on RMS energy and/or loudness
  • We will remix the scaled speech and maskers at the specified SNRs
  • All modifications will be presented to a large cohort of native British English listeners
  • We will return your individual evaluation results to you along with results for the unmodified ('plain') speech
  • Minimally, your results will include:
    • raw listener responses
    • keywords correct scores
    • gains expressed as dBs over unmodified speech

Special notes for synthetic speech entries

  • We can provide additional speech data to help you to train or adapt models for the target talker
  • We will provide text for the target sentences
  • If there are sufficient entries, we will evaluate synthetic speech separately from natural speech
  • For additional information on synthetic speech, please contact Cassie Valentini-Botinhao

Downloading

At this stage, we are making available the plain speech and masker signals. Please send a request to Martin Cooke to receive download instructions.

Important dates

1st October 2012Task materials available
31st October 2012Deadline for registration of participation
1st December 2012Deadline for receipt of modified speech
30th January 2013Listener evaluation results returned to participants
August 2013Special session or satellite workshop at Interspeech, Lyon, France

Organisers

Martin Cooke, Ikerbasque & University of the Basque Country, Spain | Catherine Mayo, CSTR, University of Edinburgh, UK | Bastian Sauert, Aachen University, Germany | Yannis Stylianou, FORTH Institute of Computer Science, Crete, Greece | Cassie Valentini-Botinhao, CSTR, University of Edinburgh, UK | Yan Tang, Language and Speech Laboratory, University of the Basque Country, Spain

Last updated: 27th September 2012