RapidEars ignoring secondsOfSilenceToDetect

Home Forums OpenEars plugins RapidEars ignoring secondsOfSilenceToDetect

Tagged: ,

Viewing 7 posts - 1 through 7 (of 7 total)

  • Author
  • #1020739

    It appears that when setFinalizeHypothesis = FALSE, RapidEars ignores the setting for secondsOfSilenceToDetect. Is this a bug? I don’t need the finalized hypothesis, but I do need to control how long pocketsphinxController waits before considering the utterance ended.

    Halle Winkler


    RapidEars doesn’t use secondsOfSilenceToDetect. Both live and finalized hypotheses are delivered as soon as they are available.


    Sorry if I was unclear. There is a rapidEarsDidDetectEndOfSpeech delegate method which appears to always be called 50-300ms after pocketsphinxDidDetectFinishedSpeech. I have observed that these end of speech callbacks are sensitive to secondsOfSilenceToDetect, but only when setFinalizeHypothesis = TRUE.

    Is this the intended behavior?

    Halle Winkler

    That’s correct – finalizing the hypothesis means that the end of speech is detected and reported, while using live recognition only means that there is no wait for a pause or consequent callback when a pause is detected because a pause at the end of an utterance isn’t used by the engine for determining the hypothesis.


    Yes, but even when setFinalizeHypothesis = FALSE, the end of speech is still detected and reported. It’s just that the config option (secondsOfSilenceToDetect) is now ignored.

    I can work around it, but it seems like incorrect behavior.

    Halle Winkler

    I hear what you’re saying, but it is by design. Live mode does not do any kind of waiting for a pause in order to derive state, so if the engine is only operating in live mode, it uses its own logic to determine when is a good time to call an utterance over so that continuous recognition is able to proceed without notable pauses or skips. secondsOfSilenceToDetect and rapidEarsDidDetectEndOfSpeech both refer to pause detection, which isn’t a feature of live mode. It could be documented better, I agree.

    I think that if I wanted to use live mode hypotheses and non-live mode utterance logic I’d probably turn finalize on and just ignore its hyp output. The overhead isn’t that heavy.


    Okay, thanks Halle, I appreciate the advice!

Viewing 7 posts - 1 through 7 (of 7 total)
  • You must be logged in to reply to this topic.