Home › Forums › OpenEars plugins › RapidEars startRealtimeListeningWithLanguageModelAtPath memory leaks
- This topic has 50 replies, 2 voices, and was last updated 7 years, 10 months ago by Laurent.
-
AuthorPosts
-
March 31, 2016 at 8:15 am #1029883LaurentParticipant
Hello,
I’m working with openEars, rapidEars and rejecto.
I’m on Xcode 7.3 and iOS 9.2.1In my app, when I start to listen to voice with rapidears and rejecto, my app crashes after 2 hours of listening because of Message from debugger: Terminated due to memory issue.
In order to debug, I have profiled my app with instruments of Xcode.
And I could see that when I launched startRealtimeListeningWithLanguageModelAtPath function,I had memory leaks.Could you help me please to find a solution to this problem.
Regards,
Laurent.March 31, 2016 at 12:46 pm #1029887LaurentParticipantBesides, the OpeanEarsSample app included in the OpenEars distribution has memory leaks ( tested with Xcode 7.3 and iOS 9.2.1 and instruments (leaks) of Xcode )
Regards,
LaurentApril 1, 2016 at 6:20 pm #1029896Halle WinklerPolitepixWelcome Laurent,
Pocketsphinx and Flite have some little leaks (not big enough to create a problem even over weeks of use IIRC) so the sample app always shows a few leaks. But it’s possible that there could be a new leak introduced with 2.5 – if you’d like me to check it out, I’d need to know what you saw in Instruments that makes you think it originates with OpenEars, which means the names of the objects and their object sizes that Instruments reports as leaking, thanks. Best is to get a report like “after one minute of operation which included starting listening and stopping listening, Leaks reported a leaked object named __ of size __ and the total size of the leaks for object __ over that time period was ___MB.” And let me know what you did in that timeframe so I can attempt to replicate it and see the same thing.
April 5, 2016 at 6:02 am #1029927LaurentParticipantHello Mr Halle,
I don’t understand why the OpenEarsSampleApp memory increases over time. I launched the sampleApp. And then there are some little leaks like you said. But if we pay attention to the Persistent Bytes All Heap Allocations, we can notice that it is increasing over time. Event when I click on the stop listening button it still growth
Do you know why?
Regards,
LaurentI’m using rejecto and rapidEars.
this is my OpenEarsSampleApp code :
[spoiler]
// ViewController.m
// OpenEarsSampleApp
//
// ViewController.m demonstrates the use of the OpenEars framework.
//
// Copyright Politepix UG (haftungsbeschränkt) 2014. All rights reserved.
// https://www.politepix.com
// Contact at https://www.politepix.com/contact
//
// This file is licensed under the Politepix Shared Source license found in the root of the source distribution.// **************************************************************************************************************************************************************
// **************************************************************************************************************************************************************
// **************************************************************************************************************************************************************
// IMPORTANT NOTE: Audio driver and hardware behavior is completely different between the Simulator and a real device. It is not informative to test OpenEars’ accuracy on the Simulator, and please do not report Simulator-only bugs since I only actively support
// the device driver. Please only do testing/bug reporting based on results on a real device such as an iPhone or iPod Touch. Thanks!
// **************************************************************************************************************************************************************
// **************************************************************************************************************************************************************
// **************************************************************************************************************************************************************#import “ViewController.h”
#import <OpenEars/OEPocketsphinxController.h>
#import <OpenEars/OEFliteController.h>
#import <OpenEars/OELanguageModelGenerator.h>
#import <OpenEars/OELogging.h>
#import <OpenEars/OEAcousticModel.h>
#import <Slt/Slt.h>
#import <RapidEars/OEEventsObserver+RapidEars.h>
#import <Rejecto/OELanguageModelGenerator+Rejecto.h>
#import <RapidEars/OEPocketsphinxController+RapidEars.h>
#import <RapidEars/OEEventsObserver+RapidEars.h>@interface ViewController()
// UI actions, not specifically related to OpenEars other than the fact that they invoke OpenEars methods.
– (IBAction) stopButtonAction;
– (IBAction) startButtonAction;
– (IBAction) suspendListeningButtonAction;
– (IBAction) resumeListeningButtonAction;// Example for reading out the input audio levels without locking the UI using an NSTimer
– (void) startDisplayingLevels;
– (void) stopDisplayingLevels;// These three are the important OpenEars objects that this class demonstrates the use of.
@property (nonatomic, strong) Slt *slt;@property (nonatomic, strong) OEEventsObserver *openEarsEventsObserver;
@property (nonatomic, strong) OEPocketsphinxController *pocketsphinxController;
@property (nonatomic, strong) OEFliteController *fliteController;// Some UI, not specifically related to OpenEars.
@property (nonatomic, strong) IBOutlet UIButton *stopButton;
@property (nonatomic, strong) IBOutlet UIButton *startButton;
@property (nonatomic, strong) IBOutlet UIButton *suspendListeningButton;
@property (nonatomic, strong) IBOutlet UIButton *resumeListeningButton;
@property (nonatomic, strong) IBOutlet UITextView *statusTextView;
@property (nonatomic, strong) IBOutlet UITextView *heardTextView;
@property (nonatomic, strong) IBOutlet UILabel *pocketsphinxDbLabel;
@property (nonatomic, strong) IBOutlet UILabel *fliteDbLabel;
@property (nonatomic, assign) BOOL usingStartingLanguageModel;
@property (nonatomic, assign) int restartAttemptsDueToPermissionRequests;
@property (nonatomic, assign) BOOL startupFailedDueToLackOfPermissions;// Things which help us show off the dynamic language features.
@property (nonatomic, copy) NSString *pathToFirstDynamicallyGeneratedLanguageModel;
@property (nonatomic, copy) NSString *pathToFirstDynamicallyGeneratedDictionary;
@property (nonatomic, copy) NSString *pathToSecondDynamicallyGeneratedLanguageModel;
@property (nonatomic, copy) NSString *pathToSecondDynamicallyGeneratedDictionary;// Our NSTimer that will help us read and display the input and output levels without locking the UI
@property (nonatomic, strong) NSTimer *uiUpdateTimer;@end
@implementation ViewController
#define kLevelUpdatesPerSecond 18 // We’ll have the ui update 18 times a second to show some fluidity without hitting the CPU too hard.
//#define kGetNbest // Uncomment this if you want to try out nbest
#pragma mark –
#pragma mark Memory Management– (void)dealloc {
[self stopDisplayingLevels];
}#pragma mark –
#pragma mark View Lifecycle– (void)viewDidLoad {
[super viewDidLoad];
self.fliteController = [[OEFliteController alloc] init];
self.openEarsEventsObserver = [[OEEventsObserver alloc] init];
self.openEarsEventsObserver.delegate = self;
self.slt = [[Slt alloc] init];self.restartAttemptsDueToPermissionRequests = 0;
self.startupFailedDueToLackOfPermissions = FALSE;// [OELogging startOpenEarsLogging]; // Uncomment me for OELogging, which is verbose logging about internal OpenEars operations such as audio settings. If you have issues, show this logging in the forums.
// [OEPocketsphinxController sharedInstance].verbosePocketSphinx = TRUE; // Uncomment this for much more verbose speech recognition engine output. If you have issues, show this logging in the forums.[self.openEarsEventsObserver setDelegate:self]; // Make this class the delegate of OpenEarsObserver so we can get all of the messages about what OpenEars is doing.
[[OEPocketsphinxController sharedInstance] setActive:TRUE error:nil]; // Call this before setting any OEPocketsphinxController characteristics
// This is the language model we’re going to start up with. The only reason I’m making it a class property is that I reuse it a bunch of times in this example,
// but you can pass the string contents directly to OEPocketsphinxController:startListeningWithLanguageModelAtPath:dictionaryAtPath:languageModelIsJSGF:NSArray *firstLanguageArray = @[@”BACKWARD”,
@”CHANGE”,
@”FORWARD”,
@”GO”,
@”LEFT”,
@”MODEL”,
@”RIGHT”,
@”TURN”];OELanguageModelGenerator *languageModelGenerator = [[OELanguageModelGenerator alloc] init];
// languageModelGenerator.verboseLanguageModelGenerator = TRUE; // Uncomment me for verbose language model generator debug output.
// NSError *error = [languageModelGenerator generateLanguageModelFromArray:firstLanguageArray withFilesNamed:@”FirstOpenEarsDynamicLanguageModel” forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]]; // Change “AcousticModelEnglish” to “AcousticModelSpanish” in order to create a language model for Spanish recognition instead of English.
NSError *error = [languageModelGenerator generateRejectingLanguageModelFromArray:firstLanguageArray
withFilesNamed:@”FirstOpenEarsDynamicLanguageModel”
withOptionalExclusions:nil
usingVowelsOnly:FALSE
withWeight:[ NSNumber numberWithFloat:2.0 ]//nil
forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
if(error) {
NSLog(@”Dynamic language generator reported error %@”, [error description]);
} else {
self.pathToFirstDynamicallyGeneratedLanguageModel = [languageModelGenerator pathToSuccessfullyGeneratedLanguageModelWithRequestedName:@”FirstOpenEarsDynamicLanguageModel”];
self.pathToFirstDynamicallyGeneratedDictionary = [languageModelGenerator pathToSuccessfullyGeneratedDictionaryWithRequestedName:@”FirstOpenEarsDynamicLanguageModel”];
}self.usingStartingLanguageModel = TRUE; // This is not an OpenEars thing, this is just so I can switch back and forth between the two models in this sample app.
// Here is an example of dynamically creating an in-app grammar.
// We want it to be able to response to the speech “CHANGE MODEL” and a few other things. Items we want to have recognized as a whole phrase (like “CHANGE MODEL”)
// we put into the array as one string (e.g. “CHANGE MODEL” instead of “CHANGE” and “MODEL”). This increases the probability that they will be recognized as a phrase. This works even better starting with version 1.0 of OpenEars.NSArray *secondLanguageArray = @[@”SUNDAY”,
@”MONDAY”,
@”TUESDAY”,
@”WEDNESDAY”,
@”THURSDAY”,
@”FRIDAY”,
@”SATURDAY”,
@”QUIDNUNC”,
@”CHANGE MODEL”];// The last entry, quidnunc, is an example of a word which will not be found in the lookup dictionary and will be passed to the fallback method. The fallback method is slower,
// so, for instance, creating a new language model from dictionary words will be pretty fast, but a model that has a lot of unusual names in it or invented/rare/recent-slang
// words will be slower to generate. You can use this information to give your users good UI feedback about what the expectations for wait times should be.// I don’t think it’s beneficial to lazily instantiate OELanguageModelGenerator because you only need to give it a single message and then release it.
// If you need to create a very large model or any size of model that has many unusual words that have to make use of the fallback generation method,
// you will want to run this on a background thread so you can give the user some UI feedback that the task is in progress.// generateLanguageModelFromArray:withFilesNamed returns an NSError which will either have a value of noErr if everything went fine or a specific error if it didn’t.
error = [languageModelGenerator generateRejectingLanguageModelFromArray:secondLanguageArray
withFilesNamed:@”SecondOpenEarsDynamicLanguageModel”
withOptionalExclusions:nil
usingVowelsOnly:FALSE
withWeight:[ NSNumber numberWithFloat:2.0 ]//nil
forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// error = [languageModelGenerator generateLanguageModelFromArray:secondLanguageArray withFilesNamed:@”SecondOpenEarsDynamicLanguageModel” forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]]; // Change “AcousticModelEnglish” to “AcousticModelSpanish” in order to create a language model for Spanish recognition instead of English.// NSError *error = [languageModelGenerator generateLanguageModelFromTextFile:[NSString stringWithFormat:@”%@/%@”,[[NSBundle mainBundle] resourcePath], @”OpenEarsCorpus.txt”] withFilesNamed:@”SecondOpenEarsDynamicLanguageModel” forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]]; // Try this out to see how generating a language model from a corpus works.
if(error) {
NSLog(@”Dynamic language generator reported error %@”, [error description]);
} else {self.pathToSecondDynamicallyGeneratedLanguageModel = [languageModelGenerator pathToSuccessfullyGeneratedLanguageModelWithRequestedName:@”SecondOpenEarsDynamicLanguageModel”]; // We’ll set our new .languagemodel file to be the one to get switched to when the words “CHANGE MODEL” are recognized.
self.pathToSecondDynamicallyGeneratedDictionary = [languageModelGenerator pathToSuccessfullyGeneratedDictionaryWithRequestedName:@”SecondOpenEarsDynamicLanguageModel”];; // We’ll set our new dictionary to be the one to get switched to when the words “CHANGE MODEL” are recognized.// Next, an informative message.
NSLog(@”\n\nWelcome to the OpenEars sample project. This project understands the words:\nBACKWARD,\nCHANGE,\nFORWARD,\nGO,\nLEFT,\nMODEL,\nRIGHT,\nTURN,\nand if you say \”CHANGE MODEL\” it will switch to its dynamically-generated model which understands the words:\nCHANGE,\nMODEL,\nMONDAY,\nTUESDAY,\nWEDNESDAY,\nTHURSDAY,\nFRIDAY,\nSATURDAY,\nSUNDAY,\nQUIDNUNC”);
// This is how to start the continuous listening loop of an available instance of OEPocketsphinxController. We won’t do this if the language generation failed since it will be listening for a command to change over to the generated language.
[[OEPocketsphinxController sharedInstance] setActive:TRUE error:nil]; // Call this once before setting properties of the OEPocketsphinxController instance.
// [OEPocketsphinxController sharedInstance].pathToTestFile = [[NSBundle mainBundle] pathForResource:@”change_model_short” ofType:@”wav”]; // This is how you could use a test WAV (mono/16-bit/16k) rather than live recognition. Don’t forget to add your WAV to your app bundle.
if(![OEPocketsphinxController sharedInstance].isListening) {
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel
dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary
acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t already listening.
}
// [self startDisplayingLevels] is not an OpenEars method, just a very simple approach for level reading
// that I’ve included with this sample app. My example implementation does make use of two OpenEars
// methods: the pocketsphinxInputLevel method of OEPocketsphinxController and the fliteOutputLevel
// method of fliteController.
//
// The example is meant to show one way that you can read those levels continuously without locking the UI,
// by using an NSTimer, but the OpenEars level-reading methods
// themselves do not include multithreading code since I believe that you will want to design your own
// code approaches for level display that are tightly-integrated with your interaction design and the
// graphics API you choose.[self startDisplayingLevels];
// Here is some UI stuff that has nothing specifically to do with OpenEars implementation
self.startButton.hidden = TRUE;
self.stopButton.hidden = TRUE;
self.suspendListeningButton.hidden = TRUE;
self.resumeListeningButton.hidden = TRUE;
}
}#pragma mark –
#pragma mark OEEventsObserver delegate methods// What follows are all of the delegate methods you can optionally use once you’ve instantiated an OEEventsObserver and set its delegate to self.
// I’ve provided some pretty granular information about the exact phase of the Pocketsphinx listening loop, the Audio Session, and Flite, but I’d expect
// that the ones that will really be needed by most projects are the following:
//
//- (void) pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID;
//- (void) audioSessionInterruptionDidBegin;
//- (void) audioSessionInterruptionDidEnd;
//- (void) audioRouteDidChangeToRoute:(NSString *)newRoute;
//- (void) pocketsphinxDidStartListening;
//- (void) pocketsphinxDidStopListening;
//
// It isn’t necessary to have a OEPocketsphinxController or a OEFliteController instantiated in order to use these methods. If there isn’t anything instantiated that will
// send messages to an OEEventsObserver, all that will happen is that these methods will never fire. You also do not have to create a OEEventsObserver in
// the same class or view controller in which you are doing things with a OEPocketsphinxController or OEFliteController; you can receive updates from those objects in
// any class in which you instantiate an OEEventsObserver and set its delegate to self.// This is an optional delegate method of OEEventsObserver which delivers the text of speech that Pocketsphinx heard and analyzed, along with its accuracy score and utterance ID.
– (void) pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID {NSLog(@”Local callback: The received hypothesis is %@ with a score of %@ and an ID of %@”, hypothesis, recognitionScore, utteranceID); // Log it.
if([hypothesis isEqualToString:@”CHANGE MODEL”]) { // If the user says “CHANGE MODEL”, we will switch to the alternate model (which happens to be the dynamically generated model).// Here is an example of language model switching in OpenEars. Deciding on what logical basis to switch models is your responsibility.
// For instance, when you call a customer service line and get a response tree that takes you through different options depending on what you say to it,
// the models are being switched as you progress through it so that only relevant choices can be understood. The construction of that logical branching and
// how to react to it is your job; OpenEars just lets you send the signal to switch the language model when you’ve decided it’s the right time to do so.if(self.usingStartingLanguageModel) { // If we’re on the starting model, switch to the dynamically generated one.
[[OEPocketsphinxController sharedInstance] changeLanguageModelToFile:self.pathToSecondDynamicallyGeneratedLanguageModel withDictionary:self.pathToSecondDynamicallyGeneratedDictionary];
self.usingStartingLanguageModel = FALSE;} else { // If we’re on the dynamically generated model, switch to the start model (this is an example of a trigger and method for switching models).
[[OEPocketsphinxController sharedInstance] changeLanguageModelToFile:self.pathToFirstDynamicallyGeneratedLanguageModel withDictionary:self.pathToFirstDynamicallyGeneratedDictionary];
self.usingStartingLanguageModel = TRUE;
}
}self.heardTextView.text = [NSString stringWithFormat:@”Heard: \”%@\””, hypothesis]; // Show it in the status box.
// This is how to use an available instance of OEFliteController. We’re going to repeat back the command that we heard with the voice we’ve chosen.
[self.fliteController say:[NSString stringWithFormat:@”You said %@”,hypothesis] withVoice:self.slt];
}#ifdef kGetNbest
– (void) pocketsphinxDidReceiveNBestHypothesisArray:(NSArray *)hypothesisArray { // Pocketsphinx has an n-best hypothesis dictionary.
NSLog(@”Local callback: hypothesisArray is %@”,hypothesisArray);
}
#endif
// An optional delegate method of OEEventsObserver which informs that there was an interruption to the audio session (e.g. an incoming phone call).
– (void) audioSessionInterruptionDidBegin {
NSLog(@”Local callback: AudioSession interruption began.”); // Log it.
self.statusTextView.text = @”Status: AudioSession interruption began.”; // Show it in the status box.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening) {
error = [[OEPocketsphinxController sharedInstance] stopListening]; // React to it by telling Pocketsphinx to stop listening (if it is listening) since it will need to restart its loop after an interruption.
if(error) NSLog(@”Error while stopping listening in audioSessionInterruptionDidBegin: %@”, error);
}
}// An optional delegate method of OEEventsObserver which informs that the interruption to the audio session ended.
– (void) audioSessionInterruptionDidEnd {
NSLog(@”Local callback: AudioSession interruption ended.”); // Log it.
self.statusTextView.text = @”Status: AudioSession interruption ended.”; // Show it in the status box.
// We’re restarting the previously-stopped listening loop.
if(![OEPocketsphinxController sharedInstance].isListening){
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t currently listening.
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel
dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary
acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
}
}// An optional delegate method of OEEventsObserver which informs that the audio input became unavailable.
– (void) audioInputDidBecomeUnavailable {
NSLog(@”Local callback: The audio input has become unavailable”); // Log it.
self.statusTextView.text = @”Status: The audio input has become unavailable”; // Show it in the status box.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening){
error = [[OEPocketsphinxController sharedInstance] stopListening]; // React to it by telling Pocketsphinx to stop listening since there is no available input (but only if we are listening).
if(error) NSLog(@”Error while stopping listening in audioInputDidBecomeUnavailable: %@”, error);
}
}// An optional delegate method of OEEventsObserver which informs that the unavailable audio input became available again.
– (void) audioInputDidBecomeAvailable {
NSLog(@”Local callback: The audio input is available”); // Log it.
self.statusTextView.text = @”Status: The audio input is available”; // Show it in the status box.
if(![OEPocketsphinxController sharedInstance].isListening) {
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition, but only if we aren’t already listening.
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel
dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary
acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
}
}
// An optional delegate method of OEEventsObserver which informs that there was a change to the audio route (e.g. headphones were plugged in or unplugged).
– (void) audioRouteDidChangeToRoute:(NSString *)newRoute {
NSLog(@”Local callback: Audio route change. The new audio route is %@”, newRoute); // Log it.
self.statusTextView.text = [NSString stringWithFormat:@”Status: Audio route change. The new audio route is %@”,newRoute]; // Show it in the status box.NSError *error = [[OEPocketsphinxController sharedInstance] stopListening]; // React to it by telling the Pocketsphinx loop to shut down and then start listening again on the new route
if(error)NSLog(@”Local callback: error while stopping listening in audioRouteDidChangeToRoute: %@”,error);
if(![OEPocketsphinxController sharedInstance].isListening) {
//
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t already listening.
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel
dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary
acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
}
}// An optional delegate method of OEEventsObserver which informs that the Pocketsphinx recognition loop has entered its actual loop.
// This might be useful in debugging a conflict between another sound class and Pocketsphinx.
– (void) pocketsphinxRecognitionLoopDidStart {NSLog(@”Local callback: Pocketsphinx started.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx started.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx is now listening for speech.
– (void) pocketsphinxDidStartListening {NSLog(@”Local callback: Pocketsphinx is now listening.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx is now listening.”; // Show it in the status box.self.startButton.hidden = TRUE; // React to it with some UI changes.
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = FALSE;
self.resumeListeningButton.hidden = TRUE;
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx detected speech and is starting to process it.
– (void) pocketsphinxDidDetectSpeech {
NSLog(@”Local callback: Pocketsphinx has detected speech.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has detected speech.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx detected a second of silence, indicating the end of an utterance.
// This was added because developers requested being able to time the recognition speed without the speech time. The processing time is the time between
// this method being called and the hypothesis being returned.
– (void) pocketsphinxDidDetectFinishedSpeech {
NSLog(@”Local callback: Pocketsphinx has detected a second of silence, concluding an utterance.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has detected finished speech.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx has exited its recognition loop, most
// likely in response to the OEPocketsphinxController being told to stop listening via the stopListening method.
– (void) pocketsphinxDidStopListening {
NSLog(@”Local callback: Pocketsphinx has stopped listening.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has stopped listening.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx is still in its listening loop but it is not
// Going to react to speech until listening is resumed. This can happen as a result of Flite speech being
// in progress on an audio route that doesn’t support simultaneous Flite speech and Pocketsphinx recognition,
// or as a result of the OEPocketsphinxController being told to suspend recognition via the suspendRecognition method.
– (void) pocketsphinxDidSuspendRecognition {
NSLog(@”Local callback: Pocketsphinx has suspended recognition.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has suspended recognition.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx is still in its listening loop and after recognition
// having been suspended it is now resuming. This can happen as a result of Flite speech completing
// on an audio route that doesn’t support simultaneous Flite speech and Pocketsphinx recognition,
// or as a result of the OEPocketsphinxController being told to resume recognition via the resumeRecognition method.
– (void) pocketsphinxDidResumeRecognition {
NSLog(@”Local callback: Pocketsphinx has resumed recognition.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has resumed recognition.”; // Show it in the status box.
}// An optional delegate method which informs that Pocketsphinx switched over to a new language model at the given URL in the course of
// recognition. This does not imply that it is a valid file or that recognition will be successful using the file.
– (void) pocketsphinxDidChangeLanguageModelToFile:(NSString *)newLanguageModelPathAsString andDictionary:(NSString *)newDictionaryPathAsString {
NSLog(@”Local callback: Pocketsphinx is now using the following language model: \n%@ and the following dictionary: %@”,newLanguageModelPathAsString,newDictionaryPathAsString);
}// An optional delegate method of OEEventsObserver which informs that Flite is speaking, most likely to be useful if debugging a
// complex interaction between sound classes. You don’t have to do anything yourself in order to prevent Pocketsphinx from listening to Flite talk and trying to recognize the speech.
– (void) fliteDidStartSpeaking {
NSLog(@”Local callback: Flite has started speaking”); // Log it.
self.statusTextView.text = @”Status: Flite has started speaking.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Flite is finished speaking, most likely to be useful if debugging a
// complex interaction between sound classes.
– (void) fliteDidFinishSpeaking {
NSLog(@”Local callback: Flite has finished speaking”); // Log it.
self.statusTextView.text = @”Status: Flite has finished speaking.”; // Show it in the status box.
}– (void) pocketSphinxContinuousSetupDidFailWithReason:(NSString *)reasonForFailure { // This can let you know that something went wrong with the recognition loop startup. Turn on [OELogging startOpenEarsLogging] to learn why.
NSLog(@”Local callback: Setting up the continuous recognition loop has failed for the reason %@, please turn on [OELogging startOpenEarsLogging] to learn more.”, reasonForFailure); // Log it.
self.statusTextView.text = @”Status: Not possible to start recognition loop.”; // Show it in the status box.
}– (void) pocketSphinxContinuousTeardownDidFailWithReason:(NSString *)reasonForFailure { // This can let you know that something went wrong with the recognition loop startup. Turn on [OELogging startOpenEarsLogging] to learn why.
NSLog(@”Local callback: Tearing down the continuous recognition loop has failed for the reason %@, please turn on [OELogging startOpenEarsLogging] to learn more.”, reasonForFailure); // Log it.
self.statusTextView.text = @”Status: Not possible to cleanly end recognition loop.”; // Show it in the status box.
}– (void) testRecognitionCompleted { // A test file which was submitted for direct recognition via the audio driver is done.
NSLog(@”Local callback: A test file which was submitted for direct recognition via the audio driver is done.”); // Log it.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening) { // If we’re listening, stop listening.
error = [[OEPocketsphinxController sharedInstance] stopListening];
if(error) NSLog(@”Error while stopping listening in testRecognitionCompleted: %@”, error);
}}
/** Pocketsphinx couldn’t start because it has no mic permissions (will only be returned on iOS7 or later).*/
– (void) pocketsphinxFailedNoMicPermissions {
NSLog(@”Local callback: The user has never set mic permissions or denied permission to this app’s mic, so listening will not start.”);
self.startupFailedDueToLackOfPermissions = TRUE;
if([OEPocketsphinxController sharedInstance].isListening){
NSError *error = [[OEPocketsphinxController sharedInstance] stopListening]; // Stop listening if we are listening.
if(error) NSLog(@”Error while stopping listening in micPermissionCheckCompleted: %@”, error);
}
}/** The user prompt to get mic permissions, or a check of the mic permissions, has completed with a TRUE or a FALSE result (will only be returned on iOS7 or later).*/
– (void) micPermissionCheckCompleted:(BOOL)result {
if(result) {
self.restartAttemptsDueToPermissionRequests++;
if(self.restartAttemptsDueToPermissionRequests == 1 && self.startupFailedDueToLackOfPermissions) { // If we get here because there was an attempt to start which failed due to lack of permissions, and now permissions have been requested and they returned true, we restart exactly once with the new permissions.if(![OEPocketsphinxController sharedInstance].isListening) { // If there was no error and we aren’t listening, start listening.
// [[OEPocketsphinxController sharedInstance]
// startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel
// dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary
// acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]
// languageModelIsJSGF:FALSE]; // Start speech recognition.// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE];
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel
dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary
acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
self.startupFailedDueToLackOfPermissions = FALSE;
}
}
}
}#pragma mark –
#pragma mark UI// This is not OpenEars-specific stuff, just some UI behavior
– (IBAction) suspendListeningButtonAction { // This is the action for the button which suspends listening without ending the recognition loop
[[OEPocketsphinxController sharedInstance] suspendRecognition];self.startButton.hidden = TRUE;
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = TRUE;
self.resumeListeningButton.hidden = FALSE;
}– (IBAction) resumeListeningButtonAction { // This is the action for the button which resumes listening if it has been suspended
[[OEPocketsphinxController sharedInstance] resumeRecognition];self.startButton.hidden = TRUE;
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = FALSE;
self.resumeListeningButton.hidden = TRUE;
}– (IBAction) stopButtonAction { // This is the action for the button which shuts down the recognition loop.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening) { // Stop if we are currently listening.
error = [[OEPocketsphinxController sharedInstance] stopListening];
if(error)NSLog(@”Error stopping listening in stopButtonAction: %@”, error);
}
self.startButton.hidden = FALSE;
self.stopButton.hidden = TRUE;
self.suspendListeningButton.hidden = TRUE;
self.resumeListeningButton.hidden = TRUE;
}– (IBAction) startButtonAction { // This is the action for the button which starts up the recognition loop again if it has been shut down.
if(![OEPocketsphinxController sharedInstance].isListening) {
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel
dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary
acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE];
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t already listening.
}
self.startButton.hidden = TRUE;
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = FALSE;
self.resumeListeningButton.hidden = TRUE;
}#pragma mark –
#pragma mark Example for reading out Pocketsphinx and Flite audio levels without locking the UI by using an NSTimer// What follows are not OpenEars methods, just an approach for level reading
// that I’ve included with this sample app. My example implementation does make use of two OpenEars
// methods: the pocketsphinxInputLevel method of OEPocketsphinxController and the fliteOutputLevel
// method of OEFliteController.
//
// The example is meant to show one way that you can read those levels continuously without locking the UI,
// by using an NSTimer, but the OpenEars level-reading methods
// themselves do not include multithreading code since I believe that you will want to design your own
// code approaches for level display that are tightly-integrated with your interaction design and the
// graphics API you choose.
//
// Please note that if you use my sample approach, you should pay attention to the way that the timer is always stopped in
// dealloc. This should prevent you from having any difficulties with deallocating a class due to a running NSTimer process.– (void) startDisplayingLevels { // Start displaying the levels using a timer
[self stopDisplayingLevels]; // We never want more than one timer valid so we’ll stop any running timers first.
self.uiUpdateTimer = [NSTimer scheduledTimerWithTimeInterval:1.0/kLevelUpdatesPerSecond target:self selector:@selector(updateLevelsUI) userInfo:nil repeats:YES];
}– (void) stopDisplayingLevels { // Stop displaying the levels by stopping the timer if it’s running.
if(self.uiUpdateTimer && [self.uiUpdateTimer isValid]) { // If there is a running timer, we’ll stop it here.
[self.uiUpdateTimer invalidate];
self.uiUpdateTimer = nil;
}
}– (void) updateLevelsUI { // And here is how we obtain the levels. This method includes the actual OpenEars methods and uses their results to update the UI of this view controller.
self.pocketsphinxDbLabel.text = [NSString stringWithFormat:@”Pocketsphinx Input level:%f”,[[OEPocketsphinxController sharedInstance] pocketsphinxInputLevel]]; //pocketsphinxInputLevel is an OpenEars method of the class OEPocketsphinxController.
if(self.fliteController.speechInProgress) {
self.fliteDbLabel.text = [NSString stringWithFormat:@”Flite Output level: %f”,[self.fliteController fliteOutputLevel]]; // fliteOutputLevel is an OpenEars method of the class OEFliteController.
}
}– (void) rapidEarsDidReceiveLiveSpeechHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore {
NSLog(@”rapidEarsDidReceiveLiveSpeechHypothesis: %@”,hypothesis);
}– (void) rapidEarsDidReceiveFinishedSpeechHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore {
NSLog(@”rapidEarsDidReceiveFinishedSpeechHypothesis: %@”,hypothesis);
}@end
[/spoiler]April 5, 2016 at 11:03 am #1029930Halle WinklerPolitepixHi Laurent,
No problem, if you discover something big I will definitely look into it. Thanks for updating me.
April 5, 2016 at 1:02 pm #1029933LaurentParticipantI’ve edited the previous post. I haven’t seen your answer. I’m sorry.
Could you take a look into the problem please?Regards,
LaurentApril 5, 2016 at 2:09 pm #1029934Halle WinklerPolitepixHi Laurent,
Take a look at what I wrote previously – to investigate a memory issue when I’m not aware of any existing issue, I need to hear about specifics that you are seeing using Instruments. That means telling me something about which specific objects you see as being reported as growing indefinitely, why it doesn’t seem right for them to use the memory they use, and by what amount over what time period they grow.
When I run the sample app with your code, I see correct behavior: there are some micro leaks from Flite and Pocketsphinx, and the OpenEars’ classes memory usage is reasonably frugal during listening, and extra memory is immediately released after a large amount of it is used during the final recognition. When listening is stopped, it is all released. At no time is an unusual amount of memory allocated (given the task). I don’t see any data in here yet that would lead to a correlation with your crash. I think it’s important to start by recreating the crash condition and look at the data you see yourself, rather than working from the opposite direction of expecting a correlation without information about it.
A reason that OpenEars or RapidEars will correctly increase the memory footprint over the life of a listening session (not leak) is because the data object where the utterance is held before recognition eventually will become large enough to hold the longest utterance that was spoken in the session. That isn’t a leak, it’s just a single buffer finding the ceiling needed in order to hold the full amount of data it is being asked to hold. This isn’t a sign of a problem since it can be released when it isn’t being used. If you watch the persistent heap bytes for a while in Instruments you will also see the numbers decrease when there is no speech (and decrease dramatically when listening is stopped), if you wait a while.
I am still up for looking into very specific reports (which include full logging, specific objects, etc) if you can recreate your crash and have some evidence that it relates to one of the libraries.
April 5, 2016 at 2:14 pm #1029935Halle WinklerPolitepixKeep in mind that if you are running RapidEars for extremely long sessions (some apps run it for days), it is probably a good idea to occasionally stop and restart listening so any big buffers from unusually long utterances can be cleaned up.
Does your app respond to low memory conditions (for instance, by stopping listening)? They can also be caused by any number of other events on the device.
April 5, 2016 at 2:58 pm #1029936LaurentParticipantHi Halle,
I share with you some screenshots of my test in instruments.
I’ve tested the sampleApp during 6min30 and I click on the button stop listening at 5 min 30
I’ve make some screenshots of my instruments screen in order to show you what I see.This screenshot is the start of the app :
https://www.dropbox.com/s/vj2rd97mb1xkdka/start.png?dl=0
With this state :
https://www.dropbox.com/s/2kowup4bgsyki8w/Capture%20start.png?dl=0And this is the end of the test:
https://www.dropbox.com/s/gkmi1fooep0xbqv/end.png?dl=0With this memory state:
You can see a lot of malloc by openEarsSampleApp but there are other malloc of the app also that I haven’t screenshot.
You can count about (762-7) 755 malloc of 256kb = 193Mb
https://www.dropbox.com/s/63y8qa9hujwkz6u/Capture%20end%20.png?dl=0
https://www.dropbox.com/s/3kamesws515ni8n/Capture%20end2.png?dl=0April 5, 2016 at 3:00 pm #1029937LaurentParticipantFor the moment I’m not managing low memory condition in my app for the voice recognition. I will do it. Thank you for this tips.
April 6, 2016 at 6:41 am #1029942LaurentParticipantHi Mr Halle,
This is the link of a new instruments test:
https://www.dropbox.com/s/erzn8sjmr43lmjr/Instruments_report_for_MrHalle.trace.zip?dl=0
I clicked on stop listening button at 5min10.
April 6, 2016 at 9:55 am #1029946Halle WinklerPolitepixSuper, thanks. A couple of questions – is it a particularly noisy area? It’s unusual to see continuous empty utterances like this log has.
My second question is that I noticed that your firm is one of the customers who received my email on March 8th 2016 about needing to replace your licensed copy of RapidEars 2.5. Did you replace it? The email was sent to df@ your company name since that was the address used for the purchase.
April 6, 2016 at 10:02 am #1029947LaurentParticipantIt is an office. So sometimes people are speaking.
Yes, we did the changes about the license.Thank you.
April 6, 2016 at 10:07 am #1029949Halle WinklerPolitepixOK, I will replicate this and figure out what’s happening. You can also remove your downloads that you linked to in this discussion if you want to since I have a copy now.
April 6, 2016 at 10:08 am #1029950LaurentParticipantThank you!
Regards,
LaurentApril 6, 2016 at 12:21 pm #1029951Halle WinklerPolitepixOK, with the Instruments file and the full logging I can see what is happening here (and why I missed it) – it looks like only if the session can’t be gracefully stopped due to audio conflicts it orphans the decoder in memory as an alternative to causing an exception, and this is an edge case missing from my testbed. I’ll add a test and fix this in the next update as a high-priority fix, thank you for your report. With some luck it could be this week or next. For your own info, it’s a issue with OpenEars and not with the plugins.
April 14, 2016 at 6:13 am #1030050LaurentParticipantHi Mr Halle,
We will wait for the upgrade then! Please notify us when it is available.
Thank you.
Regards,
LaurentApril 20, 2016 at 3:41 pm #1030105Halle WinklerPolitepixHi Laurent,
You can read and subscribe to update information for all Politepix frameworks/plugins here so you can get that information automatically: http://changelogs.politepix.com
April 20, 2016 at 5:09 pm #1030106Halle WinklerPolitepixHi Laurent,
Does this still happen if you set
[OEPocketsphinxController sharedInstance].legacy3rdPassMode = TRUE;
at the time that you are otherwise configuring OEPocketsphinxController?
April 21, 2016 at 6:31 am #1030116LaurentParticipantHi Mr Halle,
I have tested it and it still has the same behaviour.
Hope you will find the problem!
Thank you.
Regards,
LaurentApril 21, 2016 at 9:08 am #1030121Halle WinklerPolitepixGreetings Laurent,
Unfortunately I haven’t been able to replicate the issue locally although I’ve been trying. Now I’m a bit confused that you say that [OEPocketsphinxController sharedInstance].legacy3rdPassMode = TRUE doesn’t affect it, since the issue in the logging shouldn’t be possible with that setting since it appears to be caused by an overly-long 3rd-pass search, and using legacy3rdPassMode with RapidEars turns 3rd-pass searches off. That suggests that it could be a local issue to your install.
When I examine your logs more closely, the Rejecto version linked is 2.5, but it looks like the RapidEars version linked is older than 2.5. I think that somehow your project is not really linking to the current version of RapidEars, though I believe you downloaded it – perhaps the linked version is in a different location from the downloaded version.
Can you do some troubleshooting of why the current version of RapidEars doesn’t seem to be linked to the project and then let me know whether you still have this issue? I’m currently at a dead end in demonstrating it in my own testbed and it could be due to linking to an old RapidEars version.
When you’ve successfully linked to RapidEars 2.5, there will be a line of logging in your OELogging output giving the version number of your RapidEars framework, right at the beginning. Thanks!
April 21, 2016 at 10:51 am #1030131Halle WinklerPolitepixSome hints for the troubleshooting process: Xcode links to the framework both via the file navigator and via the Framework Search Paths entry of Build Settings. It is possible that they don’t both point to the same thing. I would probably start by searching my system for RapidEars.framework and removing all copies of it found, making sure that the app project breaks, removing the RapidEars.framework entry from Framework Search Paths, and then downloading your RapidEars 2.5 framework from the licensee site and installing it to the app project fresh. At that point you should see the line of logging stating the version number of RapidEars.
April 21, 2016 at 12:33 pm #1030132LaurentParticipantIn my app I can see RapidEars 2.5 in the OELogging.
Even with the legacy 3rdPassMode at TRUE I still have the same behaviour.But in the sample App it does not work I’ve tried with the demo one and with our rapidEars framework .
I have checked it is the same path for both navigator and framework build path.
This is my code . Perhaps you can help me to find why?
[spoiler]
#import “ViewController.h”
#import <OpenEars/OEPocketsphinxController.h>
#import <RapidEars/OEPocketsphinxController+RapidEars.h>
#import <OpenEars/OEFliteController.h>
#import <OpenEars/OELanguageModelGenerator.h>
#import <Rejecto/OELanguageModelGenerator+Rejecto.h>
#import <OpenEars/OELogging.h>
#import <OpenEars/OEAcousticModel.h>
#import <Slt/Slt.h>
#import <OpenEars/OELanguageModelGenerator.h>
#import <OpenEars/OEEventsObserver.h>
#import <RapidEars/OEEventsObserver+RapidEars.h>
@interface ViewController()// UI actions, not specifically related to OpenEars other than the fact that they invoke OpenEars methods.
– (IBAction) stopButtonAction;
– (IBAction) startButtonAction;
– (IBAction) suspendListeningButtonAction;
– (IBAction) resumeListeningButtonAction;// Example for reading out the input audio levels without locking the UI using an NSTimer
– (void) startDisplayingLevels;
– (void) stopDisplayingLevels;// These three are the important OpenEars objects that this class demonstrates the use of.
@property (nonatomic, strong) Slt *slt;@property (nonatomic, strong) OEEventsObserver *openEarsEventsObserver;
@property (nonatomic, strong) OEPocketsphinxController *pocketsphinxController;
@property (nonatomic, strong) OEFliteController *fliteController;// Some UI, not specifically related to OpenEars.
@property (nonatomic, strong) IBOutlet UIButton *stopButton;
@property (nonatomic, strong) IBOutlet UIButton *startButton;
@property (nonatomic, strong) IBOutlet UIButton *suspendListeningButton;
@property (nonatomic, strong) IBOutlet UIButton *resumeListeningButton;
@property (nonatomic, strong) IBOutlet UITextView *statusTextView;
@property (nonatomic, strong) IBOutlet UITextView *heardTextView;
@property (nonatomic, strong) IBOutlet UILabel *pocketsphinxDbLabel;
@property (nonatomic, strong) IBOutlet UILabel *fliteDbLabel;
@property (nonatomic, assign) BOOL usingStartingLanguageModel;
@property (nonatomic, assign) int restartAttemptsDueToPermissionRequests;
@property (nonatomic, assign) BOOL startupFailedDueToLackOfPermissions;// Things which help us show off the dynamic language features.
@property (nonatomic, copy) NSString *pathToFirstDynamicallyGeneratedLanguageModel;
@property (nonatomic, copy) NSString *pathToFirstDynamicallyGeneratedDictionary;
@property (nonatomic, copy) NSString *pathToSecondDynamicallyGeneratedLanguageModel;
@property (nonatomic, copy) NSString *pathToSecondDynamicallyGeneratedDictionary;// Our NSTimer that will help us read and display the input and output levels without locking the UI
@property (nonatomic, strong) NSTimer *uiUpdateTimer;@end
@implementation ViewController
#define kLevelUpdatesPerSecond 18 // We’ll have the ui update 18 times a second to show some fluidity without hitting the CPU too hard.
//#define kGetNbest // Uncomment this if you want to try out nbest
#pragma mark –
#pragma mark Memory Management– (void)dealloc {
[self stopDisplayingLevels];
}#pragma mark –
#pragma mark View Lifecycle– (void)viewDidLoad {
[super viewDidLoad];
self.fliteController = [[OEFliteController alloc] init];
self.openEarsEventsObserver = [[OEEventsObserver alloc] init];
self.openEarsEventsObserver.delegate = self;
self.slt = [[Slt alloc] init];self.restartAttemptsDueToPermissionRequests = 0;
self.startupFailedDueToLackOfPermissions = FALSE;[OEPocketsphinxController sharedInstance].verbosePocketSphinx = TRUE; // Uncomment this for much more verbose speech recognition engine output. If you have issues, show this logging in the forums.
[self.openEarsEventsObserver setDelegate:self]; // Make this class the delegate of OpenEarsObserver so we can get all of the messages about what OpenEars is doing.
[[OEPocketsphinxController sharedInstance] setActive:TRUE error:nil]; // Call this before setting any OEPocketsphinxController characteristics
// This is the language model we’re going to start up with. The only reason I’m making it a class property is that I reuse it a bunch of times in this example,
// but you can pass the string contents directly to OEPocketsphinxController:startListeningWithLanguageModelAtPath:dictionaryAtPath:languageModelIsJSGF:NSArray *firstLanguageArray = @[@”BACKWARD”,
@”CHANGE”,
@”FORWARD”,
@”GO”,
@”LEFT”,
@”MODEL”,
@”RIGHT”,
@”TURN”];OELanguageModelGenerator *languageModelGenerator = [[OELanguageModelGenerator alloc] init];
[OELogging startOpenEarsLogging]; // Uncomment me for OELogging, which is verbose logging about internal OpenEars operations such as audio settings. If you have issues, show this logging in the forums.// languageModelGenerator.verboseLanguageModelGenerator = TRUE; // Uncomment me for verbose language model generator debug output.
NSError *error = [languageModelGenerator generateRejectingLanguageModelFromArray:firstLanguageArray withFilesNamed:@”FirstOpenEarsDynamicLanguageModel” withOptionalExclusions:nil
usingVowelsOnly:FALSE
withWeight:[ NSNumber numberWithFloat:2.0 ]
forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]]; // Change “AcousticModelEnglish” to “AcousticModelSpanish” in order to create a language model for Spanish recognition instead of English.
if(error) {
NSLog(@”Dynamic language generator reported error %@”, [error description]);
} else {
self.pathToFirstDynamicallyGeneratedLanguageModel = [languageModelGenerator pathToSuccessfullyGeneratedLanguageModelWithRequestedName:@”FirstOpenEarsDynamicLanguageModel”];
self.pathToFirstDynamicallyGeneratedDictionary = [languageModelGenerator pathToSuccessfullyGeneratedDictionaryWithRequestedName:@”FirstOpenEarsDynamicLanguageModel”];
}self.usingStartingLanguageModel = TRUE; // This is not an OpenEars thing, this is just so I can switch back and forth between the two models in this sample app.
// Here is an example of dynamically creating an in-app grammar.
// We want it to be able to response to the speech “CHANGE MODEL” and a few other things. Items we want to have recognized as a whole phrase (like “CHANGE MODEL”)
// we put into the array as one string (e.g. “CHANGE MODEL” instead of “CHANGE” and “MODEL”). This increases the probability that they will be recognized as a phrase. This works even better starting with version 1.0 of OpenEars.NSArray *secondLanguageArray = @[@”SUNDAY”,
@”MONDAY”,
@”TUESDAY”,
@”WEDNESDAY”,
@”THURSDAY”,
@”FRIDAY”,
@”SATURDAY”,
@”QUIDNUNC”,
@”CHANGE MODEL”];// The last entry, quidnunc, is an example of a word which will not be found in the lookup dictionary and will be passed to the fallback method. The fallback method is slower,
// so, for instance, creating a new language model from dictionary words will be pretty fast, but a model that has a lot of unusual names in it or invented/rare/recent-slang
// words will be slower to generate. You can use this information to give your users good UI feedback about what the expectations for wait times should be.// I don’t think it’s beneficial to lazily instantiate OELanguageModelGenerator because you only need to give it a single message and then release it.
// If you need to create a very large model or any size of model that has many unusual words that have to make use of the fallback generation method,
// you will want to run this on a background thread so you can give the user some UI feedback that the task is in progress.// generateLanguageModelFromArray:withFilesNamed returns an NSError which will either have a value of noErr if everything went fine or a specific error if it didn’t.
error = [languageModelGenerator generateRejectingLanguageModelFromArray:secondLanguageArray withFilesNamed:@”SecondOpenEarsDynamicLanguageModel” withOptionalExclusions:nil
usingVowelsOnly:FALSE
withWeight:[ NSNumber numberWithFloat:2.0 ]
forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]]; // Change “AcousticModelEnglish” to “AcousticModelSpanish” in order to create a language model for Spanish recognition instead of English.// NSError *error = [languageModelGenerator generateLanguageModelFromTextFile:[NSString stringWithFormat:@”%@/%@”,[[NSBundle mainBundle] resourcePath], @”OpenEarsCorpus.txt”] withFilesNamed:@”SecondOpenEarsDynamicLanguageModel” forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]]; // Try this out to see how generating a language model from a corpus works.
if(error) {
NSLog(@”Dynamic language generator reported error %@”, [error description]);
} else {self.pathToSecondDynamicallyGeneratedLanguageModel = [languageModelGenerator pathToSuccessfullyGeneratedLanguageModelWithRequestedName:@”SecondOpenEarsDynamicLanguageModel”]; // We’ll set our new .languagemodel file to be the one to get switched to when the words “CHANGE MODEL” are recognized.
self.pathToSecondDynamicallyGeneratedDictionary = [languageModelGenerator pathToSuccessfullyGeneratedDictionaryWithRequestedName:@”SecondOpenEarsDynamicLanguageModel”];; // We’ll set our new dictionary to be the one to get switched to when the words “CHANGE MODEL” are recognized.// Next, an informative message.
NSLog(@”\n\nWelcome to the OpenEars sample project. This project understands the words:\nBACKWARD,\nCHANGE,\nFORWARD,\nGO,\nLEFT,\nMODEL,\nRIGHT,\nTURN,\nand if you say \”CHANGE MODEL\” it will switch to its dynamically-generated model which understands the words:\nCHANGE,\nMODEL,\nMONDAY,\nTUESDAY,\nWEDNESDAY,\nTHURSDAY,\nFRIDAY,\nSATURDAY,\nSUNDAY,\nQUIDNUNC”);
// This is how to start the continuous listening loop of an available instance of OEPocketsphinxController. We won’t do this if the language generation failed since it will be listening for a command to change over to the generated language.
[[OEPocketsphinxController sharedInstance] setActive:TRUE error:nil]; // Call this once before setting properties of the OEPocketsphinxController instance.
// [OEPocketsphinxController sharedInstance].pathToTestFile = [[NSBundle mainBundle] pathForResource:@”change_model_short” ofType:@”wav”]; // This is how you could use a test WAV (mono/16-bit/16k) rather than live recognition. Don’t forget to add your WAV to your app bundle.
if(![OEPocketsphinxController sharedInstance].isListening) {
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t already listening.
}
// [self startDisplayingLevels] is not an OpenEars method, just a very simple approach for level reading
// that I’ve included with this sample app. My example implementation does make use of two OpenEars
// methods: the pocketsphinxInputLevel method of OEPocketsphinxController and the fliteOutputLevel
// method of fliteController.
//
// The example is meant to show one way that you can read those levels continuously without locking the UI,
// by using an NSTimer, but the OpenEars level-reading methods
// themselves do not include multithreading code since I believe that you will want to design your own
// code approaches for level display that are tightly-integrated with your interaction design and the
// graphics API you choose.[self startDisplayingLevels];
// Here is some UI stuff that has nothing specifically to do with OpenEars implementation
self.startButton.hidden = TRUE;
self.stopButton.hidden = TRUE;
self.suspendListeningButton.hidden = TRUE;
self.resumeListeningButton.hidden = TRUE;
}
}#pragma mark –
#pragma mark OEEventsObserver delegate methods// What follows are all of the delegate methods you can optionally use once you’ve instantiated an OEEventsObserver and set its delegate to self.
// I’ve provided some pretty granular information about the exact phase of the Pocketsphinx listening loop, the Audio Session, and Flite, but I’d expect
// that the ones that will really be needed by most projects are the following:
//
//- (void) pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID;
//- (void) audioSessionInterruptionDidBegin;
//- (void) audioSessionInterruptionDidEnd;
//- (void) audioRouteDidChangeToRoute:(NSString *)newRoute;
//- (void) pocketsphinxDidStartListening;
//- (void) pocketsphinxDidStopListening;
//
// It isn’t necessary to have a OEPocketsphinxController or a OEFliteController instantiated in order to use these methods. If there isn’t anything instantiated that will
// send messages to an OEEventsObserver, all that will happen is that these methods will never fire. You also do not have to create a OEEventsObserver in
// the same class or view controller in which you are doing things with a OEPocketsphinxController or OEFliteController; you can receive updates from those objects in
// any class in which you instantiate an OEEventsObserver and set its delegate to self.// This is an optional delegate method of OEEventsObserver which delivers the text of speech that Pocketsphinx heard and analyzed, along with its accuracy score and utterance ID.
– (void) pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID {NSLog(@”Local callback: The received hypothesis is %@ with a score of %@ and an ID of %@”, hypothesis, recognitionScore, utteranceID); // Log it.
if([hypothesis isEqualToString:@”CHANGE MODEL”]) { // If the user says “CHANGE MODEL”, we will switch to the alternate model (which happens to be the dynamically generated model).// Here is an example of language model switching in OpenEars. Deciding on what logical basis to switch models is your responsibility.
// For instance, when you call a customer service line and get a response tree that takes you through different options depending on what you say to it,
// the models are being switched as you progress through it so that only relevant choices can be understood. The construction of that logical branching and
// how to react to it is your job; OpenEars just lets you send the signal to switch the language model when you’ve decided it’s the right time to do so.if(self.usingStartingLanguageModel) { // If we’re on the starting model, switch to the dynamically generated one.
[[OEPocketsphinxController sharedInstance] changeLanguageModelToFile:self.pathToSecondDynamicallyGeneratedLanguageModel withDictionary:self.pathToSecondDynamicallyGeneratedDictionary];
self.usingStartingLanguageModel = FALSE;} else { // If we’re on the dynamically generated model, switch to the start model (this is an example of a trigger and method for switching models).
[[OEPocketsphinxController sharedInstance] changeLanguageModelToFile:self.pathToFirstDynamicallyGeneratedLanguageModel withDictionary:self.pathToFirstDynamicallyGeneratedDictionary];
self.usingStartingLanguageModel = TRUE;
}
}self.heardTextView.text = [NSString stringWithFormat:@”Heard: \”%@\””, hypothesis]; // Show it in the status box.
// This is how to use an available instance of OEFliteController. We’re going to repeat back the command that we heard with the voice we’ve chosen.
[self.fliteController say:[NSString stringWithFormat:@”You said %@”,hypothesis] withVoice:self.slt];
}#ifdef kGetNbest
– (void) pocketsphinxDidReceiveNBestHypothesisArray:(NSArray *)hypothesisArray { // Pocketsphinx has an n-best hypothesis dictionary.
NSLog(@”Local callback: hypothesisArray is %@”,hypothesisArray);
}
#endif
// An optional delegate method of OEEventsObserver which informs that there was an interruption to the audio session (e.g. an incoming phone call).
– (void) audioSessionInterruptionDidBegin {
NSLog(@”Local callback: AudioSession interruption began.”); // Log it.
self.statusTextView.text = @”Status: AudioSession interruption began.”; // Show it in the status box.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening) {
error = [[OEPocketsphinxController sharedInstance] stopListening]; // React to it by telling Pocketsphinx to stop listening (if it is listening) since it will need to restart its loop after an interruption.
if(error) NSLog(@”Error while stopping listening in audioSessionInterruptionDidBegin: %@”, error);
}
}// An optional delegate method of OEEventsObserver which informs that the interruption to the audio session ended.
– (void) audioSessionInterruptionDidEnd {
NSLog(@”Local callback: AudioSession interruption ended.”); // Log it.
self.statusTextView.text = @”Status: AudioSession interruption ended.”; // Show it in the status box.
// We’re restarting the previously-stopped listening loop.
if(![OEPocketsphinxController sharedInstance].isListening){
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t currently listening.
}
}// An optional delegate method of OEEventsObserver which informs that the audio input became unavailable.
– (void) audioInputDidBecomeUnavailable {
NSLog(@”Local callback: The audio input has become unavailable”); // Log it.
self.statusTextView.text = @”Status: The audio input has become unavailable”; // Show it in the status box.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening){
error = [[OEPocketsphinxController sharedInstance] stopListening]; // React to it by telling Pocketsphinx to stop listening since there is no available input (but only if we are listening).
if(error) NSLog(@”Error while stopping listening in audioInputDidBecomeUnavailable: %@”, error);
}
}// An optional delegate method of OEEventsObserver which informs that the unavailable audio input became available again.
– (void) audioInputDidBecomeAvailable {
NSLog(@”Local callback: The audio input is available”); // Log it.
self.statusTextView.text = @”Status: The audio input is available”; // Show it in the status box.
if(![OEPocketsphinxController sharedInstance].isListening) {
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition, but only if we aren’t already listening.
}
}
// An optional delegate method of OEEventsObserver which informs that there was a change to the audio route (e.g. headphones were plugged in or unplugged).
– (void) audioRouteDidChangeToRoute:(NSString *)newRoute {
NSLog(@”Local callback: Audio route change. The new audio route is %@”, newRoute); // Log it.
self.statusTextView.text = [NSString stringWithFormat:@”Status: Audio route change. The new audio route is %@”,newRoute]; // Show it in the status box.NSError *error = [[OEPocketsphinxController sharedInstance] stopListening]; // React to it by telling the Pocketsphinx loop to shut down and then start listening again on the new route
if(error)NSLog(@”Local callback: error while stopping listening in audioRouteDidChangeToRoute: %@”,error);
if(![OEPocketsphinxController sharedInstance].isListening) {
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t already listening.
}
}// An optional delegate method of OEEventsObserver which informs that the Pocketsphinx recognition loop has entered its actual loop.
// This might be useful in debugging a conflict between another sound class and Pocketsphinx.
– (void) pocketsphinxRecognitionLoopDidStart {NSLog(@”Local callback: Pocketsphinx started.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx started.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx is now listening for speech.
– (void) pocketsphinxDidStartListening {NSLog(@”Local callback: Pocketsphinx is now listening.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx is now listening.”; // Show it in the status box.self.startButton.hidden = TRUE; // React to it with some UI changes.
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = FALSE;
self.resumeListeningButton.hidden = TRUE;
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx detected speech and is starting to process it.
– (void) pocketsphinxDidDetectSpeech {
NSLog(@”Local callback: Pocketsphinx has detected speech.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has detected speech.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx detected a second of silence, indicating the end of an utterance.
// This was added because developers requested being able to time the recognition speed without the speech time. The processing time is the time between
// this method being called and the hypothesis being returned.
– (void) pocketsphinxDidDetectFinishedSpeech {
NSLog(@”Local callback: Pocketsphinx has detected a second of silence, concluding an utterance.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has detected finished speech.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx has exited its recognition loop, most
// likely in response to the OEPocketsphinxController being told to stop listening via the stopListening method.
– (void) pocketsphinxDidStopListening {
NSLog(@”Local callback: Pocketsphinx has stopped listening.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has stopped listening.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx is still in its listening loop but it is not
// Going to react to speech until listening is resumed. This can happen as a result of Flite speech being
// in progress on an audio route that doesn’t support simultaneous Flite speech and Pocketsphinx recognition,
// or as a result of the OEPocketsphinxController being told to suspend recognition via the suspendRecognition method.
– (void) pocketsphinxDidSuspendRecognition {
NSLog(@”Local callback: Pocketsphinx has suspended recognition.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has suspended recognition.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx is still in its listening loop and after recognition
// having been suspended it is now resuming. This can happen as a result of Flite speech completing
// on an audio route that doesn’t support simultaneous Flite speech and Pocketsphinx recognition,
// or as a result of the OEPocketsphinxController being told to resume recognition via the resumeRecognition method.
– (void) pocketsphinxDidResumeRecognition {
NSLog(@”Local callback: Pocketsphinx has resumed recognition.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has resumed recognition.”; // Show it in the status box.
}// An optional delegate method which informs that Pocketsphinx switched over to a new language model at the given URL in the course of
// recognition. This does not imply that it is a valid file or that recognition will be successful using the file.
– (void) pocketsphinxDidChangeLanguageModelToFile:(NSString *)newLanguageModelPathAsString andDictionary:(NSString *)newDictionaryPathAsString {
NSLog(@”Local callback: Pocketsphinx is now using the following language model: \n%@ and the following dictionary: %@”,newLanguageModelPathAsString,newDictionaryPathAsString);
}// An optional delegate method of OEEventsObserver which informs that Flite is speaking, most likely to be useful if debugging a
// complex interaction between sound classes. You don’t have to do anything yourself in order to prevent Pocketsphinx from listening to Flite talk and trying to recognize the speech.
– (void) fliteDidStartSpeaking {
NSLog(@”Local callback: Flite has started speaking”); // Log it.
self.statusTextView.text = @”Status: Flite has started speaking.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Flite is finished speaking, most likely to be useful if debugging a
// complex interaction between sound classes.
– (void) fliteDidFinishSpeaking {
NSLog(@”Local callback: Flite has finished speaking”); // Log it.
self.statusTextView.text = @”Status: Flite has finished speaking.”; // Show it in the status box.
}– (void) pocketSphinxContinuousSetupDidFailWithReason:(NSString *)reasonForFailure { // This can let you know that something went wrong with the recognition loop startup. Turn on [OELogging startOpenEarsLogging] to learn why.
NSLog(@”Local callback: Setting up the continuous recognition loop has failed for the reason %@, please turn on [OELogging startOpenEarsLogging] to learn more.”, reasonForFailure); // Log it.
self.statusTextView.text = @”Status: Not possible to start recognition loop.”; // Show it in the status box.
}– (void) pocketSphinxContinuousTeardownDidFailWithReason:(NSString *)reasonForFailure { // This can let you know that something went wrong with the recognition loop startup. Turn on [OELogging startOpenEarsLogging] to learn why.
NSLog(@”Local callback: Tearing down the continuous recognition loop has failed for the reason %@, please turn on [OELogging startOpenEarsLogging] to learn more.”, reasonForFailure); // Log it.
self.statusTextView.text = @”Status: Not possible to cleanly end recognition loop.”; // Show it in the status box.
}– (void) testRecognitionCompleted { // A test file which was submitted for direct recognition via the audio driver is done.
NSLog(@”Local callback: A test file which was submitted for direct recognition via the audio driver is done.”); // Log it.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening) { // If we’re listening, stop listening.
error = [[OEPocketsphinxController sharedInstance] stopListening];
if(error) NSLog(@”Error while stopping listening in testRecognitionCompleted: %@”, error);
}}
– (void) rapidEarsDidReceiveLiveSpeechHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore {
NSLog(@”rapidEarsDidReceiveLiveSpeechHypothesis: %@”,hypothesis);
}– (void) rapidEarsDidReceiveFinishedSpeechHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore {
NSLog(@”rapidEarsDidReceiveFinishedSpeechHypothesis: %@”,hypothesis);
}
/** Pocketsphinx couldn’t start because it has no mic permissions (will only be returned on iOS7 or later).*/
– (void) pocketsphinxFailedNoMicPermissions {
NSLog(@”Local callback: The user has never set mic permissions or denied permission to this app’s mic, so listening will not start.”);
self.startupFailedDueToLackOfPermissions = TRUE;
if([OEPocketsphinxController sharedInstance].isListening){
NSError *error = [[OEPocketsphinxController sharedInstance] stopListening]; // Stop listening if we are listening.
if(error) NSLog(@”Error while stopping listening in micPermissionCheckCompleted: %@”, error);
}
}/** The user prompt to get mic permissions, or a check of the mic permissions, has completed with a TRUE or a FALSE result (will only be returned on iOS7 or later).*/
– (void) micPermissionCheckCompleted:(BOOL)result {
if(result) {
self.restartAttemptsDueToPermissionRequests++;
if(self.restartAttemptsDueToPermissionRequests == 1 && self.startupFailedDueToLackOfPermissions) { // If we get here because there was an attempt to start which failed due to lack of permissions, and now permissions have been requested and they returned true, we restart exactly once with the new permissions.if(![OEPocketsphinxController sharedInstance].isListening) { // If there was no error and we aren’t listening, start listening.
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel
// dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary
// acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]
// languageModelIsJSGF:FALSE]; // Start speech recognition.self.startupFailedDueToLackOfPermissions = FALSE;
}
}
}
}#pragma mark –
#pragma mark UI// This is not OpenEars-specific stuff, just some UI behavior
– (IBAction) suspendListeningButtonAction { // This is the action for the button which suspends listening without ending the recognition loop
[[OEPocketsphinxController sharedInstance] suspendRecognition];self.startButton.hidden = TRUE;
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = TRUE;
self.resumeListeningButton.hidden = FALSE;
}– (IBAction) resumeListeningButtonAction { // This is the action for the button which resumes listening if it has been suspended
[[OEPocketsphinxController sharedInstance] resumeRecognition];self.startButton.hidden = TRUE;
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = FALSE;
self.resumeListeningButton.hidden = TRUE;
}– (IBAction) stopButtonAction { // This is the action for the button which shuts down the recognition loop.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening) { // Stop if we are currently listening.
error = [[OEPocketsphinxController sharedInstance] stopListening];
if(error)NSLog(@”Error stopping listening in stopButtonAction: %@”, error);
}
self.startButton.hidden = FALSE;
self.stopButton.hidden = TRUE;
self.suspendListeningButton.hidden = TRUE;
self.resumeListeningButton.hidden = TRUE;
}– (IBAction) startButtonAction { // This is the action for the button which starts up the recognition loop again if it has been shut down.
if(![OEPocketsphinxController sharedInstance].isListening) {
[[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t already listening.
}
self.startButton.hidden = TRUE;
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = FALSE;
self.resumeListeningButton.hidden = TRUE;
}#pragma mark –
#pragma mark Example for reading out Pocketsphinx and Flite audio levels without locking the UI by using an NSTimer// What follows are not OpenEars methods, just an approach for level reading
// that I’ve included with this sample app. My example implementation does make use of two OpenEars
// methods: the pocketsphinxInputLevel method of OEPocketsphinxController and the fliteOutputLevel
// method of OEFliteController.
//
// The example is meant to show one way that you can read those levels continuously without locking the UI,
// by using an NSTimer, but the OpenEars level-reading methods
// themselves do not include multithreading code since I believe that you will want to design your own
// code approaches for level display that are tightly-integrated with your interaction design and the
// graphics API you choose.
//
// Please note that if you use my sample approach, you should pay attention to the way that the timer is always stopped in
// dealloc. This should prevent you from having any difficulties with deallocating a class due to a running NSTimer process.– (void) startDisplayingLevels { // Start displaying the levels using a timer
[self stopDisplayingLevels]; // We never want more than one timer valid so we’ll stop any running timers first.
self.uiUpdateTimer = [NSTimer scheduledTimerWithTimeInterval:1.0/kLevelUpdatesPerSecond target:self selector:@selector(updateLevelsUI) userInfo:nil repeats:YES];
}– (void) stopDisplayingLevels { // Stop displaying the levels by stopping the timer if it’s running.
if(self.uiUpdateTimer && [self.uiUpdateTimer isValid]) { // If there is a running timer, we’ll stop it here.
[self.uiUpdateTimer invalidate];
self.uiUpdateTimer = nil;
}
}– (void) updateLevelsUI { // And here is how we obtain the levels. This method includes the actual OpenEars methods and uses their results to update the UI of this view controller.
self.pocketsphinxDbLabel.text = [NSString stringWithFormat:@”Pocketsphinx Input level:%f”,[[OEPocketsphinxController sharedInstance] pocketsphinxInputLevel]]; //pocketsphinxInputLevel is an OpenEars method of the class OEPocketsphinxController.
if(self.fliteController.speechInProgress) {
self.fliteDbLabel.text = [NSString stringWithFormat:@”Flite Output level: %f”,[self.fliteController fliteOutputLevel]]; // fliteOutputLevel is an OpenEars method of the class OEFliteController.
}
}@end
[/spoiler]April 21, 2016 at 12:45 pm #1030134Halle WinklerPolitepixIt is definitely not in the logging for the Instruments case you sent so I’m not sure how to proceed when I can’t replicate it (I’ve run it with every example of challenging audio that I have and I’m out of ideas for how to get it to happen – this was an issue with much older OpenEars versions but a fix was added for it, so this is a bit mysterious, especially in combination with it being due to a 3rd-pass search in your previously-sent logs and then apparently still happening when you turn off 3rd-pass searches). Can you create a full replication case according to this post, that you can run and see behave in the same way:
https://www.politepix.com/forums/topic/how-to-create-a-minimal-case-for-replication/
And send me a link via the contact form? Thank you.
April 21, 2016 at 12:56 pm #1030135Halle WinklerPolitepixWhat you’re looking for in a replication case is that when you use your recorded audio from the SaveThatWave demo as a test audio file using pathToTestFile, the call to stop listening gets stuck unable to stop gracefully and then afterwards (like a minute afterwards) the full amount of memory used at the time of attempting to stop is still allocated.
April 21, 2016 at 1:00 pm #1030136LaurentParticipantI don’t see any rapidEars 2.5 in OELogging in the sample app with the rejecto demo and rapidEars demo why? it is with the code I just posted
April 21, 2016 at 1:07 pm #1030137Halle WinklerPolitepixThe code you just posted won’t work with the RapidEars demo at all, it links to a licensed version.
April 21, 2016 at 1:10 pm #1030138LaurentParticipantYes. I have changed the imports to the demo one. Sorry This is the good one:
But it still not log RapidEars 2.5 with the demo
[spoiler]#import “ViewController.h”
#import <OpenEars/OEPocketsphinxController.h>
#import <RapidEarsDemo/OEPocketsphinxController+RapidEars.h>
#import <OpenEars/OEFliteController.h>
#import <OpenEars/OELanguageModelGenerator.h>
#import <RejectoDemo/OELanguageModelGenerator+Rejecto.h>
#import <OpenEars/OELogging.h>
#import <OpenEars/OEAcousticModel.h>
#import <Slt/Slt.h>
#import <OpenEars/OELanguageModelGenerator.h>
#import <OpenEars/OEEventsObserver.h>
#import <RapidEarsDemo/OEEventsObserver+RapidEars.h>
@interface ViewController()// UI actions, not specifically related to OpenEars other than the fact that they invoke OpenEars methods.
– (IBAction) stopButtonAction;
– (IBAction) startButtonAction;
– (IBAction) suspendListeningButtonAction;
– (IBAction) resumeListeningButtonAction;// Example for reading out the input audio levels without locking the UI using an NSTimer
– (void) startDisplayingLevels;
– (void) stopDisplayingLevels;// These three are the important OpenEars objects that this class demonstrates the use of.
@property (nonatomic, strong) Slt *slt;@property (nonatomic, strong) OEEventsObserver *openEarsEventsObserver;
@property (nonatomic, strong) OEPocketsphinxController *pocketsphinxController;
@property (nonatomic, strong) OEFliteController *fliteController;// Some UI, not specifically related to OpenEars.
@property (nonatomic, strong) IBOutlet UIButton *stopButton;
@property (nonatomic, strong) IBOutlet UIButton *startButton;
@property (nonatomic, strong) IBOutlet UIButton *suspendListeningButton;
@property (nonatomic, strong) IBOutlet UIButton *resumeListeningButton;
@property (nonatomic, strong) IBOutlet UITextView *statusTextView;
@property (nonatomic, strong) IBOutlet UITextView *heardTextView;
@property (nonatomic, strong) IBOutlet UILabel *pocketsphinxDbLabel;
@property (nonatomic, strong) IBOutlet UILabel *fliteDbLabel;
@property (nonatomic, assign) BOOL usingStartingLanguageModel;
@property (nonatomic, assign) int restartAttemptsDueToPermissionRequests;
@property (nonatomic, assign) BOOL startupFailedDueToLackOfPermissions;// Things which help us show off the dynamic language features.
@property (nonatomic, copy) NSString *pathToFirstDynamicallyGeneratedLanguageModel;
@property (nonatomic, copy) NSString *pathToFirstDynamicallyGeneratedDictionary;
@property (nonatomic, copy) NSString *pathToSecondDynamicallyGeneratedLanguageModel;
@property (nonatomic, copy) NSString *pathToSecondDynamicallyGeneratedDictionary;// Our NSTimer that will help us read and display the input and output levels without locking the UI
@property (nonatomic, strong) NSTimer *uiUpdateTimer;@end
@implementation ViewController
#define kLevelUpdatesPerSecond 18 // We’ll have the ui update 18 times a second to show some fluidity without hitting the CPU too hard.
//#define kGetNbest // Uncomment this if you want to try out nbest
#pragma mark –
#pragma mark Memory Management– (void)dealloc {
[self stopDisplayingLevels];
}#pragma mark –
#pragma mark View Lifecycle– (void)viewDidLoad {
[super viewDidLoad];
self.fliteController = [[OEFliteController alloc] init];
self.openEarsEventsObserver = [[OEEventsObserver alloc] init];
self.openEarsEventsObserver.delegate = self;
self.slt = [[Slt alloc] init];self.restartAttemptsDueToPermissionRequests = 0;
self.startupFailedDueToLackOfPermissions = FALSE;[OEPocketsphinxController sharedInstance].verbosePocketSphinx = TRUE; // Uncomment this for much more verbose speech recognition engine output. If you have issues, show this logging in the forums.
[self.openEarsEventsObserver setDelegate:self]; // Make this class the delegate of OpenEarsObserver so we can get all of the messages about what OpenEars is doing.
[[OEPocketsphinxController sharedInstance] setActive:TRUE error:nil]; // Call this before setting any OEPocketsphinxController characteristics
// This is the language model we’re going to start up with. The only reason I’m making it a class property is that I reuse it a bunch of times in this example,
// but you can pass the string contents directly to OEPocketsphinxController:startListeningWithLanguageModelAtPath:dictionaryAtPath:languageModelIsJSGF:NSArray *firstLanguageArray = @[@”BACKWARD”,
@”CHANGE”,
@”FORWARD”,
@”GO”,
@”LEFT”,
@”MODEL”,
@”RIGHT”,
@”TURN”];OELanguageModelGenerator *languageModelGenerator = [[OELanguageModelGenerator alloc] init];
[OELogging startOpenEarsLogging]; // Uncomment me for OELogging, which is verbose logging about internal OpenEars operations such as audio settings. If you have issues, show this logging in the forums.// languageModelGenerator.verboseLanguageModelGenerator = TRUE; // Uncomment me for verbose language model generator debug output.
NSError *error = [languageModelGenerator generateRejectingLanguageModelFromArray:firstLanguageArray withFilesNamed:@”FirstOpenEarsDynamicLanguageModel” withOptionalExclusions:nil
usingVowelsOnly:FALSE
withWeight:[ NSNumber numberWithFloat:2.0 ]
forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]]; // Change “AcousticModelEnglish” to “AcousticModelSpanish” in order to create a language model for Spanish recognition instead of English.
if(error) {
NSLog(@”Dynamic language generator reported error %@”, [error description]);
} else {
self.pathToFirstDynamicallyGeneratedLanguageModel = [languageModelGenerator pathToSuccessfullyGeneratedLanguageModelWithRequestedName:@”FirstOpenEarsDynamicLanguageModel”];
self.pathToFirstDynamicallyGeneratedDictionary = [languageModelGenerator pathToSuccessfullyGeneratedDictionaryWithRequestedName:@”FirstOpenEarsDynamicLanguageModel”];
}self.usingStartingLanguageModel = TRUE; // This is not an OpenEars thing, this is just so I can switch back and forth between the two models in this sample app.
// Here is an example of dynamically creating an in-app grammar.
// We want it to be able to response to the speech “CHANGE MODEL” and a few other things. Items we want to have recognized as a whole phrase (like “CHANGE MODEL”)
// we put into the array as one string (e.g. “CHANGE MODEL” instead of “CHANGE” and “MODEL”). This increases the probability that they will be recognized as a phrase. This works even better starting with version 1.0 of OpenEars.NSArray *secondLanguageArray = @[@”SUNDAY”,
@”MONDAY”,
@”TUESDAY”,
@”WEDNESDAY”,
@”THURSDAY”,
@”FRIDAY”,
@”SATURDAY”,
@”QUIDNUNC”,
@”CHANGE MODEL”];// The last entry, quidnunc, is an example of a word which will not be found in the lookup dictionary and will be passed to the fallback method. The fallback method is slower,
// so, for instance, creating a new language model from dictionary words will be pretty fast, but a model that has a lot of unusual names in it or invented/rare/recent-slang
// words will be slower to generate. You can use this information to give your users good UI feedback about what the expectations for wait times should be.// I don’t think it’s beneficial to lazily instantiate OELanguageModelGenerator because you only need to give it a single message and then release it.
// If you need to create a very large model or any size of model that has many unusual words that have to make use of the fallback generation method,
// you will want to run this on a background thread so you can give the user some UI feedback that the task is in progress.// generateLanguageModelFromArray:withFilesNamed returns an NSError which will either have a value of noErr if everything went fine or a specific error if it didn’t.
error = [languageModelGenerator generateRejectingLanguageModelFromArray:secondLanguageArray withFilesNamed:@”SecondOpenEarsDynamicLanguageModel” withOptionalExclusions:nil
usingVowelsOnly:FALSE
withWeight:[ NSNumber numberWithFloat:2.0 ]
forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]]; // Change “AcousticModelEnglish” to “AcousticModelSpanish” in order to create a language model for Spanish recognition instead of English.// NSError *error = [languageModelGenerator generateLanguageModelFromTextFile:[NSString stringWithFormat:@”%@/%@”,[[NSBundle mainBundle] resourcePath], @”OpenEarsCorpus.txt”] withFilesNamed:@”SecondOpenEarsDynamicLanguageModel” forAcousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]]; // Try this out to see how generating a language model from a corpus works.
if(error) {
NSLog(@”Dynamic language generator reported error %@”, [error description]);
} else {self.pathToSecondDynamicallyGeneratedLanguageModel = [languageModelGenerator pathToSuccessfullyGeneratedLanguageModelWithRequestedName:@”SecondOpenEarsDynamicLanguageModel”]; // We’ll set our new .languagemodel file to be the one to get switched to when the words “CHANGE MODEL” are recognized.
self.pathToSecondDynamicallyGeneratedDictionary = [languageModelGenerator pathToSuccessfullyGeneratedDictionaryWithRequestedName:@”SecondOpenEarsDynamicLanguageModel”];; // We’ll set our new dictionary to be the one to get switched to when the words “CHANGE MODEL” are recognized.// Next, an informative message.
NSLog(@”\n\nWelcome to the OpenEars sample project. This project understands the words:\nBACKWARD,\nCHANGE,\nFORWARD,\nGO,\nLEFT,\nMODEL,\nRIGHT,\nTURN,\nand if you say \”CHANGE MODEL\” it will switch to its dynamically-generated model which understands the words:\nCHANGE,\nMODEL,\nMONDAY,\nTUESDAY,\nWEDNESDAY,\nTHURSDAY,\nFRIDAY,\nSATURDAY,\nSUNDAY,\nQUIDNUNC”);
// This is how to start the continuous listening loop of an available instance of OEPocketsphinxController. We won’t do this if the language generation failed since it will be listening for a command to change over to the generated language.
[[OEPocketsphinxController sharedInstance] setActive:TRUE error:nil]; // Call this once before setting properties of the OEPocketsphinxController instance.
// [OEPocketsphinxController sharedInstance].pathToTestFile = [[NSBundle mainBundle] pathForResource:@”change_model_short” ofType:@”wav”]; // This is how you could use a test WAV (mono/16-bit/16k) rather than live recognition. Don’t forget to add your WAV to your app bundle.
if(![OEPocketsphinxController sharedInstance].isListening) {
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t already listening.
}
// [self startDisplayingLevels] is not an OpenEars method, just a very simple approach for level reading
// that I’ve included with this sample app. My example implementation does make use of two OpenEars
// methods: the pocketsphinxInputLevel method of OEPocketsphinxController and the fliteOutputLevel
// method of fliteController.
//
// The example is meant to show one way that you can read those levels continuously without locking the UI,
// by using an NSTimer, but the OpenEars level-reading methods
// themselves do not include multithreading code since I believe that you will want to design your own
// code approaches for level display that are tightly-integrated with your interaction design and the
// graphics API you choose.[self startDisplayingLevels];
// Here is some UI stuff that has nothing specifically to do with OpenEars implementation
self.startButton.hidden = TRUE;
self.stopButton.hidden = TRUE;
self.suspendListeningButton.hidden = TRUE;
self.resumeListeningButton.hidden = TRUE;
}
}#pragma mark –
#pragma mark OEEventsObserver delegate methods// What follows are all of the delegate methods you can optionally use once you’ve instantiated an OEEventsObserver and set its delegate to self.
// I’ve provided some pretty granular information about the exact phase of the Pocketsphinx listening loop, the Audio Session, and Flite, but I’d expect
// that the ones that will really be needed by most projects are the following:
//
//- (void) pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID;
//- (void) audioSessionInterruptionDidBegin;
//- (void) audioSessionInterruptionDidEnd;
//- (void) audioRouteDidChangeToRoute:(NSString *)newRoute;
//- (void) pocketsphinxDidStartListening;
//- (void) pocketsphinxDidStopListening;
//
// It isn’t necessary to have a OEPocketsphinxController or a OEFliteController instantiated in order to use these methods. If there isn’t anything instantiated that will
// send messages to an OEEventsObserver, all that will happen is that these methods will never fire. You also do not have to create a OEEventsObserver in
// the same class or view controller in which you are doing things with a OEPocketsphinxController or OEFliteController; you can receive updates from those objects in
// any class in which you instantiate an OEEventsObserver and set its delegate to self.// This is an optional delegate method of OEEventsObserver which delivers the text of speech that Pocketsphinx heard and analyzed, along with its accuracy score and utterance ID.
– (void) pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID {NSLog(@”Local callback: The received hypothesis is %@ with a score of %@ and an ID of %@”, hypothesis, recognitionScore, utteranceID); // Log it.
if([hypothesis isEqualToString:@”CHANGE MODEL”]) { // If the user says “CHANGE MODEL”, we will switch to the alternate model (which happens to be the dynamically generated model).// Here is an example of language model switching in OpenEars. Deciding on what logical basis to switch models is your responsibility.
// For instance, when you call a customer service line and get a response tree that takes you through different options depending on what you say to it,
// the models are being switched as you progress through it so that only relevant choices can be understood. The construction of that logical branching and
// how to react to it is your job; OpenEars just lets you send the signal to switch the language model when you’ve decided it’s the right time to do so.if(self.usingStartingLanguageModel) { // If we’re on the starting model, switch to the dynamically generated one.
[[OEPocketsphinxController sharedInstance] changeLanguageModelToFile:self.pathToSecondDynamicallyGeneratedLanguageModel withDictionary:self.pathToSecondDynamicallyGeneratedDictionary];
self.usingStartingLanguageModel = FALSE;} else { // If we’re on the dynamically generated model, switch to the start model (this is an example of a trigger and method for switching models).
[[OEPocketsphinxController sharedInstance] changeLanguageModelToFile:self.pathToFirstDynamicallyGeneratedLanguageModel withDictionary:self.pathToFirstDynamicallyGeneratedDictionary];
self.usingStartingLanguageModel = TRUE;
}
}self.heardTextView.text = [NSString stringWithFormat:@”Heard: \”%@\””, hypothesis]; // Show it in the status box.
// This is how to use an available instance of OEFliteController. We’re going to repeat back the command that we heard with the voice we’ve chosen.
[self.fliteController say:[NSString stringWithFormat:@”You said %@”,hypothesis] withVoice:self.slt];
}#ifdef kGetNbest
– (void) pocketsphinxDidReceiveNBestHypothesisArray:(NSArray *)hypothesisArray { // Pocketsphinx has an n-best hypothesis dictionary.
NSLog(@”Local callback: hypothesisArray is %@”,hypothesisArray);
}
#endif
// An optional delegate method of OEEventsObserver which informs that there was an interruption to the audio session (e.g. an incoming phone call).
– (void) audioSessionInterruptionDidBegin {
NSLog(@”Local callback: AudioSession interruption began.”); // Log it.
self.statusTextView.text = @”Status: AudioSession interruption began.”; // Show it in the status box.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening) {
error = [[OEPocketsphinxController sharedInstance] stopListening]; // React to it by telling Pocketsphinx to stop listening (if it is listening) since it will need to restart its loop after an interruption.
if(error) NSLog(@”Error while stopping listening in audioSessionInterruptionDidBegin: %@”, error);
}
}// An optional delegate method of OEEventsObserver which informs that the interruption to the audio session ended.
– (void) audioSessionInterruptionDidEnd {
NSLog(@”Local callback: AudioSession interruption ended.”); // Log it.
self.statusTextView.text = @”Status: AudioSession interruption ended.”; // Show it in the status box.
// We’re restarting the previously-stopped listening loop.
if(![OEPocketsphinxController sharedInstance].isListening){
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t currently listening.
}
}// An optional delegate method of OEEventsObserver which informs that the audio input became unavailable.
– (void) audioInputDidBecomeUnavailable {
NSLog(@”Local callback: The audio input has become unavailable”); // Log it.
self.statusTextView.text = @”Status: The audio input has become unavailable”; // Show it in the status box.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening){
error = [[OEPocketsphinxController sharedInstance] stopListening]; // React to it by telling Pocketsphinx to stop listening since there is no available input (but only if we are listening).
if(error) NSLog(@”Error while stopping listening in audioInputDidBecomeUnavailable: %@”, error);
}
}// An optional delegate method of OEEventsObserver which informs that the unavailable audio input became available again.
– (void) audioInputDidBecomeAvailable {
NSLog(@”Local callback: The audio input is available”); // Log it.
self.statusTextView.text = @”Status: The audio input is available”; // Show it in the status box.
if(![OEPocketsphinxController sharedInstance].isListening) {
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition, but only if we aren’t already listening.
}
}
// An optional delegate method of OEEventsObserver which informs that there was a change to the audio route (e.g. headphones were plugged in or unplugged).
– (void) audioRouteDidChangeToRoute:(NSString *)newRoute {
NSLog(@”Local callback: Audio route change. The new audio route is %@”, newRoute); // Log it.
self.statusTextView.text = [NSString stringWithFormat:@”Status: Audio route change. The new audio route is %@”,newRoute]; // Show it in the status box.NSError *error = [[OEPocketsphinxController sharedInstance] stopListening]; // React to it by telling the Pocketsphinx loop to shut down and then start listening again on the new route
if(error)NSLog(@”Local callback: error while stopping listening in audioRouteDidChangeToRoute: %@”,error);
if(![OEPocketsphinxController sharedInstance].isListening) {
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t already listening.
}
}// An optional delegate method of OEEventsObserver which informs that the Pocketsphinx recognition loop has entered its actual loop.
// This might be useful in debugging a conflict between another sound class and Pocketsphinx.
– (void) pocketsphinxRecognitionLoopDidStart {NSLog(@”Local callback: Pocketsphinx started.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx started.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx is now listening for speech.
– (void) pocketsphinxDidStartListening {NSLog(@”Local callback: Pocketsphinx is now listening.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx is now listening.”; // Show it in the status box.self.startButton.hidden = TRUE; // React to it with some UI changes.
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = FALSE;
self.resumeListeningButton.hidden = TRUE;
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx detected speech and is starting to process it.
– (void) pocketsphinxDidDetectSpeech {
NSLog(@”Local callback: Pocketsphinx has detected speech.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has detected speech.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx detected a second of silence, indicating the end of an utterance.
// This was added because developers requested being able to time the recognition speed without the speech time. The processing time is the time between
// this method being called and the hypothesis being returned.
– (void) pocketsphinxDidDetectFinishedSpeech {
NSLog(@”Local callback: Pocketsphinx has detected a second of silence, concluding an utterance.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has detected finished speech.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx has exited its recognition loop, most
// likely in response to the OEPocketsphinxController being told to stop listening via the stopListening method.
– (void) pocketsphinxDidStopListening {
NSLog(@”Local callback: Pocketsphinx has stopped listening.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has stopped listening.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx is still in its listening loop but it is not
// Going to react to speech until listening is resumed. This can happen as a result of Flite speech being
// in progress on an audio route that doesn’t support simultaneous Flite speech and Pocketsphinx recognition,
// or as a result of the OEPocketsphinxController being told to suspend recognition via the suspendRecognition method.
– (void) pocketsphinxDidSuspendRecognition {
NSLog(@”Local callback: Pocketsphinx has suspended recognition.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has suspended recognition.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Pocketsphinx is still in its listening loop and after recognition
// having been suspended it is now resuming. This can happen as a result of Flite speech completing
// on an audio route that doesn’t support simultaneous Flite speech and Pocketsphinx recognition,
// or as a result of the OEPocketsphinxController being told to resume recognition via the resumeRecognition method.
– (void) pocketsphinxDidResumeRecognition {
NSLog(@”Local callback: Pocketsphinx has resumed recognition.”); // Log it.
self.statusTextView.text = @”Status: Pocketsphinx has resumed recognition.”; // Show it in the status box.
}// An optional delegate method which informs that Pocketsphinx switched over to a new language model at the given URL in the course of
// recognition. This does not imply that it is a valid file or that recognition will be successful using the file.
– (void) pocketsphinxDidChangeLanguageModelToFile:(NSString *)newLanguageModelPathAsString andDictionary:(NSString *)newDictionaryPathAsString {
NSLog(@”Local callback: Pocketsphinx is now using the following language model: \n%@ and the following dictionary: %@”,newLanguageModelPathAsString,newDictionaryPathAsString);
}// An optional delegate method of OEEventsObserver which informs that Flite is speaking, most likely to be useful if debugging a
// complex interaction between sound classes. You don’t have to do anything yourself in order to prevent Pocketsphinx from listening to Flite talk and trying to recognize the speech.
– (void) fliteDidStartSpeaking {
NSLog(@”Local callback: Flite has started speaking”); // Log it.
self.statusTextView.text = @”Status: Flite has started speaking.”; // Show it in the status box.
}// An optional delegate method of OEEventsObserver which informs that Flite is finished speaking, most likely to be useful if debugging a
// complex interaction between sound classes.
– (void) fliteDidFinishSpeaking {
NSLog(@”Local callback: Flite has finished speaking”); // Log it.
self.statusTextView.text = @”Status: Flite has finished speaking.”; // Show it in the status box.
}– (void) pocketSphinxContinuousSetupDidFailWithReason:(NSString *)reasonForFailure { // This can let you know that something went wrong with the recognition loop startup. Turn on [OELogging startOpenEarsLogging] to learn why.
NSLog(@”Local callback: Setting up the continuous recognition loop has failed for the reason %@, please turn on [OELogging startOpenEarsLogging] to learn more.”, reasonForFailure); // Log it.
self.statusTextView.text = @”Status: Not possible to start recognition loop.”; // Show it in the status box.
}– (void) pocketSphinxContinuousTeardownDidFailWithReason:(NSString *)reasonForFailure { // This can let you know that something went wrong with the recognition loop startup. Turn on [OELogging startOpenEarsLogging] to learn why.
NSLog(@”Local callback: Tearing down the continuous recognition loop has failed for the reason %@, please turn on [OELogging startOpenEarsLogging] to learn more.”, reasonForFailure); // Log it.
self.statusTextView.text = @”Status: Not possible to cleanly end recognition loop.”; // Show it in the status box.
}– (void) testRecognitionCompleted { // A test file which was submitted for direct recognition via the audio driver is done.
NSLog(@”Local callback: A test file which was submitted for direct recognition via the audio driver is done.”); // Log it.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening) { // If we’re listening, stop listening.
error = [[OEPocketsphinxController sharedInstance] stopListening];
if(error) NSLog(@”Error while stopping listening in testRecognitionCompleted: %@”, error);
}}
– (void) rapidEarsDidReceiveLiveSpeechHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore {
NSLog(@”rapidEarsDidReceiveLiveSpeechHypothesis: %@”,hypothesis);
}– (void) rapidEarsDidReceiveFinishedSpeechHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore {
NSLog(@”rapidEarsDidReceiveFinishedSpeechHypothesis: %@”,hypothesis);
}
/** Pocketsphinx couldn’t start because it has no mic permissions (will only be returned on iOS7 or later).*/
– (void) pocketsphinxFailedNoMicPermissions {
NSLog(@”Local callback: The user has never set mic permissions or denied permission to this app’s mic, so listening will not start.”);
self.startupFailedDueToLackOfPermissions = TRUE;
if([OEPocketsphinxController sharedInstance].isListening){
NSError *error = [[OEPocketsphinxController sharedInstance] stopListening]; // Stop listening if we are listening.
if(error) NSLog(@”Error while stopping listening in micPermissionCheckCompleted: %@”, error);
}
}/** The user prompt to get mic permissions, or a check of the mic permissions, has completed with a TRUE or a FALSE result (will only be returned on iOS7 or later).*/
– (void) micPermissionCheckCompleted:(BOOL)result {
if(result) {
self.restartAttemptsDueToPermissionRequests++;
if(self.restartAttemptsDueToPermissionRequests == 1 && self.startupFailedDueToLackOfPermissions) { // If we get here because there was an attempt to start which failed due to lack of permissions, and now permissions have been requested and they returned true, we restart exactly once with the new permissions.if(![OEPocketsphinxController sharedInstance].isListening) { // If there was no error and we aren’t listening, start listening.
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]];
// startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel
// dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary
// acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”]
// languageModelIsJSGF:FALSE]; // Start speech recognition.self.startupFailedDueToLackOfPermissions = FALSE;
}
}
}
}#pragma mark –
#pragma mark UI// This is not OpenEars-specific stuff, just some UI behavior
– (IBAction) suspendListeningButtonAction { // This is the action for the button which suspends listening without ending the recognition loop
[[OEPocketsphinxController sharedInstance] suspendRecognition];self.startButton.hidden = TRUE;
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = TRUE;
self.resumeListeningButton.hidden = FALSE;
}– (IBAction) resumeListeningButtonAction { // This is the action for the button which resumes listening if it has been suspended
[[OEPocketsphinxController sharedInstance] resumeRecognition];self.startButton.hidden = TRUE;
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = FALSE;
self.resumeListeningButton.hidden = TRUE;
}– (IBAction) stopButtonAction { // This is the action for the button which shuts down the recognition loop.
NSError *error = nil;
if([OEPocketsphinxController sharedInstance].isListening) { // Stop if we are currently listening.
error = [[OEPocketsphinxController sharedInstance] stopListening];
if(error)NSLog(@”Error stopping listening in stopButtonAction: %@”, error);
}
self.startButton.hidden = FALSE;
self.stopButton.hidden = TRUE;
self.suspendListeningButton.hidden = TRUE;
self.resumeListeningButton.hidden = TRUE;
}– (IBAction) startButtonAction { // This is the action for the button which starts up the recognition loop again if it has been shut down.
if(![OEPocketsphinxController sharedInstance].isListening) {
[[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:self.pathToFirstDynamicallyGeneratedLanguageModel dictionaryAtPath:self.pathToFirstDynamicallyGeneratedDictionary acousticModelAtPath:[OEAcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:FALSE]; // Start speech recognition if we aren’t already listening.
}
self.startButton.hidden = TRUE;
self.stopButton.hidden = FALSE;
self.suspendListeningButton.hidden = FALSE;
self.resumeListeningButton.hidden = TRUE;
}#pragma mark –
#pragma mark Example for reading out Pocketsphinx and Flite audio levels without locking the UI by using an NSTimer// What follows are not OpenEars methods, just an approach for level reading
// that I’ve included with this sample app. My example implementation does make use of two OpenEars
// methods: the pocketsphinxInputLevel method of OEPocketsphinxController and the fliteOutputLevel
// method of OEFliteController.
//
// The example is meant to show one way that you can read those levels continuously without locking the UI,
// by using an NSTimer, but the OpenEars level-reading methods
// themselves do not include multithreading code since I believe that you will want to design your own
// code approaches for level display that are tightly-integrated with your interaction design and the
// graphics API you choose.
//
// Please note that if you use my sample approach, you should pay attention to the way that the timer is always stopped in
// dealloc. This should prevent you from having any difficulties with deallocating a class due to a running NSTimer process.– (void) startDisplayingLevels { // Start displaying the levels using a timer
[self stopDisplayingLevels]; // We never want more than one timer valid so we’ll stop any running timers first.
self.uiUpdateTimer = [NSTimer scheduledTimerWithTimeInterval:1.0/kLevelUpdatesPerSecond target:self selector:@selector(updateLevelsUI) userInfo:nil repeats:YES];
}– (void) stopDisplayingLevels { // Stop displaying the levels by stopping the timer if it’s running.
if(self.uiUpdateTimer && [self.uiUpdateTimer isValid]) { // If there is a running timer, we’ll stop it here.
[self.uiUpdateTimer invalidate];
self.uiUpdateTimer = nil;
}
}– (void) updateLevelsUI { // And here is how we obtain the levels. This method includes the actual OpenEars methods and uses their results to update the UI of this view controller.
self.pocketsphinxDbLabel.text = [NSString stringWithFormat:@”Pocketsphinx Input level:%f”,[[OEPocketsphinxController sharedInstance] pocketsphinxInputLevel]]; //pocketsphinxInputLevel is an OpenEars method of the class OEPocketsphinxController.
if(self.fliteController.speechInProgress) {
self.fliteDbLabel.text = [NSString stringWithFormat:@”Flite Output level: %f”,[self.fliteController fliteOutputLevel]]; // fliteOutputLevel is an OpenEars method of the class OEFliteController.
}
}@end
[/spoiler]April 21, 2016 at 1:11 pm #1030139Halle WinklerPolitepixOK, I’ll check it out and get back to you.
April 21, 2016 at 1:20 pm #1030140Halle WinklerPolitepixIt’s in there after the logging line “Attempting to start listening session from startRealtimeListeningWithLanguageModelAtPath”.
April 21, 2016 at 1:31 pm #1030141LaurentParticipantI don’t have this line. I don’t have also the Creating shared instance of OEPocketsphinxController line in my logs
April 21, 2016 at 1:38 pm #1030142Halle WinklerPolitepixI downloaded a new copy of the Rejecto and RapidEars demos from the same link in your demo download email from this morning to make sure we were linking to the identical binaries, and the RapidEars demo definitely prints both of these lines when I copy and paste your code above into the sample app using the shipped version of OpenEars 2.051. Old versions of RapidEars don’t print either of those lines, so the issue is going to be related to that somehow.
April 21, 2016 at 1:39 pm #1030143Halle WinklerPolitepixDouble check that you are testing this with OELogging started, and started at the time that the view first loads (like in your code above).
April 21, 2016 at 1:55 pm #1030144Halle WinklerPolitepixI don’t have also the Creating shared instance of OEPocketsphinxController line in my logs
I’ve gone back and checked again, and the “Creating shared instance of OEPocketsphinxController” line is in the logging in the Instruments output that you sent me (and was in my run of your latest sample app code), it’s only the new RapidEars 2.5 logging which is not visible in your Instruments log.
I think it’s probably somewhere in your local logging as well if OELogging is turned on before doing anything else since that line has been in OpenEars for a couple of years, it’s probably just an oversight due to the large amount of logging output, or maybe a case-sensitive search or similar. I’m not aware of any conditions which suppress the standard OpenEars logging output.
April 21, 2016 at 1:58 pm #1030145LaurentParticipantthis is my logs for the code above:
[spoiler]
2016-04-21 15:54:12.434 OpenEarsSampleApp[4162:1540274] Starting OpenEars logging for OpenEars version 2.501 on 64-bit device (or build): iPad running iOS version: 9.300000
2016-04-21 15:54:12.440 OpenEarsSampleApp[4162:1540274] Creating shared instance of OEPocketsphinxController
2016-04-21 15:54:12.445 OpenEarsSampleApp[4162:1540274] Rejecto version 2.500000
2016-04-21 15:54:12.479 OpenEarsSampleApp[4162:1540274] Since there is no cached version, loading the language model lookup list for the acoustic model called AcousticModelEnglish
2016-04-21 15:54:12.485 OpenEarsSampleApp[4162:1540274] Returning a cached version of LanguageModelGeneratorLookupList.text
2016-04-21 15:54:12.517 OpenEarsSampleApp[4162:1540274] I’m done running performDictionaryLookup and it took 0.031893 seconds
2016-04-21 15:54:12.517 OpenEarsSampleApp[4162:1540274] I’m done running performDictionaryLookup and it took 0.033159 seconds
2016-04-21 15:54:12.526 OpenEarsSampleApp[4162:1540274] Starting dynamic language model generationINFO: ngram_model_arpa_legacy.c(504): ngrams 1=49, 2=94, 3=47
INFO: ngram_model_arpa_legacy.c(136): Reading unigrams
INFO: ngram_model_arpa_legacy.c(543): 49 = #unigrams created
INFO: ngram_model_arpa_legacy.c(196): Reading bigrams
INFO: ngram_model_arpa_legacy.c(561): 94 = #bigrams created
INFO: ngram_model_arpa_legacy.c(562): 3 = #prob2 entries
INFO: ngram_model_arpa_legacy.c(570): 3 = #bo_wt2 entries
INFO: ngram_model_arpa_legacy.c(293): Reading trigrams
INFO: ngram_model_arpa_legacy.c(583): 47 = #trigrams created
INFO: ngram_model_arpa_legacy.c(584): 2 = #prob3 entries
INFO: ngram_model_dmp_legacy.c(521): Building DMP model…
INFO: ngram_model_dmp_legacy.c(551): 49 = #unigrams created
INFO: ngram_model_dmp_legacy.c(652): 94 = #bigrams created
INFO: ngram_model_dmp_legacy.c(653): 3 = #prob2 entries
INFO: ngram_model_dmp_legacy.c(660): 3 = #bo_wt2 entries
INFO: ngram_model_dmp_legacy.c(664): 47 = #trigrams created
INFO: ngram_model_dmp_legacy.c(665): 2 = #prob3 entries
2016-04-21 15:54:12.599 OpenEarsSampleApp[4162:1540274] Done creating language model with CMUCLMTK in 0.073387 seconds.
INFO: ngram_model_arpa_legacy.c(504): ngrams 1=49, 2=94, 3=47
INFO: ngram_model_arpa_legacy.c(136): Reading unigrams
INFO: ngram_model_arpa_legacy.c(543): 49 = #unigrams created
INFO: ngram_model_arpa_legacy.c(196): Reading bigrams
INFO: ngram_model_arpa_legacy.c(561): 94 = #bigrams created
INFO: ngram_model_arpa_legacy.c(562): 5 = #prob2 entries
INFO: ngram_model_arpa_legacy.c(570): 3 = #bo_wt2 entries
INFO: ngram_model_arpa_legacy.c(293): Reading trigrams
INFO: ngram_model_arpa_legacy.c(583): 47 = #trigrams created
INFO: ngram_model_arpa_legacy.c(584): 3 = #prob3 entries
INFO: ngram_model_dmp_legacy.c(521): Building DMP model…
INFO: ngram_model_dmp_legacy.c(551): 49 = #unigrams created
INFO: ngram_model_dmp_legacy.c(652): 94 = #bigrams created
INFO: ngram_model_dmp_legacy.c(653): 5 = #prob2 entries
INFO: ngram_model_dmp_legacy.c(660): 3 = #bo_wt2 entries
INFO: ngram_model_dmp_legacy.c(664): 47 = #trigrams created
INFO: ngram_model_dmp_legacy.c(665): 3 = #prob3 entries
2016-04-21 15:54:12.630 OpenEarsSampleApp[4162:1540274] I’m done running dynamic language model generation and it took 0.183863 seconds
2016-04-21 15:54:12.639 OpenEarsSampleApp[4162:1540274] Returning a cached version of LanguageModelGeneratorLookupList.text
2016-04-21 15:54:12.639 OpenEarsSampleApp[4162:1540274] Returning a cached version of LanguageModelGeneratorLookupList.text
2016-04-21 15:54:12.672 OpenEarsSampleApp[4162:1540274] The word QUIDNUNC was not found in the dictionary of the acoustic model /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle. Now using the fallback method to look it up. If this is happening more frequently than you would expect, likely causes can be that you are entering words in another language from the one you are recognizing, or that there are symbols (including numbers) that need to be spelled out or cleaned up, or you are using your own acoustic model and there is an issue with either its phonetic dictionary or it lacks a g2p file. Please get in touch at the forums for assistance with the last two possible issues.
2016-04-21 15:54:12.673 OpenEarsSampleApp[4162:1540274] Using convertGraphemes for the word or phrase quidnunc which doesn’t appear in the dictionary
2016-04-21 15:54:12.685 OpenEarsSampleApp[4162:1540274] the graphemes “K W IH D N AH NG K” were created for the word QUIDNUNC using the fallback method.
2016-04-21 15:54:12.698 OpenEarsSampleApp[4162:1540274] I’m done running performDictionaryLookup and it took 0.058571 seconds
2016-04-21 15:54:12.698 OpenEarsSampleApp[4162:1540274] I’m done running performDictionaryLookup and it took 0.059407 seconds
2016-04-21 15:54:12.709 OpenEarsSampleApp[4162:1540274] Starting dynamic language model generationINFO: ngram_model_arpa_legacy.c(504): ngrams 1=51, 2=97, 3=49
INFO: ngram_model_arpa_legacy.c(136): Reading unigrams
INFO: ngram_model_arpa_legacy.c(543): 51 = #unigrams created
INFO: ngram_model_arpa_legacy.c(196): Reading bigrams
INFO: ngram_model_arpa_legacy.c(561): 97 = #bigrams created
INFO: ngram_model_arpa_legacy.c(562): 3 = #prob2 entries
INFO: ngram_model_arpa_legacy.c(570): 3 = #bo_wt2 entries
INFO: ngram_model_arpa_legacy.c(293): Reading trigrams
INFO: ngram_model_arpa_legacy.c(583): 49 = #trigrams created
INFO: ngram_model_arpa_legacy.c(584): 2 = #prob3 entries
INFO: ngram_model_dmp_legacy.c(521): Building DMP model…
INFO: ngram_model_dmp_legacy.c(551): 51 = #unigrams created
INFO: ngram_model_dmp_legacy.c(652): 97 = #bigrams created
INFO: ngram_model_dmp_legacy.c(653): 3 = #prob2 entries
INFO: ngram_model_dmp_legacy.c(660): 3 = #bo_wt2 entries
INFO: ngram_model_dmp_legacy.c(664): 49 = #trigrams created
INFO: ngram_model_dmp_legacy.c(665): 2 = #prob3 entries
2016-04-21 15:54:12.784 OpenEarsSampleApp[4162:1540274] Done creating language model with CMUCLMTK in 0.074753 seconds.
INFO: ngram_model_arpa_legacy.c(504): ngrams 1=51, 2=97, 3=49
INFO: ngram_model_arpa_legacy.c(136): Reading unigrams
INFO: ngram_model_arpa_legacy.c(543): 51 = #unigrams created
INFO: ngram_model_arpa_legacy.c(196): Reading bigrams
INFO: ngram_model_arpa_legacy.c(561): 97 = #bigrams created
INFO: ngram_model_arpa_legacy.c(562): 5 = #prob2 entries
INFO: ngram_model_arpa_legacy.c(570): 3 = #bo_wt2 entries
INFO: ngram_model_arpa_legacy.c(293): Reading trigrams
INFO: ngram_model_arpa_legacy.c(583): 49 = #trigrams created
INFO: ngram_model_arpa_legacy.c(584): 3 = #prob3 entries
INFO: ngram_model_dmp_legacy.c(521): Building DMP model…
INFO: ngram_model_dmp_legacy.c(551): 51 = #unigrams created
INFO: ngram_model_dmp_legacy.c(652): 97 = #bigrams created
INFO: ngram_model_dmp_legacy.c(653): 5 = #prob2 entries
INFO: ngram_model_dmp_legacy.c(660): 3 = #bo_wt2 entries
INFO: ngram_model_dmp_legacy.c(664): 49 = #trigrams created
INFO: ngram_model_dmp_legacy.c(665): 3 = #prob3 entries
2016-04-21 15:54:12.822 OpenEarsSampleApp[4162:1540274] I’m done running dynamic language model generation and it took 0.191083 seconds
2016-04-21 15:54:12.822 OpenEarsSampleApp[4162:1540274]Welcome to the OpenEars sample project. This project understands the words:
BACKWARD,
CHANGE,
FORWARD,
GO,
LEFT,
MODEL,
RIGHT,
TURN,
and if you say “CHANGE MODEL” it will switch to its dynamically-generated model which understands the words:
CHANGE,
MODEL,
MONDAY,
TUESDAY,
WEDNESDAY,
THURSDAY,
FRIDAY,
SATURDAY,
SUNDAY,
QUIDNUNC
2016-04-21 15:54:12.827 OpenEarsSampleApp[4162:1540274] User gave mic permission for this app.
2016-04-21 15:54:12.828 OpenEarsSampleApp[4162:1540274] setSecondsOfSilence wasn’t set, using default of 0.700000.
2016-04-21 15:54:12.829 OpenEarsSampleApp[4162:1540292] Starting listening.
2016-04-21 15:54:12.829 OpenEarsSampleApp[4162:1540292] about to set up audio session
2016-04-21 15:54:12.830 OpenEarsSampleApp[4162:1540292] Creating audio session with default settings.
2016-04-21 15:54:12.923 OpenEarsSampleApp[4162:1540302] Audio route has changed for the following reason:
2016-04-21 15:54:12.927 OpenEarsSampleApp[4162:1540302] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
2016-04-21 15:54:12.930 OpenEarsSampleApp[4162:1540302] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —SpeakerMicrophoneBuiltIn—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x156d72d30,
inputs = (
“<AVAudioSessionPortDescription: 0x156eb8f00, type = MicrophoneBuiltIn; name = iPad Micro; UID = Built-In Microphone; selectedDataSource = Avant>”
);
outputs = (
“<AVAudioSessionPortDescription: 0x156e7a060, type = Speaker; name = Haut-parleur; UID = Built-In Speaker; selectedDataSource = (null)>”
)>.
2016-04-21 15:54:13.067 OpenEarsSampleApp[4162:1540292] done starting audio unit
INFO: pocketsphinx.c(145): Parsed model-specific feature parameters from /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/feat.params
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-ceplen 13 13
-cmn current current
-cmninit 8.0 40
-compallsen no no
-debug 0
-dict /var/mobile/Containers/Data/Application/CEFFF7C3-6DBE-4C65-83EF-A49F8024AC52/Library/Caches/FirstOpenEarsDynamicLanguageModel.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/noisedict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/feat.params
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle
-input_endian little little
-jsgf
-keyphrase
-kws
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 22
-lm /var/mobile/Containers/Data/Application/CEFFF7C3-6DBE-4C65-83EF-A49F8024AC52/Library/Caches/FirstOpenEarsDynamicLanguageModel.DMP
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.300000e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/mdef
-mean /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/means
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec 0-12/13-25/26-38
-tmat /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/transition_matrices
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 69
-vad_prespeech 20 10
-vad_startspeech 10 10
-vad_threshold 2.0 2.000000e+00
-var /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/variances
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02INFO: feat.c(715): Initializing feature stream to type: ‘1s_c_d_dd’, ceplen=13, CMN=’current’, VARNORM=’no’, AGC=’none’
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(164): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/mdef
INFO: bin_mdef.c(516): 46 CI-phone, 168344 CD-phone, 3 emitstate/phone, 138 CI-sen, 6138 Sen, 32881 Sen-Seq
INFO: tmat.c(206): Reading HMM transition probability matrices: /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/transition_matrices
INFO: acmod.c(117): Attempting to use PTM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: ptm_mgau.c(805): Number of codebooks doesn’t match number of ciphones, doesn’t look like PTM: 1 != 46
INFO: acmod.c(119): Attempting to use semi-continuous computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(904): Loading senones from dump file /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/sendump
INFO: s2_semi_mgau.c(928): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(991): Rows: 512, Columns: 6138
INFO: s2_semi_mgau.c(1023): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1294): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4152 * 32 bytes (129 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /var/mobile/Containers/Data/Application/CEFFF7C3-6DBE-4C65-83EF-A49F8024AC52/Library/Caches/FirstOpenEarsDynamicLanguageModel.dic
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 47 words read
INFO: dict.c(358): Reading filler dictionary: /var/containers/Bundle/Application/07CBAD80-E067-459E-9522-567D8DF68E72/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/noisedict
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 9 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 46^3 * 2 bytes (190 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 51152 bytes (49 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 51152 bytes (49 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(424): Trying to read LM in bin format
INFO: ngram_model_trie.c(457): Header doesn’t match
INFO: ngram_model_trie.c(180): Trying to read LM in arpa format
INFO: ngram_model_trie.c(71): No \data\ mark in LM file
INFO: ngram_model_trie.c(537): Trying to read LM in DMP format
INFO: ngram_model_trie.c(632): ngrams 1=49, 2=94, 3=47
INFO: lm_trie.c(317): Training quantizer
INFO: lm_trie.c(323): Building LM trie
INFO: ngram_search_fwdtree.c(99): 8 unique initial diphones
INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 49 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 49 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 145
INFO: ngram_search_fwdtree.c(339): after: 8 root, 17 non-root channels, 48 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
2016-04-21 15:54:13.162 OpenEarsSampleApp[4162:1540292] There is no CMN plist so we are using the fresh CMN value 40.000000.
2016-04-21 15:54:13.163 OpenEarsSampleApp[4162:1540292] Listening.
2016-04-21 15:54:13.164 OpenEarsSampleApp[4162:1540292] Project has these words or phrases in its dictionary:
___REJ_ZH
___REJ_Z
___REJ_Y
___REJ_W
___REJ_V
___REJ_UW
___REJ_UH
___REJ_TH
___REJ_T
___REJ_SH
___REJ_S
___REJ_R
___REJ_P
___REJ_OY
___REJ_OW
___REJ_NG
___REJ_N
___REJ_M
___REJ_L
___REJ_K
___REJ_JH
___REJ_IY
___REJ_IH
___REJ_HH
___REJ_G
___REJ_F
___REJ_EY
___REJ_ER
___REJ_EH
___REJ_DH
___REJ_D
…and 17 more.
2016-04-21 15:54:13.164 OpenEarsSampleApp[4162:1540292] Recognition loop has started
2016-04-21 15:54:13.222 OpenEarsSampleApp[4162:1540274] Local callback: Pocketsphinx is now listening.
2016-04-21 15:54:13.226 OpenEarsSampleApp[4162:1540274] Local callback: Pocketsphinx started.
2016-04-21 15:54:13.506 OpenEarsSampleApp[4162:1540293] Speech detected…
2016-04-21 15:54:13.507 OpenEarsSampleApp[4162:1540274] Local callback: Pocketsphinx has detected speech.
2016-04-21 15:54:13.507 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-1286) and an utterance ID of 0.
2016-04-21 15:54:13.507 OpenEarsSampleApp[4162:1540293] Hypothesis was null so we aren’t returning it. If you want null hypotheses to also be returned, set OEPocketsphinxController’s property returnNullHypotheses to TRUE before starting OEPocketsphinxController.
2016-04-21 15:54:13.646 OpenEarsSampleApp[4162:1540292] Pocketsphinx heard ” ” with a score of (-3591) and an utterance ID of 1.
2016-04-21 15:54:13.791 OpenEarsSampleApp[4162:1540292] Pocketsphinx heard ” ” with a score of (-4690) and an utterance ID of 2.
2016-04-21 15:54:13.914 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-6414) and an utterance ID of 3.
2016-04-21 15:54:14.028 OpenEarsSampleApp[4162:1540292] Pocketsphinx heard ” ” with a score of (-7325) and an utterance ID of 4.
2016-04-21 15:54:14.179 OpenEarsSampleApp[4162:1540292] Pocketsphinx heard ” ” with a score of (-8217) and an utterance ID of 5.
2016-04-21 15:54:14.303 OpenEarsSampleApp[4162:1540295] Pocketsphinx heard ” ” with a score of (-10011) and an utterance ID of 6.
2016-04-21 15:54:14.411 OpenEarsSampleApp[4162:1540295] Pocketsphinx heard ” ” with a score of (-10959) and an utterance ID of 7.
2016-04-21 15:54:14.580 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-13033) and an utterance ID of 8.
INFO: ngram_search.c(463): Resized backpointer table to 10000 entries
2016-04-21 15:54:14.686 OpenEarsSampleApp[4162:1540292] Pocketsphinx heard ” ” with a score of (-14072) and an utterance ID of 9.
2016-04-21 15:54:14.813 OpenEarsSampleApp[4162:1540295] Pocketsphinx heard ” ” with a score of (-15302) and an utterance ID of 10.
2016-04-21 15:54:14.952 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-17031) and an utterance ID of 11.
2016-04-21 15:54:15.078 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-17955) and an utterance ID of 12.
2016-04-21 15:54:15.180 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-19053) and an utterance ID of 13.
2016-04-21 15:54:15.348 OpenEarsSampleApp[4162:1540295] Pocketsphinx heard ” ” with a score of (-21171) and an utterance ID of 14.
2016-04-21 15:54:15.472 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-22181) and an utterance ID of 15.
2016-04-21 15:54:15.563 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-23068) and an utterance ID of 16.
2016-04-21 15:54:15.719 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-25022) and an utterance ID of 17.
2016-04-21 15:54:15.840 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-25955) and an utterance ID of 18.
INFO: ngram_search.c(463): Resized backpointer table to 20000 entries
2016-04-21 15:54:15.961 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-27724) and an utterance ID of 19.
2016-04-21 15:54:16.067 OpenEarsSampleApp[4162:1540293] Pocketsphinx heard ” ” with a score of (-28966) and an utterance ID of 20.
2016-04-21 15:54:16.219 OpenEarsSampleApp[4162:1540292] Pocketsphinx heard ” ” with a score of (-29769) and an utterance ID of 21.
2016-04-21 15:54:16.355 OpenEarsSampleApp[4162:1540292] Pocketsphinx heard ” ” with a score of (-31651) and an utterance ID of 22.
2016-04-21 15:54:16.454 OpenEarsSampleApp[4162:1540295] Pocketsphinx heard ” ” with a score of (-32748) and an utterance ID of 23.
2016-04-21 15:54:16.621 OpenEarsSampleApp[4162:1540295] Pocketsphinx heard ” ” with a score of (-35159) and an utterance ID of 24.
2016-04-21 15:54:16.731 OpenEarsSampleApp[4162:1540292] Pocketsphinx heard ” ” with a score of (-36272) and an utterance ID of 25.
2016-04-21 15:54:16.870 OpenEarsSampleApp[4162:1540292] Pocketsphinx heard ” ” with a score of (-37163) and an utterance ID of 26.
2016-04-21 15:54:17.003 OpenEarsSampleApp[4162:1540292] Pocketsphinx heard ” ” with a score of (-39099) and an utterance ID of 27.
2016-04-21 15:54:17.120 OpenEarsSampleApp[4162:1540295] Pocketsphinx heard ” ” with a score of (-40206) and an utterance ID of 28.
2016-04-21 15:54:17.244 OpenEarsSampleApp[4162:1540295] End of speech detected…
2016-04-21 15:54:17.247 OpenEarsSampleApp[4162:1540274] Local callback: Pocketsphinx has detected a second of silence, concluding an utterance.
INFO: cmn_prior.c(131): cmn_prior_update: from < 40.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: cmn_prior.c(149): cmn_prior_update: to < 40.02 -0.74 -13.25 0.38 -5.00 5.24 -2.31 -1.98 -1.52 1.41 -7.07 -1.30 -3.77 >
INFO: ngram_search_fwdtree.c(1553): 16041 words recognized (42/fr)
INFO: ngram_search_fwdtree.c(1555): 70579 senones evaluated (183/fr)
INFO: ngram_search_fwdtree.c(1559): 24594 channels searched (63/fr), 3048 1st, 18760 last
INFO: ngram_search_fwdtree.c(1562): 17980 words for which last channels evaluated (46/fr)
INFO: ngram_search_fwdtree.c(1564): 241 candidate words for entering last phone (0/fr)
INFO: ngram_search_fwdtree.c(1567): fwdtree 0.52 CPU 0.134 xRT
INFO: ngram_search_fwdtree.c(1570): fwdtree 3.86 wall 1.002 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 23 words
INFO: ngram_search_fwdflat.c(948): 6335 words recognized (16/fr)
INFO: ngram_search_fwdflat.c(950): 24855 senones evaluated (65/fr)
INFO: ngram_search_fwdflat.c(952): 7343 channels searched (19/fr)
INFO: ngram_search_fwdflat.c(954): 7343 words searched (19/fr)
INFO: ngram_search_fwdflat.c(957): 3010 word transitions (7/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.06 CPU 0.015 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.06 wall 0.016 xRT
INFO: ngram_search.c(1280): lattice start node <s>.0 end node </s>.381
INFO: ngram_search.c(1306): Eliminated 5 nodes before end node
INFO: ngram_search.c(1411): Lattice has 4227 nodes, 73183 links
INFO: ps_lattice.c(1380): Bestpath score: -41400
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:381:383) = -41884
INFO: ps_lattice.c(1441): Joint P(O,S) = -220787 P(S|O) = -178903
INFO: ngram_search.c(899): bestpath 0.67 CPU 0.176 xRT
INFO: ngram_search.c(902): bestpath 0.67 wall 0.175 xRT
2016-04-21 15:54:17.996 OpenEarsSampleApp[4162:1540295] Pocketsphinx heard ” ” with a score of (-4057) and an utterance ID of 29.
[/spoiler]April 21, 2016 at 2:27 pm #1030147Halle WinklerPolitepix“Creating shared instance of OEPocketsphinxController” is the second line. But I really don’t think those logs are from a project successfully linked with a current version of RapidEarsDemo.framework. If the Rejecto 2.5 and OpenEars 2.5 logging work as expected, and all three of them work as expected when I run it locally, and the local behavior when stopping listening is different on your setup, I think the logging problem is most likely somehow connected with linking to an old copy of RapidEars locally.
I believe that it is possible that there is an extraordinary bug of some kind which causes the symptoms of just the RapidEars framework not logging just some of its logging output, so I am not ruling that out, but it seems like a lower probability than linking to an old framework right now. There is some RapidEars logging appearing (“Starting listening”), so the issue isn’t that there is no logging output from RapidEars at all, but that the new logging output that was added to RapidEars 2.5 is not apparent.
April 21, 2016 at 2:55 pm #1030149LaurentParticipantI am trying to get the rapidEars 2.5 logging when I will get it I will tell you!
And then I will be able to do the replication with the songs and share it with you.Regards,
LaurentApril 21, 2016 at 2:59 pm #1030150Halle WinklerPolitepixThank you, I look forward to checking it out.
April 22, 2016 at 10:32 am #1030164LaurentParticipantI can’t replicate the same behavior. But by trying to do it . I can tell you that
when I am profiling with instruments leaks. I don’t speak at all and when there is some noise in my office Memory size is growing up and when it is rising upper than 60MB if you click on stop listening it doesn’t fall down to 13MB.
And also I noticed that when it falls down when the memory goes up to 60MB it is not falling down at the same memory size as before the rising.
Example : before it is 13MB there is an increase up to 70MB and falling down at 19Mb…Could you please test on your side and check if it is the same behaviour.
April 22, 2016 at 11:23 am #1030167Halle WinklerPolitepixHi,
No, sorry, I don’t see that behavior when stopping, that is why I have asked for a replication case. I see the behavior I described at the beginning of the discussion, that both the buffer and the hypothesis search can grow to the size needed (which can get pretty large when there is a long utterance that has noise and the presence of words is unclear) and then is eventually released (the hypothesis search memory releases sometime after the search finishes, and the buffer memory releases sometime after stopping listening). It can take a fair amount of time before Instruments shows it as being released, and of course there are also other things happening in the sample app that can make their own use of memory (such as TTS and language model generation file caching).
I also don’t see leaks in Instruments (other than the very tiny leaks we discussed at the beginning of the discussion, adding up to less than 2k).
When I look at your Instruments example you sent, it doesn’t have leaks other than the tiny <2k leaks (leaks are orphaned memory). But it has live memory that continues to be live, at a time in which it can be expected to be possible to release, and it is a lot of memory. The memory usage is from a 3rd-pass search that doesn't complete successfully when you stop listening (this is shown in the log when you look at the log timing and compare it to the memory usage).
What is happening in that session is that the 3rd-pass search gets very large (probably due to intermittent background noise that builds up in a long utterance without any easily-found words) and it keeps searching even while the stopListening message is in progress and trying to shut down, and after too much time passes, the stopListening method has to stop attempting to release the search because it will lead to an exception. The reason that I've wanted to debug your RapidEars version install is that this behavior used to be possible with RapidEars during a big unclear 3rd-pass search, but this issue was fixed, so it is unexpected that it is happening in your install, and it is also not possible for me to replicate with the current versions of the frameworks using my own audio files or audio input.
It also seems technically impossible that this happens with 3rd-pass searches turned off, so it's a confusing issue. So these are the reasons I'm now hoping for a replication case from you where it replicates using an audio file from your environment. I would _really_ like to fix it if it is a current problem since I have already worked on this and thought it was fixed, but I can't cause it to happen in my own system. My local install demonstrates the fixes that were made, and also doesn't run 3rd-pass searches like the one shown in the Instruments file when I turn 3rd-pass searches off.
If this is an issue:
before it is 13MB there is an increase up to 70MB and falling down at 19Mb
It is a different issue – the one I’ve been trying to replicate is the one that was shown in your Instruments file you sent, where there is a far larger allocation with none of it ever being released when stopping listening fails (this is the old issue that I expected to be fixed). An intermittent usage of a large amount of memory is something that can happen briefly in many kinds of apps that have a temporary need for it without it being a problem, and the search data structures can temporarily get pretty big on a 64-bit platform, although this is something that occurs for a matter of seconds.
I’m happy to investigate a new issue like “after a large search is successfully released, there is still 6MB more memory used than expected at the time of restarting listening and it isn’t clear whether it is due to a new memory need or a bug”, but first we need to wrap up the reported issue in the file you sent, which is a huge allocation where none of it is released because stopListening is not successful, which is a different scale of problem.
Is it possible that the reported issue doesn’t replicate for you now because you have done some troubleshooting on your RapidEars install version and you are now linking to RapidEars 2.5, or do you still not see RapidEars 2.5 logging when you try to set up your replication case?
April 22, 2016 at 12:43 pm #1030169LaurentParticipantI still do not see the rapidEars 2.5 logging.
But I don’t know why but when I tried with the WAV file the memory is well released. So I think that is with the microphone. I don’t know why I can give you privately the WAV file that I can’t replicate just to listen to it if you want.For the instrument file . I can share it with you privately if you want.
So, when I will be able to replicate I will come back and share it with you.
Regards,
LaurentApril 22, 2016 at 4:00 pm #1030171LaurentParticipantThis is a link with the instruments, the project and the dmg framework:
https://www.dropbox.com/s/so9yfbbv6awwu7j/PP.zip?dl=0April 22, 2016 at 4:27 pm #1030172Halle WinklerPolitepixI believe that it is possible that there is an extraordinary bug of some kind which causes the symptoms of just the RapidEars framework not logging just some of its logging output, so I am not ruling that out, but it seems like a lower probability than linking to an old framework right now.
It is an extraordinary bug of some kind which causes the symptoms of just the RapidEars framework not logging just some of its logging output :/ . When both OELogging and verbosePocketsphinx are on, verbosePocketsphinx suppresses the earliest part of the RapidEars output. Sorry it was difficult to pin this down and thank you for bringing it to my attention. I will try to fix that in the next version but for now you can easily verify the version by running OELogging with verbosePocketsphinx turned off.
April 22, 2016 at 4:29 pm #1030173Halle WinklerPolitepixOK, the next thing I noticed is that you have
[OEPocketsphinxController sharedInstance].legacy3rdPassMode = TRUE ;
set before you call this message:
[[OEPocketsphinxController sharedInstance] setActive:TRUE error:nil];
But setActive has to come before any property setting.
April 22, 2016 at 4:46 pm #1030174Halle WinklerPolitepixOK, is the saved Instruments output file made from running the app you sent when using the WAV test file pocketsphinx_sample_log_201604221141402.wav?
April 29, 2016 at 5:52 am #1030227LaurentParticipantHello Mr Halle,
I had a problem with Xcode which didn’t link the framework so I had to reset Xcode in order to solve it.
I have checked for the RapidEars 2.5 logging and when I turned off the pocketSphinx verbose I can see the log.
I haven’t use this wav file to make this instruments.
When I will be able to reproduce the bug I will share it with you.April 29, 2016 at 7:10 am #1030228Halle WinklerPolitepixHi Laurent,
Thank you, I will look at the new version whenever you have it for me. You can erase the other upload since I have a copy.
May 17, 2016 at 10:28 pm #1030342Halle WinklerPolitepixOK, although I wasn’t able to reproduce this exactly from the sample, I was able to get a similar reproduction on an older device with a different setup, so I am currently investigating this issue, thank you for all of the info.
May 23, 2016 at 7:28 am #1030375LaurentParticipantHi, Mr Halle,
I’m glad to heard that and to help you. Please notify me if you find the problem. Thank you.
Regards,
Laurent.June 7, 2016 at 6:22 pm #1030554Halle WinklerPolitepixHi Laurent,
I’ve removed a couple of our side discussions in this thread so it’s easier for later readers who need to get an overview on related issues to get through it quickly – hope you don’t mind since the extra discussion was my fault. Today there is a new OpenEars and RapidEars version 2.502 (more info at https://changelogs.politepix.com and downloadable from https://www.politepix.com/openears and your registered framework customer account) which should fix this issue you’ve reported. Before we talk about it more I wanted to clarify what this update fixes. We’ve talked about four different things in this discussion:
1. Actual leaks in Sphinx which we’ve established are very tiny,
2. Normally-increasing memory usage from OpenEars due to a growing buffer size from longer utterances,
3. The usage of much larger amounts of memory which is then normally released after there is a hypothesis,
4. Large amounts of memory not reclaimable after stopping when there is a big search in progress during the attempted stop.I think we covered 1 & 2 pretty well earlier in the discussion, so let’s agree to just discuss 3 & 4 now that the updates are out, if that’s OK.
The OpenEars and RapidEars 2.502 updates should fix #4, which is a serious bug and which I’m really happy you told me about and showed me an example of, thank you. In general the updates should also allow faster stops when there is a big search in progress at stopping time, even setting aside the memory usage. Please be so kind as to check this out thoroughly and let me know if it stops the bad memory events at stopping time, and also let me know if you see anything bad due to the changes. The one case which was in your replication cases but which isn’t necessary to test or report is what happens to memory when the app hangs or exits due to the demo framework timing out – this is expected to not be graceful, so the memory usage under those circumstances at the very end of the app session isn’t an issue. In your case you should be able to test against your registered 2.502 framework instead of the demo.
#3 is a more complicated subject and not really a bug as far as I can see, so I wanted to explain it a little bit. I couldn’t actually replicate the situation you were seeing with extremely large allocations during an utterance, although I worked very hard to do so, setting up an external speaker system so I could play your office audio out loud into various device microphones since it didn’t replicate with test file playback. I couldn’t ever get the big memory usage to replicate, but I could see some smaller allocations which were nonetheless bigger than I would have preferred. I believe this is a bit more of an implementation issue than a bug, with a couple of root causes:
• There are some strange noises with unusual echo and doppler in the recordings. I don’t know whether there is some kind of industrial noise in the background where you work, whether this is an artifact of the device mic (it could be a strange result of echo cancelling past a certain physical distance from the mic or similar) or even if it is an artifact of SaveThatWave, but I’ve never heard it before on SaveThatWave recordings so I think it was either really there in the environment or it is a peculiarity of the mic and hardware and usage distance. In any case, this type of audio artifact causes unexpected results with speech recognition and I’ve had the experience that it adds confusion to word searches.
• In the code you shared, the jobs of vadThreshold and Rejecto weight are reversed. Normally you want the highest possible vadThreshold which still allows intentional speech to be perceived by OpenEars, then you add Rejecto to work against real speech that isn’t part of your model, and then after adding Rejecto, in relatively uncommon cases, you can increase the weight a little. In this code, the vadThreshold is left at the default although it is resulting in all environmental sounds being treated as speech (leading to all the null hyps in every recognition round), and then there is the maximum possible Rejecto weight so that nearly all of the speech (which is really incidental noise) is first completely processed and then rejected. In RapidEars, this results in very large search spaces, because every noise is a potential word, but every word has to be analyzed using the smallest possible speech units which can occur in any combination, because your actual vocabulary is weighted very low in probability, and reject-able sounds are rated very high due to the weighting. In combination with the odd noises, this leads to the big, slow hypothesis searches as a result of non-speech, which can be seen in the logs and the profile. Although I couldn’t replicate the memory usage, I believe it is happening, and I think it is due to this circumstance.
It is my expectation that if you turn off Rejecto and first find the right vadThreshold (probably at least 2.5) and then afterwards add in a normally-weighted Rejecto model, you should see more normal memory usage and probably more accuracy. I have made a decision not to make code changes for #3 because it would have big side-effects, and I think it is due to a circumstance which would be better to address via implementation. I am still open to seeing an example which replicates consistently from a test file and giving it more consideration, but so far I haven’t been able to witness it directly so my sense is that it is bound to the environment and the vadThreshold/weight issue.
Let me know how the new stopping behavior works for you, and thanks again for providing so much info about this bug so I could fix it.
June 8, 2016 at 6:18 am #1030563LaurentParticipantHi Mr Halle,
First, I want to thank you for your help.
I didn’t know the “vadThreshold” parameter. I will definitely test it.
I will test the updates and share my report with you as soon as possible.Regards,
Laurent. -
AuthorPosts
- You must be logged in to reply to this topic.