Forum Replies Created
-
AuthorPosts
-
chymelParticipant
Ooooooooooooooo *facepalm* I see how it works now.
I was under the assumption that you had to call the start, then stop to get a kickback that the audio was saved. So you would end up with a big chunk of audio recording. I didn’t realize that every time it heard something, analyzed it, it would then save that specific bit of audio. Makes complete sense now. ha!
Thank you very much again for your help!!!
chymelParticipantDo you have a working demo project I can try? I’ve done the implementation identical to the known-to-work tutorial implementation. In that the voice recognition works fine but I’m never able to get an audio recording. I can’t even get
- (void) wavWasSavedAtLocation:(NSString *)location
to fire.
chymelParticipantShouldn’t calling
[self.saveThatWaveController stop];
Cause this delegate method to fire?
- (void) wavWasSavedAtLocation:(NSString *)location
startPlayback:
is just a simple button on screen to test stopping and playing back the audio file.chymelParticipantDo you have any idea from the debug log what could be happening?
chymelParticipant(I really appreciate your help and fast responses)
Device: iPhone 5
iOS: 8.3Log:
2015-05-07 16:09:24.455 OpenEarsTest[7002:2781753] Starting OpenEars logging for OpenEars version 2.03 on 32-bit device (or build): iPhone running iOS version: 8.300000 2015-05-07 16:09:24.457 OpenEarsTest[7002:2781753] Creating shared instance of OEPocketsphinxController 2015-05-07 16:09:24.463 OpenEarsTest[7002:2781753] Attempting to start listening session from startListeningWithLanguageModelAtPath: 2015-05-07 16:09:24.473 OpenEarsTest[7002:2781753] User gave mic permission for this app. 2015-05-07 16:09:24.474 OpenEarsTest[7002:2781753] setSecondsOfSilence wasn't set, using default of 0.700000. 2015-05-07 16:09:24.475 OpenEarsTest[7002:2781753] Successfully started listening session from startListeningWithLanguageModelAtPath: 2015-05-07 16:09:24.475 OpenEarsTest[7002:2781772] Starting listening. 2015-05-07 16:09:24.476 OpenEarsTest[7002:2781772] about to set up audio session 2015-05-07 16:09:24.790 OpenEarsTest[7002:2781787] Audio route has changed for the following reason: 2015-05-07 16:09:24.819 OpenEarsTest[7002:2781787] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord 2015-05-07 16:09:25.181 OpenEarsTest[7002:2781772] done starting audio unit INFO: cmd_ln.c(702): Parsing command line: \ -lm /var/mobile/Containers/Data/Application/530E1CD4-840F-4D4D-95D9-ABB9F30B202F/Library/Caches/NameIWantForMyLanguageModelFiles.DMP \ -vad_prespeech 10 \ -vad_postspeech 69 \ -vad_threshold 2.000000 \ -remove_noise yes \ -remove_silence yes \ -bestpath yes \ -lw 6.500000 \ -dict /var/mobile/Containers/Data/Application/530E1CD4-840F-4D4D-95D9-ABB9F30B202F/Library/Caches/NameIWantForMyLanguageModelFiles.dic \ -hmm /private/var/mobile/Containers/Bundle/Application/DF8AF024-0948-44C8-A1D9-F4AB45F1B4F9/OpenEarsTest.app/AcousticModelEnglish.bundle Current configuration: [NAME] [DEFLT] [VALUE] -agc none none -agcthresh 2.0 2.000000e+00 -allphone -allphone_ci no no -alpha 0.97 9.700000e-01 -argfile -ascale 20.0 2.000000e+01 -aw 1 1 -backtrace no no -beam 1e-48 1.000000e-48 -bestpath yes yes -bestpathlw 9.5 9.500000e+00 -bghist no no -ceplen 13 13 -cmn current current -cmninit 8.0 8.0 -compallsen no no -debug 0 -dict /var/mobile/Containers/Data/Application/530E1CD4-840F-4D4D-95D9-ABB9F30B202F/Library/Caches/NameIWantForMyLanguageModelFiles.dic -dictcase no no -dither no no -doublebw no no -ds 1 1 -fdict -feat 1s_c_d_dd 1s_c_d_dd -featparams -fillprob 1e-8 1.000000e-08 -frate 100 100 -fsg -fsgusealtpron yes yes -fsgusefiller yes yes -fwdflat yes yes -fwdflatbeam 1e-64 1.000000e-64 -fwdflatefwid 4 4 -fwdflatlw 8.5 8.500000e+00 -fwdflatsfwin 25 25 -fwdflatwbeam 7e-29 7.000000e-29 -fwdtree yes yes -hmm /private/var/mobile/Containers/Bundle/Application/DF8AF024-0948-44C8-A1D9-F4AB45F1B4F9/OpenEarsTest.app/AcousticModelEnglish.bundle -input_endian little little -jsgf -kdmaxbbi -1 -1 -kdmaxdepth 0 0 -kdtree -keyphrase -kws -kws_plp 1e-1 1.000000e-01 -kws_threshold 1 1.000000e+00 -latsize 5000 5000 -lda -ldadim 0 0 -lextreedump 0 0 -lifter 0 0 -lm /var/mobile/Containers/Data/Application/530E1CD4-840F-4D4D-95D9-ABB9F30B202F/Library/Caches/NameIWantForMyLanguageModelFiles.DMP -lmctl -lmname -logbase 1.0001 1.000100e+00 -logfn -logspec no no -lowerf 133.33334 1.333333e+02 -lpbeam 1e-40 1.000000e-40 -lponlybeam 7e-29 7.000000e-29 -lw 6.5 6.500000e+00 -maxhmmpf 10000 10000 -maxnewoov 20 20 -maxwpf -1 -1 -mdef -mean -mfclogdir -min_endfr 0 0 -mixw -mixwfloor 0.0000001 1.000000e-07 -mllr -mmap yes yes -ncep 13 13 -nfft 512 512 -nfilt 40 40 -nwpen 1.0 1.000000e+00 -pbeam 1e-48 1.000000e-48 -pip 1.0 1.000000e+00 -pl_beam 1e-10 1.000000e-10 -pl_pbeam 1e-5 1.000000e-05 -pl_window 0 0 -rawlogdir -remove_dc no no -remove_noise yes yes -remove_silence yes yes -round_filters yes yes -samprate 16000 1.600000e+04 -seed -1 -1 -sendump -senlogdir -senmgau -silprob 0.005 5.000000e-03 -smoothspec no no -svspec -tmat -tm2015-05-07 16:09:25.186 OpenEarsTest[7002:2781787] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is ---HeadphonesMicrophoneBuiltIn---. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x1756fa10, inputs = (null); outputs = ( "<AVAudioSessionPortDescription: 0x1756f5e0, type = Headphones; name = Headphones; UID = Wired Headphones; selectedDataSource = (null)>" )>. atfloor 0.0001 1.000000e-04 -topn 4 4 -topn_beam 0 0 -toprule -transform legacy legacy -unit_area yes yes -upperf 6855.4976 6.855498e+03 -usewdphones no no -uw 1.0 1.000000e+00 -vad_postspeech 50 69 -vad_prespeech 10 10 -vad_threshold 2.0 2.000000e+00 -var -varfloor 0.0001 1.000000e-04 -varnorm no no -verbose no no -warp_params -warp_type inverse_linear inverse_linear -wbeam 7e-29 7.000000e-29 -wip 0.65 6.500000e-01 -wlen 0.025625 2.562500e-02 INFO: cmd_ln.c(702): Parsing command line: \ -nfilt 25 \ -lowerf 130 \ -upperf 6800 \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -agc none \ -cmn current \ -varnorm no \ -transform dct \ -lifter 22 \ -cmninit 40 Current configuration: [NAME] [DEFLT] [VALUE] -agc none none -agcthresh 2.0 2.000000e+00 -alpha 0.97 9.700000e-01 -ceplen 13 13 -cmn current current -cmninit 8.0 40 -dither no no -doublebw no no -feat 1s_c_d_dd 1s_c_d_dd -frate 100 100 -input_endian little little -lda -ldadim 0 0 -lifter 0 22 -logspec no no -lowerf 133.33334 1.300000e+02 -ncep 13 13 -nfft 512 512 -nfilt 40 25 -remove_dc no no -remove_noise yes yes -remove_silence yes yes -round_filters yes yes -samprate 16000 1.600000e+04 -seed -1 -1 -smoothspec no no -svspec 0-12/13-25/26-38 -transform legacy dct -unit_area yes yes -upperf 6855.4976 6.800000e+03 -vad_postspeech 50 69 -vad_prespeech 10 10 -vad_threshold 2.0 2.000000e+00 -varnorm no no -verbose no no -warp_params -warp_type inverse_linear inverse_linear -wlen 0.025625 2.562500e-02 INFO: acmod.c(252): Parsed model-specific feature parameters from /private/var/mobile/Containers/Bundle/Application/DF8AF024-0948-44C8-A1D9-F4AB45F1B4F9/OpenEarsTest.app/AcousticModelEnglish.bundle/feat.params INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none' INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0 INFO: acmod.c(171): Using subvector specification 0-12/13-25/26-38 INFO: mdef.c(518): Reading model definition: /private/var/mobile/Containers/Bundle/Application/DF8AF024-0948-44C8-A1D9-F4AB45F1B4F9/OpenEarsTest.app/AcousticModelEnglish.bundle/mdef INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file INFO: bin_mdef.c(336): Reading binary model definition: /private/var/mobile/Containers/Bundle/Application/DF8AF024-0948-44C8-A1D9-F4AB45F1B4F9/OpenEarsTest.app/AcousticModelEnglish.bundle/mdef INFO: bin_mdef.c(516): 46 CI-phone, 168344 CD-phone, 3 emitstate/phone, 138 CI-sen, 6138 Sen, 32881 Sen-Seq INFO: tmat.c(206): Reading HMM transition probability matrices: /private/var/mobile/Containers/Bundle/Application/DF8AF024-0948-44C8-A1D9-F4AB45F1B4F9/OpenEarsTest.app/AcousticModelEnglish.bundle/transition_matrices INFO: acmod.c(124): Attempting to use SCHMM computation module INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/DF8AF024-0948-44C8-A1D9-F4AB45F1B4F9/OpenEarsTest.app/AcousticModelEnglish.bundle/means INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/DF8AF024-0948-44C8-A1D9-F4AB45F1B4F9/OpenEarsTest.app/AcousticModelEnglish.bundle/variances INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(354): 0 variance values floored INFO: s2_semi_mgau.c(904): Loading senones from dump file /private/var/mobile/Containers/Bundle/Application/DF8AF024-0948-44C8-A1D9-F4AB45F1B4F9/OpenEarsTest.app/AcousticModelEnglish.bundle/sendump INFO: s2_semi_mgau.c(928): BEGIN FILE FORMAT DESCRIPTION INFO: s2_semi_mgau.c(991): Rows: 512, Columns: 6138 INFO: s2_semi_mgau.c(1023): Using memory-mapped I/O for senones INFO: s2_semi_mgau.c(1294): Maximum top-N: 4 Top-N beams: 0 0 0 INFO: dict.c(320): Allocating 4111 * 20 bytes (80 KiB) for word entries INFO: dict.c(333): Reading main dictionary: /var/mobile/Containers/Data/Application/530E1CD4-840F-4D4D-95D9-ABB9F30B202F/Library/Caches/NameIWantForMyLanguageModelFiles.dic INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones INFO: dict.c(336): 6 words read INFO: dict.c(342): Reading filler dictionary: /private/var/mobile/Containers/Bundle/Application/DF8AF024-0948-44C8-A1D9-F4AB45F1B4F9/OpenEarsTest.app/AcousticModelEnglish.bundle/noisedict INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones INFO: dict.c(345): 9 words read INFO: dict2pid.c(396): Building PID tables for dictionary INFO: dict2pid.c(406): Allocating 46^3 * 2 bytes (190 KiB) for word-initial triphones INFO: dict2pid.c(132): Allocated 25576 bytes (24 KiB) for word-final triphones INFO: dict2pid.c(196): Allocated 25576 bytes (24 KiB) for single-phone word triphones INFO: ngram_model_arpa.c(79): No \data\ mark in LM file INFO: ngram_model_dmp.c(166): Will use memory-mapped I/O for LM file INFO: ngram_model_dmp.c(220): ngrams 1=7, 2=9, 3=6 INFO: ngram_model_dmp.c(266): 7 = LM.unigrams(+trailer) read INFO: ngram_model_dmp.c(312): 9 = LM.bigrams(+trailer) read INFO: ngram_model_dmp.c(338): 6 = LM.trigrams read INFO: ngram_model_dmp.c(363): 4 = LM.prob2 entries read INFO: ngram_model_dmp.c(383): 5 = LM.bo_wt2 entries read INFO: ngram_model_dmp.c(403): 2 = LM.prob3 entries read INFO: ngram_model_dmp.c(431): 1 = LM.tseg_base entries read INFO: ngram_model_dmp.c(487): 7 = ascii word strings read INFO: ngram_search_fwdtree.c(99): 4 unique initial diphones INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 12 single-phone words INFO: ngram_search_fwdtree.c(186): Creating search tree INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 12 single-phone words INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 138 INFO: ngram_search_fwdtree.c(339): after: 4 root, 10 non-root channels, 11 single-phone words INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25 2015-05-07 16:09:25.299 OpenEarsTest[7002:2781772] Restoring SmartCMN value of 49.027588 2015-05-07 16:09:25.299 OpenEarsTest[7002:2781772] Listening. 2015-05-07 16:09:25.300 OpenEarsTest[7002:2781772] Project has these words or phrases in its dictionary: A A(2) OTHER PHRASE STATEMENT WORD 2015-05-07 16:09:25.301 OpenEarsTest[7002:2781772] Recognition loop has started 2015-05-07 16:09:25.322 OpenEarsTest[7002:2781753] Pocketsphinx is now listening. 2015-05-07 16:09:25.706 OpenEarsTest[7002:2781774] Speech detected... 2015-05-07 16:09:25.707 OpenEarsTest[7002:2781753] Pocketsphinx has detected speech. 2015-05-07 16:09:30.045 OpenEarsTest[7002:2781753] Stopping listening. INFO: ngram_search.c(462): Resized backpointer table to 10000 entries INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 0.00 CPU nan xRT INFO: ngram_search_fwdtree.c(435): TOTAL fwdtree 0.00 wall nan xRT INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.00 CPU nan xRT INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.00 wall nan xRT INFO: ngram_search.c(307): TOTAL bestpath 0.00 CPU nan xRT INFO: ngram_search.c(310): TOTAL bestpath 0.00 wall nan xRT 2015-05-07 16:09:30.663 OpenEarsTest[7002:2781753] No longer listening. 2015-05-07 16:09:30.667 OpenEarsTest[7002:2781753] Pocketsphinx has stopped listening. 2015-05-07 16:09:30.708 OpenEarsTest[7002:2781787] Audio route has changed for the following reason: 2015-05-07 16:09:30.713 OpenEarsTest[7002:2781787] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord 2015-05-07 16:09:30.720 OpenEarsTest[7002:2781787] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is ---Headphones---. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x17647190, inputs = ( "<AVAudioSessionPortDescription: 0x17647790, type = MicrophoneBuiltIn; name = iPhone Microphone; UID = Built-In Microphone; selectedDataSource = Bottom>" ); outputs = ( "<AVAudioSessionPortDescription: 0x17652f40, type = Headphones; name = Headphones; UID = Wired Headphones; selectedDataSource = (null)>" )>. 2015-05-07 16:09:32.863 OpenEarsTest[7002:2781753] (null) 2015-05-07 16:09:32.935 OpenEarsTest[7002:2781753] There was no audio string set. Error: Error Domain=NSOSStatusErrorDomain Code=-10875 "The operation couldn’t be completed. (OSStatus error -10875.)"
chymelParticipantI’m sorry but it’s still not working. I’ve followed the tutorial which before I wasn’t adding the linker flags but now I am, both the target and project.
I hate to dump code but this is my .m file. I can’t find anything wrong?
// // ViewController.m // OpenEarsTest // #import "ViewController.h" #import <OpenEars/OELanguageModelGenerator.h> #import <OpenEars/OEAcousticModel.h> #import <OpenEars/OEPocketsphinxController.h> #import <OpenEars/OEAcousticModel.h> #import <OpenEars/OEEventsObserver.h> #import <SaveThatWaveDemo/OEEventsObserver+SaveThatWave.h> #import <SaveThatWaveDemo/SaveThatWaveController.h> @interface ViewController () <OEEventsObserverDelegate> @property AVAudioPlayer *musicPlayer; @property NSString *audioFilePath; @property (strong, nonatomic) OEEventsObserver *openEarsEventsObserver; @property (strong, nonatomic) SaveThatWaveController *saveThatWaveController; @end @implementation ViewController @synthesize saveThatWaveController; - (void) pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID { NSLog(@"The received hypothesis is %@ with a score of %@ and an ID of %@", hypothesis, recognitionScore, utteranceID); } - (void) pocketsphinxDidStartListening { NSLog(@"Pocketsphinx is now listening."); } - (void) pocketsphinxDidDetectSpeech { NSLog(@"Pocketsphinx has detected speech."); } - (void) pocketsphinxDidDetectFinishedSpeech { NSLog(@"Pocketsphinx has detected a period of silence, concluding an utterance."); } - (void) pocketsphinxDidStopListening { NSLog(@"Pocketsphinx has stopped listening."); } - (void) pocketsphinxDidSuspendRecognition { NSLog(@"Pocketsphinx has suspended recognition."); } - (void) pocketsphinxDidResumeRecognition { NSLog(@"Pocketsphinx has resumed recognition."); } - (void) pocketsphinxDidChangeLanguageModelToFile:(NSString *)newLanguageModelPathAsString andDictionary:(NSString *)newDictionaryPathAsString { NSLog(@"Pocketsphinx is now using the following language model: \n%@ and the following dictionary: %@",newLanguageModelPathAsString,newDictionaryPathAsString); } - (void) pocketSphinxContinuousSetupDidFailWithReason:(NSString *)reasonForFailure { NSLog(@"Listening setup wasn't successful and returned the failure reason: %@", reasonForFailure); } - (void) pocketSphinxContinuousTeardownDidFailWithReason:(NSString *)reasonForFailure { NSLog(@"Listening teardown wasn't successful and returned the failure reason: %@", reasonForFailure); } - (void) testRecognitionCompleted { NSLog(@"A test file that was submitted for recognition is now complete."); } //This is never called - (void) wavWasSavedAtLocation:(NSString *)location { NSLog(@"WAV was saved at the path %@", location); self.audioFilePath = location; } - (IBAction)startPlayback:(id)sender { [self.saveThatWaveController stop]; [[OEPocketsphinxController sharedInstance] stopListening]; dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(2 * NSEC_PER_SEC)), dispatch_get_main_queue(), ^{ NSLog(@"%@", self.audioFilePath); // this returns (null) NSURL *fileURL = [NSURL URLWithString:self.audioFilePath]; NSError *error; self.musicPlayer = [[AVAudioPlayer alloc] initWithContentsOfURL:fileURL error:&error]; if (error) { UIAlertView *alert = [[UIAlertView alloc] initWithTitle:@"Playback Error" message:@"There was a problem playing the file." delegate:nil cancelButtonTitle:@"OK" otherButtonTitles: nil]; [alert show]; NSLog(@"%@", error); } else { [self.musicPlayer prepareToPlay]; [self.musicPlayer play]; } }); } - (void)viewDidLoad { [super viewDidLoad]; // Do any additional setup after loading the view, typically from a nib. self.openEarsEventsObserver = [[OEEventsObserver alloc] init]; [self.openEarsEventsObserver setDelegate:self]; self.saveThatWaveController = [[SaveThatWaveController alloc] init]; OELanguageModelGenerator *lmGenerator = [[OELanguageModelGenerator alloc] init]; NSArray *words = [NSArray arrayWithObjects:@"WORD", @"STATEMENT", @"OTHER WORD", @"A PHRASE", nil]; NSString *name = @"NameIWantForMyLanguageModelFiles"; NSError *err = [lmGenerator generateLanguageModelFromArray:words withFilesNamed:name forAcousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"]]; // Change "AcousticModelEnglish" to "AcousticModelSpanish" to create a Spanish language model instead of an English one. NSString *lmPath = nil; NSString *dicPath = nil; if(err == nil) { lmPath = [lmGenerator pathToSuccessfullyGeneratedLanguageModelWithRequestedName:@"NameIWantForMyLanguageModelFiles"]; dicPath = [lmGenerator pathToSuccessfullyGeneratedDictionaryWithRequestedName:@"NameIWantForMyLanguageModelFiles"]; } else { NSLog(@"Error: %@",[err localizedDescription]); } [[OEPocketsphinxController sharedInstance] setActive:TRUE error:nil]; [[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath acousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"] languageModelIsJSGF:NO]; // Change "AcousticModelEnglish" to "AcousticModelSpanish" to perform Spanish recognition instead of English. [self.saveThatWaveController start]; // For saving WAVs from OpenEars or RapidEars } - (void)didReceiveMemoryWarning { [super didReceiveMemoryWarning]; // Dispose of any resources that can be recreated. } @end
chymelParticipantIn the SaveThatWave introduction and installation manual? Do you have a link to the tutorial you are referencing?
chymelParticipantI did follow it exactly and I’m able to get OpenEars and OEPocketsphinxController to function properly.
Outside of instantiation of SaveThatWaveController, all I should be doing to start recording is this correct?
[[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath acousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"] languageModelIsJSGF:NO]; [self.saveThatWaveController start];
-
AuthorPosts