Forum Replies Created
-
AuthorPosts
-
May 13, 2015 at 5:04 pm in reply to: Using SaveThatWav to capture an "audio only" utterance with no hypothesis? #1025757ekobresParticipant
Can you point me to something that explains how “Rejecto syllables” work?
Will I need to set up to receive null hypotheses?
I’ll try the “Ah” dictionary and see what happens.
For what it’s worth, I’ve got this kind of working with a fairly straightforward HTTP POST to nuance.
I don’t suppose there’s a way to tap into your recognizer to get the audio bits as they arrive? The performance would be a lot better if I could stream the bits to nuance on the fly rather than sending the whole file at the end. (That’s what their recognizer did – which was the only nice thing about it.)
So now all my nuance recognizer gremlins are solved – but the reco is noticeably slower… :( I just can’t seem to hit a home run.
Thanks again.
ekobresParticipantHmm, I’ll play around with it some more.
All I did was store the path string from the same instance variable I assigned it to originally – the same one that gets passed to startListening. There’s not much that can go wrong there – and I can see it’s the identical string – a file:// URL into the app cache library.
The same call to startListening is being made using the same instance variables – they are just populated from NSUserDefaults rather than directly from pathToSuccessfullyGeneratedLanguageModelWithRequestedName and pathToSuccessfullyGeneratedDictionaryWithRequestedName.
May 12, 2015 at 9:19 pm in reply to: Using SaveThatWav to capture an "audio only" utterance with no hypothesis? #1025742ekobresParticipantI’m not trying to run both concurrently – I just want to be able to use SaveThatWave to capture the dictation I would like to submit to Nuance. I am willing to use their https: API to submit the bits rather than suffer through their dreadful “SpeechKit” SDK. It just behaves so erratically. I would like to have the benefit of streaming it to them since it will reply faster. And I would love the convenience if their recognizer if it actually worked – but it’s full of bugs.
The latest trick it pulls is to posthumously fire a ton of audio route changes after it claims to have finished with recognition. Of course this causes all sorts of problems with the speech synthesizer and with OpenEars.
At least OpenEars is really, actually done when sisStopListening fires…
So anyway – are you saying that if I pass an empty dictionary to the rejecto language model generator, it would still generate a wave file reliably?
ekobresParticipantI am using the paid version of rejecto. (and SaveThatWave)
May 12, 2015 at 6:21 pm in reply to: Mixing JSGF and default using changeLanguageModelToFile #1025734ekobresParticipantHi Halle – did you ever get a chance to look at this one?
April 19, 2015 at 3:42 pm in reply to: Mixing JSGF and default using changeLanguageModelToFile #1025459ekobresParticipantI got your minimal replication case to crash:
remove firstLanguageArray and secondLanguageArray
instead declare them both as NSDictionaries:
NSDictionary *firstLanguageDictionary = @{ ThisWillBeSaidOnce : @[ @{ OneOfTheseCanBeSaidWithOptionalRepetitions : @[ @"BACKWARD", @"CHANGE", @"FORWARD", @"GO", @"LEFT", @"MODEL", @"RIGHT", @"TURN"]} ] }; ... NSDictionary *secondLanguageDictionary = @{ ThisWillBeSaidOnce : @[ @{ OneOfTheseCanBeSaidWithOptionalRepetitions : @[ @"SUNDAY", @"MONDAY", @"TUESDAY", @"WEDNESDAY", @"THURSDAY", @"FRIDAY", @"SATURDAY", @"QUIDNUNC", @"CHANGE MODEL"]} ] };
The languageModelGenerator calls become:
NSError *error = [languageModelGenerator generateGrammarFromDictionary:firstLanguageDictionary withFilesNamed:@"FirstOpenEarsDynamicLanguageModel" forAcousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"]]; ... error = [languageModelGenerator generateGrammarFromDictionary:secondLanguageDictionary withFilesNamed:@"SecondOpenEarsDynamicLanguageModel" forAcousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"]];
The dictionary path assignments become:
self.pathToFirstDynamicallyGeneratedLanguageModel = [languageModelGenerator pathToSuccessfullyGeneratedGrammarWithRequestedName:@"FirstOpenEarsDynamicLanguageModel"]; ... self.pathToSecondDynamicallyGeneratedLanguageModel = [languageModelGenerator pathToSuccessfullyGeneratedGrammarWithRequestedName:@"SecondOpenEarsDynamicLanguageModel"];
Now run the app and say a few words to prove it’s working.
Next say “CHANGE MODEL”My version crashes (on another thread) while executing this line:
[self.fliteController say:[NSString stringWithFormat:@"You said %@",hypothesis] withVoice:self.slt];
here’s the backtrace:
(lldb) bt
* thread #10: tid = 0x209f8, 0x000000010008322c OpenEarsSampleApp`
fsg_lextree_init + 516, queue = ‘com.apple.root.default-qos’, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x000000010008322c OpenEarsSampleApp`fsg_lextree_init + 516
frame #1: 0x0000000100047ef0 OpenEarsSampleApp`fsg_search_reinit + 92
frame #2: 0x000000010001a968 OpenEarsSampleApp`ps_load_dict + 364
frame #3: 0x000000010002e37c OpenEarsSampleApp`usenglish_init + 7008
frame #4: 0x000000010002d16c OpenEarsSampleApp`usenglish_init + 2384
frame #5: 0x000000010002cda8 OpenEarsSampleApp`usenglish_init + 1420
frame #6: 0x0000000100670f94 libdispatch.dylib`_dispatch_client_callout + 16
frame #7: 0x0000000100688848 libdispatch.dylib`_dispatch_source_latch_and_call + 1392
frame #8: 0x00000001006731c0 libdispatch.dylib`_dispatch_source_invoke + 292
frame #9: 0x000000010067e5d4 libdispatch.dylib`_dispatch_root_queue_drain + 772
frame #10: 0x0000000100680248 libdispatch.dylib`_dispatch_worker_thread3 + 132
frame #11: 0x0000000194f2522c libsystem_pthread.dylib`_pthread_wqthread + 816
</blockquote>The fatal call was this one:
[[OEPocketsphinxController sharedInstance] changeLanguageModelToFile:self.pathToSecondDynamicallyGeneratedLanguageModel withDictionary:self.pathToSecondDynamicallyGeneratedDictionary];
Please let me know if you are unable to replicate. Happens for me every time.
-Erick
April 19, 2015 at 12:04 am in reply to: Mixing JSGF and default using changeLanguageModelToFile #1025457ekobresParticipantWell, looking at your code and where I would replicate I found this comment:
// You can only change language models with ARPA grammars in OpenEars (the ones that end in .languagemodel or .DMP). // Trying to switch between JSGF models (the ones that end in .gram) will return no result. [[OEPocketsphinxController sharedInstance] changeLanguageModelToFile:self.pathToSecondDynamicallyGeneratedLanguageModel withDictionary:self.pathToSecondDynamicallyGeneratedDictionary]; self.usingStartingLanguageModel = FALSE; } else { // If we're on the dynamically generated model, switch to the start model (this is just an example of a trigger and method for switching models). [[OEPocketsphinxController sharedInstance] changeLanguageModelToFile:self.pathToFirstDynamicallyGeneratedLanguageModel withDictionary:self.pathToFirstDynamicallyGeneratedDictionary]; self.usingStartingLanguageModel = TRUE; }
I’m trying to change language models with .gram files – so I suspect that’s why it’s not working. Although it does crash rather than “do nothing.” Do you still believe it should just work given these comments in the sample app code?
April 11, 2015 at 4:27 pm in reply to: Mixing JSGF and default using changeLanguageModelToFile #1025392ekobresParticipantSame results with or without n-best.
bt doesn’t provide any info (literally nothing shown) but here’s the call stack from the thread that crashed:Thread 8 Queue : com.apple.root.default-qos (concurrent) #0 0x00000001001709e0 in fsg_lextree_init () #1 0x000000010014221c in fsg_search_reinit () #2 0x000000010011cfb0 in ps_load_dict () #3 0x000000010012c820 in ___lldb_unnamed_function75$$FieldInsight () #4 0x000000010012b610 in ___lldb_unnamed_function66$$FieldInsight () #5 0x000000010012b24c in ___lldb_unnamed_function59$$FieldInsight () #6 0x0000000100984f94 in _dispatch_client_callout () #7 0x000000010099c848 in _dispatch_source_latch_and_call () #8 0x00000001009871c0 in _dispatch_source_invoke () #9 0x00000001009925d4 in _dispatch_root_queue_drain () #10 0x0000000100994248 in _dispatch_worker_thread3 () #11 0x0000000195b3522c in _pthread_wqthread ()
And the nearest label (dict_wordid) I could see is included here in the disassembly:
0x100170960 <+388>: bl 0x100173ac8 ; dict_wordid 0x100170964 <+392>: ldr x8, [x20, #32] 0x100170968 <+396>: cbz x8, 0x1001709bc ; <+480> 0x10017096c <+400>: ldr w9, [x23, #12] 0x100170970 <+404>: asr w10, w9, #31 0x100170974 <+408>: add w10, w9, w10, lsr #27 0x100170978 <+412>: asr w10, w10, #5 0x10017097c <+416>: add x8, x8, w10, sxtw #2 0x100170980 <+420>: ldr w8, [x8] 0x100170984 <+424>: and x9, x9, #0x1f 0x100170988 <+428>: lsl x9, x27, x9 0x10017098c <+432>: and x8, x8, x9 0x100170990 <+436>: cbz x8, 0x1001709bc ; <+480> 0x100170994 <+440>: ldrsw x8, [x23] 0x100170998 <+444>: ldr x9, [x26, #48] 0x10017099c <+448>: ldr x8, [x9, x8, lsl #3] 0x1001709a0 <+452>: lsl x9, x25, #1 0x1001709a4 <+456>: strh w19, [x8, x9] 0x1001709a8 <+460>: ldrsw x8, [x23, #4] 0x1001709ac <+464>: ldr x10, [x26, #40] 0x1001709b0 <+468>: ldr x8, [x10, x8, lsl #3] 0x1001709b4 <+472>: strh w19, [x8, x9] 0x1001709b8 <+476>: b 0x100170928 ; <+332> 0x1001709bc <+480>: sxtw x8, w0 0x1001709c0 <+484>: ldr x9, [x26, #16] 0x1001709c4 <+488>: ldr x9, [x9, #16] 0x1001709c8 <+492>: add x8, x9, x8, lsl #5 0x1001709cc <+496>: ldr w9, [x8, #16] 0x1001709d0 <+500>: ldr x8, [x8, #8] 0x1001709d4 <+504>: ldrsw x10, [x23] 0x1001709d8 <+508>: ldr x11, [x26, #48] 0x1001709dc <+512>: ldr x10, [x11, x10, lsl #3] -> 0x1001709e0 <+516>: ldrsh x11, [x8]
April 10, 2015 at 5:39 pm in reply to: Mixing JSGF and default using changeLanguageModelToFile #1025387ekobresParticipantHere’s the whole thing: I say a few words using the first grammar to show that it’s working. The first set are names “TWPassiveMode” and the set that crashes are called “TWPassiveMode” Both grammars are simple OneOfTheseWillBeSaidOnce cases.
If you want to see the same thing working properly with array-based models I can post that as well.
2015-04-10 11:35:19.465 FieldInsight[276:14187] Starting OpenEars logging for OpenEars version 2.03 on 64-bit device (or build): iPhone running iOS version: 8.300000 2015-04-10 11:35:19.526 FieldInsight[276:14187] Starting dynamic language model generation 2015-04-10 11:35:19.612 FieldInsight[276:14187] Done creating language model with CMUCLMTK in 0.085239 seconds. 2015-04-10 11:35:19.665 FieldInsight[276:14187] I'm done running performDictionaryLookup and it took 0.041713 seconds 2015-04-10 11:35:19.671 FieldInsight[276:14187] I'm done running dynamic language model generation and it took 0.198541 seconds 2015-04-10 11:35:19.678 FieldInsight[276:14187] Starting dynamic language model generation 2015-04-10 11:35:19.753 FieldInsight[276:14187] Done creating language model with CMUCLMTK in 0.074011 seconds. 2015-04-10 11:35:19.786 FieldInsight[276:14187] I'm done running performDictionaryLookup and it took 0.028821 seconds 2015-04-10 11:35:19.794 FieldInsight[276:14187] I'm done running dynamic language model generation and it took 0.121173 seconds 2015-04-10 11:35:19.838 FieldInsight[276:14187] I'm done running performDictionaryLookup and it took 0.028492 seconds 2015-04-10 11:35:19.859 FieldInsight[276:14187] Starting dynamic language model generation 2015-04-10 11:35:19.930 FieldInsight[276:14187] Done creating language model with CMUCLMTK in 0.070039 seconds. 2015-04-10 11:35:19.965 FieldInsight[276:14187] The word AMERANTH was not found in the dictionary /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/LanguageModelGeneratorLookupList.text/LanguageModelGeneratorLookupList.text. 2015-04-10 11:35:19.966 FieldInsight[276:14187] Now using the fallback method to look up the word AMERANTH 2015-04-10 11:35:19.966 FieldInsight[276:14187] If this is happening more frequently than you would expect, the most likely cause for it is since you are using the English phonetic lookup dictionary is that your words are not in English or aren't dictionary words, or that you are submitting the words in lowercase when they need to be entirely written in uppercase. This can also happen if you submit words with punctuation attached – consider removing punctuation from language models or grammars you create before submitting them. 2015-04-10 11:35:19.966 FieldInsight[276:14187] Using convertGraphemes for the word or phrase AMERANTH which doesn't appear in the dictionary 2015-04-10 11:35:20.004 FieldInsight[276:14187] The word NINTEY was not found in the dictionary /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/LanguageModelGeneratorLookupList.text/LanguageModelGeneratorLookupList.text. 2015-04-10 11:35:20.005 FieldInsight[276:14187] Now using the fallback method to look up the word NINTEY 2015-04-10 11:35:20.005 FieldInsight[276:14187] If this is happening more frequently than you would expect, the most likely cause for it is since you are using the English phonetic lookup dictionary is that your words are not in English or aren't dictionary words, or that you are submitting the words in lowercase when they need to be entirely written in uppercase. This can also happen if you submit words with punctuation attached – consider removing punctuation from language models or grammars you create before submitting them. 2015-04-10 11:35:20.005 FieldInsight[276:14187] Using convertGraphemes for the word or phrase NINTEY which doesn't appear in the dictionary 2015-04-10 11:35:20.021 FieldInsight[276:14187] I'm done running performDictionaryLookup and it took 0.086493 seconds 2015-04-10 11:35:20.030 FieldInsight[276:14187] I'm done running dynamic language model generation and it took 0.178032 seconds 2015-04-10 11:35:20.047 FieldInsight[276:14187] Unable to simultaneously satisfy constraints. Probably at least one of the constraints in the following list is one you don't want. Try this: (1) look at each constraint and try to figure out which you don't expect; (2) find the code that added the unwanted constraint or constraints and fix it. (Note: If you're seeing NSAutoresizingMaskLayoutConstraints that you don't understand, refer to the documentation for the UIView property translatesAutoresizingMaskIntoConstraints) ( "<NSLayoutConstraint:0x174099230 H:[UITextView:0x12700f000(600)]>", "<NSLayoutConstraint:0x174e9e370 H:[UITextView:0x12700f000]-(0)-| (Names: '|':UIView:0x126e1cdb0 )>", "<NSLayoutConstraint:0x174e9e3c0 H:|-(0)-[UITextView:0x12700f000] (Names: '|':UIView:0x126e1cdb0 )>", "<NSLayoutConstraint:0x17588f9b0 'UIView-Encapsulated-Layout-Width' H:[UIView:0x126e1cdb0(320)]>" ) Will attempt to recover by breaking constraint <NSLayoutConstraint:0x174099230 H:[UITextView:0x12700f000(600)]> Make a symbolic breakpoint at UIViewAlertForUnsatisfiableConstraints to catch this in the debugger. The methods in the UIConstraintBasedLayoutDebugging category on UIView listed in <UIKit/UIView.h> may also be helpful. applicationDidBecomeActive() listen(false -> true) 2015-04-10 11:35:22.732 FieldInsight[276:14187] Creating shared instance of OEPocketsphinxController 2015-04-10 11:35:22.734 FieldInsight[276:14187] Attempting to start listening session from startListeningWithLanguageModelAtPath: 2015-04-10 11:35:22.748 FieldInsight[276:14187] User gave mic permission for this app. 2015-04-10 11:35:22.750 FieldInsight[276:14187] setSecondsOfSilence wasn't set, using default of 0.700000. 2015-04-10 11:35:22.752 FieldInsight[276:14187] Successfully started listening session from startListeningWithLanguageModelAtPath: 2015-04-10 11:35:22.753 FieldInsight[276:14225] Starting listening. 2015-04-10 11:35:22.753 FieldInsight[276:14225] about to set up audio session 2015-04-10 11:35:22.755 FieldInsight[276:14225] audioMode is incorrect, we will change it. 2015-04-10 11:35:22.756 FieldInsight[276:14225] audioMode is now on the correct setting. 2015-04-10 11:35:22.877 FieldInsight[276:14243] Audio route has changed for the following reason: 2015-04-10 11:35:22.881 FieldInsight[276:14243] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord 2015-04-10 11:35:23.107 FieldInsight[276:14225] done starting audio unit INFO: cmd_ln.c(702): Parsing command line: \ -jsgf /var/mobile/Containers/Data/Application/25CEC9CE-BB34-42B4-BE4A-5137BF75D03B/Library/Caches/TWPassiveMode.gram \ -vad_prespeech 10 \ -vad_postspeech 69 \ -vad_threshold 3.500000 \ -remove_noise yes \ -remove_silence yes \ -bestpath yes \ -lw 1.000000 \ -dict /var/mobile/Containers/Data/Application/25CEC9CE-BB34-42B4-BE4A-5137BF75D03B/Library/Caches/TWPassiveMode.dic \ -hmm /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle Current configuration: [NAME] [DEFLT] [VALUE] -agc none none -agcthresh 2.0 2.000000e+00 -allphone -allphone_ci no no -alpha 0.97 9.700000e-01 -argfile -ascale 20.0 2.000000e+01 -aw 1 1 -backtrace no no -beam 1e-48 1.000000e-48 -bestpath yes yes -bestpathlw 9.5 9.500000e+00 -bghist no no -ceplen 13 13 -cmn current current -cmninit 8.0 8.0 -compallsen no no -debug 0 -dict /var/mobile/Containers/Data/Application/25CEC9CE-BB34-42B4-BE4A-5137BF75D03B/Library/Caches/TWPassiveMode.dic -dictcase no no -dither no no -doublebw no no -ds 1 1 -fdict -feat 1s_c_d_dd 1s_c_d_dd -featparams -fillprob 1e-8 1.000000e-08 -frate 100 100 -fsg -fsgusealtpron yes yes -fsgusefiller yes yes -fwdflat yes yes -fwdflatbeam 1e-64 1.000000e-64 -fwdflatefwid 4 4 -fwdflatlw 8.5 8.500000e+00 -fwdflatsfwin 25 25 -fwdflatwbeam 7e-29 7.000000e-29 -fwdtree yes yes -hmm /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle -input_endian little little -jsgf /var/mobile/Containers/Data/Application/25CEC9CE-BB34-42B4-BE4A-5137BF75D03B/Library/Caches/TWPassiveMode.gram -kdmaxbbi -1 -1 -kdmaxdepth 0 0 -kdtree -keyphrase -kws -kws_plp 1e-1 1.000000e-01 -kws_threshold 1 1.000000e+00 -latsize 5000 5000 -lda -ldadim 0 0 -lextreedump 0 0 -lifter 0 0 -lm -lmctl -lmname -logbase 1.0001 1.000100e+00 -logfn -logspec no no -lowerf 133.33334 1.333333e+02 -lpbeam 1e-40 1.000000e-40 -lponlybeam 7e-29 7.000000e-29 -lw 6.5 1.000000e+00 -maxhmmpf 10000 10000 -maxnewoov 20 20 -maxwpf -1 -1 -mdef -mean -mfclogdir -min_endfr 0 0 -mixw -mixwfloor 0.0000001 1.000000e-07 -mllr -mmap yes yes -ncep 13 13 -nfft 512 512 -nfilt 40 40 -nwpen 1.0 1.000000e+00 -pbeam 1e-48 1.000000e-48 -pip 1.0 1.000000e+00 -pl_beam 1e-10 1.000000e-10 -pl_pbeam 1e-5 1.000000e-05 -pl_window 0 0 -rawlogdir -remove_dc no no -remove_noise yes yes -remove_silence yes yes -round_filters yes yes -samprate 16000 1.600000e+04 -seed -1 -1 -sendump -senlogdir -senmgau -silprob 0.005 5.000000e-03 -smoothspec no no -svspec -tmat -tmatfloor 0.0001 1.000000e-04 -topn 4 4 -topn_beam 0 0 -toprule -transform legacy legacy -unit_area yes yes -upperf 6855.4976 6.855498e+03 -usewdphones no no -uw 1.0 1.000000e+00 -vad_postspeech 50 69 -vad_prespeech 10 10 -vad_threshold 2.0 3.500000e+00 -var -varfloor 0.0001 1.000000e-04 -varnorm no no -verbose no no -warp_params -warp_type inverse_linear inverse_linear -wbeam 7e-29 7.000000e-29 -wip 0.65 6.500000e-01 -wlen 0.025625 2.562500e-02 2015-04-10 11:35:23.112 FieldInsight[276:14243] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is ---ReceiverMicrophoneBuiltIn---. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x174206ce0, inputs = (null); outputs = ( "<AVAudioSessionPortDescription: 0x174206d70, type = Speaker; name = Speaker; UID = Speaker; selectedDataSource = (null)>" )>. INFO: cmd_ln.c(702): Parsing command line: \ -nfilt 25 \ -lowerf 130 \ -upperf 6800 \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -agc none \ -cmn current \ -varnorm no \ -transform dct \ -lifter 22 \ -cmninit 40 Current configuration: [NAME] [DEFLT] [VALUE] -agc none none -agcthresh 2.0 2.000000e+00 -alpha 0.97 9.700000e-01 -ceplen 13 13 -cmn current current -cmninit 8.0 40 -dither no no -doublebw no no -feat 1s_c_d_dd 1s_c_d_dd -frate 100 100 -input_endian little little -lda -ldadim 0 0 -lifter 0 22 -logspec no no -lowerf 133.33334 1.300000e+02 -ncep 13 13 -nfft 512 512 -nfilt 40 25 -remove_dc no no -remove_noise yes yes -remove_silence yes yes -round_filters yes yes -samprate 16000 1.600000e+04 -seed -1 -1 -smoothspec no no -svspec 0-12/13-25/26-38 -transform legacy dct -unit_area yes yes -upperf 6855.4976 6.800000e+03 -vad_postspeech 50 69 -vad_prespeech 10 10 -vad_threshold 2.0 3.500000e+00 -varnorm no no -verbose no no -warp_params -warp_type inverse_linear inverse_linear -wlen 0.025625 2.562500e-02 INFO: acmod.c(252): Parsed model-specific feature parameters from /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/feat.params INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none' INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0 INFO: acmod.c(171): Using subvector specification 0-12/13-25/26-38 INFO: mdef.c(518): Reading model definition: /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/mdef INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file INFO: bin_mdef.c(336): Reading binary model definition: /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/mdef INFO: bin_mdef.c(516): 46 CI-phone, 168344 CD-phone, 3 emitstate/phone, 138 CI-sen, 6138 Sen, 32881 Sen-Seq INFO: tmat.c(206): Reading HMM transition probability matrices: /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/transition_matrices INFO: acmod.c(124): Attempting to use SCHMM computation module INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/means INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/variances INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(294): 512x13 INFO: ms_gauden.c(354): 0 variance values floored INFO: s2_semi_mgau.c(904): Loading senones from dump file /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/sendump INFO: s2_semi_mgau.c(928): BEGIN FILE FORMAT DESCRIPTION INFO: s2_semi_mgau.c(991): Rows: 512, Columns: 6138 INFO: s2_semi_mgau.c(1023): Using memory-mapped I/O for senones INFO: s2_semi_mgau.c(1294): Maximum top-N: 4 Top-N beams: 0 0 0 INFO: dict.c(320): Allocating 4115 * 32 bytes (128 KiB) for word entries INFO: dict.c(333): Reading main dictionary: /var/mobile/Containers/Data/Application/25CEC9CE-BB34-42B4-BE4A-5137BF75D03B/Library/Caches/TWPassiveMode.dic INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones INFO: dict.c(336): 10 words read INFO: dict.c(342): Reading filler dictionary: /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/noisedict INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones INFO: dict.c(345): 9 words read INFO: dict2pid.c(396): Building PID tables for dictionary INFO: dict2pid.c(406): Allocating 46^3 * 2 bytes (190 KiB) for word-initial triphones INFO: dict2pid.c(132): Allocated 51152 bytes (49 KiB) for word-final triphones INFO: dict2pid.c(196): Allocated 51152 bytes (49 KiB) for single-phone word triphones INFO: jsgf.c(691): Defined rule: <TWPassiveMode.g00000> INFO: jsgf.c(691): Defined rule: PUBLIC <TWPassiveMode.rule_0> INFO: fsg_model.c(215): Computing transitive closure for null transitions INFO: fsg_model.c(277): 0 null transitions added INFO: fsg_search.c(227): FSG(beam: -1080, pbeam: -1080, wbeam: -634; wip: -5, pip: 0) INFO: fsg_model.c(428): Adding silence transitions for <sil> to FSG INFO: fsg_model.c(448): Added 4 silence word transitions INFO: fsg_model.c(428): Adding silence transitions for <sil> to FSG INFO: fsg_model.c(448): Added 4 silence word transitions INFO: fsg_model.c(428): Adding silence transitions for [BREATH] to FSG INFO: fsg_model.c(448): Added 4 silence word transitions INFO: fsg_model.c(428): Adding silence transitions for [COUGH] to FSG INFO: fsg_model.c(448): Added 4 silence word transitions INFO: fsg_model.c(428): Adding silence transitions for [NOISE] to FSG INFO: fsg_model.c(448): Added 4 silence word transitions INFO: fsg_model.c(428): Adding silence transitions for [SMACK] to FSG INFO: fsg_model.c(448): Added 4 silence word transitions INFO: fsg_model.c(428): Adding silence transitions for [UH] to FSG INFO: fsg_model.c(448): Added 4 silence word transitions INFO: fsg_search.c(174): Added 0 alternate word transitions INFO: fsg_lextree.c(110): Allocated 376 bytes (0 KiB) for left and right context phones INFO: fsg_lextree.c(256): 71 HMM nodes in lextree (36 leaves) INFO: fsg_lextree.c(259): Allocated 10224 bytes (9 KiB) for all lextree nodes INFO: fsg_lextree.c(262): Allocated 5184 bytes (5 KiB) for lextree leafnodes 2015-04-10 11:35:23.178 FieldInsight[276:14225] There is no CMN plist so we are using the fresh CMN value 42.000000. 2015-04-10 11:35:23.178 FieldInsight[276:14225] Listening. 2015-04-10 11:35:23.179 FieldInsight[276:14225] Project has these words or phrases in its dictionary: ACCURACY ACTIVE HEADING HELP LISTEN LOCATE MAP MODE PRIVACY TYPE 2015-04-10 11:35:23.179 FieldInsight[276:14225] Recognition loop has started Pocketsphinx is now listening. 2015-04-10 11:35:29.117 FieldInsight[276:14209] Speech detected... Pocketsphinx has detected speech. 2015-04-10 11:35:29.989 FieldInsight[276:14209] End of speech detected... Pocketsphinx has detected a period of silence, concluding an utterance. INFO: cmn_prior.c(131): cmn_prior_update: from < 42.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > INFO: cmn_prior.c(149): cmn_prior_update: to < 20.40 3.11 -3.10 25.90 2.22 -5.13 -7.57 -8.02 -6.38 -8.20 -0.03 -0.83 -9.06 > INFO: fsg_search.c(843): 90 frames, 845 HMMs (9/fr), 4023 senones (44/fr), 109 history entries (1/fr) 2015-04-10 11:35:29.991 FieldInsight[276:14209] Pocketsphinx heard "LOCATE" with a score of (0) and an utterance ID of 0. INFO: fsg_search.c(1225): Start node LOCATE.0:24:78 INFO: fsg_search.c(1264): End node <sil>.57:59:89 (-561) INFO: fsg_search.c(1488): lattice start node LOCATE.0 end node <sil>.57 didReceiveHypothesis: "LOCATE" with a score of 0 and an ID of 0 Saying: "Locating." at rate 0.2 didReceiveNBestHypothesisArray: [{ Hypothesis = LOCATE; Score = "-11211"; }, { Hypothesis = LOCATE; Score = "-11211"; }, { Hypothesis = LOCATE; Score = "-11211"; }] Pocketsphinx has suspended recognition. 2015-04-10 11:35:30.022 FieldInsight[276:14342] |AXSpeechAssetDownloader|error| ASAssetQuery error fetching results Error Domain=ASError Code=21 "The operation couldn’t be completed. (ASError error 21 - Unable to copy asset information)" UserInfo=0x170868800 {NSDescription=Unable to copy asset information} 2015-04-10 11:35:30.026 FieldInsight[276:14342] Building MacinTalk voice for asset: (null) Finished speaking. 2015-04-10 11:35:31.050 FieldInsight[276:14187] setSecondsOfSilence wasn't set, using default of 0.700000. Pocketsphinx has resumed recognition. INFO: cmn_prior.c(131): cmn_prior_update: from < 20.40 3.11 -3.10 25.90 2.22 -5.13 -7.57 -8.02 -6.38 -8.20 -0.03 -0.83 -9.06 > INFO: cmn_prior.c(149): cmn_prior_update: to < 20.40 3.11 -3.10 25.90 2.22 -5.13 -7.57 -8.02 -6.38 -8.20 -0.03 -0.83 -9.06 > INFO: fsg_search.c(843): 0 frames, 0 HMMs (0/fr), 0 senones (0/fr), 1 history entries (0/fr) 2015-04-10 11:35:32.421 FieldInsight[276:14212] Speech detected... Pocketsphinx has detected speech. 2015-04-10 11:35:33.444 FieldInsight[276:14212] End of speech detected... INFO: cmn_prior.c(131): cmn_prior_update: from < 20.40 3.11 -3.10 25.90 2.22 -5.13 -7.57 -8.02 -6.38 -8.20 -0.03 -0.83 -9.06 > INFO: cmn_prior.c(149): cmn_prior_update: to < 21.68 Pocketsphinx has detected a period of silence, concluding an utterance. -2.00 0.95 18.29 5.19 -2.97 -14.01 -5.47 0.84 -5.35 -2.85 -4.40 -8.91 > INFO: fsg_search.c(843): 116 frames, 1192 HMMs (10/fr), 5021 senones (43/fr), 136 history entries (1/fr) 2015-04-10 11:35:33.445 FieldInsight[276:14212] Pocketsphinx heard "ACCURACY" with a score of (0) and an utterance ID of 1. INFO: fsg_search.c(1225): Start node ACCURACY.0:47:115 INFO: fsg_search.c(1264): End node <sil>.73:75:115 (-553) INFO: fsg_search.c(1488): lattice start node ACCURACY.0 end node <sil>.73 didReceiveHypothesis: "ACCURACY" with a score of 0 and an ID of 1 Saying: "Accuracy is currently 10.0 meters" at rate 0.2 didReceiveNBestHypothesisArray: [{ Hypothesis = ACCURACY; Score = "-14157"; }, { Hypothesis = ACCURACY; Score = "-14157"; }, { Hypothesis = ACCURACY; Score = "-14157"; }] Pocketsphinx has suspended recognition. Finished speaking. 2015-04-10 11:35:35.876 FieldInsight[276:14187] setSecondsOfSilence wasn't set, using default of 0.700000. Pocketsphinx has resumed recognition. INFO: cmn_prior.c(131): cmn_prior_update: from < 21.68 -2.00 0.95 18.29 5.19 -2.97 -14.01 -5.47 0.84 -5.35 -2.85 -4.40 -8.91 > INFO: cmn_prior.c(149): cmn_prior_update: to < 21.68 -2.00 0.95 18.29 5.19 -2.97 -14.01 -5.47 0.84 -5.35 -2.85 -4.40 -8.91 > INFO: fsg_search.c(843): 0 frames, 0 HMMs (0/fr), 0 senones (0/fr), 1 history entries (0/fr) 2015-04-10 11:35:36.524 FieldInsight[276:14212] Speech detected... Pocketsphinx has detected speech. 2015-04-10 11:35:37.282 FieldInsight[276:14212] End of speech detected... Pocketsphinx has detected a period of silence, concluding an utterance. INFO: cmn_prior.c(131): cmn_prior_update: from < 21.68 -2.00 0.95 18.29 5.19 -2.97 -14.01 -5.47 0.84 -5.35 -2.85 -4.40 -8.91 > INFO: cmn_prior.c(149): cmn_prior_update: to < 20.59 2.41 0.97 12.46 -0.99 -2.22 -11.45 -6.81 -0.32 -3.00 -3.17 -3.84 -7.86 > INFO: fsg_search.c(843): 82 frames, 701 HMMs (8/fr), 3303 senones (40/fr), 128 history entries (1/fr) 2015-04-10 11:35:37.285 FieldInsight[276:14212] Pocketsphinx heard "HELP" with a score of (0) and an utterance ID of 2. didRecINFO: fsg_search.c(1225): Start node HELP.0:19:81 eiveHINFO: fsg_search.c(1264): End node <sil>.44:46:81 (-509) ypothesis: "HELP" with a score of 0 and an ID of 2 INFO: fsg_search.c(1488): lattice start node HELP.0 end node <sil>.44 Saying: "Currently in passive mode. Available commands are: LOCATE, MAP TYPE, HEADING, ACCURACY, and PRIVACY. To enter active mode, say: LISTEN, or ACTIVE MODE" at rate 0.2 Pocketsphinx has suspended recognition. didReceiveNBestHypothesisArray: [{ Hypothesis = HELP; Score = "-9629"; }, { Hypothesis = HELP; Score = "-9629"; }, { Hypothesis = HELP; Score = "-9629"; }] Finished speaking. 2015-04-10 11:35:46.649 FieldInsight[276:14187] setSecondsOfSilence wasn't set, using default of 0.700000. Pocketsphinx has resumed recognition. INFO: cmn_prior.c(131): cmn_prior_update: from < 20.59 2.41 0.97 12.46 -0.99 -2.22 -11.45 -6.81 -0.32 -3.00 -3.17 -3.84 -7.86 > INFO: cmn_prior.c(149): cmn_prior_update: to < 20.59 2.41 0.97 12.46 -0.99 -2.22 -11.45 -6.81 -0.32 -3.00 -3.17 -3.84 -7.86 > INFO: fsg_search.c(843): 0 frames, 0 HMMs (0/fr), 0 senones (0/fr), 1 history entries (0/fr) 2015-04-10 11:35:47.280 FieldInsight[276:14449] Speech detected... Pocketsphinx has detected speech. 2015-04-10 11:35:48.419 FieldInsight[276:14449] End of speech detected... Pocketsphinx has detected a period of silence, concluding an utterance. INFO: cmn_prior.c(131): cmn_prior_update: from < 20.59 2.41 0.97 12.46 -0.99 -2.22 -11.45 -6.81 -0.32 -3.00 -3.17 -3.84 -7.86 > INFO: cmn_prior.c(149): cmn_prior_update: to < 21.24 4.28 -0.66 12.17 -0.60 -1.92 -10.66 -7.02 2.19 -2.17 -1.69 -2.97 -7.92 > INFO: fsg_search.c(843): 118 frames, 1288 HMMs (10/fr), 4865 senones (41/fr), 167 history entries (1/fr) 2015-04-10 11:35:48.422 FieldInsight[276:14449] Pocketsphinx heard "ACTIVE MODE" with a score of (0) and an utterance ID of 3. didReceiveHypothesis: "ACTIVE MODE" with a score of 0 and an ID of 3 INFO: fsg_search.c(1225): Start node ACTIVE.0:30:44 INFO: fsg_search.c(1264): End node <sil>.72:74:117 (-494) INFO: fsg_search.c(1264): End node MODE.39:50:117 (-1928) INFO: fsg_search.c(1488): lattice start node ACTIVE.0 end node </s>.118 Saying: "Active Mode. Listening." at rate 0.2 Pocketsphinx has suspended recognition. didReceiveNBestHypothesisArray: [{ Hypothesis = "ACTIVE MODE"; Score = "-14977"; }, { Hypothesis = "ACTIVE MODE"; Score = "-14977"; }, { Hypothesis = "ACTIVE MODE"; Score = "-14977"; }] Finished speaking. 2015-04-10 11:35:49.900 FieldInsight[276:14187] setSecondsOfSilence wasn't set, using default of 0.700000. Pocketsphinx has resumed recognition. INFO: cmn_prior.c(131): cmn_prior_update: from < 21.24 4.28 -0.66 12.17 -0.60 -1.92 -10.66 -7.02 2.19 -2.17 -1.69 -2.97 -7.92 > INFO: cmn_prior.c(149): cmn_prior_update: to < 21.24 4.28 -0.66 12.17 -0.60 -1.92 -10.66 -7.02 2.19 -2.17 -1.69 -2.97 -7.92 > INFO: fsg_search.c(843): 0 frames, 0 HMMs (0/fr), 0 senones (0/fr), 1 history entries (0/fr) 2015-04-10 11:35:50.072 FieldInsight[276:14212] there is a request to change to the language model file /var/mobile/Containers/Data/Application/25CEC9CE-BB34-42B4-BE4A-5137BF75D03B/Library/Caches/TWActiveMode.DMP 2015-04-10 11:35:50.073 FieldInsight[276:14212] The language model ID is 1428680150 INFO: cmd_ln.c(702): Parsing command line: Current configuration: [NAME] [DEFLT] [VALUE] -agc none none -agcthresh 2.0 2.000000e+00 -allphone -allphone_ci no no -alpha 0.97 9.700000e-01 -ascale 20.0 2.000000e+01 -aw 1 1 -backtrace no no -beam 1e-48 1.000000e-48 -bestpath yes yes -bestpathlw 9.5 9.500000e+00 -bghist no no -ceplen 13 13 -cmn current current -cmninit 8.0 8.0 -compallsen no no -debug 0 -dict -dictcase no no -dither no no -doublebw no no -ds 1 1 -fdict -feat 1s_c_d_dd 1s_c_d_dd -featparams -fillprob 1e-8 1.000000e-08 -frate 100 100 -fsg -fsgusealtpron yes yes -fsgusefiller yes yes -fwdflat yes yes -fwdflatbeam 1e-64 1.000000e-64 -fwdflatefwid 4 4 -fwdflatlw 8.5 8.500000e+00 -fwdflatsfwin 25 25 -fwdflatwbeam 7e-29 7.000000e-29 -fwdtree yes yes -hmm -input_endian little little -jsgf -kdmaxbbi -1 -1 -kdmaxdepth 0 0 -kdtree -keyphrase -kws -kws_plp 1e-1 1.000000e-01 -kws_threshold 1 1.000000e+00 -latsize 5000 5000 -lda -ldadim 0 0 -lextreedump 0 0 -lifter 0 0 -lm -lmctl -lmname -logbase 1.0001 1.000100e+00 -logfn -logspec no no -lowerf 133.33334 1.333333e+02 -lpbeam 1e-40 1.000000e-40 -lponlybeam 7e-29 7.000000e-29 -lw 6.5 6.500000e+00 -maxhmmpf 10000 10000 -maxnewoov 20 20 -maxwpf -1 -1 -mdef -mean -mfclogdir -min_endfr 0 0 -mixw -mixwfloor 0.0000001 1.000000e-07 -mllr -mmap yes yes -ncep 13 13 -nfft 512 512 -nfilt 40 40 -nwpen 1.0 1.000000e+00 -pbeam 1e-48 1.000000e-48 -pip 1.0 1.000000e+00 -pl_beam 1e-10 1.000000e-10 -pl_pbeam 1e-5 1.000000e-05 -pl_window 0 0 -rawlogdir -remove_dc no no -remove_noise yes yes -remove_silence yes yes -round_filters yes yes -samprate 16000 1.600000e+04 -seed -1 -1 -sendump -senlogdir -senmgau -silprob 0.005 5.000000e-03 -smoothspec no no -svspec -tmat -tmatfloor 0.0001 1.000000e-04 -topn 4 4 -topn_beam 0 0 -toprule -transform legacy legacy -unit_area yes yes -upperf 6855.4976 6.855498e+03 -usewdphones no no -uw 1.0 1.000000e+00 -vad_postspeech 50 50 -vad_prespeech 10 10 -vad_threshold 2.0 2.000000e+00 -var -varfloor 0.0001 1.000000e-04 -varnorm no no -verbose no no -warp_params -warp_type inverse_linear inverse_linear -wbeam 7e-29 7.000000e-29 -wip 0.65 6.500000e-01 -wlen 0.025625 2.562500e-02 INFO: dict.c(320): Allocating 4118 * 32 bytes (128 KiB) for word entries INFO: dict.c(333): Reading main dictionary: /var/mobile/Containers/Data/Application/25CEC9CE-BB34-42B4-BE4A-5137BF75D03B/Library/Caches/TWActiveMode.dic INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones INFO: dict.c(336): 13 words read INFO: dict.c(342): Reading filler dictionary: /private/var/mobile/Containers/Bundle/Application/F57FF245-4FB0-4B57-A2E6-8BC898EA2D13/FieldInsight.app/AcousticModelEnglish.bundle/noisedict INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones INFO: dict.c(345): 9 words read INFO: dict2pid.c(396): Building PID tables for dictionary INFO: dict2pid.c(406): Allocating 46^3 * 2 bytes (190 KiB) for word-initial triphones INFO: dict2pid.c(132): Allocated 51152 bytes (49 KiB) for word-final triphones INFO: dict2pid.c(196): Allocated 51152 bytes (49 KiB) for single-phone word triphones INFO: fsg_lextree.c(110): Allocated 376 bytes (0 KiB) for left and right context phones (lldb)
April 10, 2015 at 5:19 pm in reply to: Mixing JSGF and default using changeLanguageModelToFile #1025386ekobresParticipantCorrect me if I’m wrong – there are 2 files in play:
1. A “language model file” (created from an array) OR a “grammar file” (created from a dictionary and
2. A dictionary file created when either of the above was created.As an API user, I stop caring which one I am using after the calls to pathToSuccessfullyGeneratedLanguageModelWithRequestedName OR pathToSuccessfullyGeneratedGrammaryWithRequestedName
AND
pathToSuccessfullyCreatedDictionaryWithRequestedName
At this point I have a model file and a dictionary file and should pass the corresponding filename pairs to the appropriate APIs.
Do I have that right?
April 10, 2015 at 4:59 pm in reply to: Mixing JSGF and default using changeLanguageModelToFile #1025384ekobresParticipantI can confirm this doesn’t work for me.
I get a EXC_BAD_ACCESS at #0 0x00000001000f8d78 in fsg_lextree_init ()
…after I switch to a second JSGF model using the changeLanguageModelToFile API. (not during the call – which succeeds – but after I resume recognition.)
The models and their respective files were both successfully created. The first JSGF dynamic dictionary works fine.
This works great if I am using models created from Arrays.
Also – although I haven’t made it that far yet, I’m hoping I can switch between array and dictionary models dynamically so I can use a relatively large vocabulary (hundreds of words) for taking a note, but a small grammar (a dozen or less phrases) when navigating menus.
I’m hoping to avoid having to completely stop and start listening as it creates a number of performance, reliability and structural challenges.
Any advice?
April 10, 2015 at 4:09 pm in reply to: Mixing JSGF and default using changeLanguageModelToFile #1025382ekobresParticipantI’ll give it a shot. I sort of assumed if you needed the languageModelIsJSGF flag in the startListening… API, then there would need to be a place for it in the changeLanguageModel… one.
I was also led astray by the fact the API is called generateGrammarFromDictionary rather than generateLanguageModelFromDictionary.
-
AuthorPosts