- This topic has 11 replies, 2 voices, and was last updated 9 years ago by rikk.
-
AuthorPosts
-
April 15, 2015 at 10:28 pm #1025421rikkParticipant
(OpenEars 2.03, Obj-C/iPhone app)
My apps is quite simple (based on the example code in your tutorial):
– A button to startListening for n seconds.
– OE is correctly identifying words and reporting.
– At the end of the time period, stopListening.
– User can repeat this as many times as they wish.Problems:
1. I see many “Audio route has changed” messages in the logs. Seems weird.2. ERROR: [AVAudioSession Notify Thread] AVAudioSessionPortImpl.mm:52: ValidateRequiredFields: Unknown selected data source for Port iPhone Microphone (type: MicrophoneBuiltIn)
3. After first start/stop session is complete and user taps start again, I get LOTS of warnings that OE is already listening (even though I verify that stopListening executed properly).
This session below shows ONE start/stop session.
Thanks!
Rikk———————————
2015-04-15 12:52:20.562 Dict Shun[7247:2374238] DEBUG> Language model SUCCESSFULLY generated, for keywords: (
DARN,
DRAT,
BUMMER,
DAMN
)!
2015-04-15 12:52:20.563 Dict Shun[7247:2374238] DEBUG> Language model: path name to lang model file: /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.DMP!
2015-04-15 12:52:20.564 Dict Shun[7247:2374238] DEBUG> Language model: path name to lang model dict: /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.dic!
2015-04-15 12:52:20.665 Dict Shun[7247:2374238] Starting OpenEars logging for OpenEars version 2.03 on 64-bit device (or build): iPhone running iOS version: 8.100000
2015-04-15 12:52:37.133 Dict Shun[7247:2374238] DEBUG> User has tapped start button ==> START LISTENING!
2015-04-15 12:52:42.259 Dict Shun[7247:2374238] Attempting to start listening session from startListeningWithLanguageModelAtPath:
2015-04-15 12:52:42.259 Dict Shun[7247:2374238] User gave mic permission for this app.
2015-04-15 12:52:42.260 Dict Shun[7247:2374238] setSecondsOfSilence wasn’t set, using default of 0.700000.
2015-04-15 12:52:42.260 Dict Shun[7247:2374238] Successfully started listening session from startListeningWithLanguageModelAtPath:
2015-04-15 12:52:42.260 Dict Shun[7247:2374410] Starting listening.
2015-04-15 12:52:42.260 Dict Shun[7247:2374410] about to set up audio session
2015-04-15 12:52:42.467 Dict Shun[7247:2374301] Audio route has changed for the following reason:
2015-04-15 12:52:42.472 Dict Shun[7247:2374301] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
2015-04-15 12:52:42.477 Dict Shun[7247:2374301] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —SpeakerMicrophoneBuiltIn—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x174203ab0,
inputs = (null);
outputs = (
“<AVAudioSessionPortDescription: 0x174203a80, type = Speaker; name = Speaker; UID = Speaker; selectedDataSource = (null)>”
)>.
2015-04-15 12:52:42.706 Dict Shun[7247:2374410] done starting audio unit
INFO: cmd_ln.c(702): Parsing command line:
\
-lm /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.DMP \
-vad_prespeech 10 \
-vad_postspeech 69 \
-vad_threshold 2.000000 \
-remove_noise yes \
-remove_silence yes \
-bestpath yes \
-lw 6.500000 \
-dict /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.dic \
-hmm /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundleCurrent configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-keyphrase
-kws
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.DMP
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 10000 10000
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-vad_postspeech 50 69
-vad_prespeech 10 10
-vad_threshold 2.0 2.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02INFO: cmd_ln.c(702): Parsing command line:
\
-nfilt 25 \
-lowerf 130 \
-upperf 6800 \
-feat 1s_c_d_dd \
-svspec 0-12/13-25/26-38 \
-agc none \
-cmn current \
-varnorm no \
-transform dct \
-lifter 22 \
-cmninit 40Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 40
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 22
-logspec no no
-lowerf 133.33334 1.300000e+02
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-svspec 0-12/13-25/26-38
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-vad_postspeech 50 69
-vad_prespeech 10 10
-vad_threshold 2.0 2.000000e+00
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.562500e-02INFO: acmod.c(252): Parsed model-specific feature parameters from /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/feat.params
INFO: feat.c(715): Initializing feature stream to type: ‘1s_c_d_dd’, ceplen=13, CMN=’current’, VARNORM=’no’, AGC=’none’
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(171): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/mdef
INFO: bin_mdef.c(516): 46 CI-phone, 168344 CD-phone, 3 emitstate/phone, 138 CI-sen, 6138 Sen, 32881 Sen-Seq
INFO: tmat.c(206): Reading HMM transition probability matrices: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/transition_matrices
INFO: acmod.c(124): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(904): Loading senones from dump file /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/sendump
INFO: s2_semi_mgau.c(928): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(991): Rows: 512, Columns: 6138
INFO: s2_semi_mgau.c(1023): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1294): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: dict.c(320): Allocating 4109 * 32 bytes (128 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.dic
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 4 words read
INFO: dict.c(342): Reading filler dictionary: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/noisedict
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(345): 9 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 46^3 * 2 bytes (190 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 51152 bytes (49 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 51152 bytes (49 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(79): No \data\ mark in LM file
INFO: ngram_model_dmp.c(166): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(220): ngrams 1=6, 2=8, 3=4
INFO: ngram_model_dmp.c(266): 6 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(312): 8 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(338): 4 = LM.trigrams read
INFO: ngram_model_dmp.c(363): 3 = LM.prob2 entries read
INFO: ngram_model_dmp.c(383): 3 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(403): 2 = LM.prob3 entries read
INFO: ngram_model_dmp.c(431): 1 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(487): 6 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 4 unique initial diphones
INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 10 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 10 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 135
INFO: ngram_search_fwdtree.c(339): after: 4 root, 7 non-root channels, 9 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
2015-04-15 12:52:42.783 Dict Shun[7247:2374410] Restoring SmartCMN value of 38.984131
2015-04-15 12:52:42.783 Dict Shun[7247:2374410] Listening.
2015-04-15 12:52:42.784 Dict Shun[7247:2374410] Project has these words or phrases in its dictionary:
BUMMER
DAMN
DARN
DRAT
2015-04-15 12:52:42.784 Dict Shun[7247:2374410] Recognition loop has started
2015-04-15 12:52:42.785 Dict Shun[7247:2374238] Pocketsphinx is now listening.
2015-04-15 12:52:42.886 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
2015-04-15 12:52:42.987 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
2015-04-15 12:52:43.089 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
2015-04-15 12:52:43.912 Dict Shun[7247:2374353] Speech detected…
2015-04-15 12:52:43.912 Dict Shun[7247:2374238] Pocketsphinx has detected speech.
2015-04-15 12:52:44.886 Dict Shun[7247:2374353] End of speech detected…
2015-04-15 12:52:44.887 Dict Shun[7247:2374238] Pocketsphinx has detected a period of silence, concluding an utterance.
INFO: cmn_prior.c(131): cmn_prior_update: from < 38.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: cmn_prior.c(149): cmn_prior_update: to < 44.36 16.43 -17.45 -4.13 -1.19 6.09 -1.74 3.96 -0.70 -0.51 0.38 3.79 -5.49 >
INFO: ngram_search_fwdtree.c(1550): 735 words recognized (7/fr)
INFO: ngram_search_fwdtree.c(1552): 8165 senones evaluated (79/fr)
INFO: ngram_search_fwdtree.c(1556): 3320 channels searched (31/fr), 400 1st, 2328 last
INFO: ngram_search_fwdtree.c(1559): 881 words for which last channels evaluated (8/fr)
INFO: ngram_search_fwdtree.c(1561): 130 candidate words for entering last phone (1/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 0.07 CPU 0.072 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 1.80 wall 1.729 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 2 words
INFO: ngram_search_fwdflat.c(938): 611 words recognized (6/fr)
INFO: ngram_search_fwdflat.c(940): 6232 senones evaluated (60/fr)
INFO: ngram_search_fwdflat.c(942): 2397 channels searched (23/fr)
INFO: ngram_search_fwdflat.c(944): 814 words searched (7/fr)
INFO: ngram_search_fwdflat.c(947): 55 word transitions (0/fr)
INFO: ngram_search_fwdflat.c(950): fwdflat 0.02 CPU 0.023 xRT
INFO: ngram_search_fwdflat.c(953): fwdflat 0.03 wall 0.026 xRT
INFO: ngram_search.c(1215): </s> not found in last frame, using [SMACK].102 instead
INFO: ngram_search.c(1268): lattice start node <s>.0 end node [SMACK].60
INFO: ngram_search.c(1294): Eliminated 132 nodes before end node
INFO: ngram_search.c(1399): Lattice has 269 nodes, 524 links
INFO: ps_lattice.c(1368): Normalizer P(O) = alpha([SMACK]:60:102) = -686508
INFO: ps_lattice.c(1403): Joint P(O,S) = -686508 P(S|O) = 0
INFO: ngram_search.c(890): bestpath 0.00 CPU 0.001 xRT
INFO: ngram_search.c(893): bestpath 0.00 wall 0.002 xRT
2015-04-15 12:52:44.917 Dict Shun[7247:2374353] Pocketsphinx heard “DARN” with a score of (0) and an utterance ID of 0.
2015-04-15 12:52:44.918 Dict Shun[7247:2374238] The received hypothesis is DARN with a score of 0 and an ID of 0
2015-04-15 12:52:45.796 Dict Shun[7247:2374410] Speech detected…
2015-04-15 12:52:45.797 Dict Shun[7247:2374238] Pocketsphinx has detected speech.
2015-04-15 12:52:46.803 Dict Shun[7247:2374410] End of speech detected…
INFO: cmn_prior.c(131): cmn_prior_update: from < 44.36 16.43 -17.45 -4.13 -1.19 6.09 -1.74 3.96 -0.70 -0.51 0.38 3.79 -5.49 >
INFO: cmn_prior.c(149): cmn_prior_update: to < 35.82 12.75 -8.41 0.75 2.41 2.39 -1.16 5.49 -2.92 -7.00 -3.80 0.23 -9.61 >
INFO: ngram_search_fwdtree.c(1550): 912 words recognized (9/fr)
2015-04-15 12:52:46.804 Dict Shun[7247:2374238] Pocketsphinx has detected a period of silence, concluding an utterance.
INFO: ngram_search_fwdtree.c(1552): 6287 senones evaluated (59/fr)
INFO: ngram_search_fwdtree.c(1556): 2033 channels searched (19/fr), 412 1st, 930 last
INFO: ngram_search_fwdtree.c(1559): 930 words for which last channels evaluated (8/fr)
INFO: ngram_search_fwdtree.c(1561): 128 candidate words for entering last phone (1/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 0.11 CPU 0.101 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 1.89 wall 1.763 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 2 words
INFO: ngram_search_fwdflat.c(938): 719 words recognized (7/fr)
INFO: ngram_search_fwdflat.c(940): 2190 senones evaluated (20/fr)
INFO: ngram_search_fwdflat.c(942): 755 channels searched (7/fr)
INFO: ngram_search_fwdflat.c(944): 755 words searched (7/fr)
INFO: ngram_search_fwdflat.c(947): 76 word transitions (0/fr)
INFO: ngram_search_fwdflat.c(950): fwdflat 0.02 CPU 0.020 xRT
INFO: ngram_search_fwdflat.c(953): fwdflat 0.02 wall 0.021 xRT
INFO: ngram_search.c(1215): </s> not found in last frame, using <sil>.105 instead
INFO: ngram_search.c(1268): lattice start node <s>.0 end node <sil>.2
INFO: ngram_search.c(1294): Eliminated 298 nodes before end node
INFO: ngram_search.c(1399): Lattice has 300 nodes, 1 links
INFO: ps_lattice.c(1368): Normalizer P(O) = alpha(<sil>:2:105) = -5320591
INFO: ps_lattice.c(1403): Joint P(O,S) = -5320591 P(S|O) = 0
INFO: ngram_search.c(890): bestpath 0.00 CPU 0.002 xRT
INFO: ngram_search.c(893): bestpath 0.00 wall 0.001 xRT
2015-04-15 12:52:46.829 Dict Shun[7247:2374410] Pocketsphinx heard “” with a score of (0) and an utterance ID of 1.
2015-04-15 12:52:46.830 Dict Shun[7247:2374410] Hypothesis was null so we aren’t returning it. If you want null hypotheses to also be returned, set OEPocketsphinxController’s property returnNullHypotheses to TRUE before starting OEPocketsphinxController.
2015-04-15 12:52:48.091 Dict Shun[7247:2374410] Speech detected…
2015-04-15 12:52:48.092 Dict Shun[7247:2374238] Pocketsphinx has detected speech.
2015-04-15 12:52:49.492 Dict Shun[7247:2374410] End of speech detected…
INFO: cmn_prior.c(131): cmn_prior_update: from < 35.82 12.75 -8.41 0.75 2.41 2.39 -1.16 5.49 -2.92 -7.00 -3.80 0.23 -9.61 >
INFO: cmn_prior.c(149): cmn_prior_update: to < 37.00 8.96 -8.76 5.71 0.96 0.59 -0.07 5.11 -2.17 -5.44 -4.62 0.82 -10.20 >
INFO: ngram_search_fwdtree.c(1550): 1055 words recognized (7/fr)
INFO: ngram_search_fwdtree.c(1552): 10067 senones evaluated (69/fr)
2015-04-15 12:52:49.493 Dict Shun[7247:2374238] Pocketsphinx has detected a period of silence, concluding an utterance.
INFO: ngram_search_fwdtree.c(1556): 3670 channels searched (25/fr), 564 1st, 2225 last
INFO: ngram_search_fwdtree.c(1559): 1185 words for which last channels evaluated (8/fr)
INFO: ngram_search_fwdtree.c(1561): 97 candidate words for entering last phone (0/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 0.12 CPU 0.086 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 2.66 wall 1.836 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 3 words
INFO: ngram_search_fwdflat.c(938): 953 words recognized (7/fr)
INFO: ngram_search_fwdflat.c(940): 6136 senones evaluated (42/fr)
INFO: ngram_search_fwdflat.c(942): 2605 channels searched (17/fr)
INFO: ngram_search_fwdflat.c(944): 1196 words searched (8/fr)
INFO: ngram_search_fwdflat.c(947): 105 word transitions (0/fr)
INFO: ngram_search_fwdflat.c(950): fwdflat 0.03 CPU 0.023 xRT
INFO: ngram_search_fwdflat.c(953): fwdflat 0.04 wall 0.025 xRT
INFO: ngram_search.c(1215): </s> not found in last frame, using <sil>.143 instead
INFO: ngram_search.c(1268): lattice start node <s>.0 end node <sil>.79
INFO: ngram_search.c(1294): Eliminated 248 nodes before end node
INFO: ngram_search.c(1399): Lattice has 430 nodes, 873 links
INFO: ps_lattice.c(1368): Normalizer P(O) = alpha(<sil>:79:143) = -928990
INFO: ps_lattice.c(1403): Joint P(O,S) = -928990 P(S|O) = 0
INFO: ngram_search.c(890): bestpath 0.00 CPU 0.003 xRT
INFO: ngram_search.c(893): bestpath 0.01 wall 0.004 xRT
2015-04-15 12:52:49.536 Dict Shun[7247:2374410] Pocketsphinx heard “DAMN” with a score of (0) and an utterance ID of 2.
2015-04-15 12:52:49.537 Dict Shun[7247:2374238] The received hypothesis is DAMN with a score of 0 and an ID of 2
2015-04-15 12:52:52.574 Dict Shun[7247:2374353] Speech detected…
2015-04-15 12:52:52.575 Dict Shun[7247:2374238] Pocketsphinx has detected speech.
INFO: cmn_prior.c(99): cmn_prior_update: from < 37.00 8.96 -8.76 5.71 0.96 0.59 -0.07 5.11 -2.17 -5.44 -4.62 0.82 -10.20 >
INFO: cmn_prior.c(116): cmn_prior_update: to < 38.40 11.61 -11.22 -1.09 -4.36 2.96 0.25 0.91 -1.84 -3.39 -2.74 0.75 -10.08 >
2015-04-15 12:52:57.157 Dict Shun[7247:2374238] DEBUG> Time over ==> STOP LISTENING!
2015-04-15 12:52:57.158 Dict Shun[7247:2374238] Stopping listening.
INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 0.31 CPU 0.087 xRT
INFO: ngram_search_fwdtree.c(435): TOTAL fwdtree 6.35 wall 1.798 xRT
INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.08 CPU 0.022 xRT
INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.09 wall 0.024 xRT
INFO: ngram_search.c(307): TOTAL bestpath 0.01 CPU 0.002 xRT
INFO: ngram_search.c(310): TOTAL bestpath 0.01 wall 0.002 xRT
2015-04-15 12:52:57.708 Dict Shun[7247:2374238] No longer listening.
2015-04-15 12:52:57.709 Dict Shun[7247:2374238] DEBUG> Pocketxphinx stopListening = SUCCESSFUL!
2015-04-15 12:52:57.709 Dict Shun[7247:2374238] DEBUG> Pocketsphinx has stopped listening.
2015-04-15 12:52:57.724 Dict Shun[7247:2374301] Audio route has changed for the following reason:
2015-04-15 12:52:57.725 Dict Shun[7247:2374301] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
2015-04-15 12:52:57.728 Dict Shun[7247:2374301] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —Speaker—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x174203b30,
inputs = (
“<AVAudioSessionPortDescription: 0x174202a90, type = MicrophoneBuiltIn; name = iPhone Microphone; UID = Built-In Microphone; selectedDataSource = Bottom>”
);
outputs = (
“<AVAudioSessionPortDescription: 0x1742034b0, type = Speaker; name = Speaker; UID = Speaker; selectedDataSource = (null)>”
)>.
2015-04-15 13:00:37.704 Dict Shun[7247:2374301] 13:00:37.703 ERROR: [AVAudioSession Notify Thread] AVAudioSessionPortImpl.mm:52: ValidateRequiredFields: Unknown selected data source for Port iPhone Microphone (type: MicrophoneBuiltIn)
2015-04-15 13:00:37.705 Dict Shun[7247:2374301] Audio route has changed for the following reason:
2015-04-15 13:00:37.708 Dict Shun[7247:2374301] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
2015-04-15 13:00:37.712 Dict Shun[7247:2374301] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —ReceiverMicrophoneBuiltIn—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x1700177a0,
inputs = (null);
outputs = (
“<AVAudioSessionPortDescription: 0x170017b30, type = Speaker; name = Speaker; UID = Speaker; selectedDataSource = (null)>”
)>.
2015-04-15 13:08:40.085 Dict Shun[7247:2374301] Audio route has changed for the following reason:
2015-04-15 13:08:40.088 Dict Shun[7247:2374301] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
2015-04-15 13:08:40.091 Dict Shun[7247:2374301] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —Speaker—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x174203a60,
inputs = (
“<AVAudioSessionPortDescription: 0x1742036d0, type = MicrophoneBuiltIn; name = iPhone Microphone; UID = Built-In Microphone; selectedDataSource = (null)>”
);
outputs = (
“<AVAudioSessionPortDescription: 0x1742035d0, type = Receiver; name = Receiver; UID = Built-In Receiver; selectedDataSource = (null)>”
)>.April 15, 2015 at 10:36 pm #1025422rikkParticipantbtw: I noticed that even in this single session I am seeing multiple:
“A request has been made to start a listening session using startListeningWithLanguageModelAtPath:…, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first ”
… excerpt from orig log posted above…
2015-04-15 12:52:42.784 Dict Shun[7247:2374410] Recognition loop has started
2015-04-15 12:52:42.785 Dict Shun[7247:2374238] Pocketsphinx is now listening.
2015-04-15 12:52:42.886 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
2015-04-15 12:52:42.987 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
2015-04-15 12:52:43.089 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
2015-04-15 12:52:43.912 Dict Shun[7247:2374353] Speech detected…
2015-04-15 12:52:43.912 Dict Shun[7247:2374238] Pocketsphinx has detected speech.
2015-04-15 12:52:44.886 Dict Shun[7247:2374353] End of speech detected…
2015-04-15 12:52:44.887 Dict Shun[7247:2374238] Pocketsphinx has detected a period of silence, concluding an utterance.April 15, 2015 at 10:43 pm #1025423Halle WinklerPolitepixHi,
The route thing isn’t significant unless it is leading to peculiar outcomes with routing. It is normal to see a few route messages in a session. This design is unfortunately not supported:
– A button to startListening for n seconds.
[….]
– At the end of the time period, stopListening.So I would assume that the warnings about stopping are related to this.
April 15, 2015 at 11:04 pm #1025424rikkParticipantThanks for quick reply! I figured that the route changes were red herrings.
Can you please elaborate on “This design is unfortunately not supported.”?
My app simply starts and stops listening for periods of time.
April 15, 2015 at 11:09 pm #1025425Halle WinklerPolitepixListening for arbitrary periods of time is unfortunately contrary to the utterance-based design of the framework. There is no testing done on using it in that way, so if there are any particular implications to setting up an app that way, they aren’t something I can help with.
April 15, 2015 at 11:24 pm #1025426rikkParticipantI apologize for my lack of knowledge in speech recognition science, but I feel like I’m missing something very important. Specifically, what does “utterance-based design” mean?
Note that my time periods are on the order of 2-30 minutes (not seconds).
April 16, 2015 at 12:42 am #1025429rikkParticipantOk, you don’t need to explain what “utterance-based design” means. ;-)
My real question is: Why is my use case “contrary to OE’s design”?
Starting, listening for a fixed period of time (minutes), and stopping, seems like the simplest and most normal case imaginable.
Am I missing something?
Thanks again for your great support!
April 16, 2015 at 2:09 am #1025430rikkParticipantfyi: I solved my issues with start/stop and the app appears to work as expected. :-D
I’m still worried about your comment that my use case is not a good fit for OpenEars (see previous message). ;-)
April 16, 2015 at 10:14 am #1025435Halle WinklerPolitepixHi rikk,
Very glad to hear it was just an issue with stopping. I misunderstood your application to be something like this, where a timed stop is being used to basically interrupt the user mid-utterance and force recognition (an utterance is a continuous period of user speech):
https://www.politepix.com/forums/topic/stop-speech-recognition-in-desired-time-ex-2-3-sec/
This design (the push-to-talk design mentioned in the linked thread) is trying to sort of fake OpenEars into not being a continuous listener, which would be better solved by using a much more basic Pocketsphinx implementation rather than trying to get OpenEars to be something it’s not with extra code that adds complexity.
What you are talking about sounds a bit different – you are using the continuous listening capabilities, and the user can use the session to speak complete utterances, but you have some kind of arbitrary end to the overall listening session period. I’m going to assume that there is a strong rationale for that design in your app requirements which is why you don’t set the period of listening purely based on user input. I wouldn’t expect your setup to be a problem since your stop is essentially similar to the phone giving an interruption.
April 17, 2015 at 3:00 am #1025442rikkParticipantHalle,
Sincere thanks for the detailed followup.
My app is trying to be a “bad word detector/trainer.” The idea is that the user chooses a session time (e.g. 2 mins) to practice speaking. As they are talking, my app will detect each time they say a “bad word”, and it will display visual feedback (e.g. unhappy face) and a score (i.e. “You said DAMN 3 times so far”). The app continues detecting and reporting as they talk throughout the session. If they say two “bad words” in succession, I’d like to get two responses from OE (e.g. “DAMN”, “DARN”).
My current app kinda works, but often misses “bad words” or combines multiple bad words (said in sequence) into a single response (e.g. “DARN BUMMER”, instead of “DARN” and “BUMMER”.
I’m wondering if I would benefit from your suggestion of recording .WAV (even though I don’t understand why that helps), or if RapidEars is the right choice.
Thoughts?
Thanks again,
RikkApril 17, 2015 at 11:34 am #1025444Halle WinklerPolitepixHi,
I’m wondering if I would benefit from your suggestion of recording .WAV
I didn’t recommend this and would not ever do so – that impression is due to a misunderstanding. I didn’t link to the other thread in order to show you advice for your design, I linked to it to explain a specific design that I think should _never_ be done with OpenEars, as a way of explaining to you that I didn’t consider your design to have the same problem. This was directly in response to your statement that you were worried that your design was a mis-fit for the library.
I told the poster in that discussion that the only way it was possible for him to use OpenEars for his design was by recording a WAV file and submitting it to the WAV function, not because it has any advantages or because it is a good idea but because it is otherwise not possible with my support at all. None of this is your problem because you are using OpenEars for continuous listening as it is designed for, just with some kind of eventual end point that is a bit arbitrary, so please consider the question of whether your design is OK to be closed (it is) and please don’t take design advice from that thread.
My current app kinda works, but often misses “bad words” or combines multiple bad words (said in sequence) into a single response (e.g. “DARN BUMMER”, instead of “DARN” and “BUMMER”.
The job of parsing a hypothesis for multiple words you are interested is an implementation issue for your app. If you receive this hypothesis, you can check it against your word list and see if there is more than one word from it in there. Rejecto may help with your false negatives.
if RapidEars is the right choice.
RapidEars will give you hypotheses sooner, but they will contain similar content to the regular hypotheses. Would that be helpful to you?
April 17, 2015 at 8:18 pm #1025448rikkParticipantOk, got it. All makes sense. Thanks for your thoughtful answers!
-
AuthorPosts
- You must be logged in to reply to this topic.