Forum Replies Created
-
AuthorPosts
-
jepjep7Participant
Yes! As soon as I changed the bundle ID to match the licensed framework, it worked. Thank you!
I read the RapidEars manual several times. For Swift, I am not sure it was clear that one has to add both of these lines to the OpenEarsHeader.h file:
#import <RapidEars/OEEventsObserver+RapidEars.h>
#import <RapidEars/OEPocketsphinxController+RapidEars.h>And then after that, you have to follow your patch directions above for everything to work.
Thanks again!
jepjep7ParticipantPS. I just tried the demo version, and that is working for me. The only think two lines I changed were in the bridge header file:
#import <RapidEars/OEPocketsphinxController+RapidEars.h>
to
#import <RapidEarsDemo/OEPocketsphinxController+RapidEars.h>and
#import <RapidEars/OEEventsObserver+RapidEars.h>
to
#import <RapidEarsDemo/OEEventsObserver+RapidEars.h>Do you think the error is caused by my framework?
Thanksjepjep7ParticipantAfter I pasted your fix, I received the error: “Type ‘ViewController’ does not conform to protocol ‘OEEventsObserverRapidEarsDelegate’ Do you want to add the protocol stubs?”
I pressed “fix”, and xcode added 6 more functions. I was then able to compile.
Next, I tried the line: “OEPocketsphinxController.sharedInstance().startRealtimeListeningWithLanguageModel(atPath: lmPath, dictionaryAtPath: dicPath, acousticModelAtPath: OEAcousticModel.path(toModel: “AcousticModelEnglish”))”
When I run this, I receive the error message “XPC connection interrupted”, and my project quits.
Do you have any idea how to fix this?
Thanks,
JoeSeptember 21, 2015 at 5:17 am in reply to: Increasing speech recognition accuracy using a custom languagemodel? #1026836jepjep7ParticipantHere is my replication case. I am using an iPhone 6 running OS9 and Xcode 7.
https://www.dropbox.com/s/bnsk2i4j1q0r31r/test3.wav?dl=0
Log:
2015-09-20 23:15:45.655 OpenEarsSampleApp[4345:1045344] Starting OpenEars logging for OpenEars version 2.041 on 64-bit device (or build): iPhone running iOS version: 9.000000
2015-09-20 23:15:45.656 OpenEarsSampleApp[4345:1045344] Creating shared instance of OEPocketsphinxController
2015-09-20 23:15:45.705 OpenEarsSampleApp[4345:1045344] Starting dynamic language model generationINFO: cmd_ln.c(703): Parsing command line:
sphinx_lm_convert \
-i /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/FirstOpenEarsDynamicLanguageModel.arpa \
-o /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/FirstOpenEarsDynamicLanguageModel.DMPCurrent configuration:
[NAME] [DEFLT] [VALUE]
-case
-debug 0
-help no no
-i /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/FirstOpenEarsDynamicLanguageModel.arpa
-ifmt
-logbase 1.0001 1.000100e+00
-mmap no no
-o /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/FirstOpenEarsDynamicLanguageModel.DMP
-ofmtINFO: ngram_model_arpa.c(503): ngrams 1=11, 2=18, 3=9
INFO: ngram_model_arpa.c(135): Reading unigrams
INFO: ngram_model_arpa.c(542): 11 = #unigrams created
INFO: ngram_model_arpa.c(195): Reading bigrams
INFO: ngram_model_arpa.c(560): 18 = #bigrams created
INFO: ngram_model_arpa.c(561): 3 = #prob2 entries
INFO: ngram_model_arpa.c(569): 3 = #bo_wt2 entries
INFO: ngram_model_arpa.c(292): Reading trigrams
INFO: ngram_model_arpa.c(582): 9 = #trigrams created
INFO: ngram_model_arpa.c(583): 2 = #prob3 entries
INFO: ngram_model_dmp.c(518): Building DMP model…
INFO: ngram_model_dmp.c(548): 11 = #unigrams created
INFO: ngram_model_dmp.c(649): 18 = #bigrams created
INFO: ngram_model_dmp.c(650): 3 = #prob2 entries
INFO: ngram_model_dmp.c(657): 3 = #bo_wt2 entries
INFO: ngram_model_dmp.c(661): 9 = #trigrams created
INFO: ngram_model_dmp.c(662): 2 = #prob3 entries
2015-09-20 23:15:45.750 OpenEarsSampleApp[4345:1045344] Done creating language model with CMUCLMTK in 0.045052 seconds.
2015-09-20 23:15:45.798 OpenEarsSampleApp[4345:1045344] I’m done running performDictionaryLookup and it took 0.033995 seconds
2015-09-20 23:15:45.806 OpenEarsSampleApp[4345:1045344] I’m done running dynamic language model generation and it took 0.144854 seconds
2015-09-20 23:15:45.814 OpenEarsSampleApp[4345:1045344] Starting dynamic language model generationINFO: cmd_ln.c(703): Parsing command line:
sphinx_lm_convert \
-i /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/SecondOpenEarsDynamicLanguageModel.arpa \
-o /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/SecondOpenEarsDynamicLanguageModel.DMPCurrent configuration:
[NAME] [DEFLT] [VALUE]
-case
-debug 0
-help no no
-i /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/SecondOpenEarsDynamicLanguageModel.arpa
-ifmt
-logbase 1.0001 1.000100e+00
-mmap no no
-o /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/SecondOpenEarsDynamicLanguageModel.DMP
-ofmtINFO: ngram_model_arpa.c(503): ngrams 1=12, 2=19, 3=10
INFO: ngram_model_arpa.c(135): Reading unigrams
INFO: ngram_model_arpa.c(542): 12 = #unigrams created
INFO: ngram_model_arpa.c(195): Reading bigrams
INFO: ngram_model_arpa.c(560): 19 = #bigrams created
INFO: ngram_model_arpa.c(561): 3 = #prob2 entries
INFO: ngram_model_arpa.c(569): 3 = #bo_wt2 entries
INFO: ngram_model_arpa.c(292): Reading trigrams
INFO: ngram_model_arpa.c(582): 10 = #trigrams created
INFO: ngram_model_arpa.c(583): 2 = #prob3 entries
INFO: ngram_model_dmp.c(518): Building DMP model…
INFO: ngram_model_dmp.c(548): 12 = #unigrams created
INFO: ngram_model_dmp.c(649): 19 = #bigrams created
INFO: ngram_model_dmp.c(650): 3 = #prob2 entries
INFO: ngram_model_dmp.c(657): 3 = #bo_wt2 entries
INFO: ngram_model_dmp.c(661): 10 = #trigrams created
INFO: ngram_model_dmp.c(662): 2 = #prob3 entries
2015-09-20 23:15:45.880 OpenEarsSampleApp[4345:1045344] Done creating language model with CMUCLMTK in 0.065760 seconds.
2015-09-20 23:15:45.915 OpenEarsSampleApp[4345:1045344] The word QUIDNUNC was not found in the dictionary /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/LanguageModelGeneratorLookupList.text/LanguageModelGeneratorLookupList.text.
2015-09-20 23:15:45.915 OpenEarsSampleApp[4345:1045344] Now using the fallback method to look up the word QUIDNUNC
2015-09-20 23:15:45.916 OpenEarsSampleApp[4345:1045344] If this is happening more frequently than you would expect, the most likely cause for it is since you are using the English phonetic lookup dictionary is that your words are not in English or aren’t dictionary words.
2015-09-20 23:15:45.916 OpenEarsSampleApp[4345:1045344] Using convertGraphemes for the word or phrase QUIDNUNC which doesn’t appear in the dictionary
2015-09-20 23:15:45.938 OpenEarsSampleApp[4345:1045344] I’m done running performDictionaryLookup and it took 0.053308 seconds
2015-09-20 23:15:45.945 OpenEarsSampleApp[4345:1045344] I’m done running dynamic language model generation and it took 0.138591 seconds
2015-09-20 23:15:45.945 OpenEarsSampleApp[4345:1045344]Welcome to the OpenEars sample project. This project understands the words:
BACKWARD,
CHANGE,
FORWARD,
GO,
LEFT,
MODEL,
RIGHT,
TURN,
and if you say “CHANGE MODEL” it will switch to its dynamically-generated model which understands the words:
CHANGE,
MODEL,
MONDAY,
TUESDAY,
WEDNESDAY,
THURSDAY,
FRIDAY,
SATURDAY,
SUNDAY,
QUIDNUNC
2015-09-20 23:15:45.946 OpenEarsSampleApp[4345:1045344] Attempting to start listening session from startListeningWithLanguageModelAtPath:
2015-09-20 23:15:45.950 OpenEarsSampleApp[4345:1045344] User gave mic permission for this app.
2015-09-20 23:15:45.950 OpenEarsSampleApp[4345:1045344] setSecondsOfSilence wasn’t set, using default of 0.700000.
2015-09-20 23:15:45.951 OpenEarsSampleApp[4345:1045344] Successfully started listening session from startListeningWithLanguageModelAtPath:
2015-09-20 23:15:45.951 OpenEarsSampleApp[4345:1045401] Starting listening.
2015-09-20 23:15:45.951 OpenEarsSampleApp[4345:1045401] about to set up audio session
2015-09-20 23:15:45.952 OpenEarsSampleApp[4345:1045401] Creating audio session with default settings.
2015-09-20 23:15:46.323 OpenEarsSampleApp[4345:1045419] Audio route has changed for the following reason:
2015-09-20 23:15:46.331 OpenEarsSampleApp[4345:1045419] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
2015-09-20 23:15:46.335 OpenEarsSampleApp[4345:1045419] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —SpeakerMicrophoneBuiltIn—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x146d47340,
inputs = (
“<AVAudioSessionPortDescription: 0x146d5f180, type = MicrophoneBuiltIn; name = iPhone Microphone; UID = Built-In Microphone; selectedDataSource = Front>”
);
outputs = (
“<AVAudioSessionPortDescription: 0x146d6ec50, type = Speaker; name = Speaker; UID = Speaker; selectedDataSource = (null)>”
)>.
2015-09-20 23:15:46.532 OpenEarsSampleApp[4345:1045401] done starting audio unit
INFO: cmd_ln.c(703): Parsing command line:
\
-lm /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/FirstOpenEarsDynamicLanguageModel.DMP \
-vad_prespeech 10 \
-vad_postspeech 69 \
-vad_threshold 3.000000 \
-remove_noise yes \
-remove_silence yes \
-bestpath yes \
-lw 6.500000 \
-dict /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/FirstOpenEarsDynamicLanguageModel.dic \
-hmm /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundleCurrent configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/FirstOpenEarsDynamicLanguageModel.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle
-input_endian little little
-jsgf
-keyphrase
-kws
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 0
-lm /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/FirstOpenEarsDynamicLanguageModel.DMP
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 69
-vad_prespeech 20 10
-vad_startspeech 10 10
-vad_threshold 2.0 3.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02INFO: cmd_ln.c(703): Parsing command line:
\
-nfilt 25 \
-lowerf 130 \
-upperf 6800 \
-feat 1s_c_d_dd \
-svspec 0-12/13-25/26-38 \
-agc none \
-cmn current \
-varnorm no \
-transform dct \
-lifter 22 \
-cmninit 40Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 40
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 22
-logspec no no
-lowerf 133.33334 1.300000e+02
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-svspec 0-12/13-25/26-38
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-vad_postspeech 50 69
-vad_prespeech 20 10
-vad_startspeech 10 10
-vad_threshold 2.0 3.000000e+00
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.562500e-02INFO: acmod.c(252): Parsed model-specific feature parameters from /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/feat.params
INFO: feat.c(715): Initializing feature stream to type: ‘1s_c_d_dd’, ceplen=13, CMN=’current’, VARNORM=’no’, AGC=’none’
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(171): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/mdef
INFO: bin_mdef.c(516): 46 CI-phone, 168344 CD-phone, 3 emitstate/phone, 138 CI-sen, 6138 Sen, 32881 Sen-Seq
INFO: tmat.c(206): Reading HMM transition probability matrices: /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/transition_matrices
INFO: acmod.c(124): Attempting to use PTM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: ptm_mgau.c(805): Number of codebooks doesn’t match number of ciphones, doesn’t look like PTM: 1 != 46
INFO: acmod.c(126): Attempting to use semi-continuous computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(294): 512×13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(904): Loading senones from dump file /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/sendump
INFO: s2_semi_mgau.c(928): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(991): Rows: 512, Columns: 6138
INFO: s2_semi_mgau.c(1023): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1294): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4115 * 32 bytes (128 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches/FirstOpenEarsDynamicLanguageModel.dic
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 10 words read
INFO: dict.c(358): Reading filler dictionary: /var/mobile/Containers/Bundle/Application/772400B2-0D97-4986-BA6E-48D839C002FC/OpenEarsSampleApp.app/AcousticModelEnglish.bundle/noisedict
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 9 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 46^3 * 2 bytes (190 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 51152 bytes (49 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 51152 bytes (49 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
INFO: ngram_model_dmp.c(166): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(220): ngrams 1=11, 2=18, 3=9
INFO: ngram_model_dmp.c(266): 11 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(312): 18 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(338): 9 = LM.trigrams read
INFO: ngram_model_dmp.c(363): 3 = LM.prob2 entries read
INFO: ngram_model_dmp.c(383): 3 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(403): 2 = LM.prob3 entries read
INFO: ngram_model_dmp.c(431): 1 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(487): 11 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 10 unique initial diphones
INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 10 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 10 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 140
INFO: ngram_search_fwdtree.c(339): after: 10 root, 12 non-root channels, 9 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
2015-09-20 23:15:46.635 OpenEarsSampleApp[4345:1045401] Listening.
2015-09-20 23:15:46.636 OpenEarsSampleApp[4345:1045401] Project has these words or phrases in its dictionary:
EIGHT
FIVE
FOUR
NINE
ONE
ONE(2)
SEVEN
SIX
THREE
TWO2015-09-20 23:15:46.636 OpenEarsSampleApp[4345:1045401] Recognition loop has started
2015-09-20 23:15:46.638 OpenEarsSampleApp[4345:1045344] the path is /var/mobile/Containers/Data/Application/595AF706-7114-41EC-B60F-29E57BD7A1B3/Library/Caches
2015-09-20 23:15:46.672 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx is now listening.
2015-09-20 23:15:46.674 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx started.
2015-09-20 23:15:51.307 OpenEarsSampleApp[4345:1045401] Speech detected…
2015-09-20 23:15:51.308 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has detected speech.
2015-09-20 23:15:53.040 OpenEarsSampleApp[4345:1045404] End of speech detected…
2015-09-20 23:15:53.041 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has detected a second of silence, concluding an utterance.
INFO: cmn_prior.c(131): cmn_prior_update: from < 40.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: cmn_prior.c(149): cmn_prior_update: to < 42.12 20.93 -15.38 14.24 -16.31 -4.51 -12.58 -2.26 -7.42 -4.90 -0.29 6.59 -8.43 >
INFO: ngram_search_fwdtree.c(1553): 1536 words recognized (8/fr)
INFO: ngram_search_fwdtree.c(1555): 29789 senones evaluated (158/fr)
INFO: ngram_search_fwdtree.c(1559): 15455 channels searched (82/fr), 1840 1st, 12041 last
INFO: ngram_search_fwdtree.c(1562): 1923 words for which last channels evaluated (10/fr)
INFO: ngram_search_fwdtree.c(1564): 466 candidate words for entering last phone (2/fr)
INFO: ngram_search_fwdtree.c(1567): fwdtree 0.35 CPU 0.187 xRT
INFO: ngram_search_fwdtree.c(1570): fwdtree 6.16 wall 3.279 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 6 words
INFO: ngram_search_fwdflat.c(948): 1659 words recognized (9/fr)
INFO: ngram_search_fwdflat.c(950): 24944 senones evaluated (133/fr)
INFO: ngram_search_fwdflat.c(952): 13878 channels searched (73/fr)
INFO: ngram_search_fwdflat.c(954): 2173 words searched (11/fr)
INFO: ngram_search_fwdflat.c(957): 434 word transitions (2/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.04 CPU 0.020 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.04 wall 0.021 xRT
INFO: ngram_search.c(1280): lattice start node <s>.0 end node </s>.138
INFO: ngram_search.c(1306): Eliminated 5 nodes before end node
INFO: ngram_search.c(1411): Lattice has 616 nodes, 1752 links
INFO: ps_lattice.c(1380): Bestpath score: -18594
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:138:186) = -1207819
INFO: ps_lattice.c(1441): Joint P(O,S) = -1286424 P(S|O) = -78605
INFO: ngram_search.c(899): bestpath 0.01 CPU 0.004 xRT
INFO: ngram_search.c(902): bestpath 0.01 wall 0.004 xRT
2015-09-20 23:15:53.093 OpenEarsSampleApp[4345:1045404] Pocketsphinx heard “THREE FIVE FIVE” with a score of (-78605) and an utterance ID of 0.
2015-09-20 23:15:53.094 OpenEarsSampleApp[4345:1045344] Flite sending interrupt speech request.
2015-09-20 23:15:53.094 OpenEarsSampleApp[4345:1045344] Local callback: The received hypothesis is THREE FIVE FIVE with a score of -78605 and an ID of 0
2015-09-20 23:15:53.096 OpenEarsSampleApp[4345:1045344] I’m running flite
2015-09-20 23:15:53.209 OpenEarsSampleApp[4345:1045344] I’m done running flite and it took 0.113386 seconds
2015-09-20 23:15:53.210 OpenEarsSampleApp[4345:1045344] Flite audio player was nil when referenced so attempting to allocate a new audio player.
2015-09-20 23:15:53.210 OpenEarsSampleApp[4345:1045344] Loading speech data for Flite concluded successfully.
2015-09-20 23:15:53.301 OpenEarsSampleApp[4345:1045344] Flite sending suspend recognition notification.
2015-09-20 23:15:53.303 OpenEarsSampleApp[4345:1045344] Local callback: Flite has started speaking
2015-09-20 23:15:53.308 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has suspended recognition.
2015-09-20 23:15:55.349 OpenEarsSampleApp[4345:1045344] AVAudioPlayer did finish playing with success flag of 1
2015-09-20 23:15:55.501 OpenEarsSampleApp[4345:1045344] Flite sending resume recognition notification.
2015-09-20 23:15:56.003 OpenEarsSampleApp[4345:1045344] Local callback: Flite has finished speaking
2015-09-20 23:15:56.008 OpenEarsSampleApp[4345:1045344] setSecondsOfSilence wasn’t set, using default of 0.700000.
2015-09-20 23:15:56.009 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has resumed recognition.
INFO: cmn_prior.c(131): cmn_prior_update: from < 42.12 20.93 -15.38 14.24 -16.31 -4.51 -12.58 -2.26 -7.42 -4.90 -0.29 6.59 -8.43 >
INFO: cmn_prior.c(149): cmn_prior_update: to < 42.12 20.93 -15.38 14.24 -16.31 -4.51 -12.58 -2.26 -7.42 -4.90 -0.29 6.59 -8.43 >
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 0 words
2015-09-20 23:15:58.301 OpenEarsSampleApp[4345:1045401] Speech detected…
2015-09-20 23:15:58.302 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has detected speech.
2015-09-20 23:15:59.832 OpenEarsSampleApp[4345:1045401] End of speech detected…
2015-09-20 23:15:59.833 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has detected a second of silence, concluding an utterance.
INFO: cmn_prior.c(131): cmn_prior_update: from < 42.12 20.93 -15.38 14.24 -16.31 -4.51 -12.58 -2.26 -7.42 -4.90 -0.29 6.59 -8.43 >
INFO: cmn_prior.c(149): cmn_prior_update: to < 42.18 20.48 -12.38 8.20 -16.77 1.20 -14.39 -5.20 -6.01 -4.37 -1.41 5.09 -7.03 >
INFO: ngram_search_fwdtree.c(1553): 1424 words recognized (9/fr)
INFO: ngram_search_fwdtree.c(1555): 27335 senones evaluated (167/fr)
INFO: ngram_search_fwdtree.c(1559): 14207 channels searched (86/fr), 1600 1st, 11096 last
INFO: ngram_search_fwdtree.c(1562): 1683 words for which last channels evaluated (10/fr)
INFO: ngram_search_fwdtree.c(1564): 496 candidate words for entering last phone (3/fr)
INFO: ngram_search_fwdtree.c(1567): fwdtree 0.25 CPU 0.151 xRT
INFO: ngram_search_fwdtree.c(1570): fwdtree 3.73 wall 2.276 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 6 words
INFO: ngram_search_fwdflat.c(948): 1538 words recognized (9/fr)
INFO: ngram_search_fwdflat.c(950): 27476 senones evaluated (168/fr)
INFO: ngram_search_fwdflat.c(952): 16006 channels searched (97/fr)
INFO: ngram_search_fwdflat.c(954): 2008 words searched (12/fr)
INFO: ngram_search_fwdflat.c(957): 490 word transitions (2/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.03 CPU 0.020 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.04 wall 0.022 xRT
INFO: ngram_search.c(1280): lattice start node <s>.0 end node </s>.111
INFO: ngram_search.c(1306): Eliminated 3 nodes before end node
INFO: ngram_search.c(1411): Lattice has 568 nodes, 2461 links
INFO: ps_lattice.c(1380): Bestpath score: -15511
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:111:162) = -1087553
INFO: ps_lattice.c(1441): Joint P(O,S) = -1174585 P(S|O) = -87032
INFO: ngram_search.c(899): bestpath 0.01 CPU 0.006 xRT
INFO: ngram_search.c(902): bestpath 0.01 wall 0.005 xRT
2015-09-20 23:15:59.884 OpenEarsSampleApp[4345:1045401] Pocketsphinx heard “THREE FIVE” with a score of (-87032) and an utterance ID of 1.
2015-09-20 23:15:59.885 OpenEarsSampleApp[4345:1045344] Flite sending interrupt speech request.
2015-09-20 23:15:59.885 OpenEarsSampleApp[4345:1045344] Local callback: The received hypothesis is THREE FIVE with a score of -87032 and an ID of 1
2015-09-20 23:15:59.887 OpenEarsSampleApp[4345:1045344] I’m running flite
2015-09-20 23:15:59.962 OpenEarsSampleApp[4345:1045344] I’m done running flite and it took 0.074690 seconds
2015-09-20 23:15:59.962 OpenEarsSampleApp[4345:1045344] Flite audio player was nil when referenced so attempting to allocate a new audio player.
2015-09-20 23:15:59.962 OpenEarsSampleApp[4345:1045344] Loading speech data for Flite concluded successfully.
2015-09-20 23:15:59.987 OpenEarsSampleApp[4345:1045344] Flite sending suspend recognition notification.
2015-09-20 23:15:59.989 OpenEarsSampleApp[4345:1045344] Local callback: Flite has started speaking
2015-09-20 23:15:59.993 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has suspended recognition.
2015-09-20 23:16:01.664 OpenEarsSampleApp[4345:1045344] AVAudioPlayer did finish playing with success flag of 1
2015-09-20 23:16:01.816 OpenEarsSampleApp[4345:1045344] Flite sending resume recognition notification.
2015-09-20 23:16:02.318 OpenEarsSampleApp[4345:1045344] Local callback: Flite has finished speaking
2015-09-20 23:16:02.325 OpenEarsSampleApp[4345:1045344] setSecondsOfSilence wasn’t set, using default of 0.700000.
2015-09-20 23:16:02.326 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has resumed recognition.
INFO: cmn_prior.c(131): cmn_prior_update: from < 42.18 20.48 -12.38 8.20 -16.77 1.20 -14.39 -5.20 -6.01 -4.37 -1.41 5.09 -7.03 >
INFO: cmn_prior.c(149): cmn_prior_update: to < 42.18 20.48 -12.38 8.20 -16.77 1.20 -14.39 -5.20 -6.01 -4.37 -1.41 5.09 -7.03 >
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 0 words
2015-09-20 23:16:05.724 OpenEarsSampleApp[4345:1045401] Speech detected…
2015-09-20 23:16:05.725 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has detected speech.
2015-09-20 23:16:06.364 OpenEarsSampleApp[4345:1045404] End of speech detected…
INFO: cmn_prior.c(131): cmn_prior_update: from < 42.18 20.48 -12.38 8.20 -16.77 1.20 -14.39 -5.20 -6.01 -4.37 -1.41 5.09 -7.03 >
2015-09-20 23:16:06.365 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has detected a second of silence, concluding an utterance.
INFO: cmn_prior.c(149): cmn_prior_update: to < 39.98 17.73 -12.48 7.80 -14.05 1.06 -13.80 -3.82 -5.37 -5.05 -1.22 3.75 -5.35 >
INFO: ngram_search_fwdtree.c(1553): 766 words recognized (10/fr)
INFO: ngram_search_fwdtree.c(1555): 10758 senones evaluated (136/fr)
INFO: ngram_search_fwdtree.c(1559): 5330 channels searched (67/fr), 750 1st, 3921 last
INFO: ngram_search_fwdtree.c(1562): 813 words for which last channels evaluated (10/fr)
INFO: ngram_search_fwdtree.c(1564): 203 candidate words for entering last phone (2/fr)
INFO: ngram_search_fwdtree.c(1567): fwdtree 0.21 CPU 0.266 xRT
INFO: ngram_search_fwdtree.c(1570): fwdtree 3.85 wall 4.874 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 5 words
INFO: ngram_search_fwdflat.c(948): 762 words recognized (10/fr)
INFO: ngram_search_fwdflat.c(950): 9527 senones evaluated (121/fr)
INFO: ngram_search_fwdflat.c(952): 5360 channels searched (67/fr)
INFO: ngram_search_fwdflat.c(954): 858 words searched (10/fr)
INFO: ngram_search_fwdflat.c(957): 173 word transitions (2/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.01 CPU 0.017 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.02 wall 0.026 xRT
INFO: ngram_search.c(1280): lattice start node <s>.0 end node </s>.21
INFO: ngram_search.c(1306): Eliminated 6 nodes before end node
INFO: ngram_search.c(1411): Lattice has 283 nodes, 269 links
INFO: ps_lattice.c(1380): Bestpath score: -3872
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:21:77) = 1429791
INFO: ps_lattice.c(1441): Joint P(O,S) = 1382256 P(S|O) = -47535
INFO: ngram_search.c(899): bestpath 0.00 CPU 0.002 xRT
INFO: ngram_search.c(902): bestpath 0.00 wall 0.003 xRT
2015-09-20 23:16:06.395 OpenEarsSampleApp[4345:1045404] Pocketsphinx heard “TWO” with a score of (-47535) and an utterance ID of 2.
2015-09-20 23:16:06.395 OpenEarsSampleApp[4345:1045344] Flite sending interrupt speech request.
2015-09-20 23:16:06.396 OpenEarsSampleApp[4345:1045344] Local callback: The received hypothesis is TWO with a score of -47535 and an ID of 2
2015-09-20 23:16:06.397 OpenEarsSampleApp[4345:1045344] I’m running flite
2015-09-20 23:16:06.469 OpenEarsSampleApp[4345:1045344] I’m done running flite and it took 0.071670 seconds
2015-09-20 23:16:06.470 OpenEarsSampleApp[4345:1045344] Flite audio player was nil when referenced so attempting to allocate a new audio player.
2015-09-20 23:16:06.470 OpenEarsSampleApp[4345:1045344] Loading speech data for Flite concluded successfully.
2015-09-20 23:16:06.488 OpenEarsSampleApp[4345:1045344] Flite sending suspend recognition notification.
2015-09-20 23:16:06.490 OpenEarsSampleApp[4345:1045344] Local callback: Flite has started speaking
2015-09-20 23:16:06.495 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has suspended recognition.
2015-09-20 23:16:07.704 OpenEarsSampleApp[4345:1045344] AVAudioPlayer did finish playing with success flag of 1
2015-09-20 23:16:07.856 OpenEarsSampleApp[4345:1045344] Flite sending resume recognition notification.
2015-09-20 23:16:08.358 OpenEarsSampleApp[4345:1045344] Local callback: Flite has finished speaking
2015-09-20 23:16:08.366 OpenEarsSampleApp[4345:1045344] setSecondsOfSilence wasn’t set, using default of 0.700000.
2015-09-20 23:16:08.366 OpenEarsSampleApp[4345:1045344] Local callback: Pocketsphinx has resumed recognition.
INFO: cmn_prior.c(131): cmn_prior_update: from < 39.98 17.73 -12.48 7.80 -14.05 1.06 -13.80 -3.82 -5.37 -5.05 -1.22 3.75 -5.35 >
INFO: cmn_prior.c(149): cmn_prior_update: to < 39.98 17.73 -12.48 7.80 -14.05 1.06 -13.80 -3.82 -5.37 -5.05 -1.22 3.75 -5.35 >
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 0 wordsAugust 23, 2015 at 3:57 pm in reply to: Increasing speech recognition accuracy using a custom languagemodel? #1026637jepjep7ParticipantXcode quit unexpectedly today, and now there are no more linker warnings when I run SaveThatWave on my device! I suspect I should have deleted the Openears Demo from my iPhone prior to compiling and cleaned and restarted Xcode. Anyway, I am on to the next step!
I used the following line and then tested SaveThatWave on my iPhone:
[self.saveThatWaveController startSessionDebugRecord];Where is the resultant WAV file stored? I cannot find the “caches” directory on my Mac. Thanks.
August 18, 2015 at 11:39 pm in reply to: Increasing speech recognition accuracy using a custom languagemodel? #1026624jepjep7ParticipantIt now complies and runs on my device, but it has a warning:
ld: warning: directory not found for option ‘-F”/Users/Joe/Downloads/OpenEarsDistribution/OpenEarsSampleApp/../Framework/”‘
August 18, 2015 at 2:27 pm in reply to: Increasing speech recognition accuracy using a custom languagemodel? #1026622jepjep7ParticipantThanks- I keep trying, but it does not compile for my device.
“Other Linker Flags” is set to -ObjC in the Build Settings of the OpenEarsSampleApp target.
I checked “Framework Search Path”, and it had the “/Users/Joe/Downloads/OpenEarsDistribution/Framework” path listed.
The SaveThatWaveDemo.framework is listed in the Framework folder and is also in the filesystem.
I am running Xcode 6.4.August 18, 2015 at 4:07 am in reply to: Increasing speech recognition accuracy using a custom languagemodel? #1026615jepjep7ParticipantI checked and the -ObjC linker flag was already added. Here are the Xcode errors:
Ld /Users/Joe/Library/Developer/Xcode/DerivedData/OpenEarsSampleApp-anzfsmtoufwysiaqbnobltsaejum/Build/Products/Debug-iphonesimulator/OpenEarsSampleApp\ Tests.xctest/OpenEarsSampleApp\ Tests normal x86_64
cd /Users/Joe/Downloads/OpenEarsDistribution/OpenEarsSampleApp
export IPHONEOS_DEPLOYMENT_TARGET=8.0
export PATH=”/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/usr/bin:/Applications/Xcode.app/Contents/Developer/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin”
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang -arch x86_64 -bundle -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator8.4.sdk -L/Users/Joe/Library/Developer/Xcode/DerivedData/OpenEarsSampleApp-anzfsmtoufwysiaqbnobltsaejum/Build/Products/Debug-iphonesimulator -F/Users/Joe/Library/Developer/Xcode/DerivedData/OpenEarsSampleApp-anzfsmtoufwysiaqbnobltsaejum/Build/Products/Debug-iphonesimulator -F/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator8.4.sdk/Developer/Library/Frameworks -F/Users/Joe/Downloads/OpenEarsDistribution/OpenEarsSampleApp/../Framework -F/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/Library/Frameworks -F/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator8.4.sdk/Developer/Library/Frameworks -filelist /Users/Joe/Library/Developer/Xcode/DerivedData/OpenEarsSampleApp-anzfsmtoufwysiaqbnobltsaejum/Build/Intermediates/OpenEarsSampleApp.build/Debug-iphonesimulator/OpenEarsSampleApp\ Tests.build/Objects-normal/x86_64/OpenEarsSampleApp\ Tests.LinkFileList -Xlinker -rpath -Xlinker @executable_path/Frameworks -Xlinker -rpath -Xlinker @loader_path/Frameworks -Xlinker -objc_abi_version -Xlinker 2 -ObjC -fobjc-arc -fobjc-link-runtime -Xlinker -no_implicit_dylibs -mios-simulator-version-min=8.0 -framework Slt -framework OpenEars -framework AudioToolbox -framework AVFoundation -Xlinker -dependency_info -Xlinker /Users/Joe/Library/Developer/Xcode/DerivedData/OpenEarsSampleApp-anzfsmtoufwysiaqbnobltsaejum/Build/Intermediates/OpenEarsSampleApp.build/Debug-iphonesimulator/OpenEarsSampleApp\ Tests.build/Objects-normal/x86_64/OpenEarsSampleApp\ Tests_dependency_info.dat -o /Users/Joe/Library/Developer/Xcode/DerivedData/OpenEarsSampleApp-anzfsmtoufwysiaqbnobltsaejum/Build/Products/Debug-iphonesimulator/OpenEarsSampleApp\ Tests.xctest/OpenEarsSampleApp\ TestsUndefined symbols for architecture x86_64:
“_OBJC_CLASS_$_SaveThatWaveController”, referenced from:
objc-class-ref in ViewController.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)August 17, 2015 at 1:29 am in reply to: Increasing speech recognition accuracy using a custom languagemodel? #1026589jepjep7ParticipantI am having trouble getting saveThatWave to work.
1) I added SaveThatWaveDemo.framework to the “Frameworks” folder in the OpenEarSample App.
2) In ViewController.h, I added:
#import <SaveThatWaveDemo/OEEventsObserver+SaveThatWave.h>
#import <SaveThatWaveDemo/SaveThatWaveController.h>
@property (strong, nonatomic) SaveThatWaveController *saveThatWaveController;
3) In ViewController.m, I added:
@synthesize saveThatWaveController;Everything complied perfectly until this point.
Then when I added this line to ViewDidLoad:
self.saveThatWaveController = [[SaveThatWaveController alloc] init];It threw a “linker command failed with exit code 1 (use -v to see invocation)” error.
Do you have any idea why this is not working? I would like to make a test-case for the “three five five” example above. Thanks!
August 16, 2015 at 10:53 pm in reply to: Increasing speech recognition accuracy using a custom languagemodel? #1026587jepjep7ParticipantI checked, and the DIC files were exactly the same. Openears Demo is only recognizing “three five” when I say “three five five”. I will create a replication case.
August 15, 2015 at 11:35 pm in reply to: Increasing speech recognition accuracy using a custom languagemodel? #1026575jepjep7ParticipantI ran Openears Demo and it seemed to be working better than mine at recognizing numbers 1-9. So I checked my code and I had an extra “nil” on my firstLanguageArray. Here is what it looked like:
NSArray *firstLanguageArray = [[NSArray alloc] initWithArray:[NSArray arrayWithObjects:
@”ONE”,
@”TWO”,
@”THREE”,
@”FOUR”,
@”FIVE”,
@”SIX”,
@”SEVEN”,
@”EIGHT”,
@”NINE”,
nil]];I changed the array by deleting the nil to look like this:
NSArray *firstLanguageArray = @[@”ONE”,
@”TWO”,
@”THREE”,
@”FOUR”,
@”FIVE”,
@”SIX”,
@”SEVEN”,
@”EIGHT”,
@”NINE”];It seems to be working better now! Do you think that was it!?
Thanks.
August 15, 2015 at 9:58 pm in reply to: Increasing speech recognition accuracy using a custom languagemodel? #1026573jepjep7ParticipantThank you for your fast response, Halle. I appreciate it!
The reason I was trying to modify the languagemodel in my original post is that I thought I was having problems with number recognition, specifically “fives” and “nines”. After more testing, I narrowed down the problem. I am having problems recognizing words at the end of utterances. For example, when I try to recognize the phrase “three five five”, Openears often only recognizes “three five”. Why is that? It is almost like the last word gets cut off from being recognized.
Settings:
setVadThreshold:3.0
setSecondsOfSilenceToDetect:0.3December 24, 2014 at 5:41 pm in reply to: [Resolved] decrease in speech recognition accuracy after updating to OpenEars 2.0 #1023788jepjep7ParticipantI changed the deployment target to 6.1 and I am running the sample app on an iPhone 6 with iOS 8.1.2 and apple corded headphones. The apple headphones helped reduce the noise.
I did some more tests between OpenEars 2.0 and 1.71, and I am still getting a bit more noise with 2.0. I have found these settings to work best for me so far:
[[OEPocketsphinxController sharedInstance] setSecondsOfSilenceToDetect:0.2];
[[OEPocketsphinxController sharedInstance] setVadThreshold:3.0];I was hoping for better recognition accuracy and better noise reduction in 2.0 because I wanted to add more words to my vocabulary and more functionality to my app. I will continue testing, but I will consider this case closed. Thanks!
December 24, 2014 at 5:21 am in reply to: [Resolved] decrease in speech recognition accuracy after updating to OpenEars 2.0 #1023776jepjep7ParticipantThanks, Halle. I am pretty sure I followed the guidelines correctly. I deleted the old acoustic model and dragged it in again. The resulting speech recognition was terrible. Then I tried running the Openears sample app. The recognition is also bad. When I say the word “turn,” it says “you said turn, you said turn, you said turn…” over and over again. My deployment target is OS 6.0 and I am testing this on the iphone 6 device. Any ideas?
December 22, 2014 at 1:07 am in reply to: [Resolved] decrease in speech recognition accuracy after updating to OpenEars 2.0 #1023655jepjep7ParticipantYes. I tried a few variations of this:
[[OEPocketsphinxController sharedInstance] setSecondsOfSilenceToDetect:0.1];
[[OEPocketsphinxController sharedInstance] setVadThreshold:2.9];But I could not get the same accuracy as I was getting with version 1.71 where I was using this:
pocketsphinxController.secondsOfSilenceToDetect = 0.1;
pocketsphinxController.calibrationTime = 1; -
AuthorPosts