lytedesigns

Forum Replies Created

Viewing 5 posts - 1 through 5 (of 5 total)

  • Author
    Posts
  • in reply to: Detection problems in Spanish words. #1027247
    lytedesigns
    Participant

    Hello, could you prove or see something with wav or with the logs?

    I think I’ve understood that an update will come up soon. It is correct?
    Do you think that this new version would work best (in our case)?

    Thanks again!

    in reply to: Detection problems in Spanish words. #1027213
    lytedesigns
    Participant

    Hello, im try to reply but the system show a error “Your reply cannot be created at this time”.
    I attach the txt with all logs and original reply in dropbox.
    https://dl.dropboxusercontent.com/u/87410097/last_post_with_reply_problems.rtf

    Sorry, my English is very bad and maybe we are not well understood.

    The main problem is that if we have a word to search and tell others words, openears recognizes words unsaid. Say “CABEZA” and recognized “HOLA” for example.

    The last wav file that you spent was recorded using SaveThatWave plugin executing startSessionDebugRecord with the app running and talking on the device.
    https://dl.dropboxusercontent.com/u/87410097/Rec_device.wav
    (this wav is generated with SaveThatWave on device)

    I’m sorry because I forgot to attach the log in the previous message. I imagine it will be necessary.
    Now i attach the logs when the session was recorded wav, and when I tried to play as Testfile.

    Also if need, i can attach the ultra simple project with the test. Ready to run.

    in reply to: Detection problems in Spanish words. #1027191
    lytedesigns
    Participant

    Ok, I started with startSessionDebugRecord and this is generated wav:
    https://dl.dropboxusercontent.com/u/87410097/Rec_device.wav

    The problem remains the same. As much as you save in one way or another, the recognition problem is still there.
    Thank you and hope to use your soft as we like a lot, if not we will have to try other alternatives.

    in reply to: Detection problems in Spanish words. #1027188
    lytedesigns
    Participant

    Hello,
    we tested a wav generated SaveThatWave saying these words in Spanish:
    CARPETA SALUDO COMIDA CARTEL VENEZUELA ELECCIONES GATO DERECHA

    In OpenEars + Rejecto we put the words to search:
    @”CABEZA”, @”IZQUIERDA”, @”DERECHA”, @”SOBRE”, @”HOLA”, @”CAMBIAR”, @”RÁPIDO”, @”CIUDADANO”, @”URNA”

    Still finding words unsaid.

    Here is the wav file generated with SaveThatWave from the device:
    https://dl.dropboxusercontent.com/u/87410097/Rec_device.wav

    Here is the wav file recorded with the device soft and exporter then wav with ‘afconvert’ (the same as I went):
    https://dl.dropboxusercontent.com/u/87410097/Rec2.wav

    Can you test it?
    Thanks

    console log running the saveThatWave wav as testFile:
    [spoiler]
    2015-11-04 09:21:07.611 OpenEarsTest[3506:336821] Starting OpenEars logging for OpenEars version 2.04 on 32-bit device (or build): iPhone running iOS version: 8.100000
    2015-11-04 09:21:07.863 OpenEarsTest[3506:336821] The word URNA was not found in the dictionary /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/LanguageModelGeneratorLookupList.text/LanguageModelGeneratorLookupList.text.
    2015-11-04 09:21:07.865 OpenEarsTest[3506:336821] Now using the fallback method to look up the word URNA
    2015-11-04 09:21:07.865 OpenEarsTest[3506:336821] If this is happening more frequently than you would expect, the most likely cause for it is since you are using the Spanish phonetic lookup dictionary is that your words are not in Spanish or aren’t dictionary words.
    2015-11-04 09:21:07.867 OpenEarsTest[3506:336821] I’m done running performDictionaryLookup and it took 0.028449 seconds
    2015-11-04 09:21:07.870 OpenEarsTest[3506:336821] I’m done running performDictionaryLookup and it took 0.088410 seconds
    2015-11-04 09:21:07.880 OpenEarsTest[3506:336821] Starting dynamic language model generation

    INFO: cmd_ln.c(702): Parsing command line:
    sphinx_lm_convert \
    -i /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.arpa \
    -o /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.DMP

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -case
    -debug 0
    -help no no
    -i /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.arpa
    -ifmt
    -logbase 1.0001 1.000100e+00
    -mmap no no
    -o /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.DMP
    -ofmt

    INFO: ngram_model_arpa.c(504): ngrams 1=36, 2=68, 3=34
    INFO: ngram_model_arpa.c(137): Reading unigrams
    INFO: ngram_model_arpa.c(543): 36 = #unigrams created
    INFO: ngram_model_arpa.c(197): Reading bigrams
    INFO: ngram_model_arpa.c(561): 68 = #bigrams created
    INFO: ngram_model_arpa.c(562): 3 = #prob2 entries
    INFO: ngram_model_arpa.c(570): 3 = #bo_wt2 entries
    INFO: ngram_model_arpa.c(294): Reading trigrams
    INFO: ngram_model_arpa.c(583): 34 = #trigrams created
    INFO: ngram_model_arpa.c(584): 2 = #prob3 entries
    INFO: ngram_model_dmp.c(518): Building DMP model…
    INFO: ngram_model_dmp.c(548): 36 = #unigrams created
    INFO: ngram_model_dmp.c(649): 68 = #bigrams created
    INFO: ngram_model_dmp.c(650): 3 = #prob2 entries
    INFO: ngram_model_dmp.c(657): 3 = #bo_wt2 entries
    INFO: ngram_model_dmp.c(661): 34 = #trigrams created
    INFO: ngram_model_dmp.c(662): 2 = #prob3 entries
    2015-11-04 09:21:07.988 OpenEarsTest[3506:336821] Done creating language model with CMUCLMTK in 0.106777 seconds.
    INFO: cmd_ln.c(702): Parsing command line:
    sphinx_lm_convert \
    -i /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.arpa \
    -o /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.DMP

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -case
    -debug 0
    -help no no
    -i /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.arpa
    -ifmt
    -logbase 1.0001 1.000100e+00
    -mmap no no
    -o /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.DMP
    -ofmt

    INFO: ngram_model_arpa.c(504): ngrams 1=36, 2=68, 3=34
    INFO: ngram_model_arpa.c(137): Reading unigrams
    INFO: ngram_model_arpa.c(543): 36 = #unigrams created
    INFO: ngram_model_arpa.c(197): Reading bigrams
    INFO: ngram_model_arpa.c(561): 68 = #bigrams created
    INFO: ngram_model_arpa.c(562): 3 = #prob2 entries
    INFO: ngram_model_arpa.c(570): 3 = #bo_wt2 entries
    INFO: ngram_model_arpa.c(294): Reading trigrams
    INFO: ngram_model_arpa.c(583): 34 = #trigrams created
    INFO: ngram_model_arpa.c(584): 2 = #prob3 entries
    INFO: ngram_model_dmp.c(518): Building DMP model…
    INFO: ngram_model_dmp.c(548): 36 = #unigrams created
    INFO: ngram_model_dmp.c(649): 68 = #bigrams created
    INFO: ngram_model_dmp.c(650): 3 = #prob2 entries
    INFO: ngram_model_dmp.c(657): 3 = #bo_wt2 entries
    INFO: ngram_model_dmp.c(661): 34 = #trigrams created
    INFO: ngram_model_dmp.c(662): 2 = #prob3 entries
    2015-11-04 09:21:08.000 OpenEarsTest[3506:336821] I’m done running dynamic language model generation and it took 0.356924 seconds
    2015-11-04 09:21:08.012 OpenEarsTest[3506:336821] User gave mic permission for this app.
    2015-11-04 09:21:08.055 OpenEarsTest[3506:336821] Attempting to start listening session from startRealtimeListeningWithLanguageModelAtPath:
    2015-11-04 09:21:08.059 OpenEarsTest[3506:336821] User gave mic permission for this app.
    2015-11-04 09:21:08.060 OpenEarsTest[3506:336821] Valid setSecondsOfSilence value of 1.700000 will be used.
    2015-11-04 09:21:08.062 OpenEarsTest[3506:336821] Successfully started listening session from startRealtimeListeningWithLanguageModelAtPath:
    2015-11-04 09:21:08.064 OpenEarsTest[3506:336888] Starting listening.
    2015-11-04 09:21:08.066 OpenEarsTest[3506:336888] about to set up audio session
    2015-11-04 09:21:08.068 OpenEarsTest[3506:336888] Creating audio session with default settings.
    2015-11-04 09:21:08.178 OpenEarsTest[3506:336907] Audio route has changed for the following reason:
    2015-11-04 09:21:08.547 OpenEarsTest[3506:336907] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
    2015-11-04 09:21:08.564 OpenEarsTest[3506:336907] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —SpeakerMicrophoneBuiltIn—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x16593b00,
    inputs = (null);
    outputs = (
    “<AVAudioSessionPortDescription: 0x165a7f40, type = Speaker; name = Altavoz; UID = Built-In Speaker; selectedDataSource = (null)>”
    )>.
    2015-11-04 09:21:08.575 OpenEarsTest[3506:336888] done starting audio unit
    INFO: cmd_ln.c(702): Parsing command line:
    \
    -lm /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.DMP \
    -vad_prespeech 10 \
    -vad_postspeech 170 \
    -vad_threshold 4.300000 \
    -remove_noise yes \
    -remove_silence yes \
    -bestpath no \
    -lw 6.500000 \
    -dict /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.dic \
    -hmm /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -allphone
    -allphone_ci no no
    -alpha 0.97 9.700000e-01
    -argfile
    -ascale 20.0 2.000000e+01
    -aw 1 1
    -backtrace no no
    -beam 1e-48 1.000000e-48
    -bestpath yes no
    -bestpathlw 9.5 9.500000e+00
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -compallsen no no
    -debug 0
    -dict /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.dic
    -dictcase no no
    -dither no no
    -doublebw no no
    -ds 1 1
    -fdict
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams
    -fillprob 1e-8 1.000000e-08
    -frate 100 100
    -fsg
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -fwdflat yes yes
    -fwdflatbeam 1e-64 1.000000e-64
    -fwdflatefwid 4 4
    -fwdflatlw 8.5 8.500000e+00
    -fwdflatsfwin 25 25
    -fwdflatwbeam 7e-29 7.000000e-29
    -fwdtree yes yes
    -hmm /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle
    -input_endian little little
    -jsgf
    -keyphrase
    -kws
    -kws_plp 1e-1 1.000000e-01
    -kws_threshold 1 1.000000e+00
    -latsize 5000 5000
    -lda
    -ldadim 0 0
    -lifter 0 0
    -lm /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.DMP
    -lmctl
    -lmname
    -logbase 1.0001 1.000100e+00
    -logfn
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -lpbeam 1e-40 1.000000e-40
    -lponlybeam 7e-29 7.000000e-29
    -lw 6.5 6.500000e+00
    -maxhmmpf 30000 30000
    -maxwpf -1 -1
    -mdef
    -mean
    -mfclogdir
    -min_endfr 0 0
    -mixw
    -mixwfloor 0.0000001 1.000000e-07
    -mllr
    -mmap yes yes
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -nwpen 1.0 1.000000e+00
    -pbeam 1e-48 1.000000e-48
    -pip 1.0 1.000000e+00
    -pl_beam 1e-10 1.000000e-10
    -pl_pbeam 1e-10 1.000000e-10
    -pl_pip 1.0 1.000000e+00
    -pl_weight 3.0 3.000000e+00
    -pl_window 5 5
    -rawlogdir
    -remove_dc no no
    -remove_noise yes yes
    -remove_silence yes yes
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -sendump
    -senlogdir
    -senmgau
    -silprob 0.005 5.000000e-03
    -smoothspec no no
    -svspec
    -tmat
    -tmatfloor 0.0001 1.000000e-04
    -topn 4 4
    -topn_beam 0 0
    -toprule
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -uw 1.0 1.000000e+00
    -vad_postspeech 50 170
    -vad_prespeech 10 10
    -vad_threshold 2.0 4.300000e+00
    -var
    -varfloor 0.0001 1.000000e-04
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 7e-29 7.000000e-29
    -wip 0.65 6.500000e-01
    -wlen 0.025625 2.562500e-02

    INFO: cmd_ln.c(702): Parsing command line:
    \
    -feat s3_1x39

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -dither no no
    -doublebw no no
    -feat 1s_c_d_dd s3_1x39
    -frate 100 100
    -input_endian little little
    -lda
    -ldadim 0 0
    -lifter 0 0
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -remove_dc no no
    -remove_noise yes yes
    -remove_silence yes yes
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -smoothspec no no
    -svspec
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -vad_postspeech 50 170
    -vad_prespeech 10 10
    -vad_threshold 2.0 4.300000e+00
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wlen 0.025625 2.562500e-02

    INFO: acmod.c(252): Parsed model-specific feature parameters from /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/feat.params
    INFO: feat.c(715): Initializing feature stream to type: ‘s3_1x39′, ceplen=13, CMN=’current’, VARNORM=’no’, AGC=’none’
    INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: mdef.c(518): Reading model definition: /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/mdef
    INFO: bin_mdef.c(181): Allocating 27954 * 8 bytes (218 KiB) for CD tree
    INFO: tmat.c(206): Reading HMM transition probability matrices: /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/transition_matrices
    INFO: acmod.c(124): Attempting to use PTM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/means
    INFO: ms_gauden.c(292): 2630 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 16×39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/variances
    INFO: ms_gauden.c(292): 2630 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 16×39
    INFO: ms_gauden.c(354): 16 variance values floored
    INFO: ptm_mgau.c(801): Number of codebooks exceeds 256: 2630
    INFO: acmod.c(126): Attempting to use semi-continuous computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/means
    INFO: ms_gauden.c(292): 2630 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 16×39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/variances
    INFO: ms_gauden.c(292): 2630 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 16×39
    INFO: ms_gauden.c(354): 16 variance values floored
    INFO: acmod.c(128): Falling back to general multi-stream GMM computation
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/means
    INFO: ms_gauden.c(292): 2630 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 16×39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/variances
    INFO: ms_gauden.c(292): 2630 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 16×39
    INFO: ms_gauden.c(354): 16 variance values floored
    INFO: ms_senone.c(149): Reading senone mixture weights: /private/var/mobile/Containers/Bundle/Application/07477015-1B5B-413E-8FB2-1B7AF2039661/OpenEarsTest.app/AcousticModelSpanish.bundle/mixture_weights
    INFO: ms_senone.c(200): Truncating senone logs3(pdf) values by 10 bits
    INFO: ms_senone.c(207): Not transposing mixture weights in memory
    INFO: ms_senone.c(268): Read mixture weights for 2630 senones: 1 features x 16 codewords
    INFO: ms_senone.c(320): Mapping senones to individual codebooks
    INFO: ms_mgau.c(141): The value of topn: 4
    INFO: phone_loop_search.c(115): State beam -225 Phone exit beam -225 Insertion penalty 0
    INFO: dict.c(320): Allocating 4130 * 20 bytes (80 KiB) for word entries
    INFO: dict.c(333): Reading main dictionary: /var/mobile/Containers/Data/Application/5C951750-9914-4547-B09D-660F7BBE180D/Library/Caches/NameIWantForMyLanguageModelFiles.dic
    INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(336): 34 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(406): Allocating 26^3 * 2 bytes (34 KiB) for word-initial triphones
    INFO: dict2pid.c(132): Allocated 8216 bytes (8 KiB) for word-final triphones
    INFO: dict2pid.c(196): Allocated 8216 bytes (8 KiB) for single-phone word triphones
    INFO: ngram_model_arpa.c(79): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(166): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(220): ngrams 1=36, 2=68, 3=34
    INFO: ngram_model_dmp.c(266): 36 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(312): 68 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(338): 34 = LM.trigrams read
    INFO: ngram_model_dmp.c(363): 3 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(383): 3 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(403): 2 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(431): 1 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(487): 36 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 8 unique initial diphones
    INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 29 single-phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 29 single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 164
    INFO: ngram_search_fwdtree.c(339): after: 8 root, 36 non-root channels, 28 single-phone words
    INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
    2015-11-04 09:21:12.627 OpenEarsTest[3506:336888] Listening.
    2015-11-04 09:21:12.631 OpenEarsTest[3506:336888] Project has these words or phrases in its dictionary:
    ___REJ_Y
    ___REJ_X
    ___REJ_V
    ___REJ_U
    ___REJ_T
    ___REJ_S
    ___REJ_RR
    ___REJ_R
    ___REJ_P
    ___REJ_O
    ___REJ_N
    ___REJ_M
    ___REJ_LL
    ___REJ_L
    ___REJ_K
    ___REJ_J
    ___REJ_I
    ___REJ_GN
    ___REJ_G
    ___REJ_F
    ___REJ_E
    ___REJ_D
    ___REJ_CH
    ___REJ_B
    ___REJ_A
    CABEZA
    CAMBIAR
    CIUDADANO
    DERECHA
    HOLA
    IZQUIERDA
    …and 4 more.
    2015-11-04 09:21:12.632 OpenEarsTest[3506:336888] Recognition loop has started
    2015-11-04 09:21:12.637 OpenEarsTest[3506:336821] Pocketsphinx is now listening.
    2015-11-04 09:21:14.284 OpenEarsTest[3506:336888] Speech detected…
    2015-11-04 09:21:14.285 OpenEarsTest[3506:336821] Pocketsphinx has detected speech.
    2015-11-04 09:21:14.287 OpenEarsTest[3506:336886] Pocketsphinx heard ” ” with a score of (-843) and an utterance ID of 0.
    2015-11-04 09:21:14.288 OpenEarsTest[3506:336886] Hypothesis was null so we aren’t returning it. If you want null hypotheses to also be returned, set OEPocketsphinxController’s property returnNullHypotheses to TRUE before starting OEPocketsphinxController.
    2015-11-04 09:21:14.471 OpenEarsTest[3506:336888] Pocketsphinx heard “HOLA” with a score of (-1487) and an utterance ID of 1.
    2015-11-04 09:21:14.472 OpenEarsTest[3506:336821] rapidEarsDidReceiveLiveSpeechHypothesis: The received hypothesis is HOLA with a score of -1487
    2015-11-04 09:21:14.716 OpenEarsTest[3506:336886] Pocketsphinx heard ” ” with a score of (-2476) and an utterance ID of 2.
    2015-11-04 09:21:14.717 OpenEarsTest[3506:336886] Hypothesis was null so we aren’t returning it. If you want null hypotheses to also be returned, set OEPocketsphinxController’s property returnNullHypotheses to TRUE before starting OEPocketsphinxController.
    2015-11-04 09:21:14.926 OpenEarsTest[3506:336886] Pocketsphinx heard ” ” with a score of (-4259) and an utterance ID of 3.
    2015-11-04 09:21:15.130 OpenEarsTest[3506:336886] Pocketsphinx heard ” ” with a score of (-5109) and an utterance ID of 4.
    2015-11-04 09:21:15.326 OpenEarsTest[3506:336886] Pocketsphinx heard ” ” with a score of (-5523) and an utterance ID of 5.
    2015-11-04 09:21:15.632 OpenEarsTest[3506:336886] Pocketsphinx heard ” ” with a score of (-5767) and an utterance ID of 6.
    2015-11-04 09:21:15.901 OpenEarsTest[3506:336886] Pocketsphinx heard ” ” with a score of (-6465) and an utterance ID of 7.
    2015-11-04 09:21:16.166 OpenEarsTest[3506:336886] Pocketsphinx heard “HOLA” with a score of (-8759) and an utterance ID of 8.
    2015-11-04 09:21:16.167 OpenEarsTest[3506:336821] rapidEarsDidReceiveLiveSpeechHypothesis: The received hypothesis is HOLA with a score of -8759
    2015-11-04 09:21:16.458 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA” with a score of (-10135) and an utterance ID of 9.
    2015-11-04 09:21:16.459 OpenEarsTest[3506:336821] rapidEarsDidReceiveLiveSpeechHypothesis: The received hypothesis is URNA HOLA with a score of -10135
    2015-11-04 09:21:16.754 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA” with a score of (-10189) and an utterance ID of 10.
    2015-11-04 09:21:16.986 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA” with a score of (-11234) and an utterance ID of 11.
    2015-11-04 09:21:17.191 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA” with a score of (-12147) and an utterance ID of 12.
    2015-11-04 09:21:17.477 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA” with a score of (-14526) and an utterance ID of 13.
    2015-11-04 09:21:17.752 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA DERECHA” with a score of (-15029) and an utterance ID of 14.
    2015-11-04 09:21:17.753 OpenEarsTest[3506:336821] rapidEarsDidReceiveLiveSpeechHypothesis: The received hypothesis is URNA HOLA DERECHA with a score of -15029
    2015-11-04 09:21:18.047 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA DERECHA” with a score of (-16053) and an utterance ID of 15.
    2015-11-04 09:21:18.319 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA DERECHA” with a score of (-16390) and an utterance ID of 16.
    INFO: ngram_search.c(462): Resized backpointer table to 10000 entries
    2015-11-04 09:21:18.614 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA DERECHA” with a score of (-16724) and an utterance ID of 17.
    2015-11-04 09:21:18.898 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA DERECHA” with a score of (-17003) and an utterance ID of 18.
    2015-11-04 09:21:19.192 OpenEarsTest[3506:336886] Pocketsphinx heard “URNA HOLA DERECHA” with a score of (-17244) and an utterance ID of 19.
    2015-11-04 09:21:19.256 OpenEarsTest[3506:336888] End of speech detected…
    2015-11-04 09:21:19.257 OpenEarsTest[3506:336821] Pocketsphinx has detected a period of silence, concluding an utterance.
    INFO: cmn_prior.c(131): cmn_prior_update: from < 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 8.89 0.07 -0.36 0.01 -0.16 -0.01 -0.26 -0.21 -0.05 -0.18 -0.09 -0.07 -0.07 >
    INFO: ngram_search_fwdtree.c(1553): 6748 words recognized (14/fr)
    INFO: ngram_search_fwdtree.c(1555): 119026 senones evaluated (245/fr)
    INFO: ngram_search_fwdtree.c(1559): 28716 channels searched (59/fr), 3815 1st, 18042 last
    INFO: ngram_search_fwdtree.c(1562): 12032 words for which last channels evaluated (24/fr)
    INFO: ngram_search_fwdtree.c(1564): 297 candidate words for entering last phone (0/fr)
    INFO: ngram_search_fwdtree.c(1567): fwdtree 4.81 CPU 0.990 xRT
    INFO: ngram_search_fwdtree.c(1570): fwdtree 6.56 wall 1.350 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 31 words
    INFO: ngram_search_fwdflat.c(945): 3539 words recognized (7/fr)
    INFO: ngram_search_fwdflat.c(947): 77412 senones evaluated (159/fr)
    INFO: ngram_search_fwdflat.c(949): 22670 channels searched (46/fr)
    INFO: ngram_search_fwdflat.c(951): 7729 words searched (15/fr)
    INFO: ngram_search_fwdflat.c(954): 6070 word transitions (12/fr)
    INFO: ngram_search_fwdflat.c(957): fwdflat 1.97 CPU 0.406 xRT
    INFO: ngram_search_fwdflat.c(960): fwdflat 2.07 wall 0.425 xRT
    2015-11-04 09:21:21.358 OpenEarsTest[3506:336888] Pocketsphinx heard “URNA HOLA DERECHA” with a score of (-17696) and an utterance ID of 20.
    2015-11-04 09:21:21.360 OpenEarsTest[3506:336821] rapidEarsDidReceiveFinishedSpeechHypothesis: The received hypothesis is URNA HOLA DERECHA with a score of -17696
    [/spoiler]

    in reply to: Detection problems in Spanish words. #1027153
    lytedesigns
    Participant

    Nothing. I changed the value of VadThreshold 4.4 , but gives the same performance. Still showing when words are not said (although detects fewer words than 4.3 VAD ) .
    With the wav you can see well the failures, but most of the problems gives voice is live .
    There are still particular words that are better, but they are rare.

    Up to this point we do not know very well to do.
    If you need the project as I can go via dropbox ( our project and the modified example of openears ).

    Thanks!

Viewing 5 posts - 1 through 5 (of 5 total)