Forum Replies Created
-
AuthorPosts
-
September 9, 2014 at 5:06 pm in reply to: Pocketsphinx 'heard' different to received hypothesis #1022471JonLocateParticipant
Hello Halle,
Thanks for your reply!
I have decided to go back to the previous method of a ‘two phrase’ method. I basically found a cheap cheat to get passed the fact that the hypothesis that i wanted was delivered after i needed it. basically a bool switch that waits for the right hypothesis before it does something with it. Doing this i found that the hypothesis generated from the wav with the second vocabulary usually returns many words from the vocab as opposed to just the one i want. this means that it is useless to do this because it will give me a wrong word that will affect the program before i get the right one.
for example, when the phrase spoken is “locate white wine”
hypothesis 1 = “locate white wine”
“wine” keyword makes the program switch to the wine vocab and rerun voice rec on saved wav, which in turn generates hypothesis 2hypothesis 2 = “pink white white”
I am looking for the word “white” but “pink” is operated on because it is the first word, and i am unable to refine this with various different Rejecto setups.But i think i will be able to use the 2 phrase method of,
user – “locate wine”
switch to wine vocab
flite – “any particular type of wine”
user – “white”
flite – “follow me to the white wine”not a huge drain on user experience really! I think it would be an interesting addition to the SDK though, the ability to run voice recognition on wavs that you have saved through SaveThatWave.
Thanks for the help!
JonSeptember 2, 2014 at 10:32 pm in reply to: Pocketsphinx 'heard' different to received hypothesis #1022441JonLocateParticipantthanks for the reply,
from my understanding of OpenEars, i Think that when you speak a sentence the loop detects speech and waits until the end of speech is detected, then it passes the hypothesis to the receiving method. I also think that you can only apply one vocabulary to one phrase at a time, only switching after receiving the hypothesis, then checking for words in the hypothesis to determine what vocabulary change needs to happen.
so it would be like:
phrase is spoken,
hypothesis is received,
vocab to switch to is determined based on hypothesis,
vocab is switched,
user speaks another phrase to be worked on by new vocabulary,
new words are given in hypothesis,
world keeps spinning.Ive built a test app that works great in this way, but the user is required to speak more than once to get the desired results. what i am trying to do is have it so the user only has to say one phrase which is used to generate multiple hypothesis using various different vocabularies.
pseudo example:
commands = [locate, find, bread, wine, fruit]
breadvocab = [brown, white, wholegrain]
fruitvocab = [bananas, kiwis, pineapples]
winevocab = [red, white, pink]user says – “locate white wine”
phrase is recorded into wav at wavPath
commands vocab is used for first hypothesis generated from [runRecOnWavAt:wavPath with:commandVocabPath]
hypothesis = “locate wine”vocab is now changed to winevocab and we [runRecOnWavAt:wavPath with:wineVocabPath]
hypothesis = “white” (or pink white white as that phrase usually generates)from the one spoken phrase two vocabs were used to find that the user wanted to locate the white wine aisle. In my head this cant be done with the one spoken phrase that the listening loop picks up, because only one vocab is used per phrase/hypothesis? so by using a recorded wav i can check it multiple times with multiple grammars.
Am i missing something? can the listening loop check with multiple grammars without having to use a wav to save the spoken phrase?
Thanks!
JonJonLocateParticipantAfter looking through the example i modified my code to have a second language model and its paths
NSString *name = @"commandModel"; NSError *commandModel = [lmGenerator generateRejectingLanguageModelFromArray:commandVocab withFilesNamed:name withOptionalExclusions:nil usingVowelsOnly:FALSE withWeight:nil forAcousticModelAtPath:[AcousticModel pathToModel:@"AcousticModelEnglish"]]; NSDictionary *languageGeneratorResults = nil; if([commandModel code] == noErr) { languageGeneratorResults = [commandModel userInfo]; // not sure if i need these file vars as they are not in the tutorial but are in the demo app lmFile = [languageGeneratorResults objectForKey:@"LMFile"]; dicFile = [languageGeneratorResults objectForKey:@"DictionaryFile"]; lmPath = [languageGeneratorResults objectForKey:@"LMPath"]; dicPath = [languageGeneratorResults objectForKey:@"DictionaryPath"]; } else { NSLog(@"Error: %@",[commandModel localizedDescription]); } NSString *name1 = @"BreadModel"; NSError *breadModel = [lmGenerator generateRejectingLanguageModelFromArray:breadVocab withFilesNamed:name1 withOptionalExclusions:nil usingVowelsOnly:FALSE withWeight:nil forAcousticModelAtPath:[AcousticModel pathToModel:@"AcousticModelEnglish"]]; NSDictionary *breadlanguageGeneratorResults = nil; if([breadModel code] == noErr) { breadlanguageGeneratorResults = [commandModel userInfo]; breadlmFile = [breadlanguageGeneratorResults objectForKey:@"LMFile"]; breaddicFile = [breadlanguageGeneratorResults objectForKey:@"DictionaryFile"]; breadlmPath = [breadlanguageGeneratorResults objectForKey:@"LMPath"]; breaddicPath = [breadlanguageGeneratorResults objectForKey:@"DictionaryPath"]; } else { NSLog(@"Error: %@",[breadModel localizedDescription]); }
then when i want to change the model i call this
[pocketsphinxController changeLanguageModelToFile:breadlmPath withDictionary:breaddicPath];
not a lot different from the demo app yet the model doesn’t get changed and the model changed delegate method doesn’t get called. any ideas?
Thanks
JonJonLocateParticipantwell it works! I completely copy and pasted the tutorial and it worked perfectly! I didnt’ think I needed to create those instance variables in the .h file and synthesize them in the .m, but I guess I needed to! How come you have to do it this way instead of just referencing the properties? I’ve seen it work fine for other things, for example, Apples Location Manager class. You create a property of a LocationManager, then you can set the class you are working on as the delegate and reference it as self.LocationManager or _LocationManager and it works fine without having to create the ivars or synthesize them. I guess they are completey differnt classes though!
Thanks for the help!
JonLocateParticipantok, i’ll copy and paste the tutorial and see how i get on. Is this line the correct way to start the verbosePocketSphinx logging?
pocketSphinxController.verbosePocketSphinx = true;
thanks
JonLocateParticipanthey Halle, thanks for your replies. I’ve updated my code a little, to closer match the tutorial. Also I am not sure what you mean by ivars?
the delegate methods are still in my code but i didn’t post them just to make the code shorter. also i turned on verbosePocketSphinx (i think) and the logging output is the same as when only using [OpenEarLogging startOpenEarsLogging].
header file
#import <UIKit/UIKit.h>
#import <OpenEars/PocketsphinxController.h>
#import <OpenEars/AcousticModel.h>
#import <OpenEars/OpenEarsEventsObserver.h>@interface SpeechViewController : UIViewController <OpenEarsEventsObserverDelegate>
{
PocketsphinxController *pocketSphinxController;
OpenEarsEventsObserver *openEarsEventObserver;
}@property (strong, nonatomic) PocketsphinxController *pocketSphinxController;
@property (strong, nonatomic) OpenEarsEventsObserver *openEarsEventObserver;@property (strong, nonatomic) IBOutlet UILabel *resultsLabel;
– (IBAction)talkButton:(id)sender;
@end
implementation
#import “SpeechViewController.h”
#import <OpenEars/LanguageModelGenerator.h>
#import <OpenEars/OpenEarsLogging.h>@interface SpeechViewController () <OpenEarsEventsObserverDelegate>
@end
@implementation SpeechViewController
@synthesize pocketSphinxController;
@synthesize openEarsEventObserver;– (void)viewDidLoad
{
[super viewDidLoad];
// Do any additional setup after loading the view, typically from a nib.[openEarsEventObserver setDelegate:self];
pocketSphinxController.verbosePocketSphinx = true;
[OpenEarsLogging startOpenEarsLogging];
}– (void)didReceiveMemoryWarning
{
[super didReceiveMemoryWarning];
// Dispose of any resources that can be recreated.
}– (PocketsphinxController *)pocketSphinxController
{
if (pocketSphinxController == nil)
{
pocketSphinxController = [[PocketsphinxController alloc] init];
}return pocketSphinxController;
}– (OpenEarsEventsObserver *)openEarsEventObserver
{
if (openEarsEventObserver == nil)
{
openEarsEventObserver = [[OpenEarsEventsObserver alloc] init];
}return openEarsEventObserver;
}– (IBAction)talkButton:(id)sender
{LanguageModelGenerator *LangGen = [[LanguageModelGenerator alloc] init];
NSArray *words = [NSArray arrayWithObjects:@”HELLO WORLD”, @”HELLO”, @”WORLD”, @”TEST”, @”SPEECH”, @”LOCATION”, nil];
NSString *name = @”LangModelName”;
NSError *err = [LangGen generateLanguageModelFromArray:words withFilesNamed:name forAcousticModelAtPath:[AcousticModel pathToModel:@”AcousticModelEnglish”]];NSDictionary *languageGeneratorResults = nil;
NSString *lmPath = nil;
NSString *dicPath = nil;if ([err code] == noErr)
{
languageGeneratorResults = [err userInfo];lmPath = [languageGeneratorResults objectForKey:@”LMPath”];
dicPath = [languageGeneratorResults objectForKey:@”DictionaryPath”];
}
else
{
NSLog(@”Error: %@”, [err localizedDescription]);
}//[openEarsEventObserver setDelegate:self];
[pocketSphinxController startListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath acousticModelAtPath:[AcousticModel pathToModel:@”AcousticModelEnglish”] languageModelIsJSGF:NO];
}
– (void)pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID
{
NSLog(@”The Recieved Hypothesis is %@ with a score of %@ and an ID of %@”, hypothesis, recognitionScore, utteranceID);[self.resultsLabel setText:hypothesis];
}logging
2014-08-21 10:03:46.900 OpenEarsTest[192:60b] Starting OpenEars logging for OpenEars version 1.7 on 32-bit device: iPad running iOS version: 7.100000
2014-08-21 10:03:49.715 OpenEarsTest[192:60b] acousticModelPath is /var/mobile/Applications/7E1E39CB-4194-4F73-B3FC-8997C8C161A0/OpenEarsTest.app/AcousticModelEnglish.bundle
2014-08-21 10:03:49.800 OpenEarsTest[192:60b] Starting dynamic language model generation
2014-08-21 10:03:49.806 OpenEarsTest[192:60b] Able to open /var/mobile/Applications/7E1E39CB-4194-4F73-B3FC-8997C8C161A0/Library/Caches/LangModelName.corpus for reading
2014-08-21 10:03:49.809 OpenEarsTest[192:60b] Able to open /var/mobile/Applications/7E1E39CB-4194-4F73-B3FC-8997C8C161A0/Library/Caches/LangModelName_pipe.txt for writing
2014-08-21 10:03:49.811 OpenEarsTest[192:60b] Starting text2wfreq_impl
2014-08-21 10:03:49.834 OpenEarsTest[192:60b] Done with text2wfreq_impl
2014-08-21 10:03:49.836 OpenEarsTest[192:60b] Able to open /var/mobile/Applications/7E1E39CB-4194-4F73-B3FC-8997C8C161A0/Library/Caches/LangModelName_pipe.txt for reading.
2014-08-21 10:03:49.838 OpenEarsTest[192:60b] Able to open /var/mobile/Applications/7E1E39CB-4194-4F73-B3FC-8997C8C161A0/Library/Caches/LangModelName.vocab for reading.
2014-08-21 10:03:49.840 OpenEarsTest[192:60b] Starting wfreq2vocab
2014-08-21 10:03:49.842 OpenEarsTest[192:60b] Done with wfreq2vocab
2014-08-21 10:03:49.844 OpenEarsTest[192:60b] Starting text2idngram
2014-08-21 10:03:49.867 OpenEarsTest[192:60b] Done with text2idngram
2014-08-21 10:03:49.878 OpenEarsTest[192:60b] Starting idngram2lm2014-08-21 10:03:49.892 OpenEarsTest[192:60b] Done with idngram2lm
2014-08-21 10:03:49.894 OpenEarsTest[192:60b] Starting sphinx_lm_convert
2014-08-21 10:03:49.908 OpenEarsTest[192:60b] Finishing sphinx_lm_convert
2014-08-21 10:03:49.915 OpenEarsTest[192:60b] Done creating language model with CMUCLMTK in 0.114059 seconds.
2014-08-21 10:03:50.093 OpenEarsTest[192:60b] I’m done running performDictionaryLookup and it took 0.147953 seconds
2014-08-21 10:03:50.101 OpenEarsTest[192:60b] I’m done running dynamic language model generation and it took 0.384307 secondsi doubt it has anything to do with how i imported OpenEars into my project since all the classes and methods are fully visible and showing no errors, but i could be wrong, maybe there is a missing reference to something?
thanks very much!
JonLocateParticipantThank you! I deleted that Stack post.
-
AuthorPosts