Forum Replies Created
-
AuthorPosts
-
hohlParticipant
Yes, the last approach sounds like the best solution. I just have to look up how the LanguageModelGenerator works and how to use it with already pre-created pronunciation.
hohlParticipantThe recognition already works well, only the creation takes very long. What has surprised myself.
Another thought of mine was to create a cache for “- (NSString *) convertGraphemes:(NSString *)phrase {“. But I didn’t debugged how long this method takes and if this would improve something.
I’ll have a look at creating an custom cmu07a.dic addition which adds the most used names from some kind of public charts and will get updated regularly by some automated routine which doesn’t run on the device itself (instead only delivering the result).
hohlParticipantJust looked up what ‘convertGraphemes’ does and it looks like a very heavy task (looks like creating stuff with Text-To-Speech). And yes, everything is upper case. The problem is the language map only contains names! And having ~100 names like ‘KONTRUST’, ‘SKRILLEX’ or ‘DEICHKIND’ which all aren’t english words.
How much work is it to create such a phonetic dictionary? Would it help to create such a dictionary for artist names?
hohlParticipantI am only using GCD since it is much cleaner. It must be dyamic since I am creating a language map of the artist and album names on the users device. The problem is, there isn’t any nice kind of notification when the user added new music to the library so I need to update the language model in the background based on a schedule plan. And while this is happening the user shouldn’t be blocked using other parts of the app or even other parts of the iOS system.
Maybe it is because the artist names aren’t classic english words? When using OpenEarsLogging I get log entries of using gall back methods for nearly all the words in the created language model. ( https://www.sourcedrop.net/4Loa58d7ba3b3 )
hohlParticipantI’ll let you know, if I can find what the lags causes. But at least there aren’t any exceptions or unexpected results when using on background. Just in the case of my application (with arround 3x~100 entries in background) it takes some time which will also block the main thread (noticeable as non-reacting UI).
hohlParticipantThanks for your response.
While debugging the application without finding something special blocking I’ve started thinking that the blocking is produced by the I/O to the devices flash memory. Does OpenEars take heavy usage of the disk, which may blocking other resources (SQLite database) to load? What would be a good approach to throttle the disk I/O usage of OpenEars?
hohlParticipantAre you looking for this?
NSError *audioSessionError = nil;
[[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryPlayAndRecord error:&audioSessionError];
[[AVAudioSession sharedInstance] setActive:YES error:&audioSessionError];
if (audioSessionError != nil) {
NSLog(@"Something went wrong with initialising the audio session!");
}AudioSessionSetActive(true);
AudioSessionAddPropertyListener(kAudioSessionProperty_AudioRouteChange, ARAudioSessionPropertyListener, nil);AVPlayer is just played and OpenEars session starts when triggered by the user. AVPlayer still plays in background, but I’ll going to make the volume of it lower during OpenEars session in future to provide better results.
hohlParticipantI am just using AVPlayer for playback. https://developer.apple.com/library/mac/#documentation/AVFoundation/Reference/AVPlayer_Class/Reference/Reference.html
hohlParticipantBut I need high quality playback since my application is a media player and 16k isn’t acceptable for that kind of application. Why does OpenEars needs to change the global playback quality?
hohlParticipantAh ok. I understand. But since it still works with the small dictionary I am using I’ll let it like that.
hohlParticipantSomething is wrong with the code tag in this forum so I uploaded the change to line 400 in AudioSessionManager.m here: https://www.sourcedrop.net/Tyj72cb2147c9
Will this have an influence on OpenEars?
hohlParticipantChanged it to:
if (fabs(preferredSampleRateCheck - kSamplesPerSecond) 0.0) {
in AudioSessionManager.m:400 and it still works and the reduction doesn’t take place anymore.hohlParticipantI extended the logging a bit and recompiled the lib. That’s what I am getting:
2012-09-05 12:29:57.733 Autoradio[5778:707] preferredBufferSize is incorrect, we will change it. Current value: 0.023000
2012-09-05 12:29:57.747 Autoradio[5778:707] PreferredBufferSize is now on the correct setting of 0.128000.
2012-09-05 12:29:57.755 Autoradio[5778:707] preferredSampleRateCheck is incorrect, we will change it. Current value: 44100.000000
2012-09-05 12:29:57.945 Autoradio[5778:707] preferred hardware sample rate is now on the correct setting of 16000.000000.
Sounds like a reduction of hardware sample rate? May I am able to change the check to something like if it is the prefereded kSamplesPerSecond or better or will this block the functionality of OpenEars?hohlParticipantWhat I’ve found out when using logging is:
2012-09-05 12:13:51.599 Autoradio[5729:707] preferredBufferSize is incorrect, we will change it.
2012-09-05 12:13:51.604 Autoradio[5729:707] PreferredBufferSize is now on the correct setting of 0.128000.
2012-09-05 12:13:51.609 Autoradio[5729:707] preferredSampleRateCheck is incorrect, we will change it.
2012-09-05 12:13:51.698 Autoradio[5729:707] preferred hardware sample rate is now on the correct setting of 16000.000000.May this result in reduction?
It’s hard to describe, maybe because I am not a musician. I would say everything sounds more dull. Thought of a lowering of the bitrate?hohlParticipantDoesn’t matter anymore. After your last comment, I found out, that the framework folder must be included in flat form (using groups instead of folder references). Not it works.
Thanks for the support.
hohlParticipantOh, I mixed it up with the grammar model. All I have in my resources is: http://cl.ly/image/0b3m3C2L0q37 (which is just the whole framework folder)
hohlParticipantI already had [OpenEarsLogging startOpenEarsLogging];
And setting verbosePocketSphinx doesn’t change anything. Just Listening and then crashes.
Acoustic model and language model is generated dynamically, so this shouldn’t be missing.
-
AuthorPosts