OpenEars and the main thread – Politepix

OpenEars and the main thread

hohl — Fri, 09 Nov 2012 17:00:38 +0000

Does OpenEars (or especially the language model creator) needs to be launched on the main thread? Or does the language model creator automatically use the main thread on the line he need it?

I am debugging an application which tries to run the language model creation in the background, but sometimes while running the language model creation the main thread get blocked (but the language model creator is launched in a background thread). I am not yet sure if there is some failure in my implementation or if it’s something internally in OpenEars?

Reply To: OpenEars and the main thread

Halle Winkler — Fri, 09 Nov 2012 17:13:42 +0000

OpenEars expects to be launched on mainThread and to handle its own multithreading, but I wouldn’t automatically expect a problem with launching it on a secondary thread. Certain operations will always be returned to mainThread (notifications sent to OpenEarsEventsObserver and delegate callbacks of OpenEarsEventsObserver will always go to mainThread, I think that audio playing in FliteController must be initiated mainThread IIRC, maybe there are some other examples of things which expect to initiate from or return to mainThread). But, if all you are doing is running LanguageModelGenerator and you are running it on a background thread, my guess is that the issue is in the actual threading, because LanguageModelGenerator works as a series of single-threaded activities so it should be possible to background. Are you using blocks, NSTimer or NSThread?

Reply To: OpenEars and the main thread

hohl — Sat, 10 Nov 2012 16:52:02 +0000

Thanks for your response.

While debugging the application without finding something special blocking I’ve started thinking that the blocking is produced by the I/O to the devices flash memory. Does OpenEars take heavy usage of the disk, which may blocking other resources (SQLite database) to load? What would be a good approach to throttle the disk I/O usage of OpenEars?

Reply To: OpenEars and the main thread

Halle Winkler — Sat, 10 Nov 2012 18:35:28 +0000

Nope, it doesn’t make any notable continuous use of the disk. Have you completely ruled out something going on in your app? If you move the operations in the sample app to another thread, do you see the same results? I would try to simplify the test case as much as possible until you can replicate the issue with only one code addition.

Reply To: OpenEars and the main thread

hohl — Wed, 14 Nov 2012 17:06:07 +0000

I’ll let you know, if I can find what the lags causes. But at least there aren’t any exceptions or unexpected results when using on background. Just in the case of my application (with arround 3x~100 entries in background) it takes some time which will also block the main thread (noticeable as non-reacting UI).

Reply To: OpenEars and the main thread

Halle Winkler — Wed, 14 Nov 2012 19:05:05 +0000

Yup, I would expect that to take a bit of time. Is there a requirement that they be dynamic, or would this blog post help: /2012/11/02/openears-tips-1-create-a-language-model-before-runtime-from-a-text-file/

My other question is about the kind of multithreading. Have you experimented with doing it with a block (or even a timer) if you’ve been using NSThread? I do a lot of multithreaded coding and I used to use NSThread and I’ve entirely switched over to blocks/GCD because it’s so much cleaner.

Reply To: Reply To: OpenEars and the main thread

hohl — Mon, 19 Nov 2012 16:09:50 +0000

I am only using GCD since it is much cleaner. It must be dyamic since I am creating a language map of the artist and album names on the users device. The problem is, there isn’t any nice kind of notification when the user added new music to the library so I need to update the language model in the background based on a schedule plan. And while this is happening the user shouldn’t be blocked using other parts of the app or even other parts of the iOS system.

Maybe it is because the artist names aren’t classic english words? When using OpenEarsLogging I get log entries of using gall back methods for nearly all the words in the created language model. ( https://www.sourcedrop.net/4Loa58d7ba3b3 )

Reply To: Reply To: OpenEars and the main thread

Halle Winkler — Mon, 19 Nov 2012 16:12:49 +0000

Have you taken the advice from OpenEarsLogging to convert everything to uppercase first? You can do this easily by taking your NSString and messaging [myString uppercaseString]

Reply To: OpenEars and the main thread

hohl — Mon, 19 Nov 2012 16:27:03 +0000

Just looked up what ‘convertGraphemes’ does and it looks like a very heavy task (looks like creating stuff with Text-To-Speech). And yes, everything is upper case. The problem is the language map only contains names! And having ~100 names like ‘KONTRUST’, ‘SKRILLEX’ or ‘DEICHKIND’ which all aren’t english words.

How much work is it to create such a phonetic dictionary? Would it help to create such a dictionary for artist names?

Reply To: OpenEars and the main thread

Halle Winkler — Mon, 19 Nov 2012 16:41:48 +0000

That is my suggestion. Since 1.2 LanguageModelGenerator has been able to take a custom or customized master phonetic dictionary in place of cmu07a.dic, and for this task I think the best approach is to add to cmu07a.dic and keep using it. In order to get the pronunciations to add I would do the following steps (this involves a little hacking to OpenEars and I’m leaving implementation up to you):

1. Source some kind of comprehensive list of the most popular tracks with whatever cutoff for popularity and timeframe makes sense for you. Feed the whole thing into LanguageModelGenerator using the simulator and save everything that triggers the fallback method.

2. Attempt to pick all the low-hanging fruit by trying to identify any compound nouns which consist of words which are actually in the master phonetic dictionary, just not with each other.

3. Do the tedious step with the remaining words of transcribing their phonemes using the same phoneme system found in cmu07a.dic.

4. Add them all to cmu07a.dic.

5. Sort cmu07a.dic alphabetically (very important), use.

With some good choices about your source data you could make a big impact on the processing time. The rest of it might have to be handled via UX methods such as further restricting the number of songs parsed or only using songs which are popular with the user, and informing the user in cases where this could lead to confusion.

Remember that for Deichkind and similar, you aren’t just dealing with a unknown word but also a word that will be pronounced with phonemes which aren’t present in the default acoustic model since it is made from North American speech. There’s nothing like that “ch” sound in English, at least as it is pronounced in Hochdeutsch. If your target market is German-speaking and you expect some German-named bands in there it might be worthwhile to think about doing some acoustic model adaptation to German-accented speech. I just did a blog post about it.