I am debugging an application which tries to run the language model creation in the background, but sometimes while running the language model creation the main thread get blocked (but the language model creator is launched in a background thread). I am not yet sure if there is some failure in my implementation or if it’s something internally in OpenEars?
]]>While debugging the application without finding something special blocking I’ve started thinking that the blocking is produced by the I/O to the devices flash memory. Does OpenEars take heavy usage of the disk, which may blocking other resources (SQLite database) to load? What would be a good approach to throttle the disk I/O usage of OpenEars?
]]>My other question is about the kind of multithreading. Have you experimented with doing it with a block (or even a timer) if you’ve been using NSThread? I do a lot of multithreaded coding and I used to use NSThread and I’ve entirely switched over to blocks/GCD because it’s so much cleaner.
]]>Maybe it is because the artist names aren’t classic english words? When using OpenEarsLogging I get log entries of using gall back methods for nearly all the words in the created language model. ( https://www.sourcedrop.net/4Loa58d7ba3b3 )
]]>How much work is it to create such a phonetic dictionary? Would it help to create such a dictionary for artist names?
]]>1. Source some kind of comprehensive list of the most popular tracks with whatever cutoff for popularity and timeframe makes sense for you. Feed the whole thing into LanguageModelGenerator using the simulator and save everything that triggers the fallback method.
2. Attempt to pick all the low-hanging fruit by trying to identify any compound nouns which consist of words which are actually in the master phonetic dictionary, just not with each other.
3. Do the tedious step with the remaining words of transcribing their phonemes using the same phoneme system found in cmu07a.dic.
4. Add them all to cmu07a.dic.
5. Sort cmu07a.dic alphabetically (very important), use.
With some good choices about your source data you could make a big impact on the processing time. The rest of it might have to be handled via UX methods such as further restricting the number of songs parsed or only using songs which are popular with the user, and informing the user in cases where this could lead to confusion.
Remember that for Deichkind and similar, you aren’t just dealing with a unknown word but also a word that will be pronounced with phonemes which aren’t present in the default acoustic model since it is made from North American speech. There’s nothing like that “ch” sound in English, at least as it is pronounced in Hochdeutsch. If your target market is German-speaking and you expect some German-named bands in there it might be worthwhile to think about doing some acoustic model adaptation to German-accented speech. I just did a blog post about it.
]]>