Frequently Asked Questions/Support
If you have trouble with some aspect of using OpenEars™ and you have carefully re-read the documents and examined the example app without it helping, you can ask a question in the OpenEars forum (please turn on OpenEarsLogging and — if the issue relates to recognition — PocketsphinxController’s verbosePocketsphinx property before posting an issue so I have some information to troubleshoot from). The forum is a place to ask questions free of charge, but free private email support is not given for OpenEars. However, you can purchase a support incident or contract if you would like to discuss a question via private email (note: email support purchases are temporarily not available, so please bring all questions to the forums).
FAQ
Q: Do OpenEars and its plugins support Swift?
A: Yes, OpenEars works great in Swift apps and it ships with a Swift 3 sample app to get you started. You can also use the OpenEars Swift 3 custom tutorial tool for a walkthrough on constructing your own app using OpenEars or any of its plugins. Support is given for Swift 3 integrations at the forums.
Q: Where are the changelogs for OpenEars™ and its plugins?
A: There is a unified changelog for OpenEars and all of its plugins here, which offers a couple of different options to subscribe so you can always stay up to date.
Q: How do I update a demo plugin? How do I update a registered plugin?
A: To update a demo plugin, download the new version using the link you were originally sent in your demo request email. To update a registered plugin, visit the licensee site at the link sent to you via email when you first purchased your registered plugin.
Q: Do the licensed plugins have more/different features than the demo plugins?
A: No, they have the same features and behavior, but the licensed plugins are legally allowed to be used in an App Store app and they don’t have a timeout after a few minutes.
Q: I am open-sourcing my app or otherwise working in a public repo or providing my code for others to use. Can I include the plugins in some form?
A: Sorry, the plugins can’t be redistributed anywhere, licensed or demo.
Q: I’d like to recognize exact phrases or exact words, or define a rules-based grammar for recognition. Can I do this with OpenEars or the plugins?
A: Yes, you can do this with regular OpenEars using the new API for dynamically generating rules-based grammars at runtime (this is the best way to identify fixed phrases with the words in a certain order) and if you need to output grammars which have faster response times than the JSGF response time or which are compatible with RapidEars, you can also try the new plugin RuleORama which uses the same API to output a new format which is as fast to recognize as OpenEars’ language models and compatible with RapidEars.
Q: OpenEars recognizes noises or random spoken words as words in my vocabulary, and I want to reduce this.
A: Rejecto is designed to deal with this issue (it is called the out-of-vocabulary problem) which affects all speech recognition with optimized smaller language models. Before trying Rejecto out, please make sure you aren’t testing on the Simulator since the issue with noises being recognized as speech is much worse on the Simulator than a real device that users will use. Across-the-board noise reduction can be achieved by increasing the value of vadThreshold – please read the next FAQ entry for more on this, even if you are using the English acoustic model.
Q: I’m trying to use a non-English acoustic model and recognition results are mixed out of the box.
A:
In order to have good recognition results, it is necessary for each non-English acoustic model to find its ideal vadThreshold setting via some experimentation, because OpenEars ships with the standard setting for the English model. The vadThreshold setting controls the cutoff level between speech and non-speech when listening, so too low of a vadThreshold value will result in too much incidental noise being attempted to be processed as speech, and with too high of a value real speech can be ignored. For this reason, if you don’t test and change the vadThreshold setting to one appropriate for your app, recognition quality will be impaired. When experimenting, it’s recommended to increase or decrease vadThreshold only .1 or maybe .5 at a time. Set it a bit higher to reject more unwanted speech and set it lower to process sounds more readily and attempt to detect speech within them. Find the right setting here before adding Rejecto to a project, since Rejecto is intended to refine these results. English-language projects will also benefit from some testing to find an ideal vadThreshold level.
Q: Does this project support adapting its acoustic models?
A:
No, sorry, this project has never support acoustic model adaptation. Before OpenEars 2.5 this project provided links to external adaptation instructions with the proviso that this was unsupported. Since OpenEars 2.5, which moved to a custom acoustic model bundle format to enable more languages than English, even these links have been removed, since it is unlikely that adaptation would be successful with the new models. There is no way to assist with adaptation needs – it is important to test with OpenEars and the plugin demos that the shipping models work for your requirements.
Q: I followed the tutorial and I’m sure that I did every step, but I’m getting an error similar to ”Slt/Slt.h’ file not found’.
A: Please start by upgrading to Xcode 8 or later since it is the least buggy in this area. When you add the framework/s, remember to check the box that says “Copy items into destination group’s folder (if needed)”, or you may receive errors that header files can’t be found in frameworks which were added. If the issues persist, take a look at what is found in your Framework Search Paths build setting for the app, since it is this entry where bugs in this feature manifest. If the path to the framework is missing, add it (there is more specific info on this in the tutorial tool) or if it has extraneous quotes or backslashes, remove them.
Q: Can I use OpenEars without defining a vocabulary? Can I use OpenEars to recognize any word the user might say?
A:OpenEars only works with vocabularies you define in advance – from the documentation: “Highly-accurate large-vocabulary recognition (that is, trying to recognize any word the user speaks out of many thousands of known words) is not yet a reality for local in-app processing on a small handheld device given the hardware limitations of the platform; even Siri does its large-vocabulary recognition on the server side. However, Pocketsphinx (the open source voice recognition engine that OpenEars uses) is capable of local recognition of vocabularies with hundreds or even thousands of words depending on the environment and other factors, and performs very well with medium-sized language models (vocabularies). ”
Q: I just tried the tutorial and OEPocketsphinxController didn’t understand the words that I said.
A: 95% of the time, this is either because you were saying words which aren’t in the vocabulary that OEPocketsphinxController is listening for, meaning that it doesn’t have a way of recognizing those words, or you are testing recognition on the Simulator. Take a look at which words the app is listening for and test recognition of those words, and make sure to test on a real device. It can also be very damaging to recognition accuracy to have a misspelled word in your vocabulary array, since OELanguageModelGenerator will not be able to successfully look it up in the pronunciation dictionary and will have to make its best guess, which may be different from what you or a user is saying to the device.
Q: But I want to write an app that uses different words from the ones in the sample app.
A: OELanguageModelGenerator is the class in OpenEars which lets you define which words to listen for. OpenEars works by creating a specific vocabulary to listen for. The tutorial explains how to create your own vocabulary and there are also examples of creating custom vocabularies in the sample app.
Q: I’m trying to use a sound framework like Finch or another OpenAL wrapper and things aren’t working as expected.
A: OEPocketsphinxController has very specific audio session and audio unit requirements and it can’t be run simultaneously with another framework which requires control over the audio session and audio input. I’m not aware of any other frameworks with specific audio session requirements which are able to fulfill their audio functionality while another audio framework with different session requirements is running simultaneously.
Q: I’m not using an audio framework as in the previous question, but I am getting strange results that aren’t being reported elsewhere.
A: This is likely to be for the same reasons as described in the previous section – something in the app is making changes to the audio session or audio settings in a way that conflicts with the settings that OpenEars needs to be able to rely on and manage itself. This often happens due to using a framework where it isn’t obvious that it makes audio session changes (I believe Unity does this, as do other speech recognition SDKs) and it can also happen as a result of copy/pasting Cocoa audio SDK code which includes an extraneous AVAudioSession call (for whatever reason, there was a fad for Stack Overflow answers about audio code to start with AVAudioSession calls without a clear need for it so there is a misconception that they should prepend any audio code). If you are getting unusual audio results, the most important troubleshooting step is to search your app for any calls to AVAudioSession or low-level “audiosession” and turn them off, and to remove third-party SDKs that have any relationship to audio processing (i.e. another speech SDK, a game development platform, etc). Even if the code you are turning off is needed by your app, it is important to know when seeking support here that you are seeking help with an audio coexistence conflict, and not to report the behavior as a bug without sharing the important information that it is happening as a result of an audio coexistence conflict that is not expected to work.
Q: I have a bluetooth device that is giving unexpected results.
A: Unfortunately, different bluetooth devices do seem to have different levels of compatibility with Cocoa audio APIs, although it is meant to be a standard. Some devices aren’t compatible with non-Apple apps at all, while others can do playback with 3rd-party apps but not recording; this may come down to buffering behavior in the hardware. This is the entire reason that bluetooth support in OpenEars is marked as experimental. Before making a support request for a Bluetooth issue, please first search for current discussions about Bluetooth devices in the forums, and also make sure you are aware of and have tried out the post-2.5.x bluetooth compatibility methods in OpenEars, disablePreferredSampleRate, disablePreferredBufferSize, and disablePreferredChannelNumber, which will solve most issues (to date, they have helped with all reported Bluetooth issues). Please read the OEPocketsphinxController documentation in order to learn about these methods and do a forum search for Bluetooth in order to see more granular discussion of them.
For more in-depth reporting of issues, in order to first test whether there should be any expectation of your device working with OpenEars, verify that it can do low-latency recording with any third party app – the best way to check this is to see whether it works with a non-Apple app that does realtime voice chat or voice calls. If you are sure that low-latency recording works with some apps, it isn’t possible to do troubleshooting of issues via the forums since it involves unknown hardware, but you can give Politepix one of the devices (either your own example of the device sent by post, or a new or used one via Amazon – you can use the contact form for more info), and Politepix will take a look at whether there is something straightforward that can be done in order to expand support (this is unfortunately neither a commitment to support the device, nor a commitment to invest a long period of troubleshooting in the specific device, but it will definitely be looked at and tested with the goal of finding out why it doesn’t work or getting it to work if that is straightforward).
Given the huge variety of devices, their different behavior, their expense, and the fact that previously verified-to-be-working hardware may change behavior with different hardware or software versions or iOS version changes, it is not prudent for Politepix to attempt to maintain its own testbed of bluetooth devices, and it is unfortunately also not possible to commit to in-depth troubleshooting with the goal of supporting every device or any one device. This offer is based on time availability so it may not always be possible to undertake, and is intended to refer to a single hardware example at a time from an independent app developer/producer/company which makes apps.
Q: When I license an OpenEars plugin (not OpenEars, one of its plugins), the license is for one app. Does that mean I need a license for each app user?
A: No, the license is for the app itself, so you need one license for one listing in the App Store. No matter how many users your app gets, it’s just one license needed, and Politepix hopes you get a whole lot.
Q: My app crashes when listening starts
A: The exact reason for this will always be reported by logging once you turn on OELogging and verbosePocketsphinx. Both are described in the documentation. If the logging doesn’t explain it clearly enough to fix, it is fine to show the complete logging in a question in the forums and ask for help.
Q: There is a bug on the Simulator/recognition isn’t good on the Simulator
A: OpenEars has a low-latency audio driver written using the Audio Unit API which requires an Audio Session setting in order to work, and it isn’t supported by the Simulator. Because it can be slow to debug app logic without using the Simulator, OpenEars has a fallback audio approach that is compatible with the Simulator. However, it isn’t as good as the device approach and very little time has been spent trying to debug it since it is only provided as a nicety. With that understanding, please don’t evaluate OpenEars’ accuracy or behavior based on the Simulator, since it uses a completely different audio driver, and please don’t report Simulator-only bugs since there’s no way to fairly allocate resources towards fixing Simulator-only audio issues when no users run apps on the Simulator.
Q: If I purchase RapidEars, will OpenEars be able to recognize anything a user says?
A: No, RapidEars does exactly the same small-vocabulary offline recognition that OpenEars does, but it does it in realtime on speech that is still in-progress rather than having to wait for the user to pause for a second before beginning to do recognition. That’s pretty cool, actually! Both OpenEars and RapidEars are recommended for use with vocabularies that are fewer than 1000-2000 words.
Q: I’m using RapidEars or OpenEars with an acoustic model that I made or downloaded elsewhere and I’m getting the following unexpected results…
A: Politepix can only support the acoustic models that it ships, since it can only test against these models.
Q: I’m getting a linker error with RapidEars, NeatSpeech, Rejecto, SaveThatWave, or another plugin — what should I do?
A: It is necessary to set the linker flag -ObjC for your target when using the plugins. If this isn’t the issue, it is otherwise always due to the fact that the plugin requires a certain version of OpenEars or later, and you are using an old version of OpenEars, or an earlier version is still somehow linked to your project, so update to the current version of OpenEars. In the case of NeatSpeech, it is also necessary to give extra attention to this step from the instructions: “For the last step, change the name of the implementation source file in which you are going to call NeatSpeech methods from .m to .mm (for instance, if the implementation is named ViewController.m, change its name to ViewController.mm and verify in the Finder that the name of the file has changed) and then make sure that in your target Build Settings, under the section “C++ Standard Library”, the setting “libstdc++ (Gnu C++ standard library)” is selected. If you receive errors like “Undefined symbols for architecture i386: std::basic_ios >::widen(char) const”, that means that this step needs special attention.” In the Swift 3 version, this involves instead adding a C++ library and is equally important.
Q: I have tried a fix for a known issue which others have been able to solve definitively, but it doesn’t work for me.
A: This is often solved by cleaning your project before testing again.
Q: What license does OpenEars use?
A: There are actually selections from five libraries in use by OpenEars-enabled projects, only one of which is the OpenEars framework, and you can see their licenses (which are commercial-friendly) in their header files in the distribution. OpenEars is licensed under the Politepix Public License version 1.0. It gives you the right to use OpenEars to make apps for the App Store. You have some obligations (such as crediting the libraries involved, including OpenEars, either in your app on on its web page) so please read the license.[TOP]
Q: Can I or should I reference OpenEars in my support/marketing/etc materials?
A: I’d love it if you want to talk about OpenEars in your marketing! If you want to discuss it in your support documents, just please do so in a way that it doesn’t cause any confusion for your endusers about where to seek support (i.e. it must be clear that you are responsible for supporting your app) and it doesn’t imply an endorsement of your app by Politepix or any of the maintainers of the libraries that OpenEars links to (unless one of those parties does actively endorse your project!).
Q: How can I trim down the size of the final binary for distribution?
A: There are instructions on doing this here.[TOP]
Q: But the framework is very large and I don’t want all of that file size added to my app
A: The framework is not added to your app. It is a static framework and only the parts of its code you link to are added to your app in binary form. The size of the framework is not related to the eventual size of your app; it only represents the overall size of all of the source that is inside of the framework. Not all of the source in the framework is even available via OpenEars’ API, so there is no scenario in which it is possible for your app to link to all of the code in the framework and cause the size of the part of your app binary which addresses the code in the framework to become as large as the framework.[TOP]
Q: But after I added OpenEars my app size got much larger.
A: If you have multiple architectures and you have bitcode on, this will be the result for you when you link to any significantly complex framework; you may only be linking to 12MB of compiled code but it is multiplied several times in the archive created due to all of the slices present. This is an app architecture issue on our platform, but to be honest, it isn’t a very important one in an era in which even a single photo taken by an iPhone is 3-12MB in size. Rather than extensively try to optimize the size of the app performing offline recognition in order to save the size of 1-4 photos on the phone, it is probably a healthier perspective to compare it to the data that will be saved from going over the network, which would exceed the amount added to your app in a relatively brief period of active use.[TOP]
Q: I thought that this version of OpenEars supported the -all_load linker flag, but I’m getting a duplicate symbol error when I use OpenEars with the flag enabled.
A: Starting with OpenEars and plugins version 1.64, the -all_load linker flag is no longer supported and using it will prevent building. Any use of all_load that another library requires can be substituted with force_load and a reference to that library only, and this has been the case since early versions of Xcode 4, so there is no reason to use all_load.
Q: Have any apps ever been rejected for using OpenEars?
A: I have never heard of any apps being rejected for using OpenEars, and I wouldn’t expect them to be since I’ve taken care to make sure OpenEars doesn’t do anything questionable, and where I’ve had any questions I’ve just written Apple and asked them for guidance directly. There is a very long list of apps that were (unsurprisingly) accepted that used OpenEars so it is fine to use OpenEars. I have heard of two apps in the last three years being rejected that linked to OpenEars, but they were not rejected because they linked to OpenEars or because of anything related to OpenEars, but because of other details of the apps that did not originate with OpenEars.
Something that is quite important as of iOS 7 for easy, painless app acceptance is that when you obtain a device capability permission, it is necessary to make it clear to the user what the permission is being used for — there can’t be any “stealth” usage of a device capability happening without it being transparent to the user. This is a great, positive development since we want to be building a user-respecting, forthright platform where users have a basis for trusting their apps. What that means in practice is if you perform speech recognition, and the user is asked to give microphone permission, there has to be some kind of explanation or indication in the app UI or description or introductory text that speech recognition is performed in the app. If you ask for mic permission and then perform speech recognition but there is nothing in the UI that would indicate that recognition is being performed, Apple will probably ask you to improve that so that the user knows what the mic stream is used for. OpenEars gives you UI hooks such as the decibel levels of incoming and outgoing speech so that it is easy for you to build a UI, but it isn’t a UI framework, so questions like how to best show the user what is being done with the mic stream are outside of the support that is given here, but I wanted to mention that this is something that you need to consider for your app now that there is a permission system and an Apple UI guideline for use of capabilities with permission.
Q: I still have a question, how do I get more support?
A: You can always ask for help in the forums and I’ll do my best to answer your question. Please turn on OpenEarsLogging and (if the issue relates to recognition) PocketsphinxController’s verbosePocketsphinx property before posting an issue so I have some information to troubleshoot from. Free private email support is not given for OpenEars, but you can purchase a support incident if you would like to discuss a question via private email (note: email support purchases are temporarily not available, so please bring all support questions to the forums). Forum support is free. Other emails regarding OpenEars (i.e. not support requests) can be sent via the contact form.
Q: What kinds of questions can I ask in the forums?
A: Questions about using the APIs available in OpenEars and its plugins to implement speech recognition or speech synthesis in native Objective-C apps in ways that are currently possible with their use (or questions about whether an application is currently possible), in order to get the best results with OpenEars and its plugins as they currently function. Sorry, it isn’t possible to help with the development of novel software or hardware techniques in the course of tech support for the OpenEars SDK. Relatedly, just like with the compiled software products of other firms, tech support doesn’t supply information about how the binary products were implemented and/or their internal implementation details.
[TOP]
Q: Why was a question, or a reply, or an account removed from the forums?
A: Posts and replies might be closed or removed if they are off-topic, too unclear to be able to help with, lacking information required for debugging after it has been requested a couple of times, or bringing an unconstructive tone to the troubleshooting process. Accounts will be removed if their posts habitually have these characteristics. Constructive questions may also be closed (but not removed) when they cover ground already comprehensively discussed in other questions, the documents, or the FAQ, or regard an unsupported feature, or in which no further answer would be possible to provide even if discussion were to continue, in order to preserve the time available for development and support rather than as any form of critique of the content of the question.
[TOP]
Q: Can I hire you to create an OpenEars-enabled app for me or adapt OpenEars, or consult on a speech project using OpenEars?
A: Sorry, Politepix does not do any consulting or contracting.
Q: Anything else?
A:Politepix would like to take this opportunity to thank the CMU Sphinx project for all of its excellent work, and Nickolay Shmyrev very specifically for answering many questions from this project.