Speech to Text on Android
I am looking to create an app which has Speech to text.
I am aware of this kind of ability using the RecognizerIntent: http://android-developers.blogspot.com/search/label/Speech%20Input
However - I do not want a new Intent to be popped up, I want to do the analysis a certain points in my current app, and I dont want it to pop something up stating that it is currently attempting to record your voice.
Has anybody any ideas on how best to do this. I was perhaps thinking of trying Sphinx 4 - but I dont know if this would be able to run on Android - has anyone got any advice or experience?!
I was wondering if I could alter the code here to perhaps not bothering to show the UI or button and just do the processing: http://developer.android.com/resources/samples/ApiDemos/src/com/example/android/apis/app/VoiceRecognition.html
Cheers,
Solution 1:
If you don't want to use the RecognizerIntent
to do speech recognition, you could still use the SpeechRecognizer
class to do it. However, using that class is a little bit more tricky than using the intent. As a final note, I would highly suggest to let the user know when he is recorded, otherwise he might be very set up, when he finally finds out.
Edit: A small example inspired (but changed) from, SpeechRecognizer causes ANR... I need help with Android speech API
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,
"com.domain.app");
SpeechRecognizer recognizer = SpeechRecognizer
.createSpeechRecognizer(this.getApplicationContext());
RecognitionListener listener = new RecognitionListener() {
@Override
public void onResults(Bundle results) {
ArrayList<String> voiceResults = results
.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
if (voiceResults == null) {
System.out.println("No voice results");
} else {
System.out.println("Printing matches: ");
for (String match : voiceResults) {
System.out.println(match);
}
}
}
@Override
public void onReadyForSpeech(Bundle params) {
System.out.println("Ready for speech");
}
/**
* ERROR_NETWORK_TIMEOUT = 1;
* ERROR_NETWORK = 2;
* ERROR_AUDIO = 3;
* ERROR_SERVER = 4;
* ERROR_CLIENT = 5;
* ERROR_SPEECH_TIMEOUT = 6;
* ERROR_NO_MATCH = 7;
* ERROR_RECOGNIZER_BUSY = 8;
* ERROR_INSUFFICIENT_PERMISSIONS = 9;
*
* @param error code is defined in SpeechRecognizer
*/
@Override
public void onError(int error) {
System.err.println("Error listening for speech: " + error);
}
@Override
public void onBeginningOfSpeech() {
System.out.println("Speech starting");
}
@Override
public void onBufferReceived(byte[] buffer) {
// TODO Auto-generated method stub
}
@Override
public void onEndOfSpeech() {
// TODO Auto-generated method stub
}
@Override
public void onEvent(int eventType, Bundle params) {
// TODO Auto-generated method stub
}
@Override
public void onPartialResults(Bundle partialResults) {
// TODO Auto-generated method stub
}
@Override
public void onRmsChanged(float rmsdB) {
// TODO Auto-generated method stub
}
};
recognizer.setRecognitionListener(listener);
recognizer.startListening(intent);
Important: Run this code from the UI Thread, and make sure you have required permissions.
<uses-permission android:name="android.permission.RECORD_AUDIO" />
Solution 2:
What is built into Android (that you launch via the intent) is a client activity that captures your voice and sends the audio to a Google server for recognition. You could build something similar. You could host sphinx yourself (or use cloud recognition services like Yapme.com), capture the voice yourself, send the audio to a recognizer, and get back text results to your app. I don't know of a way to leverage the Google recognition services without use of the Intent on Android (or through Chrome).
The general consensus I've seen so far is that today's smartphones don't really have the horsepower to do Sphinx-like speech recognition. You may want to explore running a client recognizer for yourself, but Google uses server recognition.
For some related info see:
- Google's voice search speech recognition service
- Is it possible to use Android API outside an Android project?
- Speech Recognition API
Solution 3:
In your activity do the following:
Image button buttonSpeak = findView....;// initialize it.
buttonSpeak.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
promptSpeechInput();
}
});
private void promptSpeechInput() {
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
intent.putExtra(RecognizerIntent.EXTRA_PROMPT,
getString(R.string.speech_prompt));
try {
startActivityForResult(intent, REQ_CODE_SPEECH_INPUT);
} catch (ActivityNotFoundException a) {
Toast.makeText(getApplicationContext(),
getString(R.string.speech_not_supported),
Toast.LENGTH_SHORT).show();
}
}
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent
data) {
super.onActivityResult(requestCode, resultCode, data);
switch (requestCode) {
case REQ_CODE_SPEECH_INPUT: {
if (resultCode == RESULT_OK && null != data) {
result = data
.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
EditText input ((EditText)findViewById(R.id.editTextTaskDescription));
input.setText(result.get(0)); // set the input data to the editText alongside if want to.
}
break;
}
}
}