pod afFancomSapi
Fancom Classes for Microsoft Speech API (SAPI) 5.4
Mixins
SpeechConstants | |
SpeechStringConstants |
Classes
ISpeechAudio |
Supports the control of real-time audio streams, such as those connected to a live microphone or telephone line. |
ISpeechAudioBufferInfo |
Defines the audio stream buffer information. |
ISpeechAudioStatus |
Provides control over the operation of real-time audio streams. |
ISpeechBaseStream |
Defines properties and methods for manipulating data streams. |
ISpeechDataKey |
Provides read and write access to the speech configuration database. |
ISpeechGrammarRule |
Defines the properties and methods of a speech grammar rule. |
ISpeechGrammarRuleState |
Presents the properties and methods of a speech grammar rule state. |
ISpeechGrammarRuleStateTransition |
Returns data about a transition from one rule state to another, or from a rule state to the end of a rule. |
ISpeechGrammarRuleStateTransitions |
Represents a collection of |
ISpeechGrammarRules |
Represents a collection of |
ISpeechLexicon |
Provides access to a lexicon word. |
ISpeechLexiconPronunciation |
Provides access to the pronunciations of a speech lexicon word. |
ISpeechLexiconPronunciations |
Represents a collection of ISpeechLexiconPronunciation objects. |
ISpeechLexiconWord |
Provides access to a lexicon word. |
ISpeechLexiconWords |
Represents a collection of |
ISpeechMMSysAudio |
Supports audio implementation for the standard Windows wave-in multimedia layer. |
ISpeechObjectTokens |
Represents a collection of |
ISpeechPhraseAlternate |
Enables applications to retrieve alternate phrase information from a speech recognition (SR) engine, and to update the SR engine's language model to reflect committed alternate changes. |
ISpeechPhraseAlternates |
Is a collection of ISpeechPhraseAlternate objects. |
ISpeechPhraseElement |
Provides access to information about a word or phrase. |
ISpeechPhraseElements |
Represents a collection of |
ISpeechPhraseInfo |
Contains properties detailing phrase elements. |
ISpeechPhraseProperties |
Represents a collection of ISpeechPhraseProperty objects. |
ISpeechPhraseProperty |
Stores the information for a semantic property. |
ISpeechPhraseReplacement |
specifies a replacement, or text normalization, of one or more spoken words in a recognition result. |
ISpeechPhraseReplacements |
Represents a collection of |
ISpeechPhraseRule |
Represents the part of a recognition result that returns information about the grammar rule that produced the recognition. |
ISpeechPhraseRules |
Represents a collection of |
ISpeechRecoContext |
Defines a recognition context. |
ISpeechRecoGrammar |
Enables applications to manage the words and phrases for the SR engine. |
ISpeechRecoResult |
Returns information about a recognition attempt. |
ISpeechRecoResult2 |
Returns information about a recognition attempt. |
ISpeechRecoResultTimes |
Contains the time information for speech recognition results. |
ISpeechRecognizer |
Represents a speech recognition engine. |
ISpeechRecognizerStatus |
Returns the status of the speech recognition (SR) engine represented by the recognizer object. |
ISpeechResourceLoader |
Gives applications control over loading resources. |
ISpeechVoiceStatus |
Defines the types of information returned by the SpVoice.Status method. |
ISpeechXMLRecoResult |
Gets recognition results from the ISpXMLRecoResult as an SML document. |
SPSEMANTICFORMAT |
Lists the various values of a grammar's tag-format attribute. |
SpAudioFormat |
Represents an audio format. |
SpCustomStream |
Supports the use of existing IStream objects in SAPI. |
SpFileStream |
Enables data streams to be read and written as files. |
SpInProcRecoContext |
Defines a recognition context. |
SpInProcRecognizer |
Represents a speech recognition engine. |
SpLexicon |
Provides access to lexicons. |
SpMMAudioIn |
Supports audio implementation for the standard Windows wave-in multimedia layer. |
SpMMAudioOut |
Supports audio implementation for the standard Windows wave-out multimedia layer. |
SpMemoryStream |
Supports audio stream operations in memory. |
SpObjectToken |
Represents an available resource of a type used by SAPI. |
SpObjectTokenCategory |
Represents a class of object tokens. |
SpPhoneConverter |
Supports conversion between phoneme symbols and phoneme IDs. |
SpPhraseInfoBuilder |
Provides the ability to rebuild phrase information from audio data saved to memory. |
SpSharedRecoContext |
Defines a recognition context. |
SpSharedRecognizer |
Represents a speech recognition engine. |
SpTextSelectionInformation |
Provides access to the text selection information pertaining to a word sequence buffer. |
SpUnCompressedLexicon |
Provides access to lexicons, which contain information about words that can be recognized or spoken. |
SpVoice |
The SpVoice object brings the text-to-speech (TTS) engine capabilities to applications using SAPI automation. |
SpWaveFormatEx |
Represents the format of waveform-audio data. |
SpeechDiscardType |
Lists flags indicating portions of a recognition result to be removed or eliminated once they are no longer needed. |
SpeechDisplayAttributes |
Lists the possible ways of displaying a word. |
SpeechEmulationCompareFlags |
Values of comparison options in emulation. |
SpeechLexiconType |
Lists the allowed lexicon types. |
SpeechRecoEvents |
Lists speech recognition (SR) events. |
SpeechRecognitionType |
Lists the types of speech recognition. |
SpeechRuleAttributes |
Lists the possible attributes of a grammar rule. |
SpeechVoiceEvents |
Lists the types of events which a text-to-speech (TTS) engine can send to an SpVoice object. |
SpeechVoiceSpeakFlags |
Lists flags that control the SpVoice.Speak method. |
Enums
SPCATEGORYTYPE |
Lists the different states of Speech Recognizer as categories. |
SPXMLRESULTOPTIONS |
Used to designate whether the main result or the alternates are desired. |
SpeechAudioFormatType |
Lists the supported stream formats. |
SpeechAudioState |
Lists the four possible audio input and output states. |
SpeechBookmarkOptions |
Lists bookmark options. |
SpeechDataKeyLocation |
Lists the top-level speech configuration database keys. |
SpeechEngineConfidence |
Specifies levels of confidence. |
SpeechFormatType |
Requests either the input format for the original audio source, or the format that actually arrives at the speech engine. |
SpeechGrammarRuleStateTransitionType |
Lists the types of transitions for the speech recognition engine. |
SpeechGrammarState |
Lists the possible states of a speech grammar. |
SpeechGrammarWordType |
The |
SpeechInterference |
Lists factors that can interfere with accurate recognition of speech input. |
SpeechLoadOption |
Lists the options available when loading a speech grammar. |
SpeechPartOfSpeech |
Lists the parts-of-speech categories used in SAPI. |
SpeechRecoContextState |
Lists the states of a recognition context. |
SpeechRecognizerState |
Lists the states of a Recognizer object. |
SpeechRetainedAudioOptions |
lists the options for retaining data from an audio stream. |
SpeechRuleState |
Lists the states of a speech grammar rule. |
SpeechRunState |
Lists the running states of a TTS voice. |
SpeechSpecialTransitionType |
Lists special transitions for the speech recognition engine. |
SpeechStreamFileMode |
Lists the access modes of a file stream. |
SpeechStreamSeekPositionType |
Lists the types of positioning from which a Seek method can be performed. |
SpeechTokenContext |
Lists the context in which the code managing the newly created object runs. |
SpeechTokenShellFolder |
Lists possible locations storing token information. |
SpeechVisemeFeature |
Lists the features of phonemes and visemes. |
SpeechVisemeType |
Lists the visemes supported by the SpVoice object. |
SpeechVoicePriority |
Lists the possible Priority settings of an SpVoice object. |
SpeechWordPronounceable |
Lists the possible return values from the |
SpeechWordType |
Lists the change state of a word/pronunciation combination in a lexicon. |
All Types
- ISpeechAudio
- ISpeechAudioBufferInfo
- ISpeechAudioStatus
- ISpeechBaseStream
- ISpeechDataKey
- ISpeechGrammarRule
- ISpeechGrammarRuleState
- ISpeechGrammarRuleStateTransition
- ISpeechGrammarRuleStateTransitions
- ISpeechGrammarRules
- ISpeechLexicon
- ISpeechLexiconPronunciation
- ISpeechLexiconPronunciations
- ISpeechLexiconWord
- ISpeechLexiconWords
- ISpeechMMSysAudio
- ISpeechObjectTokens
- ISpeechPhraseAlternate
- ISpeechPhraseAlternates
- ISpeechPhraseElement
- ISpeechPhraseElements
- ISpeechPhraseInfo
- ISpeechPhraseProperties
- ISpeechPhraseProperty
- ISpeechPhraseReplacement
- ISpeechPhraseReplacements
- ISpeechPhraseRule
- ISpeechPhraseRules
- ISpeechRecoContext
- ISpeechRecoGrammar
- ISpeechRecoResult
- ISpeechRecoResult2
- ISpeechRecoResultTimes
- ISpeechRecognizer
- ISpeechRecognizerStatus
- ISpeechResourceLoader
- ISpeechVoiceStatus
- ISpeechXMLRecoResult
- SPCATEGORYTYPE
- SPSEMANTICFORMAT
- SPXMLRESULTOPTIONS
- SpAudioFormat
- SpCustomStream
- SpFileStream
- SpInProcRecoContext
- SpInProcRecognizer
- SpLexicon
- SpMMAudioIn
- SpMMAudioOut
- SpMemoryStream
- SpObjectToken
- SpObjectTokenCategory
- SpPhoneConverter
- SpPhraseInfoBuilder
- SpSharedRecoContext
- SpSharedRecognizer
- SpTextSelectionInformation
- SpUnCompressedLexicon
- SpVoice
- SpWaveFormatEx
- SpeechAudioFormatType
- SpeechAudioState
- SpeechBookmarkOptions
- SpeechConstants
- SpeechDataKeyLocation
- SpeechDiscardType
- SpeechDisplayAttributes
- SpeechEmulationCompareFlags
- SpeechEngineConfidence
- SpeechFormatType
- SpeechGrammarRuleStateTransitionType
- SpeechGrammarState
- SpeechGrammarWordType
- SpeechInterference
- SpeechLexiconType
- SpeechLoadOption
- SpeechPartOfSpeech
- SpeechRecoContextState
- SpeechRecoEvents
- SpeechRecognitionType
- SpeechRecognizerState
- SpeechRetainedAudioOptions
- SpeechRuleAttributes
- SpeechRuleState
- SpeechRunState
- SpeechSpecialTransitionType
- SpeechStreamFileMode
- SpeechStreamSeekPositionType
- SpeechStringConstants
- SpeechTokenContext
- SpeechTokenShellFolder
- SpeechVisemeFeature
- SpeechVisemeType
- SpeechVoiceEvents
- SpeechVoicePriority
- SpeechVoiceSpeakFlags
- SpeechWordPronounceable
- SpeechWordType
Overview
Fancom Sapi is a complete collection of classes that wrap Microsoft Speech API (SAPI) 5.4 when running Fantom on a JVM.
Speech
Making your computer speak couldn't be simpler than:
SpVoice().speak("It's time to kick ass 'n' chew bubble gum!")
A more complete example that initialises proper COM threading, lists available voices, and speaks in the background is:
static Void main(Str[] args) { afFancom::ComThread.initSta spVoice := afFancomSapi::SpVoice() Obj.echo("Available voices:") spVoice.getVoices.each { Obj.echo(" - ${it->getDescription}") } name := spVoice.voice.getDescription.split('-')[0] spVoice.speak("Hello, I'm $name", SpeechVoiceSpeakFlags.SVSFlagsAsync) concurrent::Actor.sleep(3sec) afFancom::ComThread.release }
Speech Recognition
Speech recognition is a bit more involved as you need to initialise an input stream, register some grammar to listen for and set up an event sink to recieve callbacks. Never the less, a complete example is given below:
using gfx using fwt using afFancom using afFancomSapi class SpeechRecognition { static Void main(Str[] args) { ComThread.initSta recoCtx := SpInProcRecoContext() // initialise the input stream / microphone // not needed with an SpSharedRecoContext category := SpObjectTokenCategory() category.setId(SpeechStringConstants.SpeechCategoryAudioIn) token := SpObjectToken() token.setId(category.default_) recoCtx.recognizer.audioInput = token // register some commands to listen for grammar := recoCtx.createGrammar rule := grammar.rules.add("awesome", SpeechRuleAttributes.SRATopLevel) rule.initialState.addWordTransition(null, "Kick Ass") rule.initialState.addWordTransition(null, "Chew Bubblegum") grammar.rules.commit grammar.cmdSetRuleState("awesome", SpeechRuleState.SGDSActive) // register an event sink recoCtx.withEvents(SpeechRecognition()) window := Window { it.size = Size(320, 240) it.title = "Say Kick Ass!" }.open ComThread.release } Void onRecognition(Int streamNumber, Variant streamPosition, SpeechRecognitionType recognitionType, ISpeechRecoResult result) { utterance := result.phraseInfo.getText.capitalize if (utterance.contains("gum")) Obj.echo("Chewing gum.") else Obj.echo("Hur hur, you said, 'Ass'!!!") } }
See ISpeechRecoContext (Events) for a list of possible callback events.
Release Notes
v1.0.4
- Example src code is now bundled with the Pod, see
- Fixed error in Speech Recognition example
v1.0.2
- Enums with values were not auto-generated with a Variant surrogate
fromVariant()
static factory method
v1.0.0
- Initial release