Transcript

VoiceBase employs various methods to affect and improve the text transcript returned by the speech engine:

  • Stereo - separates the transcript by speaker turn which can be used to enable analytics on a per speaker basis. (Note: The recording must be in multi/dual-channel otherwise the request will be rejected by the API)

  • Custom Vocabulary - emphasizes the VoiceBase speech engine to prefer transcribing certain unique words in a vocabulary list. Most common use cases are for proper nouns, company/product names, and first and last names.

  • Formatting and Punctuation - improves the readability of the the text transcript by formatting numerical digits and inserting punctuation.

  • Priority - affects the turnaround time of processing the submitted recording.

  • Swear Word Filter - filters and replaces swear words in the text transcript.

  • Transcoding - converts the submitted audio file to a different audio codec. This is mainly use to convert an audio file to a playable codec within our VoiceBase Player.

  • Voicemail - applies a short form speech model for transcript to improve accuracy for calls under 30 seconds.