Enabling Media Services in a VoIP Network

Voice Message Playout Function

The voice message playout function of the media server delivers voice data to the RTP media stream, or to a TDM media stream in situations where the media server may be part of a media gateway. This voice message data is provided to the media server by an application, typically running on the application server. The voice message typically represents pre-recorded user prompts and greetings such as "Welcome to Acme Corporation," or "Please enter your four-digit PIN now." However, the voice message is sometimes a concatenation of pre-recorded voice messages with text-to-speech-generated messages. "Your account balance is….five hundred thirty-three dollars and ninety-five cents." The media server receives the voice message data in G.711 or linear PCM format. It converts the data to a negotiated format (eg. G.711, G.723.1, G.729A, etc.) and then packetizes and transmits the voice data in RTP packets. While performing voice message playout, the media server must also simultaneously support DTMF detection, voice activity detection and voice message record to detect caller inputs and pass incoming voice data to the host application.

Voice Message Record Function

The voice message record function of the media server records an RTP media stream or a TDM media stream in situations where the media server may be part of a media gateway. This function is essential for voicemail applications and call center recording applications. In addition, voice record is used to provide input to speech recognition engines on the application server. Incoming RTP packets are buffered and placed in their proper order similar to the jitter buffer function used in a two-way conversation but without the real-time constraints. The voice message data is decoded into G.711 or linear PCM and then passed to the application. When recording from the TDM side, the process is somewhat simplified, but the end result of voice data passed to the application is the same.

DTMF Detection

Also known as digit detection, DTMF detection is an important function of the media server. While speech recognition is gaining popularity in use, DMTF digits are still the primary method that callers use to communicate with an interactive voice response application. In a VoIP network, DTMF digits are transmitted both as encoded tones and as tone events per RFC 2833. A media server solution must reliably detect both types of DTMF transmissions on the RTP side and must also detect DTMF transmissions from the TDM side if the media server interfaces to the PSTN.

Media server functions comprising voice message playout, voice message record, voice activity detection and DTMF detection comprise the core media processing functions for voice messaging service applications such as voicemail, auto-attendant, directory service and IVR. The following diagram shows the relationship of the media server function to the application and data flows to and from the packet and TDM networks.

relationship of media server function to application and data flows to and from packet and TDM networks
click to enlarge

Previous Page | Next Page
1 | 2 | 3 | 4

If you found this page useful, bookmark and share it on:

 
Embedded Star Newsletter
Don't have time to visit Embedded Star everyday? Then sign up for our free newsletter. We'll send you an email when we have something to share with you. Your email address will be kept confidential and we will not share, sell, or rent it to anyone. You can unsubscribe at any time by clicking a link in the email.

Enter your email address to sign up for our free newsletter:   

If you are familiar with RSS feeds, you can also sign up for our free blog feed. Our RSS feed is updated in real-time while our newsletter is updated daily.