Mobile Zone is brought to you in partnership with:

Quality Assurance Analyst, Windows Phone developer and father. Contributing author for Windows Phone 8 in Action from Manning Publications. Adam is a DZone MVB and is not an employee of DZone and has posted 15 posts at DZone. You can read more from them at their website. View Full User Profile

Text to speech on Windows Phone 8

08.10.2012
| 6701 views |
  • submit to reddit

Ever since the Windows Phone 8 SDK leaked, I have been digging through the documentation looking for interesting additions to the upcoming release. After spending a couple of evenings trying to get the Wallet to work in the emulator using sample code found in the docs to no avail, I turned my attention to the new  SpeechSynthesizer and InstalledVoices.All classes.

As per my usual format, I start with the XAML. The first thing WP7 developers might notice is that the ApplicationTitle and pagenames are no longer hard coded by default but are resources shared across the app. This way you no longer have to set the app name on all your pages one at a time.

<phone:PhoneApplicationPage
    x:Class="TextToSpeechDemo.MainPage"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    xmlns:phone="clr-namespace:Microsoft.Phone.Controls;assembly=Microsoft.Phone"
    xmlns:shell="clr-namespace:Microsoft.Phone.Shell;assembly=Microsoft.Phone"
    xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
    xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    mc:Ignorable="d"
    FontFamily="{StaticResource PhoneFontFamilyNormal}"
    FontSize="{StaticResource PhoneFontSizeNormal}"
    Foreground="{StaticResource PhoneForegroundBrush}"
    SupportedOrientations="Portrait" Orientation="Portrait"
    shell:SystemTray.IsVisible="True">

    <!--LayoutRoot is the root grid where all page content is placed-->
    <Grid x:Name="LayoutRoot" Background="Transparent">
        <Grid.RowDefinitions>
            <RowDefinition Height="Auto"/>
            <RowDefinition Height="*"/>
        </Grid.RowDefinitions>

        <!--TitlePanel contains the name of the application and page title-->
        <StackPanel x:Name="TitlePanel" Grid.Row="0" Margin="12,17,0,28">
            <TextBlock Text="{Binding Path=LocalizedResources.ApplicationTitle, Source={StaticResource LocalizedStrings}}" Style="{StaticResource PhoneTextNormalStyle}"/>
            <TextBlock Text="{Binding Path=LocalizedResources.PageTitle, Source={StaticResource LocalizedStrings}}" Margin="9,-7,0,0" Style="{StaticResource PhoneTextTitle1Style}"/>
        </StackPanel>

        <!--ContentPanel - place additional content here-->
        <Grid x:Name="ContentPanel" Grid.Row="1" Margin="12,0,12,0">
            <StackPanel>
                <ScrollViewer Height="200">
                    <ComboBox HorizontalAlignment="Left" Width="456" Name="voicesComboBox" DisplayMemberPath="Name" />
                </ScrollViewer>
                <StackPanel Orientation="Horizontal" HorizontalAlignment="Center">
                    <RadioButton Content="Male" IsChecked="true" Name="MaleRadioButton"/>
                    <RadioButton Content="Female"/>
                </StackPanel>
                <TextBox HorizontalAlignment="Left" Height="230" TextWrapping="Wrap" Width="456" Text="I may be a sorry case, but I don't write jokes in base 13." Name="inputTextBox"/>
                <Button Content="Speak to me" HorizontalAlignment="Left" Width="456" Click="SpeakToMe_Click"/>
            </StackPanel>
        </Grid>
    </Grid>
</phone:PhoneApplicationPage>

 Next is the code-behind file for the MainPage.xaml. Its content is heavily borrowed for the WP8 SDK docs. I added some error checking and the ability to choose the gender and language of the voice used.

using System;
using System.Linq;
using System.Windows;
using Microsoft.Phone.Controls;
using Windows.Phone.Speech.Synthesis;
 
namespace TextToSpeechDemo
{
    public partial class MainPage : PhoneApplicationPage
    {
        SpeechSynthesizer synth;
        // Constructor
        public MainPage()
        {
            InitializeComponent();
            voicesComboBox.ItemsSource = new MyLocals().Items();
        }
 
        private async void SpeakToMe_Click(object sender, RoutedEventArgs e)
        {
            if (voicesComboBox.SelectedIndex == -1)
            {
                MessageBox.Show("Please select a language.");
            }
            else
            {
                if (string.IsNullOrEmpty(inputTextBox.Text))
                {
                    MessageBox.Show("Please enter some text.");
                }
                else
                {
                    try
                    {
                        // Initialize the SpeechSynthesizer object.
                        synth = new SpeechSynthesizer();
 
                        var myLocal = (MyLocale)voicesComboBox.SelectedItem;
 
                        // Query for a voice. Results rdered by Gender to ensure the order always goes Female then Male.
                        var voices = (from voice in InstalledVoices.All
                                      where voice.Language == myLocal.Lcid
                                      select voice).OrderByDescending(v => v.Gender);
 
                        // gender: 0 = Female, 1 = Male. Corresponds to the index of the above results.
                        int gender = 0;
                        if (MaleRadioButton.IsChecked == true) gender = 1; else gender = 0;
 
                        // Set the voice as identified by the query.
                        synth.SetVoice(voices.ElementAt(gender));
 
                        // Speak
                        await synth.SpeakTextAsync(inputTextBox.Text);
                    }
                    catch (Exception ex)
                    {
                        MessageBox.Show(ex.Message);
                    }
                }
            }
        }
    }
}

And finally is a couple of classes to hep fill the ComboBox of languages. Checking InstalledVoices.All shows me there are 30 items in it(15 languages x 2 voices each). I was unable to directly vies the list for reasons I don’t know. All I know is that it is a COM object and the only way to get the required VoiceInformation objects needed is via a LINQ query. I was able to determine 12 languages, but need to look deeper to find out the other 3.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
 
namespace TextToSpeechDemo
{
    class MyLocale
    {
        public MyLocale(string name, string lcid)
        {
            _name = name;
            _lcid = lcid;
        }
 
        private string _name;
        public string Name
        {
            get { return _name; }
            set { _name = value; }
        }
 
        private string _lcid;
        public string Lcid
        {
            get { return _lcid; }
            set { _lcid = value; }
        }
    }
 
    class MyLocals
    {
        private IList<MyLocale> _myLocals;
 
        public MyLocals()
        {
            _myLocals = new List<MyLocale>();
 
            _myLocals.Add(new MyLocale("Chinese Simplified (PRC)", "zh-CN"));
            _myLocals.Add(new MyLocale("Chinese Traditional (Taiwan)", "zh-TW"));
            _myLocals.Add(new MyLocale("English (United States)", "en-US"));
            _myLocals.Add(new MyLocale("English (United Kingdom)", "en-GB"));
            _myLocals.Add(new MyLocale("French (France)", "fr-FR"));
            _myLocals.Add(new MyLocale("German (Germany)", "de-DE"));
            _myLocals.Add(new MyLocale("Italian (Italy)", "it-IT"));
            _myLocals.Add(new MyLocale("Japanese (Japan)", "ja-JP"));
            _myLocals.Add(new MyLocale("Polish (Poland)", "pl-PL"));
            _myLocals.Add(new MyLocale("Portuguese (Brazil)", "pt-BR"));
            _myLocals.Add(new MyLocale("Russian (Russia)", "ru-RU"));
            _myLocals.Add(new MyLocale("Spanish (Spain)", "es-ES"));
        }
 
        public IEnumerable<MyLocale> Items()
        {
            return (IEnumerable<MyLocale>)_myLocals;
        }
    }
}

Next thing is to set the ID_CAP_SPEECH_RECOGNITION capability otherwise an exception is thrown.

 

Now if we run the app, we are shown the following. Choosing a language and clicking the button will cause the emulator to start talking to you.

One thing I noticed: If I choose a non-English language and put English text, it speaks with an accent. When I set the text to french 1, 2, 3, 4 and set the language to another similar one (Spanish) it pronounces the french “4″ in Spanish, not french. An interesting event :)

Published at DZone with permission of Adam Benoit, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)