Enable read-aloud for your application with Azure neural TTS

Enable read-aloud for your application with Azure neural TTS

This article is contributed. See the original author and article here.

This post is co-authored with Yulin Li, Yinhe Wei, Qinying Liao, Yueying Liu, Sheng Zhao


 


Voice is becoming increasingly popular in providing useful and engaging experiences for customers and employees. The Text-to-Speech (TTS) capability of Speech on Azure Cognitive Services allows you to quickly create intelligent read-aloud experience for your scenarios.


 


In this blog, we’ll walk through an exercise which you can complete in under two hours, to get started using Azure neural TTS voices and enable your apps to read content aloud. We’ll provide high level guidance and sample code to get you started, and we encourage you to play around with the code and get creative with your solution!


 


What is read-aloud   


 


Read-aloud is a modern way to help people to read and consume content like emails and word documents more easily. It is a popular feature in many Microsoft products, which has received highly positive user feedback. A few latest examples:



  • Play My Emails: In outlook iOS, users can listen to their incoming email during the commute to the office. They can choose from a female and a male voice to read the email aloud, anytime their hands may be busy doing other things.

  • Edge read aloud: In recent chromium-based edge browser, people can listen to the web pages or pdf documents when they are doing multi-tasking. The read-aloud voice quality has been enhanced with Azure neural TTS, which becomes the ‘favorite’ feature to many (Read the full article).

  • Immersive reader is a free tool that uses proven techniques to improve reading for people regardless of their age or ability. It has adopted Azure neural voices to read aloud content to students. 

  • Listen to Word documents on mobile. This is an eyes-off, potentially hands-off modern consumption experience for those who want to do multitask on the go. In specific, this feature supports a longer listening scenario for document consumption, now available with Word on Android and iOS.


With all these examples and more, we’ve seen clear trending of providing voice experiences for users consuming content on the go, when multi-tasking, or for those who tend to read in an audible way. With Azure neural TTS, it is easy to implement your own read-aloud that is pleasant to listen to for your users.  


 


The benefit of using Azure neural TTS for read-aloud


 


Azure neural TTS allows you to choose from more than 140 highly realistic voices across 60 languages and variants that enables fluid, natural-sounding speech, with rich customization capabilities available at the same time. 


 


High AI quality


Why is neural TTS so much better? Traditional TTS is a multi-step pipeline, and a complex process. Each step could involve human, expert rules or individual models. There is no end-to-end optimization in between, so the quality is not optimal. The AI based neural TTS voice technology has simplified the pipeline into three major components. Each component can be modeled by advanced neural deep learning networks: a neural text analysis module,  which generates more correct pronunciations for TTS to speak; a neural acoustic model, like uni-TTS which predicts prosody much better than the traditional TTS, and a neural vocoder, like hifiNet which creates audios in higher fidelity.


 


With all these components, Azure neural TTS makes the listening experience much more enjoyable than the traditional TTS. Our studies repeatedly show that the read-aloud experience integrated with the highly natural voices on the Azure neural TTS platform can significantly increase the time that people spend on listening to the synthetic speech continuously, and greatly improve the effectiveness of their consumption of the audio content.


 


Broad locale coverage


Usually, the reading content is available in many different languages.  To read aloud more content and reach more users, TTS needs to support various locales.  Azure neural TTS now supports more than 60 languages off the shelf. Check out the details in the full language list.


 


By offering more voices across more languages and locales, we anticipate developers across the world will be able to build applications that change experiences for millions. With our innovative voice models in the low-resource setting, we can also extend to new languages much faster than ever.


 


Rich speaking styles


Azure neural TTS provides you a rich choice of different styles that resonate your content. For example, the newscast style is optimized for news content reading in a professional tone. The customer service style supports you to create a more friendly reading experience for conversational content focusing on customer support. In addition, various emotional styles and role-play capabilities can be used to create vivid audiobooks in synthetic voices.


 


Here are some examples of the voices and styles used for different types of content.  


 






















































Language



Content type



Sample



Note



English (US)



Newscast



Aria, in the newscast style



English (US)



Newscast



Guy, in the general/default style



English (US)



Conversational



Jenny, in the chat style



English (US)



Audiobook



Jenny, in multiple styles



Chinese (Mandarin, simplified)



Newscast



Yunyang, in the newscast style



Chinese (Mandarin, simplified)



Conversational



Yunxi, in the assistant style



Chinese (Mandarin, simplified)



Audiobook



Multiple voices used: Xiaoxiao and Yunxi


 


Different styles used: lyrical, calm, angry, disgruntled, angry, embarrassed, with different style degrees applied


 



 


These styles can be adjusted using SSML, together with other tuning capabilities, including rate, pitch, pronunciation, pauses, and more.


 


Powerful customization capabilities


Besides the rich choice of prebuilt neural voices, Azure TTS provides you a powerful capability to create a one-of-a-kind custom voice that can differentiate your brand from others. Using Custom Neural Voice, you can build a highly realistic voice using less than 30 minutes of audio as training data. You can then use your customized voices to create a unique read-aloud experience that reflects your brand identity or resonate the characteristics of your content.


 


Next, we’ll walk you through the coding exercise of developing the read-aloud feature with Azure neural TTS.  


 


How to build read-aloud features with your app    


 


It is incredibly easy to add the read-aloud capability using Azure neural TTS to your application with the Speech SDK.  Below we describe two typical designs to enable read-aloud for different scenarios.


 


Prerequisites


If you don’t have an Azure subscription, create a free account before you begin. If you have a subscription, log in to the Azure Portal and create a Speech resource.


 


Client-side read-aloud


In this design, the client directly interacts with Azure TTS using the Speech SDK.  The following steps with the JavaScript code sample provide you the basic process to implement the read-aloud.


 


Step 1: create synthesizer


First, create the synthesizer with the selected language and voices. Make sure you select a neural voice to get the best quality. 


 


const config = SpeechSDK.SpeechConfig. fromAuthorizationToken(“YourAuthorizationToken”, “YourSubscriptionRegion”);
config.SpeechSynthesisVoiceName = voice;         config.SetSpeechSynthesisOutputFormat(SpeechSynthesisOutputFormat.Riff24Khz16BitMonoPcm);
const player = new SpeechSDK.SpeakerAudioDestination();
var audioConfig  = SpeechSDK.AudioConfig.fromSpeakerOutput(player);
var synthesizer = new SpeechSDK.SpeechSynthesizer(config, audioConfig); 

 


Then you can hook up the events from the synthesizer. The event will be used to update the UX while the read-aloud is on.


 


player.onAudioEnd = function (_) {
          window.console.log("playback finished");
  };

 


 


Step 2: Collect word boundary events


The word boundary event is fired during synthesis. Usually, the synthesis speed is much faster than the playback speed of the audio. The word boundary event is fired before you get the corresponding audio chunks. The application can collect the event and the time stamp information of the audio for your next step.


 


    synthesizer.wordBoundary = function (s, e) {
          window.console.log(e);
          wordBoundaryList.push(e);
        };

 


 


Step 3: Highlight word boundary during audio playback


You can then highlight the word as the audio plays, using the code sample below.


 


    setInterval(function () {
        if (player !== undefined) {
          const currentTime = player.currentTime;
          var wordBoundary;
          for (const e of wordBoundaryList) {
            if (currentTime * 1000 > e.audioOffset / 10000) {
              wordBoundary = e;
            } else {
              break;
            }
          }
          if (wordBoundary !== undefined) {
            highlightDiv.innerHTML = synthesisText.value.substr(0, wordBoundary.textOffset) +
                    "" + wordBoundary.text + "" +
                    synthesisText.value.substr(wordBoundary.textOffset + wordBoundary.wordLength);
          } else {
            highlightDiv.innerHTML = synthesisText.value;
          }
        }
      }, 50);

 


 


See the full example here for more details.


 


Server-side read-aloud


In this design, the client interacts with a middle layer service, which then interacts with Azure TTS through the Speech SDK. It is suitable for below scenarios:



  • It is required to put the authentication secret (e.g., subscription key) on the server side.

  • There could be additional related business logics such as text preprocessing, audio postprocessing etc.

  • There is already a service to interact with the client application. 


 


Below is a reference architecture for such design:


 

Reference architecture design for the server-side read-aloudReference architecture design for the server-side read-aloud


 


 


The roles of each component in this architecture are described below.



  • Azure Cognitive Services – TTS: the cloud API provided by Microsoft Azure, which converts text to human-like natural speech.

  • Middle Layer Service: the service built by you or your organization, which serves your client app by hosting the cross-device / cross-platform business logics.

  • TTS Handler: the component to handle TTS related business logics, which takes below responsibilities:

    • Wraps the Speech SDK to call the Azure TTS API.

    • Receives the text from the client app and makes preprocessing if necessary, then sends it to the Azure TTS  API through the Speech SDK.

    • Receives the audio stream and the TTS events (e.g., word boundary events) from Azure TTS, then makes postprocessing if necessary, and sends them to the client app.



  • Client App: your app running on the client side, which interacts with end users directly. It takes below responsibilities:

    • Sends the text to your service (“Middle Layer Service”).

    • Receives the audio stream and TTS events from your service (“Middle Layer Service”), and plays the audio to your end users, with UI rendering like real-time text highlight with the word boundary events.




 


Check here for the sample code to call Azure TTS API from server.


 


Comparing to the client-side read-aloud design, the server-side read-aloud is a more advanced solution. It can cost higher but is more powerful to handle more complicated requirements.


 


Recommended practices for building a read-aloud experience


 


The section above shows you how to build a read-aloud feature in the client and service scenarios. Below are some recommended practices that can help to make your development more efficient and improve your service experience.


 


Segmentation


When the content to read is long, it’s a good practice to always segment your reading content to sentences or short paragraphs in each request. Such segmentation has several benefits.



  • The response is faster for shorter content.

  • Long synthesized audio will cost more memory.

  • Azure speech synthesis API requires the synthesized audio length to be less than 10 minutes. If your audio exceeds 10 minutes, it will be truncated to 10 minutes.


Using the Speech SDK’s PullAudioOutputStream, the synthesized audio in each turn could be easily merged into one stream.


 


Streaming


Streaming is critical to lower the latency. When the first audio chunk is available, you can start the playback or start to forward the audio chunks immediately to your clients. The Speech SDK provides PullAudioOutputStreamPushAudioOutputStreamSynthesizing event, and AudioDateStream for streaming. You can select the one that best suites the architecture of your application. Find the samples here.


 


Besides, with the stream objects of the Speech SDK, you can get the seek-able in-memory audio stream, which works easily for any downstream services.


 


Tell us your experiences!


 


Whether you are building a voice-enabled chatbot or IoT device, an IVR solution, adding read-aloud features to your app, converting e-books to audio books, or even adding Speech to a translation app, you can make all these experiences natural sounding and fun with Neural TTS.


 


Let us know how you are using or plan to use Neural TTS voices in this form. If you prefer, you can also contact us at mstts [at] microsoft.com. We look forward to hearing about your experience and developing more compelling services together with you for the developers around the world.


 


Get started


Add voice to your app in 15 minutes


Explore the available voices in this demo


Build a voice-enabled bot


Deploy Azure TTS voices on prem with Speech Containers


Build your custom voice


Learn more about other Speech scenarios


 

Creating Subscriptions with ARM Templates

This article is contributed. See the original author and article here.

 


As more and more enterprises embrace Azure, complete end-to-end automation for standing up workloads in the cloud is one of the most important steps to running at scale.  The latest release (2020-09-01) of the Microsoft.Subscription resource provider enables subscription creation via templates.  To get started you first need to ensure billing agreements are in place and you can find details on that process here.  Once this is done, a new subscription can be created for the proper workload and billing account.  Once created, the subscription can be referred to with an alias throughout your code or templates.


 


Here’s a look at a subscription resource in a template:


 

"resources": [
    {
        "scope": "/",
        "type": "Microsoft.Subscription/aliases",
        "apiVersion": "2020-09-01",
        "name": "[parameters('subscriptionAlias')]",
        "properties": {
            "workload": "[parameters('subscriptionWorkload')]",
            "displayName": "[parameters('subscriptionDisplayName')]",
            "billingScope": "[tenantResourceId('Microsoft.Billing/billingAccounts/enrollmentAccounts', parameters('billingAccount'), parameters('enrollmentAccount'))]"
        }
    }
]

 


 


Of particular note is the “scope” property.  Subscriptions are a tenant level resource in Azure and must be PUT to the tenant scope which the property allows.


 


The prereqs for creating subscriptions is to identify the billing scope for the subscription.  You can find more information about billing scopes in this doc.  A sample script for looking up billing information can be found here.


 


The next step, when using templates for subscription creation, is to determine the scope for the template deployment itself.  All templates are deployed to a specific scope; most commonly this is a resourceGroup, but template deployments can be done at the subscription, managementGroup or tenant scope as well.  The scope of the deployment does not need to match the scope of the resources that are deployed though it often does.  For our template sample, we’ll describe a scenario, or rather tenant, where no subscriptions (or resourceGroups) exist and we’ll deploy to a managementGroup.  It doesn’t matter which managementGroup we choose for the deployment because the subscription itself will be created at the tenant scope and placed in the default managementGroup, unless a different one is specified.  If this is still a little confusing, just focus on the subscription resource itself.  This resource must be deployed to the tenant scope and the examples will show how to use the “scope” property to indicate that.


 


Note, that you must have permissions to create template deployments at the scope you target.  Also, the permission to deploy a template at a scope does not automatically give permissions to create any other resource, so you do need to ensure that you have the necessary permissions to create the resources in your template, if any, as well.  That should cover everything about permission.


 


A QuickStart sample for deploying a subscription can be found here.  The command for deploying this template is just like deploying any other template and following our example would be:


 

New-AzManagementGroupDeployment -ManagementGroupId (Get-AzContext).Tenant.id -Location westeurope -TemplateFile azuredeploy.json -TemplateParameterFile myParameters.json

 


 


This will deploy the template to the “root” managementGroup for the tenant.  Again, remember that you must have permission to deploy to that scope, in this case the root managementGroup.  If you don’t have that permission, you can deploy the template to any other managementGroup, subscription or resourceGroup.  Also a reminder that even though the subscription is created at the tenant scope, the template deployment does not need to match that scope.


 


So far, this is a very simple example, but you can also create a subscription and deploy resources to that subscription in the same template.  There is a little more orchestration required here because you’re actually targeting multiple scopes within the same template.  And, in order to target the subscription, you need the subscriptionId or GUID that was assigned to the subscription when it was created.  This next sample will perform each of these steps:


 



  1. Create the subscription (this is shown in the previous sample)

  2. Retrieve the subscriptionId from the newly created alias


 

"outputs": {
    "subscriptionId": {
        "type": "string",
        "value": "[reference(parameters('subscriptionAlias')).subscriptionId]"
    }
}

 


 


3. Pass that subscriptionId to the next deployment in the template


 

"type": "Microsoft.Resources/deployments",
"apiVersion": "2020-10-01",
"name": "[concat('nested-createResourceGroup-', parameters('resourceGroupName'))]",
"location": "[parameters('location')]",
"properties": {
    "expressionEvaluationOptions": {
        "scope": "inner"
    },
    "mode": "Incremental",
    "parameters": {
        "subscriptionId": {
            // this cannot be referenced directly on the subscriptionId property of the deployment so needs to be nested one level
            "value": "[reference(resourceId('Microsoft.Resources/deployments', concat('createSubscription-', parameters('subscriptionAlias')))).outputs.subscriptionId.value]"
        },
...

 


 


4. Create a new deployment in the new subscription that creates the resourceGroup


 

{
    "type": "Microsoft.Resources/deployments",
    "apiVersion": "2020-10-01",
    "name": "[concat('createResourceGroup-', parameters('resourceGroupName'))]",
    "subscriptionId": "[parameters('subscriptionId')]",
    "location": "[parameters('location')]",
    "properties": { ... }
...

 


 


And then finally deploy the resources to that resourceGroup.


 


That’s more complex than just creating a subscription because all of the orchestration is handled within a single template. 


 


If your scenario requires a different scope of deployment or more steps then you may not want to include in that single template.  If I wanted to break this down into multiple steps for orchestration in a pipeline it can be as simple as two steps.


 


Step 1 – Create the Subscription


Performing this step separately can be useful if you do not want to provide a user or service principal with permission to create template deployments at a given scope.  Once the subscription is created, the principal that created the subscription is an owner of that subscription and can deploy templates to that newly created subscription.  This means that the only permission the principal needs outside of the subscription, is the permission to create one. 


 


At this writing, the Azure PowerShell does not have a built-in command to create a subscription but you can always invoke any Azure REST api using the Invoke-AzRestMethod.  This script shows how to do that to create a subscription through an alias resource using the following command.


 

 .Create-SubscriptionAlias.ps1 -aliasName "newSub" -DisplayName "demo subscription" -billingAccount "1234567" -enrollmentAccount "654321" -workLoad DevTest

 


 


You need to set the correct parameter values for the billingAccount and enrollmentAccount which you can discover using this script from the top of this article.


 


Step 2 – Deploy the template.


Next, keeping with our greenfield scenario, where the subscription is created in the same workflow or pipeline that deploys this next template, we’ll create a subscription-level deployment.  If we were running an automated pipeline, this sample would be a good example for the next step.  The sample will create a new resourceGroup in the subscription, lock it and assign a principal access to that resourceGroup.  From here you could deploy resources to the subscription (or resourceGroup) or simply make it available for the principal to use.


 


That’s a quick overview of how to leverage this new capability, in just a few scenarios, that you can use to automate new workloads in Azure.  Let me know how it goes or if you have any questions about automating subscription creation in your environments.


 


 

Experiencing Data Access Issue in Azure portal for Log Analytics – 04/28 – Investigating

This article is contributed. See the original author and article here.

Initial Update: Wednesday, 28 April 2021 11:33 UTC

We are aware of issues within Log Analytics and are actively investigating. Some customers may experience data access and delayed or missed Log Search Alerts in West US region.
  • Work Around: None
  • Next Update: Before 04/28 16:00 UTC
We are working hard to resolve this issue and apologize for any inconvenience.
-Soumyajeet

How to query data located in Azure Blob Storage, Azure Data Lake Store Gen2/1 with ADX

How to query data located in Azure Blob Storage, Azure Data Lake Store Gen2/1 with ADX

This article is contributed. See the original author and article here.

An external table is a schema entity that references data stored outside the Azure Data Explorer database. Azure Data Explorer Web UI can create external tables by taking sample files from a storage container and creating schema based on these samples. You can then analyze and query data in external tables without ingestion into Azure Data Explorer. For information about different ways to create external tables, see create and alter external tables in Azure Storage or Azure Data Lake.


One of the most common scenarios for External table is with historian data (e.g. data that need to be stored due to legal requirements, log records for longer retention period, etc.)  that need to be query rarely. 


external table.jpg


 


Please read create an external table document for detailed explanation, here are some highlighted points.


 


1. at the Source Page, In Link to source, enter the SAS URL of your source container. You can add up to 10 sources (You can remove the 10 limitation by using the create external table command at the Query page). The first source container will display files below the File filters. In a later step, you will use one of these files to generate the table schema. 


2. At the Schema Page, in the right-hand side of the tab, you can preview your data. On the left-hand side, you can add partitions to your table definitions to access the source data more quickly and achieve better performance. 


3. At the Summary Page, you can query this table using the query buttons or with external_table() function. For more information on how to query external tables, see Querying an external table


external table-query.jpg

Announcing General Availability of the new Exchange admin center

This article is contributed. See the original author and article here.

The new Exchange admin center (EAC) is a modern, accessible, web-based management portal for managing Exchange Online that is based on the Microsoft 365 admin center experience. The new EAC is simple and accessible, and it enables you perform tasks like restoring mailboxes, migrating data, and much more.



Since entering Public Preview in June 2020, over half a million admins around the world have used it. We thrived on the feedback of our early adopters and we have steadily improved the new EAC with the help from a great community of early users.



Today, we are excited to announce that the new EAC is now generally available for customers (including GCC customers) in 10 languages. With this announcement, we are also releasing a new dashboard, new usability features, and several intelligent reports to help admins be more productive in their work. The new EAC is expected to be available to customers in GCC High at the end of May 2021, and to customers in DoD at the end of June 2021.



Here are some highlights:


 



  1. Personalized Dashboard, Reports, Insights – The new EAC offers actionable insights and includes reports for mail flow, migration, and priority monitoring.



  2. Azure Cloud Shell – Cloud Shell is a browser-accessible shell that provides a command-line experience built with Azure management tasks in mind. It enables admins to choose a shell experience that best suits their workstyle.



  3. Mailbox management and recover deleted items – Recipient management is one of the most crucial tasks that admins perform. The new EAC now includes easier mailbox management.



  4. Modern, simplified management of Groups – The new EAC also enables you to create and manage 4 types of groups: Microsoft 365 Groups, distribution lists, mail-enabled security groups, and dynamic distribution lists.



  5. Migration – The new EAC supports various kinds of migrations, including cross-tenant migrations for M&A scenarios, and automation Google Workspace/G-Suite migrations.



  6. Left navigation panel – The new EAC also includes a new left navigation panel to make it easier to find features.




You can access EAC today at https://admin.exchange.microsoft.com.


 


To learn more, check out https://docs.microsoft.com/en-us/exchange/exchange-admin-center.


 


Take a tour of the new EAC at https://www.microsoft.com/en-us/videoplayer/embed/RE4FqDa.