AwesomeTTS Anki add-on: Use Amazon Polly

December 26, 2022

anki

As its name implies, the AwesomeTTS Anki add-on is awesome. It’s nearly indispensable for language learners.

You can use it in one of two ways:

Subscribe on your own to the text-to-speech services that you plan to use and add those credentials to AwesomeTTS. (à la carte)
Subscribe to the AwesomeTTS+ service and gain access to these services. (prix fixe)

Because I had already subscribed to Google and Azure TTS before AwesomeTTS+ came on the scene, there was no reason for me to pay for the comprehensive prix fixe option. Furthermore, since I’ve never gone above the free tier on any of these services, it makes no sense for me to pay for something I’m already getting for free. For others, the convience of a one-stop-shopping experience probably makes the AwesomeTTS+ service worthwhile.

But the developers have chosen to lock Amazon Polly behind their prix fixe service. As an Amazon Web Services customer already, this makes no sense for me. AWS already knows how to bill me for services; so as with the Google and Azure services I mentioned previously, I have no intention to pay twice. But, as opposed to Google and Azure TTS, those of us who aren’t AwesomeTTS+ subscribers have been locked out of Amazon Polly.

Until now.

The rest of the post is a description of how I bypassed this limitation.

Prerequisites

Before modifying the AwesomeTTS code, you need to get a couple things out of the way first.

AWS user account

First, you will need to be an AWS user. I’m not going to go into depth with this. Start here.

Install the AWS CLI tools

For simplicity, we are going to access the Amazon Polly TTS via the command line toolset provided by AWS. To install, start here. After installing the AWS CLI tools, you will need to add your credentials as described here.

Modify the AwesomeTTS add-on code

On my system, the add-on path is ~/Library/Application Support/Anki2/addons21/1436550454. Within the awesometts directory within that path, you will find the files that you need to modify. Both are in the service directory.

Modifications to `languages.py`

Find the class definition for StandardLanguage. Change this:

class StandardVoice(Voice):
    def __init__(self, voice_data):
        self.language_code = voice_data['language_code']
        self.voice_key = voice_data['voice_key']
        self.voice_description = voice_data['voice_description']

to this:

class StandardVoice(Voice):
    def __init__(self, voice_data):
        self.language_code = voice_data['language_code']

        # we need the audio_language_code for Amazon Polly service
        self.audio_language_code = voice_data['audio_language_code']
        self.voice_key = voice_data['voice_key']
        self.voice_description = voice_data['voice_description']

This change is required by the Amazon service because in the AWS CLI call we need to specify the language code in the format specified by the audio_language_code key in the voice info.

Modifications to `amazon.py`

In the original code, they throw an exception when you aren’t an AwesomeTTS+ subscriber. We need to reverse the logic and formulate our own call. To do this, change the original code here:

def run(self, text, options, path):

    if not self.languagetools.use_plus_mode():
        raise ValueError(f'Amazon is only available on AwesomeTTS Plus')

    voice_key = options['voice']
    voice = self.get_voice_for_key(voice_key)

    rate = options['rate']
    pitch = options['pitch']

    self._logger.info(f'using language tools API')
    service = 'Amazon'
    voice_key = voice.get_voice_key()
    language = voice.get_language_code()
    options = {
        'pitch': pitch,
        'rate': rate
    }
    self.languagetools.generate_audio_v2(text, service, 'batch', language, 'n/a', voice_key, options, path)

to:

def run(self, text, options, path):
    # Nope ↓
    # raise ValueError(f'Amazon is only available on AwesomeTTS Plus')
    rate = options['rate']
    pitch = options['pitch']
    voice_key = options['voice']
    voice = self.get_voice_for_key(voice_key)
    if self.languagetools.use_plus_mode():
        self._logger.info(f'using language tools API')
        service = 'Amazon'
        voice_key = voice.get_voice_key()
        language = voice.get_language_code()
        options = {
            'pitch': pitch,
            'rate': rate
        }
        self.languagetools.generate_audio_v2(text, service, 'batch', language, 'n/a', voice_key, options, path)
    else:
        # roll your own, baby; needs AWS CLI installed
        # along with credentials therewith
        lang_code = voice.audio_language_code.replace('_', '-')
        voice_name = voice.get_key()
        (engine, voice_id) = (voice_name['engine'], voice_name['voice_id'])
        cmd = f'aws polly synthesize-speech --engine {engine} --language-code {lang_code}
        		--output-format mp3 --text "{text}" --voice-id {voice_id} "{path}"'
        cmd_list = shlex.split(cmd)
        resp = subprocess.run(cmd_list, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

We also need to import shlex and subprocess which will be used to setup and execute the shell process that communicates with AWS:

import subprocess
import shlex

After you’ve made these changes, you should now have access to Amazon Polly via the AWS CLI call.