Extracting mp3 file from web page with Python and ApplesScript

As I’ve mentioned before I use Anki extensively to memorize and practice Russian vocabulary. With language learning in particular, adding spoken pronunciations to the cards makes an enormous difference. Since I use Open Russian extensively to provide information to built my Anki cards, it’s a natural source of audio data, too. To optimize my learning time, I built two small scripts to grab and rename the audio files from the Open Russian site. First, I’ll describe my workflow.

My vocabulary workflow

Each morning, I pull 6 words from the a Russian word frequency list to add to my Anki deck. With each word, I use Open Russian to look up the complete definition, example sentences, syllabic stress, and other pieces of information that go on the flashcard. To facilitate OpenRussian.org opening in its own dedicate browser window, I built a Fluid application out of it. Having common workflow-related sites like this in their own dedication applications makes a lot of sense for task isolation.

Finally, for many words, I like to extract the audio from the site and add it to the card that I’m building. It turns out to be a cumbersome step because the audio doesn’t play in a QuickTime or other player that allows me to save the file. The source sound files can be downloaded from Shtooka but this is yet another step. This is where my enhanced workflow comes in.

What should the enhanced workflow do?

Optimally, I should be able to grab the URL that is displayed in the Open Russian Fluid application. Using the content of that page, I should be able to obtain the URL of the mp3 file for that word and save it to the desktop using the Russian word as the filename.

The solution

First is a Python application that grabs the URL from the Fluid app, extracts the audio file URL, and downloads it to the desktop.

#!/usr/bin/python
# -*- coding: utf-8 -*-

import re
import urllib2
import urlparse
from os.path import expanduser, normpath, basename, join

""" Obtain the URL from the OpenRussian application,
which is just a Fluid browser application.
If obtaining URL from Safari:
	scpt = '''
	tell application "Safari"
		set theURL to URL of current tab of window 1
	end tell'''
"""
def getOpenRussianURL():
	from subprocess import Popen, PIPE


	scpt = '''
		tell application "OpenRussian"
			set theURL to URL of browser window 1
		end tell'''

	p = p = Popen(['osascript'], stdin=PIPE, stdout=PIPE, stderr=PIPE)
	stdout, stderr = p.communicate(scpt)
	return stdout

""" Extract the audio file mp3 from
the content of the OpenRussian.org page.
"""
def audioURL(html):
	m = re.search("<audio.+(http.+mp3)", html)
	return m.group(1)

def saveMP3(url,path):
	mp3file = urllib2.urlopen(url)
	with open(path,'wb') as output:
		output.write(mp3file.read())

""" Fetch mp3 to which aURL points and save
it to the Desktop using the word as the filename
"""
def fetchMP3(aURL):
	response = urllib2.urlopen(aURL)
	content = response.read()

	url = audioURL(content)
	path = join(expanduser("~"),'Desktop',basename(normpath(url)))
	saveMP3(url, path)

url = getOpenRussianURL()
fetchMP3(url)

To make this even faster, I assigned the script to a Quicksilver keystroke trigger. It’s that simple. Once little twist that I discovered was that difficulty in launching a Python application from a Quicksilver trigger. Although there must be an easier way, I haven’t found it. Instead, I just wrote an AppleScript that runs the application in question and I used that as the triggered script in Quicksilver:

--
--	Created by: Alan Duncan
--	Created on: 2016-11-05
--
--	Copyright (c) 2016 Ojisan Seiuchi
--	All Rights Reserved
--

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

do shell script "/Users/alan/Documents/dev/scripts+tools/fetchOpenRussianMP3.py"

There may be a way to finish the process and add this to the Anki card in one step. I’ll have to work on that.


How to tell if you're being pandered to

You might be the subject of political pandering if:

1. Fear, uncertainty, and doubt are the main tricks in the politician’s kit.

A politician who never tires of scapegoating a feared group, or a feared outcome is undoubtedly pandering. Or a demagogue. Or both. Whether it’s Mexicans, or Jews, or Muslims, or gay people, they never seem to stop talking about why you should be afraid of someone or something.

Or they intentionally raise doubts around the edges of established facts. Donald Trump, for example, continues to plant seeds of doubt about President Obama’s birthplace, years after proof has been established.

The cure, of course, is to turn the doubt around 180° and ask for data and context. You aren’t likely to be mowed down by a Islamic extremist. You’re more likely to succumb to cardiovascular disease because you’re inactive, smoke, and don’t eat a proper diet. If a politician makes a claim without citing data, you should disregard what they say and look it up. Recently a jetliner crashed in the Mediterranean Sea enroute from Paris to Cairo. Within 24 hours, Trump was on record as saying the crash was an act of terror. Anyone with an ounce of sense knows that accident investigations are lengthy fact-gathering, hypothesis-testing, and data-analyzing procedures. Short-circuiting these fact-checking exercises is a timely and convenient tool of the panderer.

The vaccine against pandering is readily available. It’s a vaccine of the mind. Read opposing points of view. Seek primary data. Understand how the branches of government work. Look for sources of bias. Be skeptical about everything. Every. Single. Thing.

2. The politician appeals to commonality of religious belief

Indisputably one of the foundational principles of the U.S. is religious liberty. The First Amendment explicitly protects free exercise of religious belief. But it also protects the integrity of government by ensuring that the government is not a tool of religion. The Establishment Clause is widely-understood to prevent the use of government through its authority to promote religious belief and practice.

Religious belief is a private affair. Most religious groups gather in a communal practice but in an official sense, they are private, not public groups. Politicians who go out of their way to emphasize their religious affiliations are almost certainly pandering. Adherence to many different creeds brings people to do good things. We’re better off demanding to know exactly what a politician has done in the public sphere and what he or she intends to do in the future than about what church they attend.

The panderer can also turn this sort of pandering around in the ugliest sort of way by scapegoating and denigrating particular religious groups. Sometimes, though, this isn’t pandering but honest hateful demagoguery.

We should demand that politicians base their arguments in the broadest, most foundational terms. The proscription against baseless harm to others, for example, is common to practically all cultures. Let’s make sure that our appeals to goodness, fairness, and justice appeal to those ideals in human terms.

3. There is an incoherence between stated positions and documented actions

Honest people exhibit a coherence between what they say and what they do. They don’t go out of their way to create an image, let alone one that differs wildly from their easily observed actions. But panderers are crafty. With some, the gulf between their public works and their language is vast. Often there is even an incoherence between statements they make. Most of us in day-to-day conversation use language in such a way that our stated principles pervade our speech. Not so with the panderer where inconsistencies of all sorts abound.

4. The politician claims to be misunderstood

Occasionally the panderer is caught in an inconsistency. That’s the way deception works. The internet has a long memory. A common escape is to claim that he was misunderstood. It is more likely, however, that the message was simply shifting to suit the audience.

5. If you really, really like a candidate

There are some political candidates with whom we identify because of ideological similarities, or some other factor. Before committing to a candidate we should look again for opposing data-driven viewpoints, sources of bias, and other ways in which we might be influenced through personal appeal.

Pandering is a pervasive tool of the political trade. There’s also a fine line separating the genuine effort to put language into the right context for the audience from the purposeful manipulation of group through deception. By applying a few heuristics, panderers aren’t hard to uncover.

Well that has a familiar ring to it

The U.S. has become well-rehearsed in its response to mass shootings. An event. The pondering over terrorism vs. generalized craziness. The outpouring of prayers and support. Then the internet outrage. And more internet outrage. More meme pictures about guns and love. More color-your-profile picture trends. Empty scripted responses from pious politicians. A week or two, then back to our regularly scheduled programming.

News flash: this isn’t getting better. It’s not going to get better.

EC: An Environment Canada data plugin for Indigo

Environment Canada

Indigo is a well-known home automation controller software package for Mac OS X. I’ve written a plugin for Indigo 6 that allows you to create a virtual weather station from Environment Canada data. If you live in Canada, this will be a useful way of using weather data in your Indigo rules. For example, you could use wind and temperature data to adjust your irrigation schedule.

You can download the plugin from its git repo. After downloading the files, you’ll just need to configure them as a plugin. To do this, create a new folder and rename it EC.indigoPlugin. Copy the Contents folder that you just downloaded. Right-click on the EC.indigoPlugin bundle and Show Package Contents. Paste the Contents folder here. To install in Indigo, double-click the bundle file.

Using Python and AppleScript to get notified if a site is down

I manage a handful of websites, like this one. Having built a few on other platforms, such as Drupal, I’m familiar with the dreaded error “The website encountered an unexpected error. Please try again later.” On sites that I don’t check on frequently, it can be an embarrassment when people begin emailing you with questions about the site being down.

I wrote the following Python script to deal with the problem:

Dynamic UI lists in Indigo 6

Indigo 6 is a popular home automation controller software package on the Mac. Extensibility is one of its main features and it allows users to add a range of features to suit their needs.

Using Python scripting, users can create plugins that provide extended functionality. These plugins can provide a custom configuration UI to the user. Since the documentation around a particular feature - dynamic lists was lacking, I’ve written up my approach here.

Import and tag with Hazel and DEVONthink Pro Office

Hazel and DEVONthink make a great pair as I’ve written before. Using AppleScript, it’s possible to take the import workflow even further by tagging incoming files automatically.

Use case

I download a lot of mp3 files containing pronunciation of words in a language I’ve been learning. I keep a record of these words and tag them appropriately using my hierarchical tagging system.

I’d like to download the files to a directory on the desktop. Keep them there for a few minutes until I’m done working with them, then import the file to DEVONthink Pro Office, tag the file there and delete the original.

Using AppleScript with MailTags

I’m a fan of using metadata to classify and file things rather than declarative systems of nested folders. Most of the documents and data that I store for personal use are in DEVONthink which has robust support for metadata. On the email side, there’s MailTags which lets you apply metadata to emails. Since MailTags also supports AppleScript, I began to wonder whether it might be possible to script workflows around email processing. Indeed it is, once you discover the trick of what dictionary to use.

Using AdBlock Plus to block YouTube comments

Web

YouTube comments are some of the most offensive on the web. Even serious videos attract trolls bent on inscribing their offensiveness and cruelness on the web.

Here’s one method of dealing with YouTube comments. Treat the comments block as an advertisement and block it.^[There are other ways of avoiding YouTube comments. I’ve used ViewPure but it’s hard to find content that way even though they seem to be working on making it more seamless to get from YouTube to ViewPure.]

Introducing AnkiStats & AnkiStatsServer

The spaced repetition software system Anki is the de facto standard for foreign language vocabulary learning. Its algorithm requires lots of performance data to schedule flashcards in the most efficient way. Anki displays these statistics in a group of thorough and informative statistical graphs and descriptive text.

However, they aren’t easily available for the end-user to export. Thus, the reason behind the companion projects AnkiStats and AnkiStatsServer.

The premise is that you can run your own more extensive experiments and statistical tests on the data once you have it in hand. A bit of technical expertise is needed to get it operational but if you are up to it, clone the github repos above and go for it.