Getting plaintext into Anki fields on macOS: An update

A few years ago, I wrote about my problems with HTML in Anki fields. If you check out that previous post you’ll get the backstory about my objection.

The gist is this: If you copy something from the web, Anki tries to maintain the formatting. Basically it just pastes the HTML off the clipboard. Supposedly, Anki offers to strip the formatting with Shift-paste, but I’ve point out to the developer specific examples where this fails. Basically, I only want plain text. Ever. I will take care of any and all formatting needs via the card templates. Period.

In the previous solution, I used ApplesScript that is triggered in Quicksilver. I’ve migrated from Quicksilver to Keyboard Maestro since then, so it was time for an update. And the good news is that it’s simpler, it’s literally a Bash one-liner:

#!/bin/bash
#
# Convert contents of clipboard to plain text.

pbpaste | textutil -convert txt -stdin -stdout -encoding 4 | pbcopy

exit 0

In Keyboard Maestro, I just built a macro around this command line script. It’s still triggered by ⇧⌘W and also includes a paste command to Anki so it’s seamless compared to the previous solution.

A brief note about textutil1 - it’s a built-in text conversion utility on macOS. The script pipes the text to textutil which converts the format to text. The -encoding option with a value of 4 is the NSUTF8StringEncoding encoding format.2 Then the results are finally piped back to the clipboard.


  1. textutil - manipulate text files in various ways. ↩︎

  2. NSStringEncoding enumeration - constants are provided by NSString as possible string encodings ↩︎

Thursday, May 26 2022

I would like to propose a constitutional amendment that prohibits Sen. Ted Cruz (F-TX)1 from speaking or tweeting for seven days after a national tragedy. I’d also be fine with an amendment that prohibits him from speaking ever.


  1. The “F” designation stands for Fascist. The party to which Cruz nominally belongs is more aligned with WW2-era Axis dictatorships than those of a legitimate free civil democracy. ↩︎

Extracting title title of a web page from the command line

I was using a REST API at https://textance.herokuapp.com/title but it seems awfully fragile. Sure enough this morning, the entire application is down. It’s also not open-source and I have no idea who actually runs this thing. Here’s the solution: #!/bin/bash url=$(pbpaste) curl $url -so - | pup 'meta[property=og:title] attr{content}' It does require pup. On macOS, you can install via brew install pup. There are other ways using regular expressions but no dependency on pup but parsing HTML with regex is not such a good idea.

Friday, May 20, 2022

“Enlightenment is the absolute cooperation with the inevitable." - Anthony De Mello. Although he writes like a Buddhist, apparently he’s a Jesuit.

Three-line (though non-standard) interlinear glossing

Still thinking about interlinear glossing for my language learning project. The leizig.js library is great but my use case isn’t really what the author had in mind. I really just need to display a unit consisting of the word as it appears in the text, the lemma for that word form, and (possibly) the part of speech. For academic linguistics purposes, what I have in mind is completely non-standard. The other issue with leizig.