Ojisan Seiuchi

Stripping surveillance parameters from Facebook and Google links

October 16, 2020

While largely opaque to most users, Facebook and Google massage any links that you acquire on their sites to include data used to track you around the web. This script attempts to strip these surveillance parameters from the URL’s. It is by no means all-inclusive. Imaginably, there are links that I haven’t yet encountered and that need to be considered in a future version. So consider this a proof-of-concept.

The problem

For example, I performed a Google search¹ for “Smarties”. Inspecting the first link - to Wikipedia, I see:

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjsjeW35LjsAhXuYt8KHSuLCbsQFjAAegQIARAC&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FSmarties&usg=AOvVaw1fpahjjexXarc2XibGIA91

What is all this garbage? Just give me the Wikipedia link.

Similarly, with Facebook, here’s a representative link from a post shared by a friend:

https://l.facebook.com/l.php?u=https%3A%2F%2Fwww.propublica.org%2Farticle%2Finside-the-fall-of-the-cdc%2Famp%3F__twitter_impression%3Dtrue%26fbclid%3DIwAR1vPirl6QnkEg_3alUgWAeKUnJNtg-Yt_A2p-xzAksE8wa8ehJ0J702fDs&h=AT1abE7Jz5DjxA_jpN_8Ak5JB4oMSxFJLchizp-UlJgNPI9n0iXDb513sIAZ3uV1gHCazWY92umXONwvxHWZyDK7AFp29zy4W9ifzU6XJC9gdsvhjsY_GgW6-d1TwfV1NTpj5IqutA&__tn__=H-R&c[0]=AT0wPBsJAwDQs1I3A-IE7zqQqBiwEHA3kJW9oLxfh04m_WvjBdWOcmz-6vYfoSNsdLLYxRWgAKZw3ExnMZ2CUH-0nRSmILzqTmumL68oL5-OcChT8pbK11VBSHVG1nt64-Ek8S1AESqn8nYCjZ-gI80uo3Hhd8UJyvOD-KoZ3yI_M5Gfcdu1

Again, garbage that exists only so that Facebook can follow me around the web and generate behavioural data that they can use in turn against my better interests and those of society at large.

After the sanitization process below, we have:

https://www.propublica.org/article/inside-the-fall-of-the-cdc/amp

Nice and tidy with only what is needed to route me to the desired link without unnecessary tracking data.

Solution

The solution is to strip out the cruft with which surveillance capitalists corrupt the web. I’ve applied this script as a Keyboard Maestro but it could be reconfigured with ease as a macOS text service. Or the same idea could be implemented in other OS flavours.

#!/usr/bin/perl

sub urldecode {
    my ($rv) = @_;
    $rv =~ s/\+/ /g;
    $rv =~ s/%(..)/pack("c",hex($1))/ge;
    return $rv;
}

sub urlencode {
    my ($rv) = @_;
    $rv =~ s/([^A-Za-z0-9])/sprintf("%%%2.2X", ord($1))/ge;
    return $rv;
}

# remove referral parameters
sub strip_utm {
    $_ = shift @_;
    s/(.*[?]?)utm_source=facebook&?(.*)/$1$2/g;
    s/(.*)(utm_medium=.*?)&(.*)/$1$3/g;
    s/(.*)(utm_campaign=.*?)&(.*)/$1$3/g;
    s/(.*)(utm_brand=.*?)&(.*)/$1$3/g;
    s/(.*)(utm_social-type=.*?)&(.*)/$1$3/g;
    s/^(.*)(fbclid=[^\&]*)(.*)$/$1$3/g;
    s/^(.*)(h=[^\&]*)(.*)$/$1$3/g;
    s/^(.*)(__tn__=[^\&]*)(.*)$/$1$3/g;
    s/^(.*)([A-Za-z]\[\d+\]=[^\&]*)(.*)$/$1$3/g;
    s/(.*)(__twitter_impression=[^&]+)/$1/g;
    # eliminate duplicate & from the stripping process
    s/&+/&/g;
    # clean up
    s/^(.*)\?+\&+$/$1/g;
    return $_;
}

my $original_url = $ARGV[0];
# my $original_url = $ENV{KMVAR_unsterile_link};

# decode the url as needed
my $url_text = ($original_url =~ m/%3A/) ? urldecode($original_url) : $original_url;
# fix links from Facebook
if ( $url_text =~ m/l\.facebook\.com/ ) {
    # strip the actual link
    $url_text =~ m/http[s]?.*l.facebook.*?u=(.*)/;
    print strip_utm($1);
}
else {
    $_ = $url_text;
    # deal with google referrals
    if( m/http[s]?:.*google\.com\/url\?/ ) {
        s/url=([^&]*)/$1/g;
        print $1;
    }
    else {
        strip_utm($_);
        print $_;
    }
}

Next steps

Strip Amazon referral links - yes, I know some people monetize their blogs in whole or in part using Amazon referral links. I don’t care. If bloggers want to make money that way, then develop relationships with smaller commercial entities, those that support and live in their local communities and treat their employees with dignity.
Strip Twitter links - Deal with links from the seething cauldron of hate and incivility that is Twitter.

Ordinarily I don’t use Google services at all, even for search. But I did for this example. ↩︎

Predictions 2021

October 14, 2020

predictions

Predictions for 2021

Humans are notoriously poor at assigning probabilities to events, even those that are highly relevant to their daily lives. This year I’m making a deliberate attempt to calibrate my prediction abilities by correlating predictions with reality. The judgments of truth of these outcomes will be made on December 31, 2021, although some of the outcomes will have been decided substantially in advance of that.

Coronavirus

An effective vaccine will be widely available in Canada: 70%.
I will have received a coronavirus vaccine: 65%
I will have personally contracted coronavirus infection: 20%
Someone in my household will have contracted coronavirus: 20%
Schools in London-Middlesex will close due to coronavirus outbreak: 30%
U.S. deaths from COVID-19 > 300,000: 60%
YAPCA will resume in-person activities before end of term because of lifting coronavirus restrictions: 15%
Violin lessons will resume in-person before the end of term because of lifting coronavirus restrictions: 20%
Daily case counts exceed 30 on any day in 2021 for London-Middlesex: 50%.

Politics

Joe Biden will be elected to the U.S. Presidency: 80%
Donald Trump will officially concede the election if he is defeated: 10%
The U.S. Senate will change to Democratic control: 60%
The U.S. House of Representatives will remain in Democratic control: 99%
Joe Biden will die or become impaired in office: 10%
Florida’s electoral votes go to Biden: 45%
Michigan’s electoral votes go to Biden: 50%
Pennsylvania’s electoral votes go to Biden: 60%
Ohio’s electoral votes go to Biden: 20%
Wisonsin’s electoral votes go to Biden: 40%
Arizona’s electoral votes go to Biden: 55%
Lindsey Graham is defeated: 30%
Mitch McConnell is defeated: 10%
Susan Collins is defeated: 45%
Results of election are known by November 5, 2020: 60%
Donald Trump attends the Inauguration ceremonies: 20%
Boris Johnson is still UK PM: 60%
Justin Trudeau is still Canadian PM: 70%
Queen Elizabeth dies: 20%
Prince Philip dies: 30%
Roe v. Wade is overturned: 10%
Coney-Barrett is confirmed: 100%

Family

[redacted]: 70%
[redacted]: 50%
[redacted]: 60%
[redacted]: 80%
[redacted]: 30%
[redacted]: 50%
We own a third dog: 25%
[redacted]: 20%
[redacted]: 20%
Any member of our immediate family household travels on an airliner: 40%
Audra has a new car: 25%
[redacted]: 20%
Interlochen holds in-person summer camp: 40%

Russian

I complete Anki reviews on 100% of days: 70%
I complete Anki reviews on at least 80% of days: 80%
My tutor-rated speaking ability is improved by at least 25% on a 0-10 scale: 70%
I’ve read at least 6 short stories in Russian: 25%
I do prosody practice on at least 50% of days: 10%

Writing

I write more than 5 articles on Suzuki Experience: 40%
I write more than 12 articles on Ojisanseiuichi.com: 60%

Technology/Economy

I purchase a new laptop: 15%
I purchase a new cell phone: 10%
I set up a VPN for privacy purposes: 65%
I cancel my Facebook account: 20%
I check Facebook less than twice a day on 80% of days: 90%
I resume using Instagram: 20%
I’m using a text editor other than Sublime or Atom: 50%
I unblock Twitter: 10%
DJIA closes above 30,000: 60% 10 I update to new major macOS version: 60%

Personal

I workout on at least 80% of days: 20%
I workout on at least 50% of days: 40%
I workout on at least 25% of days: 50%
I take an SSRI or related medication: 30%
[redacted]: 60%
I sit zazen on at least 80% of days: 10%
I sit zazen on at least 50% of days: 30%
I sit zazen on at least 25% of days: 40%
I write 2021 goals: 95%
I complete all 2021 goals: 10%
I complete more than 50% of 2021 goals: 50%
We begin kitchen renovation: 15%
[redacted]: 60%
I read more than 10 books: 20%
I read more than 5 books: 90%
I read more than 4 novels: 15%
I travel anywhere on an airliner: 10%
I install radio transceiver in back lock: 60%
I install USB charger outlet behind office cabinet: 25%
I can play Rachmaninoff partita transcription from memory: 30%

As part of a Hazel rule to process downloaded mp3 files, I worked out a couple different methods for extracting the ID3 title tag. Not rocket science, but it took a little time to sort out. Both rely on non-standard third-party tools, both for parsing the text and for extracting the ID3 tags. Extracting ID3 title with ffprobe ffprobe is part of the ffmpeg suite of tools which on macOS can be installed with Homebrew.

Having fallen in love with Keyboard Maestro for its flexibility in macOS automation, I began experimenting with scripting in various languages, like my old favourite Perl. That’s when the fun began. How do we access KM variables inside a Perl script. Let’s see what the documentation says: So the documentation clearly states that this script #!/usr/bin/perl print scalar reverse $KMVAR_MyVar; should work if I have a KM variable named MyVar. But, you guessed it - it does not.

Justification Although caching can make page loads notably faster, it comes with a cost. Browsers aren’t always capable of taking note when a cached resource has changed. I’ve noticed recently that Safari utterly refuses to reload .css files even after emptying the browser cache and clearing the web history. Background With a lot of help from the a pair of articles written by Ukiah Smith, I’ve developed a workflow for dealing with this problem during the deployment process.

(N.B. The next installment in my obsessional interest in thwarting surveillance capitalism. Read Shoshana Zuboff’s seminal work on the subject and you’ll see.) Justification Last week I outlined my evolving comprehensive approach to thwarting surveillance capitalism - that is the extraction, repurposing and selling of online behavioural surplus for the purposes of altering future behaviour. This is a simple iOS shortcut to the embedded Safari setting for clearing Safari history and website data.

(N.B. I am not a security expert. I’ve implemented a handful of reasonable measures to prevent cross-site tracking and limit data collection about my preferences and actions online.) Surveillance capitalism is a real and destructive force in contemporary economics, politics and culture. Whatever utopian visions that Silicon Valley may have had about the transformative power of ubiquitous network technologies have been overwhelmed by the pernicious and opaque forces that profit from amplifying divisions between people.

Building on my earlier explorations of the UDAR project, I’ve created a macOS Service-like method for in-situ marking of syllabic stress in arbitrary Russian text. The following video shows it in action: The Keyboard Maestro is simple; we execute the following script, bracketed by Copy and Paste: #!/Users/alan/.pyenv/shims/python3 import xerox import udar import re rawText = xerox.paste() doc1 = udar.Document(rawText, disambiguate=True) searchText = doc1.stressed() result = re.sub(r'( ,)', ",", searchText) xerox.

Постепенно открылось мне, что линия, разделяющая добро и зло, проходит не между государствами, не между классами, не между партиями, — она проходит через каждое человеческое сердце — и черезо все человеческие сердца. Линия эта подвижна, она колеблется в нас с годами. Даже в сердце, объятом злом, она удерживает маленький плацдарм добра. Даже в наидобрейшем сердце — неискоренённый уголок зла. Alexander Solzhenitsyn Gulag Archipelago My rough translation to English:

Over-involvement in the future must be our most maladaptive trait. Back in the 1970’s in Ojai, when Jiddu Krishnamurti drew enormous crowds to his extemporaneous talks, he touched on the liberation that comes from releasing the pointless hold on the future.1 Do you want to know what my secret is? You see, I don’t mind what happens. Jiddu Krishnamurti Lecture, Ojai,California, USA; late 1970's That’s it. Of all the teachings from the broad wisdom traditions, his one secret was not minding what happens.

Stripping surveillance parameters from Facebook and Google links

The problem

Solution

Next steps

Predictions 2021

Predictions for 2021

Coronavirus

Politics

Family

Russian

Writing

Technology/Economy

Personal

Extracting ID3 tags from the command line - two methods

Using variables in Keyboard Maestro scripts

Hugo cache busting

iOS shortcut to clear Safari

My macOS and iOS security setup - Update 2020

A macOS text service for morphological analysis and in situ marking of Russian syllabic stress

Solzhenitsyn on the folly of looking for good/evil dichotomies

On not minding what happens