What are upload filters?

April 9, 2019

The European parliament recently voted on a major reform of EU copyright law. Henceforth, new laws are supposed to provide better IP protection for copyright owners. Upload filters are particularly controversial, as they'll analyze, and potentially block, videos, songs and images during the upload process, if they deem them in violation of intellectual property rights. All over the world, copyright owners, like movie makers, musicians and authors, have been following the discussion and now feel their finest hour has come. But what are upload filters, how do they work - and why are they so controversial?

This is what the internet will be like

First of all: Politicians generally try hard to avoid the very term because it it so unpopular. It raises Orwellian fears of a surveillance society, and no party wants any part of that. So they prefer "content recognition technologies", implemented, you guessed it, through upload filters. Avoiding the mention of the bad word serves to not frighten citizens / voters, even though the technical implementation is virtually identical. Makes sense! From now on, every upload (to Facebook, Instagram, Twitter, YouTube etc.) is to be scrutinized for potential copyright infringement. Profit-oriented websites older than 3 years must comply or face a fine.

But effectively preventing uploads of copyrighted material requires c omparing the information against a database to determine cause for suspicion and, if present, then having a capable human review the data. Currently, that's an impossible task, simply because of the vast extent of uploaded data (1 billion hours of video content on YouTube alone every day). First, a database containing samples or, for practicality, hash values based on all copyrighted material will have to be created. Hash values are unique and represent the "essence" of a file distilled into more manageable bits of information. Once in place, any and all uploaded files will also have to be distilled into hash values which will then be pattern-matched against the database. Positive matches will result in the affected file upload, e.g. a video, being prevented.

Where things get tricky is when uploaded files aren't 1:1 copies but slightly modified facsimiles. Let's assume I decided to upload a recent Hollywood blockbuster to YouTube. It should be fairly obvious to anyone that I'd be infringing on copyright law and would be facing the owner's resolute veto. But what about a parody? If I were to take only portions of the video, re-edit them and add effects, comments and music to create new and, hopefully, humorous content? The hash value of my work would differ from that of the source material, so pattern-matching would yield a negative result. Parodies, innuendo and fan-edits make up a large portion of the cultural backbone of the web. In these instances, algorithms would have to detect, and understand, the minute differences between IP theft and satire. That's where the system falls apart. Artificial intelligence, however intriguing and versatile it may be, is totally devoid of humor and higher cognitive functions, which enable us humans to grasp subtle nuances.

Artificial intelligence doesn't really get us

Let's look at another example: Imagine you bought a neat LEGO model but now have lost interest and intend to sell it on eBay. You take a photo of the box to and try to upload it. eBay's upload filter then compares your photo to whatever data LEGO provided - and blocks it due to similarities. Nobody will ever see it. What a bother! In the future, uploads to Facebook involving screenshots of a newspaper excerpt, a caricature or a GIF based on a movie may never make it past their automated filters. No matter whether your intent is to spread humor, convey information or educate people, filters will be a major pain and block what should not be. And since portals will be held directly accountable, they'll likely pull out the big guns and overshoot the mark by a mile. Whatever remotely resembles illegal content will be blocked, after all, the next lawsuit from a copyright owner is always just a few clicks away. Critics therefore expect large-scale overblocking to rather be the rule than the exception.

Normally, newspaper articles fall under copyright protection. But how would Facebook even know the excerpt from the current article of a provincial paper you're trying to upload? Nothing is older than yesterday's newspaper, that's why only current news are shared online. Does that mean all publishers have to constantly upload their current editions to keep databases up to date? I guess so. How else would online portals be able to detect violations? Excerpts are an even tougher nut to crack and would likely require automated up-to-the-minute optical character recognition. The alternative would be algorithms that block anything and everything that bears even the most minute of resemblance to actual articles, shooting down your local club paper, historical editions and your neighbors' wedding newspaper in the process. Educational sites, like Wikipedia, might also be in serious trouble, since their articles are chockfull of quotes, images and texts, that usually fall under the fair use provision, but might now have to be reevaluated to steer clear of major lawsuits. Agnostics are already predicting the intellectual impoverishment of the internet.

Live streams are another tricky issue. Will gamers be allowed to continue streaming their gameplay live? The games themselves are copyright-protected, but, from a technical standpoint, the question of how to monitor live streams effectively has, so far, not been definitively answered. And AI isn't ready to come to the rescue just yet. YouTube have been working for years to implement automated object and face recognition, but their filters operate on a very low and rudimentary level, with ample scattering and plenty of vagueness. Humans are falsely recognized as nude, classic renaissance works of art are deemed pornographic and, at times, dark-skinned people are identified as apes. And these filters are supposed to reliably police online uploads from now on? That seems a bit of a stretch.

A license to silence detractors

Another aspect, especially criticized by privacy groups, is that upload filters are the perfect means to implement full-scale online censorship. Sure, laws don't mention it, but the technical means will be available as soon as the reform is implemented. How hard will it be for countries to resist the temptation to keep close tabs on their citizens and dispose of unwelcome opinions straightaway? It'll certainly be very tempting not to silence political adversaries once access to every uploaded file is guaranteed. An image depicting government-opposed content is gaining popularity? A slight tweak to the upload filter will take care of it! A nightmare scenario, indeed!

At this time, readers outside the EU might think themselves blissfully safe from this latest EU-driven attack on the unregulated internet. Good for them! But don't celebrate just yet. US and other international lobbyist have done their part not just in Europe but across the globe to see similar laws passed all over the world. There's no denying that copyright laws are out of sync with the internet and in need of improvement. However, the solutions differ greatly from country to country. I'm hoping you'll be able to enjoy a mostly free and unregulated internet for years to come, wherever you may live!

What I would like to know: Should politicians have a say on matters they obviously know nothing about?

9 comments

Other interesting posts for you

Ashampoo Blog 10 year anniversary: What changed, what remains

AI for Everyone–Whether we like it or not!

The end of Windows 10

9 comments

R
Robert
3:30:06 PM • April 20, 2019
jb_lantz raises the vital issue of 'fair use' which in itself is a potentially difficult issue since it is my understanding that every image in a book or magazine is an entity in which separate intellectual rights exist.
The mechanism of creating hashes for every work has its issues. Any modification to the original file will invalidate the hash. Cut a couple of random frames from a video, introduce a little noise to a sound track, and the hashes will change.
What about media that the owner wants to have shared? Film trailers, samples of tracks etc.. Are we going to have to revert to the days of sneaker nets to distribute these?
I'm really not sure where this is going. The idea is to protect the rights of the owners and originators of intellectual property. However, in a digital world these measures could well act against the very people that it is supposed to protect.
J
Jon Demane
1:43:02 AM • April 12, 2019
Filtering is not a human action it is a computerised action by the likes of AL or other clones.
The human element has to be involved by programming the computer, which can never be humanised because it is a 'feeling of being'.
This being the case then every letter, symbol and spelling of every word in every one of the world's languages must be written in computer language, but how could a computer be told to also accept or reject the billions of sounds which are used in the pronunciation of those languages, possibly, in an unknown time span over an unknown number of years.
Filtering, much like manufacturing and installing traffic lights for political whim, but they don't work so the politicians try to control the traffic,but that doesn't work either.
E
Eldon Wessels
4:29:11 PM • April 11, 2019
I am 100% for the protection of intellectual property rights, be it on the internet or otherwise. However, rights don't only apply to the creators/authors of videos, music, books, etc.
What about the rights of individuals? Specifically the right to privacy on the internet? Microsoft, Facebook, Google, et al. has taken away that right. They have the right to steal our privacy... and the US government looks the other way.
F
Fritz Buzzard
2:34:23 PM • April 11, 2019
There is an old saying, from the earliest days - the internet considers censorship as damage and simply routes around it.
This so-called law is 30 years too late. That horse escaped the barn long ago.
L
Lee Saunders
9:54:40 AM • April 11, 2019
There will be a way around it coming soon, I guess. I suppose one option would be to not use services that operate in countries that will comply with the new laws. Can't get your copyrighted film on YT, Vimeo and others? There are lots of other portals, but not in English.
I understand the need to protect people's work. I mean, I reckon people would get pretty annoyed if their employer didn't pay them for their work or cut their salary, which is essentially what is happening to some content producers. They may be getting paid, but not as much as they should be.
I also know that using even newspaper articles in an online classroom could be a costly activity.
I was teaching online a number of years ago, and I wanted to use articles from newspapers as authentic texts. I contacted a couple of newspaper publishers for permission to use content. One local paper was okay with it, provided that included it as the source, while a bigger newspaper wanted hundreds of pounds per article!
S
Stephen Round
8:52:20 AM • April 11, 2019
What do you think is occurring in the invisible Country we ignored English call England the only Country in The World which has no government witown name on it's letterhead! Politicians are supposed to represent their constituents consequently they have no right to an opinion of their own. So .... why are these rabidly acquisitive anti democratic dysfunctionals talking to each other and ignoring us, why are they not utilising a democratic form of representation in their constituencies?
Change the word and change the valuation of the subject under discussion. The Alphabet rules us with a rod of irony. We are weonly allowed to speculate upon subjects which have no influence upon our lives. Hey Presto abracadabrafind a new word and thus we are rendered invisible and powerless...
j
jb_lantz
3:38:39 AM • April 11, 2019
My substantial use of brief quotations, both in a free educationally-oriented online book and in online articles and Quora answers, has heretofore been protected as 'fair use'. At least one of the sites from which readers can download the book resides in Europe. I've put thousands of hours of unpaid labor into the writing of such material. It's scary to think that in the near future this kind of material, whether authored by me or by others, may get illegitimately blocked as a result of overzealous copyright-protection policies formulated with little or no concern about collateral damage -- or, worse, get censored by deliberate abuse of said policies.
A
Ashampoo Customer
3:22:46 AM • April 11, 2019
Hi there
P
Peter Bloch
1:48:27 AM • April 11, 2019
Our politicians are forever speaking rubbish even if they know nothing about the subject. This will be one more piece of verbiage. No way of stopping it