Recently, Microsoft's Build 2017, an annual developer conference, took place in Seattle and the company once again displayed their crown jewels along with big plans for the future (as always). With a Colgate smile, attendees were shown what smart services can do with cloud data - and data protection officials were probably reaching for their suicide pills right away. Video Indexer is a great example of how to leverage these new technologies and you can give it a try - if you dare. After all, Microsoft didn't lie when they spoke of a "democratization of surveillance tools".
A bright outlook into the future at Build 2017
Artificial intelligence is slowly but noticeably making its way into our everyday lives. With Video Indexer, Microsoft introduces a tool to analyze videos on multiple levels to the public and they didn't cut corners. Aside from processing the usual length, file format and file name attributes, the tool also scans for speech and emotions. Will the Azure server know when I am happy? Reason enough to upload a few videos of my own and have Video Indexer take a go at them.
Since Microsoft Video Indexer is an online-based service, you'll first need to register at https://www.videoindexer.ai to upload your movies. I went with a colorful mix of different videos (including some related to Ashampoo) and transferred them into the cloud. After just a few minutes I was stunned. While the results for the nature video were empty (the tool probably knows nothing about manatees!), there was a considerable depth to the analysis of the interview I uploaded. Central issues were recognized and appeared as keywords in the list to the right of the video. Once clicked, playback instantly navigated to the corresponding time index. It's also possible to filter through entire video collections based on keywords. That's impressive and, once again, likely a wet dream of all secret service agencies.
Does everything it's supposed to do - and maybe more
Transcripts are also worth mentioning. Video Indexer analyzes and transcribes all spoken words. Even instant translations are possible with the quality depending on the source material's audio quality, naturally. Amateur recordings yield fragmented texts but professional interviews recorded through a good microphone result in almost flawless transcriptions though a few funny bloopers are bound to pop up here and there. My boss certainly didn't reply with "Kidney. Life, never" when asked about data security. The technology is still in development (engineers are currently working on body language recognition) but the current state is already astonishing. Textual information also gets recognized. If you ever wanted to see how states manage to capture and analyze license plates, here's your chance.
Naturally, face detection is included as well. Celebrities are instantly recognized through Bing while the identification of private individuals is left to the user and will be permanently stored, once provided. Again, video quality matters. Blurry or badly lit source material will quickly turn your colleagues into Hollywood stars or politicians in the eyes of the tool. This may be flattering but it's still an error. Video Indexer furthermore tries to determine the situation and relevance of each character to a scene. For my panel show, the tool concluded the loudmouth had about 40% of on-screen time and beat all other participants in that respect. That's a life lesson learned!
Speech sentiment is where it got really interesting. Video Indexer analyzes a speaker's genuine feelings about their words and classifies them as either neutral, positive or negative on a neatly colored timeline. The hit ratio is high. A video with a slightly tipsy friend of mine talking about her vacation trip received a deep green (positive) while a scientific lecture was mainly rated gray (neutral). When an elder person started talking about the government everything turned red, Microsoft definitely detected the anger! Roaring cheers, however, were misinterpreted as aggression. There's an option to scan for "explicit content". Since I had no such videos, I skipped this feature. I swear!
Also still in the works is advanced object and gesture detection among other things. Though the current state is labeled "preview" by Microsoft, it is easy to see where this already powerful tool is going. If you own a ton of videos (that contain people and lots of talk), you'll save hours processing them with Video Indexer as long as you can warm up to the whole cloud concept. That's what it'll ultimately come down to since the product won't be available offline.
It's difficult to draw a final conclusion because, aside from the technical brilliance, the product raises various questions. How will private consumers, employers or government institutions use these new abilities? Microsoft provides a freely accessible API (application programmers interface) so other developers can integrate the technology into their own products.Where will they draw the line and who'll enforce their usage policy especially in view of potentially unethical use? Once widely available, third parties may use the technology to foster surveillance, censorship and tracking. It's the old story of anything that can be done will be done. So is it worth it?
Pictures: Microsoft (Azure)
Someone says that using this device iand others like it is insanity because we would give up privacy for the sake of a small amount of convenience.
I would say that they have forgotten that their own ancestors deliberately compromised their own privacy for good and all when they chose to live within a system rather than be dependent on their own resources to survive, Resorting to this flawed policy could be a true explaination on the real reason why the Neanderthals became extinct over time!
Civilisation is not civilised ...we have no privacy worthy of the name that kind of privacy comes with being born with huge degrees of power and wealth even though the most prominent of these powerful people live in what is claimed to be ..the public sphere ... which is their property... not ours!
Sven first I want to thank you and everyone at Ashampoo for helping me take ownership of my new laptop. Ashampoo's AntiSpy is a brilliant piece of software engineering.
You are drawing out a clear question about privacy. What happens when a piece of video/audio are deemed "suspicious" and handed over for a DMCA claim? There goes another right to own a video one has produced.
Have a great weekend!
Is Source Intentional Benevolence Real in the Paradox of Limitless Cosmos beyond human understanding's Evidence? There IS Flow! -NJA Thanks !!!
Is Cyber-space an exercise in more unaware goofy Narcissism?
Insanity. People willing to give up privacy for a tiny amount of convenience.
"Hey, just upload your personal videos here, and we'll make sure your face and the face of everyone in them gets added to every facial recognition database in the universe. If you haven't done anything wrong, you have no reason to worry. Rumors of our being fully complicit with the NSA, FBI, and CIA are not (totally) true."
Are people really that stupid? I'm afraid they are.
" Video Indexer analyzes a speaker's genuine feelings about their words and classifies them as either neutral, positive or negative on a neatly colored timeline. The hit ratio is high."
Wow - I must make certain that I never upload a video of myself saying to my wife, "no dear, of COURSE I don't fancy Peter's wife"!
I’d welcome it, if a buzzer sound effect was played with every lie. :)
It seems they hoped we would overlook the "demon" in "democratization of surveillance tools".