Debuting Truth Teller, a prototype from the Washington Post
Truth Teller is a news application built by the Washington Post with funding from a Knight News Prototype grant. The goal of Truth Teller is to fact check speeches in as close to real time as possible. The three-month prototype built by the Post is an enormous step in that direction.
The genesis of Truth Teller was fairly well captured in this Poynter piece, which came out around the time the funding was announced. (One note: the politician mentioned was Michele Bachmann — a she, not a he.)
Steven Ginsberg saw the future of fact-checking while listening to a politician tell lies in Iowa last summer. “It was one of those small parking lot affairs outside a sports bar and the candidate was there speaking to about 30 people,” said Ginsberg, The Washington Post’s national political editor. ”For about 45 minutes he said a lot of things that I knew to not be true, and nobody else there knew that.” Ginsberg thought there must be a way to offer people in the crowd a real-time accounting of the politician’s misstatements. He called Cory Haik, the Post’s executive producer for digital news, and outlined the issue.
For the prototype, we focused on the looming debate over tax reform, both because of timing and its import for the country. The tax debate will play out over several months and naturally lends itself to deceit and deception — even moreso than many policy discussions. We hope that our application will help direct the conversation toward the truth as it is happening so that Americans get a fair shot at deciding this critical issue.
The Truth Teller prototype was built and runs with a combination of several technologies — some new, some very familiar. We’ve combined video and audio extraction with a speech-to-text technology to search a database of facts and fact checks. We are effectively taking in video, converting the audio to text (the rough transcript below the video), matching that text to our database, and then displaying, in real time, what’s true and what’s false.
We are transcribing videos using Microsoft Audio Video indexing service (MAVIS) technology. MAVIS is a Windows Azure application which uses State of the Art of Deep Neural Net (DNN) based speech recognition technology to convert audio signals into words. Using this service, we are extracting audio from videos and saving the information in our Lucene search index as a transcript. We are then looking for the facts in the transcription. Finding distinct phrases to match is difficult. That’s why we are focusing on patterns instead.
We are using approximate string matching or a fuzzy string searching algorithm. We are implementing a modified version Rabin-Karp using Levenshtein distance algorithm as our first implementation. This will be modified to recognize paraphrasing, negative connotations in the future.
What you see in the prototype is actual live fact checking — each time the video is played the fact checking starts anew. It needs more technical work and we need more facts. For instance if you stumble across what you think is a false positive, let us know (haikc [at] washpost [dot] com) — we’re tuning the algorithm as we go. It’s a proof of concept, a prototype in the truest sense. But do we think this can be applied to streaming video in the future? Yes. Can this work if someone is holding up a phone to record a politician in the middle of a field in Iowa? Presenting the truth is without dispute one of the most important missions of journalism. So yes, we believe it can.
– Cory Haik, Executive Producer for Digital News
The Washington Post Truth Teller team:
Cory Haik, Executive Producer for Digital News
Steven Ginsberg, National Political Editor
Joey Marburger, Mobile Design Director
Yuri Victor, UX Director
Siva Ghatti, Director, Application Development
Ravi Bhaskar, Principal Software Engineer
Gaurang Sathaye, Principal software engineer
Julia Beizer, Mobile Projects Editor
Sara Carothers, Producer