Sunday, January 22, 2017

Wordplay in Information Manipulation

There is a very interesting scene in the movie Dark Knight. The Joker (bad guy) is holding Rachel (lead actress) hostage at the edge of a rooftop and then the Batman arrives. The short conversation goes something like,

Batman: Let her go!
Joker: Ohh, very poor choice of words

Indeed, maybe the Batman was under a lot of stress. If this poor^ information representation example serves as one end of the spectrum, then researchers might be on the other end.

The way (good) researchers chose their words, seem remarkably careful^^. They would love to say something like, "X is associated with increased risk of Y with the p-value of blah-blah" (possibly with extra stress on the word associated).

^Poor = unintentional or careless
^^Careful = intentional and thoughtful

Interestingly these are not necessarily the people who unfairly manipulate the information. Poor or careful choice of words does not have definitive relation with information manipulation, though poor word choice will lead to information ambiguity or misrepresentation. On the other hand, I believe information manipulation can be traced back to both poor or careful choice of words (or other means of representation). So no easy way to spot it.

The Digital Trends website published an article two months ago with title "Stanford study *concludes* next generation of robots won’t try to kill us".[1] This title so far fetched from the actual content of the report that it would be very hard to qualify it as the truth. Nowhere in the report, we can find the conclusion stated by the article.[2] Funny thing is this article cites another article written by Fast Company as the source for the catchy headline. So the title is basically based on Digital Trend's interpretation of Fast Company's interpretation of the study. Poor choice of words to create a clickbait.

In the Indian epic of Mahabharata, Guru Dronacharya was invincible while holding a weapon. However as long as he was alive, the Pandavas could NOT win the Dharm Yuddha. So an ingenious plan was created by Lord Krishna to weaken Dronacharya by spreading the rumor of the death of his son Ashwastthama. Accordingly, Bhima killed an elephant named Ashwastthama and the message was spread that Ashwastthama has been killed. Guru Dronacharya found it hard to believe. There was one way to confirm, ask the man who had never lied - Yudhistira. Yudhistira being a virtuous man, refused to tell any lies. However, lord Krishna convinced him to say 'Ashwathama Hatahath, Naro Va Kunjaro Va' which means 'Ashwathama had died (in clear loud voice and then continue in low pitch) but it is not certain whether it was a Drona's son or an elephant'. Hearing this Guru Dronacharya got disheartened, laid down his weapons, got killed. Eventually, Pandavas won the war. Very careful use of words to manipulate the information.

Here is another very interesting example from the book Bad Science,

"The reports were based on a study that had observed participants over four years, and the results suggested, using natural frequencies, that you would expect one extra heart attack for every 1005 people taking ibuprofen. Or as the Daily Mail, in an article titled "How Pills for Your Headache Could Kill" reported: "British research revealed that patients taking ibuprofen to treat arthritis face a 24 percent increased risk of suffering a heart attack." Feed the fear.

Almost everyone reported the relative risk increases: diclofenac increases the risk of heart attack by 55 percent; ibuprofen, by 24 percent. The Boston Globe was clever enough to report the natural frequency: 1 extra heart attack in 1005 people on ibuprofen. The UK's Daily Mirror, meanwhile, tried and failed, reporting that 1 in 1005 people on ibuprofen "will suffer heart failure over the following year." No. It's heart attack, not heart failure, and it's 1 extra person in 1005, on the top of the heart attacks you'd get anyway. Several other papers repeated the same mistake."

Creating catchy (possibly misleading) headlines directly corresponds to revenue in this age of click rate. To be fair these reporters are generous enough as they snuck in the clauses related with title somewhere deep in the article. Unfortunately, this is not limited to science reporting. In 2011 when the anti-corruption movement was at peak against UPA-2 government in India, many news outlets used to publish similar articles. In a show called Devil’s Advocate at CNN-IBN, Mr. Kejriwal told Karan Thapar “Citizens are more important than Parliament. It is in the Constitution. Anna Hazare and every citizen is supreme. I think the Constitution says so”.[3] Irrespective of your views on Mr. Kejriwal I think you can see through the memorable headline Times of India created out of it. "Anna Hazare is above parliament: Arvind Kejriwal" [4]

Part of the problem is a lot of us don't have time /interest to see beyond the wordplay and verify the information from multiple sources (which reduce the possibility of selective reporting) or the original source (which reveals the ground truth). Another aspect is a lot of us crave for flashy headlines. Who wants to read Nature News when BuzzFeed is writing about science?

In some cases, we need to have a special qualification in order to interpret the words, like the legal systems. We can observe manipulation based on "poorly" worded laws and it's "careful" interpretation. A specific example would be a hate-speech related colonial era law in India called Section 295a. It is often used to target rationalist in India debunking godmen. Founder and president of Rationalist International, Sanal Edamaruku debunked an event perceived as magic by a church in Mumbai. In three separate police stations, cases were registered against him. [5]

I think technology can offer the solution to the reporting problem to some extent. Perhaps publishing a white box algorithm for periodically ranking reporters/anchors and newspapers/TV channels on selection bias, exaggeration factor etc in a peer-reviewed open-access journal might be a good start. Not sure how feasible it is considering so many constraints. And even if someone does come up with the algorithm, then industry adherence is another uphill battle. Meanwhile, let's watch out for words.

