Thursday, December 8, 2016

Selection Bias

Barack Obama's article in Wired. [1]
Stephen Hawking's article in The Guardian. [2]
Peter Thiel's speech at RNC. [3]

In last two months, three renowned people have shared their thoughts about the time we live in.

All of them are highly successful and revered figures in their field. They all are data driven, you will find them quoting facts and figures all the time. Yet there is a stark difference between the central message here.

Case 1: Barack Obama's article in Wired

Barack Obama wrote an article titled "Now is the greatest time to be alive". His argument is, we have achieved great breakthroughs. Though it's not utopia, considering the history the current time is the best time to live in.

"Just since 1983, when I finished college, things like crime rates, teen pregnancy rates, and poverty rates are all down. Life expectancy is up. The share of Americans with a college education is up too. Tens of mil­lions of Americans recently gained the security of health insurance. Blacks and Latinos have risen up the ranks to lead our businesses and communities. Women are a larger part of our workforce and are earning more money. Once-quiet factories are alive again, with assembly lines churning out the components of a clean-energy age.


And just as America has gotten better, so has the world. More countries know democracy. More kids are going to school. A smaller share of humans know chronic hunger or live in extreme poverty. In nearly two dozen countries—including our own—­people now have the freedom to marry whomever they love. And last year the nations of the world joined together to forge the most comprehen­sive agreement to battle climate change in human history.

Indeed, these are facts. So that does seem like a step towards utopia, doesn't it? Being his admirer I assumed the same.

Case 2: Stephen Hawking's article in The Guardian

Stephen hawking published an article this week - "This is the most dangerous time for our planet". As the name suggests central theme is pretty opposite of the first case.

"The concerns underlying these votes about the economic consequences of globalization and accelerating technological change are absolutely understandable. The automation of factories has already decimated jobs in traditional manufacturing, and the rise of artificial intelligence is likely to extend this job destruction deep into the middle classes, with only the most caring, creative or supervisory roles remaining. This in turn, will accelerate the already widening economic inequality around the world.
The consequences of this are plain to see: the rural poor flock to cities, to shanty towns, driven by hope. And then often, finding that the Instagram nirvana is not available there, they seek it overseas, joining the ever greater numbers of economic migrants in search of a better life. These migrants in turn place new demands on the infrastructures and economies of the countries in which they arrive, undermining tolerance and further fuelling political populism."

I think a lot of us can relate to what he is stating above. Sadly, it does appear to be the bigger picture at a global scale. We could face some serious issues in near future.

Case 3: Peter Thiel's speech at RNC

Peter Thiel gave a speech at RNC highlighting the poor state of the country. Basically, his stand was how as a country the US couldn't continue on the expected trajectory and things are already bad.

" our government is broken. Our nuclear bases still use floppy disks. Our newest fighter jets can’t even fly in the rain. And it would be kind to say the government’s software works poorly, because much of the time it doesn’t even work at all. That is a staggering decline for the country that completed the Manhattan project. We don’t accept such incompetence in Silicon Valley, and we must not accept it from our government. Americans get paid less today than ten years ago. But healthcare and college tuition cost more every year. Meanwhile, Wall Street bankers inflate bubbles in everything."

If you think about it, he did mention some facts. The average healthcare cost per capita in the US has touched $10,000 per year. Medical debt appears to be the leading cause of personal bankruptcy in the US. The education is getting so expensive people can spend decade(s) repaying education loans.


If you look at three cases, you will realize how "convenient" data selection can be used to support almost any argument. The difference in arguments above could be due to the difference in perception about how to measure things. Measuring things in a real world is an extremely hard problem. In research, "double-blinded + randomized + controlled" trials are considered the gold standard of evidence (not the highest though). Even with these gold standards and billions of dollars experiment could fail to measure things miserably. For an example, according to a paper published in Journal of American Medical Association cancer drugs in the real world do not follow the expectations set by clinical trials of the same drugs. The average increase in the survival time for patients under these drugs could be a lot less than results in trials. Sometimes the average increase in survival time for patients in real world taking these drugs is less than the survival time of the patients on placebo (sugar pills) in the experiment. [4].

That might look like an unnecessary example here. However, the point is, even a ton of money and brightest minds working together can not guarantee good judgment of the reality. So the least we could do is take things with a grain of salt than absolute reality.
I think it's hard to eliminate selection bias completely but it can be reduced. The examples above exhibits a comparatively decent level of selection bias. It can get really ugly and dangerous. Irrespective of the nature selection bias will contribute to twisting the perception of reality (by definition) and possibly spreading misinformation.

In some cases selection bias can twist society's perception significantly like,
- TV debate
- News article on a news website (whose sole aim could be click rate)
- Biography of a highly successful or controversial person
- Speeches of Politicians or celebrities
- Public surveys and opinion polls (especially by political parties and related organizations)

Let's make a genuine attempt to observe if it's the entire picture or just a "convenient" part of it.



No comments:

Post a Comment