1662, London
John Graunt did detailed analysis of bills of mortality and tried to find hidden patterns. He build a model which can predict next bubonic plague in the city.
1880, USA
According to this link here, first big data problem was solved during the census of 1880. A Hollerith tabulating system was used for dealing with huge data of 50 million people. It helped them to complete task of almost 7+ years in just 6 weeks. This was a step ahead in infrastructure required to deal with big data.
Someday 1999, At Gelato shop, Silicon Valley Research labs, CA.
Jeff Dean: I want to do something Big.
Sanjay Ghemawat: Hmmm, do you mean related to Big Data?
Jeff Dean: Exactly, related Big Data..
Sanjay Ghemawat: Then lets do it!
The conversation might have been little different but both loved that Gelato shop. True story! Later these legends joined Google to create benchmarks in history of Big Data.
October 2003, ACM Symposium, Lake George, NY
Sanjay Ghemawat with two other coauthors presented paper on The Google File System. It was a distributed file system which helped Google to deal with large-scale infrastructure. In simple words it helped Google to use huge cluster of machines instead of Supercomputers for dealing with huge amount of data.
December 2004, OSDI Symposium, San Francisco, CA
Jeff Dean and Sanjay Ghemawat presented paper on MapReduce. A programming model for dealing with huge datasets. It was based on functional programming language basic functions map and fold (reduce). This helped Google to run different functions (jobs) across clusters of data.
2005
Two researchers Doug Cutting (from Yahoo) and Mike Cafarella build something based on GFS and MapReduce called Hadoop.
November 2006, OSDI Symposium, Seattle, WA
Again Jeff Dean, Sanjay Ghemawat and few other coauthors presented paper on BigTable. A distributed multidimensional table for structured data (not to be confused with distributed relational database) based on Google File System.
to be continued...
John Graunt did detailed analysis of bills of mortality and tried to find hidden patterns. He build a model which can predict next bubonic plague in the city.
1880, USA
According to this link here, first big data problem was solved during the census of 1880. A Hollerith tabulating system was used for dealing with huge data of 50 million people. It helped them to complete task of almost 7+ years in just 6 weeks. This was a step ahead in infrastructure required to deal with big data.
Someday 1999, At Gelato shop, Silicon Valley Research labs, CA.
Jeff Dean: I want to do something Big.
Sanjay Ghemawat: Hmmm, do you mean related to Big Data?
Jeff Dean: Exactly, related Big Data..
Sanjay Ghemawat: Then lets do it!
The conversation might have been little different but both loved that Gelato shop. True story! Later these legends joined Google to create benchmarks in history of Big Data.
October 2003, ACM Symposium, Lake George, NY
Sanjay Ghemawat with two other coauthors presented paper on The Google File System. It was a distributed file system which helped Google to deal with large-scale infrastructure. In simple words it helped Google to use huge cluster of machines instead of Supercomputers for dealing with huge amount of data.
December 2004, OSDI Symposium, San Francisco, CA
Jeff Dean and Sanjay Ghemawat presented paper on MapReduce. A programming model for dealing with huge datasets. It was based on functional programming language basic functions map and fold (reduce). This helped Google to run different functions (jobs) across clusters of data.
2005
Two researchers Doug Cutting (from Yahoo) and Mike Cafarella build something based on GFS and MapReduce called Hadoop.
November 2006, OSDI Symposium, Seattle, WA
Again Jeff Dean, Sanjay Ghemawat and few other coauthors presented paper on BigTable. A distributed multidimensional table for structured data (not to be confused with distributed relational database) based on Google File System.
to be continued...
No comments:
Post a Comment