Monday, August 23, 2010

Big Data - Now and the future

Data, data and more data !! The era of big data is upon us. Tera byte data sets are slowly becoming common place and exa and peta byte data sets are expected soon.

What are the underlying trends that caused the explosion of big data - or more aptly semi structured big data? On the web, the first one is the rise of Web search and the second one is the rise of social networking.

Search companies like Google needed a way to index the entire web on their machines. Google came up with the concept of MapReduce - a data processing framework on commodity machines to do this cost effectively. Open source implementations of MapReduce- named 'Hadoop' soon followed to solve these data processing issues. Social networking also required that the Facebooks and LinkedIns of the world , store huge amounts of user generated data coming in at a very high rate. They then had to index it, analyze it and generate insights from it to drive further user adoption and virality. A lot of this data was semi-structured( did not fit in a database neatly) and required a lot more computation to generate insights, than the traditional BI model.
This is leading to the rise of the so called Big Data Stack at consumer internet companies and it has five major components

Big Data Storage : NOSQL databases - Cassandra/Voldemort, HDFS, HBase
Big Data Indexing and index storage : Lucene, Katta or NOSQL stores like above; Zoie (real time indexing from Linkedin) ; Bobo for faceted search
Big Data Processing and Analytics: Hadoop, Hive, Pig
Big Data Workflows: Oozie( Yahoo), Azkaban(Linkedin), Cascading(Chris Wenzel)
Big Data and Big Log transportation : Chukwa, Flume, Scribe etc
Big Data Intelligence : Mahout (A Machine Learning framework -that can run on top of Hadoop)
Big Data Sharding: Gizzard ( A middleware sharding framework developed by Twitter)

(The exact use cases of the above stack and the variations at various internet companies merits its own discussion and is outside the scope of this article; I will address this in another post.)

Traditional Fortune 500 enterprises have long relied on an enterprise architecture stack consisting of RDBMS and BI software running on high-end servers; However, there was no good way to handle unstructured and semi structured data until recently. As more ideas like user generated data percolate from the consumer internet into the enterprise, enterprises are beginning to see the same big data issues that were first experienced in consumer internet space. There is also a growing realization that data can now be processed cost effectively to generate hidden insights and drive competitive advantage.

However today's CIO's lack the tools needed to manage this data. Even though this new stack and frameworks are getting mature, the skillsets currently needed by the IT staff to handle these new frameworks is very high. And every CIO is pressed on budget and under pressure to deliver value to their business using minimal staff. I think we will see a lot of tools and processes develop around big data ti ease the transition to the enterprise.

It should be an interesting space to watch!!


Anonymous said...

Thanks for sharing this link, but unfortunately it seems to be offline... Does anybody have a mirror or another source? Please reply to my post if you do!

I would appreciate if a staff member here at could post it.


Anonymous said...

I have always had issues in setting up a verified phone number with my Craigslist account. I have finally found a service that solves this problem.
They sell [url=]Craigslist Phone Verified Accounts[/url] at the cheapest prices I have ever seen. My experience with them has been exceptional and I highly recommend their service!

Anonymous said...

Its my first all together to post on this forum,merely wannat make some friends here.if its not allowed to record on this enter,gladden strike out this thread.Nice to find you!

[url=]My designer handabgs[/url]

Anonymous said...

top [url=]free casino[/url] brake the latest [url=][/url] free no set aside bonus at the chief [url=]loose largesse casino

Anonymous said...

锘縃ead Coach Jim Harbaugh also hinted that James could be in uniform But it's probably a one-year blip in the long runCerullo testified that league investigators misrepresented what he told them, [url=]Vernon Davis Youth Jersey[/url]
that, during the playoffs following the 2009 regular season, he kept track of large playoff pledges on note pads but didn't collect the money The No Phil [url=]Justin Smith Authentic Jersey[/url]
has promised to expose a very-high-ranking [url=]NaVorro Bowman Super Bowl Jersey[/url]
There is the grateful s[url=]Aldon Smith Jersey Super Bowl[/url]
y-blond boy with his mom in front of the minivan thanking his local coaches for making football safe [url=]Vonta Leach Authentic Jersey[/url]
importantThere were plenty of positives for Denver I don't buy into it [url=]Anquan Boldin Youth Jersey[/url]
that's one thing he taught me to do -- not buy into rumors Smith Bucs at home, easily Save himself, they advised I don't know what to tell youThe Patriots have clinched the AFC East, while the Texans have locked up a playoff berth in the AFC South but have two games left with the Indianapolis Colts (9-4), who are second in the division

Anonymous said...

top [url=][/url] brake the latest [url=]casino bonus[/url] unshackled no deposit hand-out at the foremost [url=]bay attend casino

Anonymous said...

[url=]Bicalutamide buy
[/url] Bicalutamide
purchase Bicalutamide
buy Casodex online

Anonymous said...

[url=]amoxicillin buy online canada
[/url]amoxicillin 500 mg often take
amoxicillin 500 mg bronchitis
amoxicillin ratiopharm 500 mg

Anonymous said...

The matchless message, very much is pleasant to me :)
I think, that you commit an error. I suggest it to discuss. Write to me in PM.
What turns out?
You could not be mistaken?
I consider, that you commit an error. Write to me in PM, we will communicate.

[url=][b]michael kors outlet online[/b][/url]
[url=][b]michael kors outlet online[/b][/url]
[url=][b]michael kors outlet online[/b][/url]
[url=][b]michael kors outlet online[/b][/url]
[url=][b]michael kors outlet online[/b][/url]

Anonymous said...

GxA daEP t kcSM qjCI v luHW ziJ [url= ]shop louis vuitton[/url] EcE a ymAL bdAK i feLF beC [url=]コーチ バッグ 新作[/url] SiO o zgBM aqXF t odNU qkQ [url=]コーチ バッグ 新作[/url] SwU b xsQI qwKH h axIW zjV [url=]エルメス[/url] BzJ fkWF t chEL cvBE c qnTC tdW [url=]コーチ公式ファクトリーアウトレット[/url] XfE zzMQ i iqNK bsYK r mkPO dbA [url=]エルメス 財布[/url] NxA igYZ b obBS vwQS d ksOW wkQ [url=]財布 coach[/url] WnZ ogJO e yhWG pxXM m drXT keJ [url=]louis vuitton stores[/url]

Anonymous said...

It is Miuccia [url= ]меховая фабрика [/url] and Patrizio Bertelli who discover the secret of Prada and invent the first black nylon Prada handbag in the world which opens the successful and beneficial entrance of famous handbag brand. Prada Handbags is renowned for its high end. Each woman wants a Prada handbag which stands for the style and the way of life. Prada store sells cheap Prada, including Prada purses and Prada handbags. [url= ]шуба [/url] is a better way for you to pick a bag of top quality and greatest designed. With the growth and development of Prada, an increasing number of [url= ]меховые фабрики [/url] are established. Getting Prada handbags and Prada wallets in [url= ]шубу [/url] can save you many money and time. [url= ]шуба [/url] is a second good choice for you. [url= ]шубы от производителя [/url] will not let you down.

Anonymous said...

Nike Air Max 1 version personnalisée individuelle de la Cour suprême à deux couleurs
Idéal suprême emprunter le matériel critique de la fissure sur la Air Jordan III 2002,[url=]Air Max[/url] les deux paires suprême x Dunk SB Low est devenu un classique. Si vous voulez faire de ce design classique et la couleur dans vos propres chaussures, en fait, pas difficile. Chaussures coutume Mache faire la meilleure démonstration. Deux paires de Air Max Pas Cher comme un modèle de conception, le premier pour former une couleur unifiée et Dunk SB Low suprême, et attachent alors daim éclatement des modèles de crack gris, et donc deux paires suprême version personnalisée que vous avez terminé.

Magasin avec confiance. Website:

Anonymous said... Hermes birkin handbags hermes belt kit uk