Radha Popuri's blog: Big Data - Now and the future

Monday, August 23, 2010

Big Data - Now and the future

Data, data and more data !! The era of big data is upon us. Tera byte data sets are slowly becoming common place and exa and peta byte data sets are expected soon.

What are the underlying trends that caused the explosion of big data - or more aptly semi structured big data? On the web, the first one is the rise of Web search and the second one is the rise of social networking.

Search companies like Google needed a way to index the entire we b on their machines. Google came up with the concept of MapReduce - a data processing framework on commodity machines to do this cost effectively. Open source implementations of MapReduce- named 'Hadoop' soon followed to solve these data processing issues. Social networking also required that the Facebooks and LinkedIns of the world , store huge amounts of user generated data coming in at a very high rate. They then had to index it, analyze it and generate insights from it to drive further user adoption and virality. A lot of this data was semi-structured( did not fit in a database neatly) and required a lot more computation to generate insights, than the traditional BI model.
This is leading to the rise of the so called Big Data Stack at consumer internet companies and it has five major components

Big Data Storage : NOSQL databases - Cassandra/Voldemort, HDFS, HBase
Big Data Indexing and index storage : Lucene, Katta or NOSQL stores like above; Zoie (real time indexing from Linkedin) ; Bobo for faceted search
Big Data Processing and Analytics: Hadoop, Hive, Pig
Big Data Workflows: Oozie( Yahoo), Azkaban(Linkedin), Cascading(Chris Wenzel)
Big Data and Big Log transportation : Chukwa, Flume, Scribe etc
Big Data Intelligence : Mahout (A Machine Learning framework -that can run on top of Hadoop)
Big Data Sharding: Gizzard ( A middleware sharding framework developed by Twitter)

(The exact use cases of the above stack and the variations at various internet companies merits its own discussion and is outside the scope of this article; I will address this in another post.)

Traditional Fortune 500 enterprises have long relied on an enterprise architecture stack consisting of RDBMS and BI software running on high-end servers; However, there was no good way to handle unstructured and semi structured data until recently. As more ideas like user generated data percolate from the consumer internet into the enterprise, enterprises are beginning to see the same big data issues that were first experienced in consumer internet space. There is also a growing realization that data can now be processed cost effectively to generate hidden insights and drive competitive advantage.

However today's CIO's lack the tools needed to manage this data. Even though this new stack and frameworks are getting mature, the skillsets currently needed by the IT staff to handle these new frameworks is very high. And every CIO is pressed on budget and under pressure to deliver value to their business using minimal staff. I think we will see a lot of tools and processes develop around big data ti ease the transition to the enterprise.

It should be an interesting space to watch!!

11 comments:

Anonymous said...: Thanks for sharing this link, but unfortunately it seems to be offline... Does anybody have a mirror or another source? Please reply to my post if you do!

I would appreciate if a staff member here at popuri.blogspot.com could post it.

Thanks,
Mark; 1:34 AM
Anonymous said...: Its my first all together to post on this forum,merely wannat make some friends here.if its not allowed to record on this enter,gladden strike out this thread.Nice to find you!

---------------------------------------------------------------
[url=http://www.sexybags.info/rssrock.html]My designer handabgs[/url]; 8:33 AM
Anonymous said...: top [url=http://australia-online-casinos.com/]free casino[/url] brake the latest [url=http://www.realcazinoz.com/]realcazinoz.com[/url] free no set aside bonus at the chief [url=http://www.baywatchcasino.com/]loose largesse casino
[/url].; 4:28 PM
Anonymous said...: 锘縃ead Coach Jim Harbaugh also hinted that James could be in uniform But it's probably a one-year blip in the long runCerullo testified that league investigators misrepresented what he told them, [url=http://www.49ersonlineofficialstore.com/vernon_davis_jersey_super_bowl]Vernon Davis Youth Jersey[/url]
that, during the playoffs following the 2009 regular season, he kept track of large playoff pledges on note pads but didn't collect the money The No Phil [url=http://www.nike49ersnfljersey.com/san-francisco-49ers-jerseys/justin-smith-jersey.html]Justin Smith Authentic Jersey[/url]
has promised to expose a very-high-ranking [url=http://www.49ersonlineofficialstore.com/navorro_bowman_jersey_super_bowl]NaVorro Bowman Super Bowl Jersey[/url]
official
There is the grateful s[url=http://www.49ersnflofficialstore.com/nike-aldon-smith-jersey-super-bowl]Aldon Smith Jersey Super Bowl[/url]
y-blond boy with his mom in front of the minivan thanking his local coaches for making football safe [url=http://www.nikenflravensjersey.com/baltimore-ravens-jerseys/ravens-authentic-vonta-leach-purple-jersey:-super-bowl-women-s---youth-or-kids.html]Vonta Leach Authentic Jersey[/url]
importantThere were plenty of positives for Denver I don't buy into it [url=http://www.footballravensprostore.com/anquan_boldin_jersey_super_bowl]Anquan Boldin Youth Jersey[/url]
that's one thing he taught me to do -- not buy into rumors Smith Bucs at home, easily Save himself, they advised I don't know what to tell youThe Patriots have clinched the AFC East, while the Texans have locked up a playoff berth in the AFC South but have two games left with the Indianapolis Colts (9-4), who are second in the division; 11:33 PM
Anonymous said...: top [url=http://www.001casino.com/]001casino.com[/url] brake the latest [url=http://www.realcazinoz.com/]casino bonus[/url] unshackled no deposit hand-out at the foremost [url=http://www.baywatchcasino.com/]bay attend casino
[/url].; 2:54 PM
Anonymous said...: [url=http://casodex-bicalutamide.webs.com/]Bicalutamide buy
[/url] Bicalutamide
purchase Bicalutamide
buy Casodex online; 2:43 PM
Anonymous said...: [url=http://www.freewebs.com/order-amoxicillin/]amoxicillin buy online canada
[/url]amoxicillin 500 mg often take
amoxicillin 500 mg bronchitis
amoxicillin ratiopharm 500 mg; 1:20 PM
Anonymous said...: The matchless message, very much is pleasant to me :)
I think, that you commit an error. I suggest it to discuss. Write to me in PM.
What turns out?
You could not be mistaken?
I consider, that you commit an error. Write to me in PM, we will communicate.

[url=http://shenenmaoyiww.weebly.com/][b]michael kors outlet online[/b][/url]
[url=http://cheapbagmk2.manifo.com/][b]michael kors outlet online[/b][/url]
[url=http://shenenmaoyitt.esporteblog.com.br/][b]michael kors outlet online[/b][/url]
[url=http://shenenmaoyitt.webnode.cn/][b]michael kors outlet online[/b][/url]
[url=http://shenenmaoyif.webstarts.com/?r=20130111220203][b]michael kors outlet online[/b][/url]; 6:53 AM
Anonymous said...: GxA daEP t kcSM http://louisffvuittonffonline.webs.com/ qjCI v luHW ziJ [url=http://louisffvuittonffonline.webs.com/ ]shop louis vuitton[/url] EcE a ymAL http://autorettokochi1jp.com/ bdAK i feLF beC [url=http://autorettokochi1jp.com/]コーチバッグ新作[/url] SiO o zgBM http://yuuguukochiinjp.com/ aqXF t odNU qkQ [url=http://yuuguukochiinjp.com/]コーチバッグ新作[/url] SwU b xsQI http://dendouerumesu.com/ qwKH h axIW zjV [url=http://dendouerumesu.com/]エルメス[/url] BzJ fkWF t chEL http://kochijapdokutokua.com/ cvBE c qnTC tdW [url=http://kochijapdokutokua.com/]コーチ公式ファクトリーアウトレット[/url] XfE zzMQ i iqNK http://sinsakuerumesu.com/ bsYK r mkPO dbA [url=http://sinsakuerumesu.com/]エルメス財布[/url] NxA igYZ b obBS http://saiyasunekochi1jp.com/ vwQS d ksOW wkQ [url=http://saiyasunekochi1jp.com/]財布 coach[/url] WnZ ogJO e yhWG http://louisagvuittonwastores.webs.com/ pxXM m drXT keJ [url=http://louisagvuittonwastores.webs.com/]louis vuitton stores[/url]; 7:15 AM
Anonymous said...: Nike Air Max 1 version personnalisée individuelle de la Cour suprême à deux couleurs
Idéal suprême emprunter le matériel critique de la fissure sur la Air Jordan III 2002,[url=http://airmaxnikefrse.webs.com]Air Max[/url] les deux paires suprême x Dunk SB Low est devenu un classique. Si vous voulez faire de ce design classique et la couleur dans vos propres chaussures, en fait, pas difficile. Chaussures coutume Mache faire la meilleure démonstration. Deux paires de Air Max Pas Cher comme un modèle de conception, le premier pour former une couleur unifiée et Dunk SB Low suprême, et attachent alors daim éclatement des modèles de crack gris, et donc deux paires suprême version personnalisée que vous avez terminé.

Magasin avec confiance. Website: http://airmaxnikepascherfr.webs.com; 9:15 AM
Anonymous said...: http://hermesoutlet.citationguide.net Hermes birkin handbags hermes belt kit uk; 11:39 PM

Radha Popuri's blog

Monday, August 23, 2010

Big Data - Now and the future

11 comments:

Subscribe Now: Feed Icon

Blog Archive

ABOUT ME

Blogs I Read