
Tapping into its inner geek, Yahoo! has been working on an open-source grid project with Doug Cutting of Nutch and Lucene (Java-based search library and applications, etc.) fame. Distributed computing projects are all the rage these days and Yahoo! how has a chit in the "we can crank through terabytes too" grid computing game vs. Google.
Yahoo! has invested in Hadoop an open-source grid computing project
Hoping to build on the success of open-source efforts like LAMP (Linux, Apache, MySQL, Perl/PHP/Python) which has powered much of Web 2.0 and blogging, the Hadoop project hopes to connect processing cycles and users to help create the next big web-based platform innovation.
"So what?"
Many of your are wondering what the hubbub is about. OK, so what if you have Google and Yahoo! setting up big clusters of servers and in some cases, letting users help them solve problems and improve their products, right?
Google File System (GFS) and Hadoop have similar objectives -- to realize economies on the software used to index, store, append and otherwise process huge chunks of data. This data is rarely eliminated from the system and is typically only edited or added to, much like a wikipedia entry.
What's interesting in all of this is that by all appearances, Hadoop is at least roughly based on Google's GFS, a competitor's algorithm. Google doesn't release their GFS software to the world but Google does release detailed white papers on how it works.
Projects like Hadoop could mean faster and more in-depth searches across petabytes of data and could ultimately be extended to other uses as well in the Web 3.0 world - deep vertical search or any other application which needs to crunch through large volumes of data quickly and routinely to index, identify anomalies or identify patterns.
NOTE: Someone needs to tell Hadoop that they need to switch site search providers -- they're currently using Google!
Also see:
Web 3.0: Semantic Web, Scraping & Human Intelligence
With Web 2.0 -- which some have termed "web for the people" -- maturing or even (gasp!) waning, there is a good bit of discussion on what Web 3.0 might look like. One can argue whether the web in fact should have version numbers but the reality is that an "upgrade" to Web 2.0 is in the works.
posted by D.J. on 08/21/07 | Permalink »
With offices in Washington, DC and Boston, WebDriven offers a comprehensive and complementary set of web consulting and development services to help our customers achieve success online. More about WebDriven »
http://www.bewebdriven.com/mt/mt-tb.cgi/58