一個值得研究的領域 - Hadoop
In Open Source, Hadoop
.Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data.
.Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS) MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located.
.Hadoop is a Lucene sub-project that contains the distributed computing platform that was formerly a part of Nutch.
Hadoop相關資源
.Getting Started with Hadoop, Part 1
.Open Source Distributed Computing: Yahoo's Hadoop Support
.Yahoo! Launches World's Largest Hadoop Production Application
.Google Code for Distributed Systems
.Running Hadoop MapReduce on Amazon EC2 and Amazon S3
.Building an Inverted Index for an Online E-Book Store
.Running Hadoop On Ubuntu Linux (Single-Node Cluster)
.Running Hadoop On Ubuntu Linux (Multi-Node Cluster)
MapReduce相關資源