HADOOP
#3
Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license. In this , the applications can work on thousands of nodes and petabytes of data. Google's MapReduce and Google File System (GFS) papers inspired the development of HADOOP. It is a is a top-level Apache project with contributors from all over the world and it is developed using the using the Java programming language.
Hadoop History
-Dec 2004 “ Google paper published¢
- July 2005 “ Nutch uses new MapReduce implementation¢
- Jan 2006 “ Doug Cutting joins Yahoo!¢
- Feb 2006 “ Hadoop becomes a new Lucene subproject¢
- Apr 2007 “ Yahoo! running Hadoop on 1000-node cluster¢
-Jan 2008 “ An Apache Top Level Project¢
- Feb 2008 “ Yahoo! production search index with Hadoop¢
- July 2008 “ First reports of a 4000-node cluster

Architecture
Hadoop is designed to efficiently process large volumes of information by connecting many commodity computers together to work in parallel. It groups together many computers producing a single cost-effective compute cluster.
the supported filesystems includes:
-HDFS: It is the Hadoop's own rack-aware filesystem.
-CloudStore which is rack aware.
-Amazon S3 filesystem
-FTP Filesystem where the data are stpred in FTP servers
-HTTP and HTTPS file systems which are read only

Hadoop Distributed File System

It stores stores large files across multiple machines and the data is replicated across multiple hosts. data is stored on three nodes of which two on the same rack, and one on a different rack.

Job Tracker and Task Tracker: the MapReduce engine
MapReduce engine comes above the file systems. It consists of:
-Job Tracker, to which client applications submit MapReduce jobs.
-Task Tracker: the Job Tracker distributes work out to available Task Tracker nodes in the cluster. node containing the data is known to the Job Tracker and also about the machines which are nearby.

For further details, refer these links:
http://en.wikipediawiki/Hadoop

pdf is available in these links:
http://scribddoc/12021062/Hadoop
http://scribddoc/7136281/Hadoop-Primer
http://scribddoc/21962428/Hadoop-Infrastucture
http://scribddoc/23227066/Apache-Hadoop
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: seminar report on hadoop distribution file system, hadoop flume example, hadoop technical seminar ppt, hadoop project, hadoop seminar topic abstract, seminar ppt on hadoop, hadoop full ppt,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Messages In This Thread
HADOOP - by pri_niture - 12-03-2010, 03:53 PM
RE: HADOOP - by computer science topics - 16-06-2010, 09:42 PM
RE: HADOOP - by projects wizhard - 27-07-2010, 06:21 PM
RE: HADOOP - by projectsofme - 09-10-2010, 09:47 AM
RE: HADOOP - by seminar class - 06-05-2011, 04:55 PM
RE: HADOOP - by athare - 16-07-2012, 09:28 PM
RE: HADOOP - by seminar details - 17-07-2012, 10:50 AM

Forum Jump: