Thursday, November 8, 2012

Hadoop Releases, Projects and features in a nutshell

Hadoop Features in HDFS and MR across releases



News on releases : Oct 2012 : 

9 October, 2012: Release 2.0.2-alpha available

This is the second (alpha) version in the hadoop-2.x series.
This delivers significant enhancements to HDFS HA. Also it has a significantly more stable version of YARN which, at the time of release, has already been deployed on a 2000 node cluster.
Please see the Hadoop 2.0.2-alpha Release Notes for details.
Latest Hadoop : http://hadoop.apache.org/docs/current/ : 2.0.2
Latest Stable Release : http://hadoop.apache.org/docs/stable/ : 1.0.4

Common
A set of components and interfaces for distributed filesystems and general I/O
(serialization, Java RPC, persistent data structures).


Avro
A serialization system for efficient, cross-language RPC and persistent data
storage.

MapReduce
A distributed data processing model and execution environment that runs on large
clusters of commodity machines.

HDFS
A distributed filesystem that runs on large clusters of commodity machines.

Pig
A data flow language and execution environment for exploring very large datasets.
Pig runs on HDFS and MapReduce clusters.

Hive
A distributed data warehouse. Hive manages data stored in HDFS and provides a
query language based on SQL (and which is translated by the run time engine to
MapReduce jobs) for querying the data.

HBase
A distributed, column-oriented database. HBase uses HDFS for its underlying
storage, and supports both batch-style computations using MapReduce and point
queries (random reads).

ZooKeeper
A distributed, highly available coordination service. ZooKeeper provides primitives
such as distributed locks that can be used for building distributed applications.


Sqoop
A tool for efficient bulk transfer of data between structured data stores (such as
relational databases) and HDFS.

Oozie
A service for running and scheduling workflows of Hadoop jobs (including Map-
Reduce, Pig, Hive, and Sqoop jobs).

Note referenece : Hadoop The Definitive GUIDE (3rd Edition) & Hadoop 

No comments:

Post a Comment