Azure HDInsight is Microsoft‘s distribution of Hadoop. The Azure HDInsight ecosystem includes the following features/components: Pig, Hive, Hbase, Sqoop, Oozie, Ambari, Microsoft Avro Library, YARN, Cluster Dashboard and Tez.
Apart from the above listed features/components, there are a few other components which enable reporting and analytics on top of data present in Azure HDInsight. These components include the following:
More information: http://azure.microsoft.com/en-us/documentation/articles/hdinsight-introduction
Here are few highlights of Azure HDInsight:
Links & Additional Information
Getting Started
Cloudera was the first company to be formed to build enterprise solutions based on Hadoop. Cloudera has a Hadoop distribution known as Cloudera‘s Distribution for Hadoop (CDH). Here is a simplified representation of Cloudera‘s Hadoop Ecosystem.
Source: http://www.cloudera.com/content/cloudera/en/products-and-services/cdh.html
Cloudera‘s Hadoop Ecosystem includes the following features/components: Apache Avro, Apache Crunch, Apache DataFu, Apache Flume, Apache Hadoop, Apache Hbase, Apache Hive, Hue, Cloudera Impala, Kite SDK (formerly CDK), LLAMA, Apache Mahout, Apache Oozie, Parquet, Apache Pig, Cloudera Search, Apache Sentry, Apache Spark, Apache Sqoop and Apache ZooKeeper.
More Information: http://www.cloudera.com/content/dev-center/en/home/developer-admin-resources/cdh-components.html
Here are few highlights of CDH:
Links & Additional Information
Getting Started
Hortonworks has a Hadoop distribution known as Hortonworks Data Platform (HDP). Here is a simplified representation of Hortonworks Data Platform.
Source: http://hortonworks.com/hdp/
Hortonworks Data Platform includes the following features/components: Apache Hadoop, Apache Pig, Apache Hive, Apache Hbase, Apache ZooKeeper, Apache Oozie, Apache Sqoop, Apache Flume, Apache Ambari, Hue, Apache Mahout, Apache Knox, Apache Storm, Apache Tez, Apache Phoenix, Apache Accumulo and Apache Falcon.
More Information: http://hortonworks.com/hadoop/
Here are few highlights of Hortonworks Data Platform:
Links & Additional Information
Getting Started
Amazon Web Services (AWS) Elastic MapReduce (EMR) was among the first Hadoop offerings available in the market. Here is a high-level architecture/job flow of Amazon EMR.
Source: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-what-is-emr.html
Amazon EMR contains most of the popular features/components like Hive, Pig, HBase, DistCp, Ganglia, etc. integrated into it.
Here are few highlights of Amazon EMR:
Links & Additional Information
Getting Started
MapR is another major distribution available in the market. Below is a simplified architecture of MapR Data Platform.
Source: http://www.mapr.com/products/product-overview/overview
Here are few highlights of MapR:
Links & Additional Information
Getting Started
Apart from the distributions listed above, there are various other distributions available in the market from leading providers like Intel, Oracle, HP, and many others.
原文:http://www.cnblogs.com/rouge/p/3918773.html