Tuesday, December 2, 2014

Bigdata Ecosystem



I have had a lot of discussions on big data with my clients and prospects.  During these discussions some questions comes up on Hadoop for ex: – What are the different components of a Hadoop ecosystem?

In my point of view the question should be - What are different components of a Bigdata ecosystem?

Everyone seems to have a different answer for it.  I have tried to consolidate the answers. The results are this picture.


2 comments:

  1. Hi,
    I love the way you mentioned the different components in that image. Under Data Access and Processing why did you separate out MapReduce, Giraph, Mahout in one container, Spark & Storm into another and HBase, Cassandra & Impala under other? Are they all not the same genre to access HDFS?

    ReplyDelete
  2. MapReduce, Giraph, Mahout are for batch processing ( mahout may not fit in to this category) while Spark & Storm for streaming. HBase, Cassandra & Impala are NO SQL databases, so shown in different box. This separation is arbitrary and can be done in multiple ways.

    ReplyDelete