I have had a lot of discussions on big data with my clients
and prospects. During these discussions
some questions comes up on Hadoop for ex: – What are the different components
of a Hadoop ecosystem?
In my point of view the question should be - What are
different components of a Bigdata ecosystem?
Everyone seems to have a different answer for it. I have tried to consolidate the answers. The
results are this picture.
Hi,
ReplyDeleteI love the way you mentioned the different components in that image. Under Data Access and Processing why did you separate out MapReduce, Giraph, Mahout in one container, Spark & Storm into another and HBase, Cassandra & Impala under other? Are they all not the same genre to access HDFS?
MapReduce, Giraph, Mahout are for batch processing ( mahout may not fit in to this category) while Spark & Storm for streaming. HBase, Cassandra & Impala are NO SQL databases, so shown in different box. This separation is arbitrary and can be done in multiple ways.
ReplyDelete