Wednesday, October 15, 2014

Book Review: Learning Spark: Lightning-Fast Big Data Analytics



Book Review:  Learning Spark: Lightning-Fast Big Data Analytics by Holden Karau, Andy Konwinski, and Matei Zaharia: Publisher- O'Reilly: ISBN- 13: 978-1449358624
 

Learning Spark: Lightning – Fast Big Data Analytics is still in Early Release phase and will be available in Feb 2015.  I have reviewed first seven chapters of the book which are still raw but coming neat and clean.

This book is very good introduction for newbies to Spark which is rage in Big Data domain. Book almost all samples in three languages – Java, Scala and Python which makes easier for lots of people to try them out and learn Spark.

The first chapter is just gives introduction while second chapter onward real fun starts. Chapter 2 lets you to install Spark on your laptop.  Chapters 3 to 6 talk about programming aspects of Spark. Chapter 7 is about Spark cluster.

I am expecting book will be good one.

Disclaimer: I did not get paid to review this book, and I do not stand to gain anything if you buy the book. I have no relationship with the publisher or the author. I got electronic format of book from publisher for review.


One can get more information about book and related topics from:

  1. Amazon: http://www.amazon.com/Learning-Spark-Lightning-Fast-Data-Analytics/dp/1449358624
  2. Publisher -- Oreilly http://shop.oreilly.com/product/0636920028512.do



For my High Schooler - Inheritance and Package



At dinner table.

Yash: Today in Java class, we were discussing about packages. Can you please explain?
Me: Sure. You can consider package as any house in a neighborhood.  Each house has its residents – humans, animals, and non-living things. If you scale down by one level, some members of the household have their own rooms while others share rooms.  You can compare those living house as Java Classes, Interfaces, and/or Enums.  Just the way a house may have rooms, a package may have sub-package – packages in a package.
Yash: I got this. But we were also taking about inheritance.
Me: It is pretty simple.
Yash: How?
Me: Let’s create a scenario.  You have parents – one biological mother and one biological father. Correct?
Yash: Correct.
Me: Let’s assume me and your mother separate.  I decide to marry again. Now you have two mothers – one biological and the other one will be your Step Mom.  It is possible that your step Mom may decide to separate after a while and I again marry someone else, now you have three mothers.
Yash: This is complex and crazy and I hope it never happens.
Me: Let’s make it a bit more complex. Your biological mother may also decide to marry someone else. It is also possible that I and one of your step moms may decide to have baby.  To make things worse, you may decide to have baby with one of my spouse’s kid at some point in your life.
Yash: Things may really get ugly here.
Me: Yep. This complexity exists because a baby has two parents. Now assume, if a baby has only one parent. This type of complexity cannot arise. It also allows as many babies as possible from a single parent.
Yash: True.
Me: To avoid this complexity, Java does not allow multiple parents. One parent is allowed.
Yash: Ok.  Who is baby and who is parent in Java?
Me:  In Java, class is the main player. To define relationship of a parent and a baby, we use the keyword – extends.
Yash: How?
Me:  It is simple.  
class B extends A(){
}
Here class A is parent while class B is baby.
In Java terminology, class A is “super class” while class B is “sub class”.

Yash: Hmm…. I suppose, if I want to say that class C is sub class of B then I should write like
                class C extends B(){
                }
Me: Fantastic!!!
Yash: And if class D is also a sub class of B then I can write:
                class D extends B(){
                }
Me: Super!!!
Yash: Does it mean we are creating a hierarchy like class A is super class of class B and classes C & D are sub classes of class B.
 
 


Me: Perfect. This whole concept is called inheritance.
Me: Let’s mix packages and inheritance.  As it is possible that your grandparents may live in a different house and I and you live in one house. It is also possible that classes of same hierarchy belong to different packages. Let’s modify the picture little bit.
 



Yash: So it is possible that a class can live in any package irrespective of the hierarchy.
Me: Yes.
Yash: I have few more questions.
Me: I think, this is good enough for today.
Yash: Okay! Good night.


Thursday, October 2, 2014

Book Review: Using Flume: Stream Data into HDFS and HBase


Book Review:  Using Flume: Stream Data into HDFS and HBase by Hari Shreedharan: Publisher- O'Reilly: ISBN- 13: 978-1449368302



Using Flume: Stream Data into HDFS and HBase is for developers as well as Administrators of Hadoop clusters.  In its first chapter book discusses HBase which is little puzzling but as book progresses, it takes you for  deep dive in various aspects of Flume.  Book covers Streaming of data, various sources, channels, sinks, interceptors, and other components of Flume.

The last chapter is about administration of Flume which is very short. This chapter might be little bit in depth to cover capacity planning, deployment options, etc.

Nevertheless, book is a good reference for any person playing in Hadoop playground.


Disclaimer: I did not get paid to review this book, and I do not stand to gain anything if you buy the book. I have no relationship with the publisher or the author. I got electronic format of book from publisher for review.

Further reading: Apache Flume: Distributed Log Collection for Hadoop (http://www.amazon.com/Apache-Flume-Distributed-Collection-Hadoop/dp/1782167919)


One can get more information about book and related topics from:

  1. Amazon: http://www.amazon.com/Using-Flume-Stream-Data-HBase/dp/1449368301
  2. Publisher -- Oreilly http://shop.oreilly.com/product/0636920030348.do