Home > Uncategorized > Different file formats used in Hadoop and HBase

Different file formats used in Hadoop and HBase

I have been investigating different file formats used in Hadoop and HBase to understand how these file formats assist in the speedup that we’ve all witnessed in this Hadoop big data world. Also, I recommend all java developer to dig into Hadoop and HBase source code because you will definitely learn a lot and improve your java skills.

File formats used in Hadoop are SequenceFile, TFile, and Avro file whereas HFile is used exclusively in HBase.

I found an interesting and detailed explanation of the internal structure of HFile representation in the following blog.

http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html

Enjoy

BigDataExplorer

Advertisements
Categories: Uncategorized
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: