There is so much hype around the big data. Everywhere people talk about Hadoop and this word has reached many in the world in quick time. Every organization puts in lot of effort, time and money to build capabilities in big data skills.

The question is can everyone fit in the big data role?  Or only the chosen elite group will fit?

There was lot of confusion on who can or who is eligible to develop big data skills. Since the entire Hadoop framework is built on top of Java, there is still a strong perception that only Java resources can get to know this technology easier and faster. Java knowledge is not mandatory to be skilled in Hadoop related skills unless otherwise one wants to be writing map reduce codes for various applications.

Let me confine myself to Business Intelligence, the technology I am involved with to give answer to the above question. My answer is a big YES. Every BI resource can fit in. My request to other technology folks to expand this thought process.

I may be questioned on the how part of it. Let me explain.

Hadoop end of the day is data storage and processing environment. Hence, all DW/BI resources who work with data naturally fit in. Many top BI tools give connectivity to Hadoop. Hence it is the additional knowledge of how storage happens inside Hadoop will make BI resource fit in the big data role. And of course the basic operations a BI resource needs to do with respect to Relational DBs have to be done in Hadoop.

Big data is about many Vs’ (Volume, Velocity, Variety, Veracity… V s’ increasing in fact). Hadoop is the framework to store and process big data. Storing part is taken care by HDFS (Unstructured data storage), HIVE (Similar to Data warehouse), HBase (For both structured and unstructured data) and there are other NoSQL data stores. Processing part is taken care by Map-reduce since Hadoop only understands map-reduce, meaning every single operation is converted into a map-reduce job whether it is a simple I/O operation or a complex calculation.

Hope my answer is convincing to all.

And here are few roles and skills a BI resource can look for and get expertise.  For all the following, knowledge of big data distributions, NoSQL stores and basic Hadoop operations are essential.

  • Big data solution Architect ( Knowledge in Big data tools  and  BI tools which can talk to Hadoop )
  • Big data ETL Developer ( Knowledge in ETL tools with Hadoop connectivity)
  • Big data report developer ( knowledge in reporting  tools with Hadoop connectivity)
  • Data scientist ( In depth knowledge in data mining , predictive analysis, text mining and domain knowledge )
  • Big data sales professional ( Exposure to all the above mentioned skills )

Where do you think you fit in, in the big data roles?

I would be very happy to see your feedback, comments or a different point of view.

Posted by Soundar Pandian
Comments (3)
April 17th, 2014

Comments (3)

Narendra Chaudhari - November 28th, 2014

Will fresher candidate be considered for big data jobs? and what is the career path in big data for freshers?

Jaganmohan - April 18th, 2014

Is there a role for Big Data admin by Infrastructure and Operations professional? Especially on Cluster Management through Hadoop 2.0 tools or products such as Cloudera etc...

Pandian Muneeswara C - April 18th, 2014

Unstructured/Text data handling gets closely associated with Big Data / Hadoop, this requires extracting relevant interesting information from large textual content, this task will be performed by a Big Data ETL or Report Developer or a Data Scientist

Comments are closed.