Analysing Cascading over MapReduce
- 1Big Data Analyst and Independent Researcher, Delhi, India
Res. J. Computer & IT Sci., Volume 4, Issue (9), Pages 1-4, September,20 (2016)
In recent years Big Data has grown significantly. Hadoop has become a de-facto Big Data technology and Map-Reduce de-facto processing framework. Hadoop with MapReduce performs distributed processing of large data sets in fault tolerant and cost effective manner. Cascading is an abstraction layer upon MapReduce and allows developers to think with reference to tuples and fields. Many business problems can be solved conveniently with tuple rather than MapReduce key-value pair. The paper advocates cascading over MapReduce and illustrates how lengthy tasks in MapReduce are easily done in Cascading supported by a case study.
- Ghemawat Sanjay, Gobioff Howard and Leung Shun-Tak (2003)., The Google File System., Proceedings of the nineteenth ACM symposium on Operating systems principles, Bolton Landing, NY, USA. October 19-22.
- Dean Jeffrey and Ghemawat Sanjay (2004)., MapReduce: Simplified Data Processing on Large Clusters., OSDI
- Tom White (2012., Hadoop: The definitive guide., O’Reilly Media publication, ISBN-13: 978-1491901632.
- Hadoop (2016)., What Is Apache Hadoop?., Hadoop, http://hadoop.apache.org/, June 21, 2016.
- Cloudera (2016). Apache Hadoop., Cloudera, https://cloudera.com/products/apache-hadoop.html, June 21, 2016, undefined
- Hortonworks (2016)., Hadoop., http://hortonworks.com, June 21, 2016.
- IBM (2016)., Hadoop: Built for big data, insights, and innovation., IBM, USA, http://www.ibm.com/analytics/us/en/technology/hadoop/, access June 21, 2016
- MapR (2016)., Hadoop., MAPR, https://www.mapr.com/, access June 21, 2016.
- Hadoop (2016)., Apache MapReduce., Hadoop, https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html, June 21, 2016
- Cascading (2016)., MapReduce., Cascading, http://www.cascading.org/, June 21, 2016
- Apache Avro (2016)., Avro., Apache Avro, https://avro.apache.org/, June 21, 2016.