Analysing Cascading over MapReduce

International E-publication: Publish Projects, Dissertation, Theses, Books, Souvenir, Conference Proceeding with ISBN. International E-Bulletin: Information/News regarding: Academics and Research

Analysing Cascading over MapReduce

Kaustuv Kunal

Author Affiliations

¹Big Data Analyst and Independent Researcher, Delhi, India

Res. J. Computer & IT Sci., Volume 4, Issue (9), Pages 1-4, September,20 (2016)

Abstract

In recent years Big Data has grown significantly. Hadoop has become a de-facto Big Data technology and Map-Reduce de-facto processing framework. Hadoop with MapReduce performs distributed processing of large data sets in fault tolerant and cost effective manner. Cascading is an abstraction layer upon MapReduce and allows developers to think with reference to tuples and fields. Many business problems can be solved conveniently with tuple rather than MapReduce key-value pair. The paper advocates cascading over MapReduce and illustrates how lengthy tasks in MapReduce are easily done in Cascading supported by a case study.

References

Ghemawat Sanjay, Gobioff Howard and Leung Shun-Tak (2003)., The Google File System., Proceedings of the nineteenth ACM symposium on Operating systems principles, Bolton Landing, NY, USA. October 19-22.
Google Scholar
Dean Jeffrey and Ghemawat Sanjay (2004)., MapReduce: Simplified Data Processing on Large Clusters., OSDI
Google Scholar
Tom White (2012., Hadoop: The definitive guide., O’Reilly Media publication, ISBN-13: 978-1491901632.
Google Scholar
Hadoop (2016)., What Is Apache Hadoop?., Hadoop, http://hadoop.apache.org/, June 21, 2016.
Cloudera (2016). Apache Hadoop., Cloudera, https://cloudera.com/products/apache-hadoop.html, June 21, 2016, undefined
Hortonworks (2016)., Hadoop., http://hortonworks.com, June 21, 2016.
IBM (2016)., Hadoop: Built for big data, insights, and innovation., IBM, USA, http://www.ibm.com/analytics/us/en/technology/hadoop/, access June 21, 2016
MapR (2016)., Hadoop., MAPR, https://www.mapr.com/, access June 21, 2016.
Hadoop (2016)., Apache MapReduce., Hadoop, https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html, June 21, 2016
Cascading (2016)., MapReduce., Cascading, http://www.cascading.org/, June 21, 2016
Apache Avro (2016)., Avro., Apache Avro, https://avro.apache.org/, June 21, 2016.

[ref1] Ghemawat Sanjay, Gobioff Howard and Leung Shun-Tak (2003)., The Google File System., Proceedings of the nineteenth ACM symposium on Operating systems principles, Bolton Landing, NY, USA. October 19-22.
Google Scholar

[ref2] Dean Jeffrey and Ghemawat Sanjay (2004)., MapReduce: Simplified Data Processing on Large Clusters., OSDI
Google Scholar

[ref3] Tom White (2012., Hadoop: The definitive guide., O’Reilly Media publication, ISBN-13: 978-1491901632.
Google Scholar

[ref4] Hadoop (2016)., What Is Apache Hadoop?., Hadoop, http://hadoop.apache.org/, June 21, 2016.

[ref5] Cloudera (2016). Apache Hadoop., Cloudera, https://cloudera.com/products/apache-hadoop.html, June 21, 2016, undefined

[ref6] Hortonworks (2016)., Hadoop., http://hortonworks.com, June 21, 2016.

[ref7] IBM (2016)., Hadoop: Built for big data, insights, and innovation., IBM, USA, http://www.ibm.com/analytics/us/en/technology/hadoop/, access June 21, 2016

[ref8] MapR (2016)., Hadoop., MAPR, https://www.mapr.com/, access June 21, 2016.

[ref9] Hadoop (2016)., Apache MapReduce., Hadoop, https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html, June 21, 2016

[ref10] Cascading (2016)., MapReduce., Cascading, http://www.cascading.org/, June 21, 2016

[ref11] Apache Avro (2016)., Avro., Apache Avro, https://avro.apache.org/, June 21, 2016.