Acta Scientific Computer Sciences

Short Communication Volume 4 Issue 2

How Apache MapReduce Handles Big Data Query?

Radhya Sahal*

Faculty of Computer Science and Engineering, Hodeidah University, Al Hudaydah, Yemen

*Corresponding Author: Radhya Sahal, Faculty of Computer Science and Engineering, Hodeidah University, Al Hudaydah, Yemen.

Received: December 20, 2021; Published: January 18, 2022

Abstract

Apache MapReduce is the most popular framework for batch data processing. However, despite its merits, the critical challenge of Apache MapReduce is rapidly handling queries over large scale data. This review aims to provide the state-of-the-art of Apache Hive, a famous language to handle big query data on Apache MapReduce.


Keywords: Query Processing; Apache MapReduce; Hive; HiveQL

References

  1. J Dean and S Ghemawat. "MapReduce: simplified data processing on large clusters”. Communications of the ACM 51 (2008): 107-113.
  2. R Lämmel. "Google’s MapReduce programming model-Revisited”. Science of Computer Programming 70 (2008): 1-30.
  3. S Wu., et al. “Query optimization for massively parallel data processing”. In Proceedings of the 2nd ACM Symposium on Cloud Computing (2011): 12.
  4. J Dean and S Ghemawat. “MapReduce: a flexible data processing tool”. Communications of the ACM 53 (2010): 72-77.
  5. R Sahal., et al. “Exploiting Coarse-grained Reused-based Opportunities in Big Data Multi-Query Optimization”. Journal of Computational Science 26 (2018): 432-452.
  6. R Sahal., et al. “Comparative Study of Multi-query Optimization Techniques using Shared Predicate-based for Big Data”. International Journal of Grid and Distributed Computing 9 (2016): 229-240.
  7. R Sahal. et al. “iHOME: Index-based JOIN Query Optimization for Limited Big Data Storage”. Journal of Grid Computing 16 (2018): 345-380.
  8. X-Y Gao. et al. “Exploiting Sharing Join Opportunities in Big Data Multiquery Optimization with Flink”. Complexity (2020): 2020.
  9. A Thusoo. et al. “Hive: a warehousing solution over a map-reduce framework”. PVLDB 2 (2009): 1626-1629.
  10. A Thusoo. et al. “Hive-a petabyte scale data warehouse using Hadoop”. In 26th IEEE International Conference on Data Engineering (ICDE) (2010): 996-1005.
  11. J LeFevre. et al. “Opportunistic physical design for big data analytics”. In Proceedings of ACM SIGMOD international conference on management of data (2014): 851-862.
  12. HSA Azez. et al. “JOUM: An Indexing Methodology for Improving Join in Hive Star schema”. International Journal of Scientific and Engineering Research 6 (2015): 111-119, 2015.
  13. MN Abdullah. et al. “HOME: HiveQL Optimization in Multi-Session Environment”. In Proceedings of the 5th European Conference of Computer Science (ECCS14) (2014): 80-89.
  14. T Dokeroglu. et al. “Improving the performance of Hadoop Hive by sharing scan and computation tasks”. Journal of Cloud Computing 3 (2014): 1-11.
  15. A Gruenheid. et al. “Query optimization using column statistics in hive”. In Proceedings of the 15th Symposium on International Database Engineering and Applications (2011): 97-105.
  16. E Capriolo. et al. “Programming Hive”. Data warehouse and query language for Hadoop, O'Reilly Media, Inc (2012).
  17. R Kumar. et al. “Comparison of SQL with HiveQL”. International Journal for Research in Technological Studies 1 (2014): 2348-1439.

Citation

Citation: Radhya Sahal. “How Apache MapReduce Handles Big Data Query?". Acta Scientific Computer Sciences 4.2 (2022): 34-36.

Copyright

Copyright: © 2022 Radhya Sahal. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.




Metrics

Acceptance rate35%
Acceptance to publication20-30 days

Indexed In




News and Events


  • Certification for Review
    Acta Scientific certifies the Editors/reviewers for their review done towards the assigned articles of the respective journals.
  • Submission Timeline for Upcoming Issue
    The last date for submission of articles for regular Issues is July 10, 2022.
  • Publication Certificate
    Authors will be issued a "Publication Certificate" as a mark of appreciation for publishing their work.
  • Best Article of the Issue
    The Editors will elect one Best Article after each issue release. The authors of this article will be provided with a certificate of “Best Article of the Issue”.
  • Welcoming Article Submission
    Acta Scientific delightfully welcomes active researchers for submission of articles towards the upcoming issue of respective journals.
  • Contact US