Multi Query Optimization in Spark
Please read the document attached to get an understanding of what the work is about. The paper proposes heuristic algorithms for multi-query optimization.
The research extends the paper by Michiardi, P., Carra, D., & Migliorini, S. (2019a). “In-memory Caching for Multi-query Optimization of Data intensive Scalable Computing Workloads” to support Join Operators.
The focus of this writeup is the Chapter 3(Methodology):
– An in-depth introduction to the chapter with references
– An in-depth description of the methodology
– recognizing the possibilities of shared computation, essentially setting up the search space by identifying common sub-expressions (search space initialization)
– modifying the optimizer search strategy to explicitly account for shared computation and improve overall performance (optimization stage)
– An in-depth description of the proposed algorithms with references.
- Conclusion of the chapter
– A better organization of the chapter and sections
It is crucial that the write-up is precise and accurate, therefore, it necessitates a thorough comprehension of the concept to be written meticulously.