Affiliation:
1. Applied Research Institute, Polytechnic Institute of Coimbra, 3045-093 Coimbra, Portugal
2. Instituto de Telecomunicações, 6201-001 Covilhã, Portugal
3. Department of Informatics, Polytechnic of Viseu, 3504-510 Viseu, Portugal
Abstract
While the importance of indexing strategies for optimizing query performance in database systems is widely acknowledged, the impact of rapidly evolving hardware architectures on indexing techniques has been an underexplored area. As modern computing systems increasingly leverage parallel processing capabilities, multi-core CPUs, and specialized hardware accelerators, traditional indexing approaches may not fully capitalize on these advancements. This comprehensive experimental study investigates the effects of hardware-conscious indexing strategies tailored for contemporary and emerging hardware platforms. Through rigorous experimentation on a real-world database environment using the industry-standard TPC-H benchmark, this research evaluates the performance implications of indexing techniques specifically designed to exploit parallelism, vectorization, and hardware-accelerated operations. By examining approaches such as cache-conscious B-Tree variants, SIMD-optimized hash indexes, and GPU-accelerated spatial indexing, the study provides valuable insights into the potential performance gains and trade-offs associated with these hardware-aware indexing methods. The findings reveal that hardware-conscious indexing strategies can significantly outperform their traditional counterparts, particularly in data-intensive workloads and large-scale database deployments. Our experiments show improvements ranging from 32.4% to 48.6% in query execution time, depending on the specific technique and hardware configuration. However, the study also highlights the complexity of implementing and tuning these techniques, as they often require intricate code optimizations and a deep understanding of the underlying hardware architecture. Additionally, this research explores the potential of machine learning-based indexing approaches, including reinforcement learning for index selection and neural network-based index advisors. While these techniques show promise, with performance improvements of up to 48.6% in certain scenarios, their effectiveness varies across different query types and data distributions. By offering a comprehensive analysis and practical recommendations, this research contributes to the ongoing pursuit of database performance optimization in the era of heterogeneous computing. The findings inform database administrators, developers, and system architects on effective indexing practices tailored for modern hardware, while also paving the way for future research into adaptive indexing techniques that can dynamically leverage hardware capabilities based on workload characteristics and resource availability.
Funder
Polytechnic Institute of Viseu
Reference19 articles.
1. BP-tree: Overcoming the Point-Range Operation Tradeoff for In-Memory B-trees;Xu;Proc. VLDB Endow.,2023
2. Optimization of local parallel index (LPI) in parallel/distributed database systems;Chakraoui;Int. J. Geomate,2016
3. Shahrokhi, H., and Shaikhha, A. (2023). An Efficient Vectorized Hash Table for Batch Computations. 37th European Conference on Object-Oriented Programming (ECOOP 2023), Schloss-Dagstuhl-Leibniz Zentrum für Informatik.
4. Learning to hash for indexing big data—A survey;Wang;Proc. IEEE,2015
5. Xin, G., Zhao, Y., and Han, J. (2021, January 22–28). A Multi-Layer Parallel Hardware Architecture for Homomorphic Computation in Machine Learning. Proceedings of the 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Republic of Korea.