OptiFlexSort: A Hybrid Sorting Algorithm for Efficient Large-Scale Data Processing

Abuba, Nelson Seidu and Baagyere, Edward Yellakuor and Nakpih, Callistus Ireneous and Wiredu, Japheth Kodua (2025) OptiFlexSort: A Hybrid Sorting Algorithm for Efficient Large-Scale Data Processing. Journal of Advances in Mathematics and Computer Science, 40 (2). pp. 67-81. ISSN 2456-9968

Full text not available from this repository.

Abstract

Efficient sorting of massive datasets is a cornerstone of data-intensive applications, yet traditional sorting algorithms often face scalability challenges as data volumes grow exponentially. This study introduces OptiFlexSort, a novel hybrid sorting algorithm designed to enhance scalability while maintaining the inherent efficiency of Quicksort. OptiFlexSort incorporates an optimized last-element pivot selection strategy, leveraging median-of-three considerations to improve pivot quality, and an adaptive partitioning mechanism that dynamically adjusts partition sizes based on data distribution characteristics, using a threshold-based approach to balance partition efficiency.

To evaluate its performance, comprehensive experiments were conducted on randomly generated integer datasets ranging from 1,000 to 1 million elements. Implemented in Python, OptiFlexSort was benchmarked against established algorithms, including Merge Sort, Heapsort, Radix Sort, and external merge sort implementations (STXXL and TPIE). Each test was repeated twenty times to ensure statistical consistency. The results demonstrate that OptiFlexSort achieves a 10-15% improvement in execution time over Merge Sort and Heapsort across all dataset sizes. For datasets of 50,000–100,000 elements, its performance was statistically indistinguishable from Radix Sort, with differences of less than 2%. For datasets exceeding 200,000 elements, OptiFlexSort achieved a 5-8% reduction in execution time. Notably, for datasets exceeding hundreds of thousands of elements, it outperformed advanced external merge sort implementations, highlighting its robustness and scalability.

This study contributes to the field of sorting algorithm design by presenting a highly efficient, scalable, and adaptive solution tailored to the demands of modern big data applications and large-scale data processing. OptiFlexSort represents a significant step forward in addressing the challenges posed by exponentially growing datasets, offering a practical and efficient solution for large-scale data processing. While the algorithm excels on uniformly distributed integer datasets, future work will explore its adaptability to other data types and distributions, further broadening its applicability.

Item Type: Article
Subjects: Open Asian Library > Mathematical Science
Depositing User: Unnamed user with email support@openasianlibrary.com
Date Deposited: 07 Mar 2025 04:20
Last Modified: 07 Mar 2025 04:20
URI: http://conference.peerreviewarticle.com/id/eprint/2094

Actions (login required)

View Item
View Item