Data-driven Machine Learning and AI-based Data Transfer Strategies for High Energy Physics Experiments (HEP) on Open Science Grid (OSG)
Award Number: DE-SC0024648
Funding Agency: U.S. Department of Energy (DoE)
Award Amount: Not specified
Project Duration: August 15, 2023 - August 14, 2026
Dr. Byrav Ramamurthy, in collaboration with Dr. Derek Weitzel, has secured a significant grant from the U.S. Department of Energy for the project, “Data-driven Machine Learning and AI-based Data Transfer Strategies for High Energy Physics Experiments (HEP) on Open Science Grid (OSG).”
This project addresses the rapidly growing data demands of High Energy Physics (HEP) experiments, which require high-rate data transfer capabilities and efficient storage and networking solutions. As the volume of data from HEP experiments is expected to grow exponentially, the existing infrastructure risks being overwhelmed, creating bottlenecks that could impede the seamless execution of these workflows.
Key objectives of the project include: Machine Learning (ML) and AI-based Strategies: Designing online and offline ML/AI-based strategies to optimize HEP data transfers, enhancing the speed and efficiency of data distribution among computing facilities. Data Log Analysis: Conducting post-hoc and real-time analysis of cache transfer logs and network/storage resource data at HEP experiment endpoints to identify and resolve bottlenecks. Data Storage Mechanisms: Developing new data formats and storage mechanisms for faster querying and analysis using the University of Nebraska’s Holland Computing Center (HCC) and the Open Science Grid (OSG) infrastructure.
This research will leverage the high-performance computing resources at HCC, and the findings will be deployed on the OSG endpoint, benefiting numerous HEP experiments. By implementing intelligent data transfer strategies, the project aims to alleviate the bottlenecks in current storage, compute, and network infrastructures.
This project is supervised by Dr. Ramamurthy, a professor at the University of Nebraska-Lincoln’s School of Computing, with Dr. Derek Weitzel, a Research Assistant Professor at the HCC and UNL, serving as Co-PI. The project is an important contribution to improving the data management capabilities of HEP experiments, enhancing both infrastructure performance and scientific productivity.