Faculty member & Ph.D. candidate receive Best Paper Award at IEEE 24th IRI Conference

Dr. Yan Huang (left) & Somayeh Ghanbarzadeh (right) receive the Best Paper Award at the IEEE 24th International Conference on Information Reuse and Integration for Data Science. Somayeh is pictured with IEEE at the award acceptance.

Dr. Yan Huang, CSE Faculty, and Somayeh Ghanbarzadeh, CSE Ph.D. candidate, among others, recently received the Best Paper Award for their collaborative research entitled "Improving the Reusability of Pre-trained Language Models in Real-world Applications." Somayeh accepted the award at the 24th IRI Conference on August 4th-6th hosted by The Institute of Electrical and Electronics Engineers (IEEE). 

The conference serves as a forum for researchers and practitioners from academia, industry, and government to present, discuss, and exchange ideas that address real-world problems with real-world solutions. Theoretical and applied papers are both included. 

This conference explores three major tracks: information reuseinformation integration, and reusable systemsInformation reuse explores the theory and practice of optimizing representations; information integration focuses on innovative strategies and algorithms for unifying diverse information in novel domains; and reusable systems focus on developing and deploying models and corresponding processes that enable Information Reuse and Integration to play a pivotal role in enhancing decision-making processes in various application domains.

The abstract of the award-winning paper is as follows: The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their generalization problem, where their performance drastically decreases when evaluated on examples that differ from the training dataset, known as Out-of-Distribution (OOD)/unseen examples. This limitation arises from PLMs' reliance on spurious correlations, which work well for frequent example types but not for general examples. To address this issue, we propose a training approach called Mask-tuning, which integrates Masked Language Modeling (MLM) training objectives into the fine-tuning process to enhance PLMs' generalization. Comprehensive experiments demonstrate that Mask-tuning surpasses current state-of-the-art techniques and enhances PLMs' generalization on OOD datasets while improving their performance on in-distribution datasets. The findings suggest that Mask-tuning improves the reusability of PLMs on unseen data, making them more practical and effective for real-world applications.

The contributing team consists of Somayeh Ghanbarzadeh, Yan Huang, Hamid Palangi, Radames Cruz Moreno, and Hamed Khanpour. The UNT Dept. of Computer Science & Engineering extends their congratulations to everyone on team for their hard work & endeavors.