Skip to content

An AI Framework for Exploring Algorithm and Data Combinations with Efficiency-Accuracy Trade-offs

License

Notifications You must be signed in to change notification settings

Mat-Design-Yu/DSMR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DSMR

An AI Framework for Exploring Algorithm and Data Combinations with Efficiency-Accuracy Trade-offs

The primary goal of DSMR is to assess the importance of each row in the data matrix (i.e., each sample) and to provide materials scientists with the optimal combination of data subsets and models.

Alt text

Abstract:Machine learning models demonstrate remarkable capabilities in predicting properties of novel material. The optimal model can theoretically be obtained through an exhaustive search of data subsets, algorithms, and hyperparameters. However, the fundamental challenge lies in identifying the most efficient pathway through this immense search space. In this paper, we address this challenge by proposing an active learning-based exploration framework (DSMR), where the Data Screening (DS) module employs a "leave-one-out" strategy to dynamically refine the dataset through iterative elimination and addition of samples, while the Model Retrieval (MR) module integrates Bayesian optimization with hyperparameter exploration. The framework can achieve an ideal balance between computational efficiency and predictive accuracy, allowing for deeper exploration of data information. Systematic validation studies were conducted across two distinct sets each of classification and regression data. Superior models were obtained within 10 iterative cycles for all cases, achieving a 5-10% improvement compared to state-of-the-art results in current literature. Furthermore, the framework effectively utilizes newly added data to obtain enhanced models while featuring low-code implementation and user-friendly characteristics, making it a promising tool for materials design.

Key words: Active learning, Data enhancement, Model retrieval

About

An AI Framework for Exploring Algorithm and Data Combinations with Efficiency-Accuracy Trade-offs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages