Intelligent Grouping Method of Science and Technology Projects Based on Data Augmentation and SMOTE

Zhou, Can and Li, Mengting and Yu, Sha (2022) Intelligent Grouping Method of Science and Technology Projects Based on Data Augmentation and SMOTE. Applied Artificial Intelligence, 36 (1). ISSN 0883-9514

[thumbnail of Intelligent Grouping Method of Science and Technology Projects Based on Data Augmentation and SMOTE.pdf] Text
Intelligent Grouping Method of Science and Technology Projects Based on Data Augmentation and SMOTE.pdf - Published Version

Download (4MB)

Abstract

The current evaluation of science and technology projects is mainly completed by peer review, and in the process of evaluation, dividing projects into different groups is a crucial step. Project grouping is challenging due to the small amounts of data, sparsity of features, broad range of subject areas, and the seriously uneven distribution of categories. In this paper, we propose an intelligent automatic grouping method for science and technology projects based on keywords. We expanded the small dataset with samples generated by Paraphrasing, Mixup, and the GPT3 model. The text feature extraction techniques TF-IDF, Word2Vec, and TF-IDF weighted Word2Vec were utilized to pre-process the keywords of projects, and SVM and XGBoost as the classifier. Besides, we used SMOTE to process imbalanced data to alleviate model bias toward minority classes. Experiments show that the project grouping accuracy was substantially improved after introducing the data augmentation method and SMOTE. The combination of Paraphrasing, TF-IDF, SVM and SMOTE achieved the best performance, and the F1 score reached 96.78%, which proves the feasibility of the proposed method.

Item Type: Article
Subjects: Research Scholar Guardian > Computer Science
Depositing User: Unnamed user with email support@scholarguardian.com
Date Deposited: 16 Jun 2023 09:52
Last Modified: 20 Jan 2024 10:19
URI: http://science.sdpublishers.org/id/eprint/1140

Actions (login required)

View Item
View Item