Chúc mừng nhóm sinh viên HTTT2021, học viên cao học ngành HTTT có bài báo tại Hội nghị Quốc tế về Hệ thống thông minh và Khoa học dữ liệu (ISDC 2025)
Following the success of the first ISDS 2023 organized at Can Tho University, the second ISDS 2024 organized at Nha Trang University, This year, the ISDS 2025 will be held at CICT, Can Tho University. Objectives of this international conference is to attract domestic and foreign researchers to participate and present outstanding and recent research in the field of ICT. This is an opportunity for scientists to meet, exchange, and cooperate. The ISDS 2025 is also a place for students to report and learn new results in the field of ICT. This ISDS conference looks at state-of-the-art and original research issues (in the topics of intelligent systems and data science).
- Topics of the conference relate to (but not limited to):
- Track 1: Intelligent Systems & Recommender Systems
- Track 2: Data Science & Machine Learing
- Track 3: Image Processing & Pattern Recognition
- Track 4: Natural Language Processing
- ISDS 2025 will be held at College of ICT, Can Tho University from 18-19-Oct-2025
Link hội nghị: https://isds.ctu.edu.vn/2025/
Tên bài báo: “A Machine Learning-Based Tool for Autism Screening Using Psychological Medical Records”
Nhóm sinh viên, học viên thực hiện:
- 21522049, Nguyễn Thị Bích Hảo, HTTT2021
- 21521447, Nguyễn Văn Quốc Thanh, HTTT2021
- 220104018, Nguyễn Minh Nhựt, HVCH HTTT 2022
Giảng viên hướng dẫn: PGS. TS. Nguyễn Đình Thuân
Abstract: Autism Spectrum Disorder (ASD) is a neurodevelopmental condition that impacts social interaction and behavior, where early detection is crucial for optimizing intervention outcomes. This study proposes a machine learning-based screening tool that utilizes psychological medical records (EMR) to identify ASD in children aged 1 to 8 years. We conducted an experiment using two datasets: a publicly available dataset from a prior study, consisting of approximately 300 records with information on ASD status and responses to 10 evaluation questions; and an EMR dataset from a pediatric psychology clinic, comprising about 600 records that include assessments from parents through interviews and from clinicians via direct examinations. Both datasets were labeled by psychology experts based on evaluation criteria such as eye contact, pointing, imitation, and others. The EMR dataset from the clinic includes 18 labeled features corresponding to evaluation criteria. For the public dataset, we analyzed the experimental questions and mapped them to the 18 features of the EMR dataset, selecting 10 statistically significant features. Both datasets underwent preprocessing to handle missing data, encode features, and select features with significant impact on the outcome label. This resulted in 10 key features used for training machine learning classification models. Four machine learning models were implemented: Decision Tree, XG Boost, CatBoost, and Overall Local Accuracy (OLA). Results showed that OLA exhibited superior adaptability across both standard and potentially noisy datasets. However, initial experiments revealed two main challenges: class imbalance (90% ASD vs. 10% non-ASD) and feature bias due to the high accuracy of expert-labeled EMR data, causing models to over-rely on certain features. These limitations reduced the models’ reliability in real-world applications. To address these issues, we applied the SMOTE-IPF technique to balance the ASD and non-ASD class ratios and enhanced the OLA framework by incorporating dynamic weighting and regularization to reduce dependency on specific features. The proposed screening tool offers a cost-effective and scalable solution, particularly suitable for resource-constrained settings.











