شناسایی تقلب در دستمزد بیمه‌شدگان تأمین اجتماعی با رویکرد الگوریتم‌های یادگیری ماشین

ربیعی, امیرسالار; رضایی, غلامرضا; قراخانی, داود

doi:10.22034/qjo.2026.555073.1448

شناسایی تقلب در دستمزد بیمه‌شدگان تأمین اجتماعی با رویکرد الگوریتم‌های یادگیری ماشین

نوع مقاله : مقاله پژوهشی

نویسندگان

امیرسالار ربیعی ¹

غلامرضا رضایی ²

داود قراخانی ³

¹ دانشجوی دکتری مدیریت صنعتی دانشگاه آزاد اسلامی استان قزوین.

² دکتری اقتصاد، استادیار دانشگاه آزاد اسلامی استان قزوین.

³ دکتری مدیریت، استادیار گروه مدیریت دانشگاه آزاد اسلامی استان قزوین

10.22034/qjo.2026.555073.1448

چکیده

هدف: تقلب درگزارش دستمزد از چالش‌های مهم نظام‌های بیمه‌ای است که با افزایش حجم داده‌ها و گسترش سامانه‌های غیرحضوری، شناسایی آن با روش‌های سنتی دشوارتر شده است. این پژوهش با هدف ارزیابی توان الگوریتم‌های یادگیری ماشین بدون نظارت در کشف ناهنجاری‌های مرتبط با تقلب دستمزدی و ارائه رویکردی خودکار برای تقویت فرایندهای نظارتی سازمان تأمین اجتماعی انجام شده است.
روش: این پژوهش ازنظر هدف، کاربردی و از نظر ماهیت داده‌ها، توصیفی-تحلیلی است. سه الگوریتم جنگل ایزوله، الگوریتم خوشه‌بندی مبتنی بر چگالی (DBSCAN) و ماشین بردار پشتیبان (One-Class SVM) روی مجموعه‌ای شامل ۲۶۲۵۸ رکورد ماهانه دستمزد ۴۷۰ بیمه‌شده طی سال‌های ۱۳۹۸ تا ۱۴۰۲ در تشخیص تقلب برای شناسایی دستمزدهای نامتعارف استفاده ‌شده است. تحلیل‌ها با تمرکز بر الگوهای ناهنجار در سطوح فردی و بین‌فردی انجام و کارایی روش‌ها بر اساس منطق تشخیص و سازگاری با رفتار واقعی داده‌ها ارزیابی شد.
یافته‌ها: نتایج نشان داد الگوریتم جنگل ایزوله دقیق‌ترین و پایدارترین عملکرد را داشته و ناهنجاری‌ها را با توزیع منطقی‌تر نسبت به سایر روش‌ها شناسایی کرده است. الگوریتم خوشه‌بندی مبتنی‌بر چگالی در داده‌های پراکنده دچار حذف بیش از حد شده و ماشین بردار پشتیبان، حساسیت بالایی همراه با نرخ هشدار کاذب بیشتر نشان داده است.
نتیجه‌گیری:‌ یادگیری ماشین بدون نظارت، توانایی مؤثری در شناسایی خودکار رفتارهای مشکوک دستمزدی دارد و استفاده از جنگل ایزوله می‌تواند راهکاری مقیاس‌پذیر و قابل‌اعتماد برای کاهش ریسک تقلب در نظام‌های بیمه‌ای فراهم کند. پیشنهاد می‌شود این الگوریتم به‌عنوان هسته یک سامانه هشدار هوشمند در ساختارهای نظارتی مورد استفاده قرار گیرد.

کلیدواژه‌ها

تقلب بیمه‌ای

جنگل ایزوله

خوشه‌بندی مبتنی بر چگالی

ماشین بردار پشتیبان تک کلاسه

سازمان تأمین اجتماعی

عنوان مقاله English

Identifying fraud in the wages of social security insureds using machine learning algorithms

نویسندگان English

Amir Salar Rabeiee ¹

Gholamreza Rezaei ²

Davood Gharakhani ³

¹ PhD student in Industrial Management, Islamic Azad University, Qazvin Province, Iran.

² PhD in Economics, Assistant Professor, Islamic Azad University, Qazvin Province, Iran.

³ Ph.D. in Management, Assistant Professor, Department of Management, Islamic Azad University, Qazvin Province, Iran

چکیده English

Purpose: Wage-reporting fraud is a major challenge in social insurance systems. With the growing volume of data and the expansion of online premium‑submission systems, detecting such fraud through traditional methods has become increasingly difficult. This study aims to evaluate the effectiveness of unsupervised machine learning algorithms in identifying anomalies associated with wage-reporting fraud and to propose an automated approach for strengthening supervisory processes within the Social Security Organization.
Method: This applied research adopts a descriptive–analytical approach. Three unsupervised machine learning algorithms—Isolation Forest, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and One-Class Support Vector Machine (One-Class SVM)—were employed to detect anomalous wage records potentially indicative of fraud. The dataset consisted of 26,258 monthly wage records from 470 insured individuals covering the period from 2019 to 2023. Analyses focused on identifying abnormal patterns at both individual and cross-individual levels, and the performance of the algorithms was evaluated based on their detection logic and consistency with actual data behavior.
Findings: Isolation Forest demonstrated the most accurate and stable performance, identifying anomalies with a more reasonable and interpretable distribution compared with the other methods. DBSCAN exhibited excessive exclusion of observations in sparse data environments, while One-Class SVM showed high sensitivity accompanied by a higher rate of false alarms.
Conclusion: Unsupervised machine learning techniques provide an effective means for the automated detection of suspicious wage-reporting behaviors. The findings suggest that Isolation Forest can serve as a scalable and reliable solution for mitigating fraud risks in social insurance systems. It is recommended that this algorithm be utilized as the core component of an intelligent early-warning system within supervisory and monitoring frameworks.

کلیدواژه‌ها English

nsurance Fraud

Isolation Forest

DBSCAN

One-Class Support Vector Machine (One-Class SVM)

Social Security Organization