The population of people ages 65 and older has increased since the 1960s
and current estimates indicate it will double by 2060. Medicare is a federal health
insurance program for people 65 or older in the United States. Medicare claims
fraud and abuse is an ongoing issue that wastes a large amount of money every year
resulting in higher health care costs and taxes for everyone. In this study, an empirical
evaluation of several unsupervised machine learning approaches is performed which
indicates reasonable fraud detection results. We employ two unsupervised machine
learning algorithms, Isolation Forest and Unsupervised Random Forest, which have
not been previously used for the detection of fraud and abuse on Medicare data.
Additionally, we implement three other machine learning methods previously applied
on Medicare data which include: Local Outlier Factor, Autoencoder, and k-Nearest
Neighbor. For our dataset, we combine the 2012 to 2015 Medicare provider utilization
and payment data and add fraud labels from the List of Excluded Individuals/Entities
(LEIE) database. Results show that Local Outlier Factor is the best model to use for
Medicare fraud detection.