Multiple Outliers Detection Procedures in Linear Regression

Authors

  • Robiah Adnan
  • Mohd Nor Mohamad
  • Halim Setan

DOI:

https://doi.org/10.11113/matematika.v19.n.502

Abstract

Kertas kerja ini menghuraikan satu prosedur untuk mengenalpasti gandaan data terpenncil dalam regresi linear. Prosedur ini menggunakan kaedah penyesuan teguh iaitu kaedah kuasa dua trim terkecil dan kaedah berkelompok pautann tunggal untuk mengecam data terpencil yang mungkin. Kemudian, diagnostik kes berganda digunakan untuk mengenalpasti data terpencil. Prestasi prosedur ini dibandingkan dengan Serbert. Simulasi Monte Carlo digunakan untuk mengenalpasti prosedur yang terbaik dalam semua keadaan regresi linear. Katakunci: data terpencil berganda; regresi linear; penyesuain teguh; kuasa dua trim terkecil; pautan tunggal. This paper describes a procedure for identifying multiple outliers in linear regression. This procedure uses a robust fit which is the least of trimmed of squares (LTS) and the single linkage clustering method to obtain the potential outliers. Then multiple-case diagnostics are used to obtain the outliers from these potential outliers. The performance of this procedure is also compared to Serbert's method. Monte Carlo simulations are used in determining which procedure performed best in all of the linear regression scenarios. Keywords: Multiple outliers, linear regression, robust fit, Least trimmed of squares, single linkage. Keywords: Multiple outliers; linear regression; robust fit; least trimmed of squares; single linkage.

Downloads

Published

2003-06-01

Issue

Section

Mathematics