arxiv:1904.06546

Self-Paced Probabilistic Principal Component Analysis for Data with Outliers

Published on Apr 13, 2019

Authors:

Wanpeng Zhang ,

Abstract

A novel method called Self-Paced Probabilistic Principal Component Analysis (SP-PPCA) is introduced to improve robustness against outliers in dimensionality reduction and feature extraction.

AI-generated summary

Principal Component Analysis (PCA) is a popular tool for dimensionality reduction and feature extraction in data analysis. There is a probabilistic version of PCA, known as Probabilistic PCA (PPCA). However, standard PCA and PPCA are not robust, as they are sensitive to outliers. To alleviate this problem, this paper introduces the Self-Paced Learning mechanism into PPCA, and proposes a novel method called Self-Paced Probabilistic Principal Component Analysis (SP-PPCA). Furthermore, we design the corresponding optimization algorithm based on the alternative search strategy and the expectation-maximization algorithm. SP-PPCA looks for optimal projection vectors and filters out outliers iteratively. Experiments on both synthetic problems and real-world datasets clearly demonstrate that SP-PPCA is able to reduce or eliminate the impact of outliers.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/1904.06546 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/1904.06546 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/1904.06546 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.