PRIM-cipal components analysis

stat.ML updates on arXiv.org

Tianhao Liu, Daniel Andr\'es D\'iaz-Pach\'on, J. Sunil Rao

Apr 20, 2026, 12:00 AM

arXiv:2604.15538v1 Announce Type: new Abstract: Supervised No Free Lunch Theorems (NFLTs) are well studied, yet unsupervised NFLTs remain underexplored. For elliptical distributions, we prove that there exist two equally optimal, scientifically meaningful bump-hunting strategies that are exact opposites, with no universal winner. Specifically, peeling $k$ orthogonal dimensions from $\mathbb{R}^d$ ($d \ge k$), retaining an inter-quantile region of probability $1-\alpha$ per peeled dimension, maximizes total variance and Frobenius norm when the $k$ smallest principal components (called pettiest components) are selected, and minimizes them when the selected dimensions are the $k$ leading principal components. These optima inspire PRIM-based bump-hunting algorithms either by minimizing variance or by minimizing volume, thereby motivating an NFLT. We test our results on the Fashion-MNIST database, showing that peeling the largest principal components captures multiplicity, while peeling the smallest principal components isolates popular styles.