%A P. S. Bradley
%A O. L. Mangasarian
%A W. N. Street
%T Clustering via Concave Minimization
%D May 1996
%R 96-03
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X The problem of assigning $m$ points in the
$n$-dimensional real space $R^n$ to $k$ clusters
is formulated as that
of determining $k$ centers in $R^n$ such that the sum
of distances of each point to the nearest center is minimized.
If a polyhedral distance is used, the problem can be formulated
as that of minimizing a piecewise-linear concave function
on a polyhedral set which is shown to be equivalent to a bilinear
program: minimizing
a bilinear function on a polyhedral set.
A fast finite k-Median Algorithm consisting
of solving few linear programs in closed form
leads to a stationary point of the bilinear program.
Computational testing on a number of real-world databases
was carried out. On the Wisconsin Diagnostic Breast
Cancer (WDBC) database, k-Median training set correctness was
comparable to that of the k-Mean Algorithm, however
its testing set correctness was better. Additionally,
on the Wisconsin Prognostic Breast Cancer (WPBC) database,
distinct and clinically important survival curves were extracted
from the database by the k-Median Algorithm, whereas the
k-Mean Algorithm failed to obtain such distinct survival
curves for the same database.