%A W. Nick Street %T Cancer Diagnosis and Prognosis via Linear-Programming-Based Machine Learning %D August 1994 %R 94-14 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X The purpose of this research is twofold. The first purpose is the development of machine learning methods based on linear programming. These advances come in the form of both novel learning algorithms and generalization-improvement approaches that are applicable to a wide range of learning algorithms. The second aim is the application of these algorithms, along with ideas from the fields of statistics and image processing, to problems arising in the diagnosis and treatment of breast cancer. The first chapter provides background into the relevant areas of research, most importantly machine learning and breast cancer diagnosis and prognosis, and describes previous work that has united the two fields. The first major contribution of this research is an automated cytological analysis system that we call Xcyt. This system includes a partially automatic image segmentation facility for isolating cell nuclei from a digital image. {\bf Xcyt} allows computation of various size, shape and texture features of these nuclei. A linear separator distinguishes benign cases from malignant cases based on 569 training examples. Xcyt also provides an estimate of the probability of malignancy for each case. The Xcyt system is currently in use at the University of Wisconsin Hospitals. We also address the problem of cancer prognosis, that is, determining when a cancer is likely to recur. A new learning method, the recurrence surface approximation (RSA), is introduced for predicting the time of recurrence. This procedure uses linear programming to predict recurrence from right-censored input data. Various extensions and variations of RSA are also explored, including a separation-based procedure that we call implicit RSA (IRSA). The topic of improving the generalization of a learning system is first addressed in the context of RSA, where we introduce a procedure for choosing the most relevant input features for use the the predictive model. The final two chapters also describe generalization-enhancement methods. The first is mock generalization, which incorporates a tuning set into the optimization process. The second is banded approximation, which explicitly defines a tolerance band around the predictive surface in order to avoid overfitting the training data.