Description
Taylor & Francis Robust Cluster Analysis and Variable Selection 2014 Edition by Gunter Ritter
Clustering remains a vibrant area of research in statistics. Although there are many books on this topic, there are relatively few that are well founded in the theoretical aspects. In Robust Cluster Analysis and Variable Selection, Gunter Ritter presents an overview of the theory and applications of probabilistic clustering and variable selection, synthesizing the key research results of the last 50 years. The author focuses on the robust clustering methods he found to be the most useful on simulated data and real-time applications. The book provides clear guidance for the varying needs of both applications, describing scenarios in which accuracy and speed are the primary goals.Robust Cluster Analysis and Variable Selection includes all of the important theoretical details, and covers the key probabilistic models, robustness issues, optimization algorithms, validation techniques, and variable selection methods. The book illustrates the different methods with simulated data and applies them to real-world data sets that can be easily downloaded from the web. This provides you with guidance in how to use clustering methods as well as applicable procedures and algorithms without having to understand their probabilistic fundamentals. Table of contents :- IntroductionMixture and classification models and their likelihood estimatorsGeneral consistency and asymptotic normalityLocal likelihood estimatesMaximum likelihood estimatesNotesMixture models and their likelihood estimatorsLatent distributionsFinite mixture modelsIdentifiable mixture modelsAsymptotic properties of local likelihood maximaAsymptotic properties of the MLE: constrained nonparametric mixture modelsAsymptotic properties of the MLE: constrained parametric mixture modelsNotesClassification models and their criteriaProbabilistic criteria for general populationsAdmissibility and size constraintsSteady partitionsElliptical modelsNormal modelsGeometric considerationsConsistency of the MAP criterionNotesRobustification by trimmingOutliers and measures of robustnessOutliersThe sensitivitiesSensitivity of ML estimates of mixture modelsBreakdown pointsTrimming the mixture modelTrimmed likelihood function of the mixture modelNormal componentsUniversal breakdown points of covariance matrices, mixing rates, and meansRestricted breakdown point of mixing rates and meansNotesTrimming the classification model - the TDCTrimmed MAP classification modelNormal case - the Trimmed Determinant Criterion, TDCBreakdown robustness of the constrained TDCUniversal breakdown point of covariance matrices and meansRestricted breakdown point of the meansNotesAlgorithmsEM algorithm for mixturesGeneral mixturesNormal mixtures Mixtures of multivariate t-distributionsTrimming - the EMT algorithmOrder of ConvergenceAcceleration of the mixture EMNotesk-Parameters algorithmsGeneral and elliptically symmetric modelsSteady solutions and trimmingUsing combinatorial optimizationOverall algorithmsNotesHierarchical methods for initial solutionsFavorite solutions and cluster validationScale balance and Pareto solutionsNumber of components of uncontaminated dataLikelihood-ratio testsUsing cluster criteria as test statisticsModel selection criteriaRidgeline manifoldNumber of components and outliersClassification trimmed likelihood curvesTrimmed BICAdjusted BICCluster validationSeparation indicesNormality and related testsVisualizationMeasures of agreement of partitionsStabilityNotesVariable selection in clusteringIrrelevanceDefinition and general propertiesThe normal caseFiltersUnivariate filtersMultivariate filtersWrappersUsing the likelihood ratio testUsing Bayes factors and their BIC approximationsMaximum likelihood subset selectionConsistency of the MAP cluster criterion with variable selectionPractical guidelinesNotesApplicationsMiscellaneous data setsIRIS dataSWISS BILLSSTONE FLAKESGene expression dataSupervised and unsupervised methodsCombining gene selection and profile clusteringApplication to the LEUKEMIA dataNotesAppendix A: Geometry and linear algebraAppendix B: TopologyAppendix C: AnalysisAppendix D: Measures and probabilitiesAppendix E: ProbabilityAppendix F: StatisticsAppendix G: Optimization