Multivariate analysis & classification

Normal mixture models
Several codes are available that characterize multivariate datasets as mixtures of Gaussian populations via likelihood methods, often using Bayesian principles. They include: EMMIX by G. McLachlan P , MCLUST by C. Fraley and A. Raftery P , AutoClass C by P. Cheeseman P , and Snob by D. Dowe. P
Multivariate data analysis software
Collection of subroutines for principal components analysis, partitioning, hierarchical clustering. discriminant analyses (linear, multiple, k-nearest neighbors), correspondence analysis, multidimensional scaling, Sammon mapping, Kohonen self-organizing feature map.
Classification Society of North America (CSNA)
Metasite with many links to classification meetings, journals, discussion groups, commercial and on-line software.
Software for clustering and multivariate analysis
Metasite with descriptions of on-line programs and packages.
Machine Learning Library in C++ (MLC++)
Data mining and multivariate classification package including data manipulation, variety of categorizers (on attributes, thresholds, nearest neighbor, perceptron, decision tree ), induction algorithms, and visualization tools of data and trees. (P)
R Package
Package in Pascal developed for ecological spatio-temporal multivariate datasets based on monograph by L. & P. Legendre (1983). Functionalities include autocorrelation using correlograms (Moran's I and Geary's c indices), hierarchical agglomerative clustering, k-means clustering, chronological clustering for multivariate time series, analysis of variance, geometrical connectors, (nearest neighbor, Gabriel's connection, Delaunay triangulation), Mantel's two-sample statistic, multidimensional scaling by principal coordinates analysis, univariate periodogram. (P)
Multivariate analysis and graphical display package for Macintosh and Windows 95. Also provided is NetMul, a Web interface to ADE-4 for on-line principal components analysis, co-inertial analysis and discriminant analysis. (P)
Feasible solution algorithms
Algorithms for the common high breakdown estimation criteria, and to find the minimum volume ellipsoid in multivariate datasets. (P)
Interactive Projection Pursuit, providing 1- and 2-dimensional projections of multivariate data for interactive discovery of structure. The user chooses and graphically investigates interesting projections. From Case Western Reserve University. C and Fortran algorithms installed as a library for S-Plus. (P)
Oblique decision trees
Hyperplane partitioning of multivariate datasets (P)
Clustering algorithm based on dynamic altering of hierarchies. (P)
Fast Algorithm for Classification Trees"
Tree-structures classification similar to CART. (P)
Library of several dozen subroutines from NIST for multivariate clustering algorithm from 1975 monobraph by J. A. Hartigan.
Cluster analysis
Six programs computing dissimilarities, partitioning using medoids, k-medoid clustering, fuzzy clustering, agglomerative and divisive hierarchical clustering, clustering of binary data.
Average-linkage hierarchical clustering.
Hierarchical clustering
Algorithm for agglomerative clustering using various criteria (Ward's minimum variance, single linkage, average linkage, complete linkage, McQuitty's method, median method, centroid method).
AS 15 ,
Algorithm for single-linkage and minimum intra-cluster variance clustering.
AS 58
Algorithm for single-linkage and minimum intra-cluster variance clustering.
k-means clustering ,
k-means clustering minimizing intra-cluster variance.
Multivariate linear regression by least median of squares.
Minimum volume ellipsoid estimator
Robust estimator of multivariate location and dispersion.
Hypothesis testing for means and spreads for multivariate Gaussian data.
Projection pursuit
Two-dimensional exploratory projection pursuit.
Multivariate skewness and kurtosis
Probabilities of R^2
Distribution function of the square multiple correlation coefficient
Linear dependency analysis for multivariate data.
Principal components analysis

Return to StatCodes homepage