Algorithm::KMeans is a perl5 module for the clustering of numerical data in multidimensional spaces. Since the module is entirely in Perl (in the sense that it is not a Perl wrapper around a C library that actually does the clustering), the code in the module can easily be modified to experiment with several aspects of automatic clustering. For example, one can change the criterion used to measure the "distance" between two data points, the stopping condition for accepting final clusters, the criterion used for measuring the quality of the clustering achieved, etc. Please note that this clustering module is not meant for very large datafiles. Being an all-Perl implementation, the goal here is not the speed of execution. On the contrary, the goal is to make it easy to experiment with the different facets of K-Means clustering. If you need to process a large data file, you'd be better off with a module like Algorithm::Cluster. But note that when you use a wrapper module in which it is a C library that is actually doing the job of clustering for you, it is more difficult to experiment with various aspects of clustering. This module requires the following three modules: Math::Random Graphics::GnuplotIF Math::GSL the first for generating the multivariate random numbers, the second for the visualization of the clusters, and the third for access to the Perl wrappers for the GNU Scientific Library. The last, Math::GSL, is needed for the 'smart' option for "cluster_seeding" in the constructor. For installation, do the usual perl Makefile.PL make make test make install if you have root access. If not, perl Makefile.PL prefix=/some/other/directory/ make make test make install Contact: Avinash Kak email: kak@purdue.edu Please place the string "KMeans" in the subject line if you wish to write to the author. Any feedback regarding this module would be highly appreciated.