|InfoVis.net>Magazine>message nº 39||Published 2001-04-30|
|También disponible en Español|
The digital magazine of InfoVis.net
Kohonen maps are what are known as Self Organising Maps (SOM). Developed by Teuvo Kohonen from 1989 on, they use neural nets to perform an automatic analysis and categorisation of the semantic contents of textual documents. The graphical output of this analysis is a 2D map of categories in which each category occupies a space proportional to their component's frequency. The more frequent patterns occupy a greater area at the expense of the less frequent ones.
Kohonen was motivated by the idea that "the representation of knowledge in a particular category of things in general might assume the form of a feature map that is geometrically organised over the corresponding piece of the brain".**. The algorithm takes an N-dimensional set of objects as input and trains a neural network that converges finally to produce a 2D map. Moreover, it appears that SOMs are among the most realistic models of brain function.
But what are Kohonen maps used for?. We can see it, for example, in the scheme and associated article by Martind Dodge that the MappaMundi journal presents on the ET-Map application by Prof. Hsinchun Chen of the Arizona University.
In this suggestive scheme, the top level is like a set of of tiles where the different domains adopt polygonal shapes of parallel sides. Each domain has an associated word or phrase that defines the category. If we click over a particular domain, a second screen opens containing another similar map but restricted now to the sub-domains of that domain. We can repeat the process until we reach a level where the individual documents that belong to that specific sub-domain appear as a traditional listing.
For those that prefer to get straight to the action, the group of Prof. Hsinchun Chen offers three different "Spider" programs that you can download for free (60 day trial version). The three of them have a similar structure. You enter a search criteria, the program returns a list of URLs that satisfy it , that you can browse and discard or maintain. Then, the spider dives into such pages and obtains more specific results that finally distils (or filters) into a list of nouns along with their associated frequency. Those that you are interested in you then keep, while you discard the others. With all this information the program finally builds a single level Kohonen map you can interact with.
The difference between the three applications is their objective: CI-Spider is devoted to Competitive Intelligence (what my competitors are working on); Meta-Spider searches simultaneously in several search engines and Cancer Spider is specialised in the search on cancer on-line databases. The three of them require a certain amount of patience since they search in the Internet.
To get an idea of what these maps are like without too much complication you can look at Map.net, a similar example that allows you to browse the whole Internet using a category map. Map.net is the showroom of the VisualNet technology that Antarcti.ca commercialises.
Are Kohonen maps useful? The (scarce and limited) usability tests indicate that when you know precisely what you are looking for, the traditional systems perform better. Nevertheless when you look at the big picture of a web or set of documents category maps can be very useful.
Kohonen maps are another alternative among the many that are blossoming with the objective of making the representation of large sets of textual information more digestible.
*) For those interested in neural networks, see for example the tutorial of the Politecnic Univ. of Madrid
**) Kohonen, T. Self-organization and associative memory. Springer Verlag 1989
Links of this issue:
Subscribe to the free newsletter