diff options
Diffstat (limited to 'textproc/p5-Search-VectorSpace/pkg-descr')
-rw-r--r-- | textproc/p5-Search-VectorSpace/pkg-descr | 12 |
1 files changed, 12 insertions, 0 deletions
diff --git a/textproc/p5-Search-VectorSpace/pkg-descr b/textproc/p5-Search-VectorSpace/pkg-descr new file mode 100644 index 000000000000..0d30dae3d834 --- /dev/null +++ b/textproc/p5-Search-VectorSpace/pkg-descr @@ -0,0 +1,12 @@ +This module takes a list of documents (in English) and +builds a simple in-memory search engine using a vector +space model. Documents are stored as PDL objects, and +after the initial indexing phase, the search should be +very fast. This implementation applies a rudimentary +stop list to filter out very common words, and uses a +cosine measure to calculate document similarity. +All documents above a user-configurable similarity +threshold are returned. + +Author: Maciej Ceglowski <maciej AT ceglowski.com> +WWW: http://search.cpan.org/dist/Search-VectorSpace/ |