aboutsummaryrefslogtreecommitdiff
path: root/textproc/p5-Search-VectorSpace/pkg-descr
blob: fa37f69e9729ffc1866668e2633c8bd785daaa63 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
This module takes a list of documents (in English) and
builds a simple in-memory search engine using a vector
space model. Documents are stored as PDL objects, and
after the initial indexing phase, the search should be
very fast. This implementation applies a rudimentary
stop list to filter out very common words, and uses a
cosine measure to calculate document similarity.
All documents above a user-configurable similarity
threshold are returned.

WWW: https://metacpan.org/release/Search-VectorSpace