diff options
author | Martin Wilke <miwi@FreeBSD.org> | 2009-03-16 21:47:30 +0000 |
---|---|---|
committer | Martin Wilke <miwi@FreeBSD.org> | 2009-03-16 21:47:30 +0000 |
commit | abb5037267db681350863eb763f240cca4d3c66f (patch) | |
tree | 4d8d6af944e88d473a94cb4e9b8e81ba673001aa /textproc/pystemmer/pkg-descr | |
parent | cae53ca0e18e63b4742b971d53dc10395735c0a1 (diff) |
PyStemmer provides access to efficient algorithms for calculating a
"stemmed" form of a word. This is a form with most of the common
morphological endings removed; hopefully representing a common
linguistic base form. This is most useful in building search engines
and information retrieval software; for example, a search with stemming
enabled should be able to find a document containing "cycling" given the
query "cycles".
PyStemmer provides algorithms for several (mainly european) languages,
by wrapping the libstemmer library from the Snowball project in a Python
module. It also provides access to the classic Porter stemming algorithm
for english: although this has been superceded by an improved algorithm,
the original algorithm may be of interest to information retrieval
researchers wishing to reproduce results of earlier experiments.
WWW: http://pypi.python.org/pypi/PyStemmer/
PR: ports/132695
Submitted by: Wen Heping <wenheping at gmail.com>
Notes
Notes:
svn path=/head/; revision=230266
Diffstat (limited to 'textproc/pystemmer/pkg-descr')
-rw-r--r-- | textproc/pystemmer/pkg-descr | 16 |
1 files changed, 16 insertions, 0 deletions
diff --git a/textproc/pystemmer/pkg-descr b/textproc/pystemmer/pkg-descr new file mode 100644 index 000000000000..a505e0793612 --- /dev/null +++ b/textproc/pystemmer/pkg-descr @@ -0,0 +1,16 @@ +PyStemmer provides access to efficient algorithms for calculating a +"stemmed" form of a word. This is a form with most of the common +morphological endings removed; hopefully representing a common +linguistic base form. This is most useful in building search engines +and information retrieval software; for example, a search with stemming +enabled should be able to find a document containing "cycling" given the +query "cycles". + +PyStemmer provides algorithms for several (mainly european) languages, +by wrapping the libstemmer library from the Snowball project in a Python +module. It also provides access to the classic Porter stemming algorithm +for english: although this has been superceded by an improved algorithm, +the original algorithm may be of interest to information retrieval +researchers wishing to reproduce results of earlier experiments. + +WWW: http://pypi.python.org/pypi/PyStemmer/ |