com.knowledgebooks.nlp_utils
Class Stemmer

java.lang.Object
  extended by com.knowledgebooks.nlp_utils.Stemmer

public class Stemmer
extends java.lang.Object

Copyright 2002-2008 by Mark Watson. All rights reserved.

This software is not public domain. It can be legally used under either of the following licenses:

1. KnowledgeBooks.com Non Commercial Royality Free License
2. KnowledgeBooks.com Commercial Use License

see www.knowledgebooks.com for details


Constructor Summary
Stemmer()
           
 
Method Summary
 void add(char ch)
          Add a character to the word being stemmed.
 void add(char[] w, int wLen)
          Adds wLen characters to the word being stemmed contained in a portion of a char[] array.
 char[] getResultBuffer()
          Returns a reference to a character buffer containing the results of the stemming process.
 int getResultLength()
          Returns the length of the word resulting from the stemming process.
 void stem()
          Stem the word placed into the Stemmer buffer through calls to add().
 java.lang.String stemOneWord(java.lang.String word)
           
 java.util.List<java.lang.String> stemString(java.lang.String str)
           
 java.lang.String toString()
          After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Stemmer

public Stemmer()
Method Detail

add

public void add(char ch)
Add a character to the word being stemmed. When you are finished adding characters, you can call stem(void) to stem the word.


add

public void add(char[] w,
                int wLen)
Adds wLen characters to the word being stemmed contained in a portion of a char[] array. This is like repeated calls of add(char ch), but faster.


toString

public java.lang.String toString()
After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)

Overrides:
toString in class java.lang.Object

getResultLength

public int getResultLength()
Returns the length of the word resulting from the stemming process.


getResultBuffer

public char[] getResultBuffer()
Returns a reference to a character buffer containing the results of the stemming process. You also need to consult getResultLength() to determine the length of the result.


stem

public void stem()
Stem the word placed into the Stemmer buffer through calls to add(). Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().


stemString

public java.util.List<java.lang.String> stemString(java.lang.String str)

stemOneWord

public java.lang.String stemOneWord(java.lang.String word)