com.knowledgebooks.nlp_utils
Class Tokenizer

java.lang.Object
  extended by com.knowledgebooks.nlp_utils.Tokenizer

public class Tokenizer
extends java.lang.Object

Copyright 2002-2008 by Mark Watson. All rights reserved.

This software is not public domain. It can be legally used under either of the following licenses:

1. KnowledgeBooks.com Non Commercial Royality Free License
2. KnowledgeBooks.com Commercial Use License

see www.knowledgebooks.com for details


Constructor Summary
Tokenizer()
           
 
Method Summary
static java.util.List<java.lang.String> wordsToList(java.lang.String s2)
          utility to tokenize an input string into an Array of Strings
static java.util.List<java.lang.String> wordsToList(java.lang.String s2, int maxR)
          utility to tokenize an input string into an Array of Strings - with a maximum # of returned words
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Tokenizer

public Tokenizer()
Method Detail

wordsToList

public static java.util.List<java.lang.String> wordsToList(java.lang.String s2)
utility to tokenize an input string into an Array of Strings

Parameters:
s2 - string containing words to tokenize
Returns:
a List of parsed tokens

wordsToList

public static java.util.List<java.lang.String> wordsToList(java.lang.String s2,
                                                           int maxR)
utility to tokenize an input string into an Array of Strings - with a maximum # of returned words

Parameters:
s2 - string containing words to tokenize
maxR - maximum number of tokens to return
Returns:
a List of parsed tokens