idl.tmt.representation
Class BagOfWordsRepresentationBuilder
java.lang.Object
|
+--idl.tmt.representation.BagOfWordsRepresentationBuilder
- All Implemented Interfaces:
- RepresentationBuilder
- Direct Known Subclasses:
- BodyTextRepresentationBuilder, H1TextRepresentationBuilder, LinkTextRepresentationBuilder, MetaTextRepresentationBuilder, TitleTextRepresentationBuilder
- public abstract class BagOfWordsRepresentationBuilder
- extends java.lang.Object
- implements RepresentationBuilder
Abstract representation builder to handle keeping track of
the in-process document representations, and finalizing the
TmtMatrix object.
Created on Apr 7, 2004
- Author:
- jelsas
| Methods inherited from class java.lang.Object |
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, wait, wait, wait |
myMatrix
private TmtMatrix myMatrix
numDocs
private int numDocs
binarize
private boolean binarize
shareTermlist
private boolean shareTermlist
debug
private boolean debug
weight
private double weight
termList
private TermList termList
textParser
protected TextDocumentParser textParser
rep
private java.util.HashMap rep
BagOfWordsRepresentationBuilder
public BagOfWordsRepresentationBuilder()
setNumDocuments
public void setNumDocuments(int numDocs)
- Specified by:
setNumDocuments in interface RepresentationBuilder
addTermToDocRepresentation
protected void addTermToDocRepresentation(int docID,
java.lang.String term)
buildRepresentation
protected void buildRepresentation()
getRepresentation
public TmtMatrix getRepresentation()
- Specified by:
getRepresentation in interface RepresentationBuilder
setTermList
public void setTermList(TermList termList)
getTermList
public TermList getTermList()
- Specified by:
getTermList in interface RepresentationBuilder
setBinarize
public void setBinarize(boolean binarize)
isBinarize
public boolean isBinarize()
setWeight
public void setWeight(double weight)
- Specified by:
setWeight in interface RepresentationBuilder
getWeight
public double getWeight()
- Specified by:
getWeight in interface RepresentationBuilder
toString
public java.lang.String toString()
- Overrides:
toString in class java.lang.Object
setShareTermlist
public void setShareTermlist(boolean shareTermlist)
isShareTermlist
public boolean isShareTermlist()
setTextParser
public void setTextParser(TextDocumentParser textParser)
cleanup
public void cleanup()
- Specified by:
cleanup in interface RepresentationBuilder
setDebug
public void setDebug(boolean debug)
isDebug
public boolean isDebug()