idl.tmt.representation
Class LinkTextRepresentationBuilder
java.lang.Object
|
+--idl.tmt.representation.BagOfWordsRepresentationBuilder
|
+--idl.tmt.representation.LinkTextRepresentationBuilder
- All Implemented Interfaces:
- HypertextParsingListener, ParsingListener, RepresentationBuilder, WordParsingListener
- public class LinkTextRepresentationBuilder
- extends BagOfWordsRepresentationBuilder
- implements WordParsingListener, HypertextParsingListener
Created on Mar 22, 2004
- Author:
- jelsas
|
Method Summary |
void |
documentCollectionComplete()
Indicates that parsing of the collection is done, and this object
builds the representation matrix. |
void |
documentComplete()
Ignored in this representation builder because we're just interested
in the documents linked to, not the current document. |
void |
endLink()
Indicates that the end of a link has been reached. |
void |
newDocument(int docID)
Ignored in this representation builder because we're just interested
in the documents linked to, not the current document. |
void |
startLink(int linkDocID)
Indicates that a link has been started. |
void |
word(java.lang.String word,
int pos)
Indicates that a new word has been encountered. |
| Methods inherited from class idl.tmt.representation.BagOfWordsRepresentationBuilder |
addTermToDocRepresentation, buildRepresentation, cleanup, getRepresentation, getTermList, getWeight, isBinarize, isDebug, isShareTermlist, setBinarize, setDebug, setNumDocuments, setShareTermlist, setTermList, setTextParser, setWeight, toString |
| Methods inherited from class java.lang.Object |
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, wait, wait, wait |
inLink
private boolean inLink
linkDocID
private int linkDocID
LinkTextRepresentationBuilder
public LinkTextRepresentationBuilder(int numDocs)
- Creates a new LinkTextRepresentationBuilder. This class builds
document representation based on the link text from documents that
link to them.
LinkTextRepresentationBuilder
public LinkTextRepresentationBuilder()
LinkTextRepresentationBuilder
public LinkTextRepresentationBuilder(int numDocs,
TermList termList,
boolean binarize)
- Creates a LinkTextRepresentationBuilder with the specified term
list.
- Parameters:
termList - shared term list to usenumDocs - number of documents in this collectionbinarize - indicates whether the representation should be binary
instead of term counts
LinkTextRepresentationBuilder
public LinkTextRepresentationBuilder(boolean binarize,
TermList termList)
word
public void word(java.lang.String word,
int pos)
- Indicates that a new word has been encountered. Only words within
links are stored.
- Specified by:
word in interface WordParsingListener
- See Also:
WordParsingListener.word(java.lang.String, int)
startLink
public void startLink(int linkDocID)
- Indicates that a link has been started.
- Specified by:
startLink in interface HypertextParsingListener
- See Also:
HypertextParsingListener.startLink(int)
endLink
public void endLink()
- Indicates that the end of a link has been reached.
- Specified by:
endLink in interface HypertextParsingListener
- See Also:
HypertextParsingListener.endLink()
newDocument
public void newDocument(int docID)
- Ignored in this representation builder because we're just interested
in the documents linked to, not the current document.
- Specified by:
newDocument in interface ParsingListener
- See Also:
ParsingListener.newDocument(int)
documentComplete
public void documentComplete()
- Ignored in this representation builder because we're just interested
in the documents linked to, not the current document.
- Specified by:
documentComplete in interface ParsingListener
- See Also:
ParsingListener.documentComplete()
documentCollectionComplete
public void documentCollectionComplete()
- Indicates that parsing of the collection is done, and this object
builds the representation matrix.
- Specified by:
documentCollectionComplete in interface ParsingListener
- See Also:
ParsingListener.documentCollectionComplete()