public abstract class DocIndexerAbstract extends DocIndexer
Modifier and Type | Field and Description |
---|---|
protected int |
nDocumentsSkipped |
protected CountingReader |
reader |
protected boolean |
skippingCurrentDocument |
protected int |
wordsDone
Total words processed by this indexer.
|
currentLuceneDoc, documentName, docWriter, logger, MAX_DOCVALUES_LENGTH, metadataFieldValues, omitNorms, parameters
Constructor and Description |
---|
DocIndexerAbstract()
NOTE: newer DocIndexers should only have a default constructor, and provide
methods to set the Indexer object and the document being indexed (which are
called by the Indexer).
|
DocIndexerAbstract(DocWriter indexer,
String fileName,
Reader reader) |
Modifier and Type | Method and Description |
---|---|
void |
appendContent(char[] buffer,
int start,
int length) |
void |
appendContent(String str) |
void |
close() |
protected int |
getCharacterPosition()
Returns the current position in the original XML content in chars.
|
static String |
getDescription(Class<? extends DocIndexer> docIndexerClass)
If the supplied class has a static getDescription() method, call it.
|
static String |
getDisplayName(Class<? extends DocIndexer> docIndexerClass)
If the supplied class has a static getDisplayName() method, call it.
|
static boolean |
isVisible(Class<? extends DocIndexer> docIndexerClass)
Should this docIndexer implementation be listed?
A DocIndexer can be hidden by implementing a a static function named
isVisible, returning false.
|
void |
processContent(char[] buffer,
int start,
int length) |
void |
processContent(String contentToProcess) |
void |
reportCharsProcessed()
Report the amount of new characters processed since the last call
|
void |
reportTokensProcessed()
Report the change in wordsDone since the last report
|
void |
setDocument(Reader reader)
Set the document to index.
|
void |
startCaptureContent(String fieldName) |
int |
storeCapturedContent() |
void |
storePartCapturedContent() |
addMetadataField, addMetadataFieldsFromParameters, addMetadataToDocument, addNumericFields, addToForwardIndex, getCurrentLuceneDoc, getDocWriter, getMetadataField, getMetadataFieldTypeFromIndexerProperties, getParameter, getParameter, getParameter, getParameter, getSensitivitySetting, hasParameter, index, luceneTypeFromIndexMetadataType, optTranslateFieldName, setDocument, setDocument, setDocument, setDocumentName, setDocWriter, setOmitNorms, setParameter, setParameters, tokenizeField, warn
protected boolean skippingCurrentDocument
protected CountingReader reader
protected int wordsDone
protected int nDocumentsSkipped
public DocIndexerAbstract()
public void startCaptureContent(String fieldName)
public int storeCapturedContent()
public void storePartCapturedContent()
public void appendContent(String str)
public void appendContent(char[] buffer, int start, int length)
public void processContent(char[] buffer, int start, int length)
public void processContent(String contentToProcess)
protected int getCharacterPosition()
getCharacterPosition
in class DocIndexer
public void setDocument(Reader reader)
setDocument
in class DocIndexer
reader
- documentpublic void close() throws BlackLabRuntimeException
close
in interface AutoCloseable
close
in class DocIndexer
BlackLabRuntimeException
public final void reportCharsProcessed()
DocIndexer
reportCharsProcessed
in class DocIndexer
public final void reportTokensProcessed()
reportTokensProcessed
in class DocIndexer
public static String getDisplayName(Class<? extends DocIndexer> docIndexerClass)
docIndexerClass
- class to get the display name forpublic static String getDescription(Class<? extends DocIndexer> docIndexerClass)
docIndexerClass
- class to get the description forpublic static boolean isVisible(Class<? extends DocIndexer> docIndexerClass)
docIndexerClass
- Copyright © 2020 Instituut voor Nederlandse Taal (INT). All rights reserved.