public class DocIndexerFactoryConfig extends Object implements DocIndexerFactory
DocIndexerFactory.Format
Modifier and Type | Field and Description |
---|---|
protected Function<String,Optional<ConfigInputFormat>> |
finder
Return a config from the supported list, or load it if it's in the unloaded
list.
|
protected Map<String,String> |
formatErrors |
protected boolean |
isInitialized |
protected Map<String,ConfigInputFormat> |
supported |
protected Map<String,File> |
unloaded |
Constructor and Description |
---|
DocIndexerFactoryConfig() |
Modifier and Type | Method and Description |
---|---|
protected void |
addFormat(ConfigInputFormat format) |
void |
addFormatsInDirectories(List<File> dirs)
Locate all config files (files ending with .blf.yaml, .blf.yml, .blf.json)
within the list of directories and load them.
|
String |
formatError(String formatIdentifier)
If this format exists but has an error, return the error.
|
DocIndexerConfig |
get(String formatIdentifier,
DocWriter indexer,
String documentName,
byte[] b,
Charset cs)
Instantiating a DocIndexer from a byte array.
|
DocIndexerConfig |
get(String formatIdentifier,
DocWriter indexer,
String documentName,
File f,
Charset cs)
Instantiating a DocIndexer from a file.
|
DocIndexerConfig |
get(String formatIdentifier,
DocWriter indexer,
String documentName,
InputStream is,
Charset cs)
Instantiating a DocIndexer from an input stream.
|
DocIndexerConfig |
get(String formatIdentifier,
DocWriter indexer,
String documentName,
Reader reader)
Instantiating a DocIndexer from a reader.
|
DocIndexerFactory.Format |
getFormat(String formatIdentifier)
Get the full format from its identifier.
|
List<DocIndexerFactory.Format> |
getFormats()
Return all formats supported by this factory.
|
void |
init()
Don't call manually, is called when this factory is added to the
DocumentFormats registry
(
DocumentFormats.registerFactory(DocIndexerFactory) ). |
boolean |
isSupported(String formatIdentifier)
Can this factory instantiate a docIndexer for this type of format.
|
protected Optional<ConfigInputFormat> |
load(String formatIdentifier,
File f) |
protected void |
loadUnloaded() |
protected boolean isInitialized
protected Map<String,ConfigInputFormat> supported
protected Function<String,Optional<ConfigInputFormat>> finder
Why this function? Formats can depend on each other, and this is a mechanism
to allow formats to get/lazy-load other formats registered with this factory.
Why lazy-loading? Formats only refer to other formats by their name, not
their file location, so we need to find them all before we actually load
them. Those found config files are kept in the
unloaded
map until they are loaded
public void init() throws InvalidInputFormatConfig
DocIndexerFactory
DocumentFormats.registerFactory(DocIndexerFactory)
).init
in interface DocIndexerFactory
InvalidInputFormatConfig
public boolean isSupported(String formatIdentifier)
DocIndexerFactory
isSupported
in interface DocIndexerFactory
formatIdentifier
- lowercased and never null or empty stringpublic List<DocIndexerFactory.Format> getFormats()
DocIndexerFactory
getFormats
in interface DocIndexerFactory
public DocIndexerFactory.Format getFormat(String formatIdentifier)
DocIndexerFactory
getFormat
in interface DocIndexerFactory
protected void addFormat(ConfigInputFormat format) throws InvalidInputFormatConfig
InvalidInputFormatConfig
public void addFormatsInDirectories(List<File> dirs) throws InvalidInputFormatConfig
DocumentFormats
,
if not already done.
This is a one-time scan, so configs placed in these directories after this
scan will not be picked up. dirs
- InvalidInputFormatConfig
- when one of the formats could not be
loadedprotected Optional<ConfigInputFormat> load(String formatIdentifier, File f) throws IOException
IOException
protected void loadUnloaded()
public DocIndexerConfig get(String formatIdentifier, DocWriter indexer, String documentName, Reader reader)
DocIndexerFactory
get
in interface DocIndexerFactory
formatIdentifier
- the formatIdentifier for the documentindexer
- indexer objectdocumentName
- name of the unit we're indexingreader
- text to indexpublic DocIndexerConfig get(String formatIdentifier, DocWriter indexer, String documentName, InputStream is, Charset cs)
DocIndexerFactory
get
in interface DocIndexerFactory
formatIdentifier
- the formatIdentifier for the documentindexer
- indexer objectdocumentName
- name of the unit we're indexingis
- data to indexcs
- default character set if not definedpublic DocIndexerConfig get(String formatIdentifier, DocWriter indexer, String documentName, File f, Charset cs) throws FileNotFoundException
DocIndexerFactory
get
in interface DocIndexerFactory
formatIdentifier
- the formatIdentifier for the documentindexer
- indexer objectdocumentName
- name of the unit we're indexingf
- file to indexcs
- default character set if not definedFileNotFoundException
- if file doesn't existpublic DocIndexerConfig get(String formatIdentifier, DocWriter indexer, String documentName, byte[] b, Charset cs)
DocIndexerFactory
get
in interface DocIndexerFactory
formatIdentifier
- the formatIdentifier for the documentindexer
- indexer objectdocumentName
- name of the unit we're indexingb
- data to indexcs
- default character set if not definedpublic String formatError(String formatIdentifier)
DocIndexerFactory
formatError
in interface DocIndexerFactory
formatIdentifier
- format to check for errorsCopyright © 2020 Instituut voor Nederlandse Taal (INT). All rights reserved.