A simple analyzer that isn't limited to Latin.
The token filter to accompany BLDutchTokenizer.
A simple tokenizer for Dutch texts.
A tokenizer that doesn't tokenize (returns the whole field value as one token)
Analyzer that doesn't tokenize but returns a single token.
A simple analyzer based on StandardTokenizer that isn't limited to Latin.
Simple whitespace analyzer.
Lowercases and/or removes any accents from the input.
Removes any accents from the input.
Replaces punctuation with space.
Analyzer implementations, including Tokenizers and Filters.
Copyright © 2020 Instituut voor Nederlandse Taal (INT). All rights reserved.