Introduction

What is BlackLab?

BlackLab is an open source corpus search engine built on top of Apache Lucene. It allows fast, complex searches with accurate hit highlighting on large, tagged and annotated, bodies of text.

Who is it for?

BlackLab was designed primarily for linguists who want to search for (potentially complex) patterns in large bodies of text annotated with linguistic properties (headword, part-of-speech, paragraphs, sentences, named entities, etc.). Apart from that, BlackLab is also being used for other purposes, including historical research and artifical intelligence.

How do I use it?

If you want to use BlackLab in your own projects, it’s available both as a web service (BlackLab Server) and a Java library (BlackLab Core). It’s easy to use BlackLab from your favourite programming language.

Who made BlackLab?

BlackLab was developed at the Dutch Language Institute (INT) to provide a fast and feature-rich search interface on our historical and contemporary text corpora. It was released as open source (Apache License 2.0) in 2012 and has since gathered a number of users and contributors. It is still in active development.