Who is using BlackLab?
A number of institutes and companies have chosen BlackLab for their corpus or text analysis projects.
If you're using BlackLab, let us know; we'd like to mention your project here!
- The Dutch Language Institute (INT) is leading BlackLab development and uses it for their corpora, such as its internal Corpus of Contemporary Dutch (~2.2 billion tokens and growing), Letters as loot, Corpus Gysseling and the Corpus Hedendaags Nederlands, a 1+ billion token public Corpus of Contemporary Dutch.
- Lexion AI, who develop a contract management system. Lexion has an interestingly different use case from INT, with high query volume on many smaller indexes. We are grateful for their valuable contributions to the project.
- In Search of the Drowned: Testimonies and Testimonial Fragments of the Holocaust, edited and built by Gabor M. Toth in collaboration with the Yale Digital Humanities Laboratory.
- OpenSonar, a Dutch corpus, by the University of Tilburg in collaboration with the Dutch Language Institute. The look and feel of our corpus frontend was developed as part of this collaboration as well, for which we'd like to thank Martin Reynaert and Matje van de Camp.
- Frisian Corpora, Frisian corpora ranging from runes to modern Frisian, mostly unrestricted material, Mid Frisian contains detailed linguistic annotations. Eduard Drenth of the Fryske Akademy has made some valuable contributions as well, such as initial support for the Saxon parser.
- Cosycat, corpus query and annotation interface used for the Mind-Bending Grammars project, University of Antwerp.
- IKE, a knowledge extraction tool, developed at the Allen Institute for Artificial Intelligence. Allen AI has made very useful contributions as well, improving performance and fixing bugs.
- VIVA Korpusportaal, Virtuele Instituut Vir Afrikaans (South Africa)
- SADiLaR corpus portal, South African Centre for Digital Language Resources
- EarlyPrint, early English print record, by Martin Mueller and Philip Burns at Northwestern University.
- Arabic Digital Humanities. The right-to-left support in our corpus frontend came about thanks to this project.
- Latin lemmatized texts (Alpheios)
- Corpus of Spoken Hindi (Japan)