Who is using BlackLab?

A number of institutes and companies have chosen BlackLab for their corpus or text analysis projects:

  • The Dutch Language Institute (INT) is developing BlackLab and uses it for their corpora, such as its internal Corpus of Contemporary Dutch (~2.2 billion tokens and growing), Letters as loot, Corpus Gysseling and the Corpus Hedendaags Nederlands (CLARIN login required), a ~ 1 billion token public Corpus of Contemporary Dutch.
  • Lexion AI, who develop a contract management system. Lexion has an interestingly different use case from INT, with high query volume on many smaller indexes. We are grateful for their valuable contributions to the project.
  • In Search of the Drowned: Testimonies and Testimonial Fragments of the Holocaust, edited and built by Gabor M. Toth in collaboration with the Yale Digital Humanities Laboratory.
  • OpenSonar, a Dutch corpus, by the University of Tilburg in collaboration with the Dutch Language Institute. The look and feel of our corpus frontend was developed as part of this collaboration as well, for which we'd like to thank Martin Reynaert and Matje van de Camp.
  • Cosycat, corpus query and annotation interface used for the Mind-Bending Grammars project, University of Antwerp.
  • IKE, a knowledge extraction tool, developed at the Allen Institute for Artificial Intelligence. Allen AI has made very useful contributions as well, improving performance and fixing bugs.
  • VIVA Korpusportaal, Virtuele Instituut Vir Afrikaans (South Africa)
  • SADiLaR corpus portal, South African Centre for Digital Language Resources
  • EarlyPrint, early English print record, by Martin Mueller and Philip Burns at Northwestern University.
  • Arabic Digital Humanities. The right-to-left support in our corpus frontend came about thanks to this project.
  • Fryske Akademy, NL (restricted corpus). Eduard Drenth has made some valuable contributions as well, such as the option to use the Saxon parser.
  • cosycat (Collaborative Synchronized Corpus Annotation Tool)
  • Latin lemmatized texts (Alpheios)
  • Corpus of Spoken Hindi (Japan)
  • If you're using BlackLab, let me know, I'd like to mention your project here!