# Using BlackLab Server

Work in progress

This will become a guided introduction to BlackLab Server. See the REST API reference for details about each endpoint.

# Overview

# JSON, XML or CSV?

The webservice answers in JSON or XML. Selection of the desired output format can be done two ways:

  • by passing the HTTP header Accept with the value application/json, application/xml or text/csv
  • by passing an extra parameter outputformat with the value json, xml or csv

If both are specified, the parameter has precedence.

We'll usually use JSON in our examples.

# Running results count

BlackLab Server is mostly stateless: a particular URL will always result in the same response. An exception to this is the running result count. When you're requesting a page of results, and there are more results to the query, BlackLab Server will retrieve these results in the background. It will report how many results it has retrieved and whether it has finished or is still retrieving.

A note about retrieving versus counting. BLS has two limits for processing results: maximum number of hits to retrieve/process and maximum number of hits to count. Retrieving or processing hits means the hit is stored and will appear on the results page, is sorted, grouped, faceted, etc. If the retrieval limit is reached, BLS will still keep counting hits but will no longer store them.

# Examples

There's code examples of using BlackLab Server from a number of different programming languages.

Below are examples of individual requests to BlackLab Server.

NOTE: for clarity, double quotes have not been URL-encoded.

# Searches

All occurrences of “test” in the “opensonar” corpus (CorpusQL query)

http://blacklab.ivdnt.org/blacklab-server/opensonar/hits?patt="test"

All documents having “guide” in the title and “test” in the contents, sorted by author and date, results 61-90

http://blacklab.ivdnt.org/blacklab-server/opensonar/docs?filter=title:guide&patt="test"& sort=field:author,field:date&first=61&number=30

Occurrences of “test”, grouped by the word left of each hit

http://blacklab.ivdnt.org/blacklab-server/opensonar/hits?patt="test"&group=wordleft

Documents containing “test”, grouped by author

http://blacklab.ivdnt.org/blacklab-server/opensonar/docs?patt="test"&group=field:author

Larger snippet around a hit:

http://blacklab.ivdnt.org/blacklab-server/opensonar/docs/0345391802/snippet?hitstart=120&hitend=121&context=50

# Information about a document

Metadata of document with specific PID

http://blacklab.ivdnt.org/blacklab-server/opensonar/docs/0345391802

The entire original document

http://blacklab.ivdnt.org/blacklab-server/opensonar/docs/0345391802/contents

The entire document, with occurrences of “test” highlighted (with <hl/> tags)

http://blacklab.ivdnt.org/blacklab-server/opensonar/docs/0345391802/contents?patt="test"

Part of the document (embedded in a <blacklabResponse> root element; BlackLab makes sure the resulting XML is well-formed)

http://blacklab.ivdnt.org/blacklab-server/opensonar/docs/0345391802/contents?wordstart=1000&wordend=2000

# Information about indices

Information about the webservice; list of available indices

http://blacklab.ivdnt.org/blacklab-server/ (trailing slash optional)

Information about the “opensonar” corpus (structure, fields, (sub)annotations, human-readable names)

http://blacklab.ivdnt.org/blacklab-server/opensonar/ (trailing slash optional)

Information about the “opensonar” corpus, include all values for "pos" annotation (listvalues is a comma-separated list of annotation names):

http://blacklab.ivdnt.org/blacklab-server/opensonar/?listvalues=pos

Information about the “opensonar” corpus, include all values for "pos" annotation and any subannotations (listvalues may contain regexes):

http://blacklab.ivdnt.org/blacklab-server/opensonar/?listvalues=pos.*

Autogenerated XSLT stylesheet for transforming whole documents (only available for configfile-based XML formats):

http://blacklab.ivdnt.org/blacklab-server/input-formats/folia/xslt