# API versions
We're evolving the BlackLab web service API over time, using versioning.
There's currently three major versions:
3.0
: legacy API version, used by BlackLab up to 3.x4.0
: newer, mostly compatible API version, used by BlackLab 4.x. Adds some keys and endpoints and makes slight changes, mostly to improve consistency and reduce redundancy. The additions are still experimental, just like v5 of the API.5.0
: STILL EXPERIMENTAL. stricter, cleaner version of the4.0
API, removing parameters and response keys from3.0
. Because this is still experimental, we may make breaking changes. Useful for testing.
To know which API version your BLS defaults to, check the server info page (/
). Look for the key blacklabResponse.apiVersion
. If this doesn't exist, BLS is using API v3 or lower. If it exists and has the value 4.0
or 5.0
, BLS is using that API version.
On any BLS request, include the api
parameter to specify which API version to use. So if you want to keep using the older API for now, add api=3
. If you don't specify an API version, the default version will be used. If you specify an unsupported API version, e.g. api=2
, you will get an error.
To set the default API version, configure parameters.api
in your blacklab-server.yaml
(or blacklab-webservice.yaml
for Solr).
Valid values for API version include:
- Exact version, e.g.
4.0
- Major version, e.g.
4
(most recent "regular" release in the 4.x series, i.e. without a-..
suffix) cur
orcurrent
, the current stable version (currently4.0
)exp
orexperimental
, always points to the highest version available (currently5.0
)
New versions may be added in the future, and support for older versions will eventually be dropped (see the API support roadmap below).
API version numbers roughly match the BlackLab version where they became (or will become) the default API version.
Note that until BlackLab 4.0 is released, any additions to API v4 should be considered experimental. So where it states "you should use this for future compatibility", that really only applies after 4.0 is released.
# API changes from 3.0 to 4.0
API v4.0 is the default API on the development branch and starting from BlackLab 4.0.
Below are all the differences between API version 3.0 and 4.0.
To prepare for API version 5.0 (which will likely be the default in BlackLab 5.0), you should stop using the deprecated features.
# Changed
- Corpus info page (
/CORPUSNAME
):- Two keys were renamed to be more consistent: in the
versionInfo
block,blacklabVersion
andblacklabBuildTime
are now spelled with a lowercasel
, just like on the server info page. This is unlikely to break any clients. - A key was added:
blacklabScmRevision
, giving the short Git commit hash for this version of BlackLab.
- Two keys were renamed to be more consistent: in the
- Annotated fields:
displayOrder
will no longer include internal annotations (e.g.punct
and_relation
, previously calledstarttag
), as these are generally not meant to be displayed as search fields.
# Added
- Server info page:
- New key added (
/
):apiVersion
(valid values:3.0
,4.0
,5.0
; assume3.0
if missing) - In addition to
indices
, the newcorpora
key was added that provides the same information in a slightly different format. You should usecorpora
instead ofindices
for future compatibility. - In addition to being reported under
fieldInfo
,pidField
is now also a top-level key. You should use this version of the key for future compatibility. (the other special fields infieldInfo
will be moved tocustom
in v5)
- New key added (
- Corpus info page:
mainAnnotatedField
indicates which of the annotated fields is the main search field. This is useful if there's more than one annotated field, e.g. in case of parallel corpora.
- Search (hits) operations:
- The
patt
parameter may also be specified as a JSON query structure. This will be detected automatically, or you can setpattlang
tojson
to make it explicit. - In the JSON response,
summary
will now include apattern
object containing: ajson
key that giving the JSON query structure; abcql
key giving the (re-)serialized pattern in BlackLab Corpus Query Language; afieldName
key giving the search field, e.g."contents"
; and amatchInfos
key giving the match info groups from the query with their types (span, tag, relation, list). You can use this to convert between the two representations, e.g. for query builders (or use the new/parse-pattern
endpoint, see below). The XML response does not contain thejson
object. - In addition to
captureGroups
,matchInfos
will be reported that includes the same information as well as any inline tags and relations matched. You should use this instead ofcaptureGroups
for future compatibility. before
/after
are the new, preferred alternatives toleft
/right
,e.g. when sorting/grouping on context. Not all languages are LTR, so this makes more sense. Existing endpoints still useleft
/right
in the response for compatibility, but new endpoints have been updated as well. These properties can now get a number of tokens as an extra parameter, e.g.before:lemma:i:2
.- For grouping on context,
wordleft
/wordright
have been deprecated. Usebefore
/after
with 1 token instead. - Grouped results also include a
properties
key that gives the values for the individual properties used for grouping. This is an alternative toidentityDisplay
, which clients would sometimes have to parse to display it in their preferred way. context
is the new name for thewordsaroundhit
parameter and supports more options (separate before/after, whole sentence, etc.)
- The
- Pages that list values for fields, tags, etc. now support the
limitvalues
parameter. This parameter defaults to200
, but can be set higher if you need really long value lists. - New endpoints were added for all operations on corpora, at
/corpora/CORPUSNAME/...
(for now alongside existing endpoints/CORPUSNAME
). These endpoints are available in BlackLab v4 but only "speak" API v5 (see below). You should move to these endpoints for future compatibility. - A new endpoint
/parse-pattern
was added that allows you to parse a CorpusQL or JSON query structure pattern without actually executing the search. - A new endpoint
.../CORPUSNAME/relations
that will return all the spans ("inline tags") and relations indexed in the corpus. - Doc info on results pages and document info page: the new
tokenCounts
array gives token counts for all annotated fields. The first annotated field is the main one, which has the same value aslengthInTokens
. - When using
usecontent=orig
, you can now specifyescapexmlfragment
to control whether XML fragments are escaped as CDATA or not. This defaults tofalse
for API v4 and older, but will betrue
for API v5+.
# Deprecated
These features still work for now, but will be removed in the future.
- Server info page:
- The
indices
object. Usecorpora
instead. - The top-level
fieldInfo
object. Instead, use the top-levelpidField
to find the persistent identifier field. For other special fields, you still have to usefieldInfo
, but that will move tocustom
in API v5.
- The
- Document info page (
/docs/DOC_PID
) and results pages:metadataFieldDisplayNames
,metadataFieldGroups
anddocFields
are deprecated. This information can be found on the corpus info page, so it should be retrieved from there once. - Doc info on results pages and document info page:
lengthInTokens
is deprecated; use the first object in thetokenCounts
array instead, which gives thetokenCount
for the main annotated field. - Search operations:
- the
wordsaroundhit
parameter (usecontext
instead) - sort/group properties
wordleft
,wordright
. Usebefore
/after
instead with1
as the number of tokens, e.g.before:lemma:i:1
instead ofwordleft:lemma:i
. - The
captureGroups
response key. UsematchInfos
instead.
- the
- The old
/CORPUSNAME/...
endpoints. Use the new/corpora/CORPUSNAME/...
endpoints (that respond with API v5 responses) instead.
# API changes from 4.0 to 5.0
API v5.0 will become the default in BlackLab 5.0. Right now it's experimental and can be used for testing. Use api=exp
to test that your client works with this API version.
Note that the new endpoints (like /corpora/CORPUSNAME/...
) in BlackLab 4.0 always "speak" API v5, so those won't change when going from API v4 to v5. So where we say "changed" or "removed" below, we usually mean that compared to the equivalent old endpoints.
# Removed
All these were deprecated in API v4.0, and v5.0 removes them:
- Old endpoints related to corpora have been removed. Use the new
/corpora/...
endpoints introduced in API v4 instead. - Server info page removed
indices
. Usecorpora
instead. - Document info page (
/docs/DOC_PID
) and results pages no longer includemetadataFieldDisplayNames
,metadataFieldGroups
ordocFields
. This information can be found on the corpus info page and need not be sent with each document info request. - Results pages don't include
captureGroups
anymore. UsematchInfos
instead.
# Changed
These are breaking changes compared to v4.0. Make sure you update your client accordingly.
- All XML responses:
- XML responses don't use dynamic values as element names anymore, but instead adopt a
<entry><key>...</key><value>...</value></entry>
structure. This avoids problems with and simplifies maintaining the (less-used) XML format. Mainly affectssummary/searchParams
,docInfos
,users[]
in/sharing
response. - Wherever the XML used attributes for map entries, e.g.
<entry key="key">value</entry>
, this was changed to<entry><key>...</key><value>...</value></entry>
as well.
- XML responses don't use dynamic values as element names anymore, but instead adopt a
- server, corpus, field info:
indexName
in responses has been replaced withcorpusName
.- Corpora, metadata and annotated field and annotations report certain properties (such as
displayName
,description
) inside acustom
block now etc. These are all ignored by BlackLab but may be useful for client applications such as BlackLab Frontend. They are only included in responses if you specifycustom=true
.
- Results pages:
summary
has been restructured to group related values together. Keys have been renamed for clarity.- response keys
left
/right
have been replaced withbefore
/after
in the/hits
response. docInfos
now have ametadata
subobject instead of mixing metadata withmayView
andlengthInTokens
.- When using
usecontent=orig
, the value ofescapexmlfragment
now defaults totrue
, so XML fragments from the document will be escaped as CDATA. Set it tofalse
to include them as part of the XML structure instead (the old default).
# API support roadmap
This is how we intend to evolve BlackLab Server and Frontend with respect to API version support:
- [x] BlackLab v4.0 alpha/beta
- [ ] deprecate API v3 (responses include a "deprecation warning key")
- [x] add API v4 (adds endpoints, transitional; default API)
- [x] add API v5 experimental (enforces exclusive use of new endpoints)
- [ ] Frontend v4.0: start using API v4: use new endpoints if available, fall back to v3 endpoints if not (for now). So if the BLS you're talking to supports v4 (check server info page if blacklabResponse.apiVersion exists and is >= 4):
- [ ] use
/relations
to get list of tags - [ ] from server info page, use
corpora
key instead of deprecatedindices
key. - [ ] use
/corpora/CORPUSNAME/...
endpoints
- [ ] use
- [ ] BlackLab v4.0 release
- [ ] API v4/5 finalized
- [ ] BlackLab v5.0:
- [ ] switch default to API v5.
- [ ] deprecate API v4.
- [ ] remove deprecated API v3.
- [ ] Frontend v5.0: drop support for API v3 and v4 (use new endpoints exclusively). Test this by passing
api=5
to BLS (enforces only new endpoints). Frontend should still work with BlackLab 4.0 at this point (because that already supported API v5; it just wasn't the default). - [ ] BlackLab v6.0:
- [ ] remove deprecated API v4