Skip to content

bastienbot/spacy-api-docker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spaCy API Docker

Ready-to-use Docker images for the spaCy NLP library.

Try the demo for spaCy 1.9.0 or spaCy 2.0.0 (alpha)!

Features

  • Use the awesome spaCy NLP framwork with other programming languages.
  • Better scaling: One NLP - multiple services.
  • Build using the official spaCy REST services.
  • Dependency parsing visualisation with displaCy.
  • Docker images for English, German, Spanish and French.
  • Automated builds to stay up to date with spaCy.
  • Used spaCy version: 1.9.0 / 2.0.0alpha.

Please note that this is a completely new API and is incompatible with the previous one. If you still need them, use jgontrum/spacyapi:en-legacy or jgontrum/spacyapi:de-legacy.

Documentation, API- and frontend code based upon spaCy REST services by Explosion AI.


Images

Image Description Build
jgontrum/spacyapi:base Base image, containing no language model Build Status
jgontrum/spacyapi:latest English language model Build Status
jgontrum/spacyapi:en English language model Build Status
jgontrum/spacyapi:de German language model Build Status
jgontrum/spacyapi:es Spanish language model Build Status
jgontrum/spacyapi:fr French language model Build Status
jgontrum/spacyapi:all Contains EN, DE, ES and FR language models Build Status
jgontrum/spacyapi:base_v2 Base image for spaCy 2.0 Build Status
jgontrum/spacyapi:en_v2 English language model for spaCy 2.0 Build Status
jgontrum/spacyapi:de_v2 German language model for spaCy 2.0 Build Status
jgontrum/spacyapi:es_v2 Spanish language model for spaCy 2.0 Build Status
jgontrum/spacyapi:all_v2 Contains EN, DE and FR language models for spaCy 2.0 Build Status
jgontrum/spacyapi:en-legacy Old API with English model legacy
jgontrum/spacyapi:de-legacy Old API with German model legacy

Usage

docker run -p "127.0.0.1:8080:80" jgontrum/spacyapi:en

All models are loaded at start up time. Depending on the model size and server performance, this can take a few minutes.

The displaCy frontend is available at /ui.

Docker Compose

version: '2'

services:
  spacyapi:
    image: jgontrum/spacyapi:en
    ports:
      - "127.0.0.1:8080:80"
    restart: always

REST API Documentation

GET /ui/

displaCy frontend is available here.


POST /dep/

Example request:

{
    "text": "They ate the pizza with anchovies",
    "model":"en",
    "collapse_punctuation": 0,
    "collapse_phrases": 1
}
Name Type Description
text string text to be parsed
model string identifier string for a model installed on the server
collapse_punctuation boolean Merge punctuation onto the preceding token?
collapse_phrases boolean Merge noun chunks and named entities into single tokens?

Example request using the Python Requests library:

import json
import requests

url = "http:https://localhost:8000/dep"
message_text = "They ate the pizza with anchovies"
headers = {'content-type': 'application/json'}
d = {'text': message_text, 'model': 'en'}

response = requests.post(url, data=json.dumps(d), headers=headers)
r = response.json()

Example response:

{
    "arcs": [
        { "dir": "left", "start": 0, "end": 1, "label": "nsubj" },
        { "dir": "right", "start": 1, "end": 2, "label": "dobj" },
        { "dir": "right", "start": 1, "end": 3, "label": "prep" },
        { "dir": "right", "start": 3, "end": 4, "label": "pobj" },
        { "dir": "left", "start": 2, "end": 3, "label": "prep" }
    ],
    "words": [
        { "tag": "PRP", "text": "They" },
        { "tag": "VBD", "text": "ate" },
        { "tag": "NN", "text": "the pizza" },
        { "tag": "IN", "text": "with" },
        { "tag": "NNS", "text": "anchovies" }
    ]
}
Name Type Description
arcs array data to generate the arrows
dir string direction of arrow ("left" or "right")
start integer offset of word the arrow starts on
end integer offset of word the arrow ends on
label string dependency label
words array data to generate the words
tag string part-of-speech tag
text string token

Curl command:

curl -s localhost:8000/dep -d '{"text":"Pastafarians are smarter than people with Coca Cola bottles.", "model":"en"}'
{
  "arcs": [
    {
      "dir": "left",
      "end": 1,
      "label": "nsubj",
      "start": 0
    },
    {
      "dir": "right",
      "end": 2,
      "label": "acomp",
      "start": 1
    },
    {
      "dir": "right",
      "end": 3,
      "label": "prep",
      "start": 2
    },
    {
      "dir": "right",
      "end": 4,
      "label": "pobj",
      "start": 3
    },
    {
      "dir": "right",
      "end": 5,
      "label": "prep",
      "start": 4
    },
    {
      "dir": "right",
      "end": 6,
      "label": "pobj",
      "start": 5
    }
  ],
  "words": [
    {
      "tag": "NNPS",
      "text": "Pastafarians"
    },
    {
      "tag": "VBP",
      "text": "are"
    },
    {
      "tag": "JJR",
      "text": "smarter"
    },
    {
      "tag": "IN",
      "text": "than"
    },
    {
      "tag": "NNS",
      "text": "people"
    },
    {
      "tag": "IN",
      "text": "with"
    },
    {
      "tag": "NNS",
      "text": "Coca Cola bottles."
    }
  ]
}

POST /ent/

Example request:

{
    "text": "When Sebastian Thrun started working on self-driving cars at Google in 2007, few people outside of the company took him seriously.",
    "model": "en"
}
Name Type Description
text string text to be parsed
model string identifier string for a model installed on the server

Example request using the Python Requests library:

import json
import requests

url = "http:https://localhost:8000/ent"
message_text = "When Sebastian Thrun started working on self-driving cars at Google in 2007, few people outside of the company took him seriously."
headers = {'content-type': 'application/json'}
d = {'text': message_text, 'model': 'en'}

response = requests.post(url, data=json.dumps(d), headers=headers)
r = response.json()

Example response:

[
    { "end": 20, "start": 5,  "type": "PERSON" },
    { "end": 67, "start": 61, "type": "ORG" },
    { "end": 75, "start": 71, "type": "DATE" }
]
Name Type Description
end integer character offset the entity ends after
start integer character offset the entity starts on
type string entity type
curl -s localhost:8000/ent -d '{"text":"Pastafarians are smarter than people with Coca Cola bottles.", "model":"en"}'
[
  {
    "end": 12,
    "start": 0,
    "type": "NORP"
  },
  {
    "end": 51,
    "start": 42,
    "type": "ORG"
  }
]

GET /models

List the names of models installed on the server.

Example request:

GET /models

Example response:

["en", "de"]

GET /{model}/schema/

Example request:

GET /en/schema
Name Type Description
model string identifier string for a model installed on the server

Example response:

{
  "dep_types": ["ROOT", "nsubj"],
  "ent_types": ["PERSON", "LOC", "ORG"],
  "pos_types": ["NN", "VBZ", "SP"]
}

GET /version

Show the used spaCy version.

Example request:

GET /version

Example response:

{
  "spacy": "1.9.0"
}

About

spaCy REST API, wrapped in a Docker container.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 35.8%
  • Python 27.8%
  • CSS 20.6%
  • HTML 12.9%
  • Makefile 2.4%
  • Nginx 0.5%