Name	Name	Last commit message	Last commit date
Latest commit History 1 Commit
farm_haystack	farm_haystack
.gitignore	.gitignore
Dockerfile	Dockerfile
LICENSE	LICENSE
README.rst	README.rst
qa_config.py	qa_config.py
requirements.txt	requirements.txt
tutorial.ipynb	tutorial.ipynb

Name

Last commit message

Last commit date

farm_haystack

Haystack — Natural Language Question Answering At Scale

Introduction

A system built on top of FARM Framework to perform NLP question answering on a collection of large documents.

Inference for QA using current state-of-the-art models is computationally expensive. To make scaling question answering on documents pragmatic, document retrieval techniques are used to narrow down the scope to small subset of paragraphs across documents where an answer to the question could potentially be.

The system is designed with a goal to be modular. Individual components can be customized or new ones can be incorporated with minimal effort.

Components

There are three major components for the question answering pipeline:

Reader implements inference on FARM Adaptive Models trained on SQuaD like tasks to perform question answering. It takes paragraphs of text as input and returns answers with corresponding confidence scores.
Retriever is an implementation of term frequency–inverse document frequency(tf-idf) numerical statistic similar to the query scoring functions used in popular search systems like Elasticsearch. Retriever helps to narrow down the scope for Reader to smaller units of text where a given question could be answered.
Finder is a pipeline to glue together instance of a Reader and a Retriever to provide an easy-to-use question answering interface.

Quickstart

Installation

There are two ways to install:

(recommended) from source, git clone <url> and run pip install [--editable] . from the root of the repositry.
from PyPI, do a pip install farm_haystack

Configuration

The configuration can be supplied in a qa_config.py placed in the PYTHONPATH. Alternatively, the DATABASE_URL can also be set an an environment variable.

Deployment

SQL Backend

The database ORM layer is implemented using SQLAlchemy library. By default, it uses the file-based SQLite database. For large scale deployments, the configuration can be changed to use other compatible databases like PostgreSQL or MySQL.

REST API

A Flask based HTTP REST API is included to use the QA Framework with UI or integrating with other systems. To serve the API, run FLASK_APP=farm_hackstack.api.inference flask run.

About

🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Haystack — Natural Language Question Answering At Scale

Introduction

Components

Quickstart

Installation

Configuration

Deployment

SQL Backend

REST API

About

Releases

Packages

Languages

License

jamescalam/haystack

Folders and files

Latest commit

History

Repository files navigation

Haystack — Natural Language Question Answering At Scale

Introduction

Components

Quickstart

Installation

Configuration

Deployment

SQL Backend

REST API

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages