Skip to main content
Version: 0.8.1

Quickstart

In this quick start guide, we will install Quickwit, create an index, add documents and finally execute search queries. All the Quickwit commands used in this guide are documented in the CLI reference documentation.

Install Quickwit using Quickwit installer

The Quickwit installer automatically picks the correct binary archive for your environment and then downloads and unpacks it in your working directory. This method works only for some OS/architectures, and you will also need to install some external dependencies.

curl -L https://install.quickwit.io | sh
cd ./quickwit-v*/
./quickwit --version

You can now move this executable directory wherever sensible for your environment and possibly add it to your PATH environment.

Use Quickwit's Docker image

You can also pull and run the Quickwit binary in an isolated Docker container.

# Create first the data directory.
mkdir qwdata
docker run --rm quickwit/quickwit --version

If you are using Apple silicon based macOS system you might need to specify the platform. You can also safely ignore jemalloc warnings.

docker run --rm --platform linux/amd64 quickwit/quickwit --version

Start Quickwit server

./quickwit run

Tips: you can use the environment variable RUST_LOG to control quickwit verbosity.

Check it's working by browsing the UI at https://localhost:7280 or do a simple GET with cURL:

curl https://localhost:7280/api/v1/version

Create your first index

Before adding documents to Quickwit, you need to create an index configured with a YAML config file. This config file notably lets you define how to map your input documents to your index fields and whether these fields should be stored and indexed. See the index config documentation.

Let's create an index configured to receive Stackoverflow posts (questions and answers).

# First, download the stackoverflow dataset config from Quickwit repository.
curl -o stackoverflow-index-config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/stackoverflow/index-config.yaml

The index config defines three fields: title, body and creationDate. title and body are indexed and tokenized, and they are also used as default search fields, which means they will be used for search if you do not target a specific field in your query. creationDate serves as the timestamp for each record. There are no more explicit field definitions as we can use the default dynamic mode: the undeclared fields will still be indexed, by default fast fields are enabled to enable aggregation queries. and the raw tokenizer is used for text.

And here is the complete config:

stackoverflow-index-config.yaml
#
# Index config file for stackoverflow dataset.
#
version: 0.7

index_id: stackoverflow

doc_mapping:
field_mappings:
- name: title
type: text
tokenizer: default
record: position
stored: true
- name: body
type: text
tokenizer: default
record: position
stored: true
- name: creationDate
type: datetime
fast: true
input_formats:
- rfc3339
fast_precision: seconds
timestamp_field: creationDate

search_settings:
default_search_fields: [title, body]

indexing_settings:
commit_timeout_secs: 30

Now we can create the index with the command:

./quickwit index create --index-config ./stackoverflow-index-config.yaml

Check that a directory ./qwdata/indexes/stackoverflow has been created, Quickwit will write index files here and a metastore.json which contains the index metadata. You're now ready to fill the index.

Let's add some documents

Quickwit can index data from many sources. We will use a new line delimited json ndjson datasets as our data source. Let's download a bunch of stackoverflow posts (10 000) in ndjson format and index it.

# Download the first 10_000 Stackoverflow posts articles.
curl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json
# Index our 10k documents.
./quickwit index ingest --index stackoverflow --input-path stackoverflow.posts.transformed-10000.json --force

As soon as the ingest command finishes you can start querying data by using the following search command:

./quickwit index search --index stackoverflow --query "search AND engine"

It should return 10 hits. Now you're ready to play with the search API.

Execute search queries

Let's start with a query on the field title: title:search AND engine:

curl "https://127.0.0.1:7280/api/v1/stackoverflow/search?query=title:search+AND+engine"

The same request can be expressed as a JSON query:

curl -XPOST "https://localhost:7280/api/v1/stackoverflow/search" -H 'Content-Type: application/json' -d '{
"query": "title:search AND engine"
}'

This format is more verbose but it allows you to use more advanced features such as aggregations. The following query finds most popular tags used on the questions in this dataset:

curl -XPOST "https://localhost:7280/api/v1/stackoverflow/search" -H 'Content-Type: application/json' -d '{
"query": "type:question",
"max_hits": 0,
"aggs": {
"foo": {
"terms":{
"field":"tags",
"size": 10
}
}
}
}'

As you are experimenting with different queries check out the server logs to see what's happening.

note

Don't forget to encode correctly the query params to avoid bad request (status 400).

Clean

Let's do some cleanup by deleting the index:

./quickwit index delete --index stackoverflow

Congrats! You can level up with the following tutorials to discover all Quickwit features.

TLDR

Run the following command from within Quickwit's installation directory.

curl -o stackoverflow-index-config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/stackoverflow/index-config.yaml
./quickwit index create --index-config ./stackoverflow-index-config.yaml
curl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json
./quickwit index ingest --index stackoverflow --input-path ./stackoverflow.posts.transformed-10000.json --force
./quickwit index search --index stackoverflow --query "search AND engine"
./quickwit index delete --index stackoverflow

Next tutorials