Quickstart
In this quick start guide, we will install Quickwit, create an index, add documents and finally execute search queries. All the Quickwit commands used in this guide are documented in the CLI reference documentation.
Install Quickwit using Quickwit installer
The Quickwit installer automatically picks the correct binary archive for your environment and then downloads and unpacks it in your working directory. This method works only for some OS/architectures, and you will also need to install some external dependencies.
curl -L https://install.quickwit.io | sh
cd ./quickwit-v*/
./quickwit --version
You can now move this executable directory wherever sensible for your environment and possibly add it to your PATH
environment.
Use Quickwit's Docker image
You can also pull and run the Quickwit binary in an isolated Docker container.
# Create first the data directory.
mkdir qwdata
docker run --rm quickwit/quickwit --version
If you are using Apple silicon based macOS system you might need to specify the platform. You can also safely ignore jemalloc warnings.
docker run --rm --platform linux/amd64 quickwit/quickwit --version
Start Quickwit server
- CLI
- Docker
./quickwit run
docker run --rm -v $(pwd)/qwdata:/quickwit/qwdata -p 127.0.0.1:7280:7280 quickwit/quickwit run
Tips: you can use the environment variable RUST_LOG
to control quickwit verbosity.
Check it's working by browsing the UI at https://localhost:7280 or do a simple GET with cURL:
curl https://localhost:7280/api/v1/version
Create your first index
Before adding documents to Quickwit, you need to create an index configured with a YAML config file. This config file notably lets you define how to map your input documents to your index fields and whether these fields should be stored and indexed. See the index config documentation.
Let's create an index configured to receive Stackoverflow posts (questions and answers).
# First, download the stackoverflow dataset config from Quickwit repository.
curl -o stackoverflow-index-config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/stackoverflow/index-config.yaml
The index config defines three fields: title
, body
and creationDate
. title
and body
are indexed and tokenized, and they are also used as default search fields, which means they will be used for search if you do not target a specific field in your query. creationDate
serves as the timestamp for each record. There are no more explicit field definitions as we can use the default dynamic mode: the undeclared fields will still be indexed, by default fast fields are enabled to enable aggregation queries. and the raw
tokenizer is used for text.
And here is the complete config:
#
# Index config file for stackoverflow dataset.
#
version: 0.7
index_id: stackoverflow
doc_mapping:
field_mappings:
- name: title
type: text
tokenizer: default
record: position
stored: true
- name: body
type: text
tokenizer: default
record: position
stored: true
- name: creationDate
type: datetime
fast: true
input_formats:
- rfc3339
fast_precision: seconds
timestamp_field: creationDate
search_settings:
default_search_fields: [title, body]
indexing_settings:
commit_timeout_secs: 30
Now we can create the index with the command:
- CLI
- CURL
./quickwit index create --index-config ./stackoverflow-index-config.yaml
curl -XPOST https://127.0.0.1:7280/api/v1/indexes --header "content-type: application/yaml" --data-binary @./stackoverflow-index-config.yaml
Check that a directory ./qwdata/indexes/stackoverflow
has been created, Quickwit will write index files here and a metastore.json
which contains the index metadata.
You're now ready to fill the index.
Let's add some documents
Quickwit can index data from many sources. We will use a new line delimited json ndjson datasets as our data source. Let's download a bunch of stackoverflow posts (10 000) in ndjson format and index it.
# Download the first 10_000 Stackoverflow posts articles.
curl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json
- CLI
- CURL
# Index our 10k documents.
./quickwit index ingest --index stackoverflow --input-path stackoverflow.posts.transformed-10000.json --force
# Index our 10k documents.
curl -XPOST "https://127.0.0.1:7280/api/v1/stackoverflow/ingest?commit=force" --data-binary @stackoverflow.posts.transformed-10000.json
As soon as the ingest command finishes you can start querying data by using the following search
command:
- CLI
- CURL
./quickwit index search --index stackoverflow --query "search AND engine"
curl "https://127.0.0.1:7280/api/v1/stackoverflow/search?query=search+AND+engine"
It should return 10 hits. Now you're ready to play with the search API.
Execute search queries
Let's start with a query on the field title
: title:search AND engine
:
curl "https://127.0.0.1:7280/api/v1/stackoverflow/search?query=title:search+AND+engine"
The same request can be expressed as a JSON query:
curl -XPOST "https://localhost:7280/api/v1/stackoverflow/search" -H 'Content-Type: application/json' -d '{
"query": "title:search AND engine"
}'
This format is more verbose but it allows you to use more advanced features such as aggregations. The following query finds most popular tags used on the questions in this dataset:
curl -XPOST "https://localhost:7280/api/v1/stackoverflow/search" -H 'Content-Type: application/json' -d '{
"query": "type:question",
"max_hits": 0,
"aggs": {
"foo": {
"terms":{
"field":"tags",
"size": 10
}
}
}
}'
As you are experimenting with different queries check out the server logs to see what's happening.
Don't forget to encode correctly the query params to avoid bad request (status 400).
Clean
Let's do some cleanup by deleting the index:
- CLI
- REST
./quickwit index delete --index stackoverflow
curl -XDELETE https://127.0.0.1:7280/api/v1/indexes/stackoverflow
Congrats! You can level up with the following tutorials to discover all Quickwit features.
TLDR
Run the following command from within Quickwit's installation directory.
curl -o stackoverflow-index-config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/stackoverflow/index-config.yaml
./quickwit index create --index-config ./stackoverflow-index-config.yaml
curl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json
./quickwit index ingest --index stackoverflow --input-path ./stackoverflow.posts.transformed-10000.json --force
./quickwit index search --index stackoverflow --query "search AND engine"
./quickwit index delete --index stackoverflow