Skip to content

calorie/embulk-output-opensearch

 
 

Repository files navigation

OpenSearch output plugin for Embulk

Overview

  • Plugin type: output
  • Rollback supported: no
  • Resume supported: no
  • Cleanup supported: no

Installation

Install via Gemfile:

gem 'embulk-output-opensearch'

Configuration

  • mode: "insert" or "replace". See below(string, optional, default is insert)
  • nodes: list of nodes. nodes are pairs of host and port (list, required)
  • use_ssl Use SSL encryption (boolean, default is false)
  • auth_method (string, default is 'none') 'none'/'basic'. See also Authentication.
  • user Username for basic authentication (string, default is null)
  • password Password for above user (string, default is null)
  • index: index name (string, required)
  • id: document id column (string, default is null)
  • bulk_actions: Sets when to flush a new bulk request based on the number of actions currently added. (int, default is 1000)
  • bulk_size: Sets when to flush a new bulk request based on the size of actions currently added. (long, default is 5242880)
  • fill_null_for_empty_column: Fill null value when column value is empty (boolean, optional, default is false)
  • maximum_retries Number of maximam retry times (int, optional, default is 7)
  • initial_retry_interval_millis Initial interval between retries in milliseconds (int, optional, default is 1000)
  • maximum_retry_interval_millis Maximum interval between retries in milliseconds (int, optional, default is 120000)
  • timeout_millis timeout in milliseconds for each HTTP request(int, optional, default is 60000)
  • connect_timeout_millis connection timeout in milliseconds for HTTP client(int, optional, default is 60000)
  • max_snapshot_waiting_secs maximam waiting time in second when snapshot is just creating before delete index. works when mode: replace (int, optional, default is 1800)

Modes

insert:

default. This mode writes data to existing index.

replace:

  1. Create new temporary index
  2. Insert data into the new index
  3. replace the alias with the new index. If alias doesn't exists, plugin will create new alias.
  4. Delete existing (old) index if exists

Index should not exists with the same name as the alias

out:
  type: opensearch
  mode: replace
  nodes:
  - {host: localhost, port: 9200}
  index: <alias name> # plugin generates index name like <index>_%Y%m%d-%H%M%S

Authentication

This plugin supports Basic authentication. 'Security' also supports LDAP and Active Directory. This plugin doesn't supports these auth methods.

use_ssl: true
auth_method: basic
user: <username>
password: <password>

Example

out:
  type: opensearch
  mode: insert
  nodes:
  - {host: vpc-domain-name-identifier.region.es.amazonaws.com, port: 443}
  index: <index name>
  use_ssl: true

Benchmark

plugin total sec speed records records/s
embulk-output-opensearch 210mb 42.3 5.0mb/s 5,000,000 118,301/s
embulk-output-elasticsearch 210mb 53.0 4.0mb/s 5,000,000 94,279/s

Test

Firstly install Docker and Docker compose then docker compose up opensearch, so that an ES server will be locally launched then you can run tests with docker compose run --rm java ./gradlew test.

docker compose up opensearch
docker compose run --rm java ./gradlew test  # -t to watch change of files and rebuild continuously

For Maintainers

Release

Modify version in build.gradle at a detached commit, and then tag the commit with an annotation.

git checkout --detach main
# (Edit: Remove "-SNAPSHOT" in "version" in build.gradle.)
git add build.gradle
git commit -m "Release vX.Y.Z"
git tag -a vX.Y.Z --cleanup=whitespace
# (Edit: Write a tag annotation in the changelog format.)

See Keep a Changelog for the changelog format. We adopt a part of it for Git's tag annotation like below.

## [X.Y.Z] - YYYY-MM-DD
### Added
- Added a feature.
### Changed
- Changed something.
### Fixed
- Fixed a bug.

Push the annotated tag, then. It triggers a release operation on GitHub Actions after approval.

git push -u origin vX.Y.Z

About

OpenSearch output plugin for Embulk

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 98.2%
  • Dockerfile 1.1%
  • Ruby 0.7%