Skip to content

An ongoing & curated collection of awesome software, libraries, learning tutorials, and most important tools, esources and cool stuff about Elasticsearch

License

Notifications You must be signed in to change notification settings

exajobs/elasticsearch-collection

Repository files navigation

Elasticsearch Tools

Welcome To The World of Elastic search Collection®. A collection of awesome software, libraries, documents, books, resources and cool stuff about ELK Stack. Thanks to our daily readers and contributors. The goal is to build a categorized community-driven collection of very well-known resources. Sharing, suggestions and contributions are always welcome!

What is ElasticSearch?

When people ask, “what is Elasticsearch?”, some may answer that:

  • It’s “an index”,
  • A "search engine”,
  • An “analytics database”,
  • A "big data solution”,
  • that “it’s fast and scalable”,
  • or that “it’s kind of like Google”.
  • Elasticsearch is simple to configure, has incredible flexibility, and is an excellent tool for complex searches. Let's take a closer look.

Depending on your level of familiarity with this technology, these answers may either bring you closer to an ah-ha moment or further confuse you. But the truth is, all of these answers are correct and that’s part of the appeal of Elasticsearch.

elk

  • Elasticsearch is a distributed, open-source search and analytics engine built on Apache Lucene and developed in Java. . It was developed in Java and is designed to operate in real time. It can search and index document files in diverse formats. It was designed to be used in distributed environments by providing flexibility and scalability. Now, Elasticsearch is a widely popular enterprise search engine. Elasticsearch allows you to store, search, and analyze huge volumes of data quickly and in near real-time and give back answers in milliseconds.

How does it work?

To help understand how Elasticsearch handles data, we can make an analogy to a database.

  • Elasticsearch stores the data using the "schema-less" concept. This means that it is not necessary to define the structure of the data that will be entered in advance, as happens with relational databases known in the market: Oracle, MySQL, and SQLServer, among others.

In our analogy of traditional relational databases, the structure of the data used by Elasticsearch would be:

analogy

  • Index: - Indices, the largest unit of data in Elasticsearch, are logical partitions of documents and can be compared to a database in the world of relational databases. More on indices analogy

  • Type: - A type in Elasticsearch represents a class of similar documents. A type consists of a name—such as user or blog post—and a mapping.

  • Documents: - A document in Lucene consists of a simple list of field-value pairs. A field must have at least one value, but any field can contain multiple values.

  • Fields: - Are columns in Elasticsearch.

shards

  • Cluster: - A cluster is a collection of one or more servers that together hold entire data and give federated indexing and search capabilities across all servers. For relational databases, the node is DB Instance. There can be N nodes with the same cluster name.
  • Node: - A node is a single server that holds some data and participates in the cluster’s indexing and querying. A node can be configured to join a specific cluster by the particular cluster name. A single cluster can have as many nodes as we want. A node is simply one Elasticsearch instance.
  • Shard - A shard is a subset of documents of an index. An index can be divided into many shards.
  • Replica Shard: - The main purpose of replicas is for failover: if the node holding a primary shard dies, a replica is promoted to the role of primary; replica shard is the copy of primary shard and serves to prevent data loss in case of hardware failure.

Table of contents

Elastic Architecture

Indices

Indices, the largest unit of data in Elasticsearch, are logical partitions of documents and can be compared to a database in the world of relational databases.

Continuing our e-commerce app example, you could have one index containing all of the data related to the products and another with all of the data related to the customers. You can have as many indices defined in Elasticsearch as you want. These in turn will hold documents that are unique to each index. Indices are identified by lowercase names that refer to actions that are performed actions (such as searching and deleting) on the documents that are inside each index. For a list of best practices in handling indices, check out the blog Managing an Elasticsearch Index. Another key element to getting how Elasticsearch’s indices work is to get a handle on shards.

APIs

Elasticsearch Queries

Elasticsearch provides a full Query DSL (Domain Specific Language) based on JSON to define queries. Think of the Query DSL as an AST (Abstract Syntax Tree) of queries, consisting of two types of clauses:

queries

Elastic Stack

  • Elasticsearch official website
  • Logstash is a data pipeline that helps you process logs and other event data from a variety of systems
  • Kibana is a data analysis tool that helps to visualize your data; Kibana Manual docs
  • beats is the platform for building lightweight, open source data shippers for many types of data you want to enrich with Logstash, search and analyze in Elasticsearch, and visualize in Kibana.

Books

Certifications

Elastic Certified Engineer

Related (awesome) lists

Open-source and free products, based on Elasticsearch

  • Fess is an open source full featured Enterprise Search, with a web-crawler
  • Yelp/elastalert is a modular flexible rules based alerting system written in Python
  • etsy/411 - an Alert Management Web Application https://demo.fouroneone.io (credentials: user/user)
  • appbaseio/mirage is a 🔎 GUI for composing Elasticsearch queries
  • exceptionless/Exceptionless is an error (exceptions) collecting and reporting server with client bindings for a various programming languages
  • searchkit/searchkit is a UI framework based on React to build awesome search experiences with Elasticsearch
  • appbaseio/reactivemaps is a React based UI components library for building Airbnb / Foursquare like Maps
  • appbaseio/reactivesearch is a library of beautiful React UI components for Elasticsearch
  • appbaseio/dejavu The missing UI for Elasticsearch; landing page
  • Simple File Server is an Openstack Swift compatible distributed object store that can serve and securely store billions of large and small files using minimal resources.
  • logagent a log shipper to parse and ship logs to Elasticsearch including bulk indexing, disk buffers and log format detection.
  • ItemsAPI simplified search API for web and mobile (based on Elasticsearch and Express.js)
  • Kuzzle - An open-source backend with advanced real-time features for Web, Mobile and IoT that uses ElasticSearch as a database. (Website)
  • SIAC - SIAC is an enterprise SIEM built on the ELK stack and other open-source components.
  • Sentinl - Sentinl is a Kibana alerting and reporting app.
  • Praeco - Elasticsearch alerting made simple
  • DataStation - Easily query, script, and visualize data from every database, file, and API.

Elasticsearch developer tools and utilities

Development and debugging

  • Sense (from Elastic) A JSON aware developer console to Elasticsearch; official and very powerful
  • ES-mode An Emacs major mode for interacting with Elasticsearch (similar to Sense)
  • Elasticsearch Cheatsheet Examples for the most used queries, API and settings for all major version of Elasticsearch
  • Elasticstat CLI tool displaying monitoring informations like htop
  • Elastic for Visual Studio Code An extension for developing Elasticsearch queries like Kibana and Sense extention in Visual Studio Code
  • Elastic Builder A Node.js implementation of the Elasticsearch DSL
  • Bodybuilder A Node.js elasticsearch query body builder
  • enju A Node.js elasticsearch ORM
  • Peek An interactive CLI in Python that works like Kibana Console with additional features

Import and Export

  • Knapsack plugin is an "swiss knife" export/import plugin for Elasticsearch
  • Elasticsearch-Exporter is a command line script to import/export data from Elasticsearch to various other storage systems
  • esbulk Parallel elasticsearch bulk indexing utility for the command line.
  • elasticdump - tools for moving and saving indices
  • elasticsearch-loader - Tool for loading common file types to elasticsearch including csv, json, and parquet

Management

  • Esctl - High-level command line interface to manage Elasticsearch clusters.
  • Vulcanizer - Github's open sourced cluster management library based on Elasticsearch's REST API. Comes with a high level CLI tool

Elasticsearch plugins

Cluster

  • sscarduzio/elasticsearch-readonlyrest-plugin Safely expose Elasticsearch REST API directly to the public
  • mobz/elasticsearch-head is a powerful and essential plugin for managing your cluster, indices and mapping
  • Bigdesk - Live charts and statistics for elasticsearch cluster
  • Elastic HQ - Elasticsearch cluster management console with live monitoring and beautiful UI
  • Cerebro is an open source(MIT License) elasticsearch web admin tool. Supports ES 5.x
  • Kopf - Another management plugin that have REST console and manual shard allocation
  • Search Guard - Elasticsearch and elastic stack security and alerting for free
  • ee-outliers - ee-outliers is a framework to detect outliers in events stored in an Elasticsearch cluster.
  • Elasticsearch Comrade - Elasticsearch admin panel built for ops and monitoring
  • elasticsearch-admin - Web administration for Elasticsearch

Other

  • SIREn Join Plugin for Elasticsearch This plugin extends Elasticsearch with new search actions and a filter query parser that enables to perform a "Filter Join" between two set of documents (in the same index or in different indexes).

Integrations and SQL support

You know, for search

Kibana plugins and applications

Kibana Visualization plugins

  • nbs-system/mapster - a visualization which allows to create live event 3d maps in Kibana
  • Kibana Tag Cloud Plugin - tag cloud visualization plugin based on d3-cloud plugin
  • LogTrail - a plugin for Kibana to view, analyze, search and tail log events from multiple hosts in realtime with devops friendly interface inspired by Papertrail
  • Analyze API - Kibana 6 application to manipulate the _analyze API graphically
  • kbn_network - This is a plugin developed for Kibana that displays a network node that link two fields that have been previously selected.

Discussions and social media

Tutorials

Articles

System configuration

Docker and Elasticsearch

Java tuning

How to start using G1

#ES_JAVA_OPTS=""
ES_JAVA_OPTS="-XX:-UseParNewGC -XX:-UseConcMarkSweepGC -XX:+UseG1GC"

Scalable Infrastructure and performance

Integrations

Logging

Alerts

Time series

Machine Learning

Use cases for Elasticsearch

Other

Videos

Overviews

Advanced

Code, configuration file samples and other gists

Who is using elasticsearch?

Yelp, IFTTT, StackExchange, Raygun, Mozilla, Spotify, CERN, NASA Zalando

License

MIT License & cc license

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Back to top