📅 Community Meeting |
---|
The Fluid project holds bi-weekly community online meeting. To join or watch previous meeting notes and recordings, please see meeting schedule and meeting minutes. |
Fluid is an open source Kubernetes-native Distributed Dataset Orchestrator and Accelerator for data-intensive applications, such as big data and AI applications. It is hosted by the Cloud Native Computing Foundation (CNCF) as a sandbox project.
For more information, please refer to our papers:
-
Rong Gu, Kai Zhang, Zhihao Xu, et al. Fluid: Dataset Abstraction and Elastic Acceleration for Cloud-native Deep Learning Training Jobs. IEEE ICDE, pp. 2183-2196, May, 2022. (Conference Version)
-
Rong Gu, Zhihao Xu, Yang Che, et al. High-level Data Abstraction and Elastic Data Caching for Data-intensive AI Applications on Cloud-native Platforms. IEEE TPDS, pp. 2946-2964, Vol 34(11), 2023. (Journal Version)
English | 简体ä¸æ–‡
What is NEW! |
---|
May. 26th, 2023. Fluid v0.9.0 is RELEASED! It provides various new features, such as adding thinRuntime to simplify integration with third-party storage systems, access data across namespaces, subDataset support, new data Operation like dataMigrate, native acceleration system EFCRuntime for distributed file systems, and so on. Please check the CHANGELOG for details. |
Sep. 03th, 2022. Fluid v0.8.0 is RELEASED! It provides various new features, such as Lifecycle management of Serverless Job with fluid sidecar support, Enable runtime controller on demand, Automatic CRD upgrader, Restrict pod scheduling to dataset cache nodes, Arm64 support with JuicefsRuntime, GCS support for Alluxio Runtime, and so on. Please check the CHANGELOG for details. |
Mar. 02th, 2022. Fluid v0.7.0 is RELEASED! It provides various new features, such as Fuse sidecar auto injection for all the runtimes (suitable for serverless environment), Fuse auto recovery and upgrade, lazy fuse mount mode, support JuiceFS cache runtime and so on. Please check the CHANGELOG for details. |
Aug. 11th, 2021. Fluid v0.6.0 is RELEASED! It provides various new features, such as dataset cache autoscaling and cronscaling, dataset cache aware Pod scheduling, HA support for cache Runtime. Please check the CHANGELOG for details. |
Apr. 27th, 2021. Fluid accepted by CNCF! Fluid project was accepted as an official CNCF Sandbox Project by CNCF Technical Oversight Committee (TOC) with a majority vote after the review process. New beginning for Fluid! . |
-
Dataset Abstraction
Implements the unified abstraction for datasets from multiple storage sources, with observability features to help users evaluate the need for scaling the cache system.
-
Scalable Cache Runtime
Offers a unified access interface for data operations with different runtimes, enabling access to third-party storage systems.
-
Automated Data Operations
Provides various automated data operation modes to facilitate integration with automated operations systems.
-
Elasticity and Scheduling
Enhances data access performance by combining data caching technology with elastic scaling, portability, observability, and data affinity-scheduling capabilities.
-
Runtime Platform Agnostic
Supports a variety of environments and can run different storage clients based on the environment, including native, edge, Serverless Kubernetes clusters, and Kubernetes multi-cluster environments.
Dataset: A Dataset is a set of data logically related that can be used by computing engines, such as Spark for big data analytics and TensorFlow for AI applications. Intelligently leveraging data often creates core industry values. Managing Datasets may require features in different dimensions, such as security, version management and data acceleration. We hope to start with data acceleration to support the management of datasets.
Runtime: The Runtime enforces dataset isolation/share, provides version management, and enables data acceleration by defining a set of interfaces to handle DataSets throughout their lifecycle, allowing for the implementation of management and acceleration functionalities behind these interfaces.
- Kubernetes version > 1.16, and support CSI
- Golang 1.18+
- Helm 3
You can follow our Get Started guide to quickly start a testing Kubernetes cluster.
You can see our documentation at docs for more in-depth installation and instructions for production:
You can also visit Fluid Homepage to get relevant documents.
See ROADMAP.md for the roadmap details. It may be updated from time to time.
Feel free to reach out if you have any questions. The maintainers of this project are reachable via:
DingTalk:
WeChat Official Account:
Slack:
- Join in the
CNCF Slack
and navigate to the#fluid
channel for discussion.
Contributions are highly welcomed and greatly appreciated. See CONTRIBUTING.md for details on submitting patches and the contribution workflow.
If you are interested in Fluid and would like to share your experiences with others, you are warmly welcome to add your information on ADOPTERS.md page. We will continuously discuss new requirements and feature design with you in advance.
Fluid is under the Apache 2.0 license. See the LICENSE file for details. It is vendor-neutral.
Security is a first priority thing for us at Fluid. If you come across a related issue, please send email to [email protected] .
Fluid adopts CNCF Code of Conduct.