Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pipeline): transform support on_failure #4123

Merged
merged 61 commits into from
Jun 17, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
63491d9
chore: add log http ingester scaffold
paomian May 22, 2024
4d2ec3b
chore: add some example code
paomian May 22, 2024
2e51c16
chore: add log inserter
paomian May 27, 2024
fbc66ec
chore: add log handler file
paomian May 27, 2024
cd4d83d
chore: add pipeline lib
paomian May 27, 2024
2bc1937
chore: import log handler
paomian May 29, 2024
2b16ef9
chore: add pipelime http handler
paomian May 29, 2024
f1350cd
chore: add pipeline private table
paomian May 30, 2024
1d52cad
chore: add pipeline API
paomian May 31, 2024
8c69abb
chore: improve error handling
paomian May 31, 2024
7e0a9ad
Merge branch 'main' into feat/log-handler
shuiyisong Jun 3, 2024
73432dc
chore: merge main
shuiyisong Jun 3, 2024
9d7284c
Merge pull request #6 from shuiyisong/chore/merge_main
paomian Jun 3, 2024
1a03b7e
chore: add multi content type support for log handler
paomian Jun 3, 2024
a2f1230
Merge branch 'main' into feat/log-handler
shuiyisong Jun 4, 2024
6a0998d
refactor: remove servers dep on pipeline
shuiyisong Jun 3, 2024
443eaf9
refactor: move define_into_tonic_status to common-error
shuiyisong Jun 3, 2024
c8ce4ee
refactor: bring in pipeline 3eb890c551b8d7f60c4491fcfec18966e2b210a4
shuiyisong Jun 4, 2024
eb9cd22
chore: fix typo
shuiyisong Jun 4, 2024
8d0595c
refactor: bring in pipeline a95c9767d7056ab01dd8ca5fa1214456c6ffc72c
shuiyisong Jun 4, 2024
061b14e
chore: fix typo and license header
shuiyisong Jun 4, 2024
c152472
refactor: move http event handler to a separate file
shuiyisong Jun 4, 2024
ddea3c1
chore: add test for pipeline
paomian Jun 4, 2024
162e92f
Merge branch 'main' into feat/log-handler
shuiyisong Jun 4, 2024
5a7a5be
chore: update
shuiyisong Jun 4, 2024
423e51e
chore: fmt
shuiyisong Jun 4, 2024
51df233
Merge pull request #7 from shuiyisong/refactor/log_handler
paomian Jun 4, 2024
8066eb3
refactor: bring in pipeline 7d2402701877901871dd1294a65ac937605a6a93
shuiyisong Jun 4, 2024
e2a2e50
refactor: move `pipeline_operator` to `pipeline` crate
shuiyisong Jun 4, 2024
209a1a3
chore: minor update
shuiyisong Jun 4, 2024
c110adb
refactor: bring in pipeline 1711f4d46687bada72426d88cda417899e0ae3a4
shuiyisong Jun 5, 2024
1047dd7
chore: add log
shuiyisong Jun 5, 2024
2ff2fda
chore: add log
shuiyisong Jun 5, 2024
8b6a652
chore: remove open hook
shuiyisong Jun 5, 2024
6ca15ad
Merge pull request #8 from shuiyisong/refactor/log
paomian Jun 5, 2024
1298b0a
chore: minor update
shuiyisong Jun 5, 2024
ea548b0
chore: fix fmt
shuiyisong Jun 5, 2024
fb13278
Merge pull request #9 from shuiyisong/refactor/log
paomian Jun 5, 2024
6c88b89
chore: minor update
shuiyisong Jun 5, 2024
eeed85e
chore: rename desc for pipeline table
shuiyisong Jun 5, 2024
f77d20b
refactor: remove updated_at in pipelines
shuiyisong Jun 5, 2024
38ed6bb
Merge pull request #10 from shuiyisong/chore/polish_code
paomian Jun 5, 2024
5815675
chore: add more content type support for log inserter api
paomian Jun 5, 2024
c84ef0e
Merge pull request #11 from paomian/feat/log-handler-v2
paomian Jun 5, 2024
2e69655
chore: introduce pipeline crate
shuiyisong Jun 5, 2024
ca9525d
Merge branch 'chore/introduce_pipeline' into feat/log-handler
shuiyisong Jun 5, 2024
85a4c32
Merge branch 'main' into feat/log-handler
shuiyisong Jun 6, 2024
77ef015
chore: update upload pipeline api
paomian Jun 6, 2024
43a57a7
chore: fix by pr commit
paomian Jun 6, 2024
3560285
chore: add some doc for pub fn/struct
paomian Jun 6, 2024
4872c8a
chore: some minro fix
paomian Jun 6, 2024
11933b0
chore: add pipeline version support
paomian Jun 6, 2024
92a2bda
chore: impl log pipeline version
paomian Jun 7, 2024
29eb2db
transform on_failure
yuanbohan Jun 8, 2024
3f8b9ce
chore: merge main
shuiyisong Jun 11, 2024
e764564
chore: merge main
shuiyisong Jun 12, 2024
6bec090
chore: merge log-handler
shuiyisong Jun 12, 2024
4236f87
chore: add test
shuiyisong Jun 12, 2024
cbb2337
chore: move test to a separate file
shuiyisong Jun 12, 2024
6968150
chore: merge log-handler
shuiyisong Jun 17, 2024
83e59b1
chore: add comment
shuiyisong Jun 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
chore: add some doc for pub fn/struct
  • Loading branch information
paomian committed Jun 6, 2024
commit 35602851788b55bb9e76c0c3fccfb376fae1a54d
26 changes: 19 additions & 7 deletions src/pipeline/src/manager/pipeline_operator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@ use crate::{GreptimeTransformer, Pipeline};

pub const PIPELINE_TABLE_NAME: &str = "pipelines";

/// PipelineOperator is responsible for managing pipelines.
/// It provides the ability to:
/// - Create a pipeline table if it does not exist
/// - Get a pipeline from the pipeline table
/// - Insert a pipeline into the pipeline table
/// - Compile a pipeline
/// - Add a pipeline table to the cache
/// - Get a pipeline table from the cache
pub struct PipelineOperator {
inserter: InserterRef,
statement_executor: StatementExecutorRef,
Expand All @@ -41,6 +49,7 @@ pub struct PipelineOperator {
}

impl PipelineOperator {
/// Create a table request for the pipeline table.
pub fn create_table_request(&self, catalog: &str) -> RegisterSystemTableRequest {
let (time_index, primary_keys, column_defs) = PipelineTable::build_pipeline_schema();

Expand Down Expand Up @@ -81,8 +90,7 @@ impl PipelineOperator {
}

async fn create_pipeline_table_if_not_exists(&self, ctx: QueryContextRef) -> Result<()> {
let catalog_str = ctx.current_catalog().to_owned();
let catalog = catalog_str.as_str();
let catalog = ctx.current_catalog();

// exist in cache
if self.get_pipeline_table_from_cache(catalog).is_some() {
Expand All @@ -105,19 +113,19 @@ impl PipelineOperator {
return Ok(());
}

let schema = expr.schema_name.clone();
let table_name = expr.table_name.clone();

// create table
self.statement_executor
.create_table_inner(&mut expr, None, ctx)
.create_table_inner(&mut expr, None, ctx.clone())
.await
.context(CreateTableSnafu)?;

let schema = &expr.schema_name;
let table_name = &expr.table_name;

// get from catalog
let table = self
.catalog_manager
.table(catalog, &schema, &table_name)
.table(catalog, schema, table_name)
.await
.context(CatalogSnafu)?
.context(PipelineTableNotFoundSnafu)?;
Expand All @@ -133,6 +141,7 @@ impl PipelineOperator {
Ok(())
}

/// Get a pipeline table from the cache.
pub fn get_pipeline_table_from_cache(&self, catalog: &str) -> Option<PipelineTableRef> {
self.tables.read().unwrap().get(catalog).cloned()
}
Expand All @@ -152,6 +161,7 @@ impl PipelineOperator {
}

impl PipelineOperator {
/// Create a new PipelineOperator.
pub fn new(
inserter: InserterRef,
statement_executor: StatementExecutorRef,
Expand All @@ -167,6 +177,7 @@ impl PipelineOperator {
}
}

/// Get a pipeline from the pipeline table.
pub async fn get_pipeline(
&self,
query_ctx: QueryContextRef,
Expand All @@ -180,6 +191,7 @@ impl PipelineOperator {
.await
}

/// Insert a pipeline into the pipeline table.
pub async fn insert_pipeline(
&self,
name: &str,
Expand Down
16 changes: 14 additions & 2 deletions src/pipeline/src/manager/table.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ use table::table::adapter::DfTableProviderAdapter;
use table::TableRef;

use crate::error::{
BuildDfLogicalPlanSnafu, CastTypeSnafu, CollectRecordsSnafu, CompilePipeline,
BuildDfLogicalPlanSnafu, CastTypeSnafu, CollectRecordsSnafu, CompilePipelineSnafu,
ExecuteInternalStatementSnafu, InsertPipelineSnafu, PipelineNotFoundSnafu, Result,
};
use crate::etl::transform::GreptimeTransformer;
Expand All @@ -56,6 +56,9 @@ pub const PIPELINE_TABLE_PIPELINE_SCHEMA_COLUMN_NAME: &str = "schema";
pub const PIPELINE_TABLE_PIPELINE_CONTENT_TYPE_COLUMN_NAME: &str = "content_type";
pub const PIPELINE_TABLE_PIPELINE_CONTENT_COLUMN_NAME: &str = "pipeline";
pub const PIPELINE_TABLE_CREATED_AT_COLUMN_NAME: &str = "created_at";

/// PipelineTable is a table that stores the pipeline schema and content.
/// Every catalog has its own pipeline table.
pub struct PipelineTable {
inserter: InserterRef,
statement_executor: StatementExecutorRef,
Expand All @@ -65,6 +68,7 @@ pub struct PipelineTable {
}

impl PipelineTable {
/// Create a new PipelineTable.
pub fn new(
inserter: InserterRef,
statement_executor: StatementExecutorRef,
Expand All @@ -80,6 +84,7 @@ impl PipelineTable {
}
}

/// Build the schema for the pipeline table.
pub fn build_pipeline_schema() -> (String, Vec<String>, Vec<ColumnDef>) {
(
PIPELINE_TABLE_CREATED_AT_COLUMN_NAME.to_string(),
Expand Down Expand Up @@ -138,6 +143,7 @@ impl PipelineTable {
)
}

/// Build the column schemas for inserting a row into the pipeline table.
fn build_insert_column_schemas() -> Vec<PbColumnSchema> {
vec![
PbColumnSchema {
Expand Down Expand Up @@ -181,10 +187,11 @@ impl PipelineTable {
.into()
}

/// Compile a pipeline from a string.
pub fn compile_pipeline(pipeline: &str) -> Result<Pipeline<GreptimeTransformer>> {
let yaml_content = Content::Yaml(pipeline.into());
parse::<GreptimeTransformer>(&yaml_content)
.map_err(|e| CompilePipeline { reason: e }.build())
.map_err(|e| CompilePipelineSnafu { reason: e }.build())
}

fn generate_pipeline_cache_key(schema: &str, name: &str) -> String {
Expand All @@ -203,6 +210,7 @@ impl PipelineTable {
.cloned()
}

/// Insert a pipeline into the pipeline table.
async fn insert_pipeline_to_pipeline_table(
&self,
schema: &str,
Expand Down Expand Up @@ -255,6 +263,8 @@ impl PipelineTable {
Ok(())
}

/// Get a pipeline by name.
/// If the pipeline is not in the cache, it will be get from table and compiled and inserted into the cache.
pub async fn get_pipeline(
&self,
schema: &str,
Expand All @@ -273,6 +283,8 @@ impl PipelineTable {
Ok(compiled_pipeline)
}

/// Insert a pipeline into the pipeline table and compile it.
/// The compiled pipeline will be inserted into the cache.
pub async fn insert_and_compile(
&self,
schema: &str,
Expand Down
Loading