diff --git a/chart/values.yaml b/chart/values.yaml index d62746bf84..570db7d78b 100644 --- a/chart/values.yaml +++ b/chart/values.yaml @@ -107,7 +107,7 @@ postgresql: ## @param image.tag PostgreSQL image tag (immutable tags are recommended) ## image: - tag: 0.25.0 + tag: 12.1.0 ## Authentication parameters ## ref: https://github.com/bitnami/bitnami-docker-postgresql/blob/master/README.md#setting-the-root-password-on-first-run ## ref: https://github.com/bitnami/bitnami-docker-postgresql/blob/master/README.md#creating-a-database-on-first-run diff --git a/docs/openapi.html b/docs/openapi.html index 6958d1dc14..23976c22ea 100644 --- a/docs/openapi.html +++ b/docs/openapi.html @@ -13,21 +13,21 @@ } -

Marquez (0.25.0)

Download OpenAPI specification:Download

License: Apache 2.0

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem's metadata.

-

Namespaces

Create a namespace

Creates a new namespace object. A namespace enables the contextual grouping of related jobs and datasets. Namespaces must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), dashes (-), colons (:), slashes (/), or dots (.). A namespace is case-insensitive with a maximum length of 1024 characters. Note jobs and datasets will be unique within a namespace, but not across namespaces.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
Request Body schema: application/json
ownerName
required
string

The owner of the namespace.

-
description
string

The description of the namespace.

-

Responses

Request samples

Content type
application/json
{
  • "ownerName": "me",
  • "description": "My first namespace!"
}

Response samples

Content type
application/json
{
  • "name": "my-namespace",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "ownerName": "me",
  • "description": "My first namespace!"
}

Retrieve a namespace

Returns a namespace.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-

Responses

Response samples

Content type
application/json
{
  • "name": "my-namespace",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "ownerName": "me",
  • "description": "My first namespace!"
}

List all namespaces

Returns a list of namespaces.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

-
offset
integer
Default: 0

The initial position from which to return results

-

Responses

Response samples

Content type
application/json
{
  • "namespaces": [
    ]
}

Sources

Create a source Deprecated

Creates a new source object. A source is the physical location of a dataset such as a table in PostgreSQL, or topic in Kafka. A source enables the grouping of physical datasets to their physical source.

-
path Parameters
source
required
string <= 1024 characters
Example: my-source

The name of the source.

-
Request Body schema: application/json
type
required
string

The type of the source.

-
connectionUrl
required
string <URL>

The URL to the location of the source.

-
description
string

The description of the source.

-

Responses

Request samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

Response samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "name": "my-source",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

Retrieve a source

Returns a source.

-
path Parameters
source
required
string <= 1024 characters
Example: my-source

The name of the source.

-

Responses

Response samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "name": "my-source",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

List all sources

Returns a list of sources.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

-
offset
integer
Default: 0

The initial position from which to return results

-

Responses

Response samples

Content type
application/json
{
  • "sources": [
    ]
}

Datasets

Create a dataset Deprecated

Creates a new dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
Request Body schema: application/json
One of
type
required
string
Value: "DB_TABLE"

The type of the dataset.

-
physicalName
required
string

The physical name of the table.

-
sourceName
required
string

The name of the source associated with the table.

-
required
Array of objects[ items ]

The fields of the table.

-
tags
Array of strings

List of tags.

-
description
string

The description of the table.

-
runId
string

The ID associated with the run modifying the table.

-

Responses

Request samples

Content type
application/json
Example
{
  • "type": "DB_TABLE",
  • "physicalName": "public.mytable",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "description": "My first dataset!"
}

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a dataset

Returns a dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a version for a dataset

Returns a version for a dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
version
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the job or dataset version.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "version": "d224dac0-35d7-4d9b-bbbe-6fff1a8485ad",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "description": "My first dataset!",
  • "createdByRun": {
    }
}

List all versions for a dataset

Returns a list of versions for a dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-

Responses

Response samples

Content type
application/json
{
  • "versions": [
    ]
}

List all datasets

Returns a list of datasets.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

-
offset
integer
Default: 0

The initial position from which to return results

-

Responses

Response samples

Content type
application/json
{
  • "datasets": [
    ],
  • "totalCount": 0
}

Tag a dataset

Tag an existing dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
tag
required
string
Example: SENSITIVE

The name of the tag.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Tag a field

Tag an existing field of a dataset.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

-
field
required
string
Example: my_field

The name of the field.

-
tag
required
string
Example: SENSITIVE

The name of the tag.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Jobs

Create a job Deprecated

Creates a new job object. All job objects are immutable and are uniquely identified by a generated ID. Marquez will create a version of a job each time the contents of the object is modified. For example, the location of a job may change over time resulting in new versions. The accumulated versions can be listed, used to rerun a specific job version or possibly help debug a failed job run.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-
Request Body schema: application/json
object

The ID of the job.

-
type
required
enum (JobType)
Enum: "BATCH" "STREAM" "SERVICE"

The type of the job.

-
required
Array of objects (DatasetId) unique [ items ]

The set of input datasets.

-
required
Array of objects (DatasetId) unique [ items ]

The set of output datasets.

-
location
string <URL>

The URL of the job source code or artifact.

-
context
object
Deprecated

A key/value pair that must be of type string. A context can be used for getting additional details about the job.

-
description
string

The description of the job.

-
runId
string

An optional run ID used to associate a job version to an existing job run.

-

Responses

Request samples

Content type
application/json
{}

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "latestRun": null,
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a job

Retrieve a job.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "latestRun": null,
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

List all jobs

Returns a list of jobs.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

-
offset
integer
Default: 0

The initial position from which to return results

-

Responses

Response samples

Content type
application/json
{
  • "jobs": [
    ],
  • "totalCount": 0
}

Retrieve a version for a job

Returns a version for a job.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-
version
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the job or dataset version.

-

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "version": "56472c57-a2ef-4218-b7b7-d2af02a343fd",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "facets": { }
}

List all versions for a job

Returns a list of versions for a job.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-

Responses

Response samples

Content type
application/json
{
  • "versions": [
    ]
}

Create a run Deprecated

Creates a new run object for a job.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-
Request Body schema: application/json
id
string <uuid>

An optional user-provided unique ID of the run. A run ID must be an UUID. If an ID for the run is not provided, a random UUID will be generated for the given run.

-
nominalStartTime
string <date-time>

An ISO-8601 timestamp representing the nominal start time of the run.

-
nominalEndTime
string <date-time>

An ISO-8601 timestamp representing the nominal end time of the run.

-
args
object

The arguments of the run.

-

Responses

Request samples

Content type
application/json
{
  • "args": {
    }
}

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "COMPLETED",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": "2019-05-09T20:05:46.815920Z",
  • "durationMs": 4250894125,
  • "args": {
    },
  • "context": {
    },
  • "facets": { }
}

List all runs

Returns a list of runs for a job.

-
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

-
job
required
string <= 1024 characters
Example: my-job

The name of the job.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

-
offset
integer
Default: 0

The initial position from which to return results

-

Responses

Response samples

Content type
application/json
{
  • "runs": [
    ]
}

Retrieve a run

Retrieve a run.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Start a run Deprecated

Marks the run as RUNNING.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

-

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Complete a run Deprecated

Marks the run as COMPLETED.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

-

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "COMPLETED",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": "2019-05-09T20:05:46.815920Z",
  • "durationMs": 4250894125,
  • "args": {
    },
  • "context": {
    },
  • "facets": { }
}

Fail a run Deprecated

Marks the run as FAILED.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

-

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Abort a run Deprecated

Marks the run as ABORTED.

-
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

-
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

-

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Lineage

Record a single lineage event

Receive, process, and store lineage metadata using the OpenLineage standard.

-
Request Body schema: application/json
any (LineageEvent)

Responses

Request samples

Content type
application/json
{}

Get a lineage graph

query Parameters
nodeId
required
string
Example: nodeId=dataset:food_delivery:public.delivery_7_days

The ID of the node.

-
depth
integer
Default: 20

Depth of lineage graph to create.

-

Responses

Response samples

Content type
application/json
{
  • "graph": [
    ]
}

Tags

Create a tag

Creates a new tag object.

-
path Parameters
tag
required
string
Example: SENSITIVE

The name of the tag.

-
Request Body schema: application/json
description
string

The description of the tag.

-

Responses

Request samples

Content type
application/json
{
  • "description": "My first tag!"
}

Response samples

Content type
application/json
{
  • "tags": [
    ]
}

List all tags

Returns a list of tags.

-
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

-
offset
integer
Default: 0

The initial position from which to return results

-

Responses

Response samples

Content type
application/json
{
  • "tags": [
    ]
}

Search

Query all datasets and jobs

Returns one or more datasets and jobs of your query.

-
query Parameters
q
required
string
Example: q=my-dataset

Query containing pattern to match; datasets and jobs pattern matching is string based and case-insensitive. Use percent sign (%) to match any string of zero or more characters (my-job%), or an underscore (_) to match a single character (_job_).

-
filter
string
Example: filter=dataset

Filters the results of your query by dataset or job.

-
sort
string
Example: sort=name

Sorts the results of your query by name or updated_at.

-
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

-

Responses

Response samples

Content type
application/json
{
  • "totalCount": 1,
  • "results": [
    ]
}
+ " fill="currentColor">

Marquez (0.26.0-SNAPSHOT)

Download OpenAPI specification:Download

License: Apache 2.0

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem's metadata.

+

Namespaces

Create a namespace

Creates a new namespace object. A namespace enables the contextual grouping of related jobs and datasets. Namespaces must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), dashes (-), colons (:), slashes (/), or dots (.). A namespace is case-insensitive with a maximum length of 1024 characters. Note jobs and datasets will be unique within a namespace, but not across namespaces.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
Request Body schema: application/json
ownerName
required
string

The owner of the namespace.

+
description
string

The description of the namespace.

+

Responses

Request samples

Content type
application/json
{
  • "ownerName": "me",
  • "description": "My first namespace!"
}

Response samples

Content type
application/json
{
  • "name": "my-namespace",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "ownerName": "me",
  • "description": "My first namespace!"
}

Retrieve a namespace

Returns a namespace.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+

Responses

Response samples

Content type
application/json
{
  • "name": "my-namespace",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "ownerName": "me",
  • "description": "My first namespace!"
}

List all namespaces

Returns a list of namespaces.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

+
offset
integer
Default: 0

The initial position from which to return results

+

Responses

Response samples

Content type
application/json
{
  • "namespaces": [
    ]
}

Sources

Create a source Deprecated

Creates a new source object. A source is the physical location of a dataset such as a table in PostgreSQL, or topic in Kafka. A source enables the grouping of physical datasets to their physical source.

+
path Parameters
source
required
string <= 1024 characters
Example: my-source

The name of the source.

+
Request Body schema: application/json
type
required
string

The type of the source.

+
connectionUrl
required
string <URL>

The URL to the location of the source.

+
description
string

The description of the source.

+

Responses

Request samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

Response samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "name": "my-source",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

Retrieve a source

Returns a source.

+
path Parameters
source
required
string <= 1024 characters
Example: my-source

The name of the source.

+

Responses

Response samples

Content type
application/json
{
  • "type": "POSTGRESQL",
  • "name": "my-source",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "connectionUrl": "jdbc:postgresql://db.example.com/mydb",
  • "description": "My first source!"
}

List all sources

Returns a list of sources.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

+
offset
integer
Default: 0

The initial position from which to return results

+

Responses

Response samples

Content type
application/json
{
  • "sources": [
    ]
}

Datasets

Create a dataset Deprecated

Creates a new dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
Request Body schema: application/json
One of
type
required
string
Value: "DB_TABLE"

The type of the dataset.

+
physicalName
required
string

The physical name of the table.

+
sourceName
required
string

The name of the source associated with the table.

+
required
Array of objects[ items ]

The fields of the table.

+
tags
Array of strings

List of tags.

+
description
string

The description of the table.

+
runId
string

The ID associated with the run modifying the table.

+

Responses

Request samples

Content type
application/json
Example
{
  • "type": "DB_TABLE",
  • "physicalName": "public.mytable",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "description": "My first dataset!"
}

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a dataset

Returns a dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a version for a dataset

Returns a version for a dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
version
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the job or dataset version.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "version": "d224dac0-35d7-4d9b-bbbe-6fff1a8485ad",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "description": "My first dataset!",
  • "createdByRun": {
    }
}

List all versions for a dataset

Returns a list of versions for a dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+

Responses

Response samples

Content type
application/json
{
  • "versions": [
    ]
}

List all datasets

Returns a list of datasets.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

+
offset
integer
Default: 0

The initial position from which to return results

+

Responses

Response samples

Content type
application/json
{
  • "datasets": [
    ],
  • "totalCount": 0
}

Tag a dataset

Tag an existing dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
tag
required
string
Example: SENSITIVE

The name of the tag.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Tag a field

Tag an existing field of a dataset.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
dataset
required
string <= 1024 characters
Example: my-dataset

The name of the dataset.

+
field
required
string
Example: my_field

The name of the field.

+
tag
required
string
Example: SENSITIVE

The name of the tag.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "DB_TABLE",
  • "name": "my-dataset",
  • "physicalName": "public.mytable",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "upodatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "sourceName": "my-source",
  • "fields": [
    ],
  • "tags": [ ],
  • "lastModifiedAt": null,
  • "description": "My first dataset!",
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Jobs

Create a job Deprecated

Creates a new job object. All job objects are immutable and are uniquely identified by a generated ID. Marquez will create a version of a job each time the contents of the object is modified. For example, the location of a job may change over time resulting in new versions. The accumulated versions can be listed, used to rerun a specific job version or possibly help debug a failed job run.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+
Request Body schema: application/json
object

The ID of the job.

+
type
required
string (JobType)
Enum: "BATCH" "STREAM" "SERVICE"

The type of the job.

+
required
Array of objects (DatasetId) unique [ items ]

The set of input datasets.

+
required
Array of objects (DatasetId) unique [ items ]

The set of output datasets.

+
location
string <URL>

The URL of the job source code or artifact.

+
context
object
Deprecated

A key/value pair that must be of type string. A context can be used for getting additional details about the job.

+
description
string

The description of the job.

+
runId
string

An optional run ID used to associate a job version to an existing job run.

+

Responses

Request samples

Content type
application/json
{}

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "latestRun": null,
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

Retrieve a job

Retrieve a job.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "latestRun": null,
  • "facets": { },
  • "currentVersion": "b1d626a2-6d3a-475e-9ecf-943176d4a8c6"
}

List all jobs

Returns a list of jobs.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

+
offset
integer
Default: 0

The initial position from which to return results

+

Responses

Response samples

Content type
application/json
{
  • "jobs": [
    ],
  • "totalCount": 0
}

Retrieve a version for a job

Returns a version for a job.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+
version
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the job or dataset version.

+

Responses

Response samples

Content type
application/json
{
  • "id": {
    },
  • "type": "BATCH",
  • "name": "my-job",
  • "version": "56472c57-a2ef-4218-b7b7-d2af02a343fd",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "namespace": "my-namespace",
  • "inputs": [
    ],
  • "outputs": [ ],
  • "context": {
    },
  • "description": "My first job!",
  • "facets": { }
}

List all versions for a job

Returns a list of versions for a job.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+

Responses

Response samples

Content type
application/json
{
  • "versions": [
    ]
}

Create a run Deprecated

Creates a new run object for a job.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+
Request Body schema: application/json
id
string <uuid>

An optional user-provided unique ID of the run. A run ID must be an UUID. If an ID for the run is not provided, a random UUID will be generated for the given run.

+
nominalStartTime
string <date-time>

An ISO-8601 timestamp representing the nominal start time of the run.

+
nominalEndTime
string <date-time>

An ISO-8601 timestamp representing the nominal end time of the run.

+
args
object

The arguments of the run.

+

Responses

Request samples

Content type
application/json
{
  • "args": {
    }
}

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "COMPLETED",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": "2019-05-09T20:05:46.815920Z",
  • "durationMs": 4250894125,
  • "args": {
    },
  • "context": {
    },
  • "facets": { }
}

List all runs

Returns a list of runs for a job.

+
path Parameters
namespace
required
string <= 1024 characters
Example: my-namespace

The name of the namespace.

+
job
required
string <= 1024 characters
Example: my-job

The name of the job.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

+
offset
integer
Default: 0

The initial position from which to return results

+

Responses

Response samples

Content type
application/json
{
  • "runs": [
    ]
}

Retrieve a run

Retrieve a run.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Start a run Deprecated

Marks the run as RUNNING.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

+

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Complete a run Deprecated

Marks the run as COMPLETED.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

+

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "COMPLETED",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": "2019-05-09T20:05:46.815920Z",
  • "durationMs": 4250894125,
  • "args": {
    },
  • "context": {
    },
  • "facets": { }
}

Fail a run Deprecated

Marks the run as FAILED.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

+

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Abort a run Deprecated

Marks the run as ABORTED.

+
path Parameters
id
required
string <uuid>
Example: ea9badc5-7cb2-49af-9a9f-155771d3a797

The ID of the run.

+
query Parameters
at
string <date-time>

An ISO-8601 timestamp representing the time when the run transitioned.

+

Responses

Response samples

Content type
application/json
{
  • "id": "870492da-ecfb-4be0-91b9-9a89ddd3db90",
  • "createdAt": "2019-05-09T19:49:24.201361Z",
  • "updatedAt": "2019-05-09T19:49:24.201361Z",
  • "nominalStartTime": null,
  • "nominalEndTime": null,
  • "state": "RUNNING",
  • "startedAt": "2019-05-09T15:17:32.690346",
  • "endedAt": null,
  • "durationMs": null,
  • "args": {
    },
  • "facets": { }
}

Lineage

Record a single lineage event

Receive, process, and store lineage metadata using the OpenLineage standard.

+
Request Body schema: application/json
any (LineageEvent)

Responses

Request samples

Content type
application/json
{}

Get a lineage graph

query Parameters
nodeId
required
string
Example: nodeId=dataset:food_delivery:public.delivery_7_days

The ID of the node. A node can either be a dataset node or a job node. The format of nodeId for dataset is dataset:<namespace_of_dataset>:<name_of_the_dataset> and for job is job:<namespace_of_the_job>:<name_of_the_job>.

+
depth
integer
Default: 20

Depth of lineage graph to create.

+

Responses

Response samples

Content type
application/json
{
  • "graph": [
    ]
}

Tags

Create a tag

Creates a new tag object.

+
path Parameters
tag
required
string
Example: SENSITIVE

The name of the tag.

+
Request Body schema: application/json
description
string

The description of the tag.

+

Responses

Request samples

Content type
application/json
{
  • "description": "My first tag!"
}

Response samples

Content type
application/json
{
  • "tags": [
    ]
}

List all tags

Returns a list of tags.

+
query Parameters
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

+
offset
integer
Default: 0

The initial position from which to return results

+

Responses

Response samples

Content type
application/json
{
  • "tags": [
    ]
}

Search

Query all datasets and jobs

Returns one or more datasets and jobs of your query.

+
query Parameters
q
required
string
Example: q=my-dataset

Query containing pattern to match; datasets and jobs pattern matching is string based and case-insensitive. Use percent sign (%) to match any string of zero or more characters (my-job%), or an underscore (_) to match a single character (_job_).

+
filter
string
Example: filter=dataset

Filters the results of your query by dataset or job.

+
sort
string
Example: sort=name

Sorts the results of your query by name or updated_at.

+
limit
integer
Default: 100
Example: limit=25

The number of results to return from offset

+

Responses

Response samples

Content type
application/json
{
  • "totalCount": 1,
  • "results": [
    ]
}