Skip to content

Commit

Permalink
Adds changelog notes and migration guide for 0.10.0.
Browse files Browse the repository at this point in the history
Summary: see title

Test Plan: bk

Reviewers: prha

Reviewed By: prha

Differential Revision: https://dagster.phacility.com/D6019
  • Loading branch information
Bob Chen committed Jan 14, 2021
1 parent d5f4588 commit 271f4d7
Show file tree
Hide file tree
Showing 2 changed files with 286 additions and 183 deletions.
121 changes: 103 additions & 18 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,112 @@
# Changelog

## Upcoming 0.10.0
## 0.10.0

**Bugfixes**
### Major Changes

- Noneable config fields and no longer required, default to `None`.
* A **native scheduler** with support for exactly-once, fault tolerant, timezone-aware scheduling. A new Dagster daemon process has been added to manage your schedules and sensors with a reconciliation loop, ensuring that all runs are executed exactly once, even if the Dagster daemon experiences occasional failure. See the [Migration Guide](https://github.com/dagster-io/dagster/blob/master/MIGRATION.md) for instructions on moving from `SystemCronScheduler` or `K8sScheduler` to the new scheduler.
* **First-class sensors**, built on the new Dagster daemon, allow you to instigate runs based on changes in external state - for example, files on S3 or assets materialized by other Dagster pipelines. See the [Sensors Overview](https://docs.dagster.io/overview/schedules-sensors/sensors) for more information.
* Dagster now supports **pipeline run queueing**. You can apply instance-level run concurrency limits and prioritization rules by adding the QueuedRunCoordinator to your Dagster instance. See the [Run Concurrency Overview](https://docs.dagster.io/overview/pipeline-runs/limiting-run-concurrency) for more information.
* The `IOManager` abstraction provides a new, streamlined primitive for granular control over where and how solid outputs are stored and loaded. This is intended to replace the (deprecated) intermediate/system storage abstractions, See the [IO Manager Overview](https://docs.dagster.io/overview/io-managers/io-managers) for more information.
* A new **Partitions page** in Dagit lets you view your your pipeline runs organized by partition. You can also **launch backfills from Dagit** and monitor them from this page.
* A new **Instance Status page** in Dagit lets you monitor the health of your Dagster instance, with repository location information, daemon statuses, instance-level schedule and sensor information, and linkable instance configuration.
* **Resources can now declare their dependencies on other resources** via the `required_resource_keys` parameter on `@resource`.
* Our support for deploying on **Kubernetes** is now mature and battle-tested Our Helm chart is now easier to configure and deploy, and we’ve made big investments in observability and reliability. You can view Kubernetes interactions in the structured event log and use Dagit to help you understand what’s happening in your deployment. The defaults in the Helm chart will give you graceful degradation and failure recovery right out of the box.
* Experimental support for **dynamic orchestration** with the new `DynamicOutputDefinition` API. Dagster can now map the downstream dependencies over a dynamic output at runtime.

**Breaking Changes**
### Breaking Changes

**Dropping Python 2 support**

* We’ve dropped support for Python 2.7, based on community usage and enthusiasm for Python 3-native public APIs.

**Removal of deprecated APIs**

These APIs were marked for deprecation with warnings in the 0.9.0 release, and have been removed in the 0.10.0 release.

* The decorator `input_hydration_config` has been removed. Use the `dagster_type_loader` decorator instead.
* The decorator `output_materialization_config` has been removed. Use `dagster_type_materializer` instead.
* The system storage subsystem has been removed. This includes `SystemStorageDefinition`, `@system_storage`, and `default_system_storage_defs` . Use the new `IOManagers` API instead. See the [IO Manager Overview](https://docs.dagster.io/overview/io-managers/io-managers) for more information.
* The `config_field` argument on decorators and definitions classes has been removed and replaced with `config_schema`. This is a drop-in rename.
* The argument `step_keys_to_execute` to the functions `reexecute_pipeline` and `reexecute_pipeline_iterator` has been removed. Use the `step_selection` argument to select subsets for execution instead.
* Repositories can no longer be loaded using the legacy `repository` key in your `workspace.yaml`; use `load_from` instead. See the [Workspaces Overview](https://docs.dagster.io/overview/repositories-workspaces/workspaces) for documentation about how to define a workspace.

**Breaking API Changes**

* `SolidExecutionResult.compute_output_event_dict` has been renamed to `SolidExecutionResult.compute_output_events_dict`. A solid execution result is returned from methods such as `result_for_solid`. Any call sites will need to be updated.
* The `.compute` suffix is no longer applied to step keys. Step keys that were previously named `step_key.compute` will now be named `step_key`. If you are using any API method that takes a step_selection argument, you will need to update the step keys accordingly.
* The `pipeline_def` property has been removed from the `InitResourceContext` passed to functions decorated with `@resource`.

**Helm Chart**

* The schema for the `scheduler` values in the helm chart has changed. Instead of a simple toggle on/off, we now require an explicit `scheduler.type` to specify usage of the `DagsterDaemonScheduler`, `K8sScheduler`, or otherwise. If your specified `scheduler.type` has required config, these fields must be specified under `scheduler.config`.
* `snake_case` fields have been changed to `camelCase`. Please update your `values.yaml` as follows:
* `pipeline_run``pipelineRun`
* `dagster_home``dagsterHome`
* `env_secrets``envSecrets`
* `env_config_maps``envConfigMaps`
* The Helm values `celery` and `k8sRunLauncher` have now been consolidated under the Helm value runLauncher for simplicity. Use the field `runLauncher.type` to specify usage of the `K8sRunLauncher`, `CeleryK8sRunLauncher`, or otherwise. By default, the `K8sRunLauncher` is enabled.
* All Celery message brokers (i.e. RabbitMQ and Redis) are disabled by default. If you are using the `CeleryK8sRunLauncher`, you should explicitly enable your message broker of choice.
* `userDeployments` are now enabled by default.

### Core

* Event log messages streamed to `stdout` and `stderr` have been streamlined to be a single line per event.
* Experimental support for memoization and versioning lets you execute pipelines incrementally, selecting which solids need to be rerun based on runtime criteria and versioning their outputs with configurable identifiers that capture their upstream dependencies.
* Use the `version` parameter of dagster entities to tag your steps to a specific version: ie `@solid(version=...)`
* Add the `IS_MEMOIZED_RUN` tag with value `"true"` in order to run a pipeline with memoization enabled.
* View which steps will be re-executed using the [`list_versions` CLI](https://docs.dagster.io/_apidocs/cli#dagster-pipeline-list-versions).
* Schedules that are executed using the new `DagsterDaemonScheduler` can now execute in any timezone by adding an `execution_timezone` parameter to the schedule. Daylight Savings Time transitions are also supported. See the [Schedules Overview](https://docs.dagster.io/overview/schedules-sensors/schedules#timezones) for more information and examples.

### Dagit

* Countdown and refresh buttons have been added for pages with regular polling queries (e.g. Runs, Schedules).
* Confirmation and progress dialogs are now presented when performing run terminations and deletions. Additionally, hanging/orphaned runs can now be forced to terminate, by selecting "Force termination immediately" in the run termination dialog.
* The Runs page now shows counts for "Queued" and "In progress" tabs, and individual run pages show timing, tags, and configuration metadata.
* The backfill experience has been improved with means to view progress and terminate the entire backfill via the partition set page. Additionally errors related to backfills are surfaced better.
* Shortcut hints are no longer displayed when attempting to use the screen capture command.
* The asset page has been revamped to include a table of events and enable organizing events by partition. Asset key escaping issues in other views have been fixed as well.
* Miscellaneous bug fixes, frontend performance tweaks, and other improvements are also included.

### Kubernetes/Helm

* The [Dagster Kubernetes documentation](https://docs.dagster.io/deploying/k8s) has been refreshed.

**Helm**

* We've added schema validation to our Helm chart. You can now check that your values YAML file is correct by running:

```bash
helm lint helm/dagster -f helm/dagster/values.yaml
```

* Added support for resource annotations throughout our Helm chart.
* Added Helm deployment of the dagster daemon & daemon scheduler.
* Added Helm support for configuring a compute log manager in your dagster instance.
* User code deployments now include a user `ConfigMap` by default.
* Changed the default liveness probe for Dagit to use `httpGet "/dagit_info"` instead of `tcpSocket:80`

**Dagster-K8s [Kubernetes]**

* Added support for user code deployments on Kubernetes.
* Added support for tagging pipeline executions.
* Fixes to support version 12.0.0 of the Python Kubernetes client.
* Improved implementation of Kubernetes+Dagster retries.
* Many logging improvements to surface debugging information and failures in the structured event log.

**Dagster-Celery-K8s**

* Improved interrupt/termination handling in Celery workers.

### Integrations & Libraries

* Added a new `dagster-docker` library with a `DockerRunLauncher` that launches each run in its own Docker container. See our [Deploying with Docker docs](https://docs.dagster.io/examples/deploy_docker) for an example)
* Added support for AWS Athena. (Thanks @jmsanders!)
* Added mocks for AWS S3, Athena, and Cloudwatch in tests. (Thanks @jmsanders!)
* Allow setting of S3 endpoint through env variables. (Thanks @marksteve!)
* Various bug fixes and new features for the Azure, Databricks, and Dask integrations.
* Added a `create_databricks_job_solid` for creating solids that launch Databricks jobs.

- Eliminated public field `config_field` on definition classes.
- Eliminated `input_hydration_config` and `output_materialization_config` top-level includes.
- Eliminated `input_hydration_config` and `output_materialization_config` arguments on type constructors.
- Changed `ExecuteStepArgs` parameters. Replaced `execute_step_with_structured_logs` CLI command
with `execute_step`, and `execute_run_with_structured_logs` with `execute_run`.
- The `.compute` suffix is no longer applied to all step keys.
- `SolidExecutionResult.compute_output_event_dict` changed to `SolidExecutionResult.compute_output_events_dict`
- Eliminated `pipeline_def` property from `InitResourceContext`.
- Removed the argument `step_keys_to_execute` to the `reexecute_pipeline` and `reexecute_pipeline_iterator`.
- [Helm/K8s]
- The schema for the `scheduler` values in the helm chart has been rewritten.
- `celery` and `k8sRunLauncher` have now been consolidated under `runLauncher`.
- The `K8sRunLauncher` is now enabled by default.
- The `userDeployments` is now enabled by default.

## 0.9.22.post0

Expand Down
Loading

0 comments on commit 271f4d7

Please sign in to comment.