-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ingest/datahub): Support postgres; build(postgres): Modernize postgres docker setup #8762
fix(ingest/datahub): Support postgres; build(postgres): Modernize postgres docker setup #8762
Conversation
…tgres docker compose
@@ -1,25 +1,7 @@ | |||
DATAHUB_UPGRADE_HISTORY_KAFKA_CONSUMER_GROUP_ID=generic-duhe-consumer-job-client-gms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is now meant to be used in conjunction with docker.env, rather than instead of it
@@ -66,8 +66,6 @@ services: | |||
dockerfile: docker/datahub-upgrade/Dockerfile | |||
env_file: datahub-upgrade/env/docker-without-neo4j.env | |||
depends_on: | |||
mysql-setup: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only mention mysql in the .override
files. This depends_on
gets added by docker-compose-without-neo4j.override.yml
@@ -10,10 +10,10 @@ WORKDIR /go/src/github.com/jwilder/dockerize | |||
RUN go install github.com/jwilder/dockerize@$DOCKERIZE_VERSION | |||
|
|||
FROM alpine:3 | |||
COPY --from=binary /go/bin/dockerize /usr/local/bin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't change anything, just making this look more the like mysql-setup dockerfile
@@ -1,2 +1,3 @@ | |||
POSTGRES_USER: datahub | |||
POSTGRES_PASSWORD: datahub | |||
PGUSER: datahub |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For healthcheck, might be needed to make the default psql user datahub
# Ensures stable order, chronological per (urn, aspect) | ||
# Version 0 last, only when createdon is the same. Otherwise relies on createdon order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres comments use a different syntax lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for quoting right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres comments use --
or /* */
while mysql supports those plus #
. I moved the comments completely out of the query string so that we don't have to worry about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these file names are getting out of hand lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeahh, this one is perhaps overkill but I wanted to make it clear it takes the place of an "override" file. Ideally, I think we want something like:
docker-compose.yaml
docker-compose.with-neo4j.yaml
docker-compose.postgres.yaml
docker-compose.mysql.yaml
docker-compose.mysql-m1.yaml
and you can just compose them for which combination you want. I didn't want to deal with that in this PR though, so went with clarity over conciseness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really dislike all these compose files as well. This is the same problem that developed from k8 manifest files. The solution to that is to use templating to render the output files given some configuration files, i.e. helm. I've also seen jsonnet be used to programmatically render templates into yaml/json/etc. Out of scope, but I'd like to see some thoughts around a method to specify options (perhaps similar to helm) and then the docker compose file is rendered as a single output to that configuration.
# Ensures stable order, chronological per (urn, aspect) | ||
# Version 0 last, only when createdon is the same. Otherwise relies on createdon order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for quoting right?
environment: | ||
- DATAHUB_SERVER_TYPE=${DATAHUB_SERVER_TYPE:-quickstart} | ||
- DATAHUB_TELEMETRY_ENABLED=${DATAHUB_TELEMETRY_ENABLED:-true} | ||
depends_on: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of these depends on and env rules can go in the base docker-compose file right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, but I didn't want to make any unnecessary changes, especially here where idk what these environment variables are doing / how to test if they work. This is just from copying what's in docker-compose-without-neo4j.override.yaml since I want the postgres version to function like the mysql version, as much as possible
* tag 'v0.11.0': (188 commits) fix(spark-test): upgrade gradle and fix spark smoke test (datahub-project#8777) fix(gms): Fixed Recently Viewed section for users with '@' in the URN. (datahub-project#8754) feat: add feedback widget (datahub-project#8732) fix(custom-search): fix custom search to be able to use unquoted query (datahub-project#8805) docs(db-retention): update with default setting (datahub-project#8797) feat(openapi): entity endpoints & analytics raw (datahub-project#8537) feat(search): Also de-duplicate the field queries based on field names (datahub-project#8788) fix(ingest): drop `wrap_aspect_as_workunit` method (datahub-project#8766) feat(ingest): drop sql_metadata parser (datahub-project#8765) docs: minor fix on versioning navbar and dropdown (datahub-project#8790) chore(ingest): upgrade sqlglot fork (datahub-project#8775) docs: add datahub source to integrations page (datahub-project#8787) fix(ingest/bigquery): fix partition and median queries for profiling (datahub-project#8778) fix(ingest/tableau): fix tableau native CLL for snowflake, add type annotations (datahub-project#8779) refactor(ingest): Add support for group-owners in dataflow entities (datahub-project#8154) feat(systemMetadata): Adding a lastRunId field system metadata (datahub-project#8672) feat(airflow-plugin): add package type information (datahub-project#8795) fix(ingest/datahub): Support postgres; build(postgres): Modernize postgres docker setup (datahub-project#8762) docs(session): add documentation for session token duration and fix default (datahub-project#8791) chore(analytics): bump version (datahub-project#8786) ...
Ran postgres locally via:
I tried to model
docker-compose.postgres.override.yml
afterdocker-compose-without-neo4j.override.yml
. Ideally we'd have compose files for with vs. without neo4j, mysql vs. postgres, m1 vs not, etc. Not trying to do that right now, so right now you can only run postgres without neo4j... but I think that's ok. I also moved it to the main directory, to make it more clear it's meant to replace one of the "override" files, and deleted it from thepostgres/
directory since not sure what that file is doing otherwise.Checklist