Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It stopped updating on Mar 20th. #14

Closed
alexey-milovidov opened this issue Apr 20, 2024 · 4 comments
Closed

It stopped updating on Mar 20th. #14

alexey-milovidov opened this issue Apr 20, 2024 · 4 comments
Assignees
Labels

Comments

@alexey-milovidov
Copy link
Member

Release assets now split over multiple files, e.g. v2024.04.16-planes-readsb-staging-0.tar.aa, and we need to support them in the import scripts.

@alexey-milovidov
Copy link
Member Author

@alexey-milovidov
Copy link
Member Author

$ cat adsblol.sh 
#!/bin/bash

source config

mkdir -p adsblol
pushd adsblol

mkdir lock || exit
trap 'rmdir lock' EXIT

# Download the file. Process and upload them to S3. Remove the file and update the last date.

NEXT=$(clickhouse-local --query "SELECT '$(cat last)'::Date + 1")
YEAR=$(clickhouse-local --query "SELECT toYear('$NEXT'::Date)")
DATE_FORMATTED=$(clickhouse-local --query "SELECT formatDateTime('$NEXT'::Date, '%Y.%m.%d')")
PATCH=0
NAME="v${DATE_FORMATTED}-planes-readsb-prod-${PATCH}"

export CLICKHOUSE_PLANES_HOST
export CLICKHOUSE_PLANES_USER
export CLICKHOUSE_PLANES_PASSWORD

export TABLE=default.planes_adsblol_loading

for SUFFIX in '' .a{a..z}
do
    URL="https://github.com/adsblol/globe_history_${YEAR}/releases/download/${NAME}/${NAME}.tar${SUFFIX}"
    wget --no-verbose --continue "$URL" && aws s3 cp --no-progress "${NAME}.tar${SUFFIX}" "s3:https://clickhouse-public-datasets/adsblol/original/${NAME}.tar${SUFFIX}"
done

cat "${NAME}.tar"* > "${NAME}.all.tar" &&
mkdir -p "$NAME" && (cd "$NAME" && tar xf "../${NAME}.all.tar") && rm "${NAME}.all.tar" "${NAME}.tar"* &&
clickhouse-client ${CLICKHOUSE_PLANES_PARAMS} --query "CREATE OR REPLACE TABLE ${TABLE} AS planes_mercator" &&
find *readsb*/traces -name '*.json' | xargs -P 100 -L1 ../adsblol-process-file.sh &&
clickhouse-client ${CLICKHOUSE_PLANES_PARAMS} --query "INSERT INTO planes_mercator SELECT * FROM ${TABLE}" &&
rm -rf *readsb* && mv last prev && echo ${NEXT} > last

@alexey-milovidov
Copy link
Member Author

It should catch up in a few hours...

@alexey-milovidov
Copy link
Member Author

It finished loading, and I also reloaded the problematic day 2024-03-22 manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant