This document outlines the steps necessary to finish initializing this CDP Instance.
Install the command line tools that will help shorten the setup process
- Install gcloud
- Install gsutil
- Install firebase-tools
- Install just
There are additional tasks required after generating this repository.
-
Create the GitHub repository for this deployment to live in.
Create a new Repository with the following parameters:
- Set the repo name to: long-beach
- Set the repo owner to: CouncilDataProject
- Set the repo visibility to: "Public"
- Do not initialize with any of the extra options
- Click "Create repository".
-
Install
cdp-backend
.This step should be ran while within the
SETUP
directory (cd SETUP
).pip install ../python/
-
Get the infrastructure files.
This step should be ran while within the
SETUP
directory (cd SETUP
).get_cdp_infrastructure_stack .
-
Login to Google Cloud.
This step should be run while within the
SETUP
directory (cd SETUP
).Run:
just login
-
Initialize the basic project infrastructure.
This step should be run while within the
SETUP
directory (cd SETUP
)Run:
just init cdp-long-beach-49323fe9
This step will also generate a Google Service Account JSON file and store it in a directory called
.keys
in the root of this repository. -
Set or update the
GOOGLE_APPLICATION_CREDENTIALS
environment variable to the path to the key that was just generated.export GOOGLE_APPLICATION_CREDENTIALS="INSERT/PATH/HERE"
-
Create (or re-use) a Google Cloud billing account and attach it to the newly created project (cdp-long-beach-49323fe9).
For more details on the cost of maintaining a CDP Instance, see our estimated cost breakdown.
-
Generate a Firebase CI token.
firebase login:ci
Save the created token for a following step!
-
Create a GitHub Personal Access Token.
Create a new (classic) GitHub Personal Access Token by navigating to https://github.com/settings/tokens/new.
- Click the "Generate new token" dropdown.
- Select "Generate new token (classic)".
- Give the token a descriptive name / note. We recommend:
cdp-long-beach-49323fe9
- Set the expiration to "No expiration"
- You can set a set expiration if you would like, you will simply have to update this token later.
- Select the
repo
checkbox to give access this token access to the repo. - Click the "Generate token" button.
Save the created token for a following step.
For more documentation and assistance see GitHub's Documentation.
-
Attach the Google Service Account JSON as GitHub Repository Secret.
- Create a new secret
- Set the name to: GOOGLE_CREDENTIALS
- Set the value to: the contents of the file
.keys/cdp-long-beach-49323fe9.json
- Click "Add secret"
- Create a new secret
- Set the name to: FIREBASE_TOKEN
- Set the value to: the value of the Firebase CI token you created in a prior step.
- Click "Add secret"
- Create a new secret
- Set the name to: PERSONAL_ACCESS_TOKEN
- Set the value to: the value of the GitHub Personal Access Token you created in a prior step.
- Click "Add secret"
-
Build the basic project infrastructure.
This step should be run while within the
SETUP
directory (cd SETUP
)just setup cdp-long-beach-49323fe9 us-central
-
Initialize Firebase Storage.
The default settings ("Start in Production Mode" and default region) for setting up storage are fine.
-
Initialize and push the local repository to GitHub.
This step should be run while within the base directory of the repository (
cd ..
).To initialize the repo locally, run:
git init git add -A git commit -m "Initial commit" git branch -M main
To setup a connection to our GitHub repo, run either:
git remote add origin https://github.com/CouncilDataProject/long-beach.git
Or (with SSH):
git remote add origin [email protected]:CouncilDataProject/long-beach.git
Finally, to push this repo to GitHub, run:
git push -u origin main
Now refresh your repository's dashboard to ensure that all files were pushed.
-
Once the "Web App" GitHub Action Successfully Complete configure GitHub Pages.
Go to your repository's GitHub Pages Configuration
- Set the source to: "gh-pages"
- Set the folder to:
/ (root)
- Click "Save"
-
Once the "Infrastructure" GitHub Action Successfully Completes request a quota increase for
compute.googleapis.com/gpus_all_regions
.- Click the checkbox for the "GPUs (all regions)"
- Click the "EDIT QUOTAS" button
- In the "New limit" text field, enter a value of:
2
.- You can request more or less than
2
GPUs, however we have noticed that a request of2
is generally automatically accepted.
- You can request more or less than
- In the "Request description" text field, enter a value of: speech-to-text model application and downstream text tasks
- Click the "NEXT" button
- Enter your name and phone number into the contact fields.
- Click the "SUBMIT REQUEST" button
If the above direct link doesn't work, follow the instructions from Google Documentation.
You will need to wait until the quota increase has been approved before running any event processing. From our experience, the quota is approved within 15 minutes.
If all steps complete successful your web application will be viewable at: https://councildataproject.github.io/long-beach
Once your repository, infrastructure, and web application have been set up, you will need to write an event data gathering function.
Navigate and follow the instructions in the the file: python/cdp_long_beach_backend/scraper.py
.
As soon as you push your updates to your event gather function (get_events
) to your GitHub repository, everything will be tested and configured for the next pipeline run. Events are gathered from this function every 6 hours from the default branch via a Github Action cron job. If you'd like to manually run event gathering, you can do so from within the Actions tab of your repo -> Event Gather -> Run workflow.
It is expected that the Event Index workflow will fail to start, as your database will not yet be populated with events to index.
There are some optional configurations for the data gathering pipeline which can be added to python/event-gather-config.json
. No action is needed for a barebones pipeline run, but the optional parameters can be checked in the CDP pipeline config documentation. Note that google_credentials_file
and get_events_function_path
should not be modified and will populate automatically if you have followed the steps above.
Be sure to review the CDP Ingestion Model documentation for the object definition to return from your get_events
function.
Once your function is complete and pushed to the main
branch, feel free to delete this setup directory.
For more documentation on adding data to your new CDP instance and maintainer or customizing your instance please see the "admin-docs" directory.