Skip to content
This repository has been archived by the owner on Jun 10, 2020. It is now read-only.
/ pulse Public archive

How the federal .gov domain space is doing at best practices and policies.

License

Notifications You must be signed in to change notification settings

18F/pulse

Repository files navigation

The pulse of the federal .gov webspace

How the .gov domain space is doing at best practices and federal requirements.

Setup

Pulse is a Flask app written in Python 3. We recommend pyenv for easy Python version management.

  • Install dependencies:
pip install -r requirements.txt
gem install sass bourbon neat bitters
  • Now you can run the app:
make run
  • If editing styles during development, keep the Sass auto-compiling with:
make watch

Deploying the site

The site can be easily deployed (by someone with credentials to the right server) through Fabric, which requires Python 2.

The Fabric script will expect a defined ssh configuration called pulse, which you should already have defined in your SSH configuration with the right hostname and key.

To deploy to staging, switch to a Python 2 virtualenv with fabric installed, and run:

make staging

This will cd into deploy/ and run fab deploy.

To deploy to production, activate Python 2 and fabric and run:

make production

This will run the fabric command to deploy to production.

Updating the data in Pulse

Updating Pulse is a multi-step process that combines data published by government offices with data scanned from the public internet.

Step 1: Get official data

The official .gov domain list is published quarterly in this directory. Download the federal CSV for the most recent date. This will be referred to below as domains.csv.

Step 2: Scan domains

Use domain-scan to scan the .gov domain list, using the DAP list as a reference.

  • Download and set up domain-scan from GitHub. For right now, this requires site-inspector 1.0.2 (not 2.0) and ssllabs-scan.

  • Tell domain-scan to run the inspect, tls, and analytics scanners over the list of .gov domains, referencing the DAP participation list. Use --force to tell it to ignore any disk cache and to tell SSL Labs to ignore its server-side cache. Use --sort to sort the resulting CSV so that domains are in a consistent order.

The command for this might look like:

./scan domains.csv --scan=inspect,tls,analytics --analytics=https://analytics.usa.gov/data/live/second-level-domains.csv --output=domain-report --debug --force --sort

This will output a CSV report for each scanner to domain-report/results/.

Step 3: Update Pulse

Move the report CSVs into this repo, run a script to update Pulse's data, and mark the new date(s) in _config.yml.

  • Copy inspect.csv, tls.csv, analytics.csv and meta.json into the data/ directory of this repository.

  • Update _config.yml to reflect the latest dates:

data:
  domains: 2015-03-15
  dap: 2015-05-29
  scan: 2015-06-07

domains: The date the .gov domain list was generated by the .gov registry.

dap: The date the DAP participation list was generated by the Digital Analytics Program.

scan: The date that domain-scan was executed and which created inspect.csv and tls.csv.

  • Update Pulse's data from the data/ directory:
./update

This will use the scanned data to create the high-level conclusions Pulse displays to users and makes available for download.

  • Review the changes, rebuild the site, and if all looks good, commit them to source control.

Ideas for later versions

This project is an initial pass - there is much more information that can be represented in dashboards to great effect. Below are some of the further ideas for both for future work on this project. Feel free to add your ideas here, too.

  • For the DAP Dashboard
    • Number of pages from a domain reporting into DAP
    • Number or list of subdomains from a domain reporting into DAP
    • Test the deeper config options that the DAP snippet should be employing, such as IP anonymization, Event tracking, Demographics turned off, and ?????. (Possibly using headless browser)
  • Does the site require “www”? Does it require not using “www”?
  • Load time (server-side)
  • Mobile friendliness (poss. using Google's Mobile Friendly Test)
  • Mixed content detection (linking to insecure resources)
  • Use of third party services
  • 508 compliance (poss. with https://pa11y.org/)
  • Any other items listed in the OMB letter to OGP passing along .gov domain issuance
  • Lighter or fun things - like how many domains start with each letter of the alphabet, what the last 10 that came out were, etc.
  • 2FA or Connect.gov ? - Not sure how it would work but note Section 3's requirement in this EO
  • Anything from/with itdashboard.gov
  • open source
  • Look at what Ben tracked
  • IPv6
  • DNSSEC
  • What else can we get from Verisign?

Public domain

This project is in the worldwide public domain. As stated in CONTRIBUTING:

This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication.

All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.

About

How the federal .gov domain space is doing at best practices and policies.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published