Skip to content

Commit

Permalink
Merge pull request #2 from mrcnc/develop
Browse files Browse the repository at this point in the history
improving data import process
  • Loading branch information
mrcnc committed Oct 29, 2016
2 parents 272e2ce + a916183 commit d40397c
Show file tree
Hide file tree
Showing 4 changed files with 84 additions and 99 deletions.
73 changes: 33 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,59 +3,52 @@
We want to build something better than the default Socrata 311 site:
http:https://311explorer.nola.gov/main/category/

Keep useful features and enhance the user experience:
Here are some vague user stories explaining the main features:

As a user,
I want to lookup info about my request (by entering a reference # received from 311).
I want to visualize ticket types with bar charts (counts) and pie graphs (percentage).
I want to visualize the data on a map around me and filter by ticket type, open/closed, date range.
I want to browse curated datasets before exploring the data myself (maybe showing less data that's
As a citizen,
* I want to lookup info about my 311 request (by entering a reference # received from 311 or searching my previous history).
* I want to visualize ticket types with bar charts (counts) and pie graphs (percentage).
* I want to visualize the data on a map around me and filter and sort by ticket type, open/closed, date range.
* I want to browse curated datasets before exploring the data myself (maybe showing less data that's
more recent data will be useful; maybe by sharing my location, I can see more relevant data on a map
zoomed to my address).
* I want to see open requests near me.
* I want to submit issues that integrate with the City's system (the city 311 system can notify the user).
* I want the ability to choose the amount of information to share about myself (email required to submit ticket?)

Other nice to have features:
* Commenting and upvoting on issues nearby me
* Get notified about issues created by others (star/follow)
* See filter of all issues a user has submitted (email required)
* Map feature: Request per district (styled where color gets darker for more requests)
* Frequency: analyze the frequency of 311 incidents (median time, types
that stay open the longest, etc)

As a developer,
I want to store the 311 data in a database so we can query it more efficiently.

## prerequisites
## database setup

We recommend using Homebrew to install components on the Mac.

* Postgres database.
* Postgis.

If you do not already have these components installed, ask someone on the project to help you get it installed on your machine.

## get the data
First you need to install PostgreSQL and PostGIS. Then you can run the
commands below to get the 311 data into your database.

```
# get bulk call data data.nola.gov
wget -O 311-calls.csv 'https://data.nola.gov/api/views/3iz8-nghx/rows.csv?accessType=DOWNLOAD'
```
# create the db
createuser nola311
createdb nola311 -O nola311
## create the db
# create the table and import the data from the csv
psql -U postgres -d nola311 -f schema_and_csv_import.sql
```
brew install postgres ## if not already installed
brew install postgis ## if not already installed
createuser three11
createdb three11 -O three11
psql -d three11 -c "create extension postgis;"
# sanitize the table
psql -U postgres -d nola311 -f sanitize.sql
```

## load data into db

```
# ogr2ogr is a useful tool for working with geospatial data
brew install gdal --with-postgres ## if not already installed
ogr2ogr -f PostgreSQL PG:"host='localhost' dbname='three11' user='three11'" 311-calls.csv -nln calls
# add location column
psql -U three11 -c "ALTER TABLE calls ADD COLUMN the_geom geometry(POINT, 4326);"
psql -U three11 -c "UPDATE calls SET the_geom = ST_PointFromText(geom, 4326) WHERE geom != '';"
```
### some sample queries

## useful data to know about
```sql
-- what are the top issues that people call about?
select issue_type, count(*) as num_calls from nola311.calls group by issue_type order by num_calls desc;

Show on map: Request per district (legend gets darker for more requests)
Frequency: analyze the frequency of 311 incidents
-- which council district has the most calls?
select council_district, count(*) as num_calls from nola311.calls group by council_district order by num_calls desc;
```
59 changes: 0 additions & 59 deletions location_to_geom.py

This file was deleted.

26 changes: 26 additions & 0 deletions sanitize.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
alter role nola311 set search_path to nola311, public;
create extension if not exists postgis;

create table nola311.calls as (
select id,
ticket_id,
issue_type,
to_timestamp(ticket_created_date_time,'MM/DD/YYYY HH12:MI:SS AM') as ticket_created_date_time,
to_timestamp(ticket_closed_date_time,'MM/DD/YYYY HH12:MI:SS AM') as ticket_closed_date_time,
ticket_status,
issue_description,
street_address,
neighborhood_district,
council_district,
city,
state,
zip_code,
location,
st_pointfromtext('POINT(' || longitude || ' ' || latitude || ')', 4326) as geom
from nola311.calls_tmp
);

comment on table nola311.calls is 'This dataset represents calls to the City of New Orleans'' 311 Call Center';

grant all on schema nola311 to nola311;
grant all on all tables in schema nola311 to nola311;
25 changes: 25 additions & 0 deletions schema_and_csv_import.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
create schema if not exists nola311;

create table if not exists nola311.calls_tmp (
id serial primary key,
ticket_id numeric,
issue_type text,
ticket_created_date_time text,
ticket_closed_date_time text,
ticket_status text,
issue_description text,
street_address text,
neighborhood_district text,
council_district text,
city text,
state text,
zip_code numeric,
location text,
geom text,
latitude numeric,
longitude numeric
);

copy nola311.calls_tmp (ticket_id,issue_type,ticket_created_date_time,ticket_closed_date_time,ticket_status,issue_description,street_address,neighborhood_district,council_district,city,state,zip_code,location,geom,latitude,longitude)
from program 'wget -q -O - "$@" "https://data.nola.gov/api/views/3iz8-nghx/rows.csv?accessType=DOWNLOAD"'
with csv header NULL as '';

0 comments on commit d40397c

Please sign in to comment.