Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge trees to database #48

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions migrations/2023-01-31-migration.treedata.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
/*
Insert current trees into new tree table
*/
insert into public.tree (
id_reference,
geom,
common,
scientific,
genus,
lng,
lat,
date_planted,
address,
city,
state,
zip,
country,
neighborhood,
health,
dbh_raw,
height,
owner,
url,
urlimage,
status,
notes,
email,
volunteer,
side_type,
location_tree_count,
legal_status,
irrigation,
water_freq,
sourceid
)
SELECT id_reference,
geom,
common,
scientific,
genus,
lng,
lat,
date_planted,
address,
city,
state,
zip,
country,
neighborhood,
health,
dbh as dbh_raw,
height,
owner,
url,
urlimage,
status,
notes,
email,
volunteer,
side_type,
location_tree_count,
legal_status,
irrigation,
water_freq,
source_id as sourceid
FROM public.treedata
;
114 changes: 114 additions & 0 deletions stored_procedures/public.merge_treedata.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@

create or replace procedure public.merge_treedata (
c_sourceid character varying(255)
)
language plpgsql
as $$
begin

drop table if exists _tree;

create temporary table _tree as
SELECT t.id as id_tree,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id and id_tree are different. id is the id that we create with the tree-id repo, id_tree is the db serial

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id_tree for the purposes of this stored procedure just let me know we are using the id that connects to the tree table, as opposed to the staging table. The difference is mainly used below for determining missing / matching data.

ts.id as id_tree_staging,
ts.ref as id_reference,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id_reference is what we were using and then we moved it to ref because of the open data standards. Unfortunately it's special in react so ref kind of is confusing as an open data standard, imo.

ts.geom,
ts.common,
ts.scientific,
ts.genus,
ts.lng,
ts.lat,
ts.planted as date_planted,
ts.address,
ts.city,
-- state,
-- zip,
ts.country,
-- neighborhood,
ts.health,
case when ts.dbh ~ '^[0-9\.]+$' then ts.dbh::double precision else null end as dbh,
ts.dbh as dbh_raw,
ts.height,
ts.owner,
-- status,
ts.note as notes,
ts.sourceid
FROM public.tree_staging ts
FULL OUTER JOIN public.tree t on (t.id_reference = ts.ref and ts.sourceid = t.sourceid)
WHERE 1=1
AND ts.sourceid = c_sourceid;

insert into public.tree (
id_reference,
geom,
common,
scientific,
genus,
lng,
lat,
date_planted,
address,
city,
country,
health,
dbh,
dbh_raw,
height,
owner,
notes,
-- email,
sourceid
)
select id_reference,
geom,
common,
scientific,
genus,
lng,
lat,
date_planted,
address,
city,
country,
health,
dbh,
dbh_raw,
height,
owner,
notes,
-- email,
sourceid
FROM _tree
WHERE 1=1
AND _tree.id_tree is null;

-- update anything that is in both and has fields changed
update tree
SET dbh_raw = t.dbh_raw,
dbh = case when t.dbh_raw ~ '^[0-9\.]+$' then t.dbh_raw::double precision else null end
FROM (
select t.id,
ts.dbh_raw
from _tree as ts
JOIN tree as t on ts.id_tree = t.id
where 1=1
AND coalesce(ts.dbh_raw, '') != coalesce(t.dbh_raw, '')
) t
where 1=1
and t.id = tree.id;

-- set the removed field for any that are not in staging
-- but in the regular table
/*
What do we want to do about trees that were
deleted from the source?
*/


-- DELETE
-- FROM public.tree_staging
-- WHERE 1=1
-- AND sourceid = c_sourceid;

end; $$
;
86 changes: 86 additions & 0 deletions tables/public.tree.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
-- drop table public.tree
CREATE TABLE public.tree (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

treedata_staging

id bigserial NOT NULL primary key,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id_treedata_staging

Copy link
Contributor Author

@tzinckgraf tzinckgraf Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an id based on one we create

id_reference character varying(255),
geom geometry(Point,4326),
-- who character varying(255),
common character varying(255),
scientific character varying(255),
genus character varying(255),
lng double precision,
lat double precision,
date_planted timestamp without time zone,
address character varying(255),
city character varying(255),
state character varying(255),
zip character varying(255),
country character varying(255),
neighborhood character varying(255),
health character varying(255),
dbh double precision,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dbh_min and dbh_max

dbh_raw character varying(255),
height character varying(255),
owner character varying(255),
url character varying(255),
urlimage character varying(255),
status character varying(255),
notes text,
email character varying(255),
volunteer character varying(255),
-- planting_opt1 character varying(255),
-- planting_opt1_com character varying(255),
-- planting_opt2 character varying(255),
-- planting_opt3 character varying(255),
side_type character varying(50),
location_tree_count character varying(10),
legal_status character varying,
irrigation boolean,
water_freq integer DEFAULT 14,
-- watering_end_date timestamp without time zone,
-- last_watered timestamp without time zone,
sourceid character varying(255),
created timestamp without time zone DEFAULT CURRENT_TIMESTAMP NOT NULL,
modified timestamp without time zone DEFAULT CURRENT_TIMESTAMP NOT NULL
);


ALTER TABLE public.tree OWNER TO trees;


-- ALTER TABLE public.treedata ALTER COLUMN id_tree ADD GENERATED ALWAYS AS IDENTITY (
-- SEQUENCE NAME public.treedata_id_tree_seq
-- START WITH 3000
-- INCREMENT BY 1
-- NO MINVALUE
-- NO MAXVALUE
-- CACHE 1
-- );



-- CREATE UNIQUE INDEX id_geohash ON public.treedata USING btree (id);

CREATE INDEX tree_geom_idx ON public.tree USING gist (geom);

CREATE INDEX tree_sourceid on public.tree(sourceid)

-- CREATE INDEX treedata_id_idx ON public.treedata USING btree (id);

-- CREATE INDEX treedata_id_tree_idx ON public.treedata USING btree (id_tree);



-- CREATE TRIGGER geom_update AFTER INSERT ON public.treedata FOR EACH ROW EXECUTE FUNCTION public.add_geom();



-- ALTER TABLE ONLY public.treedata
-- ADD CONSTRAINT treedata_planted_by_email_fkey FOREIGN KEY (email) REFERENCES public.users(email);



-- ALTER TABLE ONLY public.treedata
-- ADD CONSTRAINT treedata_planted_by_nickname_fkey FOREIGN KEY (volunteer) REFERENCES public.users(nickname);



38 changes: 38 additions & 0 deletions tables/public.tree_staging.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
SELECT 1;

-- drop table public.tree_staging;
CREATE UNLOGGED TABLE public.tree_staging (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tree_sources_staging

id bigserial not null primary key,
geom geometry(Point,4326),
ref character varying (255),
dbh character varying (255),
scientific character varying(255),
common character varying(255),
health character varying(255),
sourceid character varying(255),
city character varying(255),
country character varying(255),
email character varying(255),
download character varying(255),
info character varying(255),
lat double precision,
lng double precision,
count integer,
address character varying(255),
family character varying(255),
variety character varying(255),
species character varying(255),
genus character varying(255),
owner character varying(255),
planted timestamp without time zone,
note character varying(255),
height character varying(255),
crown character varying(255),
spread character varying(255),
ule character varying(255),
updated timestamp without time zone DEFAULT CURRENT_TIMESTAMP NOT NULL,
created timestamp without time zone DEFAULT CURRENT_TIMESTAMP NOT NULL
);


ALTER TABLE public.tree_staging OWNER TO trees;