Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geocoding 2.0 #304

Merged
merged 79 commits into from
Jan 29, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
173d618
Put parents into regions dataframe and geodataframe
IKupriyanov-HORIS Sep 11, 2020
c8f8d1b
Implementing parents in regions_builder.
IKupriyanov-HORIS Sep 14, 2020
1fccca9
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Sep 14, 2020
6fef609
WIP: parents
IKupriyanov-HORIS Sep 15, 2020
9b5ae98
Fix tests, sync protocol with server
IKupriyanov-HORIS Sep 15, 2020
38f1db7
Fix more tests
IKupriyanov-HORIS Sep 16, 2020
54b4457
Fix more tests
IKupriyanov-HORIS Sep 25, 2020
299b6aa
New server IP in demo
IKupriyanov-HORIS Sep 25, 2020
701f78c
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Oct 5, 2020
a37a3f5
WIP: answers in protocol
IKupriyanov-HORIS Oct 14, 2020
f6ca8a3
WIP: moving to Answers instead of flat list of GeocodedFeature
IKupriyanov-HORIS Oct 14, 2020
52e7c1f
WIP: link between request and response
IKupriyanov-HORIS Oct 15, 2020
be18cc5
Getting rid of a query string from Answer and GeocodedFeature
IKupriyanov-HORIS Oct 17, 2020
e8219a8
Remove query from Answer and GeocodedFeature
IKupriyanov-HORIS Oct 19, 2020
326550b
Fix assertion condition
IKupriyanov-HORIS Oct 20, 2020
f6fc7c8
Remove chunked requests, better docs for new API
IKupriyanov-HORIS Oct 20, 2020
cbc1745
WIP: scope
IKupriyanov-HORIS Oct 21, 2020
02aa0ba
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Oct 21, 2020
cca501d
Update example with counties
IKupriyanov-HORIS Oct 22, 2020
91b027a
Update example with counties
IKupriyanov-HORIS Oct 22, 2020
21ec386
Basic support for scope in the new geocoding API
IKupriyanov-HORIS Oct 22, 2020
6f99bbe
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Oct 23, 2020
60968ca
Default scope value to None
IKupriyanov-HORIS Oct 26, 2020
f44d979
Unit tests for new geocoding API
IKupriyanov-HORIS Oct 28, 2020
9ed17b0
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Oct 28, 2020
539ef98
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Oct 28, 2020
4345821
Request validation, tests
IKupriyanov-HORIS Oct 28, 2020
648eef7
Experiment with map and nbviewer
IKupriyanov-HORIS Oct 29, 2020
d492167
Experiment with map and nbviewer and stamen tiles
IKupriyanov-HORIS Oct 29, 2020
bc846fa
Better parents/scope support, more tests
IKupriyanov-HORIS Oct 30, 2020
dcfd17b
where(..., near=..., within=...)
IKupriyanov-HORIS Nov 2, 2020
41e0ce1
Better error handling
IKupriyanov-HORIS Nov 3, 2020
26804b4
Tests for not available features in new API
IKupriyanov-HORIS Nov 4, 2020
fbd7c50
Better error handling for where function (missing key, scope and othe…
IKupriyanov-HORIS Nov 5, 2020
c444036
map_join with multikeys
IKupriyanov-HORIS Nov 14, 2020
0e213d9
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Nov 14, 2020
f7f5df8
Single key data_join_on and map_join_on.
IKupriyanov-HORIS Nov 15, 2020
a31ca74
Multi key data_join_on and map_join_on.
IKupriyanov-HORIS Nov 16, 2020
1f69608
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Nov 23, 2020
08f250f
map_join with dups, but livemap fails with null values in data
IKupriyanov-HORIS Nov 24, 2020
71060c2
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Nov 25, 2020
6c5f57b
Handle null values in pie/bar
IKupriyanov-HORIS Nov 26, 2020
cec1bdc
LP-62
IKupriyanov-HORIS Nov 27, 2020
c5375fb
Better error message on using scope with parents
IKupriyanov-HORIS Dec 4, 2020
5a2569d
Better error message for invalid scope type, countries request suppor…
IKupriyanov-HORIS Dec 7, 2020
4926246
More tests for us-48
IKupriyanov-HORIS Dec 8, 2020
c7c72ba
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Dec 8, 2020
dc5a21f
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Dec 14, 2020
b0ba74f
WIP: changing public API
IKupriyanov-HORIS Dec 15, 2020
536ed25
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Dec 18, 2020
fe750f7
Support Geocoder as geom_xxx(map=...) parameter
IKupriyanov-HORIS Dec 21, 2020
c317a16
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Dec 21, 2020
7531be1
Code cleanup, more tests
IKupriyanov-HORIS Dec 22, 2020
425fef3
Copy not matched dups from map on map_join (fix for last cell in geop…
IKupriyanov-HORIS Dec 23, 2020
ed5f0ea
Use single entry parent for all names
IKupriyanov-HORIS Dec 23, 2020
371525a
Fix empty result for select all kind of request
IKupriyanov-HORIS Dec 24, 2020
481b90d
where -> scope, near -> closest_to, single entry scope, remove auto-c…
IKupriyanov-HORIS Jan 11, 2021
4f3e114
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Jan 11, 2021
5002f32
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Jan 11, 2021
d316cbe
Remove MapRegion from 'scope' type error message
IKupriyanov-HORIS Jan 13, 2021
bfe2d0b
scope can now be used with county and state. scope and country will c…
IKupriyanov-HORIS Jan 14, 2021
f1a3014
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Jan 18, 2021
e747a7e
Handle any Iterable types in parents functions (counties/states/count…
IKupriyanov-HORIS Jan 20, 2021
3915dba
Enable gzip for geocoding responses
IKupriyanov-HORIS Jan 21, 2021
d8b1fac
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Jan 21, 2021
93c7aaa
Update docs
IKupriyanov-HORIS Jan 21, 2021
f1e98cd
Update docs
IKupriyanov-HORIS Jan 21, 2021
4b4a9ba
Keep original query name for ambiguous result with allow_ambiguous flag
IKupriyanov-HORIS Jan 22, 2021
ffaaab8
Fix error message
IKupriyanov-HORIS Jan 22, 2021
50156ed
Remove method to_data_frame and interface CanToDataFrame
IKupriyanov-HORIS Jan 25, 2021
8131764
WIP: update geocoding.md
IKupriyanov-HORIS Jan 26, 2021
1917b39
WIP: update geocoding.md
IKupriyanov-HORIS Jan 26, 2021
a908666
Revert changes to builder.ipynb
IKupriyanov-HORIS Jan 26, 2021
ff331eb
WIP: update geocoding.md
IKupriyanov-HORIS Jan 26, 2021
b6fa9b4
Use geocoding level instead of 'request' for a column name, remove da…
IKupriyanov-HORIS Jan 28, 2021
851fa40
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Jan 28, 2021
83002eb
Support Geocoder in ggplot(data=...)
IKupriyanov-HORIS Jan 29, 2021
99bd3c3
Switch to the new geocoding server
IKupriyanov-HORIS Jan 29, 2021
a203c3b
Merge branch 'master' into geoservices-1.1.0
IKupriyanov-HORIS Jan 29, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
WIP: parents
  • Loading branch information
IKupriyanov-HORIS committed Sep 15, 2020
commit 6fef6091600c69b62577a09b6047af5cc5b8f32b
6 changes: 3 additions & 3 deletions python-package/lets_plot/geo_data/gis/json_request.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,9 +101,9 @@ def _format_region_queries(region_queires: List[RegionQuery]) -> List[Dict]:
result.append(
FluentDict()
.put(Field.region_query_names, [] if query.request is None else [query.request])
.put(Field.region_query_countries, query.country)
.put(Field.region_query_states, query.state)
.put(Field.region_query_counties, query.county)
.put(Field.region_query_countries, RequestFormatter._format_map_region(query.country))
.put(Field.region_query_states, RequestFormatter._format_map_region(query.state))
.put(Field.region_query_counties, RequestFormatter._format_map_region(query.county))
.put(Field.ambiguity_resolver, None if query.ambiguity_resolver is None else FluentDict()
.put(Field.ambiguity_ignoring_strategy, query.ambiguity_resolver.ignoring_strategy)
.put(Field.ambiguity_box, RequestFormatter._format_box(query.ambiguity_resolver.box))
Expand Down
65 changes: 50 additions & 15 deletions python-package/lets_plot/geo_data/gis/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,42 +45,77 @@ class LevelKind(enum.Enum):
class MapRegionKind(enum.Enum):
id = True
name = False
place = 'place'


class MapRegion:
'''
Represents three different entities:
scope - ids of already geocoded objects. The only kind of MapRegion allowed to store multiply objects
place - already geocoded single place. In addition to id it holds administrative level and requeted name.
Used mostly as parent object for geocoding other objects.
with_name - single name, not yet geocoded.
'''
@staticmethod
def with_single_id(parent_ids: List[str]):
assert_list_type(parent_ids, str)
assert len(parent_ids) == 1, 'Single id MapRegion expected. Actual number of ids: ' + len(parent_ids)
return MapRegion(MapRegionKind.id, parent_ids, ids_limit)
def request_or_none(place: Optional['MapRegion']):
if place is None:
return None

assert place.kind == MapRegionKind.place, 'Only palce MapRegion contains request'
return place._request


@staticmethod
def place(id: str, request: str, level_kind: LevelKind):
assert_type(id, str)
assert_type(request, str)
assert_type(level_kind, LevelKind)
return MapRegion(MapRegionKind.place, [id], request, level_kind)

def with_ids(parent_ids: List[str]):
@staticmethod
def scope(parent_ids: List[str]):
assert_list_type(parent_ids, str)
return MapRegion(MapRegionKind.id, parent_ids, ids_limit)
return MapRegion(MapRegionKind.id, parent_ids)

@staticmethod
def with_name(name: str):
assert_type(name, str)
return MapRegion(MapRegionKind.name, [name])

def __init__(self, kind: MapRegionKind, values: List[str]):
def __init__(self, kind: MapRegionKind, values: List[str], request: Optional[str] = None, level_kind: Optional[LevelKind] = None):
assert_type(kind, MapRegionKind)
assert_list_type(values, str)
assert_optional_type(request, str)
assert_optional_type(level_kind, LevelKind)

self.kind: MapRegionKind = kind
self.values: Tuple[str] = tuple(values, )
self._ids_limit: Optional[int] = ids_limit
self._request:Optional[str] = request
self._level_kind: Optional[LevelKind] = level_kind
self._hash = hash((self.values, self.kind))

def request(self) -> Optional[str]:
assert self.kind == MapRegionKind.place, 'Invalid MapRegion kind: only place contains request'
return self._request

def level_kind(self) -> Optional[LevelKind]:
assert self.kind == MapRegionKind.place, 'Invalid MapRegion kind: only place contains level_kind'
return self._level_kind

def __eq__(self, other: 'MapRegion'):
return isinstance(other, MapRegion) \
and self.kind == other.kind \
and self.values == other.values
and self.values == other.values \
and self._request == other._request \
and self._level_kind == other._level_kind

def __ne__(self, o: object) -> bool:
return not self == o

def __str__(self):
if self.kind == MapRegionKind.place:
return '{} {} {}'.format(str(self.values), self._request, self._level_kind)

return str(self.values)

def __hash__(self):
Expand Down Expand Up @@ -134,9 +169,9 @@ def __init__(self,
self.request: Optional[str] = request
self.scope: Optional[MapRegion] = scope
self.ambiguity_resolver: AmbiguityResolver = ambiguity_resolver
self.country = country
self.state = state
self.county = county
self.country: Optional[MapRegion] = country
self.state: Optional[MapRegion] = state
self.county: Optional[MapRegion] = county

def __eq__(self, o: object) -> bool:
return isinstance(o, RegionQuery) \
Expand Down Expand Up @@ -284,7 +319,7 @@ def __ne__(self, o: object) -> bool:

class RequestBuilder:
def __init__(self):
self.request_kind: RequestKind = None
self.request_kind: Optional[RequestKind] = None
self.requested_payload: List[PayloadKind] = []
self.resolution: Optional[int] = None
self.ids: List[str] = []
Expand All @@ -294,7 +329,7 @@ def __init__(self):
self.allow_ambiguous: bool = False

# reverse
self.reverse_coordinates: List[GeoPoint] = None
self.reverse_coordinates: Optional[List[GeoPoint]] = None
self.reverse_scope: Optional[MapRegion] = None

def set_reverse_coordinates(self, coordinates: List[GeoPoint]) -> 'RequestBuilder':
Expand Down Expand Up @@ -387,7 +422,7 @@ def build(self) -> Optional[MapRegion]:

class RegionQueryBuilder:
def __init__(self):
self.request: Optional[str] = []
self.request: Optional[str] = None
self.scope: Optional[MapRegion] = None
self.ignoring_strategy: Optional[IgnoringStrategyKind] = None
self.closest_coord: Optional[GeoPoint] = None
Expand Down
65 changes: 27 additions & 38 deletions python-package/lets_plot/geo_data/regions.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,42 +48,28 @@ def contains_values(column):
def select_not_empty_name(feature: GeocodedFeature) -> str:
return feature.name if feature.query is None or feature.query == '' else feature.query

def select_parents(queries: List[RegionQuery] = None) -> Dict:
if queries is None:
return {}

data = {}

counties = [query.county for query in queries]
if contains_values(counties):
data[DF_PARENT_COUNTY] = counties

states = [query.state for query in queries]
if contains_values(states):
data[DF_PARENT_STATE] = states

countries = [query.country for query in queries]
if contains_values(countries):
data[DF_PARENT_COUNTRY] = countries

return data


class PlacesDataFrameBuilder:
def __init__(self):
self._request: List[str] = []
self._found_name: List[str] = []
self._county: List[str] = []
self._state: List[str] = []
self._country: List[str] = []
self._county: List[Optional[str]] = []
self._state: List[Optional[str]] = []
self._country: List[Optional[str]] = []

def append_row(self, request: str, found_name: str, parents: Dict, parent_row: int):
def append_row(self, request: str, found_name: str, queries: Optional[List[RegionQuery]], parent_row: int):
self._request.append(request)
self._found_name.append(found_name)

self._county.append(parents[DF_PARENT_COUNTY][parent_row] if DF_PARENT_COUNTY in parents else None)
self._state.append(parents[DF_PARENT_STATE][parent_row] if DF_PARENT_STATE in parents else None)
self._country.append(parents[DF_PARENT_COUNTRY][parent_row] if DF_PARENT_COUNTRY in parents else None)
if queries is None or len(queries) == 0:
self._county.append(None)
self._state.append(None)
self._country.append(None)
else:
query: RegionQuery = queries[parent_row]
self._county.append(MapRegion.request_or_none(query.county))
self._state.append(MapRegion.request_or_none(query.state))
self._country.append(MapRegion.request_or_none(query.country))


def build_dict(self):
Expand All @@ -104,12 +90,12 @@ def build_dict(self):


@abstractmethod
def to_data_frame(self, features: List[GeocodedFeature], queries: List[RegionQuery] = None) -> DataFrame:
def to_data_frame(self, features: List[GeocodedFeature], queries: List[RegionQuery] = []) -> DataFrame:
raise ValueError('Not implemented')


class Regions(CanToDataFrame):
def __init__(self, level_kind: LevelKind, features: List[GeocodedFeature], highlights: bool = False, queries: List[RegionQuery] = None):
def __init__(self, level_kind: LevelKind, features: List[GeocodedFeature], highlights: bool = False, queries: List[RegionQuery] = []):
try:
import geopandas
except:
Expand All @@ -126,6 +112,9 @@ def __repr__(self):
def __len__(self):
return len(self._geocoded_features)

def to_map_regions(self):
return [MapRegion.place(feature.id, feature.query, self._level_kind) for feature in self._geocoded_features]

def as_list(self) -> List['Regions']:
return [Regions(self._level_kind, [feature], self._highlights) for feature in self._geocoded_features]

Expand Down Expand Up @@ -257,15 +246,17 @@ def centroids(self):

# implements abstract in CanToDataFrame
def to_data_frame(self) -> DataFrame:
parents = select_parents(self._queries)
places = PlacesDataFrameBuilder()

data = {}
data[DF_ID] = [feature.id for feature in self._geocoded_features]

# for us-48 queries doesnt' count
queries = self._queries if len(self._queries) == len(self._geocoded_features) else None

for i in range(len(self._geocoded_features)):
feature = self._geocoded_features[i]
places.append_row(select_not_empty_name(feature), feature.name, parents, i)
places.append_row(select_not_empty_name(feature), feature.name, queries, i)

data = {**data, **places.build_dict()}

Expand Down Expand Up @@ -416,7 +407,8 @@ def _make_parent_region(place: parent_types) -> Optional[MapRegion]:
return MapRegion.with_name(place)

if isinstance(place, Regions):
return MapRegion.with_single_id(place.unique_ids())
assert len(place.to_map_regions()) == 1, 'Region object used as parent should contain only single record'
return place.to_map_regions()[0]

raise ValueError('Unsupported parent type: ' + str(type(place)))

Expand All @@ -427,7 +419,7 @@ def _to_scope(location: scope_types) -> Optional[Union[List[MapRegion], MapRegio

def _make_region(obj: Union[str, Regions]) -> Optional[MapRegion]:
if isinstance(obj, Regions):
return MapRegion.with_ids(obj.unique_ids())
return MapRegion.scope(obj.unique_ids())

if isinstance(obj, str):
return MapRegion.with_name(obj)
Expand All @@ -440,20 +432,17 @@ def _make_region(obj: Union[str, Regions]) -> Optional[MapRegion]:
return _make_region(location)


def _ensure_is_list(obj: request_types) -> Optional[List[str]]:
def _ensure_is_list(obj) -> Optional[List[str]]:
if obj is None:
return None

if isinstance(obj, list):
return obj

if isinstance(obj, str):
return [obj]

if isinstance(obj, Series):
return obj.tolist()

raise ValueError("Wrong type")
return [obj]


def _coerce_resolution(res: int) -> int:
Expand Down
10 changes: 5 additions & 5 deletions python-package/lets_plot/geo_data/regions_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
from .gis.request import MapRegion, RegionQuery, RequestBuilder, RequestKind, PayloadKind, AmbiguityResolver, \
IgnoringStrategyKind
from .gis.response import LevelKind, Response, SuccessResponse, GeoRect
from .regions import _to_level_kind, request_types, scope_types, Regions, _raise_exception, \
_ensure_is_list, _to_scope
from .regions import _to_level_kind, request_types, parent_types, scope_types, Regions, _raise_exception, \
_ensure_is_list, _make_parent_region, _to_scope

NAMESAKE_MAX_COUNT = 10

Expand Down Expand Up @@ -86,9 +86,9 @@ def _create_queries(request: request_types, scope: scope_types, ambiguity_resovl
queries = []
for i in range(len(requests)):
name = requests[i] if requests is not None else None
country = countries[i] if countries is not None else None
state = states[i] if states is not None else None
county = counties[i] if counties is not None else None
country = _make_parent_region(countries[i]) if countries is not None else None
state = _make_parent_region(states[i]) if states is not None else None
county = _make_parent_region(counties[i]) if counties is not None else None

query = RegionQuery(request=name, scope=scope, ambiguity_resolver=ambiguity_resovler,
country=country, state=state, county=county)
Expand Down
19 changes: 8 additions & 11 deletions python-package/lets_plot/geo_data/to_geo_data_frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from pandas import DataFrame
from shapely.geometry import box

from lets_plot.geo_data import PlacesDataFrameBuilder, select_not_empty_name, select_parents, DF_REQUEST, DF_FOUND_NAME, abstractmethod
from lets_plot.geo_data import PlacesDataFrameBuilder, select_not_empty_name, DF_REQUEST, DF_FOUND_NAME, abstractmethod
from lets_plot.geo_data.gis.response import GeocodedFeature, GeoRect, Boundary, Multipolygon, Polygon, GeoPoint
from lets_plot.geo_data.gis.request import RegionQuery

Expand Down Expand Up @@ -40,15 +40,14 @@ def __init__(self):
self._lonmax: List[float] = []
self._latmax: List[float] = []

def to_data_frame(self, features: List[GeocodedFeature], queries: List[RegionQuery] = None) -> DataFrame:
def to_data_frame(self, features: List[GeocodedFeature], queries: List[RegionQuery] = []) -> DataFrame:
places = PlacesDataFrameBuilder()

parents = select_parents(queries)
for i in range(len(features)):
feature = features[i]
rects: GeoRect = self._read_rect(feature)
rects: List[GeoRect] = self._read_rect(feature)
for rect in rects:
places.append_row(request=select_not_empty_name(feature), found_name=feature.name, parents=parents, parent_row=i)
places.append_row(request=select_not_empty_name(feature), found_name=feature.name, queries=queries, parent_row=i)
self._lonmin.append(rect.min_lon)
self._latmin.append(rect.min_lat)
self._lonmax.append(rect.max_lon)
Expand Down Expand Up @@ -79,13 +78,12 @@ def __init__(self):
self._lons: List[float] = []
self._lats: List[float] = []

def to_data_frame(self, features: List[GeocodedFeature], queries: List[RegionQuery] = None) -> DataFrame:
def to_data_frame(self, features: List[GeocodedFeature], queries: List[RegionQuery] = []) -> DataFrame:
places = PlacesDataFrameBuilder()

parents = select_parents(queries)
for i in range(len(features)):
feature = features[i]
places.append_row(request=select_not_empty_name(feature), found_name=feature.name, parents=parents, parent_row=i)
places.append_row(request=select_not_empty_name(feature), found_name=feature.name, queries=queries, parent_row=i)
self._lons.append(feature.centroid.lon)
self._lats.append(feature.centroid.lat)

Expand All @@ -97,14 +95,13 @@ class BoundariesGeoDataFrame:
def __init__(self):
super().__init__()

def to_data_frame(self, features: List[GeocodedFeature], queries: List[RegionQuery] = None) -> DataFrame:
def to_data_frame(self, features: List[GeocodedFeature], queries: List[RegionQuery] = []) -> DataFrame:
places = PlacesDataFrameBuilder()

geometry = []
parents = select_parents(queries)
for i in range(len(features)):
feature = features[i]
places.append_row(request=select_not_empty_name(feature), found_name=feature.name, parents=parents, parent_row=i)
places.append_row(request=select_not_empty_name(feature), found_name=feature.name, queries=queries, parent_row=i)
geometry.append(self._geo_parse_geometry(feature.boundary))

return _create_geo_data_frame(places.build_dict(), geometry=geometry)
Expand Down
Loading