Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve plot data handling #54

Merged
merged 77 commits into from
Aug 20, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
e7906b1
Prototype version of PlotData class for more consistent and less erro…
janssenhenning Jun 6, 2021
2d10655
Some fixes and add a first usage example in single_scatterplot
janssenhenning Jun 6, 2021
2246f5e
Add data argument to multiple_scatterplot
janssenhenning Jun 7, 2021
37b09f8
Add plot_data to multi_scatter_plot
janssenhenning Jun 7, 2021
fe7f285
Add plot_data to colormesh_plot
janssenhenning Jun 7, 2021
f551c5f
Add min max function to PlotData to easily determine bounds and finis…
janssenhenning Jun 7, 2021
8fd48ec
Use min/max function in colormesh_plot
janssenhenning Jun 7, 2021
921cc6b
Add plot_data to waterfall_plot and surface_plot
janssenhenning Jun 7, 2021
ee1810c
Use plot_data in multiplot_moved
janssenhenning Jun 7, 2021
e573e38
Add apply method to mutate data for plotting using lambda functions (…
janssenhenning Jun 10, 2021
e2b979d
Use plot_data in dos plots in plot_methods
janssenhenning Jun 10, 2021
6b063a9
Use plot_data in plot_bands in plot_methods
janssenhenning Jun 10, 2021
18f3853
Move plot_fleur_bands and plot_fleur_dos to use the data argument
janssenhenning Jun 10, 2021
a32492b
Fix numpy warning in tests
janssenhenning Jun 10, 2021
247b19c
Fix for ``legend_show_data_labels`` option with new list_of_dicts switch
janssenhenning Jul 12, 2021
ab110c2
Add documentation to `vis.data` module
janssenhenning Jul 16, 2021
e0e0cfd
Add some first unit tests of the data module
janssenhenning Jul 16, 2021
71d3a7f
Add method to shift data entries in PlotData
janssenhenning Jul 17, 2021
95f54cd
Add PlotData to histogram and barchart plot function
janssenhenning Jul 17, 2021
52eaba1
pre-commit fixes and add missing files
janssenhenning Jul 17, 2021
aeef301
Started introduction of PlotData class to bokeh plots
janssenhenning Jul 17, 2021
43a0cb2
Improve normalization of the bokeh json for better regression tests
janssenhenning Jul 18, 2021
e817d9e
More improvement to bokeh_plots. Properly deprecated old signature fo…
janssenhenning Jul 18, 2021
cbed2be
Add tests for old signature in bokeh_plots
janssenhenning Jul 18, 2021
ed76bc9
Bugfixes in normalize_list_or_array
janssenhenning Jul 18, 2021
41e99b6
Fix normalize_list_or_array test
janssenhenning Jul 18, 2021
3affa81
Allow function defaults to be officially defined for multiple datasets
janssenhenning Jul 18, 2021
8a8f978
Improve bokeh regression again, with decoding numpy arrays and introd…
janssenhenning Jul 18, 2021
0698468
Convert bokeh_line to use PlotData
janssenhenning Jul 18, 2021
4a1383b
pre-commit fixes
janssenhenning Jul 18, 2021
8f9396f
Some cleanup
janssenhenning Jul 18, 2021
0a4c24b
Change default dpi to 100 to avoid problems with using plt.show
janssenhenning Jul 18, 2021
f256957
Add basic tests for fleur DOS/bandstructure plots with bokeh
janssenhenning Jul 18, 2021
d288c73
Convert bokeh_dos/spinpol_dos functions
janssenhenning Jul 18, 2021
d39899a
Convert plot_fleur_dos to use new signature of bokeh_dos
janssenhenning Jul 18, 2021
6aa80fb
Move bokeh_bands and bokeh_dos to PlotData
janssenhenning Jul 18, 2021
8057f7a
Move plot_fleur_bands to adjust to new signature in bokeh_routines
janssenhenning Jul 18, 2021
1d106b2
Activate plot_fleur_bands_characterize plot test for bokeh
janssenhenning Jul 18, 2021
8e81f6e
Forgot to adjust test
janssenhenning Jul 18, 2021
1d105e9
Reencode numpy arrays in bokeh tests to keep files smaller
janssenhenning Jul 19, 2021
5b7b492
Add data_keys used in copy_data automatically if needed
janssenhenning Jul 19, 2021
44bbfa9
Turn on more interactive tools in bokeh plots by default.
janssenhenning Jul 20, 2021
dbd8ff2
Add switch to disable usage of formatted strings in tooltips. Needed …
janssenhenning Jul 20, 2021
e4f5b1c
Add option to matplotlib plotter to remove duplicate legend labels
janssenhenning Jul 20, 2021
7b19d17
Add band_index to dataframe for future improvements of bandstructure …
janssenhenning Jul 20, 2021
bdae74d
Also encode integer data in bokeh tests to make files smaller for bet…
janssenhenning Jul 20, 2021
add2ff1
Add function to expand the number of plot parameters to a multiple of…
janssenhenning Jul 20, 2021
64bab30
Add functions to either sort the data or group the data by data_keys
janssenhenning Jul 20, 2021
4738b02
Add option to plot bandstructure with lines and updated docstrings
janssenhenning Jul 20, 2021
ae49bb2
Add functions for loading/saving defaults for matplotlib/bokeh modules
janssenhenning Jul 21, 2021
a6f2e35
Add sections to the Developers Guide/User Guide about the PlotData class
janssenhenning Jul 21, 2021
2c22791
pre-commit fix
janssenhenning Jul 21, 2021
7233843
Fixes after rebase
janssenhenning Jul 21, 2021
4d8542e
Bugfix for setting function defaults on added parameters
janssenhenning Jul 21, 2021
e1ba7e1
Add argument to copy the data argument in the beginning of the plot f…
janssenhenning Jul 21, 2021
1a7bd06
Add tests for separate_bands with providing parameters for individual…
janssenhenning Jul 21, 2021
ad28ab5
Update bokeh tests
janssenhenning Jul 21, 2021
188bc5d
Add a Wrapper for ColumnDataSources to be able to use it without spec…
janssenhenning Jul 22, 2021
c40ff09
Move max and min method to use np.nanmax/min to handle nan values
janssenhenning Jul 22, 2021
226ffd0
First test for PlotData
janssenhenning Jul 23, 2021
8ad5074
Make sure that PlotData does not need bokeh installed to function
janssenhenning Jul 23, 2021
34b5458
Some fixes for using ColumnDataSources in PlotData
janssenhenning Jul 23, 2021
1430fb6
Add tests for min function
janssenhenning Jul 23, 2021
73f4269
Allow the mask argument in min/max to be a function to be parametrize…
janssenhenning Jul 23, 2021
c20f656
Suppress warnings for numpy.bool in masks. The datatype is deprecated…
janssenhenning Aug 3, 2021
24b2ac6
Rework parametrization of PlotData tests to allow for list data sources
janssenhenning Aug 3, 2021
8bde4a8
Add tests for max function of PlotData
janssenhenning Aug 3, 2021
9c62c97
Add __contains__ method to ColumnDataSourceWrapper to be able to use …
janssenhenning Aug 3, 2021
9309e0a
Add test for get_keys and get_values
janssenhenning Aug 3, 2021
de984b4
Fix copy_data for force=True for pandas DataFrames
janssenhenning Aug 3, 2021
4329535
Fix copy_data argument to PlotData for ColumnDataSource.
janssenhenning Aug 3, 2021
a65c859
Add tests for shift_data
janssenhenning Aug 3, 2021
4ff4e3a
Add tests for distinct_datasets and apply function of PlotData
janssenhenning Aug 3, 2021
4cfe504
Add tests for get_function_result and copy_data
janssenhenning Aug 3, 2021
e483c1e
Add test of default iteration behaviour
janssenhenning Aug 3, 2021
84354df
reraise exception in dict_of_list_to_list_of_dicts
janssenhenning Aug 3, 2021
d722bc9
Implement saving of bokeh plots
janssenhenning Aug 13, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Allow the mask argument in min/max to be a function to be parametrize…
…d by data form the PlotData instance
  • Loading branch information
janssenhenning committed Aug 19, 2021
commit 73f42694d3d6dc9f4f40e4eeabcc1fb1e3d55ad5
52 changes: 38 additions & 14 deletions masci_tools/vis/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
import numpy as np
import pandas as pd
import copy
import warnings

try:
from bokeh.models import ColumnDataSource
Expand Down Expand Up @@ -245,30 +246,28 @@ def get_values(self, data_key):

return values

def min(self, data_key, separate=False, mask=None):
def min(self, data_key, separate=False, mask=None, mask_data_key=None):
"""
Get the minimum value for a given data column for all entries

:param data_key: name of the data key to determine the minimum
:param separate: bool if True the minimum will be determined and returned
for all entries separately
:param mask: optional mask to select specifc rows from the data entries
:param mask_data_key: optional data key to be used when ``mask`` is a function

:returns: minimum value for all entries either combined or as a list
"""
if data_key not in self._column_spec._fields:
raise ValueError(f'Field {data_key} does not exist')

if mask is not None:
if len(mask) == len(self):
mask_gen = (mask_indx for mask_indx in mask)
else:
mask_gen = (mask for i in self)
mask = self.get_mask(mask, data_key=mask_data_key if mask_data_key is not None else data_key)
else:
mask_gen = (None for i in self)
mask = [None] * len(self)

min_val = []
for (entry, source), mask_entry in zip(self.items(mappable=True), mask_gen):
for (entry, source), mask_entry in zip(self.items(mappable=True), mask):

key = entry._asdict()[data_key]

Expand All @@ -287,30 +286,28 @@ def min(self, data_key, separate=False, mask=None):
else:
return min(min_val)

def max(self, data_key, separate=False, mask=None):
def max(self, data_key, separate=False, mask=None, mask_data_key=None):
"""
Get the maximum value for a given data column for all entries

:param data_key: name of the data key to determine the maximum
:param separate: bool if True the maximum will be determined and returned
for all entries separately
:param mask: optional mask to select specifc rows from the data entries
:param mask_data_key: optional data key to be used when ``mask`` is a function

:returns: maximum value for all entries either combined or as a list
"""
if data_key not in self._column_spec._fields:
raise ValueError(f'Field {data_key} does not exist')

if mask is not None:
if len(mask) == len(self):
mask_gen = (mask_indx for mask_indx in mask)
else:
mask_gen = (mask for i in range(len(self)))
mask = self.get_mask(mask, data_key=mask_data_key if mask_data_key is not None else data_key)
else:
mask_gen = (None for i in range(len(self)))
mask = [None] * len(self)

max_val = []
for (entry, source), mask_entry in zip(self.items(mappable=True), mask_gen):
for (entry, source), mask_entry in zip(self.items(mappable=True), mask):

key = entry._asdict()[data_key]

Expand Down Expand Up @@ -390,6 +387,33 @@ def get_function_result(self, data_key, func, list_return=False, **kwargs):
else:
return result

def get_mask(self, mask, data_key=None):
"""
Get mask list for use with the Data in this instance

:param mask: either list of callable, if it is callable it is used in
:py:meth:`get_function_result()` together with the ``data_key``
argument
:param data_key: str to be used for the data key if mask is a callable

:param
"""
from collections.abc import Callable

if isinstance(mask, Callable):
if data_key is None:
raise ValueError('If mask is a function the data_key argument has to be given')
mask = self.get_function_result(data_key, mask, list_return=True)
else:
if len(mask) != len(self):
mask = [mask for i in range(len(self))]

for mask_entry in mask:
if not all(isinstance(val, bool) for val in mask_entry):
warnings.warn('Not all entries in the mask are booleans')

return mask

def sort_data(self, by_data_keys, **kwargs):
"""
Sort the data by the given data_key(s)
Expand Down
12 changes: 5 additions & 7 deletions masci_tools/vis/plot_methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -1926,10 +1926,8 @@ def plot_bands(kpath,
if 'y' in kwargs['limits']:
ylimits = kwargs['limits']['y']

data = plot_data.values(first=True)
mask = np.logical_and(data.bands > ylimits[0], data.bands < ylimits[1])

weight_max = plot_data.max('size', mask=mask)
mask = lambda bands, ylimits=tuple(ylimits): np.logical_and(bands > ylimits[0], bands < ylimits[1])
weight_max = plot_data.max('size', mask=mask, mask_data_key='bands')
if 'vmax' not in kwargs:
kwargs['vmax'] = weight_max

Expand Down Expand Up @@ -2081,9 +2079,9 @@ def plot_spinpol_bands(kpath,
if 'y' in kwargs['limits']:
ylimits = kwargs['limits']['y']

data = plot_data.values()
mask = [np.logical_and(col.bands > ylimits[0], col.bands < ylimits[1]) for col in data]
weight_max = plot_data.max('size', mask=mask)

mask = lambda bands, ylimits=tuple(ylimits): np.logical_and(bands > ylimits[0], bands < ylimits[1])
weight_max = plot_data.max('size', mask=mask, mask_data_key='bands')

if 'vmax' not in kwargs:
kwargs['vmax'] = weight_max
Expand Down