Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support datetime data types as input #242

Closed
seisman opened this issue Nov 16, 2018 · 29 comments · Fixed by #464
Closed

Support datetime data types as input #242

seisman opened this issue Nov 16, 2018 · 29 comments · Fixed by #464
Labels
feature request New feature wanted help wanted Helping hands are appreciated upstream Bug or missing feature of upstream core GMT
Milestone

Comments

@seisman
Copy link
Member

seisman commented Nov 16, 2018

Description of the problem
I want to plot some datetime data on a map. However, gmt-python doesn't accept string as input.

Full code that generated the error

import pygmt

x = ['2008-01-01', '2012-01-01']
y = [5, 5]

fig = pygmt.Figure()
fig.plot(x, y, J='X10c/5c', R='2005-01-01/2015-01-01/0/10', B=True, W='2p')
fig.show(method='external')

Full error message

Traceback (most recent call last):
  File "test.py", line 7, in <module>
    fig.plot(x, y, J='X10c/5c', R='2005-01-01/2015/01-01/0/10', B=True, W='2p')
  File "/Users/seisman/Gits/gmt/gmt-python/gmt/helpers/decorators.py", line 199, in new_module
    return module_func(*args, **kwargs)
  File "/Users/seisman/Gits/gmt/gmt-python/gmt/helpers/decorators.py", line 294, in new_module
    return module_func(*args, **kwargs)
  File "/Users/seisman/Gits/gmt/gmt-python/gmt/base_plotting.py", line 344, in plot
    with file_context as fname:
  File "/Users/seisman/.anaconda/lib/python3.7/contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "/Users/seisman/Gits/gmt/gmt-python/gmt/clib/session.py", line 1067, in virtualfile_from_vectors
    self.put_vector(dataset, column=col, vector=array)
  File "/Users/seisman/Gits/gmt/gmt-python/gmt/clib/session.py", line 754, in put_vector
    gmt_type = self._check_dtype_and_dim(vector, ndim=1)
  File "/Users/seisman/Gits/gmt/gmt-python/gmt/clib/session.py", line 703, in _check_dtype_and_dim
    "Unsupported numpy data type '{}'.".format(array.dtype.name)
gmt.exceptions.GMTInvalidInput: Unsupported numpy data type 'str320'.

System information

  • Operating system: macOS 10.14
  • Python installation (Anaconda, system, ETS): Anaconda
  • Version of GMT: 6.0.0_15a9160
  • Version of Python: 3.7.0
  • Version of this package: latest
  • If using conda, paste the output of conda list below:
output of conda list
# packages in environment at /Users/seisman/.anaconda:
#
# Name                    Version                   Build  Channel
_ipyw_jlab_nb_ext_conf    0.1.0                    py37_0
alabaster                 0.7.11                   py37_0
anaconda                  5.3.0                    py37_0
anaconda-client           1.7.2                    py37_0
anaconda-navigator        1.9.2                    py37_0
anaconda-project          0.8.2                    py37_0
appdirs                   1.4.3            py37h28b3542_0
appnope                   0.1.0                    py37_0
appscript                 1.0.1            py37h1de35cc_1
argh                      0.26.2                    
asn1crypto                0.24.0                   py37_0
astroid                   2.0.4                    py37_0
astropy                   3.0.4            py37h1de35cc_0
atomicwrites              1.2.1                    py37_0
attrs                     18.2.0           py37h28b3542_0
automat                   0.7.0                    py37_0
babel                     2.6.0                    py37_0
backcall                  0.1.0                    py37_0
backports                 1.0                      py37_1
backports.shutil_get_terminal_size 1.0.0                    py37_2
basemap                   1.2.0            py37h0acbc05_0
beautifulsoup4            4.6.3                    py37_0
bitarray                  0.8.3            py37h1de35cc_0
bkcharts                  0.2                      py37_0
black                     18.9b0                    
blas                      1.0                         mkl
blaze                     0.11.3                   py37_0
bleach                    2.1.4                    py37_0
blosc                     1.14.4               hd9629dc_0
bokeh                     0.13.0                   py37_0
boto                      2.49.0                   py37_0
bottleneck                1.2.1            py37h1d22016_1
bzip2                     1.0.6                h1de35cc_5
ca-certificates           2018.03.07                    0
cartopy                   0.16.0           py37h9263bd1_0
certifi                   2018.8.24                py37_1
cffi                      1.11.5           py37h6174b99_1
cftime                    1.0.0b1          py37h1d22016_0
chardet                   3.0.4                    py37_1
click                     6.7                      py37_0
cloud-sptheme             1.9.4                     
cloudpickle               0.5.5                    py37_0
clyent                    1.2.2                    py37_1
cmarkgfm                  0.4.2                     
colorama                  0.3.9                    py37_0
conda                     4.5.11                py37_1000    conda-forge
conda-build               3.15.1                   py37_0
conda-env                 2.6.0                         1
constantly                15.1.0           py37h28b3542_0
contextlib2               0.5.5                    py37_0
cryptography              2.3.1            py37hdbc3d79_0
curl                      7.61.0               ha441bb4_0
cycler                    0.10.0                   py37_0
cython                    0.28.5           py37h0a44026_0
cytoolz                   0.9.0.1          py37h1de35cc_1
dask                      0.19.1                   py37_0
dask-core                 0.19.1                   py37_0
datashape                 0.5.4                    py37_1
dbus                      1.13.2               h760590f_1
decorator                 4.3.0                    py37_0
defusedxml                0.5.0                    py37_1
distributed               1.23.1                   py37_0
docutils                  0.14                     py37_0
entrypoints               0.2.3                    py37_2
et_xmlfile                1.0.1                    py37_0
expat                     2.2.6                h0a44026_0
fastcache                 1.0.2            py37h1de35cc_2
filelock                  3.0.8                    py37_0
flask                     1.0.2                    py37_1
flask-cors                3.0.6                    py37_0
freetype                  2.9.1                hb4e5f40_0
future                    0.16.0                    
future                    0.17.0                py37_1000    conda-forge
geographiclib             1.49                      
geos                      3.6.2                h5470d99_2
get_terminal_size         1.0.0                h7520d66_0
gettext                   0.19.8.1             h15daf44_3
gevent                    1.3.6            py37h1de35cc_0
glib                      2.56.2               hd9629dc_0
glob2                     0.6                      py37_0
gmp                       6.1.2                hb37e062_1
gmpy2                     2.0.8            py37h6ef4df4_2
gmt-python                0.1a3+126.g4617492.dirty           
greenlet                  0.4.15           py37h1de35cc_0
guzzle-sphinx-theme       0.7.11                    
h5py                      2.8.0            py37h878fce3_3
hdf4                      4.2.13               h39711bb_2
hdf5                      1.10.2               hfa1e0ec_1
heapdict                  1.0.0                    py37_2
html5lib                  1.0.1                    py37_0
hyperlink                 18.0.0                   py37_0
icu                       58.2                 h4b95b61_1
idna                      2.7                      py37_0
imageio                   2.4.1                    py37_0
imagesize                 1.1.0                    py37_0
incremental               17.5.0                   py37_0
intel-openmp              2019.0                      118
ipykernel                 4.9.0                    py37_1
ipython                   6.5.0                    py37_0
ipython_genutils          0.2.0                    py37_0
ipywidgets                7.4.1                    py37_0
isort                     4.3.4                    py37_0
itsdangerous              0.24                     py37_1
jbig                      2.1                  h4d881f8_0
jdcal                     1.4                      py37_0
jedi                      0.12.1                   py37_0
jinja2                    2.10                     py37_0
jpeg                      9b                   he5867d9_2
jsonschema                2.6.0                    py37_0
jupyter                   1.0.0                    py37_7
jupyter_client            5.2.3                    py37_0
jupyter_console           5.2.0                    py37_1
jupyter_core              4.4.0                    py37_0
jupyterlab                0.34.9                   py37_0
jupyterlab_launcher       0.13.1                   py37_0
keyring                   13.2.1                   py37_0
kiwisolver                1.0.1            py37h0a44026_0
lazy-object-proxy         1.3.1            py37h1de35cc_2
libcurl                   7.61.0               hf30b1f0_0
libcxx                    4.0.1                h579ed51_0
libcxxabi                 4.0.1                hebd6815_0
libedit                   3.1.20170329         hb402a30_2
libffi                    3.2.1                h475c297_4
libgfortran               3.0.1                h93005f0_2
libiconv                  1.15                 hdd342a3_7
libnetcdf                 4.6.1                h4e6abe9_1
libpng                    1.6.34               he12f830_0
libsodium                 1.0.16               h3efe00b_0
libssh2                   1.8.0                h322a93b_4
libtiff                   4.0.9                hcb84e12_2
libxml2                   2.9.8                hab757c2_1
libxslt                   1.1.32               hb819dd2_0
livereload                2.5.2                     
llvmlite                  0.24.0           py37hc454e04_0
locket                    0.2.0                    py37_1
lxml                      4.2.5            py37hef8c89e_0
lzo                       2.10                 h362108e_2
markupsafe                1.0              py37h1de35cc_1
matplotlib                2.2.3            py37h54f8f79_0
mccabe                    0.6.1                    py37_1
mistune                   0.8.3            py37h1de35cc_1
mkl                       2019.0                      118
mkl-service               1.1.2            py37h6b9c3cc_5
mkl_fft                   1.0.4            py37h5d10147_1
mkl_random                1.0.1            py37h5d10147_1
more-itertools            4.3.0                    py37_0
mpc                       1.1.0                h6ef4df4_1
mpfr                      4.0.1                h3018a27_3
mpmath                    1.0.0                    py37_2
msgpack-python            0.5.6            py37h04f5b5a_1
multipledispatch          0.6.0                    py37_0
navigator-updater         0.2.1                    py37_0
nbconvert                 5.4.0                    py37_1
nbformat                  4.4.0                    py37_0
ncurses                   6.1                  h0a44026_0
netcdf4                   1.4.1            py37h08833f9_0
networkx                  2.1                      py37_0
nltk                      3.3.0                    py37_0
nose                      1.3.7                    py37_2
notebook                  5.6.0                    py37_0
numba                     0.39.0           py37h6440ff4_0
numexpr                   2.6.8            py37h1dc9127_0
numpy                     1.15.1           py37h6a91979_0
numpy-base                1.15.1           py37h8a80b8c_0
numpydoc                  0.8.0                    py37_0
obspy                     1.1.0            py37h39e3cac_2    conda-forge
obspy                     1.1.0                     
odo                       0.5.1                    py37_0
olefile                   0.46                     py37_0
openpyxl                  2.5.6                    py37_0
openssl                   1.0.2p               h1de35cc_0
owslib                    0.17.0                   py37_0
packaging                 17.1                     py37_0
pandas                    0.23.4           py37h6440ff4_0
pandoc                    1.19.2.1             ha5e8f32_1
pandocfilters             1.4.2                    py37_1
parso                     0.3.1                    py37_0
partd                     0.3.8                    py37_0
path.py                   11.1.0                   py37_0
pathlib2                  2.3.2                    py37_0
pathtools                 0.1.2                     
patsy                     0.5.0                    py37_0
pcre                      8.42                 h378b8a2_0
pep8                      1.7.1                    py37_0
pexpect                   4.6.0                    py37_0
pickleshare               0.7.4                    py37_0
pillow                    5.2.0            py37hb68e598_0
pip                       10.0.1                   py37_0
pkginfo                   1.4.2                    py37_1
pluggy                    0.7.1            py37h28b3542_0
ply                       3.11                     py37_0
port-for                  0.3.1                     
proj4                     5.0.1                h1de35cc_0
prometheus_client         0.3.1            py37h28b3542_0
prompt_toolkit            1.0.15                   py37_0
psutil                    5.4.7            py37h1de35cc_0
ptyprocess                0.6.0                    py37_0
py                        1.6.0                    py37_0
pyasn1                    0.4.4            py37h28b3542_0
pyasn1-modules            0.2.2                    py37_0
pycodestyle               2.4.0                    py37_0
pycosat                   0.6.3            py37h1de35cc_0
pycparser                 2.18                     py37_1
pycrypto                  2.6.1            py37h1de35cc_9
pycurl                    7.43.0.2         py37hdbc3d79_0
pyepsg                    0.3.2                    py37_0
pyflakes                  2.0.0                    py37_0
pygments                  2.2.0                    py37_0
pylint                    2.1.1                    py37_0
pyodbc                    4.0.24           py37h0a44026_0
pyopenssl                 18.0.0                   py37_0
pyparsing                 2.2.0                    py37_1
pyproj                    1.9.5.1          py37h833a5d7_1
pyqt                      5.9.2            py37h655552a_2
pyshp                     1.2.12                   py37_0
pysocks                   1.6.8                    py37_0
pytables                  3.4.4            py37h13cba08_0
pytest                    3.8.0                    py37_0
pytest-arraydiff          0.2              py37h39e3cac_0
pytest-astropy            0.4.0                    py37_0
pytest-doctestplus        0.1.3                    py37_0
pytest-mpl                0.10                      
pytest-openfiles          0.3.0                    py37_0
pytest-remotedata         0.3.0                    py37_0
python                    3.7.0                hc167b69_0
python-dateutil           2.7.3                    py37_0
python.app                2                        py37_8
pytz                      2018.5                   py37_0
pywavelets                1.0.0            py37h1d22016_0
pyyaml                    3.13             py37h1de35cc_0
pyzmq                     17.1.2           py37h1de35cc_0
qt                        5.9.6                h45cd832_2
qtawesome                 0.4.4                    py37_0
qtconsole                 4.4.1                    py37_0
qtpy                      1.5.0                    py37_0
readline                  7.0                  h1de35cc_5
readme-renderer           22.0                      
requests                  2.19.1                   py37_0
requests-toolbelt         0.8.0                     
rope                      0.11.0                   py37_0
ruamel_yaml               0.15.46          py37h1de35cc_0
scikit-image              0.14.0           py37h0a44026_1
scikit-learn              0.19.2           py37h4f467ca_0
scipy                     1.1.0            py37h28f7352_1
seaborn                   0.9.0                    py37_0
send2trash                1.5.0                    py37_0
service_identity          17.0.0           py37h28b3542_0
setuptools                40.2.0                   py37_0
shapely                   1.6.4            py37h20de77a_0
simplegeneric             0.8.1                    py37_2
singledispatch            3.4.0.3                  py37_0
sip                       4.19.8           py37h0a44026_0
six                       1.11.0                   py37_1
snappy                    1.1.7                he62c110_3
snowballstemmer           1.2.1                    py37_0
sortedcollections         1.0.1                    py37_0
sortedcontainers          2.0.5                    py37_0
sphinx                    1.7.9                    py37_0
Sphinx                    1.8.1                     
sphinx-autobuild          0.7.1                     
sphinx-cjkspace           0.1.2                     
sphinx-intl               0.9.11                    
sphinx-rtd-theme          0.4.1                     
sphinxcontrib             1.0                      py37_1
sphinxcontrib-websupport  1.1.0                    py37_1
spyder                    3.3.1                    py37_1
spyder-kernels            0.2.6                    py37_0
sqlalchemy                1.2.11           py37h1de35cc_0
sqlite                    3.24.0               ha441bb4_0
statsmodels               0.9.0            py37h1d22016_0
sympy                     1.2                      py37_0
tblib                     1.3.2                    py37_0
terminado                 0.8.1                    py37_1
testpath                  0.3.1                    py37_0
tk                        8.6.8                ha441bb4_0
toml                      0.10.0                    
toolz                     0.9.0                    py37_0
tornado                   5.1              py37h1de35cc_0
tqdm                      4.26.0           py37h28b3542_0
traitlets                 4.3.2                    py37_0
twine                     1.12.1                    
twisted                   18.7.0           py37h1de35cc_1
unicodecsv                0.14.1                   py37_0
unixodbc                  2.3.7                h1de35cc_0
urllib3                   1.23                     py37_0
watchdog                  0.9.0                     
wcwidth                   0.1.7                    py37_0
webencodings              0.5.1                    py37_1
werkzeug                  0.14.1                   py37_0
wheel                     0.31.1                   py37_0
widgetsnbextension        3.4.1                    py37_0
wrapt                     1.10.11          py37h1de35cc_2
xarray                    0.10.8                   py37_0
xlrd                      1.1.0                    py37_1
xlsxwriter                1.1.0                    py37_0
xlwings                   0.11.8                   py37_0
xlwt                      1.3.0                    py37_0
xz                        5.2.4                h1de35cc_4
yaml                      0.1.7                hc338f04_2
zeromq                    4.2.5                h0a44026_1
zhon                      1.1.5                     
zict                      0.1.3                    py37_0
zlib                      1.2.11               hf3cbc9b_2
zope                      1.0                      py37_1
zope.interface            4.5.0            py37h1de35cc_0
@seisman
Copy link
Member Author

seisman commented Nov 29, 2018

We should also support Python's datetime as input for datetime plots.

@leouieda leouieda changed the title Accept string as input Support string input types for Figure.plot Jan 16, 2019
@leouieda
Copy link
Member

Agreed. Pandas can already handle this type of data. I confess that I've never had this use case so I'd love to hear what people need from this.

@leouieda leouieda added the feature request New feature wanted label Jan 22, 2019
@eelcodoornbos
Copy link

I would love to see this feature as well in pygmt. GMT has the best support for easy-to-read and beautiful time series plots. I use that feature even more than plotting maps. I especially like the interval annotation (where, for example, the name of the month is shown in between two tick marks, instead of underneath the first day of the month). Would love to be able to do all of this from pygmt.

The basemap features with the nice annotation already work, but plot (psxy) does not yet accept datetime data.

There are many date/time representations in Python. Plain-old command-line GMT uses ISO strings "2019-03-18T17:48:00.000", which would be a good place to start, but native datetime module objects and numpy.datetime64 would be nice as well. Otherwise pandas can easily help to do any conversions.

@seisman
Copy link
Member Author

seisman commented May 26, 2020

@PaulWessel What's the correct way to pass string vectors to GMT? Like this one:

2008-01-01T00:00  5.0
2008-01-01T00:01  5.0

The GMT_Put_Vector function can only pass numeric vectors. It seems GMT_Put_Strings can pass string vectors, but it's not clear to me how to specify the column number for string vectors.

@PaulWessel
Copy link
Member

Did not anticipate this. One way would be to convert those to UNIX seconds since 1970 and pass that as a double. However, if you want to pass datetime strings then we would need to pass type = GMT_DATETIME and that would need to trigger a conversion from those strings to internal time. Would you also need to reverse, i.e., calling GMT_Get_Vector and if that column is abs time then return string array? Or perhaps we could make a GMT_DateTime () function that takes your string and returns a double for internal floating point time and then you pass that? What would be best from your perspective?

@PaulWessel
Copy link
Member

Since telling GMT that you have abstime etc is done via -J or -f, that is separate from loading up an array in a column. So maybe for me to extend Get/Put to do the datestring to time and back is the simplest?

@PaulWessel
Copy link
Member

Before we do, what is the equivalent issue in Julia and MATLAB, @joa-quim ? Regarding time representations.

@seisman
Copy link
Member Author

seisman commented May 27, 2020

It's not only about datetimes. See this command-line example:

gmt begin map png,pdf
gmt basemap -R0/10/0/6 -JX10c/6c -Bafg1 -BWSen
gmt text -F+f+a+j -W1p -Glightblue << EOF
5 1 12p,0,red       0   TL GMT TEXT1
5 3 15p,1,blue      30  MC GMT TEXT2
5 5 18p,2,yellow    180 TL GMT TEXT3
EOF
gmt end show

It seems there are no API functions to pass the third columns (varying fonts) to GMT.

@PaulWessel
Copy link
Member

Sure? All that stuff is part of trailing text, i.e.,

5 1 is the two numerical columns and "12p,0,red 0 TL GMT TEXT1" is the trailing text. That is what GMT_Put_Strings is meant to do. Trailing text is its own special "column"

@PaulWessel
Copy link
Member

IN contrast, your datetime string is meant to be a numerical column but it is given as a string that needs conversion.

@PaulWessel
Copy link
Member

I like the GMT_Put/Get_Vector extension to GMT_DATETIME. With no interruptions (...) I would make a branch for that today.

@seisman
Copy link
Member Author

seisman commented May 27, 2020

5 1 is the two numerical columns and "12p,0,red 0 TL GMT TEXT1" is the trailing text. That is what GMT_Put_Strings is meant to do. Trailing text is its own special "column"

I thought the 4th column (the text angles) should be passed as a numerical column.

@seisman
Copy link
Member Author

seisman commented May 27, 2020

I like the GMT_Put/Get_Vector extension to GMT_DATETIME. With no interruptions (...) I would make a branch for that today.

Yes, I think that's a good and useful extension.

@joa-quim
Copy link
Member

Before we do, what is the equivalent issue in Julia and MATLAB, @joa-quim ? Regarding time representations.

Don't know. Never tried to plot with time

@PaulWessel
Copy link
Member

pstext goes way back. The order of things were fixed and it mixed up text and numbers. Then we added the -F+j+a+f modifiers to specify the order, and even pull some of them out (+a45+jCB). IT is one of those things were backwards compatibility is a problem. Since angle is a number then it should always be part of the numerics (if it is given). I think from the externals you need to place Put_Strings with whatever order you have and let -F tell us.

@PaulWessel
Copy link
Member

@joa-quim : I am sure Julia must have a way to deal with time, but there really are only a few: a string like @seisman has, a floating point number (time-units since some epoch), or some complicated structure with hours, day, month, etc).
I will work up a string-solution.

@seisman seisman added the upstream Bug or missing feature of upstream core GMT label May 27, 2020
@seisman seisman changed the title Support string input types for Figure.plot Support datetime input types for Figure.plot May 29, 2020
@seisman seisman changed the title Support datetime input types for Figure.plot Support datetime data types as input May 29, 2020
@seisman seisman added the help wanted Helping hands are appreciated label May 29, 2020
@seisman
Copy link
Member Author

seisman commented May 30, 2020

In PR GenericMappingTools/gmt#3396, the GMT API function GMT_Put_Vectors added support for vectors in GMT_DATETIME type. The updated function will be available in the upcoming GMT release GMT 6.1.0.

With that PR merged, now it's possible for us to pass datetime vectors to GMT. See #464 for the possible implementation.

Comments and suggestions are welcomed.

@weiji14
Copy link
Member

weiji14 commented May 30, 2020

Just had a quick look at #464 and it looks really promising! Been working with some time series data recently so I'd be keen to test it out once GMT 6.1.0 is out (looking at the calendar team!).

One quick question though, how will Not-a-Time (i.e. NaT) values be handled? Does GMT just ignore plotting them (as it should) or would it convert it to some big number like numpy seems to currently do at numpy/numpy#16391? Or would it just fall back on the user to properly drop those NaT values first before plotting them with PyGMT/GMT.

@seisman
Copy link
Member Author

seisman commented May 30, 2020

Any datetime types (raw strings, datetime, np.datetim64 and pandas.DatetimeIndex) are converted to strings (char ** in C) before passing to the GMT API.
NaT is converted to the string NaT. GMT can't handle NaT, so it just gives an error and skips the NaT data points. (Perhaps it should be a warning instead of an error).

pygmt-session [ERROR]: Unable to parse 1 datetime strings (ISO datetime format required)

@PaulWessel
Copy link
Member

What is the definition of a NaT?

@seisman
Copy link
Member Author

seisman commented May 30, 2020

NaT is also new to me. There are some documentation from Matlab and Python numpy

@weiji14
Copy link
Member

weiji14 commented May 30, 2020

NaT is basically like NaN, but for time. To be honest, I'm not sure why they don't just use NaN, but yeah, it exists...

NaT is converted to the string NaT. GMT can't handle NaT, so it just gives an error and skips the NaT data points. (Perhaps it should be a warning instead of an error).

pygmt-session [ERROR]: Unable to parse 1 datetime strings (ISO datetime format required)

Yeah, a warning might be better here. Not sure if there is an ISO datetime for NaT.

@weiji14
Copy link
Member

weiji14 commented Dec 3, 2023

Revisiting this thread a little as I'm trying to implement support for Apache Arrow's date32 and date64 dtypes in PyGMT at #2845, and would like some advice on the implementation.

Currently at 5d16103, I've converted data stored in PyArrow's date32 dtype (32-bit) to NumPy's datetime64 dtype (64-bit), which might cause an increase the memory usage going from 32-bit to 64-bit. But @seisman mentioned at #242 (comment) that GMT_Put_Vectors converts the date to a string/char representation? Would it be better to just convert the PyArrow date32 data directly to a string/char representation then, instead of going through an intermediate np.datetime64 format? Or does it not matter since string/char is 64-bit anyway.

Also, what's the temporal resolution of GMT_DATETIME, or the smallest unit that can be handled? Asking because np.datetime64 supports different units (https://numpy.org/doc/1.26/reference/arrays.datetime.html#datetime-units) such as nanosecond, millisecond, day, etc, and it'd be good to know how fine or coarse of a resolution we can handle.

@PaulWessel
Copy link
Member

Internally in GMT, absolute (and relative) time is stored in doubles. From wikipedia it say the smallest increment between two doubles is 2.22e-16 and since the default unit is seconds since some epoch (e.g. UNIX time 1970) it would seem that is pretty small. We never really used high-precision time when we wrote GMT so I would think 2.22e-16 seconds is pretty small. But not sure if that is always the smallest increment. Should cover nano seconds.

@joa-quim
Copy link
Member

joa-quim commented Dec 3, 2023

It's a little more elaborated. The 2.22e-16 is the eps for doubles around 1

julia> eps(1.)
2.220446049250313e-16

but since we are already 53 years away from 1970, the eps now is much larger

julia> eps(53*365*24*60*60.0)
2.384185791015625e-7

If the time reference is the year 0 (not uncommon) with doubles one cannot have better resolution then ~10 micro sec

julia> eps(2023*365*24*60*60.0)
7.62939453125e-6

And definitely doubles must be used. The panorama with singles (float32) is dramatic

julia> eps(Float32(53*365*24*60*60.0))
128.0f0

Ideally time should be stored in 64 bits ints.

@weiji14
Copy link
Member

weiji14 commented Dec 4, 2023

Ok, so if I understand correctly, the default TIME_UNIT is 1 second, but because GMT stores time as double (float64), the smallest resolution would vary away from the set TIME_EPOCH, and this means the time-steps could range in the order of 10e-16 second to 10e-6 second or so?

To be safe and on the conservative side, it sounds like 1 microsecond is ok (and I just remembered @seisman mentioning this at #464 (comment))...

@joa-quim
Copy link
Member

joa-quim commented Dec 4, 2023

Yes, but when reference is year zero 7.62939453125e-6 ~= 1e-5, which gives ~10 microseconds

@seisman
Copy link
Member Author

seisman commented Dec 4, 2023

GMT_Put_Vectors converts the date to a string/char representation? Would it be better to just convert the PyArrow date32 data directly to a string/char representation then, instead of going through an intermediate np.datetime64 format? Or does it not matter since string/char is 64-bit anyway.

Just read the GMT_Put_Vector source codes again. It seems we can convert datetimes to double and pass the double vector to GMT C API, but we can't tell GMT the column is GMT_DATETIME type.

@PaulWessel
Copy link
Member

Yes, but when reference is year zero 7.62939453125e-6 ~= 1e-5, which gives ~10 microseconds

Well, who would use year zero and look for high precision in recent years. You can decide you internal epoch if you really need nanoseconds around some epoch. Presumably one does not need nanoseconds from 0-2023?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature wanted help wanted Helping hands are appreciated upstream Bug or missing feature of upstream core GMT
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants