Skip to content

Commit

Permalink
Merge pull request nomic-ai#110 from nomic-ai/bugfixes-new-arrow-upload
Browse files Browse the repository at this point in the history
Updated documentation around timestamps
  • Loading branch information
AndriyMulyar authored Mar 23, 2023
2 parents dc4bb5f + bdbeed6 commit 786345c
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 17 deletions.
2 changes: 1 addition & 1 deletion docs/how_does_atlas_work.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ All information and operations that are visually presented on an Atlas map have
Atlas visualizes your embeddings in two-dimensions using a non-linear dimensionality reduction algorithm. Atlas' dimensionality reduction algorithm is custom-built for scale, speed and dynamic updates.
Nomic cannot share the technical details of the algorithm at this time.

#### Client upload
#### Data Formats and Integrity

Atlas stores and transfers data using a subset of the [Apache Arrow](arrow.apache.org) standard.

Expand Down
20 changes: 5 additions & 15 deletions docs/mapping_faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,21 +48,11 @@ If you are added to a Nomic organization by someone (such as your employer), you
by specifying an `organization_name` in the `map_embedding` method of the AtlasClient. By default, projects are
made under your own account.

## Working with dates and timestamps
Atlas will consider any metadata field as a timestamp if and only if it matches the ISO8601 timestamp format.
You can convert a Python `datetime` object into the ISO8601 timestamp format as follows:
```py
import datetime
now=datetime.datetime.now()
now.isoformat()
```
If you are working with dates which are in a non-uniform format, parsing into datetime objects may be difficult. Nomic recommends
using the `python-dateutil` package in this situation. It will intelligently parse a string into a Python datetime object at the cost of some compute cycles.
```python
from dateutil import parser
date = parser.parse("Apr 15 1999 12:00AM") # datetime.datetime(1999, 4, 15, 0, 0)
date.isoformat()
```
## Working with Dates and Timestamps
Atlas will consider metadata as timestamps when they are passed as Python `date` or `datetime` objects. Under the hood,
these are converted into timestamps compatible with the Apache Arrow standard. Remember, you can directly pass
through pandas Dataframe objects and Arrow tables to the `add_*` endpoints.

## How do I make maps of a dataset I have already uploaded?
You need to make a new index on the project you have uploaded your data to.
See [How does Atlas work?](how_does_atlas_work.md) for details.
Expand Down
1 change: 0 additions & 1 deletion nomic/project.py
Original file line number Diff line number Diff line change
Expand Up @@ -1503,7 +1503,6 @@ def send_request(i):
# if this method is being called internally, we pass a global progress bar
close_pbar = False
if pbar is None:
logger.info("Uploading text to Atlas.")
close_pbar = True
pbar = tqdm(total=int(len(data)) // shard_size)
failed = 0
Expand Down

0 comments on commit 786345c

Please sign in to comment.