Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Adds a dataset that can be read and written lazily #5344

Merged
merged 21 commits into from
Aug 23, 2021
Merged

Adds a dataset that can be read and written lazily #5344

merged 21 commits into from
Aug 23, 2021

Conversation

dirkgr
Copy link
Member

@dirkgr dirkgr commented Aug 7, 2021

No description provided.

This does not work yet. I'm still working on supporting classes.
@dirkgr
Copy link
Member Author

dirkgr commented Aug 7, 2021

I think I can simplify this a lot. SqliteSparseSequence will stick around though.

@dirkgr dirkgr marked this pull request as ready for review August 20, 2021 00:50
@dirkgr dirkgr requested a review from epwalsh August 20, 2021 00:50
else:
return None
elif isinstance(i, slice):
from allennlp.tango.dataloader import ShuffledSequence
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to put this import here? I'm afraid this would add non-negligible runtime overhead each time this path is reached.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it a global import.

Comment on lines +36 to +44
if isinstance(i, int):
current_length = len(self)
if i < 0:
i %= current_length
self.table[str(i)] = value
self.table["_len"] = max(i, current_length)
self.table.commit()
else:
raise TypeError(f"list indices must be integers, not {i.__class__.__name__}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we handle slices here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't set slices. You can in numpy, but not in Python list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But can't you just go through one-by-one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you can? I tried to do whatever list does. Numpy can do many things that list can't, and I didn't implement those either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I don't see a huge need for it, so 🤷‍♂️

@dirkgr dirkgr mentioned this pull request Aug 21, 2021
@dirkgr dirkgr enabled auto-merge (squash) August 23, 2021 18:18
Copy link
Member

@epwalsh epwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dirkgr dirkgr merged commit 5dc80a6 into main Aug 23, 2021
@dirkgr dirkgr deleted the TangoBigData branch August 23, 2021 19:24
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants