Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite schema #379

Open
edsu opened this issue Feb 16, 2021 · 1 comment
Open

sqlite schema #379

edsu opened this issue Feb 16, 2021 · 1 comment
Labels

Comments

@edsu
Copy link
Member

edsu commented Feb 16, 2021

Part of the proposed functionality for twarc's utilities is for them to allow users to convert collected JSON and storing in a SQLite database. The goal would be to allow v1 and v2 JSON to be written to a single representation and then to have various utilities use that rather than the raw JSON.

If we are going to go down this route it might be useful to sketch out what this schema might look like. We'll want to balance the benefits of SQL (being able to query tweet properties) while storing some data which doesn't need to be queried directly as a JSON blob? It might be useful to look at @simonw's twitter-to-sqlite package as reference and possible integration point.

@igorbrigadir
Copy link
Contributor

Thinking out loud:

The goal would be to allow v1 and v2 JSON to be written to a single representation

This should be possible - i always wanted to do a v2 -> v1.1 and v1.1 -> v2 converter. A bunch of old tools would suddenly work again.

We'll want to balance the benefits of SQL (being able to query tweet properties) while storing some data which doesn't need to be queried directly as a JSON blob?

I have some unfinished thoughts on this: I'm very pro having an SQL structured set of tables for querying data. However, I also think we should store the original timestamped API request responses as an "even log" - this "log" of requests can be parsed and re-parsed into various formats later (this comes from my own past experience so i din't know how useful it is for others)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants