An implementation of the ERC-time-ordered-distributable-database (TODD) as a generic library. It can be used to make data TODD-compliant to facilitate peer-to-peer distribution.
Status: prototype
To test out a new database design, where user participation makes the entire database more available.
Questions for you:
- Do you have data that grows over time and that you would like users to host?
- Are you providing data as a public good and are wondering how to wean to community?
Min-know makes data into an append-only structure that anyone can publish to. Distribution happens like a print publication where users obtain Volumes as they are released. A user becomes a distributer too.
Volumes contain Chapters that can be obtained separately. This effectively divides the database, making large databases manageable for resource-constrained users.
📘🔍🐟
To make any database TODD-compliant so that data-users become data-providers.
TODD-compliance is about:
- Delivering a user the minimum knowledge that is useful to them.
- Delivering a user some extra data.
- Making it easy for a user to become a data provider for the next user.
A minnow is a small fish 🐟 that can be part of a larger collective.
Data is published in Volumes
.
📘 - A Volume
Volumes are added over time:
📘 📘 📘 📘 📘 ... 📘 <--- 📘 - All Volumes (published so far).
Volumes
have Chapters
for specific content. Chapters
can be obtained individually.
- 📘 An example volume with 256
Chapters
- 📕
0x00
First chapter (1st) - ...
- ...
- 📙
0xff
Last Chapter (256th)
- 📕
A Manifest
📜 exists that lists all Chapters for all Volumes. A manifest
simple contains IPFS hashes for data (see example manifests).
A user can check the manifest and find which Chapter
is right for
them. They can ignore the IPFS hashes that don't match their needs.
📜🔍🐟
The user starts with something they know (a key), for example, an address. For every key, only one Chapter will be important.
- User (🐟) key is an address:
0xf154...f00d
. - Data is divided into chapters using the first two characters of address (
Chapter
=0xf1
)
Visually:
- 📕
0x00
- ...
- ...
- 📗
0xf1
<--- 🐟0xf154...f00d
(user only needs thisChapter
) - ...
- ...
- 📙
0xff
For every published Volume
, the user only downloads the right Chapter
for their needs.
The Min-know library automates this by using the CIDs in the manifest to find files on IPFS.
This means obtaining one Chapter
from every Volume
that has ever been published.
Hence, the user 🐟 only needs 1/256th of the entire database.
Once downloaded, the Chapters
can be queried for useful information that
the database contains.
Optionally, they can also pin their Chapters
to IPFS, which makes the data
available from more sources.
Iteraction with the library occurs the Todd
struct ([database::types::Todd
]) through the methods:
- For users:
obtain_relevant_data()
check_completeness()
find()
- For maintainers:
full_transformation()
extend()
repair_from_raw()
generate_manifest()
manifest()
See ./ARCHITECTURE.md for how this library is structured.
All examples can be seen with the following command:
cargo run --example
See ./examples/README.md for more information.
See ./DATABASES.md for different databases that have been implmemented in this library.
The maintainer methods in the examples are used to create and extend a TODD-compliant database.
This requires having a local "raw" source, which will be different for every
data type. The library will use the methods in the ./extraction
module
to convert the data.
For example:
- The address-appearance-index is created and maintained by having locally available Unchained Index chunk files (produced by trueblocks-core https://github.com/TrueBlocks/trueblocks-core)). They are parsed and reorganised to form the TODD-compliant format.
- The nametags database is created and maintained by having individual files (one per address) that contain JSON-encoded names and tags.
Other raw formats might be flat files containing data of various kinds.