Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type-id v3 #1

Merged
merged 3 commits into from
Apr 12, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
add new tests
  • Loading branch information
conradludgate committed Apr 12, 2024
commit 954a94bca86d5bbc07d0ef870ae3b726a79088c2
53 changes: 33 additions & 20 deletions tests/spec/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# TypeID Specification (Version 0.2.0)
# TypeID Specification (Version 0.3.0)

## Overview

TypeIDs are a type-safe extension of UUIDv7, they encode UUIDs in base32 and add a type prefix.

Here's an example of a TypeID of type `user`:
Expand All @@ -16,39 +17,49 @@ This document formalizes the specification for TypeIDs.
## Specification

A typeid consists of three parts:

1. A **type prefix**: a string denoting the type of the ID. The prefix should be
at most 63 characters in all lowercase ASCII [a-z].
1. A **separator**: an underscore '_' character.
at most 63 characters in all lowercase snake_case ASCII `[a-z_]`.
1. A **separator**: an underscore `_` character. The separator is omitted if the prefix is empty.
1. A **UUID suffix**: a 128-bit UUIDv7 encoded as a 26-character string in base32.

### Type Prefix
A type prefix is a string denoting the type of the ID. The prefix should be at most
63 characters in all lowercase ASCII [a-z]. Valid prefixes should match the following
regex: `[a-z]{0,63}`.

The empty string is a valid prefix, it's there for very specific use cases in which
applications need to encode a typeid but elide the type information. In general though,
applications should use a prefix that is at least 3 characters long.
A type prefix is a string denoting the type of the ID.
The prefix must:

- Contain at most 63 characters.
- May be empty.
- If not empty:
* Must contain only lowercase alphabetic ASCII characters `[a-z]`, or an underscore `_`.
* Must start and end with an alphabetic character `[a-z]`. Underscores are not allowed at the beginning or end of the string.

> Note: [There's a proposal](https://github.com/jetpack-io/typeid/issues/7) to add `_` as
> an allowed separator within type prefixes.
Valid prefixes match the following
regex: `^([a-z]([a-z_]*[a-z])?)?$`.

The empty string is a valid prefix, it's there for use cases in which
applications need to encode a typeid but elide the type information. In general though,
applications SHOULD use a prefix that is at least 3 characters long.

### Separator

The separator is a single underscore character `_`. If the prefix is empty, the separator
is omitted.

### UUID Suffix

The UUID suffix encodes exactly 128-bits of data in 26 characters. It uses the base32
encoding described below.

#### Base32 Encoding

Bytes from the UUID are encoded from left to right. Two zeroed bits are pre-pended
to the 128-bits of the UUID, resulting in 130-bits of data. The 130-bits are then
split into 5-bit chunks, and each chunk is encoded as a single character in the
base32 alphabet, resulting in a total of 26 characters.

In practice this is most often done by using bit-shifting and a lookup table. See
the [reference implementation encoding](https://github.com/jetpack-io/typeid-go/blob/main/base32/base32.go)
the [reference implementation encoding](https://github.com/jetify-com/typeid-go/blob/main/base32/base32.go)
for an example.

Note that this is different from the standard base32 encoding which encodes in
Expand All @@ -58,7 +69,7 @@ The encoding uses the following alphabet `0123456789abcdefghjkmnpqrstvwxyz` as
specified by the following table:

| Value | Symbol | Value | Symbol | Value | Symbol | Value | Symbol |
|-------|--------|-------|--------|-------|--------|-------|--------|
| ----- | ------ | ----- | ------ | ----- | ------ | ----- | ------ |
| 0 | 0 | 8 | 8 | 16 | g | 24 | r |
| 1 | 1 | 9 | 9 | 17 | h | 25 | s |
| 2 | 2 | 10 | a | 18 | j | 26 | t |
Expand All @@ -78,28 +89,30 @@ is `7zzzzzzzzzzzzzzzzzzzzzzzzz`. Implementations should reject any suffix greate
that value, by checking that the first character is a `7` or less.

#### Compatibility with UUID

When genarating a new TypeID, the generated UUID suffix MUST decode to a valid UUIDv7.

Implementations MAY allow encoding/decoding of other UUID variants when the
Implementations SHOULD allow encoding/decoding of other UUID variants when the
bits are provided by end users. This makes it possible for applications to encode
other UUID variants like UUIDv1 or UUIDv4 at their discretion.

## Versioning

This spec uses semantic versioning: `MAJOR.MINOR.PATCH`. The version is incremented
when the spec changes in a way that is not backwards compatible.

Libraries that implement this spec should also use semantic versioning, and their
MAJOR and MINOR versions should match the version of the spec they implement.
The PATCH version is up to the discretion of the library author.
Libraries that implement this spec should also use semantic versioning.

## Validating Implementations

To assist library authors in validating their implementations, we provide:
+ A reference implementation in [Go](https://github.com/jetpack-io/typeid-go)

- A [reference implementation in Go](https://github.com/jetify-com/typeid-go)
with extensive testing.
+ A [valid.yml](valid.yml) file containing a list of valid typeids along
- A [valid.yml](valid.yml) file containing a list of valid typeids along
with their corresponding decoded UUIDs. For convienience, we also provide
a [valid.json](valid.json) file containing the same data in JSON format.
+ An [invalid.yml](invalid.yml) file containing a list of strings that are
- An [invalid.yml](invalid.yml) file containing a list of strings that are
invalid typeids and should fail to parse/decode. For convienience, we also
provide a [invalid.json](invalid.json) file containing the same data in
JSON format.
21 changes: 16 additions & 5 deletions tests/spec/invalid.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# Each example contains an invalid TypeID string. Implementations are expected
# to throw an error when attempting to parse/validate these strings.
#
# Last updated: 2023-07-05
# Last updated: 2024-04-10 (for version 0.3.0 of the spec)

- name: prefix-uppercase
typeid: "PREFIX_00000000000000000000000000"
Expand All @@ -18,9 +18,10 @@
typeid: "pre.fix_00000000000000000000000000"
description: "The prefix can't have symbols, it needs to be alphabetic"

- name: prefix-underscore
typeid: "pre_fix_00000000000000000000000000"
description: "The prefix can't have symbols, it needs to be alphabetic"
# Test removed in v0.3.0 – we now allow underscores in the prefix
# - name: prefix-underscore
# typeid: "pre_fix_00000000000000000000000000"
# description: "The prefix can't have symbols, it needs to be alphabetic"

- name: prefix-non-ascii
typeid: "préfix_00000000000000000000000000"
Expand Down Expand Up @@ -85,4 +86,14 @@
- name: suffix-overflow
# This is the first suffix that overflows into 129 bits
typeid: "prefix_8zzzzzzzzzzzzzzzzzzzzzzzzz"
description: "The should encode at most 128-bits"
description: "The suffix should encode at most 128-bits"

# Tests below were added in v0.3.0 when we started allowing '_' within the
# type prefix.
- name: prefix-underscore-start
typeid: "_prefix_00000000000000000000000000"
description: "The prefix can't start with an underscore"

- name: prefix-underscore-end
typeid: "prefix__00000000000000000000000000"
description: "The prefix can't end with an underscore"
9 changes: 8 additions & 1 deletion tests/spec/valid.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
# note that not all of them are UUIDv7s. When *generating* new random typeids,
# implementations should always use UUIDv7s.
#
# Last updated: 2023-07-05
# Last updated: 2024-04-10 (for version 0.3.0 of the spec)

- name: nil
typeid: "00000000000000000000000000"
Expand Down Expand Up @@ -64,3 +64,10 @@
typeid: "prefix_01h455vb4pex5vsknk084sn02q"
prefix: "prefix"
uuid: "01890a5d-ac96-774b-bcce-b302099a8057"

# Tests below were added in v0.3.0 when we started allowing '_' within the
# type prefix.
- name: prefix-underscore
typeid: "pre_fix_00000000000000000000000000"
prefix: "pre_fix"
uuid: "00000000-0000-0000-0000-000000000000"