Skip to content

Compact human-readable unique identifiers for temporary objects in small non-synchronizing distributed systems.

License

Notifications You must be signed in to change notification settings

strizhechenko/uuid05

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UUID05 - compact human-readable almost unique identifiers for temporary objects in small non-synchronizing distributed systems. Inspired by nanoid and mktemp utility.

Well, it's not really unique (that's why 0.5) and collisions are possible but probability is low and it's probably acceptable.

The library have zero dependencies (well, you need python3 and pip) and provides you with UUID05 class based on int with two additional methods:

  • make() - sort of constructor.
  • as_b64() -> str - alternative compact representation.

Examples below explain how it works:

TTL
(seconds)
Workers Worker ID Example value Max value .as_b64(max_value)
(hour) 3600 2 1 9589 132400 AgUw
3600 4 2 29589 332400 BRJw
3600 10 4 49589 932400 Djow
3600 16 1 195893 15356400 6lHw
3600 32 11 1195893 31356400 Ad518A
3600 256 53 539589397 25535996400 BfIQYfA
(2 days) 172800 2 1 1449589 11555200 sFGA
172800 4 2 21449589 31555200 AeF-gA
172800 10 4 41449589 91555200 BXUFgA
172800 16 4 414495893 1517107200 Wm04AA
172800 32 14 1414495893 3117107200 uctIAA
172800 256 179 1791449589397 2551727827200 AlIe1KkA

Installing

pip3 install uuid05

Using

from uuid05 import UUID05

# basic usage
uid = UUID05.make()
suffix: str = uid.as_b64()
object_name: str = f'autotest_object_{suffix}'

# .make() may be parametrized by workers: int, ttl: int, precision: int
# defaults are: workers=10, ttl=2 days, precision=1
uid: int = UUID05.make(workers=16, ttl=86400, precision=6)  # 1419554951415
uid.as_b64()  # 'AUqEXzNy'

# you may also want to just shorten your existing integer identifiers.
suffix: str = UUID05(123123123123).as_b64()
assert suffix == 'HKq1w7M'

# How to get maximum UUID value/length with given params?
max_value = UUID05.max_value(machines=16, ttl=86400, precision=6)  # 1586399913600
len(str(max_value)) == 13
len(max_value.as_b64()) == 8  # AXFczaaA
len(f'autotest_object_{max_value.as_b64()}') == 23  # autotest_object_AXFczaaA

It can be also used as an utility from command-line:

$ uuid05
61503153
$ uuid05 -w 2
1503125
$ uuid05 -t 3600 -w 2
27091
$ uuid05 -b -t 3600 -w 2
aZ8
$ uuid05 -b -w 2
FvN2
$ uuid05 -b
AxHktA
$ uuid05 -b -a '_-' -w 64
eB_5Yg
$ uuid05 --help

Where UUID05 is useful

In E2E/UI-testing. It's slow, and sometimes you need to check a data created by tests after run.

Or, more generally, in staging environments where you aren't sure that your testing system will delete data after runs, but the tested system is aware of such a data and deletes it after some time. There's also be multiple testing systems instances running simultaneously, and you don't want them to affect each other.

You also may want identifiers to be more or less rememberable for at least 10-15 seconds while you switching tabs.

Oh, and you don't want to synchronize workers via network. Otherwise Redis, Memcached or another database with a single INCRementing counter would do the trick.

When UUID05 is useless

  • If your system isn't distributed, local counter in memory or file will work better.
  • If your objects are persistent, you'd better use py-nanoid.
  • If you need to generate multiple UIDs for multiple object really quick:
    • generate one and reuse it, using a semantic or loop variable as a suffix;
    • well, there's a simple collision preventing mechanism, allowing small "bursts" of loops;
      • If you seek for performance, disable it (uuid05.disable_cache = True) to win ~32% time.
    • pass precision argument to uuid05(). It scales automatically with worker count, but if there are less than 16 workers, default is 1 which means 1 uuid per 0.1 second, usually it's enough.
      • precision=3 argument will use milliseconds.
      • precision=6 for microseconds.
    • if precision=6 is not enough, stop trying to make your identifier compact.
  • If you believe that semi-persistent data is a testing antipattern, and it should be cleared by the testing system before or after each run.

Benchmarks

Function CPU Frequency (GHz) Collision prevening mechanism enabled Mean time, ns
UUID05.make() 3.8GHz True 868
UUID05.make() 3.8GHz False 1300
uuid.uuid4() 3.8GHz n/a 1500
UUID05.make() 4.7GHz True 779
UUID05.make() 4.7GHz False 1070
UUID05(1333).as_b64() 3.8GHz n/a 596
str(uuid.UUID) 3.8GHz n/a 570

Development

  • Tests are doctests and may be run by pytest.
  • Benchmarks - use %timeit in ipython
  • Documentation - look at code, it's just two files.
  • Build - python3 -m build --sdist --wheel.
  • Release twine check dist/uuid05-<version>* && twine upload dist/uuid05-<version>*.
  • Design - UUID05 class doesn't try to copy uuid module methods, properties and behavior. Speed also isn't a goal.