Implement functions for working with binary keyvalues #3

rossengeorgiev · 2015-11-23T23:19:25Z

Those are used in richpresence and other places, would be useful for the steam module eventually.

implement binary dump/load functions
verify against real data from steam
document usage and type hinting in README

The text was updated successfully, but these errors were encountered:

SleepProgger · 2016-03-11T23:44:48Z

Also used in the PICSProductInfoResponse for the package data.
I'll port the version pysteamkit uses and submit a pull request in a bit.

SleepProgger · 2016-03-12T04:24:14Z

I finished the first version of the parse_binary function.
826a5d2
It works so far for every package data i tested it with.
Some test data (base64 encoded) at http:https://pastebin.com/mf16sHQh .
Let me know what you think.

Some notes:

node-steam.js uses an int32 for the Color and Pointer types. I guess we should do the same. https://github.com/seishun/node-steam/blob/master/lib/VDF.js At the moment i raise an exception on those types.
There seem to be an Widestring type but no implementation i saw used it, so i guess we can assume it isn't used widely in steam.
pysteamkit (and it seems node-steam.js, too) is using the native byte order. I am not sure if that is correct and steam knows about the byte order of the client(?). I used explicit little endian, that looks correct, but we should check if it is.
_read_till_seperator is a bit inefficient and we could improve the performance by caching all separator indices we find when reading a chunk, but i am not sure the increased complexity is really worth the performance benefit.

Additionally, i am not sure if creating the load_binary and loads_binary functions or just supplying an optional binary parameter to load/s would be better..

Edit:
Added experimental version of dump_binary.
b1f96ab
It needs the scalars in the form (data_type, value).
IMHO, that is a bit strange, but we NEED to know as which type to save the data (esp numeric values).
Additionally there is en/decoding missing, plus the same types as in the parse_binary function are missing (BIN_POINTER, BIN_WIDESTRING, BIN_COLOR).

rossengeorgiev · 2016-03-12T20:06:48Z

Let's just deal with bytes strings. Then we can use indexing and slicing to parse it. I suggest find method for slicing out bin_string.

>>> x = b'bbb\x00aaaa\x00'
>>> x[0: x.find(b'\x00', 0)]
b'bbb'
>>> x[4: x.find(b'\x00', 4)]
b'aaaa'

If we want to make load/dump symmetric then the only way is to use objects for point, widestring, and color.

rossengeorgiev · 2016-03-13T23:43:00Z

I think that covers everything. I haven't tested it on any real data yet.

SleepProgger · 2016-03-14T01:23:57Z

I am almost finished with my version, too. IMHO it is a bit cleaner.
https://github.com/SleepProgger/vdf/blob/parse_binary_impl/vdf/vdf_binary.py

Changes:

I am using some type wrapper now to contain the types (int, uint, ...). This allows us to reserialize data pretty nicely.
There was a bug in _dump_gen_binary that is now fixed.
I changed Pointer and Color to an UINT32, because that is the most reasonable format for them i guess.
Everything works now with python 2.7 and 3 (at least the two versions i tested)
- I moved all the stuff to another file, to not clutter the vdf namespace

So far i tested it with all the packages i could get from steam, and deserializeing them and serializing them again returned the correct results.
The problem is, so far all the data i tested was of the types String and int32. I am looking for other testdata (especially with widestrings in it).

If you want i could merge your changes with mine ?

rossengeorgiev · 2016-03-14T02:04:34Z

Did you see my implementation? It's feature complete and fully tested.

Here is quick perf comparison.

In [1]: import vdf, vdf_binary
In [2]: raw = vdf.binary_dumps(dict(map(lambda x: (str(x), x),
range(5000))))
In [3]: %timeit vdf.binary_loads(raw)
100 loops, best of 3: 15.1 ms per loop
In [4]: %timeit vdf_binary.parse(raw)
10 loops, best of 3: 54.1 ms per loop
In [5]: raw = vdf.binary_dumps(dict(map(lambda x: (str(x), str(x)*100),
range(5000))))
In [6]: %timeit vdf.binary_loads(raw)
100 loops, best of 3: 15 ms per loop
In [7]: %timeit vdf_binary.parse(raw)
10 loops, best of 3: 103 ms per loop

SleepProgger · 2016-03-14T02:34:54Z

Well, your parsing version doesn't support streams (and therefore is ofc faster with memory data ;)), but i can live with that.
After rereading your code, they are pretty similar in most cases.

Some stuff i noticed:

You declared the struct instances, but aren't using them in _binary_dump_gen (~ int32.pack(value))
I guess POINTER and COLOR should be unsigned ints (at least for pointer that sounds pretty right)
- IMHO it would be wise to conserve if the value was a WIDESTRING, too just in case.

SleepProgger · 2016-03-14T09:29:43Z

Good news everybody,
I tested your implementation against all current existing productInfo i could get (about 3000 objects), and everything went well, and all data is correctly parsed and dumped ( except the minor issue mentioned in #5 ).
All the test data can be found here. I will update that repo with other files (from richpresence and other sources which may exist)

Every single binary vdf blob i tested only contained int32 and string types btw.

rossengeorgiev added the enhancement label Nov 23, 2015

rossengeorgiev added a commit that referenced this issue Mar 13, 2016

#3; complete implementation for binary vdf

483a6a9

rossengeorgiev closed this as completed Apr 10, 2016

Cohedrin mentioned this issue Feb 13, 2017

fix(parsing): handle edge case of blank strings as keys DoctorMcKay/node-binarykvparser#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement functions for working with binary keyvalues #3

Implement functions for working with binary keyvalues #3

rossengeorgiev commented Nov 23, 2015

SleepProgger commented Mar 11, 2016

SleepProgger commented Mar 12, 2016

rossengeorgiev commented Mar 12, 2016

rossengeorgiev commented Mar 13, 2016

SleepProgger commented Mar 14, 2016

rossengeorgiev commented Mar 14, 2016

SleepProgger commented Mar 14, 2016

SleepProgger commented Mar 14, 2016

Implement functions for working with binary keyvalues #3

Implement functions for working with binary keyvalues #3

Comments

rossengeorgiev commented Nov 23, 2015

SleepProgger commented Mar 11, 2016

SleepProgger commented Mar 12, 2016

rossengeorgiev commented Mar 12, 2016

rossengeorgiev commented Mar 13, 2016

SleepProgger commented Mar 14, 2016

rossengeorgiev commented Mar 14, 2016

SleepProgger commented Mar 14, 2016

SleepProgger commented Mar 14, 2016