-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup channel commits using recursive SQL queries #5469
Comments
This is a known downside of the current implementation, this has been discussed in #3615 (comment) |
With recursive SQL queries, the select + sort time of 700K torrents takes about 25 seconds. https://www.sqlite.org/lang_with.html The problem is, collection node queries a big dataset, does pythonic recursive sorts (factorial penalty !), puts them in temp data structures (messes the ram) sorts it several times (cpu penalty), i mean this is wrong in so many dimensions, i am not even talking about ORM overhead. |
Also data should be traversed over generators |
I did a recursive SQL implementation of commits half a year ago. Indeed, the performance becomes radically better. ORM does not hurt the recursive SQL implementation, as it is just that - a single custom SQL query. Also, I got an alternative, pure SQLite->SQLite (SQLite Hopefully, I'll be able to finish the stuff for 7.6. |
@hbiyik , the metadata DB cannot be used with unmodified Tribler source to re-commit the private channel w/o the corresponding private key, BTW. |
Please dont close this issue, mark it as enhancement or something, but the issue #3615 is discussing the difference between good and perfect. This issue is about the difference between terrible and OK. Currently it is not usable at all and you dont even need to go 100K torrents at all to see this. I am working on a way to improve that, below some sql snippets to improve the recursive selection 23 seconds on indexed id_ column which is missing
Orphan Clean Up Recursive SQL 3 seconds
Currently i am working on it and I may provide irregular feedback due to the fact that i am not very available in weekdays. |
Unless it is something completely fool-proof and really simple, like 1-to-1 replacing of ORM queries with recursive SQL, this is going to go into 7.6. |
btw @arvidn @hbiyik 3-5 November or 17-19 November we would like to organise "TriblerCon". Small-scale physical meeting in the countryside with high-end Github access. Gathering of you and the 12 Delft people working on Tribler for detailed alignment, nerd talk, 🍺 and roadmap for coming year. My dream is to get VLC-founder and Kodi developers to attend future editions and merge it with our scientific workshop, https://dicg2020.github.io
Because of your high-quality work we would like to refund your airfare if you could possibly attend TriblerCon during these Corona times..
Looking forward to seeing "Recursive SQL" that reduces insert time. Note that the channel format will change at some point in the future. It needs to become easier to parse in real-time. |
@hbiyik , we will be really glad to see you at Delft in person! 👍 |
@synctext it is a pleasure for me to join such an event and getting to know members closely as well. It seems like KLM is already OK to deliver me from Germany despite COVID crisis and also free lecture from Prof. Pouwelse is a nice addition (even though i lose tracking after 20th or 30th minute or so :) when he starts to talk about Ethics, Freedom and high moral grounds, i hope several beers can increase my mileage in this context so count me in! @ichorid i am sure that replacing SQL queries wont help by itself, there will be blood for sure. All the database iteration should be synchronous with the rest of the operations with generator iterators to prevent ram torture. Thus i am working on a fork in my repo, when the i build the confidence that this works without any side affects or regression, we can discuss which branch to merge, meanwhile you can keep working on an actual solution with the attach addition and multiple DBs. You never know what you will get with the lifecycle management of a software anyways :). But I will need some support especially i am not very confident about some legacy definitions and old database formats as well. |
@synctext forgot to mention, I was also going to bring that metadata possibilty for each channel content and channel itself, like the one there is in kodi. So i am glad that it is somewhere in the back of your head. I dont want to sidetrack but only critical design clue could be, providing and mediaid for the content where the metadata is maintained by a thirdparty. For further details i would prefer to discuss offline, for the reasons you might guess :) |
count me in! (assuming the netherlands will accept a swede :) ) |
@hbiyik , I should warn you about the biggest problem with Channels stuff so far: reactor hogging. To circumvent the problem, I had to come up with a kind of a congestion control mechanism. It is not enough to just make the code fast. You need to make it asynchronous as well. |
hello again after a short silence. @ichorid, @qstokkink https://github.com/hbiyik/tribler/commit/47ac1b8a895760a79b4998d2849ad8b9e97bd225 You can find proof of concept work in the above commit. Please note that it is not a working example at all but an experiment to see the boundaries in terms of performance. I chose consolidate_channel_torrent method to work with to trigger cpu/mem heavy operations. This is not complete because i am not using orm entities, and yes i have tried to make it work with pony but, i could not find a way to implement my design choices, if there is a way to do it in pony as well, that is also ok. Ideas under test:
TLDR: Above commit shows a poc, where it is possible to dump 300K recursive nodes to 1 single 65MB compressed and signed mdblob in 1.5 minute where %70 of this process can be quite asynch |
Impressive job, @hbiyik ! We can almost put it into our codebase as it is. However, your implementation lacks two important details:
EDIT:
You don't have to rewrite the class (if you mean the ORM class) at all. You just need to be sure the SQL <-> Pony objects mapping/cache is flushed before running your direct SQL stuff. Then Pony can proceed to work as normal. |
Also, to test the correctness of your code you can try replacing tribler/src/tribler-core/tribler_core/modules/metadata_store/tests/test_channel_metadata.py Line 374 in 43a4cb0
tribler/src/tribler-core/tribler_core/modules/metadata_store/tests/test_channel_metadata.py Line 298 in 43a4cb0
These two tests thoroughly check all possible situations and content types that can be put into a channel. To visually control the results you can use the tribler/src/tribler-core/tribler_core/modules/metadata_store/tests/test_channel_metadata.py Line 491 in 43a4cb0
|
So the only way to do is, instantiate the entity from_dict, with the row, directly with the result of cursor.fetchone(). Because under the hood pony always uses fetchall() or fetchmany(). This will introduce a little bit overhead (cpu+mem) and hopefully should be ok, or not gotta test it. but if i understand you on flushing the orm cache correctly, on each iteration we need run ORM cache flush and may be a manual GC collection after we are done with the entity instance. But the concern i have is GC in Cpython only works where there is no further reference to this particular object. If the instance is somehow referenced in somewhere in pony (through some saching mechanism) the RAM will still keep stacking on itself.
Yes, that part is rather easy i think and the method here is almost similar with the seelction but a simpler one, creating a pivot table on the sqlite memory space, and updating this pivot table on the actual table. Thus all dirty stuff to be handled on sqlite part again. I expect this would not require more resource than selection.
Yes, that part i think is a misconception on my end, i need check in detail how this timestamping works in both writer and reader directions, and also the tests you mentioned would help a lot. |
No, what I meant is you should not rely on Pony to fetch stuff at all, but instead access SQLite directly as you do. It should work like this (in pseudocode):
You don't need to flush the ORM cache on every iteration since you do not use Pony object in your code. Instead, you must ensure that all ORM-induced changes are flushed to DB before you start manipulating it directly. |
Hi @hbiyik! I'm the author of PonyORM. @ichorid asked me to look at the code that you suggested regarding its compatibility with PonyORM. I don't see any problems regarding interaction with PonyORM. It is a good practice to rewrite heavy queries in raw SQL to avoid ORM overhead. Some small comments:
So, the correct integration with PonyORM the code should be like that:
To me, the current code looks correct regarding SQL <-> ORM interaction. It already does what I have described above. |
@ichorid thanks for the tips, @kozlovsky thanks for your time to review and put the suggestions in a composed way, I think i have the overall way of how to do it without hurting ORM
Only in above statement, serializer/deserialization is a member of entity classes, however i think reimplementing this as an alternate way with the row tuples/dicts is the only solution, that should not be a very difficult task, but need to test it throughly against weird pitfalls ie: in poc commit there is no delete serialization because it is just a poc. and also it is based on tuples, dict like object is a cleaner option and sqlite.Row factory to be defined in the connection statement since this is a connection configuration in _sqlite. https://github.com/python/cpython/blob/b146568dfcbcd7409c724f8917e4f77433dd56e4/Modules/_sqlite/cursor.c#L122 |
Since working with direct rows is the direction, I designed a MDBlob class, that serializes, sings, compresses and writes the mdblobs from rows. and also reads, deserializes, and generates rows back from the mblobs. Benefits:
CONS:
You can check the implementation in https://github.com/hbiyik/tribler/commit/67d93f303b954d6c6e0344646130518038d6d20b And run some tests at: Here are some results on my end numbers depend on the host cpu and your disk type. My case Perf table
|
First of all, bravo for the performance 👏 Now, some food for thought: What you did is, you optimized the file part of mdblob processing for memory usage (with generators). However, the biggest bottleneck is not the file reading performance, but entry insertion into the main DB.
You optimized the Anyways, it would be great if, using your new routines, you will measure processing/committing some big channel that is already in the system. |
That mos.def the best way to do this, but also requires redesign of the database, maybe multiple tables, and even maybe multiple databases and almost rewriting of the channels. So i would be watching closely this approach as well. But again my target here is to make things "good" not "perfect". And generally "perfect" is the best enemy of the "good", which is quite "bad". Let me elaborate the discussion little bit.
The bottleneck when interfacing mdblob is on CPU side, so asynching IO call to the disk would merit but not hat much. The idea here is slicing the overall routine (with routine i mean DB -> somestuff -> FILE orr vice versa as a whole) to sub coroutines, so the long operation time would not block the reactor, and could be cancelled on demand. and also balacing the resources in the meanwhile (CPU - RAM - IO) Now the next and final step is the updating DB for this I have several ideas and should be fine as long as I can access the correct locks/semaphores in the Pony, lets see how that will work, i am expecting similar performance with recursive selection.
If i had such a thing i would put a PR already :) |
Generators in Python always run on the same thread where they were called. They do not use threading implicitly. Ergo, stuff in a generator will still block the reactor. Essentially, generators are just syntactic sugar to make up for Python's lack of pointers and Even more important, the reactor does not know when it can switch to another coroutine. The programmer must tell it when to do that - by using the
Correct me if I'm wrong, but I see neither |
There is a simple way to test if the reactor is hogged: create a |
Yep, i wasnt very clear on that, I shouldnt write messages in the midnight. The reason is that i did not make them async yet, my point in generators is that in terms of syncronous programming generators are the closest thing to coroutines in asynch, because they are kind of semi-coros running in sync. The reason i did not do them yet is db execution and cursor movement, those calls should run at the end of the day in a thread (at least the selection part db.execute), because i dont want to block the spinning reactor with those costly db calls. (around 20/30 sec). But the real problem here is Python _sqlite module is not thread safe at all (even though c-level sqlite3 is). possible solutions:
Since I am fetchone method, in either case using (method1: use mutex to serailze the access, method2: queue the sql statements and run sequentially), the cursor will be blocked because sqlite to my knowledge has no concept of cell/row/column/table locks https://sqlite.org/threadsafe.html, even if there is, there is no python library exposes those. (may be option 2 is but i am not sure. option 3 is somehow the same concept as above but in sqlite library itself.) So bottom line, when cursor is moved accross the db, or a costly statement is executed in sqlite, rest of the the accesses must be blocked from the the app, because the rationale here is again, the costly operations we are doing in the sqlite itself can never be done faster in python, also it introduces bunch memory issues. What I am targetting is leave the reactor free at that point of time so it can handle rest of the tasks related to ipv8 or things not related to DB in general. At the end of the day I target a design as below.
And currently i am focusing on slicing the iteration to minimum and most affective points, thus currently it is on snyc with generatos, when i deal with SQLite issues, i can easily convert them to coros and taks. |
by the way, above is just a pseudo code, with context managers those stuff can be made prettier without exposing cursors and connections to all around. Also in the case with the new design with multiple DBs and attach methods, there will still be an option receive old mdblobs from the older DBs right, so if that works it will be usefull for future as well, regardless of whatever the next redesign is. |
to be practical: something like that https://github.com/hbiyik/tribler/commit/ea449281e1ddbc55e9c8d0c18abe6f31085ba4e0, + transaction control on updating |
Well, the
It should work similarly as the thing you developed: you put the transaction in, you get Pony objects out, asynchronously. If we want to be able to do direct SQL transactions, we want them to use the same two queues. Otherwise, there will be deadlocks. One of the reasons why I am not doing the thing at the moment is that I wanted to wait for @kozlovsky to get here in September and do it together. Nonetheless, your work on recursive commits is very appreciated because it does not touch this topic directly and can be later integrated easily in the queue-based framework. |
@hbiyik impressive work! It has been a month since there was activity in this thread. @kozlovsky also joined the team now. Can this work be turned into production code which does not block the reactor? |
@synctext I was in vacation for the last month so couldnt really put focus on this one because of sand, summer sun, and beaches. :) However from my POV & understanding i can provide executive summary as below: Best way to manage this is to use multiple dbs/tables and plus attach dbs i think this is the consensus. Nevertheless old db design will still be usable in the wild, then this re design would bring benefit to commit or merge legacy mdblobs from/to the torrent transport from/to sqlite db. To answer the question precisely, the async part discussed here, better to be implemented in the pony with the proper queue or locking methodolgy. An iterative way of traversing cursor to be also provided by pony as well to prevent memory issues, i think this part is as vital as ever. So the snippets here are not close to PR quality, due to the fact it somehow introduces redesign of several components ( even pony itself). But the recursive sql code is quite solid in production level, even though this sql snippet is just the tip of iceberg. |
There are some great insights in this issue, like deep understanding of IPv8 perfection:
Even though I would like the IPv8 appreciation going, I believe this issue is now mostly out-of-date and should be closed:
|
Hello
I have added around 300/400k torrents to my channel, and when I issues commit to the channel, The commit take too long time, and after some time, i received a crash. Too bad that i did not get any logs of it.
Now when I try to reopen tribler again, it tries to regenerate the channel form metadata db, but the process takes literally days to just commit. Funny enough, when the restoration is succefull, and you are patient enough, the core is stuck for some reason, but can close gracefully. After a close/open of tribler, you start retoration again, and it takes a lot of time again.
You can find the metadata.db in the below link. Drop it to you tribler dir, and you will see the problem.
https://ufile.io/7gqv5ump
Tested with a AMD 8320 CPU and 12gb of ram with NVME disk, so the setup is quite fast.
The text was updated successfully, but these errors were encountered: