Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couldn't generate valid v2/hybrid torrent file from torrent_handle::torrent_file() #6283

Closed
glassez opened this issue Jun 20, 2021 · 54 comments
Milestone

Comments

@glassez
Copy link
Contributor

glassez commented Jun 20, 2021

libtorrent version (or branch): RC_2_0 latest

When I try to generate torrent file from torrent_info obtained from torrent_handle (i.e. torrent_handle::torrent_file()) it either fails in case of "pure" v2 torrent or produces invalid file (with missing piece layers field) in case of hybrid torrent. The issue is caused by the following line:

m_torrent_file->free_piece_layers();

@arvidn
Copy link
Owner

arvidn commented Jun 20, 2021

yes, this is a complicated aspect of v2 torrents and the current split between torrent_info and add_torrent_params

The issue boils down to piece layers not being part of the info dictionary. They can also be quite large, so I free them after they've been "ingested" into the internal merkle tree representation, to save memory.

so when you ask for the torrent_file() from a torrent_handle, you get an immutable reference to the internal object. This is very efficient. however, if you need to make a new .torrent file from it, you can call torrent_file_with_hashes().

https://libtorrent.org/reference-Torrent_Handle.html#torrent-file-torrent-file-with-hashes

I would like to work towards a clearer separation of the immutable parts of a torrent (i.e. the info dictionary) and the additional stuff (trackers, DHT nodes, comments, piece layers, web seeds etc.). I would like torrent_info to represent the info dict, and add_torrent_params to be the whole package.

That way it would be a lot clearer what belongs where and what's mutable and immutable.

I think the first step towards this goal is to add a function that loads a .torrent file and returns an add_torrent_params object. then deprecate the web seeds, trackers, DHT nodes etc. from torrent_info.

@arvidn
Copy link
Owner

arvidn commented Jun 20, 2021

this should be mentioned in the upgrade_to_2.0 document.

@arvidn arvidn added this to the 2.0.5 milestone Jun 20, 2021
@glassez
Copy link
Contributor Author

glassez commented Jun 21, 2021

if you need to make a new .torrent file from it, you can call torrent_file_with_hashes().

https://libtorrent.org/reference-Torrent_Handle.html#torrent-file-torrent-file-with-hashes

OK, I see.
The problem is that we have a feature that allows user to get and save a .torrent file as soon as the torrent (added from the magnet link) finishes downloading the metadata. From the documentation above, I realized that this is now not possible until such a torrent finishes downloading files, since the "piece layers" is not part of the metadata. Am I right?

@arvidn
Copy link
Owner

arvidn commented Jun 21, 2021

Piece layers are mandatory in .torrent files, and are sufficient to start downloading. The fact that the internal torrent_info object has the piece layers stripped is just because the hashes are already using quite a lot of memory, so it can't really be left in there as a copy. It's unfortunate that this fact is exposed to clients, I wish it wasn't.

However, if you just change the call to torrent_file_with_hashes(), it should work.

@glassez
Copy link
Contributor Author

glassez commented Jun 21, 2021

However, if you just change the call to torrent_file_with_hashes(), it should work.

But the documentation says that this may not be enough:

Note that a torrent added from a magnet link may not have the full merkle trees for all files, and hence not have the complete piece layers. In that state, you cannot create a .torrent file even from the torrent_info returned from torrent_file_with_hashes(). Once the torrent completes downloading all files, becoming a seed, you can make a .torrent file from it.

@arvidn
Copy link
Owner

arvidn commented Jun 21, 2021

right. it has always been the case that a magnet link may not have the metadata, and not be able to create a .torrent file. There isn't a huge difference there, except that the piece hashes are downloaded lazily for the most part.

@AllSeeingEyeTolledEweSew
Copy link
Contributor

So, do I have it right that

  • before metadata is downloaded, torrent_file() and torrent_file_with_hashes() both return NULL
  • after metadata is downloaded:
    • torrent_file_with_hashes() returns a "complete" torrent_info
    • torrent_file()'s return value contains v1 hashes
    • torrent_file()'s return value does not contain v2 hashes

The v1/v2 difference is surprising, I didn't realize it until this thread as I've only worked with v1 so far

@arvidn
Copy link
Owner

arvidn commented Jun 21, 2021

that's right, except that the merkle hash trees in v2 are downloaded on-demand, so the piece layers may not be available immediately after the metadata is received.

@glassez
Copy link
Contributor Author

glassez commented Jun 21, 2021

right. it has always been the case that a magnet link may not have the metadata, and not be able to create a .torrent file.

I talk about the case when metadata is received. Previously we provide the user ability to save .torrent file at this time.

There isn't a huge difference there, except that the piece hashes are downloaded lazily for the most part.

Sorry, I'm confusing.
It turns out that when you add a torrent using a .torrent file, the full "piece layers" data is required to start downloading (otherwise, the torrent will be discarded, right?). But when you add a torrent using a magnet link, you don't need full "piece layers" data to start downloading. It sounds inconsistent.

One more problem is downloading torrent added via magnet link during several sessions. Previously we just store metadata in file once it is received and restore torrent next time using this metadata. It looks like it won't work the same way now, because if the metadata is possible saved without "piece layers", then when the torrent is restored (added with this metadata and stored "resume data"), we won't be able to parse metadata successfully, will we?

@glassez
Copy link
Contributor Author

glassez commented Jun 22, 2021

I would like to work towards a clearer separation of the immutable parts of a torrent (i.e. the info dictionary) and the additional stuff (trackers, DHT nodes, comments, piece layers, web seeds etc.). I would like torrent_info to represent the info dict, and add_torrent_params to be the whole package.

👍
Please keep us posted. Feel free to ping me for reviewing of related changes.

yes, this is a complicated aspect of v2 torrents and the current split between torrent_info and add_torrent_params

The issue boils down to piece layers not being part of the info dictionary. They can also be quite large, so I free them after they've been "ingested" into the internal merkle tree representation, to save memory.

so when you ask for the torrent_file() from a torrent_handle, you get an immutable reference to the internal object. This is very efficient. however, if you need to make a new .torrent file from it, you can call torrent_file_with_hashes().

https://libtorrent.org/reference-Torrent_Handle.html#torrent-file-torrent-file-with-hashes

I would like to work towards a clearer separation of the immutable parts of a torrent (i.e. the info dictionary) and the additional stuff (trackers, DHT nodes, comments, piece layers, web seeds etc.). I would like torrent_info to represent the info dict, and add_torrent_params to be the whole package.

That way it would be a lot clearer what belongs where and what's mutable and immutable.

It would be best if torrent_handle stores const torrent_info and provide changed file_storage differently.

I think the first step towards this goal is to add a function that loads a .torrent file and returns an add_torrent_params object. then deprecate the web seeds, trackers, DHT nodes etc. from torrent_info.

👍

@glassez
Copy link
Contributor Author

glassez commented Jun 28, 2021

@arvidn
So what's about #6283 (comment)?
This looks like a clear inconsistency. And unfortunately, it clearly spoils our lives.

@arvidn
Copy link
Owner

arvidn commented Jun 28, 2021

yes, I agree. I will look into addressing that. I overlooked this use case and just focused on saving metadata along with the resume data.

@arvidn
Copy link
Owner

arvidn commented Jun 28, 2021

I've just been a bit busy with my day job lately.

@arvidn
Copy link
Owner

arvidn commented Jul 14, 2021

@ssiloti Do you have any comments on this? It seems unnecessary to require .torrent files have piece layers, since it (typically) can be downloaded from the swarm anyway. I agree that it seems inconsistent with magnet links.

To quote glassez:

It turns out that when you add a torrent using a .torrent file, the full "piece layers" data is required to start downloading (otherwise, the torrent will be discarded, right?). But when you add a torrent using a magnet link, you don't need full "piece layers" data to start downloading. It sounds inconsistent.

@ssiloti
Copy link
Collaborator

ssiloti commented Jul 15, 2021

Libtorrent can handle such torrents just fine. The8472 argued in favor of requiring them:

A .torrent is only in a fully valid state once the piece layers have been included. Those are necessary for partial resumes and stateless torrent clients. So I want to make sure that people don't just go around omitting them because they don't see the use-cases.

bittorrent/bittorrent.org#59 (comment)

@glassez
Copy link
Contributor Author

glassez commented Jul 15, 2021

Requiring piece layers in torrent file also drop advantages described in https://www.bittorrent.org/beps/bep_0030.html

@glassez
Copy link
Contributor Author

glassez commented Jul 15, 2021

Anyway we need convenient way to store immutable torrent data (e.g. info section, author etc.) separately from other "resume" data. Otherwise we have to use some sort of workaround like qbittorrent/qBittorrent#15191.

@arvidn
Copy link
Owner

arvidn commented Jul 21, 2021

#6326

@the8472
Copy link

the8472 commented Sep 12, 2021

You might want to add an option to eagerly download the piece layers so a full torrent can be created.

@arvidn arvidn modified the milestones: 2.0.5, 2.0.6 Dec 27, 2021
@arvidn arvidn modified the milestones: 2.0.6, 2.0.7 Apr 17, 2022
@blightzero
Copy link

I ran into this issue a couple of days ago, where I noticed that anytime I'd try to download a v2 only torrent file from a magnet URL I could not generate a torrent file.
I use the python bindings and they do not expose the torrent_file_with_hashes() function on the torrent_info object.

Moreover, since the torrent info hash is only ever a sha1 or sha256 on the info section and never includes the piece layers or anything outside the info section, all information outside of that can be changed or omitted. Due to this limitation and the way this should be implemented via BEP9 all other information is missing from a magnet torrent anyway.

Wouldn't it make sense to just use the info section then anyway? At least for now that is my workaround, where I just use the info_section() function on the torrent_info object to get a bencoded representation of the info section and then add a d4:info prefix and e postfix to get a torrent file.

libtorrent is absolutely fine with parsing v2 torrents that are missing the piece layers. Integrity of the files is ensured through the pieces root anyway.

@arvidn
Copy link
Owner

arvidn commented Apr 29, 2022

if the reason you're saving a .torrent file it to be able to re-add it if you restart, the resume data is much more practical for that. Then you'll get the partial merkle trees included as well.

libtorrent is absolutely fine with parsing v2 torrents that are missing the piece layers. Integrity of the files is ensured through the pieces root anyway.

Yes, iirc that was a recent change to mitigate issues around this.

@arvidn
Copy link
Owner

arvidn commented May 9, 2022

I see. I may not have reproduced the same issue then. The exception I get from a v2-only torrent is in the call to generate() on the create_torrent object, you get the assert in the constructor.

@arvidn
Copy link
Owner

arvidn commented May 9, 2022

I can't actually reproduce this. For me, it either works or torrent_file_with_hashes() returns nullptr (which it does when we don't have all the hashes yet.

@arvidn
Copy link
Owner

arvidn commented May 9, 2022

I suspect it's related to v2 torrents that are so small that not all files have a merkle tree. i.e. when the a file is smaller than one piece.

@hekkr000
Copy link

hekkr000 commented May 9, 2022

Exactly! I've used this hash (AABB9549B326A811616DD0D6617EAC3353A2A8AE) to test to see if there is anything that could tell me if this assertion would fail, but so far I couldn't find anything. My only solution currently is to manually add these hashes to a blacklist so they won't crash the program... As I said I'm sadly not primarily a c++ developer, so this is the best I was able to come up with so far, but if I can get any more information about this issue than I'm happy to help!

This is my current list of hashes that are causing this assertion to fail:
04445AAD7D263D30C31418BE21C8F08869C03474
1770C2313EB11ADF71A55821B986FD5B5E39F44F
1DC41DE15C813D677FCEB74B47B022A20C2BD52B
40F3F8CEC4F10615A390EC13DA0A7163F1F45AD3
4EECAAEC3ED1C87381D6D59B2FC3B2C49E029AAA
62B924AFD013F41628F8AA7BD607245408014AAA
8CFDE7139BD5D6E2DF64772F59F8750A2E4F4552
A4E7792E383302BFD1D62677E81C11E9AE02579D
A8F8295A2628609A1C35BE64841A5FC9BE4860DB
AABB9549B326A811616DD0D6617EAC3353A2A8AE
BEC5E08FB13F5D58D6F1C52A95BD71AF4DFE4B55
C6CC63FB3570A2C63CC32FEE5E42A3CB327D7022
440BE32141E7A5D42818BACC810923B0DAF8B05E
00E681BAEA32ADAED8B283110E4A738B45208B6F
3EA21F6D64C21C0B4C123FAE8E85FD216908E822

@hekkr000
Copy link

hekkr000 commented May 9, 2022

I suspect it's related to v2 torrents that are so small that not all files have a merkle tree. i.e. when the a file is smaller than one piece.

Interesting, I will look into that with my examples!

@hekkr000
Copy link

hekkr000 commented May 9, 2022

I've run through them all, and for sure, all of them seems to have at least one file with the size of 0 bytes, I guess that would be smaller than one piece :) I will make some changes to my program so that it logs the hashes that are like this, to see if I can find any that do not fail in this case!

@hekkr000
Copy link

hekkr000 commented May 9, 2022

I can't actually reproduce this. For me, it either works or torrent_file_with_hashes() returns nullptr (which it does when we don't have all the hashes yet.

Really? Interesting, maybe it has to do something with GCC then?
I've created and attached a sample, that definitely crashes for me. Sorry in advance for the code quality...
test.zip

@arvidn
Copy link
Owner

arvidn commented May 9, 2022

I wrote this to try to reproduce it: #6856

@glassez
Copy link
Contributor Author

glassez commented May 9, 2022

v2 torrents aren't considered valid without having the piece layer hashes in the merkle trees.

This part of BEP52 contradicts common sense, IMO. Why require something that is inherently optional?..

@hekkr000
Copy link

hekkr000 commented May 9, 2022

I wrote this to try to reproduce it: #6856

Built your sample, and run it, same result:
image

Running on Arch (linux 5.11.10-arch1-1)
Lib: extra/libtorrent-rasterbar 1:2.0.5-1
GCC: core/gcc 11.2.0-4
Build params: g++ -g -fPIC Main.cpp -ltorrent-rasterbar -o test

Also, per your recommendation, this workaround seems to be able to detect, which torrent would cause a crash:
image

It also wrote down all the hashes that I collected over the past couple weeks, and they seem to align. So you were right, there should be a problem with small files (although I would argue that it has to be 0 bytes exactly).

@hekkr000
Copy link

hekkr000 commented May 9, 2022

v2 torrents aren't considered valid without having the piece layer hashes in the merkle trees.

This part of BEP52 contradicts common sense, IMO. Why require something that is inherently optional?..

I don't think that it's "optional", as I understand, we only take the merkle tree out of the info section for compatibility/performance reasons, the whole idea behind this is that we could be able to identify the same files between different torrents, without downloading them and checking the files hashes manually.

Although, I have to agree, that it is kind of contradictory that we have to download the files in order to calculate the tree itself, which defeats the purpose.

I guess this is the reason why @arvidn is reluctant to make it optional, because then nobody would write them down to the torrent files, and we would eventually lose the data. I prefer the idea of making a function that forces the client to download the whole merkle tree without having to download the files, and failing to save it until it is downloaded.

@arvidn
Copy link
Owner

arvidn commented May 9, 2022

@hekkr000 that output you took a screenshot of suggests it's not the create_torrent constructor that's triggering the assertion failure.

@hekkr000
Copy link

hekkr000 commented May 9, 2022

@hekkr000 that output you took a screenshot of suggests it's not the create_torrent constructor that's triggering the assertion failure.

It is though. Here to be precise:
image

image

@hekkr000
Copy link

hekkr000 commented May 9, 2022

Alright, nevermind. I tried to build the project from source so I could do a bit more investigation, but it looks like the official arch repository was a bit out of date not so long ago. It looks like you have already fixed this issue in 2.0.6. I bet this is the one:

fix issue creating a v2 torrent from torrent_info containing an empty file

Apparently they only uploaded 2.0.6 last Wednesday and I haven't updated since posting this issue last Monday.
Sorry for this. Problem solved.

@arvidn
Copy link
Owner

arvidn commented May 9, 2022

heh, right. I should have remembered that :)

@arvidn arvidn closed this as completed May 14, 2022
@markmdscott
Copy link

v2 torrents aren't considered valid without having the piece layer hashes in the merkle trees.

This part of BEP52 contradicts common sense, IMO. Why require something that is inherently optional?..

Piece layers are not optional in v2 torrents, so where's the contradiction @glassez ?

@glassez
Copy link
Contributor Author

glassez commented Jan 9, 2023

Piece layers are not optional in v2 torrents, so where's the contradiction @glassez ?

Piece layers are inherently optional but they are not optional in v2 torrents (BEP52). That's what the contradiction with common sense is.

@markmdscott
Copy link

Piece layers are not optional in v2 torrents, so where's the contradiction @glassez ?

Piece layers are inherently optional but they are not optional in v2 torrents (BEP52). That's what the contradiction with common sense is.

Hmm yes I have not found @the8472 's reason for a v2 torrent to be considered valid unless it includes piece layers to itself be valid:

A .torrent is only in a fully valid state once the piece layers have been included. Those are necessary for partial resumes and stateless torrent clients. So I want to make sure that people don't just go around omitting them because they don't see the use-cases.

What is necessary (and sufficient) for partial resumes is the root hash. I am not quite sure what is meant by stateless torrent clients, though up/downloading data requires state.

On the other hand Vladimir, aren't the file hashes in a v1 torrent also optional given your reasoning? Only the infohash would be necessary for a torrent file.

@arvidn
Copy link
Owner

arvidn commented Jan 9, 2023

I think he means that anything that isn't part of the info-dictionary is (inherently) optional. It's not committed to by the inf-hash. In that sense, the piece hashes in a v1 torrent are not optional, but the piece layers in a v2 torrent are.

The rationale for this decision was to allow shorter start-up times when downloading a v2 magnet link, where the info-dict is very small and you download portions of the merkle tree as you go.

@markmdscott
Copy link

Oh yes I forgot that the infohash in a v1 torrent only commits to the piece hash data (and other metadata) and not the file itself. This is unlike with a v2 torrent where the root hash commits to the file data, hence the piece layer hashes are not necessary.

But that means the reason for requiring them in a v2 torrent needs reexamination. Resumption of downloading only needs the root hash...

(Maybe the piece layers are also there and can be used as an integrity check on the root hash, to ensure it hasn't experienced a bit flip? If so then there could be a much cheaper integrity check on the root.)

@the8472
Copy link

the8472 commented Jan 9, 2023

What is necessary (and sufficient) for partial resumes is the root hash.

Partial resume here means partially downloaded files. You can only verify which pieces in a file are already complete if you have the piece hashes. The root hash can only verify a complete file, not a partial one.

I am not quite sure what is meant by stateless torrent clients, though up/downloading data requires state.

Clients that can be pointed at bunch of .torrent files and a filesystem and figure out the rest by looking for partial or complete matches in the filesystem. This way no client-specific state is needed.

@markmdscott
Copy link

What is necessary (and sufficient) for partial resumes is the root hash.

Partial resume here means partially downloaded files. You can only verify which pieces in a file are already complete if you have the piece hashes. The root hash can only verify a complete file, not a partial one.

But through the Merkle root hash you can get the piece hash, and through the piece hash you can verify the piece of a partial file you have received correct? So you can still incrementally verify the pieces of a file as it is downloaded, you don't need the complete file to start verifying it.

I am not quite sure what is meant by stateless torrent clients, though up/downloading data requires state.

Clients that can be pointed at bunch of .torrent files and a filesystem and figure out the rest by looking for partial or complete matches in the filesystem. This way no client-specific state is needed.

In this scenario of the filesystem for complete files the client can verify files simply from the Merkle root hashes (and can always recreate the piece layers.) For partially downloaded files they would likely already be augmented with some saved resume data anyway (corresponding to the partially downloaded piece layer hashes) for the client to "figure out the rest." Otherwise you'd be groping around in the dark since "stateless" partial files are what people call corrupt files.

@the8472
Copy link

the8472 commented Jan 9, 2023

But through the Merkle root hash you can get the piece hash, and through the piece hash you can verify the piece of a partial file you have received correct?

Get? Over the network, if there's another peer, yes. But you wouldn't know which ones to get, so you'd have to get all of them if a file doesn't verify as a whole, which means you need all piece hashes to be available.

Having the piece layers avoids that dependency.

So you can still incrementally verify the pieces of a file as it is downloaded, you don't need the complete file to start verifying it.

I'm not talking about verifying-while-downloading. I'm talking about having a partially downloaded file on disk and wanting to continue downloading it by importing it to a client. But to do so you need to determine how much you already have.

For partially downloaded files they would likely already be augmented with some saved resume data anyway

That "resume data" is part of the client-specific state that's excluded by being stateless.

Otherwise you'd be groping around in the dark since "stateless" partial files are what people call corrupt files.

Not at all. The piece layers in the torrent file let you determine which pieces of a partial file you already have. And you can in turn verify the piece layers by checking them against the root hash.

@markmdscott
Copy link

But through the Merkle root hash you can get the piece hash, and through the piece hash you can verify the piece of a partial file you have received correct?

Get? Over the network, if there's another peer, yes. But you wouldn't know which ones to get, so you'd have to get all of them if a file doesn't verify as a whole, which means you need all piece hashes to be available.

For incrementally downloading a file through the network you would know which further piece hashes to get since you're the one downloading it, as you already have the partial piece layer hashes (this is not your stateless client scenario.)

I gather the scenario you have in mind for stateless clients is that you're handed a filesystem with (complete or otherwise) files (it's a mystery as there is no saved resume data) as well as a bunch of torrent files, and no network access. So for every file you have to hash check every piece of it to figure out if it is complete or not, and if it is not which pieces are missing. So for the incomplete files you'd need the all piece layer data to be there already in the torrent files to know this.

That is a rather specific circumstance. How common is this? Like how often would one run into this stateless client scenario?

So you can still incrementally verify the pieces of a file as it is downloaded, you don't need the complete file to start verifying it.

I'm not talking about verifying-while-downloading. I'm talking about having a partially downloaded file on disk and wanting to continue downloading it by importing it to a client. But to do so you need to determine how much you already have.

Would that not be partially conflating the partial resumes scenario with the stateless client scenario. (Unless you already meant to equate them in your original remark years ago?) Regardless if you have network access you can always ask for the piece layers from peers through the root hash and then determine how much of the incomplete file's data you already have.

See that's the thing: network access is key, and bittorrenting inherently assumes a network. Without it you can indeed determine (with the full piece layers) which pieces of an incomplete file are there but you can't do anything about it; you can't download and then use the file. And after all that's what you want to do in the end, not merely to feel satisfied that you know which parts of an incomplete file you have.

@the8472
Copy link

the8472 commented Jan 9, 2023

That is a rather specific circumstance. How common is this? Like how often would one run into this stateless client scenario?

It's not really one single scenario. It's multiple scenarios that are rather similar. It's moving terabytes of storage between machines. It's renaming files. It's switching clients.

Currently clients keep implementation-specific state. The file renames. The progress information. With bittorrent v2 (including piece layers) this becomes very low complexity. The torrents and a filesystem is all you need. For terabytes of data. You just scan for files of the right size, calculate a very small amount of possible piece hashes (16KB * powers of two) and then compare against your torrent database. This is a single pass over the filesystem that finds everything, complete and incomplete torrents, renames or not.

Yes, sure, you could try to recover without that information. But the point is to make it dead-simple so one doesn't have to worry about all that junk anymore.

Another minor concern was just creating another availability-bottleneck. In addition to pieces dropping below availability of 1 you'd also have to worry about hash information dropping below 1 in some edge-cases. clients cannot share proof-hashes unless they either have to the complete file or store them in yet another client-specific format. Having the piece hashes in the torrent file provides portable format for information you need to store anyway.

Sure, for a magnet transfer you only need the infodictionary. But eventually you have to transfer the piece hashes in one form or another anyway. So a .torrent provides a more complete version of that. The spec just makes it a guarantee so that clients don't store it in a non-portable format and other clients can rely on it.

@markmdscott
Copy link

Yeah I see your point of view here. Having the piece layer requirement in the torrent file guarantees they will be stored somewhere, so that other clients can readily transmit it to you. Without it clients might skip it to save space when finished then you'll not have it easily available, or take a long time to get it. This data should still be stored somewhere after a file is complete so this guarantees it'll be there, and as you said, in a portable format to boot.

@glassez
Copy link
Contributor Author

glassez commented Jan 11, 2023

In any case, this does not change the fact that this requirement is "unnatural". It looks more like a way to achieve some unification of .torrent files.
In fact, there is no problem with this requirement when creating a new torrent. Problems arose due to the fact that BitTorrent v2 changed the layout of the .torrent file, thereby violating the logic of existing applications, because previously the .torrent file containing only an info section was valid, so this gave us a reason and an opportunity to do the following things:

  1. save the .torrent file for magnet torrents as soon as the metadata is received,
  2. store metadata (for internal use) in .torrent files, instead of resume data, since it is immutable and it makes no sense to rewrite it every time.

Of course, all these problems should go away with improved support for everything that BitTorrent v2 has brought us. It also revealed a number of shortcomings in libtorrent, which should also be eliminated (for example, the need for a clearer separation of immutable and mutable torrent data).

@the8472
Copy link

the8472 commented Jan 11, 2023

thereby violating the logic of existing applications

Existing applications only know the v1 spec. From their perspective a hybrid torrent without the piece layers is valid. To be affected by the validation logic you must make changes to add v2 logic to comply with the v2 spec. This is one part of that.

save the .torrent file for magnet torrents as soon as the metadata is received,

You can still do that, it just doesn't form a proper torrent file. I'd recommend naming it differently though. Maybe .torrent-part or .info or something.
You can do anything for internal use. The spec is only about the public API. When giving a user a torrent or accepting one from a user that's a constraint that should be checked.

Another option is to fetch the piece layers eagerly after downloading the metadata.

It also revealed a number of shortcomings in libtorrent, which should also be eliminated (for example, the need for a clearer separation of immutable and mutable torrent data).

Agreed

@markmdscott
Copy link

In any case, this does not change the fact that this requirement is "unnatural".

I mean whether it is natural or not is subjective. I agree that piece layers are strictly unnecessary if you just have the root hash. It is just that as a practical matter, having the piece layer requirement guarantees the benefits that were discussed, which arise from common scenarios of people torrenting in the the wild.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants