arvidn / libtorrent

The list of torrent's trackers is inconsistently handled. In some places the duplicated URLs are disallowed while in other ones there are no such checking so there can be duplicated trackers added.

Loading trackers from resume data (disallowed):

libtorrent/src/torrent.cpp

Lines 325 to 328 in beb9dbc

    
           if (!find_tracker(e.url)) 
        
           { 
        
           	if (e.url.empty()) continue; 
        
           	m_trackers.push_back(e);

Adding single tracker to torrent (disallowed):

libtorrent/src/torrent.cpp

Lines 5743 to 5747 in beb9dbc

    
           if(auto k = find_tracker(url.url)) 
        
           { 
        
           	k->source |= url.source; 
        
           	return false; 
        
           }

Adding single tracker to torrent info (disallowed):

libtorrent/src/torrent_info.cpp

Lines 1766 to 1768 in beb9dbc

    
           auto const i = std::find_if(m_urls.begin(), m_urls.end() 
        
           	, [&url](announce_entry const& ae) { return ae.url == url; }); 
        
           if (i != m_urls.end()) return;

Replace all the trackers from torrent (allowed):

libtorrent/src/torrent.cpp

Lines 5689 to 5693 in beb9dbc

    
           for (auto const& t : urls) 
        
           { 
        
           	if (t.url.empty()) continue; 
        
           	m_trackers.emplace_back(t); 
        
           }

Parsing magnet link (allowed):

libtorrent/src/magnet_uri.cpp

Lines 223 to 236 in beb9dbc

    
           else if (string_equal_no_case(name, "tr"_sv)) // tracker 
        
           { 
        
           	// since we're about to assign tiers to the trackers, make sure the two 
        
           	// vectors are aligned 
        
           	if (p.tracker_tiers.size() != p.trackers.size()) 
        
           		p.tracker_tiers.resize(p.trackers.size(), 0); 
        
           	error_code e; 
        
           	std::string tracker = unescape_string(value, e); 
        
           	if (!e && !tracker.empty()) 
        
           	{ 
        
           		p.trackers.push_back(std::move(tracker)); 
        
           		p.tracker_tiers.push_back(tier++); 
        
           	} 
        
           }

Parsing .torrent file (allowed):

libtorrent/src/torrent_info.cpp

Lines 1629 to 1646 in beb9dbc

    
           			m_urls.reserve(announce_node.list_size()); 
        
           			for (int j = 0, end(announce_node.list_size()); j < end; ++j) 
        
           			{ 
        
           				bdecode_node const tier = announce_node.list_at(j); 
        
           				if (tier.type() != bdecode_node::list_t) continue; 
        
           				for (int k = 0, end2(tier.list_size()); k < end2; ++k) 
        
           				{ 
        
           					announce_entry e(tier.list_string_value_at(k).to_string()); 
        
           					ltrim(e.url); 
        
           					if (e.url.empty()) continue; 
        
           					e.tier = std::uint8_t(j); 
        
           					e.fail_limit = 0; 
        
           					e.source = announce_entry::source_torrent; 
        
           #if TORRENT_USE_I2P 
        
           					if (is_i2p_url(e.url)) m_flags |= i2p; 
        
           #endif 
        
           					m_urls.push_back(e); 
        
           				}

Therefore, first of all, I have the following question - what kind of behavior is still intended?
Obviously, there is no point in having pure duplicate trackers (or is there?). But, AFAIK, some people think that it might be useful to have duplicate trackers with different tiers (for the purposes of some very advanced tracker management).
In any case, the current behavior looks inconsistent/incorrect, so it should be fixed in one way or another.

@arvidn
Would you mind to fix it?
I can't do it myself, at least since I do not know what your intention is about duplicate trackers.

I can't think of a reason to allow duplicates in the tracker list. not even in different tiers. The de-facto behavior of tiers is to announce to all tiers in parallel, and (afaik) the de-factor behavior of creating torrents is to add all trackers to individual tiers. i.e. all trackers are typically announced to in parallel.

On the other hand, it dosen't seem core to libtorrent to enforce uniqueness. I suppose it would be a matter of quality when loading poorly made .torrent files. I take it people create torrents with duplicate trackers in the wild then, is that right?

I can't think of a reason to allow duplicates in the tracker list. not even in different tiers

👍
It doesn't seem like it can do any good. After all, it will either announce the same tracker several times at the same time, or it will try to do it if the previous announcement failed, which does not make sense since it should try another tracker instead.

I take it people create torrents with duplicate trackers in the wild then, is that right?

Don't know. I didn't pay attention to it. I'm here only because I noticed inconsistencies in libtorrent's behavior.

On the other hand, it dosen't seem core to libtorrent to enforce uniqueness.

But it is more convenient to do it at libtorrent side, besides, part of it has already been done.
In any case, inconsistent behavior should be fixed either one way or the other (either take care of the uniqueness of trackers, or not do it at all).

On the other hand, it dosen't seem core to libtorrent to enforce uniqueness.

As I said it related discussion at qBittorrent side (qbittorrent/qBittorrent#17017 (comment)) at least there are problems with mapping between tracker responses and tracker entries (aka announce entries in libtorrent) unless they are unique, aren't there?

one challenge here is that I don't think it's acceptable to have O(n^2) complexity of loading tracker URLs from a torrent. So there would need to be some more sophisticated data structure, which in turn would make everyone pay (a small) cost for this feature.

Presumably adding a single tracker URL to a torrent also shouldn't be O(n) either.

one challenge here is that I don't think it's acceptable to have O(n^2) complexity of loading tracker URLs from a torrent.

It's already there when you load resume data. Do not forget that this is exactly the action that is performed in a batch for many torrents when resuming a session. So it won't make the situation much worse if it does the same when loading a .torrent file or a magnet link, since these actions are mostly rare during application lifetime.
At least, it would be better to fix the problem in this way first, and only then think about how to optimize it.

So there would need to be some more sophisticated data structure, which in turn would make everyone pay (a small) cost for this feature.

You could use std::set but I don't think it will bring any performance increase in case of not so much trackers per torrent (it seems to me that most users don't have many trackers on their torrents). Besides, std::set is unlikely allowed to be sorted so freely.

@arvidn
Or do you have any ideas how to make reliable and predictable behavior with allowed tracker duplicates?

I've started work on a patch that will use a bloom filter to detect duplicates

#6873

#6873

Parsing .torrent file or magnet link still doesn't filter out tracker duplicates.

was this resolved? if yes, it should be closed.

it was fixed in libtorrent master branch (so, not released yet). duplicates torrents are not filtered when parsing a torrent file, they are filtered when adding it to the session.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

	if (!find_tracker(e.url))
	{
	if (e.url.empty()) continue;
	m_trackers.push_back(e);

	if(auto k = find_tracker(url.url))
	{
	k->source \|= url.source;
	return false;
	}

	auto const i = std::find_if(m_urls.begin(), m_urls.end()
	, [&url](announce_entry const& ae) { return ae.url == url; });
	if (i != m_urls.end()) return;

	for (auto const& t : urls)
	{
	if (t.url.empty()) continue;
	m_trackers.emplace_back(t);
	}

	else if (string_equal_no_case(name, "tr"_sv)) // tracker
	{
	// since we're about to assign tiers to the trackers, make sure the two
	// vectors are aligned
	if (p.tracker_tiers.size() != p.trackers.size())
	p.tracker_tiers.resize(p.trackers.size(), 0);
	error_code e;
	std::string tracker = unescape_string(value, e);
	if (!e && !tracker.empty())
	{
	p.trackers.push_back(std::move(tracker));
	p.tracker_tiers.push_back(tier++);
	}
	}

	m_urls.reserve(announce_node.list_size());
	for (int j = 0, end(announce_node.list_size()); j < end; ++j)
	{
	bdecode_node const tier = announce_node.list_at(j);
	if (tier.type() != bdecode_node::list_t) continue;
	for (int k = 0, end2(tier.list_size()); k < end2; ++k)
	{
	announce_entry e(tier.list_string_value_at(k).to_string());
	ltrim(e.url);
	if (e.url.empty()) continue;
	e.tier = std::uint8_t(j);
	e.fail_limit = 0;
	e.source = announce_entry::source_torrent;
	#if TORRENT_USE_I2P
	if (is_i2p_url(e.url)) m_flags \|= i2p;
	#endif
	m_urls.push_back(e);
	}

Inconsistent torrent trackers handling