webtorrent / webtorrent

⚡️ Streaming torrent client for the web

Home Page:https://webtorrent.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Webtorrent downloads deselected pieces after re-adding a torrent

detarkende opened this issue · comments

What version of this package are you using?
2.1.36

What operating system, Node.js, and npm version?
macOS Sonoma 14.2.1 (23C71)
Node.js v20.11.1

What happened?
I'm using the free Sintel torrent - downloaded from webtorrent.io to test partial downloading and seeding.
Here's what I do:

/** @type {WebTorrent.Torrent} */
const torrent = await new Promise((resolve, reject) => {
    const torrent = client.add(
        TORRENT_PATH,
        {
            path: DOWNLOADS_PATH,
        },
        (torrent) => {
            console.log(
                `Torrent ${torrent.name} - ${torrent.infoHash} verified and added.`,
            );
            resolve(torrent);
        },
    );
    torrent.on('metadata', () => {
        console.log(
            `Deselecting all files and pieces for ${torrent.name} - ${torrent.infoHash}`,
        );
        torrent.files.forEach((file) => file.deselect());
        torrent.deselect(0, torrent.pieces.length - 1, 0);
    });
    torrent.on('error', reject);
});

I add the torrent, deselect all files and pieces, then wait for the file verification to finish.
After this, I have a server endpoint that serves partial content. I grab the start and end pieces from the Range header, then use those to create the readable stream:

/** @type {ReadableStream} */
const fileStream = file.stream({start, end});
const nodeStream = Readable.fromWeb(fileStream);
nodeStream.pipe(res);

This means that only the content that the user requests is actually downloaded, the rest of the files should not get downloaded. Meanwhile, I should be a partial seeder if possible, to seed as much as I can.

everything up until this point works correctly.

The issue happens when I restart my script:
Webtorrent verified pieces (I log out the progress on each download event), but then proceeds to download the rest of the torrent. Even though I have deselected everything and didn't create a ReadableStream to the files yet.
Straight after adding the torrent, it starts downloading the full content.

What did you expect to happen?
I would expect Webtorrent not to start the download after the verification, since everything should be deselected.

Are you willing to submit a pull request to fix this bug?
Yes, although I am not very familiar with the Bittorrent protocol, so I might need some guidance.

Here's the complete script, in case it is useful to you:
import WebTorrent from 'webtorrent';
import Express from 'express';
import mime from 'mime';
import { Readable } from 'stream';

const TORRENT_PATH = '<absolute-path-to-torrent>/sintel.torrent';
const DOWNLOADS_PATH = '<absolute-path-to-downloads>/downloads';
const MAX_CHUNK_SIZE = 20_000_000;

const app = Express();
app.use(Express.urlencoded({extended: true}));

const client = new WebTorrent();

/** @type {WebTorrent.Torrent} */
const torrent = await new Promise((resolve, reject) => {
    const torrent = client.add(
        TORRENT_PATH,
        {
            path: DOWNLOADS_PATH,
        },
        (torrent) => {
            console.log(
                `Torrent ${torrent.name} - ${torrent.infoHash} verified and added.`,
            );
            resolve(torrent);
        },
    );
    torrent.on('metadata', () => {
        console.log(
            `Deselecting all files and pieces for ${torrent.name} - ${torrent.infoHash}`,
        );
        torrent.files.forEach((file) => file.deselect());
        torrent.deselect(0, torrent.pieces.length - 1, 0);
    });
    torrent.on('error', reject);
});


torrent.on('download', (piece) => {
    console.log(`downloaded piece: ${piece} - ${(torrent.progress * 100).toFixed(2)}%`);
});

torrent.on('upload', (piece) => {
    console.log('uploading piece:', piece);
})

app.get('/', (req, res) => {
    res.send(`<html>
    <head><title>${torrent.name}</title></head>
    <body>
    ${torrent.files.map((file) => {
        return `<a href="/${encodeURIComponent(file.path)}">${file.path}</a> <br>`;
    }).join('')}
    </body>
    </html>`)
});

app.get('/:path', (req, res) => {
    console.log(req.params.path)
    const file = torrent.files.find((file) => file.path === req.params.path);
    if (!file) {
        return res.status(404).send('File not found');
    }
    const range = req.range(file.length, {
        combine: true,
    });

    if (range === undefined) {
        res.setHeader('Content-Type', mime.getType(file.path));
        res.setHeader('Content-Length', file.length);
        const end = Math.min(MAX_CHUNK_SIZE, file.length);
        res.statusCode = 206;
        res.setHeader('Accept-Ranges', 'bytes');
        res.setHeader('Content-Range', `bytes 0-${end}/${file.length}`);
        /** @type {ReadableStream} */
        const fileStream = file.stream({start: 0, end });
        const nodeStream = Readable.fromWeb(fileStream);
        nodeStream.pipe(res);
    }

    if (range === -1) {
        return res.status(416).send('Requested range not satisfiable');
    }
    if (range === -2) {
        return res.status(416).send('Invalid range');
    }
    if (range.type !== 'bytes') {
        return res.status(416).send('Invalid range');
    }
    let [{start, end}] = range;

    if (end - start > MAX_CHUNK_SIZE) {
        end = start + MAX_CHUNK_SIZE;
    }

    res.statusCode = 206;
    res.setHeader('Content-Type', mime.getType(file.path));
    res.setHeader('Accept-Ranges', 'bytes');
    res.setHeader('Content-Range', `bytes ${start}-${end}/${file.length}`);    
    res.setHeader('Content-Length', file.length);
    
    try {
    /** @type {ReadableStream} */
    const fileStream = file.stream({start, end});
    const nodeStream = Readable.fromWeb(fileStream);
    nodeStream.pipe(res);
    } catch (error) {
        console.error(error);
        res.status(500).send('An error occurred');
    }
})

app.listen(3000, () => {
    console.log('Server is running at http://localhost:3000/');
});

Interestingly, if I manually empty out the _selections array after the torrent is added, then everything works fine.
However, the deselect method doesn't have the same effect.

If I do this:

/** @type {WebTorrent.Torrent} */
const torrent = await new Promise((resolve, reject) => {
    // ...
});
torrent._selections.length = 0;

then the progress stays the same. (meaning that if 46% of the torrent was downloaded, then after the script starts, it doesn't download new chunks, unless a stream is opened. So it works as I would expect it to).

However, if I try to deselect everything the correct way (without using internal properties)...

/** @type {WebTorrent.Torrent} */
const torrent = await new Promise((resolve, reject) => {
    // ...
});
torrent.deselect(0, torrent.pieces.length - 1, 0)

then it doesn't work.

It's weird... I can use this workaround for now, but I would rather not use internal properties.
I'll comment further details if I find the root cause of this issue.

Here's what I've found (these line numbers are all in libs/torrent.js):

on lines 685-695, in the _markUnverified method, all unverified pieces are selected (so 1 item per piece is added to _selections):

_markUnverified (index) {
    const len = (index === this.pieces.length - 1)
      ? this.lastPieceLength
      : this.pieceLength
    this.pieces[index] = new Piece(len)
    this.bitfield.set(index, false)
    this.select(index, index, 1)
    this.files.forEach(file => {
      if (file.done && file.includes(index)) file.done = false
    })
  }

This is fine, but it's important later.

On lines 1052-1057, in the deselect method, in the for loop, we're only looking for selection items that are exact matches. I believe this is the culprit:

// current implementation:
deselect (start, end, priority) {
  if (this.destroyed) throw new Error('torrent is destroyed')

  priority = Number(priority) || 0
  this._debug('deselect %s-%s (priority %s)', start, end, priority)

  for (let i = 0; i < this._selections.length; ++i) {
    const s = this._selections[i]
    const isPieceInInterval = s.from >= from && s.to <= to 
    if (s.from === start && s.to === end && s.priority === priority) {
      this._selections.splice(i, 1)
      break
    }
  }

  this._updateSelections()
}

Is this intentional? I have a couple of questions about this method:

  1. Why is the priority argument necessary here? Don't we just want to remove the given selections? I don't see why knowing its current priority is necessary.
  2. Why are we only matching exact selections? Wouldn't it make more sense to remove every selection item that falls into the given from-to interval?

I believe this could be the partial cause of the "file.deselect supposedly not working" issue (#164)

commented

this was meant to be fixed in #2115, but alex has since abandoned it and noone has picked up the PR, feel free to fix it

@ThaUnknown Thanks for pointing me in the right direction.

I opened a draft PR with a solution that I came up with. Could you or any other maintainers/contributers check it out?

I'm mainly looking for feedback on it atm, since I'm a first time contributer, so I'm looking for holes in my implementation.

Your feedback would be greatly appreciated 🙂