lucaswerkmeister / m3api

minimal modern MediaWiki API client

Home Page:https://www.npmjs.com/package/m3api

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Automatically get and add CSRF token

lucaswerkmeister opened this issue · comments

Many other MediaWiki API libraries support automatically adding a token parameter, transparently getting it from action=query&meta=tokens if it’s not known yet. We should add this to m3api – it’s useful for many different API modules.

The proposed interface is two new request options (i.e. the second request() parameter, after the params, alongside the method):

  • tokentype: No default. If present, use this token type. (csrf would be the most common type, but there are a few others.)
  • tokenname: Defaults to token. I’m not sure if any API modules use another name (I seem to dimly remember one existing, but can’t find anything in Wikidata paraminfo), but it doesn’t cost much to include. (We can always remove it later if it really turns out to be unnecessary.)

Example:

session.request( {
    action: 'edit',
    // ...
}, {
    method: 'POST',
    tokentype: 'csrf',
} );

One potential issue here is that the tokens need to be reset when the session state changes (i.e. the user logs in or out). I’m not sure if we should try to detect this automatically (successful POST to login, clientlogin, or logout?) or just provide a resetTokens() method and tell users to call it after changing the session state.

Hm, that makes me think of another thing as well: after you’re logged in (but not before!), you probably want to include assert=user in all API requests, to protect against accidentally logging out and continuing as an IP? But you can’t include assert=user in the default params at construction time.

This almost feels like you effectively want to have a different Session instance after login, with the same cookie jar, but different default params and saved tokens… 🤔

This almost feels like you effectively want to have a different Session instance after login, with the same cookie jar, but different default params and saved tokens… 🤔

On the other hand, what should happen to the existing Session in that case? The saved tokens will still be invalid, since you’re sharing a cookie jar.

In Rust, we could say that a login method consumes the old Session and returns a new one, and then the compiler prevents you from continuing to use the old one. But that’s not a thing in JavaScript.

One idea:

  • document session.defaultParams, session.defaultOptions, and session.savedTokens (or session.tokens?) as stable members, which users are allowed to modify
  • after logging in using a bot password, it’s the user’s responsibility to assign session.defaultParams.assert = 'user' and session.savedTokens = {}
  • provide a separate package, m3api-botpassword, with a login method that does all the steps
  • a similar m3api-oauth2 package (see #9) also sets session.defaultOptions.authorize (HTTP Authorize header; not yet supported but planned in the back of my head)

In case it's helpful, here's what I ended up with for mwapi (Rust):

assert=[whatever] is set for all requests except action=login, and meta=tokens&type=login. Hopefully people use OAuth though and avoid the login endpoint entirely.

I think you can avoid the entire resetting session issue by just retrying on "badtoken" errors. And really that's the benefit of having a convenient post_with_token interface, that you have enough information to retry. https://gitlab.com/mwbot-rs/mwbot/-/blob/master/mwapi/src/client.rs#L295 is my implementation.

I think retries are especially necessary for browser code where people could be adjusting or messing with their session/cookies in another tab, and you have no way to intercept or listen to those kinds of events.

Thanks, that helps a lot!

I like the idea of resetting the tokens on token errors. I think I still want to have the separate packages that wrap the two recommended ways to authenticate (bot passwords and OAuth2), to make the interface more convenient and to avoid the network roundtrip of a badtoken error (though in the most common case even users who don’t reset the tokens wouldn’t get a badtoken error, since they’d only get a login token before login and afterwards never use that again) – but having dedicated handling for badtoken errors is a great fallback for that, or, as you mentioned, in case the session gets modified in the background (in the browser).

tokenname: Defaults to token. I’m not sure if any API modules use another name (I seem to dimly remember one existing, but can’t find anything in Wikidata paraminfo), but it doesn’t cost much to include. (We can always remove it later if it really turns out to be unnecessary.)

Nah, let’s turn this around, and only add tokenname if somebody asks for it. It’s probably not needed.

tokenname: Defaults to token. I’m not sure if any API modules use another name (I seem to dimly remember one existing, but can’t find anything in Wikidata paraminfo), but it doesn’t cost much to include. (We can always remove it later if it really turns out to be unnecessary.)

Nah, let’s turn this around, and only add tokenname if somebody asks for it. It’s probably not needed.

Scratch that, in some modules the token parameter name is prefixed (e.g. action=login, lgtoken):

$ curl -s 'https://www.wikidata.org/w/api.php?action=paraminfo&modules=main+*&formatversion=2&format=json' | jq -r '.paraminfo.modules | .[] | select(.parameters | .[] | select(.name == "token" or .tokentype)) | select((.prefix | length) > 0) | .name'
changeauthenticationdata
clientlogin
createaccount
linkaccount
login

So we need tokenname after all.