ebernhardson / mwbot

A simple, but flexible MediaWiki bot for Node.js

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MWBot

Download stats

Description

MWBot is a Node.js module for interacting with the MediaWiki API.

The library makes use of the Promise pattern and behind the scene, the NPM request library.

The design goal is to be as flexible as possible, with the ability to overwrite options and behaviour at any point. The library also lets you freely choose the abstraction/convenience level on which you want to work. You can use convenience functions that bundles (with concurrency) multiple API requests into one function, but you can also handcraft your own custom MediaWiki API and pure HTTP requests.

The library has extensive test coverage and is written in modern ECMAScript 2015.

Requirements

  • Node.js 4.0+

Technical API Documentation

Documentation

Typical Example

const MWBot = require('mwbot');

let bot = new MWBot();

bot.loginGetEditToken({
    apiUrl: settings.apiUrl,
    username: settings.username,
    password: settings.password
}).then(() => {
    return bot.edit('Test Page', '=Some more Wikitext=', 'Test Upload');
}).then((response) => {
    // Success
}).catch((err) => {
    // Error
});

For more examples, read the documentation or have a look at the /test directory

Constructor and Settings

constructor(customOptions, customRequestOptions)

Constructs a new MWBot instance.

let bot = new MWBot();

Construct with custom options:

let bot = new MWBot({
    verbose: true
});

Construct with custom request options

let bot = new MWBot({}, {
    timeout: 160000
});

setOptions(customOptions)

Overwrite/extend the default bot options

bot.setOptions({
   verbose: false,  
   silent: false,
   defaultSummary: 'MWBot',
   concurrency: 1,
   apiUrl: false,
   sparqlEndpoint: 'https://query.wikidata.org/bigdata/namespace/wdq/sparql'
});

setGlobalRequestOptions(customRequestOptions)

Overwrite/extend the request options This may be important for more advanced usecases, e.g. changing the user agent or adding additional authentification or certificates.

bot.setGlobalRequestOptions({
    method: 'POST',
    qs: {
        format: 'json'
    },
    headers: {
        'User-Agent': 'mwbot/1.0.3'
    },
    timeout: 120000, // 120 seconds
    jar: true,
    time: true,
    json: true
})

Login and Session Management

Login with user and password. This will be necessary for most bot actions. A successful login will add the login token to the bot state.

.login(loginOptions)

bot.login({
  apiUrl: "http://localhost:8080/wiki01/api.php",
  username: "testuser",
  password: "testpassword"
}).then((response) => {
    // Logged In
}).catch((err) => {
    // Could not login
});

.getEditToken()

Fetches an edit token that is needed for certain MediaWiki API actions, like editing pages.

bot.getEditToken().then((response) => {
    // Success
}).catch((err) => {
    // Error: Could not get edit token
});

loginGetEditToken(loginOptions)

Combines .login() and getEditToken() into one operation for convenience.

setApiUrl(apiUrl)

If no login is necessary for the bot actions, it is sufficient to just set the API URL instead of loggin in.

bot.setApiUrl('https://www.semantic-mediawiki.org/w/api.php');

Note that it is also possible to set the API URL with the constructor:

let bot = new MWBot({
    apiUrl: 'https://www.semantic-mediawiki.org/w/api.php'
});

CRUD Operations

create(title, content, summary, customRequestOptions)

Creates a wiki page. If the page already exists, it will fail

bot.create('Test Page', 'Test Content', 'Test Summary').then((response) => {
    // Success
}).catch((err) => {
    // General error, or: page already exists
});

read(title, customRequestOptions)

Reads the content of a wiki page. To fetch more than one page, separate the page names with |

bot.read('Test Page|MediaWiki:Sidebar', {timeout: 8000}).then((response) => {
    // Success
    // The MediaWiki API Result is somewhat unwieldy:
    console.log(response.query.pages['1']['revisions'][0]['*']);
}).catch((err) => {
    // Error
});

update(title, content, summary, customRequestOptions)

Updates a wiki page. If the page doesn't exist, it will fail.

bot.update('Test Page', 'Test Content', 'Test Summary').then((response) => {
    // Success
}).catch((err) => {
    // Error
});

edit(title, content, summary, customRequestOptions)

Edits a wiki page. If the page does not exist yet, it will be created.

bot.edit('Test Page', 'Test Content', 'Test Summary').then((response) => {
    // Success
}).catch((err) => {
    // Error
});

upload(title, pathToFile, comment, customParams, customRequestOptions)

Upload a file to the wiki. If the file exists, it will be skipped. Make sure your wiki is configured correctly for file uploads

bot.upload(false, __dirname + '/mocking/example1.png')}).then((response) => {
  // Success
}).catch((err) => {
  // Error
});

uploadOverwrite(title, pathToFile, comment, customParams, customRequestOptions)

Like upload(), but will overwrite files on the server

Convenience Operations

batch(jobs, summary, concurrency, customRequestOptions)

This function allows to work more conveniently with the MediaWiki API. It combines all CRUD operations and additionally manages concurrency, logging, error handling, etc.

let batchJobs = {
    create: {
        'TestPage1': 'TestContent1',
        'TestPage2': 'TestContent2'
    },
    update: {
        'TestPage1': 'TestContent1-Update'
    },
    delete: [
        'TestPage2'
    ],
    edit: {
        'TestPage2': 'TestContent2',
        'TestPage3': Math.random()
    },
    upload: {
        'Image1.png': '/path/to/Image1.png'
    }
};

bot.batch(batchJobs, 'Batch Upload Summary').then((response) => {
    // Success
}).catch((err) => {
    // Error
});

Alternatively, an array.array notation can be used. The first array item is the operation name, the second declares the page name. All following array items are used as function parameters.

bot.loginGetEditToken(loginCredentials.valid).then(() => {
    return bot.batch([
        [
            'create',
            'TestPage1',
            'TestContent1',
            'Batch Upload Reason'
        ],
        [
            'update',
            'TestPage1',
            'TestContent1-Update',
            'Batch Upload Reason'
        ],
        [
            'delete',
            'TestPage1',
            'Batch Upload Reason'
        ]
    ], false, 1);

}).then((response) => {
    // Success
}).catch((err) => {
    // Error
});

sparqlQuery(query, endpointUrl, customRequestOptions)

Query Triplestores / SPARQL Endpoints like those from wikidata and dbpedia.

let endPoint = 'https://query.wikidata.org/bigdata/namespace/wdq/sparql';
let query = `
    PREFIX wd: <http://www.wikidata.org/entity/>
    PREFIX wdt: <http://www.wikidata.org/prop/direct/>
    PREFIX wikibase: <http://wikiba.se/ontology#>

    SELECT ?catLabel WHERE {
        ?cat  wdt:P31 wd:Q146 .

        SERVICE wikibase:label {
            bd:serviceParam wikibase:language "en" .
        }
    }
`;

bot.sparqlQuery(query, endPoint).then((response) => {
    // Success
}).catch((err) => {
    // Error
});

askQuery(query, customRequestOptions)

let apiUrl = 'https://www.semantic-mediawiki.org/w/api.php';
let query = `
    [[Category:City]]
    [[Located in::Germany]] 
    |?Population 
    |?Area#km² = Size in km²
`;

bot.askQuery(query, apiUrl).then((response) => {
    // Success
}).catch((err) => {
    // Error
});

Basic Requests

In case that the standard CRUD requests are not sufficient, it is possible to craft custom requests:

request(params, customRequestOptions)

This request assumes you're acting against a MediaWiki API. It allows you to easily craft custom MediaWiki API Request. It also does basic error handling, and uses the login data if given.

bot.request({
    action: 'edit',
    title: 'Main_Page',
    text: '=Some Wikitext 2=',
    summary: 'Test Edit',
    token: bot.editToken
}).then((response) => {
    // Success
}).catch((err) => {
    // Error
});

rawRequest(requestOptions)

This executes a standard request request. It uses the some default requestOptions, but you can overwrite any of them. Use this if you need full flexibility or do generic HTTP requests.

bot.rawRequest({
    method: 'GET',
    uri: 'https://jsonplaceholder.typicode.com/comments',
    json: true,
    qs: {
        postId: 1
    }
}).then((response) => {
    // Success
}).catch((err) => {
    // Error
});

Helper Functions

MWBot.logStatus(status, currentCounter, totalCounter, operation, pageName, reason)

Static function that prints "pretty" upload status log messages. Is used internally to print .batch() status messages.

MWBot.logStatus('[+] ', counter, total, 'USER', user.userName);

MWBot.Promise

Injection of a bluebird.js Promise

MWBot.map

Injection of a bluebird.js Promise.map (for concurrent batch requests)

MWBot.mapSeries

Injection of a bluebird.js Promise.mapSeries (for sequential batch requests)

Tips and Tricks

Learn how to use the Promise Pattern! It handles a lot, like concurrency, parallel or sequential requests, etc.

I can recommend the bluebird.js Promise library. This is the Promise library that mwbot is using internally, too.

  • If you want to do batch request concurrently, use Promise.map
  • If you want to do batch request in strictly sequential order, use Promise.mapSeries

Complete Examples

Fetch content of a page, change it and upload the changed page

let bot = new MWBot();
bot.loginGetEditToken({
    apiUrl: "http://localhost:8080/wiki01/api.php",
    username: "testuser",
    password: "testpassword"
}).then(() => {
    return bot.read('Main Page');
}).then((response) => {
    let pageContent = response.query.pages['1']['revisions'][0]['*'];
    pageContent += ' Appendix';
    return bot.update('Main Page', pageContent);
}).then((response) => {
    // Success
}).catch((err) => {
    // Error
});

Concurrent execution of batch jobs with bluebird.js Promise.map

This example takes a list of pages and executes a purge action on it. It also demonstrates how to (re)use the static MWBot.logStatus helper function

let bot = new MWBot();

let pages = [
    'Main Page',
    'Test Page'
];

let pagesTotal = pages.length || 0;
let pageCounter = 0;

bot.loginGetEditToken({
    apiUrl: "http://localhost:8080/wiki01/api.php",
    username: "testuser",
    password: "testpassword"
}).then(() => {

    return MWBot.map(pages, (page) => {

        pageCounter += 1;

        return bot.request({
            action: 'purge',
            titles: page,
            forcelinkupdate: true
        }).then((response) => {

            // Use MWBot.logStatus helper function
            if (response.error) {
                MWBot.logStatus('[E] ', pageCounter, pagesTotal, 'PURGE', response.purge[0].title);
            } else {
                MWBot.logStatus('[=] ', pageCounter, pagesTotal, 'PURGE', response.purge[0].title);
            }
        }).catch((err) => {
            MWBot.logStatus('[E] ', pageCounter, pagesTotal, 'PURGE', page);
            log(err);
        });

    }, {
        concurrency: 2
    }).then((response) => {
        // Success
    }).catch((err) => {
        // Error
    });

}).catch((err) => {
    // Login Error
});

About

A simple, but flexible MediaWiki bot for Node.js


Languages

Language:JavaScript 100.0%