ZoneMTA (internal code name X-699)

Modern outbound SMTP relay (MTA/MSA) built on Node.js, LevelDB (queue handling) and MongoDB (queue storage). It's kind of like Postfix for outbound but is able to use multiple local IP addresses and is easily extendable using plugins that are way more flexible than milters.

ZoneMTA is in beta, so handle with care! Currently there's a single ZoneMTA instance deployed to production, it delivers about 500 000 messages per day, 70-80 messages per second on peak times. Total messages delivered to date is more than 20 000 000.

 _____             _____ _____ _____
|__   |___ ___ ___|     |_   _|  _  |
|   __| . |   | -_| | | | | | |     |
|_____|___|_|_|___|_|_|_| |_| |__|__|

The goal of this project is to provide granular control over routing different messages. Trusted senders can be routed through high-speed (more parallel connections) virtual "sending zones" that use high reputation IP addresses, less trusted senders can be routed through slower (less connections) virtual "sending zones" or through IP addresses with less reputation. In addition the server comes packed with features more common to commercial software, ie. message rewriting, IP warm-up or HTTP API for posting messages.

ZoneMTA is comparable to Haraka but unlike Haraka it's for outbound only. Both systems run on Node.js and have a built in plugin system even though the designs are somewhat different. The plugin system (and a lot more as well) for ZoneMTA is inherited from the Nodemailer project and thus do not have direct relations to Haraka.

There's also a web-based administration interface (needs to be installed separately).

(See all screenshots of ZMTA-WebAdmin here)

Quickstart

Assuming Node.js (v6.0.0+), MongoDB running on localhost, build tools and git. There must be nothing listening on ports 2525 (SMTP), 8080 (HTTP API) and 8081 (internal data channel). All these ports are configurable.

Requirements

Requirements: Node.js v6+ for running the app + compiler for building LevelDB bindings
If running in Windows install the (free) build dependencies (Python, Visual Studio Build Tools etc). From elevated PowerShell (run as administrator) run npm install --global --production windows-build-tools to get these tools

Create ZoneMTA application

If your user is not able to install global modules with npm then run the first command with sudo, otherwise you do not need root permissions to create or run ZoneMTA applications (at least not as long as you don't want to use privileged ports like 25 org 465).

$ npm install -g zone-mta
$ zone-mta create path/to/app
$ cd path/to/app
$ npm start

If everything succeeds then you should have a SMTP relay with no authentication running on localhost port 2525 (does not accept remote connections).

See default.js for all possible config options that you can use for config.json in your app folder.

Web administration console should be installed separately, it is not part of the default installation. See instructions in the ZMTA-WebAdmin page.

Birds-eye-view of the system

Incoming message pipeline

Messages are dropped for delivery either by SMTP or HTTP API. Message is processed as a stream, so it shouldn't matter if the message is very large in size (except if a very large message is submitted using the JSON API). This applies also to DKIM body hash calculation – the hash is calculated chunk by chunk as the message stream flows through (actual signature is generated out of the body hash when delivering the message to destination). The incoming stream starts from incoming connection and ends in LevelDB, so if there's an error in any step between these two, the error is reported back to the client and the message is rejected. If impartial data is stored to LevelDB it gets garbage collected after some time (all message bodies without referencing delivery rows are deleted automatically)

Outgoing message pipeline

Delivering messages to destination

Features

Web interface. See queue status and debug deferred messages through an easy to use web interface (needs to be installed separately).
Cross platform. You do need compile tools but this should be fairly easy to set up on every platform, even on Windows
Fast. Send millions of messages per day
Send large messages with low overhead
Automatic DKIM signing
Adds Message-Id and Date headers if missing
Sending Zone support: send different messages using different IP addresses
Built-in support for delayed messages. Just use a future value in the Date header and the message is not sent out before that time
Assign specific recipient domains to specific Sending Zones
Queue is stored in LevelDB
Built in IPv6 support
Uses STARTTLS for outgoing messages by default, so no broken padlock images in Gmail
Smarter bounce handling
Throttling per Sending Zone connection
Spam detection using Rspamd
HTTP API to send messages
Route messages to the onion network
Custom plugins
Automatic back-off if an IP address gets blacklisted

Check the WIKI for more details

Configuration

Default configuration can be found from default.js. In your application specific configuration you override specific options but you do not need to specify these values that you want to keep as default.

For example if the default.js states an object with multiple properties like this:

{
    mailerDaemon: {
        name: 'Mail Delivery Subsystem',
        address: 'mailer-daemon@' + os.hostname()
    }
}

Then you can override only a single property without changing the other values like this in config.json:

{
    "mailerDaemon": {
        "name": "Override default value"
    }
}

Features

Large message support

All data is processed in chunks without reading the entire message into memory, so it does not matter if the message is 1kB or 1GB in size.

LevelDB backend

Using LeveldDB means that you do not run out of inodes when you have a large queue, you can pile up even millions of messages (assuming you do not run out of disk space first). Read about storing queued messages to LeveldDB in the Wiki. For better performance you can also use alternatives like the Basho fork of LevelDB (see here).

DKIM signing

DKIM signing support is built in to ZoneMTA. You can provide DKIM keys using the built in DKIM plugin (see here) or alternatively create your own plugin to handle key management. ZoneMTA calculates all required hashes and is able to sign messages if a key or multiple keys are provided.

Sending Zone

You can define as many Sending Zones as you want. Every Sending Zone can have its own local address IP pool that is used to send out messages designated for that Zone (IP addresses are not locked, you can assign the same IP for multiple Zones or multiple times for a single Zone). You can also specify the amount of maximum parallel outgoing connections (per process) for a Sending Zone.

Routing by Zone name

To preselect a Zone to be used for a specific message you can use the X-Sending-Zone header key

X-Sending-Zone: zone-identifier

For example if you have a Sending Zone called "zone-identifier" set then messages with such header are routed through this Sending Zone.

NB This behavior is enabled by default only for 'api' and 'bounce' zones, see the allowRountingHeaders option in default config for details

Routing based on specific header value

You can define specific header values in the Sending Zone configuration with the routingHeaders option. For example if you want to send messages that contain the header 'X-User-ID' with value '123' then you can configure it like this:

'sending-zone': {
    ...
    routingHeaders: {
        'x-user-id': '123'
    }
}

Routing based on sender domain name

You also define that all senders with a specific From domain name are routed through a specific domain. Use senderDomains option in the Zone config.

'sending-zone': {
    ...
    senderDomains: ['example.com']
}

Routing based on recipient domain name

You also define that all recipients with a specific domain name are routed through a specific domain. Use recipientDomains option in the Zone config.

'sending-zone': {
    ...
    recipientDomains: ['gmail.com', 'kreata.ee']
}

Default routing

The routing priority is the following:

By the X-Sending-Zone header
By matching routingHeaders headers
By sender domain value in senderDomains
By recipient domain value in recipientDomains

If no routing can be detected, then the "default" zone is used.

IPv6 support

IPv6 is supported but not enabled by default. You can enable or disable it per Sending Zone with the ignoreIPv6 option.

HTTP based authentication

If authentication is required then all clients are authenticated against a HTTP endpoint using Basic access authentication. If the HTTP request succeeds then the user is considered as authenticated. See more here. If you need some other authentication mechanisms then you can create a plugin that handles the 'smtp:auth' hook. To enable authentication you need to set authentication option to true for that specific SMTP interface.

Per-Zone domain connection limits

You can set connection limits for recipient domains per Sending Zone. For example if you have set max 2 connections to a specific domain then even if your queue processor has free slots and there are a lot of messages queued for that domain it will not create more connections than allowed.

Bounce handling

ZoneMTA tries to guess the reason behind rejecting a message – maybe the message was greylisted or maybe your sending IP is blocked by this recipient. Not every bounce is equal.

If the message hard bounces (or after too many retries for soft bounces) a bounce notification is POSTed to an URL. You can also define that a bounce response is sent to the sender email address. See more here

Blacklist back-off

If the bounce occured because your sending IP is blacklisted then this IP gets disabled for that MX for the next 6 hours and message is retried from a different IP. You can also disable local IP addresses permanently for specific domains with disabledAddresses option.

Error Recovery

ZoneMTA is an at-least-once delivery system, so messages are deleted from the queue only after positive response from the receiving MX server. If a child starts processing a message the child locks the message and the lock is released automatically if the child dies or master dies. Once normal operations are resumed, the same message can be fetched from the queue again.

Child processes that handle actual delivery keep a TCP connection up against the master process. This connection is used as the data channel for exchanging information about deliveries. If the connection drops for any reason, all current operations are cancelled by the child and non-delivered messages are re-queued by the master. This behavior should limit the possibility of multiple deliveries of the same message. Multiple deliveries can still happen if the process or connection dies exactly on the moment when the MX server acknowledges the message and the notification does not get propagated to the master. This risk of multiple deliveries is preferred over losing messages completely.

Messages might get lost if the database gets into a corrupted state and it is not possible to recover data from it.

IP Warm-Up

You can assign a new IP to the IP pool using lower load share than other addresses by using ratio option (value in the range of 0 and 1 where 0 means that this IP is never used and 1 means that only this IP is used)

{
    pools: {
        default: [
            {name: 'host1.example.com', address: '1.2.3.1'},
            {name: 'host2.example.com', address: '1.2.3.2'},
            {name: 'host3.example.com', address: '1.2.3.3'},
            // the next address gets only 5% of the messages to handle
            {name: 'warmup.example.com', address: '1.2.3.4', ratio: 1/20}
        ]
    }
}

Once your IP address is warm enough then you can either increase the load ratio for it or remove the parameter entirely to share load evenly between all addresses. Be aware though that every time you change pool structure it mixes up the address resolving, so a message that is currently deferred for greylisting does not get the same IP address that it previously used and thus might get greylisted again.

HTTP API

You can post a JSON structure to a HTTP endpoint (if enabled) and it will be converted into a rfc822 formatted message and delivered to destination. The JSON structure follows Nodemailer email config (see here) except that file and url access is disabled – you can't define an attachment that loads its contents from a file path or from an url, you need to provide the file contents as base64 encoded string.

You can provide the authenticated username with X-Authenticated-User header and originating IP with X-Originating-IP header, both values are optional.

curl -H "Content-Type: application/json" -H "X-Authenticated-User: andris" -H "X-Originating-IP: 123.123.123.123" -X POST  http://localhost:8080/send -d '{
    "from": "sender@example.com",
    "to": "recipient1@example.com, recipient2@example.com",
    "subject": "hello",
    "text": "hello world!"
}'

In the same manner you could upload raw rfc822 message for delivery. In this case the sender and recipient info would be fetched from the message.

curl -H "Content-Type: message/rfc822" -H "X-Authenticated-User: andris" -H "X-Originating-IP: 123.123.123.123" -X POST  http://localhost:8080/send-raw -d 'From: sender@example.com
To: recipient1@example.com, recipient2@example.com
Subject: Hello!

Hello world'

Zone status

You can check the current state of a sending zone (for example "default") with the following query

curl http://localhost:8080/counter/zone/default

The response includes counters about queued and deferred messages

{
    "active": {
        "rows": 13
    },
    "deferred": {
        "rows": 17
    }
}

You can check counters for all zones with:

curl http://localhost:8080/counter/zone/

Queued messages

You can list the first 1000 messages queued or deferred for a queue

curl http://localhost:8080/queued/active/default

Replace active with deferred to get the list of deferred messages.

The response includes an array of messages

{
    "list": [
        {
            "id":"157ca04cd5c000ddea",
            "zone":"default",
            "recipient":"example@example.com"
        }
    ]
}

Message status in Queue

If you know the queue id (for example 1578a823de00009fbb) then you can check the current status with the following query

curl http://localhost:8080/message/1578a823de00009fbb

The response includes general information about the message and lists all recipients that are current queued (about to be sent) or deferred (are scheduled to send in the future). This does not include messages already sent or bounced.

{
    "meta": {
        "id": "1578a823de00009fbb",
        "interface": "feeder",
        "from": "sender@example.com",
        "to": ["recipient1@example.com", "recipient2@example.com"],
        "origin": "127.0.0.1",
        "originhost": "[127.0.0.1]",
        "transhost": "foo",
        "transtype": "ESMTP",
        "time": 1475497588281,
        "dkim": {
            "hashAlgo": "sha256",
            "bodyHash": "HAuESLcsVfL2FGQCUtFOwTL6Ax18XDXZO2vOeAz+DpI="
        },
        "headers": [{
            "key": "date",
            "line": "Date: Mon, 03 Oct 2016 12:26:32 +0000"
        }, {
            "key": "from",
            "line": "From: Sender <sender@example.com>"
        }, {
            "key": "message-id",
            "line": "Message-ID: <95dc84ae-ff9e-4e95-aa75-8ee707bc018d@example.com>"
        }, {
            "key": "subject",
            "line": "subject: test"
        }],
        "messageId": "<95dc84ae-ff9e-4e95-aa75-8ee707bc018d@example.com>",
        "date": "Mon, 03 Oct 2016 12:26:32 +0000",
        "parsedEnvelope": {
            "from": "sender@example.com",
            "to": [],
            "cc": [],
            "bcc": [],
            "replyTo": false,
            "sender": false
        },
        "bodySize": 3458,
        "created": 1475497593204
    },
    "messages": [{
        "id": "1578a823de00009fbb",
        "seq": "002",
        "zone": "default",
        "recipient": "recipient1@example.com",
        "status": "DEFERRED",
        "deferred": {
            "first": 1475499253068,
            "count": 2,
            "last": 1475499774161,
            "next": 1475501274161,
            "response": "450 4.3.2 Service currently unavailable"
        }
    }]
}

Message body

If you know the queue id (for example 1578a823de00009fbb) then you can fetch the entire message contents

curl http://localhost:8080/fetch/1578a823de00009fbb

The response is a message/rfc822 message. It does not include a Received header for ZoneMTA or a DKIM signature header, these are added when sending out the message.

Content-Type: text/plain
From: sender@example.com
To: exmaple@example.com
Subject: testmessage
Message-ID: <4f7e73c3-009c-48c2-4b45-1cf20b2fe6d3@example.com>
Date: Sat, 15 Oct 2016 20:24:54 +0000
MIME-Version: 1.0

Hello world! This is a test message
...

List all keys

In case you need to see the internals of the database you can list all keys in it

curl http://localhost:8080/internals/list

The response is a plaintext (utf-8) list of keys in the DB. The final line is special, it includes stats about the listing

message 158e3fe97ca000121a #
message 158e3fe97ca000121a 158e3fe97db000
message 158e3fe97ca000121a 158e3fe97db001
message 158e3fe97ca000121a 158e3fe97db002
message 158e3fe97ca000121a 158e3fe97db003
...
message 158e3fe9903000121a 158e3fe9ed7001
message 158e3fe9903000121a 158e3fe9edf000
message 158e3fe9903000121a 158e3fe9edf001
Listed 1044 keys in 0.19s

Get value of specific key

You can fetch the value of a key with the following call:

curl http://localhost:8080/internals/key?key=KEY_ID

The response is an octet stream with the key contents

<X bytes of binary data>

Delete a specific key

You can also delete a key with the following call:

curl -XDELETE http://localhost:8080/internals/key?key=KEY_ID

The response is a JSON value with a success message

{"message": "Key deleted"}

You get the success message even if the key did not actually exist

Utilities

check-bounces

Cli command that reads a SMTP error response from stdin and returns bounce information

$ echo "552-5.7.0 This message was blocked because its content presents a potential
552-5.7.0 security issue. Please visit
552-5.7.0 http://support.google.com/mail/bin/answer.py?answer=6590 to review our
552 5.7.0 message content and attachment content guidelines. cp3si16622595oec.101 - gsmtp" | check-bounce

> data     : 552-5.7.0 This message was blocked because its content presents a potential
>            552-5.7.0 security issue. Please visit
>            552-5.7.0 http://support.google.com/mail/bin/answer.py?answer=6590 to review our
>            552 5.7.0 message content and attachment content guidelines. cp3si16622595oec.101 - gsmtp
> action   : reject
> message  : Suspicious attachment
> category : virus
> code     : 552
> status   : 5.7.0

TODO

1. Domain based throttling

Currently it is possible to limit active connections against a domain and you can limit sending speed per connection (eg. 10 messages/min per connection) but you can't limit sending speed per domain. If you have set 3 processes, 5 connections and limit sending with 10 messages / minute then what you actually get is 3 * 5 * 10 = 150 messages per minute for a Sending Zone.

2. Web interface

It should be possible to administer queues using an easy to use web interface.

Update There is a web interface that is not open yet as it is still experimental, here's a preview

Update Update There is now a publicly available web interface, called ZMTA-WebAdmin

3. Replace LevelDB with RocksDB

RocksDB has much better performance both for reading and writing but it's more difficult to set up

Update You can use any LevelUp backend module. This module is tested with:

leveldown which is the default
leveldown-basho-andris which is a fork of leveldown that uses Basho fork of LevelDB

To use a different backend than the default leveldown you need to first install it with npm and set the package name as the 'queue'.'backend' config option value.

Personally I prefer the Basho fork. Original LevelDown caused some issues, probably related to compaction, where LevelDB threads were using 100% cpu very often and caused the app to be unresponsive. I have not had these problems with the Basho fork. YMMV

Notes

In production you probably would want to allow Node.js to use more memory, so you should probably start the app with --max-old-space-size option

node --max-old-space-size=8192 app.js

This is mostly needed if you want to allow large SMTP envelopes on submission (eg. someone wants to send mail to 10 000 recipients at once) as all recipient data is gathered in memory and copied around before storing to the queue.

Potential issues with LevelDB

ZoneMTA uses LevelDB as the storage backend. While extremely capable and fast there is a small chance that LevelDB gets into a corrupted state. There are options to recover from such state automatically but this usually means dropping a lot of data, so no automatic attempt is made to "fix" the corrupt database by the application. What you probably want to do in such situation would be to move the queue folder to some other location for manual recovery and let ZoneMTA to start over with a fresh and empty queue folder.

Repair failed queue folder

If your queue folder gets corrupted then the actions should be following:

Stop ZoneMTA or make sure it does not restart automatically
Move queue folder to somewhere else
Create new empty queue folder and set ZoneMTA user as the owner of that folder
Start ZoneMTA to start accepting and processing new mail
Use ZoneMTA Recovery tool to repair the corrupted database and pump messages from it to the fresh ZoneMTA instance or some else SMTP MTA

Replace LevelDB with basho fork of LevelDB

If you use LevelDB as the backend and start having 100% CPU usage then you might have run into endless compaction. Best bet would be to dump LevelDB and start using the Basho fork of it which is more optimized for servers and does not have such problems.

npm install leveldown-basho-andris --save

And then in your config:

{
  ...
  "queue": {
    "db": "/var/data/zone-mta",
    "backend": "leveldown-basho-andris",
    "leveldown-basho-andris": {
        "createIfMissing": true,
        "compression": true,
        "blockSize": 4096,
        "writeBufferSize": 62914560
    }
    ...

You can't reuse your old LevelDB files, so you should start with an empty database folder (which in turn means that you loose your existing queue).

License

European Union Public License 1.1 (details)

In general, EUPLv1.1 is compatible with GPLv2, so it's a copyleft license. Unlike GPL the EUPL license has legally binding translations in every official language of the European Union, including the Estonian language. This is why it was preferred over GPL.

ZoneMTA is created and maintained in the European Union, licensed under EUPL and its authors have no relations to the US, thus there can not be any infringements of US-based patents.

ZoneMTA (internal code name X-699)

Quickstart

Requirements

Create ZoneMTA application

Birds-eye-view of the system

Incoming message pipeline

Outgoing message pipeline

Features

Configuration

Features

Large message support

LevelDB backend

DKIM signing

Sending Zone

Routing by Zone name

Routing based on specific header value

Routing based on sender domain name

Routing based on recipient domain name

Default routing

IPv6 support

HTTP based authentication

Per-Zone domain connection limits

Bounce handling

Blacklist back-off

Error Recovery

IP Warm-Up

HTTP API

Zone status

Queued messages

Message status in Queue

Message body

List all keys

Get value of specific key

Delete a specific key

Utilities

TODO

1. Domain based throttling

2. Web interface

3. Replace LevelDB with RocksDB

Notes

Potential issues with LevelDB

Repair failed queue folder

Replace LevelDB with basho fork of LevelDB

License

About

Languages