amphp / websocket-client

Async WebSocket client for PHP based on Amp.

Home Page:https://amphp.org/websocket-client

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adding/switching to an EventEmitter interface?

toverux opened this issue · comments

I planned to use this library as a wrapper around Slack's API. As I'm writing a public library, the raw websockets client needs to be hidden and decorated by my own interface.

The problem is, with the current interface, it leads to horrible gymnastics (onOpen example only, but you can add onData, and onClose) :

class RtmClient extends WebClient
{
    public function start() : Promise
    {
        return pipe($this->callAsync('rtm.start'), function(array $rtmInfo) {
            $onOpen = \Closure::fromCallable([$this, 'handleOpen']);

            $repeater = new class($onOpen) implements Websocket {
                private $onOpen;

                public function __construct(callable $onOpen)
                {
                    $this->onOpen = $onOpen;
                }

                public function onOpen(Websocket\Endpoint $endpoint, array $headers)
                {
                    call_user_func($this->onOpen, $endpoint, $headers);
                }
            };

            return websocket($repeater, new Websocket\Handshake($rtmInfo['url']));
        });
    }

    private function handleOpen(Websocket\Endpoint $endpoint, array $headers) : void
    {
        // ...
    }
}

To avoid that (which is especially hideous and non-performant), I can also make RtmClient an implementor of the Amp\Websocket interface, renaming handleOpen in onOpen (and onData, etc), and making this method public. But that makes it part of my public API.

Since this library is somewhat low level, I don't think that the current API, although neat, is well suited ; in most of the cases, you'll want to decorate this websocket client and emit higher-level objects. IMO, a simple EventEmitter would make the component more easily reusable, because the current Amp\Websocket interface makes that task too hard.

Your thoughts?

Regards,

I do fully agree … I just haven't given this library much love yet. It's quite a rip-off of the Aerys server API (where it makes much more sense).

I do not think we necessarily need an EventEmitter and think that there is a more idiomatic way in a Promise environment.

Basically, individual frames could be represented as an observable, i.e. the observable provides a new value on each new frame and it will be resolved when the Server or Client close the connection.

With the current Amp (i.e. v1) this would look a bit like:

// Websocket\Frames extends PromiseStream
return websocket($handshake, function(Websocket\Endpoint $endpoint, array $headers, Websocket\Frames $frames) {
    while (yield $frames->valid()) {
        $frame = yield $frames->consume();
        /* work on the current frame */
    }
    /* we're finished here and may want to access the Close frame message */
    list($closeId, $closeMessage) = yield $frames;
});

What do you think? That way the whole communication is nicely encapsulated in a single Generator.

Yeah, I saw that this library is not actively maintained ; too bad, because I think PHP's ecosystem really needs those type of libraries nowadays.

Anyway − the API proposal you just posted is very elegant !

Edit : I presume you don't have so much time to make such breaking changes to this library, at least before Amp v.2 and its observables. If you can confirm that, maybe I should just fork the lib and patch it according to my needs?

@toverux thanks for the feedback, I'll try to find some time in the next days improving this API :-)

This lib is not unmaintained, there are just many Amp libraries. I cannot do everything at once and were right now also migrating to Amp v2 … And we're ultimately just 3-4 people doing the whole work. Right now, you've pointed an issue out and I'll tackle it now. Reporting issues shows up the important points which need more immediate attention.

Thus, thanks for pointing out what needed an improvement!

commented

So ... I 100% agree we need a strong php websocket client library.

Let's use this thread to crowd-source the specific API we want to see out of this client websocket lib. We're strained for resources because there's so much in-flight amp async work all happening concurrently (haha see what I did there?) ... but let's do the planning here and we'll implement it. I'll add ideas for the API soon ...

commented

So here's my first pass at an API. Please share thoughts and ask questions if you're unclear how to accomplish something in particular using the functionality shown below.

Public API things

interface Websocket
{
    function listen(): \Generator<Awaitable<Data>>;
    function send(string $data): Awaitable<int>;
    function close(int $code = Code::NORMAL_CLOSE, string $reason = ""): Awaitable<void>;
    function info(): Awaitable<Info>;
}
interface Data
{
    // We would just use polymorphism to implement this interface
    // for objects like Frame, Control, Message, etc ...
    function getType(): int; // (is this a data frame, data message, control frame, etc)
    function getPayload(): string;
}
interface Info
{
    /*
    // An immutable value object exposing accessor
    // methods for info about connection, e.g.:
        - bytes read
        - bytes sent
        - frames read
        - frames sent
        - messages read
        - messages sent
        - timestamps, etc
    */
}

function websocket($uriOrRequest, array $options = []): \Generator<Websocket>
{
    $httpRequestObj = generateHttpRequest($uriOrRequest);
    $uriAuthority = fetchAuthorityFromRequest($httpRequestObj);
    $connection = yield connectSocket($uriAuthority);
    $info = yield doHandshake($httpRequestObj);

    return new WebsocketImplementationClass($connection, $info);
}

Alternatively we could also encapsulate the websocket() function logic in a connect() method on the Websocket interface and just have it return an Awaitable. In that case we'd just need to either (1) throw from the other methods if not yet connected or (2) make them all return Awaitable objects and auto-connect if one is called and the connection is not yet live. I prefer option 2 in that case.

Example Usage

use function Amp\execute();
use function Amp\repeat();
use function Amp\once();
use function Amp\wrap();

function myWebsocketClientApp(string $uri): \Generator
{
    $websocket = yield from websocket($uri);

    // schedule the connection to close in 30 seconds because why not
    once([$websocket, "close"], 30000);

    // say hello to the server every 5000ms just because we can
    repeat(wrap([$websocket, "send"]), 5000, "hello from the client"));

    // listen to all data received from the server and echo it back
    foreach ($websocket->listen() as $nextDataPromise) {
        $data = yield $nextDataPromise;
        $payload = $data->getPayload();
        yield $websocket->send($payload); // echo it back
    }

    // we'll arrive here in ~30 seconds because when the connection
    // closes our listen() generator will complete
}

execute(wrap(myWebsocketClientApp("ws://foo.com/chat")));

The client could be configured via options to do things like not yield individual data frames or control frames and buffer data until full messages are received. Basically any sort of thing we'd want to configure could be supported by the initial $options array with sane defaults to show or hide as much protocol level detail as a user wants. Likewise supporting either a string URI input or a full blown HTTP request object provides users as much or as little control over the websocket handshake and protocol details as they like.

@rdlowrey There's a fundamental flaw with listen(), PHP has no yield foreach or such.

Each time a generator is called, something must be returned immediately, whether it's null or an Awaitable or whatever. I.e. at the time where list() is called, we may not know yet whether there won't be any future frames … Thus, if we want to iterate that way, we'll have to either resolve the last returned Awaitable without any emits and force the user to do an if ($data == "") … or fail it (i.e. cause a throw in the coroutine) and force him to add a try/catch.
Both aren't very elegant solutions IMHO.

Additionally, $data->getPayload(): string is not an option - at least not if we want to allow the server to send longer bodies - same reason why we have Aerys\Body.

what'd work would be

while (yield $websocket->next()) {
    $stringdata = yield $websocket->consume();
   // if we want to still use a Data class: $
}

You've designed a Data class. What for? The only thing we care about is data. The other things are all either pings, close. Control frames which are handled by the websocket endpoint itself. Also, cont is handled by the ws. Are we left with data vs binary, which is really just important for clients differing by binary and utf-8 strings or such. PHP isn't. And if you need to be really sure, the classical if (preg_match("//u", $str)) is just as effective. We maybe might add an option to the ws to only allow UTF-8 frames, but that's about it.

One thing I like in your proposal is yield websocket($UriOrHandshake). Thus I suggest:

$f = function() {
    // any coroutine
    $ws = yield websocket("ws://foo.com/chat");

    $headers = $ws->getHeaders();
    if (!in_array("BAR", $headers["FOO"])) {
        $ws->close();
        return;
    }

    while (yield $ws->next()) {
        $frame = yield $ws->getCurrent();
        /* work on the current frame; e.g. */
        $ws->send(yield $frame);
    }
    /* we're finished here and may want to access the Close frame message */
    list($closeId, $closeMessage) = yield $frames;
});

execute($f());

with:

class Websocket\Frame extends Observer implements Observable {} /* similar impl to Aerys\Body */
class Websocket extends Observer {
    function send(string $data): Awaitable<int>;
    function close(int $code = Code::NORMAL_CLOSE, string $reason = ""): Awaitable<null>;
    function info(): Awaitable<Info>;
    /* inherited from Observer: */
    function getCurrent(): Websocket\Frame;
    function next(): Awaitable<bool>;
}

@bwoebi Is there a reason to expose the websocket frames in this API? I wouldn't expect users to assemble frames into messages. getCurrent() should be returning Websocket\Message, perhaps this is what you meant?

commented

Round 3

So here's a reboot taking into account feedback so far. The only thing this iteration is missing is any potential backpressure from observer/observables. I'm not totally convinced that we need backpressure on the client side of a websocket, but I'm open to it.

Please share thoughts on or alternatives to what's below ...

Public API Things

<?php

/**
 * Connect to a websocket endpoint
 *
 * @param mixed $uriOrRequest
 * @param array $options
 * @return Awaitable<Websocket>
 */
function websocket($uriOrRequest, array $options = []): Awaitable;

interface Websocket
{
    /**
     * What to do when a data message is received in full
     *
     * If you pass a callback here then you're implicitly opting-in to all
     * frames of a message being buffered in memory until the message is
     * received in full.
     *
     * Callback signature: function (Message $message);
     *
     * @param callable $func the callback
     * @return void
     */
    function onMessage(callable $func);

    /**
     * Supply a callback if you care about frame-level granularity
     *
     * This method would also allow you to consume control frames if you're
     * interested in them (you could check a method on the emitted Frame object
     * to know what type of frame it was).
     *
     * If you pass a callback here then you're implicitly opting-in to frames
     * being fully buffered in memory.
     *
     * Callback signature: function (Frame $f);
     *
     * @param callable $func the callback
     * @param int $frameBitmask specify which frame types should be emitted
     * @return void
     */
    function onFrame(callable $func, int $frameBitmask);

    /**
     * If you want to bypass the frame construct altogether and just stream
     * data at a granularity smaller than individual data frames.
     *
     * By specifying $chunkSize in bytes your callback would be notified any
     * time that minimum threshold of data bytes was buffered (or when a message
     * payload was fully received if less than the threshold).
     *
     * Callback signature: function (string $dataPart);
     *
     * @return void
     */
    function onDataPart(int $chunkSize, callable $func);

    // the rest of these should be self-explanatory
    function send(string $data): Awaitable;
    function close(int $code, string $reason = ""): Awaitable;
    function info(): Info;
}

Usage

function () { // any coroutine
    $ws = yield websocket("ws://foo.com/chat");
    // do stuff with the resulting websocket object here
};

Here is a rough draft of a websocket connection API, modeling the connection as an observable set of messages.

use Amp\Observable;
use Interop\Async\Awaitable;

/**
 * Connect to a websocket endpoint
 *
 * @param mixed $uriOrRequest
 * @param array $options
 * @return Awaitable<Connection>
 */
function websocket($uriOrRequest, array $options = []): Awaitable;

class Connection {
    /**
     * Returns an observable of messages received from the websocket.
     *
     * @return Observable<Message, CloseStatus> Received messages and eventual close status.
     */
    function listen(): Observable;

    /**
     * Combined send method for strings or Message objects. Could be separate methods.
     *
     * @param string|Message $data
     * @param bool $isBinary Ignored if the first param is a Message instance.
     *
     * @return Awaitable<int> Number of bytes sent.
     */
    function send($data, bool $isBinary = false): Awaitable;

    /**
     * @param int $code One of the CloseStatus constants.
     * @param string $message Optional close message.
     *
     * @return Awaitable<int> Number of bytes sent.
     */
    function close(int $code = CloseStatus::NORMAL, string $message = ''): Awaitable;

    /**
     * @return string[][] Response headers.
     */
    function getHeaders(): array;
}

class Message extends Observer implements Observable {
    // Similar to Aerys\Body as @bwoebi suggested.

    public function isBinary(): bool;
}

class CloseStatus {
    const NORMAL =        1000;
    const GOING_AWAY =    1001;
    const PROTOCOL =      1002;
    const BAD_DATA =      1003;
    const NO_STATUS =     1005;
    const ABNORMAL =      1006;
    const INVALID_DATA =  1007;
    const VIOLATION =     1008;
    const TOO_BIG =       1009;
    const EXTENSION =     1010;
    const SERVER_ERROR =  1011;
    const TLS_ERROR =     1015;


    public function getStatus(): int; // Returns one of the constants above.
    public function getMessage(): string;
}

// Example usage:

Amp\execute(function () {
    $ws = yield websocket('ws://web.socket/endpoint');

    $messages = new Observer($ws->listen());

    while (yield $messages->next()) {
        $message = $messages->getCurrent();

        $data = yield $message; // Message can be streamed using $message->next() / $message->getCurrent().

        $ws->send("Message received.");
    }

    $closeStatus = $messages->getResult();
});
commented

function send($data, bool $isBinary = false): Awaitable;

I dislike optional boolean params. I'd rather have separate binary/non-binary send methods or (more preferably) just have the API internally check if there are non-utf8 characters in the data (a single regex call) and send the correct thing automatically. This latter approach is what I've been leaning towards for a while now.

Also, I don't really see why the compound observable publication is necessary. A "close" is a message just the same as any other. There's no reason to model it as a separate thing in the stream of received messages. This is what I was getting at in my initial Data concept ... you just have types of messages.

Also not a fan of function getHeaders(): array; ... would rather see that as a data element returned from a broader function info(): Info; method because there's a lot of other valuable meta information about the connection we should expose ... not just headers.

Unless your binary happens to be valid UTF-8... I'd rather just have a separate method, or force creating a Message object to send binary.

commented

but if your binary is valid UTF-8 then you don't care ... the binary flag is strictly for transport, not for interpretation of what the data means, right?

I believe it's suppose to be for data interpretation.... I really don't see the need for the distinction, but it's part of the protocol...

I'm also not sure if websockets should be handled as streaming or message based... and I don't think the RFC is sure either. At least the proposed interface above allows buffering whole messages or streaming.

commented

alternative send() idea:

function send(string $data, int $mode = self::MODE_AUTO): Awaitable<void>;

Where you have MODE_AUTO, MODE_TEXT and MODE_BINARY ... auto mode would try to figure it out. Or if we think auto mode is a bad idea we could just nix it and expect people to explicitly say, "this is text/binary mode" ... I'd just rather see a delf-documenting const there than a boolean. And I don't love the idea of adding a second method just for that.

commented

I think we're officially all on board with this opening function so we can stop including it in our code snippets:

<?php

/**
 * Connect to a websocket endpoint
 *
 * @param mixed $uriOrRequest
 * @param array $options
 * @return Awaitable<Connection>
 */
function websocket($uriOrRequest, array $options = []): Awaitable;

@trowski yes, I indeed meant message, sorry for the wrong class name ;o)

@rdlowrey I dislike auto-modes in context of encodings because they may yield wrong results (like interpret data as TEXT while meant as binary). I don't see how this could work 100% reliably.

Can we please just have send($text) and sendBinary($text) and be fine?

@rdlowrey The close status is the completion value of the observable, analogous to the return value of a Generator. It's not emitted as part of the set of values, but a separate value after emitting has completed.

@bwoebi I'm all for separate send methods, perhaps:

function send(string $data): Awaitable;
function sendBinary(string $data): Awaitable;
function sendMessage(Message $message): Awaitable;

I think I agree on Daniel with merging headers into Info… but… is there any reason why Info actually needs to be a separate object and can't just be part of the Connection itself?

@trowski I like your API in general though, we should make status/message properties instead though. No need for getter methods there.

@trowski sendMessage() is interesting. it could be used for streaming data over websocket (i.e. in a same frame). Aerys should also have such a method I think.

commented

Moar proposal:

<?php

interface Connection
{
    const MODE_TEXT = 1;
    const MODE_BINARY = 2;

    /**
     * Observe messages received from the websocket connection.
     *
     * By default only text, binary and close data is emitted.
     *
     * @param int $types Bitmask determining which data types will be emitted
     *                   from the returned observable
     * @return Observable<Message> Received data frames/messages/etc
     */
    function listen(int $types = Message::DATA | Message::CLOSE): Observable<Message>;

    /**
     * Send a data message.
     *
     * Method internally determines from the $data if it's binary or not
     * and transmits the message appropriately.
     * 
     * @param string $data The message payload
     * @param int $mode Is this TEXT or BINARY?
     * @return Awaitable<void>
     */
    function send(string $data, int $mode): Awaitable<void>;

    /**
     * Send a stream of frames constituting a larger message.
     *
     * Each string element of the iterable is sent as its own frame as part
     * of a larger message. The message is complete once the iterable is no
     * longer valid. Any Awaitables present in the iterable are resolved prior
     * to sending. Empty strings are ignored (as are Awaitables resolving to
     * an empty string). An Awaitable resolving to a non-string will error.
     *
     * @param array|Iterator $iterable a "stream" of frame strings or Awaitables
     * @param int $mode Is this TEXT or BINARY?
     * @return Observable<string> Emits the payload of each frame once sent
     */
    function stream($iterable, int $mode): Observable<string>

    function close(int $code, string $reason = ""): Awaitable;
    function info(): Info;
}

interface Message
{
    const CONTROL   = 0b00010000;
    const HEARTBEAT = 0b00010100;
    const PING      = 0b00010101;
    const PONG      = 0b00010110;
    const CLOSE     = 0b00011000;
    const TEXT      = 0b10100000;
    const BINARY    = 0b11000000;
    const DATA      = 0b10000000;

    const FROM_PARTIAL = 0b00;
    const FROM_FRAME   = 0b01;
    const FROM_MESSAGE = 0b11;

    /**
     * Retrieve the 
     */
    function getType(): int;

    /**
     * Retrieve the payload of the "message"
     *
     * @return string
     */
    function getPayload(): string;

    /**
     * We can't assume we want to buffer full frames/messages. Initial
     * configuration of the connection determines the threshold for emission.
     * This function returns a bitmask of FROM_* constants explaining why this
     * message was emitted from the observable:
     *
     *  - a connection buffer size limit was reached
     *  - a protocol frame boundary was reached
     *  - a protocol message boundary was reached
     *
     * This allows us to configure the connection to notify us of data at any
     * granularity the user prefers ... including treating the data as just a
     * stream of bytes without being forced to buffer protocol frames which are
     * allowed to reach 2^64 bytes in size.
     *
     * @return int
     */
    function from(): int;
}

I think Aerys should use whatever interface we produce here, perhaps depending on this lib so it can at least share some of the impl and interfaces.

@trowski The difference is that Aerys is a server and will thus have many connections on a single Endpoint. The client here will have only one connection. This needs a different API, yet alone because of the first parameter of the API - $clientId. [no, please don't propose to change this; needing to run a Generator for every single client of the potentially thousands attached will be too memory consuming - and anyway you still need to broadcast there… no, that doesn't go well.] — We can have a similar API, but not share it.

commented

Like @bwoebi said, we've crossed that bridge before ... a performant server has some very different requirements from a single client. They're just different enough to make it too expensive if you need performance from the server.

@bwoebi I realize that it would probably need to be extended for Aerys... but I can see where that could get messy. I did not think that a generator per client would be a problem, but I never actually tried that, I assume you and @rdlowrey have tested this.

Fair enough then... just trying to reduce some code duplication where possible. We have so much to maintain already... 👀

@trowski yes, in a HTTP req the Generator is only active for fractions of a second, or at most few seconds (in normal case). For websockets it may be normal to hold tens or hundreds of thousands connections open (which are mostly in idle). The Generator itself is not that expensive, but when you attach an Object persisting info, headers, one or more objects just used in that single Generator, you end up with a very significant and memory cost.

commented

All right guys, here's another stab at it taking into account the things discussed in chat earlier ...

The Connection

<?php

interface Connection
{
    /**
     * Consume messages over the websocket connection.
     *
     * TEXT, BINARY and CLOSE messages are always received.
     *
     * The final Awaitable in the generator always resolves to the close message
     * responsible for ending the connection. In this way we never have to throw
     * to exit a listen() loop. Even if the connection is unexpectedly severed
     * without a proper handshake we can give back a CLOSE message with a 1006
     * (Abnormal Close) code.
     *
     * ### Future Option Scope ###
     *
     * - option to emit messages once a specified buffer size reached as opposed
     *   to waiting for protocol message completion
     * - option allowing listen() to receive PING/PONG messages as well
     * - option to prevent auto-response to PING/PONG/CLOSE for manual handling
     *
     * @param array $options
     * @return Generator<Awaitable(Message)> returned value is the close message
     */
    public function recv(array $options = []): \Generator;

    /**
     * Send data to the server.
     *
     * ### Streaming array|Iterator frames ###
     *
     * Each string element of the iterable is sent as its own frame as part
     * of a larger message. The message is complete once the iterable is no
     * longer valid. Any Awaitables present in the iterable are resolved prior
     * to sending. Empty strings are ignored (as are Awaitables resolving to
     * an empty string). An Awaitable resolving to a non-string will error.
     *
     * @param string|Message|array|Iterator $data The message to be sent
     * @return Awaitable<void>
     */
    public function send($data): Awaitable;

    /**
     * Fetch information about the connection environment
     *
     * @return Info
     */
    public function info(): Info;
}

Implementation Examples

<?php
# Listening on the connection

coroutine(function () {
    $connection = yield websocket("foo");
    $listener = $connection->recv();

    foreach ($listener as $nextMessage) {
        $message = yield $nextMessage;
        if ($message->type() & Message::DATA) {
            $payload = $message->payload();
        }
    }

    $closeMessage = $listener->getReturn();
});

# Sending data on the connection

coroutine(function () {
    $connection = yield websocket("foo");

    // Send text data -- passing a string message autodetects TEXT/BINARY
    yield $connection->send("hello");

    // (same as above but with a message object)
    yield $connection->send(new Text("hello"));

    // Send binary data
    yield $connection->send(new Binary("hello"));

    // Streaming multiple frames from any iterable. Awaitables are resolved as
    // they're encountered. The TEXT/BINARY distinction from the first frame
    // in the iterable is the message type. All others will be sent as CONT.
    // The canonical use case for streaming from iterables is incrementally
    // reading a huge file and sending it in framed chunks as part of a single
    // larger BINARY message.
    yield $connection->send([
        new Binary("foo"),
        new Success("bar"), // <-- awaitables should be resolved
        "baz",
    ]);

    // Close the connection ... we don't need a separate method for close as
    // we already provide the necessary functionality. Users can easily wrap
    // the connection class to add helper methods like close() if desired.
    yield $connection->send(new Close(1000, "my reason"));
});

Message Implementations

<?php
abstract class Message
{
    const PING          = 0b00010101;
    const PONG          = 0b00010110;
    const CLOSE         = 0b00011000;
    const TEXT          = 0b10100000;
    const BINARY        = 0b11000000;
    const DATA          = 0b10000000;

    private $type;
    private $payload;

    public function __construct(int $type, string $payload)
    {
        switch ($type) {
            case self::PING:
            case self::PONG:
            case self::CLOSE:
            case self::TEXT:
            case self::BINARY:
                $this->type = $type;
                break;
            default:
                throw new \DomainException(
                    "Unknown message type: {$type}"
                );
        }
        $this->payload = $payload;
    }

    /**
     * What is the message type?
     *
     * @return int
     */
    final public function type(): int
    {
        return $this->type;
    }

    /**
     * Retrieve the payload of the "message"
     *
     * @return string
     */
    final public function payload(): string
    {
        return $this->payload;
    }
}

final class Text extends Message
{
    public function __construct(string $payload)
    {
        parent::__construct(self::TEXT, $payload);
    }
}

final class Binary extends Message
{
    public function __construct(string $payload)
    {
        parent::__construct(self::BINARY, $payload);
    }
}

final class Ping extends Message
{
    public function __construct(string $payload)
    {
        parent::__construct(self::PING, $payload);
    }
}

final class Pong extends Message
{
    public function __construct(string $payload)
    {
        parent::__construct(self::PONG, $payload);
    }
}

final class Close extends Message
{
    private $code;
    private $reason;

    public function __construct(int $code, string $reason = "")
    {
        $this->setCode($code);
        $this->setReason($reason);
        $payload = \pack("n", $code) . $reason;
        parent::__construct(self::CLOSE, $payload);
    }

    private function setCode(int $code)
    {
        // @TODO validation logic
        $this->code = $code;
    }

    private function setReason(string $reason)
    {
        // @TODO validate UTF-8
        $this->reason = $reason;
    }

    public function getCode(): int
    {
        return $this->code;
    }

    public function getReason(): string
    {
        return $this->reason;
    }
}

Why do we have Connection::recv? How about making it implement Observable directly?

@kelunik I suggested this as well, but since Observable extends Awaitable, you would be unable to resolve awaitables (i.e., the awaitable returned from websocket()) with Connection objects.

@rdlowrey One fundamental difference between our interfaces that needs to be resolved: Are websocket messages fully buffered wrappers around a string, or should we allow streaming of message contents similar to Aerys/Body? Personally, I would vote the latter, still being able to use yield $message to buffer the entire message body if desired.

commented

Are websocket messages fully buffered wrappers around a string, or should we allow streaming of message contents similar to Aerys/Body

^ @trowski I think the ideal implementation defaults to the former while optionally allowing the latter. The 95% use-case just wants everything buffered. But we have to support streams and I don't think we need a custom solution for that. Whatever mechanism emits messages only needs to emit strings as the buffer size threshold is reached.

@rdlowrey Sure. That's exactly what that streaming class would do. Buffering in handy chunks. - per message.

And that's why we should have a separate API for that. Something which just gives you fully buffered messages and something which gives you a message stream split into buffer size threshold segments.

@bwoebi If getting the fully buffered message is as simple as yield $message then perhaps we don't need a separate API emitting only buffered messages.

@trowski that's right - as long as we are fine with it throwing. But perhaps that's not an issue, as we could just create a helper function Websocket\read($ws) which consumes the Websocket::recv() API internally.

function read(Connection $ws): \Amp\Observable {
    return new \Amp\Emitter(function($emit) use ($ws) {
        try {
            foreach ($ws as $msg) {
                $emit(yield $msg);
            }
        } catch (ServerException $e) { }
    });
}

which can be most easily usable with:

$ws = yield websocket("ws://...")
$msgs = new \Amp\Observer(Websocket\read($ws));
while (yield $msgs->next()) {
    $msg = $msgs->getCurrent();
    // ...
}

While still allowing the full-blown recv() API to be used.

@trowski @rdlowrey what do you think?

@bwoebi The separate read() function to create an observable makes a lot of sense.


Revised proposal:

class Connection implements Iterator<Message> {
    public function send(string $data): Awaitable<int>;
    public function sendBinary(string $data): Awaitable<int>;
    public function sendMessage(Message $data): Awaitable<int>;
    public function close(Close $close = null): Awaitable<int>;
}

class Message extends Observer implements Observable {
    // Similar to Aerys\Body
    public function isBinary(): bool;
}

class Close {
    const NORMAL =        1000;
    const GOING_AWAY =    1001;
    const PROTOCOL =      1002;
    const BAD_DATA =      1003;
    const NO_STATUS =     1005;
    const ABNORMAL =      1006;
    const INVALID_DATA =  1007;
    const VIOLATION =     1008;
    const TOO_BIG =       1009;
    const EXTENSION =     1010;
    const SERVER_ERROR =  1011;
    const TLS_ERROR =     1015;

    public function __construct(int $code = self::NORMAL, string $message = "");
    public function getStatus(): int; // Returns one of the constants above.
    public function getMessage(): string;
}

// Example usage:

Amp\execute(function () {
    $ws = yield websocket("ws://web.socket/endpoint");

    yield $ws->send("Hello there!");

    foreach ($ws as $message) {
        $contents = yield $message; // Yielding the message buffers entire message contents.
        yield $ws->send("Message received.");

        if ($contents === "close") {
            yield $ws->close();
            break;
        }
    }
});

One minor quibble: If Connection does not implement Iterator and there's a method in the class to return an iterator, I'd much rather see it called receive(), read(), or listen() as opposed to recv().

I'm not sure whether one foreach ($ws as $message) is really any better than a onMessage callback.

Is there any reason why we need a dedicated Close object and can't just use $code, $msg?

@trowski you forgot info() method, for stats as well as close reason.

The Close object probably is not necessary… mostly it was a place for the constants to live with a meaningful name that didn't get too long, i.e.: Close::NORMAL vs. Connection::CLOSE_NORMAL. The latter is perfectly fine though.

What sort of information would be returned from info()? Should it be one method, or multiple methods such as getHeaders(), messageCount(), etc. If we do go with a single method, does there need to be a dedicated Info object, or maybe just an array?

The current draft proposes a sendMessage(). I am really unsure about this as we'd have to queue subsequent send() calls behind that message, as the websocket protocol does not allow message interleaving (only spurious control frames between data frames).

This will lead to future send() calls being delayed and potentially confusing receivers why they don't get it. [In particular if previous sendMessage() calls stall a bit due to Observable only slowly being updated (or even stalling).]

I'd leave it out - for now.

I have https://github.com/amphp/websocket/tree/amp_v2 for now - still needs some tests eventually, but much better than whatever we had before :-)

Thanks @bwoebi, and also the other ones who worked on the new API. I'm still very interested in the project, and I'll give you some feedback (if things goes wrong lol) as soon as Amp 2 is out (so I can work again seriously on my own project). I really appreciate your efforts :)

We have Amp v2 now, what's the API we want?

@toverux Do you have time for a chat?

The new version will be released soon and is based on Amp\Iterator, should be pretty easy to turn that into an event emitter if anybody needs that.