windowjs / windowjs

Window.js is an open-source Javascript runtime for desktop graphics programming.

Home Page:https://windowjs.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Codecs API

joaodasilva opened this issue · comments

Window.js should have convenient APIs to encode and decode data:

  • base64
  • URL encode/decode
  • gzip compress/decompress

Use methods as in HTML5 where available; for example:

https://developer.mozilla.org/en-US/docs/Web/API/btoa
https://developer.mozilla.org/en-US/docs/Web/API/atob
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent

Gzip support could have a similar global function, like ztob and btoz or so.

The btoa/atob APIs encode and decode from/to strings; Window.js could have a variant that also supports encoding ArrayBuffers and decoding into an ArrayBuffer.

I haven't looked too deeply into this repo, but it looks like it's mostly c++ so I found some implementations that could be used:

base64

#include "base64.h"
#include <iostream>

static const std::string base64_chars =
             "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
             "abcdefghijklmnopqrstuvwxyz"
             "0123456789+/";


static inline bool is_base64(BYTE c) {
  return (isalnum(c) || (c == '+') || (c == '/'));
}

encode

std::string base64_encode(BYTE const* buf, unsigned int bufLen) {
  std::string ret;
  int i = 0;
  int j = 0;
  BYTE char_array_3[3];
  BYTE char_array_4[4];

  while (bufLen--) {
    char_array_3[i++] = *(buf++);
    if (i == 3) {
      char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
      char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
      char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);
      char_array_4[3] = char_array_3[2] & 0x3f;

      for(i = 0; (i <4) ; i++)
        ret += base64_chars[char_array_4[i]];
      i = 0;
    }
  }

  if (i)
  {
    for(j = i; j < 3; j++)
      char_array_3[j] = '\0';

    char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
    char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
    char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);
    char_array_4[3] = char_array_3[2] & 0x3f;

    for (j = 0; (j < i + 1); j++)
      ret += base64_chars[char_array_4[j]];

    while((i++ < 3))
      ret += '=';
  }

  return ret;
}

decode

std::vector<BYTE> base64_decode(std::string const& encoded_string) {
  int in_len = encoded_string.size();
  int i = 0;
  int j = 0;
  int in_ = 0;
  BYTE char_array_4[4], char_array_3[3];
  std::vector<BYTE> ret;

  while (in_len-- && ( encoded_string[in_] != '=') && is_base64(encoded_string[in_])) {
    char_array_4[i++] = encoded_string[in_]; in_++;
    if (i ==4) {
      for (i = 0; i <4; i++)
        char_array_4[i] = base64_chars.find(char_array_4[i]);

      char_array_3[0] = (char_array_4[0] << 2) + ((char_array_4[1] & 0x30) >> 4);
      char_array_3[1] = ((char_array_4[1] & 0xf) << 4) + ((char_array_4[2] & 0x3c) >> 2);
      char_array_3[2] = ((char_array_4[2] & 0x3) << 6) + char_array_4[3];

      for (i = 0; (i < 3); i++)
          ret.push_back(char_array_3[i]);
      i = 0;
    }
  }

  if (i) {
    for (j = i; j <4; j++)
      char_array_4[j] = 0;

    for (j = 0; j <4; j++)
      char_array_4[j] = base64_chars.find(char_array_4[j]);

    char_array_3[0] = (char_array_4[0] << 2) + ((char_array_4[1] & 0x30) >> 4);
    char_array_3[1] = ((char_array_4[1] & 0xf) << 4) + ((char_array_4[2] & 0x3c) >> 2);
    char_array_3[2] = ((char_array_4[2] & 0x3) << 6) + char_array_4[3];

    for (j = 0; (j < i - 1); j++) ret.push_back(char_array_3[j]);
  }

  return ret;
}

usage

std::vector<BYTE> myData;
...
std::string encodedData = base64_encode(&myData[0], myData.size());
std::vector<BYTE> decodedData = base64_decode(encodedData);

url

#include <cctype>
#include <iomanip>
#include <sstream>
#include <string>

using namespace std;

encode

string url_encode(const string &value) {
    ostringstream escaped;
    escaped.fill('0');
    escaped << hex;

    for (string::const_iterator i = value.begin(), n = value.end(); i != n; ++i) {
        string::value_type c = (*i);

        // Keep alphanumeric and other accepted characters intact
        if (isalnum(c) || c == '-' || c == '_' || c == '.' || c == '~') {
            escaped << c;
            continue;
        }

        // Any other characters are percent-encoded
        escaped << uppercase;
        escaped << '%' << setw(2) << int((unsigned char) c);
        escaped << nouppercase;
    }

    return escaped.str();
}

decode

string urlDecode(string &SRC) {
    string ret;
    char ch;
    int i, ii;
    for (i=0; i<SRC.length(); i++) {
        if (int(SRC[i])==37) {
            sscanf(SRC.substr(i+1,2).c_str(), "%x", &ii);
            ch=static_cast<char>(ii);
            ret+=ch;
            i=i+2;
        } else {
            ret+=SRC[i];
        }
    }
    return (ret);
}

usage

string myUrl;
...
string encodedUrl = url_encode(myUrl);
string decodedUrl = urlDecode(encodedUrl);

I couldn't find a good one for gzip but you could use this

Thanks for the pointers!

Regarding the C++ implementation: Window.js already depends on libuv, v8 and Skia which come with their own utilities and further dependencies. We should reuse them where possible, since that code is already working and tested, and is also already included in the binary so we shouldn't add duplicated code.

For base64:
The API is going to be atob and btoa to match HTML5, and Skia seems to have an implementation here:

libraries/skia/include/utils/SkBase64.h

This file only exists in the checkout after syncing dependencies. Here's the source in the Skia repo:

https://github.com/google/skia/blob/main/include/utils/SkBase64.h

For gzip:
Window.js is already using zlib via a v8 dependency. See here:

https://github.com/windowjs/windowjs/blob/main/src/zip.cc

Window.js uses that to decompress the embedded console source from src/console.js, and the welcome screen in src/welcome.js.

What's still unclear is how to expose this as an API. I thought of something like atoz and ztoa to match the base64 functions. However, I'd like to also have something to encode an ArrayBuffer, and to decode into an ArrayBuffer, not just strings. Do you have any suggestions?

For urlEncode:

I haven't looked yet but I guess that there's an implementation in v8 or its dependencies as well.

For the implementation:

I'd suggest adding utility functions in zip.h and zip.cc first (also for the base64 codecs; maybe rename zip.cc to codec.cc), and then registering the APIs as globals in js_api.cc:

scope.Set(global, StringId::devicePixelRatio, DevicePixelRatio);

Let me know if you'd like to prepare a PR. As stated above, I'd suggest starting with a PR just for atob and btoa.

I can sart a pr now if you want.

Please feel free to contribute! If you send PR then I'll have a look and comment on it.

If you add the APIs, please also include an update to the docs and to the Typescript types for the new functions.

atob and btoa aren't great because they don't work with TypedArrays. I think a custom Codec API makes sense for these.

I need to inline some binary data in a script and base64 would be a good way to do it, so I'll add Codec.toBase64 and Codec.base64ToArrayBuffer now.

The codec for base64 has been added:

https://windowjs.org/doc/codec

Adding a similar codec for gzip is a good first bug. Here's what needs to be done:

  • Add Codec.zip and Codec.unzip
  • Add the strings "zip" and "unzip" to js_strings.h and js_strings.cc
  • Register the callbacks for these APIs in js_api_codec.cc
  • Use zip.h to compress and uncompress
  • "zip" should support a string, ArrayBuffer, Uint8Array or Uint8ClampedArray argument and returns an ArrayBuffer
  • "unzip" takes an ArrayBuffer or Uint8Array, and returns an ArrayBuffer
  • "unzip" should throw on invalid input (but note that zip.h is currently crashing; it needs to be modified to propagate errors)
  • add the Typescript types to types/codec.d.ts
  • document the API in docs/doc/codec.md
  • add test coverage to tests/codec.js