strophe corrupts unicode JID node names.

Question

strophe corrupts unicode JID node names.

jstressman opened this issue 13 years ago · comments

(As reported at https://github.com/amiadogroup/candy/issues/24 )

The strophe js lib corrupts unicode JID nodes as a shortcoming of javascript itself.
(Unicode node names are valid per http://xmpp.org/extensions/xep-0029.html#sect-id317032 )

This can be worked around using the method described at http://ecmanaut.blogspot.com/2006/07/encoding-decoding-utf8-in-javascript.html and https://developer.mozilla.org/en/DOM/window.btoa#Unicode_Strings

I've applied the same changes as done for the candy libs to strophe 1.0.2 to create the diff below.

--- strophe.js  2011-08-25 00:25:54.839392300 -0400
+++ strophe-unicode.js  2011-08-25 00:29:37.644136000 -0400
@@ -3019,9 +3019,9 @@
         } else if (do_sasl_plain) {
             // Build the plain auth string (barejid null
             // username null password) and base 64 encoded.
-            auth_str = Strophe.getBareJidFromJid(this.jid);
+            auth_str = unescape(encodeURIComponent(Strophe.getBareJidFromJid(this.jid)));
             auth_str = auth_str + "\u0000";
-            auth_str = auth_str + Strophe.getNodeFromJid(this.jid);
+            auth_str = auth_str + unescape(encodeURIComponent(Strophe.getNodeFromJid(this.jid)));
             auth_str = auth_str + "\u0000";
             auth_str = auth_str + this.pass;

@@ -3102,14 +3102,14 @@
             digest_uri = digest_uri + "/" + host;
         }

-        var A1 = MD5.hash(Strophe.getNodeFromJid(this.jid) +
+        var A1 = MD5.hash(unescape(encodeURIComponent(Strophe.getNodeFromJid(this.jid))) +
                           ":" + realm + ":" + this.pass) +
             ":" + nonce + ":" + cnonce;
         var A2 = 'AUTHENTICATE:' + digest_uri;

         var responseText = "";
         responseText += 'username=' +
-            this._quote(Strophe.getNodeFromJid(this.jid)) + ',';
+            this._quote(unescape(encodeURIComponent(Strophe.getNodeFromJid(this.jid)))) + ',';
         responseText += 'realm=' + this._quote(realm) + ',';
         responseText += 'nonce=' + this._quote(nonce) + ',';
         responseText += 'cnonce=' + this._quote(cnonce) + ',';

▟ ▖▟ ▖ · Answer 1 · Wed Nov 09 2011 22:45:52 GMT+0800 (China Standard Time)

looks a little bit old to me .. oh and XEP0029 starts with:

WARNING: This document has been retracted by the author(s). Implementation of the protocol described herein is not recommended. Developers desiring similar functionality are advised to implement the protocol that supersedes this one (if any).

do you have any other references that are not outdated?

(sorry if that sounds a little bit harsh)

Jarosław Jedynak · Answer 2 · Mon Aug 15 2016 23:06:54 GMT+0800 (China Standard Time)

looks a little bit old to me .. oh and XEP0029 starts with:
The point is, XMPP supports unicode. If XEP0029 is deprecated, it's only because something newer came out.

For example take a look at https://xmpp.org/extensions/xep-0106.html - RFC 3920 [1] defines the Nodeprep profile of stringprep (RFC 3454 [2]), which specifies that the following nine Unicode code points are disallowed in the node identifier portion of a Jabber Identifier (JID): - only nine unicode codepoints are explocitly listed as forbidden.

I highly encourage you to apply this changes to strophejs library - I had to patch this "by hand" for my website, and this feels a bit hacky. After this patch everything is working as it should be.

If you hesitate because of backward compability, remember that unescape(encodeURIComponent(x)) is no-op as long as x is ascii string - so it's impossible that something will break after this change.