mysqljs / mysql

A pure node.js JavaScript Client implementing the MySQL protocol.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TypeCast blows up values that come after it in row

tomasikp opened this issue · comments

Issue #536 reminded me of a problem I ran into recently. In reality, I am typecasting timestamps to unix time, but for the sake of simplicity here i just typecast all timestamps to the string '1' as you can see the values that follow get garbled. I didn't notice this until recently because most of my timestamp values are the last column in a given table in my tables...

var mysql = require('mysql')
  , assert = require('assert');

var conn= mysql.createConnection({
  host: 'localhost',
  user: 'root',
  database: 'test',
  typeCast: function (field, next) {
                if(field.type=='TIMESTAMP'){ 
                            return '1';
                }
                return next();
            }
});

conn.connect();

conn.query(
  "CREATE TABLE IF NOT EXISTS typecast_bug (id SERIAL,created timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,active tinyint(1) DEFAULT '1', expired tinyint(1) DEFAULT '0')", 
  function(err) {
    if (err) throw err;


    conn.query('INSERT INTO typecast_bug SET ?', { 
      active:true
    }, function(err, result) {
      if (err) throw err;

      var id = result.insertId;

      conn.query('SELECT * FROM typecast_bug WHERE ?', { id: id }, function(err, row) {
        if (err) throw err;

       assert.equal(true, row[0].active);


        conn.end();

      });
    });
  }
);

Throws: AssertionError: true == "NaN"

Yes, I see the issue here. When you make a custom typecast for any of the cases in RowDataPacket.prototype._typeCast that invoke the parser and you don't invoke the parser in the custom typecast for it (can you even? I'm not sure) then the rest of the fields will be garbage like you are experiencing.

@dougwilson I don't think ive seen an example of a custom typecast configuration invoking the parser...

The example in the README is as follows:

connection.query({
  sql: '...',
  typeCast: function (field, next) {
    if (field.type == 'TINY' && field.length == 1) {
      return (field.string() == '1'); // 1 = true, 0 = false
    }
    return next();
  }
});

The same as the way I use it in the sample code above, except that it is part of the connection config and not passed as part of the query.

@tomasikp right, probably because the example is too simplistic and perhaps it wasn't thought of before. Actually, I just looked and it looks like you have to invoke one of the methods on field. You example didn't invoke field.string() for TIMESTAMP which indirectly invokes the parser. If you do that then the following values won't be garbage any longer.

i.e.:

function (field, next) {
  if (field.type === 'TIMESTAMP') {
    field.string(); // Invokes parser and necessary
    return '1';
  }
  return next();
}

The moral is in your custom typecast function, you must invoke next or the proper method on field in all your branches or you may get garbage for the rest of the fields in the row packet. Your simple typecast in the OP was just missing field.string() in the if branch to parse the timestamp field from the row packet.

touche, will submit a pull request with the proper README change to explicitly state the importance of invoking field.string()

Sure, though you cannot always invoke field.string(), either. You have to invoke exactly the method that happens to be required for the type you are handling. field.string() is just what it is for TIMESTAMP. You can see the list by checking out the RowDataPacket.prototype._typeCast method, though for almost all of them it is indeed field.string().

I see what you mean:

field.string()
field.buffer()
field.geometry()

are aliases for

parser.parseLengthCodedString()
parser.parseLengthCodedBuffer()
parser.parseGeometryValue()

Yea. After seeing this I feel like the way custom type casting works right now makes to too easy for someone to shoot themselves in the foot (even calling field.string() twice in the same branch will garble the rest of the data). The custom typecast should probably be called within _typeCast so for all known types it can invoke the parser for the user correctly instead of the user needing to.

This is exactly what happened to me.
I originally had something like this:

typeCast: function (field, next) {
                if(field.type=='TIMESTAMP'){ 
                            return require('moment')(field.string()).unix();
                }
                return next();
            }

but in some cases field.string() would return null, which in turn would cause moment to throw an error.
tried this and started getting garbled data coming back.

typeCast: function (field, next) {
                if(field.type=='TIMESTAMP'){ 
                            return field.string()==null?'':require('moment')(field.string()).unix();
                }
                return next();
            }

now I understand why, called field.string twice.

It is prob not enough to just amend the documentation to warn of this possibility; too easy to shoot oneself in the foot and not realize it...

A fix for this would prob be a breaking change for everyone using custom typeCasting. The callback parameters would prob change to be something like (Original Field, Parsed) so you could parse the original field yourself if you like, or return the parsed value as-is

It depends on what the maintainers of the library think. The current version in still "alpha" so they may consider it before it becomes "beta" or stable or whatever, not sure.

We could make a change in Field to avoid double parsing. Maybe throwing if someone calls field.string() and then calls field.buffer(), but if someone calls field.string() and then again it would return the previous value. This should solve your problem.

About not calling field.*() and just returning something, we could do something about it but it will probably complicate the code.