PNixx / clickhouse-activerecord

A Ruby database ActiveRecord driver for ClickHouse

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Binary string support

Ankk98 opened this issue · comments

Data Insertion

ArgumentError: invalid byte sequence in UTF-8
from /home/user/.rvm/gems/ruby-2.7.4/gems/clickhouse-activerecord-0.5.7/lib/active_record/connection_adapters/clickhouse/schema_statements.rb:14:in `sub'

Data fetching

  • As a format to get back data, we use JSONCompact format. But this format only supports valid UTF-8 characters so any character that is invalid UTF-8 gets replaced with a placeholder.

  • Can we allow usage of JSONEachRow?

  • This format does not replace invalid UTF-8 chars and ensures data integrity.

  • To do this we will have to write a function to parse this data in insert into ActiveRecord::Result.

  • This function can be added in SchemaStatements.

  • In ruby binary strings are represented as strings with ASCII-8BIT encodings.

  • We can provide support for binary data similar to how MySql does. We can add a config of encoding.

  • If encoding is binary, we can use ASCII-8BIT encoding else we can use standard UTF-8 encodings.

  • If you want I can provide more links regarding these.

  • I can contribute also.

Is there any timetable for this problem to be fixed?

It wasn't in the plan. You can open PR.

Hi @PNixx, I can open the PR. Do you have any recommendations on where and what would probably need to be modified?