配置的 key 是 / 的时候，.custom.yaml 中如何书写路径？

Question

配置的 key 是 / 的时候，.custom.yaml 中如何书写路径？

Streamlet opened this issue a year ago · comments

例如
原配置
（luna_pinyin.schema.yaml）

punctuator:
  half_shape:
    # ...
    "/": ["、", "､", "/", "／", "÷"]
    # ...

我想改成 "/": "/"，但不想把 half_shape 下的所有内容都抄一遍，
那么我理论上应该写成：
（luna_pinyin.custom.yaml）

patch:
  punctuator/half_shape/<斜杠>: "/"

此处 <斜杠> 该如何表达？

我看到 config_data.cc 里都是直接 SplitPath、JoinPath 的：

vector<string> ConfigData::SplitPath(const string& path) {
  vector<string> keys;
  auto is_separator = boost::is_any_of("/");
  auto trimmed_path = boost::trim_left_copy_if(path, is_separator);
  boost::split(keys, trimmed_path, is_separator);
  return keys;
}

string ConfigData::JoinPath(const vector<string>& keys) {
  return boost::join(keys, "/");
}

并没有对 / 做转义

Streamlet · Answer 1 · Wed Aug 02 2023 17:48:07 GMT+0800 (China Standard Time)

@fxliang 太赞了

居戎氏 · Answer 2 · Thu Aug 03 2023 12:46:43 GMT+0800 (China Standard Time)

patch 的语法不完备，不完全支持 key 里面含有 / @ + = 这几个字符有特殊含义。
实现一个完备的转义机制有些复杂，感觉不太合算。现在再改，还有可能使用户已有的配置失效。

具体到符号定义，原题有设计好的解法。分两种情况。

如果你是方案作者，方案里定义了全套符号，如题目所示，那么你可以直接修改方案中的定义，不需要用 patch。

如果是 luna_pinyin.schema.yaml 这种需要通过 patch 修改的预设方案，源文件里往往会使用成套符号配置，如

# luna_pinyin.schema.yaml

punctuator:
  import_preset: default

而没有直接在 punctuator/half_shape 定义整套符号的映射。
那么推荐的自定义方法是 patch punctuator/half_shape 节点，包含需要修改的符号。

# luna_pinyin.custom.yaml

patch:
  punctuator/half_shape:
    '/': '/'

编译配置的过程中，以上代码展开为：

# luna_pinyin.schema.yaml

__patch:
  __include: luna_pinyin.custom:/patch

punctuator:
  __include: default:/punctuator

进一步展开

# luna_pinyin.schema.yaml

__patch:
  punctuator/half_shape:
    '/': '/'

punctuator:
  punctuator/half_shape:
  full_shape: # 全角符号
    # ...
  half_shape: #半角符号
    # ...
    '/' : [ '、', '/', '／', '÷' ]
  symbols: # 特殊符号
    # ...

打上 patch 之后，即为所求。

Streamlet · Answer 3 · Mon Aug 07 2023 02:01:34 GMT+0800 (China Standard Time)

fxliang 的解法看上去比较简洁呀，可以采纳不？

居戎氏 · Answer 4 · Tue Feb 06 2024 11:22:48 GMT+0800 (China Standard Time)

fxliang 的解法看上去比较简洁呀，可以采纳不？

暫不採納。
一個是實現比較複雜，代碼不容易懂。也不好驗證。
二一個是原來用戶可以寫 a/\/b 表示三層嵌套 a / \ / b ，加了轉義就成了兩層 a / /b；要表示原來的 key \ 還要再轉義成 \\。這等於改變了原有配置的行爲，引進了新的問題。

hegotit · Answer 5 · Sat Mar 23 2024 15:23:30 GMT+0800 (China Standard Time)

fxliang 的解法看上去比较简洁呀，可以采纳不？

暫不採納。一個是實現比較複雜，代碼不容易懂。也不好驗證。二一個是原來用戶可以寫 a/\/b 表示三層嵌套 a / \ / b ，加了轉義就成了兩層 a / /b；要表示原來的 key \ 還要再轉義成 \\。這等於改變了原有配置的行爲，引進了新的問題。

能否新增参数以确定是否执行完备转义？就像weasel里的color_format一样，默认不是rgba，但为了照顾主流习惯做了兼容

Streamlet · Answer 6 · Mon Apr 29 2024 00:46:00 GMT+0800 (China Standard Time)

fxliang 的解法看上去比较简洁呀，可以采纳不？

暫不採納。一個是實現比較複雜，代碼不容易懂。也不好驗證。二一個是原來用戶可以寫 a/\/b 表示三層嵌套 a / \ / b ，加了轉義就成了兩層 a / /b；要表示原來的 key \ 還要再轉義成 \\。這等於改變了原有配置的行爲，引進了新的問題。

能否新增参数以确定是否执行完备转义？就像weasel里的color_format一样，默认不是rgba，但为了照顾主流习惯做了兼容

对呀，这样就可以兼容了，@lotem 考虑不？

居戎氏 · Answer 7 · Mon Apr 29 2024 09:16:00 GMT+0800 (China Standard Time)

不考虑。

用 yaml 本身的列表数据结构来描述多个字符串值比自己解析一种字符串转义编码更好。

具体是什么问题必须要重新设计一套语法？

Shewer Lu · Answer 8 · Tue Jun 04 2024 09:16:56 GMT+0800 (China Standard Time)

用
lua_processor@*punctuator ,lua_segmentor@*punctuator
lua_translator@*punctuator
三個 component 檢查 property 切換不同方案的punctuator
2 準備多個方案：/punctuator 利用 context.property 切換

------ Processsor
function P.init(env)
   env.puncts={
       Component.Processor(env.engine, Schema('punct1'), '', 'punctuator'),
       Component.Processor(env.engine, Schema('punct2'), '', 'punctuator'),
       Component.Processor(env.engine, Schema('punct3'), '', 'punctuator'), }
end
function P.func(key, env)
    local p_no= env.engine.context:get_property('puncts_sw')
    local  proc=  env.puncts[p_no]
    return proc:processor_key_event(key)
end
------ Segment
function S.init(env)
   env.puncts={
       Component.Segmentor(env.engine, Schema('punct1'), '', 'punct_segmentor'),
       Component.Sgementor(env.engine, Schema('punct2'), '', 'punct_segmentor'),
       Component.Segmentor(env.engine, Schema('punct3'), '', 'punct_segmentor'), }
end

function S.func(segments, env)
    local p_no= env.engine.context:get_property('puncts_sw')
    local  segm=  env.puncts[p_no]
    return segm:proceed(segments)
end

------------ translator
function T.init(env)
   env.puncts={
       Component.Translator(env.engine, Schema('punct1'), '', 'punct_translator'),
       Component.Translator(env.engine, Schema('punct2'), '', 'punct_translator'),
       Component.Translator(env.engine, Schema('punct3'), '', 'punct_translator'), }
end
function T.func(input,seg env)
   local p_no= env.engine.context:get_property('puncts_sw')
   local  tran=  env.puncts[p_no]
   local translator  = tran:query(input, seg)
   if translator then
       for cand in tran:query(input, seg) do 
             yield(cand)
       end
   end
end