sammo3182 / regioncode

Software for converting regional administrative codes (China) over years

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adding a function of region and pinyin conversion

sammo3182 opened this issue · comments

  1. 2area: Converting name/code to regions hua-bei, dong-bei, etc.)
  2. 2pinyin:Converting name/codes to pinyin (probably provincial level)

添加了以下功能:

  1. 2area
    在method中添加一个选项2area,可以将所有输入的code或省市的名字转变为所属行政区划。
    为此在region_table中添加了一列area,并在vignettes中添加案例。
  2. topinyin
    在函数中添加一个参数topinyin,可以将最终转化结果(包括省、市、行政区划的名字)转变为pinyin。(基于包pinyin)
    并在vignettes中添加案例。

其他修改:

  1. 删除了原代码在省级转化时对code2code的报错。理由:现在method已经不再使用code2code,这个报错本身即可实现。

I used vignette data to test function topinyin :

regioncode(data_input = vignette_data$prefecture_id, method = '2name',province = FALSE,year_from = 2019,year_to = 1999, topinyin = TRUE)

But result showed that all pinyin conversions are ended with '_fú'. The correct result should end with '_shi'. Similarly, ‘地区’ is translated to a strange pinyin '_de_ōu`.

\t临汾地区          \t吕梁地区                <NA>                <NA> 
"\t_lín_fēn_de_ōu" "\t_lǚ_liáng_de_ōu"                "NA"                "NA" 
            太原市              太原市              大同市              朔州市 
      "tā_yuán_fú"        "tā_yuán_fú"        "dà_tóng_fú"      "shuò_zhōu_fú" 
            大同市              朔州市              长治市              长治市 
      "dà_tóng_fú"      "shuò_zhōu_fú"      "chánɡ_chí_fú"      "chánɡ_chí_fú" 
            朔州市              朔州市              晋中市              晋中市 
    "shuò_zhōu_fú"      "shuò_zhōu_fú"      "jìn_zhōnɡ_fú"      "jìn_zhōnɡ_fú" 
[ reached getOption("max.print") -- omitted 18404 entries ]

@already-love

Please check the issue and remove the tones from the pinyin @already-love

I used vignette data to test function topinyin :

regioncode(data_input = vignette_data$prefecture_id, method = '2name',province = FALSE,year_from = 2019,year_to = 1999, topinyin = TRUE)

But result showed that all pinyin conversions are ended with '_fú'. The correct result should end with '_shi'. Similarly, ‘地区’ is translated to a strange pinyin '_de_ōu`.

\t临汾地区          \t吕梁地区                <NA>                <NA> 
"\t_lín_fēn_de_ōu" "\t_lǚ_liáng_de_ōu"                "NA"                "NA" 
            太原市              太原市              大同市              朔州市 
      "tā_yuán_fú"        "tā_yuán_fú"        "dà_tóng_fú"      "shuò_zhōu_fú" 
            大同市              朔州市              长治市              长治市 
      "dà_tóng_fú"      "shuò_zhōu_fú"      "chánɡ_chí_fú"      "chánɡ_chí_fú" 
            朔州市              朔州市              晋中市              晋中市 
    "shuò_zhōu_fú"      "shuò_zhōu_fú"      "jìn_zhōnɡ_fú"      "jìn_zhōnɡ_fú" 
[ reached getOption("max.print") -- omitted 18404 entries ]

@already-love

I've removed the tones from pinyin and fixed these problems.

library(pinyin) is in line 260. I read tidyverse style guide and it says,

If your script uses add-on packages, load them all at once at the very beginning of the file. This is more transparent than sprinkling library() calls throughout your code or having hidden dependencies that are loaded in a startup file, such as .Rprofile.

Do we need to put library(pinyin) first in the code?

@sammo3182

library(pinyin) is in line 260. I read tidyverse style guide and it says,

If your script uses add-on packages, load them all at once at the very beginning of the file. This is more transparent than sprinkling library() calls throughout your code or having hidden dependencies that are loaded in a startup file, such as .Rprofile.

Do we need to put library(pinyin) first in the code?

@sammo3182

Good catch. Sure! Let's tidyversize our codes!