tidyverse / stringr

A fresh approach to string manipulation in R

Home Page:https://stringr.tidyverse.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

str_trunc() incorrectly snips rhs-of-ellipsis for truncated strings

eauleaf opened this issue · comments

str_trunc with param side = 'left' or side = 'center' returns incorrect result when ellipsis is the the same size or one larger than width. The rhs of the ellipsis gets appended without trunctation.

datr <- c('', 'a', 'aa', 'aaa', 'aaaa', 'aaaaaaa')

stringr::str_trunc(datr, width=4, side = 'left',   ellipsis = "..") # correct
#> [1] ""     "a"    "aa"   "aaa"  "aaaa" "..aa"
stringr::str_trunc(datr, width=4, side = 'right',  ellipsis = "..") # correct
#> [1] ""     "a"    "aa"   "aaa"  "aaaa" "aa.."
stringr::str_trunc(datr, width=4, side = 'center', ellipsis = "..") # correct
#> [1] ""     "a"    "aa"   "aaa"  "aaaa" "a..a"
stringr::str_trunc(datr, width=3, side = 'left',   ellipsis = "..") # correct
#> [1] ""    "a"   "aa"  "aaa" "..a" "..a"
stringr::str_trunc(datr, width=3, side = 'right',  ellipsis = "..") # correct
#> [1] ""    "a"   "aa"  "aaa" "a.." "a.."
stringr::str_trunc(datr, width=3, side = 'center', ellipsis = "..") # should be c("","a","aa","aaa","a..","a..")
#> [1] ""           "a"          "aa"         "aaa"        "a..aaaa"      "a..aaaaaaa"
stringr::str_trunc(datr, width=2, side = 'left',   ellipsis = "..") # should be c("","a","aa","..","..","..")
#> [1] ""          "a"         "aa"        "..aaa"     "..aaaa"    "..aaaaaaa"
stringr::str_trunc(datr, width=2, side = 'right',  ellipsis = "..") # correct
#> [1] ""   "a"  "aa" ".." ".." ".."
stringr::str_trunc(datr, width=2, side = 'center', ellipsis = "..") # should be c("","a","aa","..","..","..")
#> [1] ""          "a"         "aa"        "..aaa"     "..aaaa"    "..aaaaaaa"
stringr::str_trunc(datr, width=1, side = 'left',   ellipsis = "..") # correct
#> Error in `stringr::str_trunc()`:
stringr::str_trunc(datr, width=1, side = 'right',  ellipsis = "..") # correct
#> Error in `stringr::str_trunc()`:
stringr::str_trunc(datr, width=1, side = 'center', ellipsis = "..") # correct
#> Error in `stringr::str_trunc()`:

stringr::str_trunc(datr, width=4, side = 'left',   ellipsis = "~") # correct
#> [1] ""     "a"    "aa"   "aaa"  "aaaa" "~aaa"
stringr::str_trunc(datr, width=4, side = 'right',  ellipsis = "~") # correct
#> [1] ""     "a"    "aa"   "aaa"  "aaaa" "aaa~"
stringr::str_trunc(datr, width=4, side = 'center', ellipsis = "~") # correct
#> [1] ""     "a"    "aa"   "aaa"  "aaaa" "aa~a"
stringr::str_trunc(datr, width=3, side = 'left',   ellipsis = "~") # correct
#> [1] ""    "a"   "aa"  "aaa" "~aa" "~aa"
stringr::str_trunc(datr, width=3, side = 'right',  ellipsis = "~") # correct
#> [1] ""    "a"   "aa"  "aaa" "aa~" "aa~"
stringr::str_trunc(datr, width=3, side = 'center', ellipsis = "~") # correct
#> [1] ""    "a"   "aa"  "aaa" "a~a" "a~a"
stringr::str_trunc(datr, width=2, side = 'left',   ellipsis = "~") # correct
#> [1] ""   "a"  "aa" "~a" "~a" "~a"
stringr::str_trunc(datr, width=2, side = 'right',  ellipsis = "~") # correct
#> [1] ""   "a"  "aa" "a~" "a~" "a~"
stringr::str_trunc(datr, width=2, side = 'center', ellipsis = "~") # should be c("","a","aa","a~","a~","a~")
#> [1] ""          "a"         "aa"        "a~aaa"     "a~aaaa"    "a~aaaaaaa"
stringr::str_trunc(datr, width=1, side = 'left',   ellipsis = "~") # should be c("","a","~","~","~","~")
#> [1] ""         "a"        "~aa"      "~aaa"     "~aaaa"    "~aaaaaaa"
stringr::str_trunc(datr, width=1, side = 'right',  ellipsis = "~") # correct
#> [1] ""  "a" "~" "~" "~" "~"
stringr::str_trunc(datr, width=1, side = 'center', ellipsis = "~") # should be c("","a","~","~","~","~")
#> [1] ""         "a"        "~aa"      "~aaa"     "~aaaa"    "~aaaaaaa"
stringr::str_trunc(datr, width=0, side = 'left',   ellipsis = "~") # correct
#> Error in `stringr::str_trunc()`:
stringr::str_trunc(datr, width=0, side = 'right',  ellipsis = "~") # correct
#> Error in `stringr::str_trunc()`:
stringr::str_trunc(datr, width=0, side = 'center', ellipsis = "~") # correct
#> Error in `stringr::str_trunc()`:

stringr::str_trunc(datr, width=3, side = 'left',   ellipsis = "") # correct
#> [1] ""    "a"   "aa"  "aaa" "aaa" "aaa"
stringr::str_trunc(datr, width=3, side = 'right',  ellipsis = "") # correct 
#> [1] ""    "a"   "aa"  "aaa" "aaa" "aaa"
stringr::str_trunc(datr, width=3, side = 'center', ellipsis = "") # correct 
#> [1] ""    "a"   "aa"  "aaa" "aaa" "aaa"
stringr::str_trunc(datr, width=2, side = 'left',   ellipsis = "") # correct
#> [1] ""   "a"  "aa" "aa" "aa" "aa"
stringr::str_trunc(datr, width=2, side = 'right',  ellipsis = "") # correct 
#> [1] ""   "a"  "aa" "aa" "aa" "aa"
stringr::str_trunc(datr, width=2, side = 'center', ellipsis = "") # correct 
#> [1] ""   "a"  "aa" "aa" "aa" "aa"
stringr::str_trunc(datr, width=1, side = 'left',   ellipsis = "") # correct
#> [1] ""  "a" "a" "a" "a" "a"
stringr::str_trunc(datr, width=1, side = 'right',  ellipsis = "") # correct 
#> [1] ""  "a" "a" "a" "a" "a"
stringr::str_trunc(datr, width=1, side = 'center', ellipsis = "") # should be c("","a","a","a","a","a")
#> [1] ""         "a"        "aaa"      "aaaa"     "aaaaa"    "aaaaaaaa"
stringr::str_trunc(datr, width=0, side = 'left',   ellipsis = "") # should be c("","","","","","")
#> [1] ""        "a"       "aa"      "aaa"     "aaaa"    "aaaaaaa"
stringr::str_trunc(datr, width=0, side = 'right',  ellipsis = "") # correct
#> [1] "" "" "" "" "" ""
stringr::str_trunc(datr, width=0, side = 'center', ellipsis = "") # should be c("","","","","","")
#> [1] ""        "a"       "aa"      "aaa"     "aaaa"    "aaaaaaa"

Created on 2023-06-03 with reprex v2.0.2.9000

I submitted a pull request for str_trunc and an additional unit test.

Below is the description of the issue in str_trunc().

  • when nchar(width) is 1 more than nchar(ellipsis), side 'center' fails to return correct result
  • when nchar(width) is == to nchar(ellipsis), sides 'center' and 'left' fail to return correct result

Both cases happen because:
str_sub(string[too_long], -width..., -1))
becomes:
str_sub('characters', 0, -1))
which returns all 'characters' rather than the correct number