elizagrames / litsearchr

litsearchr is an R package to partially automate search term selection for systematic reviews using keyword co-occurrence networks. In addition to identifying search terms, it can write Boolean searches and translate them into over 50 languages.

Home Page:https://elizagrames.github.io/litsearchr

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Backslashes in write_search()

luketudge opened this issue · comments

write_search() puts backslashes into the search string, but only when writing to file with writesearch=TRUE.

For example:

terms <- list(c("a", "b"), c("c", "d"))
write_search(terms, closure="left", languages="English", writesearch=TRUE)
cat(read_file("search-inEnglish.txt"))
\(\(a OR b\) AND \(c OR d\)\)

The backslashes come before parentheses, which makes it look like an escape sequence of some sort. This is not an escape sequence that R itself recognizes, for example:

"\("
Error: '\(' is an unrecognized escape in character string starting ""\("

But I'm wondering whether it is an escape sequence that is needed when copying text into the search field of a database? If so, then please ignore this issue. But if not, then it seems like a bug, since it complicates copying and pasting from a text file, and it also does not occur when getting the search string straight from the console with writesearch=FALSE:

write_search(terms, closure="left", languages="English", writesearch=FALSE)
[1] "English is written"
[1] "((a OR b) AND (c OR d))"

Doesn't look like this is an encoding issue, but just in case here is my locale info:

getOption("encoding")
[1] "native.enc"
Sys.getlocale("LC_CTYPE")
[1] "en_US.UTF-8"

This is intentional because otherwise the output has errors if the backslashes aren't included as escape characters. Maybe this is only a Linux thing though, so I could make it OS-specific if it causes problems on Mac and/or Windows?

Ok, I see. Curious.

On Linux I experimented with removing the backslashes completely (in b0aca33) and I get both the text file output and the console output as-is with no problems.

Just to check: Do you mean that Windows throws an error if you don't include the backslashes? What sort of error?

Thanks again for your help!

Honestly, my documentation for this decision is pretty terrible and at this point I don't know why this is the default since I just vaguely recall something about escape characters for asterisks. I'm a bit limited these days with working from home on a single OS with no access to lab computer resources, but if anyone can confirm that dropping the extra backslashes is fine on Windows and Mac, I can add that straight away.

Ok. No worries. We can just wait and see whether someone on Windows and Mac is able to check this out later. I also can't get to any other machine at the moment.

For anyone willing to test this out, you can install a temporary version of litsearchr without the backslashes for write_search() by doing:

library(devtools)
install_github("luketudge/litsearchr", ref="patch-1")

Then check that you succeeded in getting the temporary version with:

library(litsearchr)
grepl("\\\\", body(write_search)[7])

This should say FALSE.

Then run the following test and report the output here:

terms <- list(c("alphabetically", "b"), c("concatenated", "d"))
write_search(terms, closure="left", languages="English", writesearch=FALSE)
write_search(terms, closure="left", languages="English", writesearch=TRUE)

And also report the output of:

library(readr)
cat(read_file("search-inEnglish.txt"))

@Shireen87 in case you have time, you could try this. But no rush.

Sure I'm happy to help. I implemented the codes on a Mac laptop, here is the output on R, plus the file under the name of "search-inEnglish.txt"
search-inEnglish.txt

library(devtools)
Loading required package: usethis
install_github("luketudge/litsearchr", ref="patch-1")
Downloading GitHub repo luketudge/litsearchr@patch-1
These packages have more recent versions available.
It is recommended to update all of them.
Which would you like to update?

1: All
2: CRAN packages only
3: None
4: stringdist (0.9.5.5 -> 0.9.6) [CRAN]

Enter one or more numbers, or an empty line to skip updates:
6
Enter one or more numbers, or an empty line to skip updates:
1
stringdist (0.9.5.5 -> 0.9.6) [CRAN]
Installing 1 packages: stringdist

There is a binary version available but the source version is later:
binary source needs_compilation
stringdist 0.9.5.5 0.9.6 TRUE

Do you want to install from sources the package which needs compilation? (Yes/no/cancel) yes
installing the source package ‘stringdist’

trying URL 'https://cran.rstudio.com/src/contrib/stringdist_0.9.6.tar.gz'
Content type 'application/x-gzip' length 839087 bytes (819 KB)

downloaded 819 KB

  • installing source package ‘stringdist’ ...
    ** package ‘stringdist’ successfully unpacked and MD5 sums checked
    ** libs
    clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include -I/usr/local/include -fopenmp -fPIC -Wall -g -O2 -c R_register_native.c -o R_register_native.o
    clang: error: unsupported option '-fopenmp'
    make: *** [R_register_native.o] Error 1
    ERROR: compilation failed for package ‘stringdist’
  • removing ‘/Library/Frameworks/R.framework/Versions/3.5/Resources/library/stringdist’
  • restoring previous ‘/Library/Frameworks/R.framework/Versions/3.5/Resources/library/stringdist’
    Error: Failed to install 'litsearchr' from GitHub:
    (converted from warning) installation of package ‘stringdist’ had non-zero exit status

library(litsearchr)
terms <- list(c("alphabetically", "b"), c("concatenated", "d"))
write_search(terms, closure="left", languages="English", writesearch=FALSE)
[1] "English is written"
[1] "((alphabet* OR b) AND (concaten* OR d))"
write_search(terms, closure="left", languages="English", writesearch=TRUE)
This is going to write .txt files to your computer containing the search strings. Are you sure you want to write the files?

1: yes
2: no

Selection: yes
[1] "English is written"
[1] "((alphabet* OR b) AND (concaten* OR d))"

Here is the output of the file:

library(readr)
cat(read_file("search-inEnglish.txt"))
((alphabet* OR b) AND (concaten* OR d))

Hi @Shireen87 thanks for taking the time to help out.

I'm not completely sure, but judging from the early part of the output it looks as though you did not in fact manage to get the temporary test version of litsearchr that I set up. Your output says:

Error: Failed to install 'litsearchr' from GitHub:
(converted from warning) installation of package ‘stringdist’ had non-zero exit status

This seems to have occurred because R tried to update another package (stringdist) at the same time, and failed. This isn't necessary, so you could skip it. Whenever you next have the time and patience, could you try it again? But this time when the console asks you about updating packages, just choose the 'None' option, which will not attempt any other updates.

Then once installation has finished and you have done library(litsearchr), you can test whether you really got the test version by entering the following command:

grepl("\\\\", body(write_search)[7])

This command checks for a piece of source code that is in the original litsearchr but not in the test version. So you should get the response FALSE if you succeeded in getting the test version. (I have also updated my instructions in the post above to include this check).

Then you can go ahead and run the rest of the test as before, so:

terms <- list(c("alphabetically", "b"), c("concatenated", "d"))
write_search(terms, closure="left", languages="English", writesearch=FALSE)
write_search(terms, closure="left", languages="English", writesearch=TRUE)
library(readr)
cat(read_file("search-inEnglish.txt"))

Thanks!

I'm so sorry! I didn't notice that I have to choose a number of a given list, like 3 for None. But now I hope it is fixed, below are the outputs:

library(devtools)
Loading required package: usethis
install_github("luketudge/litsearchr", ref="patch-1")
Downloading GitHub repo luketudge/litsearchr@patch-1
These packages have more recent versions available.
It is recommended to update all of them.
Which would you like to update?

1: All
2: CRAN packages only
3: None
4: stringdist (0.9.5.5 -> 0.9.6) [CRAN]

Enter one or more numbers, or an empty line to skip updates:
3
✓ checking for file ‘/private/var/folders/gc/s1w8t2pd4_dgythn6wkh5w1r0000gn/T/Rtmp8OAQAc/remotes2efd7c0ee8e3/luketudge-litsearchr-b0aca33/DESCRIPTION’ ...
─ preparing ‘litsearchr’:
✓ checking DESCRIPTION meta-information ...
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ looking to see if a ‘data/datalist’ file should be added
─ building ‘litsearchr_1.0.0.tar.gz’

  • installing source package ‘litsearchr’ ...
    ** R
    ** data
    *** moving datasets to lazyload DB
    ** inst
    ** byte-compile and prepare package for lazy loading
    ** help
    *** installing help indices
    ** building package indices
    ** installing vignettes
    ** testing if installed package can be loaded
  • DONE (litsearchr)

terms <- list(c("alphabetically", "b"), c("concatenated", "d"))
write_search(terms, closure="left", languages="English", writesearch=FALSE)
Error in write_search(terms, closure = "left", languages = "English", :
could not find function "write_search"
write_search(terms, closure="left", languages="English", writesearch=TRUE)
Error in write_search(terms, closure = "left", languages = "English", :
could not find function "write_search"
library(readr)
cat(read_file("search-inEnglish.txt"))
(("depress* disord*" OR "depress* symptom*" OR "major* depress*" OR "negat* affect*" OR "neurotroph* factor*" OR "psychiatr* illness" OR "trauma* exposur*" OR "affect* disord*" OR "emot* abus*" OR "emot* reactiv*" OR "mental* disord*" OR "mental* illness" OR "negat* emot*" OR "psychiatr* disord*" OR "trauma* questionnair*" OR "traumat* event*") AND ("emot* dysregul*" OR "emot* process*" OR "emot* regul*" OR "emot* stimuli*") AND ("anxieti* disord*") AND ("acut* stress*" OR "chronic* stress*" OR "polymorph* region*" OR "posttraumat* stress* disord*" OR "psycholog* stress*" OR "psychosoci* stress*" OR "social* stress*" OR "stress* disord*" OR "stress* exposur*" OR "stress* reactiv*" OR "traumat* stress*" OR "environment* stress*" OR "mental* stress*" OR "stress* hormon*" OR "trier* social* stress*") AND ("child* maltreat*" OR "childhood* abus*" OR "childhood* advers*" OR "childhood* experi*" OR "childhood* maltreat*" OR "childhood* trauma*" OR "earli* advers*" OR "environ* interact*" OR "sexual* abus*" OR "advers* experi*" OR "critic* period*" OR "earli* childhood*" OR "earli* develop*" OR "matern* separ*" OR "sensit* period*" OR "young* adult*"))

Hi @Shireen87 this looks good and you are almost there. But it looks like you did not run library(litsearchr) in between doing the install step and running the test. You can see this from the message could not find function "write_search". R cannot find this function because you have not loaded the litsearchr package.

Could you give it one more go when you next have time? The install steps seem to have worked, so you don't need to repeat those, but now after you start up RStudio, make sure you begin by loading litsearchr:

library(litsearchr)

Then run the quick test that I mentioned for checking that you really have the modified version that I set up:

grepl("\\\\", body(write_search)[7])

And then finally the main test again:

terms <- list(c("alphabetically", "b"), c("concatenated", "d"))
write_search(terms, closure="left", languages="English", writesearch=FALSE)
write_search(terms, closure="left", languages="English", writesearch=TRUE)
library(readr)
cat(read_file("search-inEnglish.txt"))

Thanks!

Oh! Here it is :) I hope everything now is correct!

library(litsearchr)
grepl("\\", body(write_search)[7])
[1] FALSE
terms <- list(c("alphabetically", "b"), c("concatenated", "d"))
write_search(terms, closure="left", languages="English", writesearch=FALSE)
[1] "English is written"
[1] "((alphabet* OR b) AND (concaten* OR d))"
write_search(terms, closure="left", languages="English", writesearch=TRUE)
This is going to write .txt files to your computer containing the search strings. Are you sure you want to write the files?

1: yes
2: no

Selection: yes
[1] "English is written"
[1] "((alphabet* OR b) AND (concaten* OR d))"

library(readr)
cat(read_file("search-inEnglish.txt"))
((alphabet* OR b) AND (concaten* OR d))

Success! Thanks @Shireen87 for your patience.

So it looks as though removing the backslashes is fine for Mac. I'll see if I can find a Windows user.

Thanks, @Shireen87 for testing it out and @luketudge for coordinating this effort to figure out why on earth I made this a default setting in the first place! You guys are the best :)

Hi @elizagrames , we're glad to be able to help out.

It occurred to me that maybe your choice was originally about regular expressions. \( isn't an escape sequence for basic strings or writing text to file as far as I know, but it is an escape sequence that is necessary for representing literal parentheses in regular expressions, because regular expressions use parentheses for grouping.

This is something that can sometimes come up unexpectedly in the behavior of functions like gsub(), which treat the first argument not as a literal string but as a regular expression. For example:

gsub("(", ")", ":(")
Error in gsub("(", ")", ":(") :
  invalid regular expression '(', reason 'Missing ')''
gsub("\\(", ")", ":(")
[1] ":)"

I don't see anywhere in the source that the search strings are used as regular expressions, but maybe earlier you did so. Could that be it?

Anyway, I'll keep thinking about it and wait to see if we get a Windows test.

Hi everyone,

I'm using Windows and have followed @luketudge 's instructions.

I was able to install temporary test version as :

library(litsearchr)
grepl("\\\\", body(write_search)[7])

returned FALSE.
Then:

terms <- list(c("alphabetically", "b"), c("concatenated", "d"))
write_search(terms, closure="left", languages="English", writesearch=FALSE)

returned

[1] "English is written"
[1] "((alphabet* OR b) AND (concaten* OR d))"

Then:
write_search(terms, closure="left", languages="English", writesearch=TRUE) returned the file 'search-inEnglish.txt' which had((alphabet* OR b) AND (concaten* OR d)) in it.

Thanks @mitchhenderson for picking this up and testing it out!

So I guess it looks as though a version without backslashes works ok on Windows too.