emmanuelparadis / ape

analysis of phylogenetics and evolution

Home Page:http://ape-package.ird.fr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

read.tree() silently (but correctly) fails on malformed Newick file

arcresu opened this issue · comments

If read.tree() encounters a Newick file that is missing its trailing semicolon, it silently returns NULL thanks to this line:

if (is.na(y[1])) return(NULL)

Would it be possible to add a warning("trailing semicolon not found") or similar before returning here? I had a tree with this issue and it opened fine in some other tree viewers so it took some debugging to figure out why ape couldn't read it. Thanks!

Thanks for reporting this. Looking at the code, maybe this could be tested a few lines above:

y <- unlist(gregexpr(";", tree))

The test would then be:

if (all(y == -1))  {
     warning("no semicolon(s) [end(s) of tree] found")
    return(NULL)
}

I had a tree with this issue and it opened fine in some other tree viewers

I guess these viewers would not work if the Newick file has more than one tree. read.tree() can work with many trees, so the use of ; is critical.
Cheers,
E.

Thanks, yes that sounds sensible.

I'm coming from a context where it's unusual to have more than one tree per file and some tools aren't careful with the semicolons. iTOL is one tree viewer that copes with these slightly malformed one-tree Newick files. I don't see any problem with read.tree() aborting since it expects to handle multiple trees, but the warning might save someone some debugging time in future.

Hi @arcresu,
might be also useful to inform the authors of the program you got the tree from that they should add a semicolon.
I did this for a program called GRASP recently.
Kind regards,
Klaus

Thanks, yes that sounds sensible.

I'm coming from a context where it's unusual to have more than one tree per file and some tools aren't careful with the semicolons. iTOL is one tree viewer that copes with these slightly malformed one-tree Newick files. I don't see any problem with read.tree() aborting since it expects to handle multiple trees, but the warning might save someone some debugging time in future.

Fixed and pushed here (with updated date).