r-dbi / DBI

A database interface (DBI) definition for communication between R and RDBMSs

Home Page:https://dbi.r-dbi.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why is `dbWriteTable` discouraged?

pnacht opened this issue · comments

The man page for dbWriteTable() discourages its use by stating that

New code should prefer dbCreateTable() and dbAppendTable().

However, I can't see why, given that it presents the same features and more in a tidy little function.

dbCreateTable() simply calls CREATE TABLE, without checking if the table already exists. So if one wishes to override a possibly-existing table, one must first call dbGetQuery("SELECT {table exists}") to determine whether the table exists, and then either dbExecute('DELETE FROM ...') if it does or dbCreateTable() if it doesn't, then dbAppendTable().

Or, you know, just call dbWriteTable(overwrite = TRUE). (And, if you want, add the row names as a column for "free")

In my ignorance, I can't seem to find the source code for dbWriteTable to understand what it does under the hood (I just found the default definition, which just looks like infinite recursion to me...), but I'd expect it has to do a lot of work to figure out precisely what it has to do in each situation.

So I understand that if the user knows they are creating a new table, they might be better served by dbCreate+AppendTable, which are clean pieces of code which likely outperform dbWriteTable for such tasks.

What I found odd was the term "New code", which implies dbWriteTable is inferior to the Create+Append combo, when it's seemingly a superset. Something like "However, a combination of dbCreateTable() and dbAppendTable() will in many cases be more performant".

Then again, I've usually found that there's something I've forgotten in these cases, so what haven't I taken into consideration here? Is it simply a maintenance attack-surface concern looking forward?

Thanks. There's dbExistsTable() and dbRemoveTable() too. The goal here is to simplify the database interface so that it uses simpler building blocks, dbWriteTable() does "too much" in one method.

Also why does the documentation for dbWriteTable() not explain (anymore? I thought it used to) that it can use a file name as value argument instead of a data.frame?
To me that's the main advantage of dbWriteTable() to me is that it's the only way I can add a csv directly into an sqlite file. (and it does seem to do it much more efficiently than if loading the file as data.frame first!).

The R markdown chunks (set to sqlite), the SQL script files not the dbExecute() function allows me to use any type of code to do that. When writing an r package, I can't access the sqlite terminal, so that's also not an option.

I'm just wondering if this "you should use these other functions" means this one will eventually be deprecated and my option for this gone. I did see from the source code I saw it's using a C function from sqlite itself to do so, so I guess if it does disappear I'll just have to figure how to use that.

Maybe, if the dbCreateTable() and dbAppendTable() combo are better for data.frames, the dbWriteTable() could become dbImportTable() and only do that import part?

Thanks for your input. I have tweaked the documentation, hope it's a bit clearer now -- in particular with the links to the methods in other packages.

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.