- General articles
- Getting help
- Logic
- Files management
- Data cleaning and manipulation
- Data visualization
- R programming
- Packages (ordered alphabetically)
?orhelpgives the documentation of a specific function.??orhelp.searchsearches for provided key word (or regex pattern) in the help system.
xorindicates elementwise exclusive OR.
setwd&getwd: Setting working directorylist.files&list.dirs: List all files or directories in the given path.file.exists,file.copy,file.rename,file.remove: System level of file manipulation.download.file: Download files from the Internet in an R session.
anyNA,complete.cases,is.na, andna.omitare useful when finding or excluding NAs.ordercan order the data frame with data in its column(s). For example,airquality[order(airquality$Month),]andairquality[order(airquality$Day),]order that data frame by Month and Day respectively. Multiple argumets inorderare allowed.transformtransforms columns in a data frame.
- Plotting lines and the group aesthetic in ggplot2
A good thing to know when usingggplot2to plot a line chart where x-axis is a categorical variable. - Never use [] or $ inside
aes
Avoid using column slicing with[]or$inggplot2::aes. - What is the width argument in position_dodge?
Decent explanation and demonstration of mechanisms ofggplot2::position_dodge. - Share a legend between multiple plots using
grid.arrange
Usinggridto place the plots and the legends in an arbitrary layout. I also modified this function to allow shared axes titles and to specify only ncol or nrow. - Heat maps with ggplot2
A tutorial for creating heat maps in R, including with base and ggplot2 system.
- All arguments after an ellipsis must have default values.
- The arguments can be passed by order or by specified names. When specifying names, they can be either names themselves or characters. For instances,
mean(x = 1:3)is equivalent tomean("x" = 1:3). returnreturns the result of an expression and ignores all the following lines in that function.- Generating messages for function users:
messageis used for generating a diagnostic messagewarningandstopare for generating warnings and fetal errors respectively.stopifnot, is "If any of the expressions in...are not all TRUE,stopis called, producing an error message indicating the first of the elements of...which were not true."
missingcan be used to test whether a value was specified as an argument to a function. For instance,test <- function(y = 1) {if (missing(y)) {print(y)}}.on.exitrecords the expression given as its argument as needing to be executed when the current function exits (either naturally or as the result of an error).existcan test whether the named object exist in the specified environment.readlinereads a line from the terminal (in interactive use).::to use functions (once) without loading the package For example, callingreshape2::meltis equivalent tolibrary(reshape2)orrequire(reshape2)beforemelt.
- R的字串處理
grep,grepl,regexpr,gregexprandregexecsearch for matches to argument pattern within each element of a character vector: they differ in the format of and amount of detail in the results.subandgsubperform replacement of the first and all matches respectively.sprintfreturns a character vector containing a formatted combination of text and variable values.substrextracts or replaces substrings in a character vector.strsplitsplits the elements of a character vector x into substrings according to the matches to substring split within them.tolowerandtoupperconvert upper-case characters in a character vector to lower-case, or vice versa. Non-alphabetic characters are left unchanged.nchartakes a character vector as an argument and returns a vector whose elements contain the sizes of the corresponding elements of x.
splitdivides the data in the vector x into the groups defined by f.apply,sapply,lapply,tapply, andmapply("apply" family). See an example ofmapplysince it's more complicated.byis an object-oriented wrapper fortapplyapplied to data frames.Reduceuses a binary function to successively combine the elements of a given vector and a possibly given initial value.do.callconstructs and executes a function call from a name or a function and a list of arguments to be passed to it, whilecallonly constructs the function call.to_bind <- list(data.frame(A = 1:2, B = 3:4), data.frame(A = 7:9, B = 5:7)) do.call(rbind, to_bind) # A B # 1 1 3 # 2 2 4 # 3 7 5 # 4 8 6 # 5 9 7
replicateis a wrapper for the common use of sapply for repeated evaluation of an expression (which will usually involve random number generation).Vectorizecreates a function wrapper that vectorizes the action of its argument FUN.
classreturns the data type (or to be specific, the method) of one object. Compare this withmode.strcompactly displays the internal structure of an R objectappendadds elements to a vector.diffreturns suitably lagged and iterated differences, e.g.diff(1:5).identicaltests two objects for being exactly equal.system.timereturns CPU (and other) times that expr used. Compare this withSys.time.unlistsimplifies it to produce a vector which contains all the atomic components which occur in the given list.unnameremoves the names or dimnames attribute of an R object.searchgives a list of attached packages (see library), and R objects, usually data frames.rlecomputes the lengths and values of runs of equal values in a vector.sequencecan be regarded as the vectorized version ofseq_len.
x <- c(rep(1:4, times = 1:4), 1, 1) sequence(rle(x)$length) # 1 1 2 1 2 3 1 2 3 4 1 2
- car
Short for "Companion to Applied Regression". Two of the useful functions areAnovaandManova, which can calculate type-II or type-III ANOVA and MANOVA respectively. - caret
Short for "Classification And REgression Training". A package integrate multiple machine learning algorithm packages. In addition, it helps data preprocessing and cross-validation withconfusionMatrix. - cowplot
Merging multiple ggplots and labeling them respectively in one graph. - dendextend
Extended functions for built-in dendrograms in R. - dplyr
Some other ways to manipulate or cleanse data. - e1071
LIBSVM package for R. - ggmap
Spatial visualization with ggplot2. - ggplot2
A popular plotting system in R. - googleVis
R interface to Google's chart tools, allowing users to create interactive charts based on data frames. - gridExtra
"Miscellaneous Functions for 'Grid' Graphics." A tutorial can be found here. - leaflet
Useful for adding markers and (interactive) polygons on the map. - lme4
Package for creating (generalized) linear mixed-effects model. Also see regression on repeated measurements for discussions on this topic. - magrittr
The "pipe-like" operator%>%allows people to transmit a value or object to an expression or function call. - mice
Short for "Multivariate Imputation by Chained Equations". Tutorials of means, including but not limit to MICE, to deal with missing data can be found in this webpage (in Mandarin). Check also my understanding to MICE and Tutorial on 5 Powerful R Packages used for imputing missing values. - MCMCglmm
A package for fitting Bayesian mixed models in R. More introduction and tutorial here. - plotly
A powerful package to build interactive plots. Itsplot_lyfunction creates various types of plots, andggplotlyturns most ofggplot2objects interactive. - rattle
Wonderful GUI for machine learning analyses. The author emphasizes its capability of creating logs when users click the GUI, and exporting them as a shortcut for further argument tuning. Programming is still encouraged. - reshape2
meltthe data into a long-format orcastit into a wide-format. An example is provided here. - shiny
Building interactive interface and present data to others even they don't know R. Its tutorial is very worth reading.