class: center, middle, inverse, title-slide # Package Development in R ###
Fabio Votta
favstats
@favstats
www.favstats.eu
develop-rpkgs.netlify.app
2020-11-04 --- <style> .onehundredtwenty { font-size: 120%; } <style> .ninety { font-size: 90%; } .eightyfive { font-size: 85%; } .eighty { font-size: 80%; } .seventyfive { font-size: 75%; } .seventy { font-size: 70%; } .fivty { font-size: 50%; } </style> ### It's normal to struggle at first but it gets better! <img src="images/r_first_then_new.png" width="5136" /> .fivty[Illustration adapted from [Allison Horst](https://twitter.com/allison_horst)] -- + My experience is that this stuff isn't super easy... but it gets better! -- + Mostly because of: + Awesome inclusive community that is always ready to help + Great documentation of existing packages and functions + Active blogosphere with use cases and examples + and much more! --- ## Overview + Introduction + `usethis` magic + Creating a Package + Documentation (`roxygen2` & some `devtools`) + Unit Tests (with `testthat`) + Some Resources --- ### Why Even Create an R Package + R packages are a wonderful way to make functions and datasets easily accessible for everyone -- + R really comes alive because of the many awesome packages that hundreds and hundreds of volunteers help create and maintain -- + It's a great way to become active in the R community & give back -- .center[ ![](https://media1.giphy.com/media/5yWTr8kc2O7zaTDhW9/giphy.gif) ] --- ### Before we Start.. + Creating packages is a process -- + There's no need to do everything at once! -- + Come up with milestones and focus on reaching them -- + Ask yourself: what's the main purpose of your package? + Should your package be about ... + Data Visualization + Implemetiation of statistical models + Wrapping an existing API + Data wrangling + ??? + a little bit of everything? --- ### Before we Start.. + Make sure you installed the latest R and Rstudio version + Install the following packages: ```r pkgs <- c("devtools", "roxygen2", "usethis", "testthat") install.packages(pkgs) ``` -- And now? -- *Choosing a package name* ![](https://www.wired.com/wp-content/uploads/2016/05/1MINb-ejYCab5RKkhqAplDA-1.gif) --- ### Choosing a Package Name The `available` package will help check whether your desired package name is still.. well.. available: ```r library(available) available("datenguideR", browse = F) ``` <img src="images/available.png" width="1512" /> --- class: center, middle, inverse > tfw your chosen package name is still available ![](https://media.giphy.com/media/l3vR7aXoQX0OdHi48/giphy.gif) --- class: center, middle, inverse ## Now we're ready to create our first R package! --- class: center, middle, inverse ## Now we're ready to create our first R package! ### With the help of ### `usethis` --- .pull-left[ ### `usethis` The purpose of `usethis` is to ] .pull-right[ <img src="images/usethis.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] > … automate repetitive tasks that arise during project setup and development, both for R packages and non-package projects. -- As you can guess by the description `usethis` is **VERY** useful for package creation and we will be using its functions quite often during development. -- We're going to focus on two types of functions within `usethis` (although there are MANY more) -- + `use_*` + for example `use_pipe` to include the pipe operator in your package + `create_*` + for example `create_from_github` which creates a local Git repository from GitHub .pull-right[ [`usethis` pkgdown website](https://usethis.r-lib.org/index.html) ] --- .pull-left[ ### Creating your first R Package The following code will create a *minimal* R package: ] .pull-right[ <img src="images/usethis.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] ```r library(usethis) create_package("~/git_projects/datenguideR") ``` All you need to do is specify a `path`. If it exists, it is used. If it does not exist, it is created, provided that the parent path exists. -- But because *being tidy* is awesome we will be using `create_tidy_package` ```r create_tidy_package("~/git_projects/datenguideR") ``` This function will also create a new package *but* it will also apply many great tidyverse conventions that will come in handy. --- ### Creating your first R Package These files should be in your working directory after creating a tidy package: ```r -- R -- tests -- DESCRIPTION -- NAMESPACE -- cran-comments.md -- LICENSE.md -- README.Rmd -- README.md ``` --- ### Creating your first R Package ```r *-- R -- tests -- DESCRIPTION -- NAMESPACE -- cran-comments.md -- LICENSE.md -- README.Rmd -- README.md ``` This is a folder which holds `.R` files with your package functions. --- ### Creating your first R Package ```r -- R *-- tests -- DESCRIPTION -- NAMESPACE -- cran-comments.md -- LICENSE.md -- README.Rmd -- README.md ``` This is a folder which holds tests for your package functions. We will talk about tests in greater detail later. --- ### Creating your first R Package ```r -- R -- tests *-- DESCRIPTION -- NAMESPACE -- cran-comments.md -- LICENSE.md -- README.Rmd -- README.md ``` This is a file that contains meta-info about your package including authors, description and license. It's also the place for adding package dependencies. ![](images/DESCRIPTION.png) --- ### Creating your first R Package ```r -- R -- tests -- DESCRIPTION *-- NAMESPACE -- cran-comments.md -- LICENSE.md -- README.Rmd -- README.md ``` This file holds namespaces of the package and is auto-generated by ´roxygen2´ (more later). So we don't have to worry about it at all. --- ### Creating your first R Package ```r -- R -- tests -- DESCRIPTION -- NAMESPACE *-- cran-comments.md -- LICENSE.md -- README.Rmd -- README.md ``` Use this file to communicate with CRAN when you submit a package to their repository. The text should provide an overview of how your package performs on different operating systems. --- ### Creating your first R Package ```r -- R -- tests -- DESCRIPTION -- NAMESPACE -- cran-comments.md *-- LICENSE.md -- README.Rmd -- README.md ``` This file holds some template text for the license under which you publish your package. --- ### Creating your first R Package ```r -- R -- tests -- DESCRIPTION -- NAMESPACE -- cran-comments.md -- LICENSE.md *-- README.Rmd -- README.md ``` This is an Rmarkdown file that can be used to generate the README of your package. The knitted version will be displayed on GitHub. --- ### Creating your first R Package ```r -- R -- tests -- DESCRIPTION -- NAMESPACE -- cran-comments.md -- LICENSE.md -- README.Rmd *-- README.md ``` This is a markdown file that was generated by the README.Rmd file. You don't have to worry about this file. --- ### Creating your first R Package Finally, the following two lines of code will set up Git and a GitHub repository for your package: ```r use_git() use_github() ``` This is important because people need to be able to install your package from a public source. GitHub is primary source for that but there are others (Gitlab for example). -- Once you have run the two code pieces above you can do the following to install an already fully functioning R package on your computer. ```r devtools::install_github("{github_username}/{packagename}") ``` --- class: center, middle, inverse ## Congratulations you created your very first R package! ![](https://media.giphy.com/media/XGJY5KLuazndauTfZP/giphy.gif) --- class: center, middle, inverse ![](http://fabian-brunke.com/portfolio-2019/wp-content/themes/portfolio/coming_soon.gif) ## Let's start adding functions to our package! -- By first of all talking about how to *name* our functions --- ## Naming Functions The `rOpenSci Package guide` states that: > Functions and arguments naming should be chosen to work together to form a common, logical programming API that is easy to read, and auto-complete. -- So in the best case scenario you come up with a naming scheme in the following style: + `object_verb` This scheme + helps avoid namespace conflicts with packages that may have similar verbs + makes code readable and easy to auto-complete -- For example `stringr` functions all start with `str_*` whereas `memer` functions all start with `meme_*`. In our examples here we won't be using this naming style just for convenience but in your personal package you should absolutely consider using it! --- class: center, middle, inverse ## It's time to type some R code ![](https://media1.tenor.com/images/72bf7922ac0b07b2f7f8f630e4ae01d2/tenor.gif?itemid=11364811) --- ## Creating your first Function in your Package In order to get started with our first function we will create a new R script with the help of the `usethis` package: ```r use_r("hello") ``` This will create and open an `hello.R` file in your package's R subfolder; the place where all the functions of your package will live! -- Let's create an easy function that will add two numbers: ```r add <- function(x, y) { x + y } add(1, 2) ``` ``` ## [1] 3 ``` So far so good. --- class: center, middle, inverse ### But *how* do we communicate what functions do ### to people who use our package? --- ## Documentation With the help of *D O C U M E N T A T I O N*, of course! ```r always ⊂_ヽ \\ document \( ͡° ͜ʖ ͡°) > ⌒ヽ / へ\ / / \\your レ ノ ヽ_つ / / / /| ( (ヽ | |、\functions | 丿 \ ⌒) | | ) / ノ ) Lノ (_/ ``` --- .pull-left[ ## Documentation Documentation is absolutely *central*. ] .pull-right[ <img src="images/roxygen2.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] It's basically the guidance for your users so they know how to use the functions within your package. -- Or as Hadley Wickham puts it: > Documentation is one of the most important aspects of good code. Without it, users won’t know how to use your package, and are unlikely to do so. -- There is a built-in functionality within R to document packages: `.Rd` files that are stored in the `man` subdirectory of your package. These files use a syntax that is similar to Latex. Here to help us with documentation is `roxygen2` which *conveniently* creates a lot of the necessary files with its own more intuitive syntax style. --- .pull-left[ ## Documentation ] .pull-right[ <img src="images/roxygen2.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] This is how documentation for our `add` function might look like: ```r #' Add together two numbers. #' #' @param x A number. #' @param y A number. #' @return The sum of \code{x} and \code{y}. #' @export #' @examples #' add(1, 1) #' add(10, 1) add <- function(x, y) { x + y } ``` -- Oi. This looks.. *different*. So what is actually going on here? Let's take a look at what this piece of roxygen code will generate. --- .pull-left[ ## Documentation ] .pull-right[ <img src="images/roxygen2.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] <img src="images/add.png" width="600" height="350" style="display: block; margin: auto;" /> Ah. This looks more familiar! Let's go through the original roxygen code line by line to understand it better. --- .pull-left[ ## Documentation ] .pull-right[ <img src="images/roxygen2.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] The first line of code highlighted here shows the title. ```r *#' Add together two numbers. #' #' @param x A number. #' @param y A number. #' @return The sum of \code{x} and \code{y}. #' @export #' @examples #' add(1, 1) #' add(10, 1) add <- function(x, y) { x + y } ``` This is a (very) short description of your function. --- .pull-left[ ## Documentation ] .pull-right[ <img src="images/roxygen2.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] You might have noticed this odd code in front of the documentation: `#'` ```r #' Add together two numbers. *#' #' @param x A number. #' @param y A number. #' @return The sum of \code{x} and \code{y}. #' @export #' @examples #' add(1, 1) #' add(10, 1) add <- function(x, y) { x + y } ``` Roxygen comments start with this symbol in order to distinguish them from your usual comments (`#`) --- .pull-left[ ## Documentation ] .pull-right[ <img src="images/roxygen2.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] Next up is the `@param` name descriptions for our functions. ```r #' Add together two numbers. #' *#' @param x A number. *#' @param y A number. #' @return The sum of \code{x} and \code{y}. #' @export #' @examples #' add(1, 1) #' add(10, 1) add <- function(x, y) { x + y } ``` These lines of code describe the function’s inputs or *parameters*. The description usually documents what the parameter does, what the default inputs are (if any) and what the object type should ideally be (e.g., string, numeric vector etc.). --- .pull-left[ ## Documentation ] .pull-right[ <img src="images/roxygen2.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] We continue with the `@return` name description. ```r #' Add together two numbers. #' #' @param x A number. #' @param y A number. *#' @return The sum of \code{x} and \code{y}. #' @export #' @examples #' add(1, 1) #' add(10, 1) add <- function(x, y) { x + y } ``` The `@return` tag simply describes the output from the function. By using `\code{}` we can also make sure that the variables x and y are written in code font. --- .pull-left[ ## Documentation ] .pull-right[ <img src="images/roxygen2.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] We continue with the `@return` name description. ```r #' Add together two numbers. #' #' @param x A number. #' @param y A number. *#' @return The sum of `x` and `y`. #' @export #' @examples #' add(1, 1) #' add(10, 1) add <- function(x, y) { x + y } ``` The `@return` tag simply describes the output from the function. However, since we used `create_tidy_package` we also automatically turned on *markdown* style syntax within our roxygen descriptions so this style of syntax is not actually necessary here. --- .pull-left[ ## Documentation ] .pull-right[ <img src="images/roxygen2.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] Next up is the `@export` tag. ```r #' Add together two numbers. #' #' @param x A number. #' @param y A number. #' @return The sum of `x` and `y`. *#' @export #' @examples #' add(1, 1) #' add(10, 1) add <- function(x, y) { x + y } ``` This one is pretty straightforward. If you want your functions to be available to the user (and not just create an internal function to be used within the package) then it is crucial to add the `@export` tag. --- .pull-left[ ## Documentation ] .pull-right[ <img src="images/roxygen2.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] Finally, the `@examples` tag. ```r #' Add together two numbers. #' #' @param x A number. #' @param y A number. #' @return The sum of `x` and `y`. #' @export *#' @examples *#' add(1, 1) *#' add(10, 1) add <- function(x, y) { x + y } ``` Adding examples to functions can *really* help your users figure out how your package is *meant* to be used. --- .pull-left[ ## Documentation ] .pull-right[ <img src="images/devtools.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] With this, the roxygen code for your documentation is done. The last thing to do is converting your description to `.Rd` files so it can appear in the help file. Luckily, with the help of the `devtools` package this step is *really* easy to do. Just type in the following code: ```r library(devtools) document() ``` ```r Updating datenguideR documentation Writing NAMESPACE Loading datenguideR Writing NAMESPACE Writing add.Rd ``` If you now type `?add` into R it should render the development description that we saw earlier. --- class: center, middle, inverse ## That's it! ## We created our very first function documentation! ![](https://media0.giphy.com/media/5aWC7vyZtQIoLemw1R/giphy.gif) --- class: center, middle, inverse However... this is not everything of course. There are so much more things to know about documentation! I recommend the following sources: [Introduction to roxygen2](https://cran.r-project.org/web/packages/roxygen2/vignettes/roxygen2.html) [Object documentation chapter from R Packages book](http://r-pkgs.had.co.nz/man.html) --- class: center, middle, inverse ## Next up: ## Just `testthat` --- .pull-left[ ## Unit Testing ] .pull-right[ <img src="images/testthat.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] ```r <⌒/ヽ-、_ /<_/____/ 'I cant sleep if you dont test your functions' ∧_∧ ( ・ω・) _| ⊃/(___ / └-(____/  ̄ ̄ ̄ ̄ ̄ ̄ ̄ ``` Testing is really important to make sure your package is functioning as intended. -- As you keep developing and adding to your package code, keeping track of what might go wrong will become more and more complex. -- Unit tests will help you with identifying issues within your code so you can pinpoint what went wrong and where. --- .pull-left[ ## Unit Testing ] .pull-right[ <img src="images/testthat.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] The basic setup of an unit testing is quite simple: You write up example code with your package functions and define specific outputs that you expect. 1. If they are met, great, the test is passed! 2. If not, the test fails and you will know that something is wrong with your code and what failed. -- To set up unit tests for your package with `testthat` you can simply run `use_testthat` (from our beloved `usethis` package): ```r use_testthat() ``` -- This will create a folder `tests` and a subfolder `testthat`. This is the folder where your tests will live. Since we created our package with `create_tidy_package` we don't need to run this part as this function automatically sets up `testthat`. --- .pull-left[ ## Unit Testing ] .pull-right[ <img src="images/testthat.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] How do set up individual tests? `usethis` got us covered again! Simply run this code: ```r use_test("hello") ``` -- This will create the following file: ```r tests/testthat/test-hello.R ``` -- With this content: ```r test_that("multiplication works", { expect_equal(2 * 2, 4) }) ``` --- .pull-left[ ## Unit Testing ] .pull-right[ <img src="images/testthat.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] Let's take this short code apart: ```r test_that("multiplication works", { expect_equal(2 * 2, 4) }) ``` In the first line of code we can see the function `test_that` which is our main tool for unit tests. -- The first argument of `test_that` describes what is supposed to be tested whereas the second argument encapsulates the code to be tested within a curly bracket `{}` environment. -- Within this environment we write up our expectations with the help of `testthat`'s `expect_*` functions, of which there are many. -- In this case we use `expect_equal` which is quite straightforward. The 1st argument should be equal to the 2nd argument and if it's not, the test fails. --- ## You ready? ![](https://media1.tenor.com/images/bd7d6b583e936c6bc2f279b57d3005a5/tenor.gif?itemid=13422464) ## Good. Because we are now designing our own test for the `add` function. --- .pull-left[ ## Unit Testing ] .pull-right[ <img src="images/testthat.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] Alright. So let's start testing our `add` function. First, we will test whether addition works as intended: ```r test_that("addition works as intended", { result <- add(5, 5) expect_equal(result, 10) }) ``` The test did not throw an error so this worked. Perfect! --- .pull-left[ ## Unit Testing ] .pull-right[ <img src="images/testthat.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] How about testing that supplying a string should throw an error? ```r test_that("supplying string should throw error", { expect_error( add(5, "i_am_not_a_number") ) }) ``` Again, this ran without problems. So everything went as we expected! --- .pull-left[ ## Unit Testing ] .pull-right[ <img src="images/testthat.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] <br> .center[ Now, let's try to break our tests! ![](http://i.giphy.com/klPeFHrWqzPDW.gif) ] --- .pull-left[ ## Unit Testing ] .pull-right[ <img src="images/testthat.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] Let's try to test whether 5 + 5 is more than 15: ```r test_that("result is more than 15", { result <- add(5, 5) expect_more_than(result, 15) }) ``` -- ```r Error: Test failed: 'result is more than 15' * `result` is not strictly more than 15. Difference: -5 ``` Success! We failed! Since 5 + 5 is not more than 15 (it's actually 5 **less**) the test fails and we receive an informative error message as well. --- .pull-left[ ## Unit Testing ] .pull-right[ <img src="images/testthat.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] It takes a little bit of creativity to foresee the many ways functions within your package might break. In practice, writing the right kind of unit tests is considered sort of an [art](https://www.manning.com/books/the-art-of-unit-testing-second-edition) by some and a nuisance by others. In any case though it is really important for writing robust code that can be relied upon. --- .pull-left[ ## Unit Testing Now, we have succesfully written up three tests for our `add` function. ] .pull-right[ <img src="images/testthat.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] The `devtools` package will help us test all of our unit tests at once by running the following code within your package project: ```r test() ``` <img src="images/test.png" width="1307" /> --- .pull-left[ ## Bringing it all together We can also run `check` from `devtools` to test pretty much everything about our package: ] .pull-right[ <img src="images/devtools.png" width="100" height="120" style="display: block; margin: auto 0 auto auto;" /> ] ```r check() ``` <img src="images/check.png" width="1319" /> --- class: center, middle, inverse And when your package passes all checks you get the sweet feeling of getting the following feedback: <img src="images/check_results.png" width="827" /> --- ## Useful stuff that I wasn't able to mention yet Add Package imports to `DESCRIPTION` with `usethis::use_package`: ```r use_package("dplyr") ``` -- Set up continuous integration (in this case Travis CI) with the following code: ```r use_travis() ``` Continuous integration allows you to run checks on your package code with various R versions and ensures that your package can run on different systems like Linux, Mac OSX & Windows. You can start reading about Travis CI here: [A BEGINNER'S GUIDE TO TRAVIS-CI FOR R](https://juliasilge.com/blog/beginners-guide-to-travis/). -- Set up an awesome website for your package (documentation) with the help of `pkgdown`. ```r use_pkgdown() ``` --- ## Some Resources + [Writing an R Package from Scratch (Hilary Parker)](https://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/) + [Writing an R Package from Scratch (Tomas Westlake)](https://r-mageddon.netlify.com/post/writing-an-r-package-from-scratch/) + [usethis workflow for package development](https://www.hvitfeldt.me/blog/usethis-workflow-for-package-development/) + [R packages by Hadley Wickham](http://r-pkgs.had.co.nz/) + [rOpenSci Packages: Development, Maintenance, and Peer Review](https://devguide.ropensci.org/) + [R Packages cheatsheet](https://rawgit.com/rstudio/cheatsheets/master/package-development.pdf) + [Generating .Rd files (roxygen2)](https://cran.r-project.org/web/packages/roxygen2/vignettes/rd.html) + [Writing R Extensions]() + [usethis website](https://usethis.r-lib.org/index.html) + [testthat website](https://testthat.r-lib.org/) --- class: center, middle, inverse ### Thank you for listening ![](https://media1.tenor.com/images/da0f7d5d93faa11dfc36db1e6c6fdf2a/tenor.gif?itemid=6159389)