+ - 0:00:00
Notes for current slide
Notes for next slide

Package Development in R





Fabio Votta

 favstats
 @favstats
  www.favstats.eu
 develop-rpkgs.netlify.app

2020-11-04

1

It's normal to struggle at first but it gets better!

Illustration adapted from Allison Horst

2

It's normal to struggle at first but it gets better!

Illustration adapted from Allison Horst

  • My experience is that this stuff isn't super easy... but it gets better!
2

It's normal to struggle at first but it gets better!

Illustration adapted from Allison Horst

  • My experience is that this stuff isn't super easy... but it gets better!

  • Mostly because of:

    • Awesome inclusive community that is always ready to help
    • Great documentation of existing packages and functions
    • Active blogosphere with use cases and examples
    • and much more!
2

Overview

  • Introduction

  • usethis magic

  • Creating a Package

  • Documentation (roxygen2 & some devtools)

  • Unit Tests (with testthat)

  • Some Resources

3

Why Even Create an R Package

  • R packages are a wonderful way to make functions and datasets easily accessible for everyone
4

Why Even Create an R Package

  • R packages are a wonderful way to make functions and datasets easily accessible for everyone

  • R really comes alive because of the many awesome packages that hundreds and hundreds of volunteers help create and maintain

4

Why Even Create an R Package

  • R packages are a wonderful way to make functions and datasets easily accessible for everyone

  • R really comes alive because of the many awesome packages that hundreds and hundreds of volunteers help create and maintain

  • It's a great way to become active in the R community & give back

4

Why Even Create an R Package

  • R packages are a wonderful way to make functions and datasets easily accessible for everyone

  • R really comes alive because of the many awesome packages that hundreds and hundreds of volunteers help create and maintain

  • It's a great way to become active in the R community & give back

4

Before we Start..

  • Creating packages is a process
5

Before we Start..

  • Creating packages is a process

  • There's no need to do everything at once!

5

Before we Start..

  • Creating packages is a process

  • There's no need to do everything at once!

  • Come up with milestones and focus on reaching them

5

Before we Start..

  • Creating packages is a process

  • There's no need to do everything at once!

  • Come up with milestones and focus on reaching them

  • Ask yourself: what's the main purpose of your package?

  • Should your package be about ...
    • Data Visualization
    • Implemetiation of statistical models
    • Wrapping an existing API
    • Data wrangling
    • ???
    • a little bit of everything?
5

Before we Start..

  • Make sure you installed the latest R and Rstudio version

  • Install the following packages:

pkgs <- c("devtools", "roxygen2", "usethis", "testthat")
install.packages(pkgs)
6

Before we Start..

  • Make sure you installed the latest R and Rstudio version

  • Install the following packages:

pkgs <- c("devtools", "roxygen2", "usethis", "testthat")
install.packages(pkgs)

And now?

6

Before we Start..

  • Make sure you installed the latest R and Rstudio version

  • Install the following packages:

pkgs <- c("devtools", "roxygen2", "usethis", "testthat")
install.packages(pkgs)

And now?

Choosing a package name

6

Choosing a Package Name

The available package will help check whether your desired package name is still.. well.. available:

library(available)
available("datenguideR", browse = F)

7

tfw your chosen package name is still available

8

Now we're ready to create our first R package!

9

Now we're ready to create our first R package!

With the help of

usethis

10

usethis

The purpose of usethis is to

… automate repetitive tasks that arise during project setup and development, both for R packages and non-package projects.

11

usethis

The purpose of usethis is to

… automate repetitive tasks that arise during project setup and development, both for R packages and non-package projects.

As you can guess by the description usethis is VERY useful for package creation and we will be using its functions quite often during development.

11

usethis

The purpose of usethis is to

… automate repetitive tasks that arise during project setup and development, both for R packages and non-package projects.

As you can guess by the description usethis is VERY useful for package creation and we will be using its functions quite often during development.

We're going to focus on two types of functions within usethis (although there are MANY more)

11

usethis

The purpose of usethis is to

… automate repetitive tasks that arise during project setup and development, both for R packages and non-package projects.

As you can guess by the description usethis is VERY useful for package creation and we will be using its functions quite often during development.

We're going to focus on two types of functions within usethis (although there are MANY more)

  • use_*
    • for example use_pipe to include the pipe operator in your package
  • create_*
    • for example create_from_github which creates a local Git repository from GitHub
11

Creating your first R Package

The following code will create a minimal R package:

library(usethis)
create_package("~/git_projects/datenguideR")

All you need to do is specify a path. If it exists, it is used. If it does not exist, it is created, provided that the parent path exists.

12

Creating your first R Package

The following code will create a minimal R package:

library(usethis)
create_package("~/git_projects/datenguideR")

All you need to do is specify a path. If it exists, it is used. If it does not exist, it is created, provided that the parent path exists.

But because being tidy is awesome we will be using create_tidy_package

create_tidy_package("~/git_projects/datenguideR")

This function will also create a new package but it will also apply many great tidyverse conventions that will come in handy.

12

Creating your first R Package

These files should be in your working directory after creating a tidy package:

-- R
-- tests
-- DESCRIPTION
-- NAMESPACE
-- cran-comments.md
-- LICENSE.md
-- README.Rmd
-- README.md
13

Creating your first R Package

-- R
-- tests
-- DESCRIPTION
-- NAMESPACE
-- cran-comments.md
-- LICENSE.md
-- README.Rmd
-- README.md

This is a folder which holds .R files with your package functions.

14

Creating your first R Package

-- R
-- tests
-- DESCRIPTION
-- NAMESPACE
-- cran-comments.md
-- LICENSE.md
-- README.Rmd
-- README.md

This is a folder which holds tests for your package functions. We will talk about tests in greater detail later.

15

Creating your first R Package

-- R
-- tests
-- DESCRIPTION
-- NAMESPACE
-- cran-comments.md
-- LICENSE.md
-- README.Rmd
-- README.md

This is a file that contains meta-info about your package including authors, description and license. It's also the place for adding package dependencies.

16

Creating your first R Package

-- R
-- tests
-- DESCRIPTION
-- NAMESPACE
-- cran-comments.md
-- LICENSE.md
-- README.Rmd
-- README.md

This file holds namespaces of the package and is auto-generated by ´roxygen2´ (more later). So we don't have to worry about it at all.

17

Creating your first R Package

-- R
-- tests
-- DESCRIPTION
-- NAMESPACE
-- cran-comments.md
-- LICENSE.md
-- README.Rmd
-- README.md

Use this file to communicate with CRAN when you submit a package to their repository. The text should provide an overview of how your package performs on different operating systems.

18

Creating your first R Package

-- R
-- tests
-- DESCRIPTION
-- NAMESPACE
-- cran-comments.md
-- LICENSE.md
-- README.Rmd
-- README.md

This file holds some template text for the license under which you publish your package.

19

Creating your first R Package

-- R
-- tests
-- DESCRIPTION
-- NAMESPACE
-- cran-comments.md
-- LICENSE.md
-- README.Rmd
-- README.md

This is an Rmarkdown file that can be used to generate the README of your package. The knitted version will be displayed on GitHub.

20

Creating your first R Package

-- R
-- tests
-- DESCRIPTION
-- NAMESPACE
-- cran-comments.md
-- LICENSE.md
-- README.Rmd
-- README.md

This is a markdown file that was generated by the README.Rmd file. You don't have to worry about this file.

21

Creating your first R Package

Finally, the following two lines of code will set up Git and a GitHub repository for your package:

use_git()
use_github()

This is important because people need to be able to install your package from a public source. GitHub is primary source for that but there are others (Gitlab for example).

22

Creating your first R Package

Finally, the following two lines of code will set up Git and a GitHub repository for your package:

use_git()
use_github()

This is important because people need to be able to install your package from a public source. GitHub is primary source for that but there are others (Gitlab for example).

Once you have run the two code pieces above you can do the following to install an already fully functioning R package on your computer.

devtools::install_github("{github_username}/{packagename}")
22

Congratulations you created your very first R package!

23

Let's start adding functions to our package!

24

Let's start adding functions to our package!

By first of all talking about how to name our functions

24

Naming Functions

The rOpenSci Package guide states that:

Functions and arguments naming should be chosen to work together to form a common, logical programming API that is easy to read, and auto-complete.

25

Naming Functions

The rOpenSci Package guide states that:

Functions and arguments naming should be chosen to work together to form a common, logical programming API that is easy to read, and auto-complete.

So in the best case scenario you come up with a naming scheme in the following style:

  • object_verb

This scheme

  • helps avoid namespace conflicts with packages that may have similar verbs
  • makes code readable and easy to auto-complete
25

Naming Functions

The rOpenSci Package guide states that:

Functions and arguments naming should be chosen to work together to form a common, logical programming API that is easy to read, and auto-complete.

So in the best case scenario you come up with a naming scheme in the following style:

  • object_verb

This scheme

  • helps avoid namespace conflicts with packages that may have similar verbs
  • makes code readable and easy to auto-complete

For example stringr functions all start with str_* whereas memer functions all start with meme_*.

In our examples here we won't be using this naming style just for convenience but in your personal package you should absolutely consider using it!

25

It's time to type some R code

26

Creating your first Function in your Package

In order to get started with our first function we will create a new R script with the help of the usethis package:

use_r("hello")

This will create and open an hello.R file in your package's R subfolder; the place where all the functions of your package will live!

27

Creating your first Function in your Package

In order to get started with our first function we will create a new R script with the help of the usethis package:

use_r("hello")

This will create and open an hello.R file in your package's R subfolder; the place where all the functions of your package will live!

Let's create an easy function that will add two numbers:

add <- function(x, y) {
x + y
}
add(1, 2)
## [1] 3

So far so good.

27

But how do we communicate what functions do

to people who use our package?

28

Documentation

With the help of D O C U M E N T A T I O N, of course!

always
⊂_ヽ
  \\ document
   \( ͡° ͜ʖ ͡°)
    > ⌒ヽ
   /   へ\
   /  / \\your
   レ ノ   ヽ_つ
  / /
  / /|
 ( (ヽ
 | |、\functions
 | 丿 \ ⌒)
 | |  ) /
ノ )  Lノ
(_/
29

Documentation

Documentation is absolutely central.

It's basically the guidance for your users so they know how to use the functions within your package.

30

Documentation

Documentation is absolutely central.

It's basically the guidance for your users so they know how to use the functions within your package.

Or as Hadley Wickham puts it:

Documentation is one of the most important aspects of good code. Without it, users won’t know how to use your package, and are unlikely to do so.

30

Documentation

Documentation is absolutely central.

It's basically the guidance for your users so they know how to use the functions within your package.

Or as Hadley Wickham puts it:

Documentation is one of the most important aspects of good code. Without it, users won’t know how to use your package, and are unlikely to do so.

There is a built-in functionality within R to document packages: .Rd files that are stored in the man subdirectory of your package. These files use a syntax that is similar to Latex.

Here to help us with documentation is roxygen2 which conveniently creates a lot of the necessary files with its own more intuitive syntax style.

30

Documentation

This is how documentation for our add function might look like:

#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of \code{x} and \code{y}.
#' @export
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}
31

Documentation

This is how documentation for our add function might look like:

#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of \code{x} and \code{y}.
#' @export
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}

Oi. This looks.. different. So what is actually going on here?

Let's take a look at what this piece of roxygen code will generate.

31

Documentation

Ah. This looks more familiar!

Let's go through the original roxygen code line by line to understand it better.

32

Documentation

The first line of code highlighted here shows the title.

#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of \code{x} and \code{y}.
#' @export
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}

This is a (very) short description of your function.

33

Documentation

You might have noticed this odd code in front of the documentation: #'

#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of \code{x} and \code{y}.
#' @export
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}

Roxygen comments start with this symbol in order to distinguish them from your usual comments (#)

34

Documentation

Next up is the @param name descriptions for our functions.

#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of \code{x} and \code{y}.
#' @export
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}

These lines of code describe the function’s inputs or parameters. The description usually documents what the parameter does, what the default inputs are (if any) and what the object type should ideally be (e.g., string, numeric vector etc.).

35

Documentation

We continue with the @return name description.

#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of \code{x} and \code{y}.
#' @export
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}

The @return tag simply describes the output from the function.

By using \code{} we can also make sure that the variables x and y are written in code font.

36

Documentation

We continue with the @return name description.

#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of `x` and `y`.
#' @export
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}

The @return tag simply describes the output from the function.

However, since we used create_tidy_package we also automatically turned on markdown style syntax within our roxygen descriptions so this style of syntax is not actually necessary here.

37

Documentation

Next up is the @export tag.

#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of `x` and `y`.
#' @export
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}

This one is pretty straightforward. If you want your functions to be available to the user (and not just create an internal function to be used within the package) then it is crucial to add the @export tag.

38

Documentation

Finally, the @examples tag.

#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of `x` and `y`.
#' @export
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}

Adding examples to functions can really help your users figure out how your package is meant to be used.

39

Documentation

With this, the roxygen code for your documentation is done. The last thing to do is converting your description to .Rd files so it can appear in the help file.

Luckily, with the help of the devtools package this step is really easy to do.

Just type in the following code:

library(devtools)
document()
Updating datenguideR documentation
Writing NAMESPACE
Loading datenguideR
Writing NAMESPACE
Writing add.Rd

If you now type ?add into R it should render the development description that we saw earlier.

40

That's it!

We created our very first function documentation!

41

However...

this is not everything of course.

There are so much more things to know about documentation!

I recommend the following sources:

Introduction to roxygen2

Object documentation chapter from R Packages book

42

Next up:

Just testthat

43

Unit Testing

<⌒/ヽ-、_
/<_/____/
'I cant sleep if you dont test your functions'
   ∧_∧
   ( ・ω・) 
  _| ⊃/(___
/ └-(____/
 ̄ ̄ ̄ ̄ ̄ ̄ ̄

Testing is really important to make sure your package is functioning as intended.

44

Unit Testing

<⌒/ヽ-、_
/<_/____/
'I cant sleep if you dont test your functions'
   ∧_∧
   ( ・ω・) 
  _| ⊃/(___
/ └-(____/
 ̄ ̄ ̄ ̄ ̄ ̄ ̄

Testing is really important to make sure your package is functioning as intended.

As you keep developing and adding to your package code, keeping track of what might go wrong will become more and more complex.

44

Unit Testing

<⌒/ヽ-、_
/<_/____/
'I cant sleep if you dont test your functions'
   ∧_∧
   ( ・ω・) 
  _| ⊃/(___
/ └-(____/
 ̄ ̄ ̄ ̄ ̄ ̄ ̄

Testing is really important to make sure your package is functioning as intended.

As you keep developing and adding to your package code, keeping track of what might go wrong will become more and more complex.

Unit tests will help you with identifying issues within your code so you can pinpoint what went wrong and where.

44

Unit Testing

The basic setup of an unit testing is quite simple:

You write up example code with your package functions and define specific outputs that you expect.

  1. If they are met, great, the test is passed!
  2. If not, the test fails and you will know that something is wrong with your code and what failed.
45

Unit Testing

The basic setup of an unit testing is quite simple:

You write up example code with your package functions and define specific outputs that you expect.

  1. If they are met, great, the test is passed!
  2. If not, the test fails and you will know that something is wrong with your code and what failed.

To set up unit tests for your package with testthat you can simply run use_testthat (from our beloved usethis package):

use_testthat()
45

Unit Testing

The basic setup of an unit testing is quite simple:

You write up example code with your package functions and define specific outputs that you expect.

  1. If they are met, great, the test is passed!
  2. If not, the test fails and you will know that something is wrong with your code and what failed.

To set up unit tests for your package with testthat you can simply run use_testthat (from our beloved usethis package):

use_testthat()

This will create a folder tests and a subfolder testthat. This is the folder where your tests will live.

Since we created our package with create_tidy_package we don't need to run this part as this function automatically sets up testthat.

45

Unit Testing

How do set up individual tests?

usethis got us covered again!

Simply run this code:

use_test("hello")
46

Unit Testing

How do set up individual tests?

usethis got us covered again!

Simply run this code:

use_test("hello")

This will create the following file:

tests/testthat/test-hello.R
46

Unit Testing

How do set up individual tests?

usethis got us covered again!

Simply run this code:

use_test("hello")

This will create the following file:

tests/testthat/test-hello.R

With this content:

test_that("multiplication works", {
expect_equal(2 * 2, 4)
})
46

Unit Testing

Let's take this short code apart:

test_that("multiplication works", {
expect_equal(2 * 2, 4)
})

In the first line of code we can see the function test_that which is our main tool for unit tests.

47

Unit Testing

Let's take this short code apart:

test_that("multiplication works", {
expect_equal(2 * 2, 4)
})

In the first line of code we can see the function test_that which is our main tool for unit tests.

The first argument of test_that describes what is supposed to be tested whereas the second argument encapsulates the code to be tested within a curly bracket {} environment.

47

Unit Testing

Let's take this short code apart:

test_that("multiplication works", {
expect_equal(2 * 2, 4)
})

In the first line of code we can see the function test_that which is our main tool for unit tests.

The first argument of test_that describes what is supposed to be tested whereas the second argument encapsulates the code to be tested within a curly bracket {} environment.

Within this environment we write up our expectations with the help of testthat's expect_* functions, of which there are many.

47

Unit Testing

Let's take this short code apart:

test_that("multiplication works", {
expect_equal(2 * 2, 4)
})

In the first line of code we can see the function test_that which is our main tool for unit tests.

The first argument of test_that describes what is supposed to be tested whereas the second argument encapsulates the code to be tested within a curly bracket {} environment.

Within this environment we write up our expectations with the help of testthat's expect_* functions, of which there are many.

In this case we use expect_equal which is quite straightforward. The 1st argument should be equal to the 2nd argument and if it's not, the test fails.

47

You ready?

Good. Because we are now designing our own test for the add function.

48

Unit Testing

Alright. So let's start testing our add function.

First, we will test whether addition works as intended:

test_that("addition works as intended", {
result <- add(5, 5)
expect_equal(result, 10)
})

The test did not throw an error so this worked. Perfect!

49

Unit Testing

How about testing that supplying a string should throw an error?

test_that("supplying string should throw error", {
expect_error(
add(5, "i_am_not_a_number")
)
})

Again, this ran without problems. So everything went as we expected!

50

Unit Testing


Now, let's try to break our tests!

51

Unit Testing

Let's try to test whether 5 + 5 is more than 15:

test_that("result is more than 15", {
result <- add(5, 5)
expect_more_than(result, 15)
})
52

Unit Testing

Let's try to test whether 5 + 5 is more than 15:

test_that("result is more than 15", {
result <- add(5, 5)
expect_more_than(result, 15)
})
Error: Test failed: 'result is more than 15' *
`result` is not strictly more than 15. Difference: -5

Success! We failed! Since 5 + 5 is not more than 15 (it's actually 5 less) the test fails and we receive an informative error message as well.

52

Unit Testing

It takes a little bit of creativity to foresee the many ways functions within your package might break.

In practice, writing the right kind of unit tests is considered sort of an art by some and a nuisance by others.

In any case though it is really important for writing robust code that can be relied upon.

53

Unit Testing

Now, we have succesfully written up three tests for our add function.

The devtools package will help us test all of our unit tests at once by running the following code within your package project:

test()

54

Bringing it all together

We can also run check from devtools to test pretty much everything about our package:

check()

55

And when your package passes all checks you get the sweet feeling of getting the following feedback:

56

Useful stuff that I wasn't able to mention yet

Add Package imports to DESCRIPTION with usethis::use_package:

use_package("dplyr")
57

Useful stuff that I wasn't able to mention yet

Add Package imports to DESCRIPTION with usethis::use_package:

use_package("dplyr")

Set up continuous integration (in this case Travis CI) with the following code:

use_travis()

Continuous integration allows you to run checks on your package code with various R versions and ensures that your package can run on different systems like Linux, Mac OSX & Windows.

You can start reading about Travis CI here: A BEGINNER'S GUIDE TO TRAVIS-CI FOR R.

57

Useful stuff that I wasn't able to mention yet

Add Package imports to DESCRIPTION with usethis::use_package:

use_package("dplyr")

Set up continuous integration (in this case Travis CI) with the following code:

use_travis()

Continuous integration allows you to run checks on your package code with various R versions and ensures that your package can run on different systems like Linux, Mac OSX & Windows.

You can start reading about Travis CI here: A BEGINNER'S GUIDE TO TRAVIS-CI FOR R.

Set up an awesome website for your package (documentation) with the help of pkgdown.

use_pkgdown()
57

Thank you for listening

59

It's normal to struggle at first but it gets better!

Illustration adapted from Allison Horst

2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow