Exercises


  1. Create a for-loop that loops over all numbers between 0 and 10, but only prints numbers below 5.

  1. Modify the for loop to only print the numbers 3, 4, and 5.

  1. Try to do the same thing without a for-loop, by subsetting a vector from 0 to 10 directly.

  1. Recreate the following matrix, where 1 to eight are multiplied by 1 on the first row, 2 on the second, etc. Tip: use byrow = TRUE to fill a matrix left-to-right instead of top-to-bottom.

  1. Create a 6 by 6 matrix of strings, where each cell contains “row + column = sum”. For example, the second row, third column would yield “2 + 3 = 6”. Tip: Create an empty 6x6 matrix first and fill it with values later.

  1. Modify your loop to put "Sum > 8" in the matrix in the cells where that is true.

The anscombe data set is a wonderful data set from 1973 by Francis J. Anscombe aimed to demonstrate that pairs of variables can have the same statistical properties, while having completely differnt graphical representations. We will be using this data set more this week. If you’d like to know more about anscombe, you can simply call ?anscombe to enter the help.

You can directly call anscombe from your console because the datasets package is a base package in R. This means that it is always included and loaded when you start an R instance. In general, when you would like to access functions or data sets from packages that are not automatically loaded, we don’t have to explicitly load the package. We can also call package::thing-we-need to directly ‘grab’ the thing-we-need from the package namespace. For example,

This is especially handy within functions, as we can call package::function-name to borrow functionality from installed packages, without loading the whole package. Calling only those functions that you need is more memory-efficient than loading it all. More memory efficient means faster computation.


  1. Display summary statistics (for example, using summary) of each column of the anscombe dataset from the datasets package

  1. Display summary statistics of each column of the anscombe dataset using apply().

  1. Display summary statistics of each column of the anscombe dataset using sapply().

  1. Display summary statistics of each column of the anscombe dataset using lapply().

  1. Write a function that takes a vector of numbers as input, and returns a string containing “The mean is XXX”, where XXX should be, of course, the mean of the input vector.

  1. Apply this to each column of anscombe.

  1. Now modify your function to round() off the means to have a single decimal, and apply it again to see the results.

The mammalsleep data set from the mice package shows data collected by Allison and Cicchetti (1976). It holds information for 62 mammal species on the interrelationship between sleep, ecological, and constitutional variables. The dataset contains missing values on five variables, which poses challenges when analyses include these variables.

We will use this datasets also more frequently this week, but we use it only once today. Therefore we could more efficiently call mice::mammalsleep to obtain only the mammalsleep data set without loading the whole mice package.


  1. Write a function that takes a vector as input, and that returns a string that contains either (1) the mean and standard deviation (sd()) of the vector, if the vector is numeric, or (2) the levels of the vector, if it is categorical.

End of Practical


References

Allison, T., Cicchetti, D.V. (1976). Sleep in Mammals: Ecological and Constitutional Correlates. Science, 194(4266), 732-734.

Anscombe, Francis J. (1973) Graphs in statistical analysis. American Statistician, 27, 17–21.