1. Repeat the experiment from 10 on object y from Exercise4_data.RData.
  • Plot a histogram of y and report on the shape of the data
hist(y)

  • What do you think the mean is, based on the plot?

Hard to say, the data seems to be bi-modal. Somewhere around 6? as there seems to be more mass around 10 than around 1.

  • Calculate the mean (expected value)
mean(y)
## [1] 5.777224
  • Calculate the median (the center of the distribution)
median(y)
## [1] 6.12544
  • Calculate the mode (the most observed value). Calculate the mode on rounded numbers (HINT: use round()).
round(y) %>% table
## .
## -3 -2 -1  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 
##  3  3  3  3  8  8  8  5  5  7  5  5 12 13  5  4  1  2

The mode seems to be around 10

  • what do the mean, mode and median tell you about the shape of the data?

The mean, median and mode are all different. This means data de data are non-normal. The histogram clearly shows this. If we plot the density, this becomes even more apparent:

y %>% density %>% plot

  • Plot the least squares curve from values 2 to 10, conform the previous exercise
lsfun <- function(meanestimate) apply(outer(y, meanestimate, "-")^2, 2, sum)
curve(lsfun, from = 2, to = 10)

  • Plot a curve that zooms in on the minimum (i.e. the least squared deviations)
curve(lsfun, from = 5.5, to = 6)

  • Report the meanestimate that would minimize the least squares function

Naturally, this is the mean of y, which equals 5.777224


End of practical.