Quick Overview

Column 1

Markup Languages and Reproducible Programming in Statistics 2019

This course gives an overview of the state-of-the-art in statistical markup, reproducible programming and scientific digital representation. Students will get to know the professional field of statistical markup and its innovations and challenges. It consists of 8 meetings in which students will learn about markup languages (\(\LaTeX\) and Markdown), learn efficient programming with rMarkdown, experience developing Shiny web apps, get to know version control with Git and will create and maintain their own data archive repository and personal (business card style) page through GitHub . Combining these lectures, the students get acquainted with different viewpoints on marking up statistical manuscripts, areas of innovation, and challenges that people face when working with, analysing and reporting (simulated) data.

Knowledge obtained from this course will help students face multidimensional problems during their professional career.

Assignment and Grading

The final grade is computed as follows

Graded part Weight
Markup manuscript 30 %
Research repository 30 %
Personal repository 10 %
Shiny app 15 %
Visual presentation 15 %

To develop the necessary skills for completing the assignment and the presentation, 7 exercises must be made and submitted. These exercises are not graded, but students must fulfil them to pass the course.

In order to pass the course, the final grade must be 5.5 or higher, your contribution to the course should be sufficient and all assignments and practical assignments should be handed in and/or passed. Otherwise, additional work is required concerning the assignments and/or exercises you have failed.

Column 2

Schedule

When? Where? What?
28-Oct 10 am Ruppert 031 LaTeX and Bibliographies
04-Nov 10 am Ruppert 031 Beamer presentations and equations
11-Nov 10 am Ruppert 031 Tables and Figures
18-Nov 10 am Ruppert 031 A reproducible workflow with rMarkdown
25-Nov 10 am Ruppert 031 Version control and GitHub repositories
02-Dec 10 am Ruppert 031 Presentations with markdown
09-Dec 10 am Ruppert 031 Github pages and Shiny apps
16-Dec 10 am Ruppert 031 Presentations

For fun

Expand \((a+b)^n\): \[ \begin{gather*} (a + b)^n\\ (a\ + \ b)^n\\ (a\quad + \quad b)^n\\ (a\qquad + \qquad b)^n \end{gather*} \] source

Course Manual

Column 1

Course manual

I’d rather have a pdf

Course description

This course gives an overview of the state-of-the-art in statistical markup, reproducible programming and scientific digital representation. Students will get to know the professional field of statistical markup and its innovations and challenges. It consists of 8 meetings in which students will learn about markup languages (LaTeX and Markdown), learn efficient programming with rMarkdown, experience developing Shiny web apps, get to know version control with Git and will create and maintain their own data archive repository and personal (business card) page through GitHub. Combining these lectures, the students get acquainted with different viewpoints on marking up statistical manuscripts, areas of innovation, and challenges that people face when working with, analysing and reporting (simulated) data. Knowledge obtained from this course will help students face multidimensional problems during their professional career.

Assignment

Students will individually choose one statistical topic and work on a manuscript about this topic. Students will need to perform calculations and program code for this manuscript. All work for the student needs to be combined in an easy understandable and insightful data archive and will need be posted on a personal GitHub repository. This end result will be graded on

  1. Quality of the markup language skills,
  2. Quality of the data archive, and,
  3. Quality of the online repository.

Grading

Students will be evaluated on the following aspects:

  1. Developing and publishing a research archive that contains code, data and a typeset manuscript following a markup language;
  2. Developing and publishing a personal repository page;
  3. Creating a visual presentation about the progress made in this course.

Further,

  1. Students develop fundamental knowledge and understanding in the state of the art in statistical markup languages and reproducible programming (Knowledge and Understanding)
  2. They apply their knowledge in a multi-disciplinary context to contemporary problems (Applying)
  3. They can determine the most effective markup strategies to address a typesetting problem (Applying)
  4. They can efficiently organise a reproducible programming process (Applying)
  5. They can advise researchers in applying the current state of the art in markup and programming (Judgment)
  6. They can produce repositories up to the standards of international programming and coding conventions and initiatives (Communication)
  7. They can produce publications up to the typesetting standards of international peer- reviewed journals (Communication)
  8. They are capable of autonomous scholarly self-development (Learning skills)
  9. They give proof of being a responsible and scholarly professional (Learning skills)

After taking this course students can understand innovations in statistical markup, statistical simulation and reproducible research. Students are also able to approach challenges from different professional viewpoints. They have gained experience in marking up a professional manuscript and designing a state-of-the-art statistical archive in an open source repository.

To develop the necessary skills for completing the assignment and the presentation, 7 exercises must be made and submitted. These exercises are not graded, but students must fulfil them to pass the course.

The final grade is computed as follows

Graded part Weight
Markup manuscript 30 %
Research repository 30 %
Personal repository 10 %
Shiny app 15 %
Visual presentation 15 %

In order to pass the course, the final grade must be 5.5 or higher, your contribution to the course should be sufficient and all assignments and practical assignments should be handed in and/or passed. Otherwise, additional work is required concerning the assignments and/or exercises you have failed.

Instructions for preparing the repositories

The research repository has to be prepared as a supplementary archive that can serve as an extensive documentation of the research (e.g. as a supplement to be submitted to a journal). The archive has to be published in a public or private GitHub repository.

Time schedule

This course takes place in the second half of the first semester. For students that follow the Master Programme MSBBSS; the course starts the week after the submission deadline for the thesis proposal. The course will run for 8 weeks on Mondays, from 10am – 12.45am, starting October 28, 2019.

Prerequisites

Students will need their own laptop computer. Students should have experience in programming with R and should be familiar with the IDE RStudio.

Week 1

Column 1

LaTeX and BibTex

We start with a simple introduction to the LaTeX environment. Just as with any new language aimed at programming and/or scripting: practice makes perfect. Follow the two exercises for this week and you’ll have a head start on the wealth of marking up your documents with \(\LaTeX\).

All the best,

Gerko

For fun

source and LaTeX code

Week 2

Column 1

Beamer and equations

This week we’ll cover equations in LaTeX - I’m sure you’ll love it. We will also use LaTeX to design slide show presentations. Later on in this course, we’ll focus on creating presentation with Markdown - which is much easier, but also less flexible in obtaining perfect detailed typesetting. For now, getting to know the basics of presentations and equations in LaTeX will pay off in the future.

All the best,

Gerko

For fun

Column 2

Week 3

Column 1

Beamer and equations

This week we’ll cover tables and figures in LaTeX

All the best,

Gerko

For fun

Spoiler alert for Silicon Valley

Column 2

Exercise

This week’s excercise:


Week 4

Column 1

Reproducible workflows

This week we’ll cover reproducible workflows with rmarkdown in RStudio

All the best,

Gerko

Week 5

Column 1

Git and GitHub

This week we’ll cover version control with git in RStudio

All the best,

Gerko

For fun

source

Column 2

Exercise and lecture

This week’s documents:

For fun 2

source

Week 6

Column 1

Presentations with rmarkdown

This week we’ll cover presentations with rmarkdown in RStudio

All the best,

Gerko

For fun

source and original

Column 2

Exercise

This week’s documents:

Week 7

Column 1

Online representation

This week we’ll cover shiny web-apps and GitHub pages. shiny is a wonderfull means to showcase your work and offer online services. GitHub pages is the way for developers and professionals to introduce yourself to the world and host a personal webpage right from your GitHub. And all this is free!

All the best,

Gerko

For fun

source

Column 2

Exercise

This week’s documents:

Useful references

Definitely look at the book Mastering Shiny by Hadley Wickham. This book is currently under development.