Lecture 2

MVJ

12April, 2018

Welcome to the lab

This course is heavily lab- and computer-focused.

We will work with the R statistical platform, with the RStudio interface and with the RMarkdown markup language.

R

The R platform is the world’s leading open source and free statistical software.

SPSS, Stata, SAS very expensive platforms with a lot of functionality.

Matlab scientific computing with support for statistical calculations and graphing.

Excel has weak support for statistical computing.

RStudio

Comfortable way of interacting with R.

Everyone open a web browser and login now.

RMarkdown

Interweaves text and statistical computation into a single document.

All these lecture slides are written in RMarkdown.

Your lab reports, homework assignments and project reports all will be written in RMarkdown.

# Heading

Text goes here. ALWAYS write text that explains what you are doing

```{r}
2+5
data = read.csv("filename.csv")
```

Data storage types

Data can be stored in several ways: Excel files, databases, … One very common way is in a Character Separated Values file (CSV file).

In a CSV file:

A CSV file may have variable names in the first row. This is called a header row

read.csv assumes that the separator is , and a header row.

read.table assumes that the separator is some amount of whitespace, and no header row.

The book datasets use header rows and a tab separator. Read them with read.csv and using sep="\t" to indicate the separator.

Today’s Lab

For today’s lab, you will all follow along with the lab as worked out on the projector, to make sure everyone gets started.

Lab instructions are on

https://www.math.csi.cuny.edu/~mvj/MTH214/mth214-laboration-1.html

You can find all labs through the Lab Instructions link on the course website.