MVJ
12April, 2018
This course is heavily lab- and computer-focused.
We will work with the R statistical platform, with the RStudio interface and with the RMarkdown markup language.
The R platform is the world’s leading open source and free statistical software.
SPSS, Stata, SAS very expensive platforms with a lot of functionality.
Matlab scientific computing with support for statistical calculations and graphing.
Excel has weak support for statistical computing.
Comfortable way of interacting with R.
Everyone open a web browser and login now.
Interweaves text and statistical computation into a single document.
All these lecture slides are written in RMarkdown.
Your lab reports, homework assignments and project reports all will be written in RMarkdown.
# Heading
Text goes here. ALWAYS write text that explains what you are doing
```{r}
2+5
data = read.csv("filename.csv")
```
Scatterplot |
|
gf_point
|
Bar graph |
|
gf_bar
|
Histogram |
|
gf_histogram
|
Data can be stored in several ways: Excel files, databases, … One very common way is in a Character Separated Values file (CSV file).
In a CSV file:
,
;
tab space …)"
A CSV file may have variable names in the first row. This is called a header row
read.csv
assumes that the separator is ,
and a header row.
read.table
assumes that the separator is some amount of whitespace, and no header row.
The book datasets use header rows and a tab separator. Read them with read.csv
and using sep="\t"
to indicate the separator.
For today’s lab, you will all follow along with the lab as worked out on the projector, to make sure everyone gets started.
Lab instructions are on
https://www.math.csi.cuny.edu/~mvj/MTH214/mth214-laboration-1.html
You can find all labs through the Lab Instructions
link on the course website.