getting-started-with-r.Rmd
This vignette is written for complete beginners to R. For users familiar with R and RStudio, skip ahead to the “Basic Plots” vignette.
Download the most recent versions of R and RStudio for the appropriate OS using the links below:
An R Project file defines the root folder, ie, where R starts searching for your files. For best practice, follow the “one folder, one R project” rule.
Create a new folder to store your files. For example,
"Desktop/drug_lifespan"
.
Create your lifespan study in Excel, export it to
csv
, and paste it in the same
Desktop/drug_lifespan"
folder. Your csv
file
should have 4 columns: condition
, day
,
dead
, censored
. You can also add additional
columns (eg plate/vial). Here is an example:
condition | day | dead | censored |
---|---|---|---|
WT | 10 | 2 | 0 |
WT | 12 | 0 | 2 |
To create an R Project, open RStudio. Click on File
-> New Project
-> Existing folder
.
Navigate to the “Desktop/drug_lifespan
” folder, and click
“Create”.
RStudio should now open in a new window, and you should see your
lifespan study csv
file in the bottom right Files
pane.
The user interface
Code chunk vs text
Analysis in R is performed in files known as markdown files. These enable you to fuse code chunks (calculations) with text chunks (explainations). They come in two similar flavours - Quarto markdown and R markdown. For the purposes of this tutorial, we will be using the Quarto markdown file.
File
->
New File
-> Quarto Document
.Quarto documents consist of two parts - code chunks vs text chunks.
Here is a code chunk:
1 + 1
#> [1] 2
And here is some text:
1 + 1
What’s the difference?
A simple analogy - code chunks are the Microsoft Excel of R, while text chunks are the Microsoft Word of R.
For example, if you type =1+1
in Excel, Excel
automatically sums them up and returns the answer, 2. In contrast, if
you type =1+1
in Microsoft Word, it will remain as
text.
Code chunks are evaluated - meaning that any calculations
inside the chunk will be run. To run code chunks, click on the tiny
green triangle pointing to the right at the top right of the code chunk.
Code chunks have a different colour, and are delineated by the
’``{r}' header. Everything within a code chunk is run and evaluated. The only exceptions are comments, which start with a
#`:
# This is a comment: Let's try addition in R
1 + 1
#> [1] 2
The tldr: code chunks are for code, while text chunks are for text. Describe what you are analyzing and describe your conditions that you are testing. This will help your reader, and also your future self when you revisit this analysis several years down the road.
For example, I will begin by describing my fictional dataset:
This dataset contains lifespan data from 50 _C.elegans_ worms in 3 different conditions.
Summary of conditions:
condition | description
--------- | ------------------------------------------
WT | 50 N2 worms
Drug1 | 50 N2 worms treated with fictional Drug X at 50 μM
Drug2 | 50 N2 worms treated with fictional Drug Y at 50 μM
First, we load libraries. Libraries are packages that contain functions that you need to run. Copy and paste the following line of code into your Quarto Document.
If you’re new to R, I highly recommend the following resource by the Harvard Chan Bioinformatics Core. It provides a great overview of the RStudio interface, as well as R Projects that help keep your analysis organized. https://hbctraining.github.io/Intro-to-R-flipped/lessons/01_introR-R-and-RStudio.html