forked from dcossyleon/basic-course-website
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlab1.Rmd
54 lines (44 loc) · 1.62 KB
/
lab1.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
---
title: "Lab: DNA exercise"
pagetitle: "Lab: DNA exercise"
---
```{r setup, include=FALSE}
library(knitr)
knitr::opts_chunk$set(echo = TRUE)
```
## DNA exercise
```{r dna_ex2, include = T, echo=T, eval=T}
# vector ... etc.
my_dna <- "AACGAATGAGTAAATGAGTAAATGAAGGAATGATTATTCCTTGCTTTAGAACTTCTGGAATTAGAGGACAATATTAATAATACCATCGCACAGTGTTTCTTTGTTGTTAATGCTACAACATACAAAGAGGAAGCATGCAG"
my_dna
length(my_dna)
class(my_dna)
str(my_dna)
nchar(my_dna)
```
```{r dna_ex3, include = T, echo=T, eval=T}
my_dna_list <- strsplit(x = my_dna, split = "", fixed = TRUE)
length(my_dna_list)
class(my_dna_list)
my_dna_vector <- unlist(my_dna_list)
length(my_dna_list[[1]])
str(my_dna_vector)
length(my_dna_vector)
# unique characters
unique(my_dna_vector)
# number of As
(my_dna_vector == "A")
length(my_dna_vector[my_dna_vector == "A"])
```
## Frequency distribution
A gene consists of a sequence of nucleotides {A, C, G, T}.The number of each nucleotide can be displayed in a frequency table. This will be illustrated by the Zyxin gene which plays an important role in cell adhesion (Golub et al., 1999). The accession number (X94991.1) of oneof its variants can be found via an NCBI UniGene search. The code below illustrates how to install the package ape, to load it, to read gene ”X94991.1”of the species homo sapiens from GenBank, and to make a frequency table of the four nucleotides
```{r freq1, include = T, echo=T, eval=T}
# install.packages(c("ape"), repo="http://cran.r-project.org", dep=TRUE)
library(ape)
gene <- read.GenBank(c("X94991.1"), as.character=TRUE)
table(gene)
pie(table(gene))
```
```{r knit_exit, include=F, echo=F}
knit_exit()
```