#!/usr/bin/env Rscript
# Normalize a set of CEL files
## Note: To use biocLite packages (i.e. 'affy,' 'rma()', ...), it may be
# necessary to do a one-time [re-]install of the related packages. This can be
# done with the following sequence:
# > source("http://bioconductor.org/biocLite.R"); # Tell R where to download package info from
# > biocLite(); # Update biocLite package grounds. This turns out to be necessary 1-time step before a call to rma(...), which requires up-to-date biocLite library functions
# > biocLite("affy"); # d/l, install 'affy' package (1-time req. only)
library("affy");
# We'll be using this library
# Read CEL files
filenames <- list.files("path/to/CEL", full.names=TRUE);
affy.data <- ReadAffy(filenames=filenames, verbose=TRUE);
## Normalize using RMA
write.table( exprs( rma(affy.data, verbose=TRUE)), file="rma.matrix", sep='\t', quote=FALSE); # file="" to write to stdout, but watch for verbose output from rma(...) and/or ReadAffy(...)
## Using Quantile Norm (this post is edited--the old way didn't collapse measurements per probe)
library("affyPLM")
eset <- threestep (affy.data, background.method="IdealMM", normalize.method="quantile", summary.method="tukey.biweight" )
write.table(eset, file="qnorm.matrix", sep='\t', quote=FALSE)
# for some reason, this transposes the data and adds an "X" in front of every numeric probe ID?
A repository of notes; by noto, for noto. Made public so noto can read it from anywhere. It won't make any sense to you.
Monday, December 21, 2009
Normalizing Affymetrix Data with bioconductor
Subscribe to:
Post Comments (Atom)
Depending on the contents of the CEL file(s), you may need to reinstall some packages; e.g.,
ReplyDeletebiocLite('hgu133acdf')