notolog: Normalizing Affymetrix Data with bioconductor


#!/usr/bin/env Rscript
# Normalize a set of CEL files 

## Note:  To use biocLite packages (i.e. 'affy,' 'rma()', ...), it may be
# necessary to do a one-time [re-]install of the related packages.  This can be
# done with the following sequence:

#    > source("http://bioconductor.org/biocLite.R"); # Tell R where to download package info from 
#    > biocLite();            # Update biocLite package grounds.  This turns out to be necessary 1-time step before a call to rma(...), which requires up-to-date biocLite library functions 
#    > biocLite("affy");     # d/l, install 'affy' package (1-time req. only)

library("affy");
# We'll be using this library

# Read CEL files
filenames <- list.files("path/to/CEL", full.names=TRUE); 
affy.data <- ReadAffy(filenames=filenames, verbose=TRUE);

## Normalize using RMA 
write.table( exprs( rma(affy.data, verbose=TRUE)), file="rma.matrix", sep='\t', quote=FALSE); # file="" to write to stdout, but watch for verbose output from rma(...) and/or ReadAffy(...)

## Using Quantile Norm (this post is edited--the old way didn't collapse measurements per probe)
library("affyPLM")
eset <- threestep (affy.data, background.method="IdealMM", normalize.method="quantile", summary.method="tukey.biweight" )
write.table(eset, file="qnorm.matrix", sep='\t', quote=FALSE)
# for some reason, this transposes the data and adds an "X" in front of every numeric probe ID?

notolog

Monday, December 21, 2009

Normalizing Affymetrix Data with bioconductor

1 comment:

Followers

Blog Archive

About Me