High-Throughput Arrays for Relative Methylation (CHARM)

Figure 20.1.1 Overview of McrBC-based fractionation (Lippman et al., 2005; Ordway et al., 2006) coupled with CHARM analysis (Irizarry et al., 2008). Genomic DNA is sheared to 1.5 to 3.0 kb and divided into two equal parts. The first is digested with McrBC, a methyl-cytosine insensitive enzyme that recognizes Pu^mC(N_40-3000)^mCPu, and the second is untreated. Both fractions are then resolved side by side on a 1% agarose gel, and fragments between 1.65 kb and 3.0 kb are excised and purified. Next, the untreated fraction, representing total input DNA, is labeled with cyanine-3 (Cy3) and the McrBC-treated fraction, representing unmethylated DNA, is labeled with cyanine-5 (Cy5) followed by cohybridization to a CHARM microarray. Sequences that are methylated will be present in the input fraction (Cy3) and depleted in the methyl-depleted fraction (Cy5). For each probe on the array, a log ratio of the Cy3 to Cy5 intensity is calculated and represents the methylation level (M-value) at each locus, with larger M-values representing more methylation and smaller M-values representing less methylation.

BASIC PROTOCOL

CHARM ARRAY HYBRIDIZATION AND ANALYSIS

In this protocol, we will describe how to label fractionated DNA and hybridize labeled DNA to a custom-designed CHARM microarray. In addition, we describe charm, an R package, for basic analysis of CHARM arrays including array normalization and identification of differentially methylated regions. Analysis is carried out using R and Bioconductor software packages (Gentleman et al., 2004).

Sample labeling and hybridization

A minimum of 3.5 µg of DNA is required for cyanine dye labeling. The fractionated DNA used in these labeling reactions should be prepared as described in Support Protocols 1 to 3, and should have also passed quality control as described in Support Protocol 4 using real-time PCR analysis. The labeling and hybridization steps used for CHARM were originally provided by NimbleGen for methylated DNA immunoprecipitation (MeDIP). Here, we follow the protocol described in the NimbleGen Arrays User’s Guide for DNA methylation analysis (http://www.nimblegen.com/products/lit/methylation_userguide_v6p0.pdf) provided by Roche NimbleGen, with slight modifications that reflect differences between inputs for CHARM and MeDIP. For CHARM, we label the untreated DNA fraction with cyanine-3 and methyl-depleted fraction with cyanine-5, followed by cohybridization to a CHARM array (Roche NimbleGen HD2 platform), as shown in Figure 20.1.1.

Analyzing CHARM DNA methylation data

Raw data obtained from the CHARM microarrays (.xys files) can most easily be analyzed using the R/Bioconductor software environment (Gentleman et al., 2004), as normalization functions and a custom array annotation package have been made available.

There are three main components of CHARM array analysis. The first part of the analysis involves normalization. The basic measure of methylation is the log-ratio of intensities in the untreated and methyl-depleted channels. Normalization serves the dual purposes of setting the zero-methylation signal level within each array and removing the nonlinear dependence of the log-ratio (M) on the average signal intensity in the two channels (A). Loess normalization is widely used in two-color expression arrays to correct this M-A bias under the assumption that most genes are not differentially expressed (M=0) (Yang et al., 2002). However, this assumption is not appropriate when examining DNA methylation data, since many sites may be methylated and represented by positive M-values. A modified strategy involves fitting a loess regression through a subset of control probes known to represent unmethylated regions, and then applying this correction curve to all the probes on the array (Irizarry et al., 2008). The control probes are selected from CpG-free regions of the genome that are guaranteed to remain uncut by McrBC.

Following normalization, a moving window smoother is applied. The smoothed M-value at a given location is obtained by taking a weighted mean of probes within a prespecified distance of the location (Irizarry et al., 2008). A typical window size is 10 to 20 probes. This procedure significantly reduces the impact of probe-effect biases and noise on methylation estimates at the single-CpG level.

Lastly, regions with differential methylation between phenotypes are identified. For each probe, the average M-value is computed for each phenotype. Differential methylation is quantified for each pairwise tissue comparison by the difference of averaged M values (DM). Replicates are used to estimate probe-specific standard deviation (S.D.), which provide standard errors (S.E.M.) for DM. Z scores (DM/S.E.M.) are calculated, and statistically significant values (typically p < 0.001 or p < 0.005) are grouped into candidate DMR regions. A useful metric for ranking regions is the area defined as the length by average DM. In experiments with at least six biological replicates per experimental group, it is possible to calculate a statistical significance in terms of the false discovery rate (FDR) associated with each DMR. Permutation p values are generated using a null distribution of DMR areas generated by repeatedly assigning permuted group labels to samples and rerunning the DMR identification procedure. Permutation p values are then converted into q values (corresponding to FDRs) using the Bayesian procedure of Storey (2003).

Materials

NimbleGen Array User’s Guide downloaded from the Roche NimbleGen Web site (http://www.nimblegen.com/products/lit/methylation_userguide_v6p0.pdf)

NimbleScan Software User’s Guide downloaded from the Roche NimbleGen Web site (http://www.nimblegen.com/products/lit/NimbleScan_v2p5_UsersGuide.pdf)

Bioconductor software (http://www.bioconductor.org), with the packages:

oligo (http://bioconductor.org/packages/bioc/html/oligo.html)
qvalue (http://bioconductor.org/packages/bioc/html/qvalue.html)

CHARM microarray annotation package: pd.feinberg.hg18.me.hx1 (http://rafalab.jhsph.edu/software.html)

NimbleScan feature extraction software (Roche NimbleGen)

3.5 µg untreated DNA sample (obtained using Support Protocols 1, 2, and 3), analyzed by quantitative real-time PCR to evaluate the specificity of McrBC in the digestion reaction (Support Protocol 4)

3.5 µg methyl-depleted DNA sample (obtained using Support Protocols 1, 2, and 3), analyzed by quantitative real-time PCR to evaluate the specificity of McrBC in the digestion reaction (Support Protocol 4)

CHARM HD2 microarrays and supplies for HD2 array hybridization and processing (NimbleGen): see the NimbleGen Array User’s Guide at the URL above

R Software (http://www.r-project.org)

1. Download the NimbleGen Arrays User’s Guide for DNA methylation analysis (http://www.nimblegen.com/products/lit/methylation_userguide_v6p0.pdf) from the Roche NimbleGen Web site. This guide describes in detail all steps required for labeling, hybridizing, and scanning HD2 arrays. Follow the User’s Guide beginning at Chapter 1, page 3 (Components Supplied) up to Chapter 5, page 41, with the following exceptions:

a. The twelve sample tracking controls (STCs), described on pages 7 and 19 to 20 are not required for the CHARM assay.

b. The experimental (IP) sample described throughout the NimbleGen Array User’s guide should be substituted with the methyl-depleted (MD) fraction for each sample. The control (input) sample described throughout the NimbleGen Array User’s Guide should be replaced by the untreated (UT) fraction for each sample.

c. The nonspecific binding (NSB) sample described in Chapter 3, page 13, of the array User’s Guide is not required for the CHARM assay.

Throughout the user’s guide, be sure to follow all detailed guidelines and recipes specified for the 2.1 M Arrays.

After completing all steps of the NimbleGen Arrays User’s Guide, pages 3 to 41, a .tif image is obtained for each array. These .tif images are used in the subsequent step by the NimbleGen software to generate X, Y and Signal reports (.xys files).

2. Download the NimbleScan Software User’s Guide (http://www.nimblegen.com/products/lit/NimbleScan_v2p5_UsersGuide.pdf) from the Roche NimbleGen Web site. Follow the detailed instructions for generating X, Y and Signal reports (.xys files) in Chapter 5 (Producing Reports), pages 58 to 63.

The X, Y and Signal reports provide the coordinates of each feature and its intensity on each CHARM array. Pages 110 and 111 of the NimbleScan Software User’s Guide provide a description and example of X, Y and Signal reports.

The results of this procedure are .xys data files generated for each CHARM HD2 array, including both Cy3 and Cy5 intensities, using the NimbleScan feature extraction software (Roche NimbleGen).

Analyze raw data (.xys files) obtained from the CHARM HD2 arrays

Also see http://www.biostat.jhsph.edu/~maryee/charm for additional software details and support.

3. Download and install R (http://cran.r-project.org) and Bioconductor (http://bioconductor.org).

Detailed installation instructions are available at the download Web sites.

4. Start R and install the charm and pd.feinberg.hg18.me.hx1 packages by typing the following at the R prompt:

> source(“http://www.bioconductor.org/biocLite.R”)

> repos <- c(“http://R-Forge.R-project.org”, biocinstallRepos())

> pkgs <- c(“charm”, “pd.feinberg.hg18.me.hx1”)

> install.packages(pkgs=pkgs, repos=repos)

Lines beginning with > indicate commands to be entered at the R prompt. Help for all commands used in steps 4 to 9 can be accessed within R using the ? command. For example type ?qcReport at the R command prompt to obtain information about the qcReport command.

5. Prepare a tab-delimited sample description file (example shown in Fig. 20.1.2). This can be created in Microsoft Excel and saved using the Save As > Tab Delimited Text option. The sample description file should have one line per channel, i.e., two lines per sample corresponding to the Cyanine-3 (532.xys) and Cyanine-5 (635.xys) channel data files. Three columns are required: the .xys file name (denoted by filename in Fig. 20.1.2), a sample identifier (denoted by sampleID in Fig. 20.1.2), and a group label (denoted by tissue in Fig. 20.1.2). The names of the columns are arbitrary.

Figure 20.1.2 Tab-delimited sample description file (example).

6. Read the tab-delimited sample description file created in step 5 (denoted here as sample_description_file.txt) into R by typing the following at the R prompt:

> pd <- read.delim(“sample_description_file.txt”)

7. Read raw .xys file data into R by typing the following at the R prompt:

> rawData <- readCharm(files=pd$filename, sampleKey=pd)

8. Generate a hybridization quality report to identify outlier arrays with poor signal quality or spatial artifacts by typing the following at the R prompt:

> qual <- qcReport(rawData, file=“qcReport.pdf”)

The report is saved in PDF format.

9. Normalize the data and find differentially methylated regions (DMRs) by typing the following at the R prompt:

> grp <- pData(rawData)$tissue

> p <- methp(rawData)

> dmr <- dmrFinder(rawData, p=p, groups=grp, compare=c(“brain”,

“liver”))

> dmr <- dmrFdr(dmr)

> head(dmr$tabs[[“brain-liver”]])

An example of the output (a table listing DMRs) generated using the commands used above is shown in Figure 20.1.3.

Figure 20.1.3 Table listing DMRs (example of output generated using commands in step 9).

The dmr object produced by the above commands contains a table of candidate DMRs (example shown in Figure 20.1.3). Each DMR is identified by chromosome and start and end positions (columns 1 to 3). The columns p1 and p2 contain the average percentage methylation in group 1 and group 2 respectively. The last column (qval) indicates the false discovery rate q-value of the DMR. Note that the q-value is not a reliable indicator of statistical significance when there are less than approximately 5 biological replicates in either group.

See Anticipated Results for interpretation of this information.

SUPPORT PROTOCOL 1

FRACTIONATION OF GENOMIC DNA BY RANDOM SHEARING

In this protocol, we describe how to randomly shear 5 µg of high-quality and high-molecular-weight genomic DNA (gDNA) from 1.5 kb to 3.0 kb using a HydroShear device (DigiLab, http://www.digilabglobal.com/). The HydroShear uses hydrodynamic forces to randomly shear gDNA to a specified size range. More specifically, the device contracts and causes the gDNA solution to pass through a 0.05-mm diameter orifice in a ruby, located within the shearing assembly. In turn, the flow rate of the solution accelerates forcing the gDNA to stretch and break (Oefner et al., 1996; Thorstenson et al., 1998). Following shearing, the DNA is equally split into two aliquots and is ready for methylation-dependent fractionation. Figure 20.1.4 provides an example of the expected distribution of DNA sizes after shearing has been performed.

Figure 20.1.4 The expected size distribution, from 1.5 kb to 3.0 kb, of genomic DNA after shearing with a HydroShear device. We ran 500 ng of DNA on a 1% agarose gel in 1× TAE buffer for 60 min at 110 volts. Lane 1 contains a 1 Kb Plus DNA Ladder with the upper and lower arrows denoting 3.0 kb and 1.65 kb, respectively. Lane 2 contains unsheared, high-quality, high-molecular-weight genomic DNA as a reference. Lane 3 contains genomic DNA that was sheared using a standard shearing assembly on a HydroShear device. The two straight horizontal lines denote the expected size distribution of DNA.

Materials

5 µg of high-quality, high-molecular-weight, genomic DNA in 1× TE buffer (APPENDIX 3B)

1× TE buffer, pH 7.4 (Quality Biological), filtered prior to use to avoid clogging of the shearing assembly

0.2 M hydrochloric acid (wash solution I), filtered prior to use to avoid clogging of the shearing assembly

0.2 M sodium hydroxide (wash solution II), filtered prior to use to avoid clogging of the shearing assembly

50-ml conical tubes (BD Falcon)

HydroShear device, equipped with a standard shearing assembly and syringe (DigiLab; http://www.digilabglobal.com/)

1. Allow genomic DNA to equilibrate at room temperature for 30 min and mix well by vortexing prior to shearing.

2. Prepare DNA for shearing by aliquotting the following into a 1.5-ml microcentrifuge tube:

5 µg of genomic DNA

1× TE buffer to 100 µl.

3. Place an ~20-ml aliquot of wash solution I (0.2 M HCl), wash solution II (0.2 M NaOH), and 1× TE buffer from the filtered stock into a new 50-ml conical tube.

These 50-ml-tube aliquots are used to wash the HydroShear device by placing the input tubing on the HydroShear device into the 50-ml tube containing the appropriate solution.

Only gold members can continue reading. Log In or Register to continue