Hi-C解析(2017NGSハンズオン講習会-2017年9月1日)

Similar documents
AJACS18_ ppt

CpG (Whole genome bisulfite sequencing; WGBS) MeDip-Seq 1 DNA CpG-rich 1. SureSelect Reduced representation bisulfite sequencing (RRBS) DNA CpG PCR DN

PowerPoint Presentation

Agilent Microarray Total Solution 5 5 RNA-Seq 60 mer DNA in situ DNA 5 2 QC 4200 TapeStation 2100 / mirna CGHCGH+SNP ChIP-on-chip 2 mirna QC

Maser RNA-seq Genome Resequencing De novo Genome Sequencing Metagenome ChIP-seq CAGE BS-seq

Infinium BeadChip COGS BeadChip 4 * iselect 3 SNP 25 1 SNP NGS Sequencing by Synthesis SBS HiSeq MiSeq WGS 1 RNA-Seq ChIP-Seq 1 1 * icogs BCAC OCAC PR

untitled

PowerPoint プレゼンテーション

プレゼンテーション2.ppt

GWB

PowerPoint プレゼンテーション

17-05 THUNDERBIRD SYBR qpcr Mix (Code No. QPS-201, QPS-201T) TOYOBO CO., LTD. Life Science Department OSAKA JAPAN A4196K

シーケンサー利用技術講習会 第10回 サンプルQC、RNAseqライブラリー作製/データ解析実習講習会

100 SDAM SDAM Windows2000/XP 4) SDAM TIN ESDA K G G GWR SDAM GUI

機能ゲノム学(第6回)

untitled

計算機生命科学の基礎II_

第 10 回シーケンス講習会 RNA-seq library 調製法の特徴と選び方 理化学研究所 (RIKEN) ライフサイエンス技術基盤研究センター (CLST) 機能性ゲノム解析部門 (DGT) ゲノムネットワーク解析支援施設 (GeNAS) 野間将平

PowerPoint Presentation

141025mishima

バクテリアゲノム解析

第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

II III I ~ 2 ~

中堅中小企業向け秘密保持マニュアル

- 2 -


1 (1) (2)

IonTorrentPGM_appnote_0319.indd

untitled

Microsoft Word - Meta70_Preferences.doc

2007 3DCG : M DCG 3DCG 3DCG 3D (huristic method) C++

5 I The Current Situation and Future Prospects of the North Korean Economy presented at the 2014 Korea Dialogue Conference on Strengthenin

untitled

2007/8 Vol. J90 D No. 8 Stauffer [7] 2 2 I 1 I 2 2 (I 1(x),I 2(x)) 2 [13] I 2 = CI 1 (C >0) (I 1,I 2) (I 1,I 2) Field Monitoring Server

日本組織適合性学会誌第23巻2号

(a) (b) (c) Canny (d) 1 ( x α, y α ) 3 (x α, y α ) (a) A 2 + B 2 + C 2 + D 2 + E 2 + F 2 = 1 (3) u ξ α u (A, B, C, D, E, F ) (4) ξ α (x 2 α, 2x α y α,

1 Kinect for Windows M = [X Y Z] T M = [X Y Z ] T f (u,v) w 3.2 [11] [7] u = f X +u Z 0 δ u (X,Y,Z ) (5) v = f Y Z +v 0 δ v (X,Y,Z ) (6) w = Z +

GWB_RNA-Seq_


ChIP-seq

日本における結婚観の変化―JGSS累積データ を用いた分析―

: (EQS) /EQUATIONS V1 = 30*V F1 + E1; V2 = 25*V *F1 + E2; V3 = 16*V *F1 + E3; V4 = 10*V F2 + E4; V5 = 19*V99

CLC Genomics Workbench ウェブトレーニングセミナー: 変異解析編

療養病床に勤務する看護職の職務関与の構造分析

橡NO005-PDF用.ec5

A Feasibility Study of Direct-Mapping-Type Parallel Processing Method to Solve Linear Equations in Load Flow Calculations Hiroaki Inayoshi, Non-member


PowerPoint プレゼンテーション

3 1 2

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

Furukawa et al. (2004) fms13 Takagi et al. (2001) Tru ) PCR 8) 3 DNA HE 11) PIC 12) 13) ( ) 1 4 PIT 14) DNA mtdna msdna DN

日本産科婦人科学会雑誌第64巻第12号

Convolutional Neural Network A Graduation Thesis of College of Engineering, Chubu University Investigation of feature extraction by Convolution


untitled

Transcription:

29 NGS Hi-C 2017 9 1 1

n Hi-C l Chromosome Conformation Capture l Hi-C n Hi-C l Hi-C l l l l TAD l 3D 2

n Fastq Hi-C 3 n python python.py Hi-C 3

Bio-Linux-8.0.7_hm_kh.ova ~/HiC 1_mapping_read_to_genome 2_filtering_reads 3_normalization 4_convert_Juice 5_detect_TADs 6_modeling_3D python Results data fastq Index Bowtie2 ref fasta src 4

Hi-C ex. Hi-C 5

NGS n n l l l Reseq l RNA-seq l ChIP-seq l ATAC-seq l Hi-C l irep 6

Chromosome Conformation Capture (3C) Dekker, Job, et al. "Capturing chromosome conformation." Science 295.5558 (2002): 1306-1311. 3C-based method DNA 7

3C-based technologies de Wit, Elzo, and Wouter de Laat. "A decade of 3C technologies: insights into nuclear organization." Genes & development 26.1 (2012): 11-24. 8

4C: Chromosome conformation capture-on-chip viewpoint PCR NGS 4C-seq de Wit, Elzo, and Wouter de Laat. "A decade of 3C technologies: insights into nuclear organization." Genes & development 26.1 (2012): 11-24. 9

4C: Chromosome conformation capture-on-chip 4C-seq Hi-C validation Ke, Yuwen, et al. "3D chromatin structures of mature gametes and structural reprogramming during mammalian embryogenesis." Cell 170.2 (2017): 367-381. 10

Hi-C Lieberman-Aiden, Erez, et al. "Comprehensive mapping of long-range interactions reveals folding principles of the human genome." Science 326.5950 (2009): 289-293. vs. Forward, Reverse 11

Hi-C i j (i, j) Rao, Suhas SP, et al. "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping Cell 159.7 (2014): 1665-1680. 12

Rao, Suhas SP, et al. "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping Cell 159.7 (2014): 1665-1680. 13

L2 Rao, Suhas SP, et al. "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping Cell 159.7 (2014): 1665-1680. L1 L3 14

Topologically Associated Domains (TADs) TAD Akdemir, Kadir Caner, and Lynda Chin. "HiCPlotter integrates genomic data with interaction matrices." Genome biology 16.1 (2015): 198. 15

Topologically Associated Domains (TADs) Ke, Yuwen, et al. "3D chromatin structures of mature gametes and structural reprogramming during mammalian embryogenesis." Cell 170.2 (2017): 367-381. 16

Topologically Associated Domains (TADs) Rao, Suhas SP, et al. "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping Cell 159.7 (2014): 1665-1680. Fudenberg, Geoffrey, et al. "Formation of chromosomal domains by loop extrusion. Cell reports 15.9 (2016): 2038-2049. 17

Nagano, Takashi, et al. Cell-cycle dynamics of chromosomal organization at single-cell resolution. Nature 547 (2017): 61 67 18

Hi-C 1. 2. 3. meta3c Marbouty, Martial, et al. "Metagenomic chromosome conformation capture (meta3c) unveils the diversity of chromosome organization in microorganisms. Elife 3 (2014): e03318. 19

Hi-C n l Bin 10 100 20

Hi-C n excl. single cell Hi-C l 3 O'sullivan, Justin M., et al. "The statistical-mechanics of chromosome conformation capture. Nucleus 4.5 (2013): 390-398 21

Hi-C n excl. single cell Hi-C l Rao, Suhas SP, et al. "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping Cell 159.7 (2014): 1665-1680. 22

Hi-C n 3C 1 1 l l Genome Architecture Mapping Beagrie, Robert A., et al. "Complex multi-enhancer contacts captured by genome architecture mapping. Nature 543.7646 (2017): 519-524. 23

n Hi-C l Chromosome Conformation Capture l Hi-C n Hi-C l Hi-C l l l l TAD l 3D 24

Hi-C Trimmomatic, cutadapt, fastqc Bowtie2, BWA, Juicer, hiclib, HiCUP, HIPPIE Juicer, hiclib, HiCUP, HIPPIE, HOMER Juicer, hiclib, HIPPIE, HOMER TAD 3 Fit-Hi-C, GOTHiC, HOMER, HIPPIE, HiCCUPS HiCseg, TADbit, Arrowhead, TADtree, Armatus ChromSDE, ShRec3D, PASTIS 25

Bio-Linux-8.0.7_hm_kh.ova ~/HiC 1_mapping_read_to_genome 2_filtering_reads 3_normalization 4_convert_Juice 5_detect_TADs 6_modeling_3D python Results data fastq Index Bowtie2 ref fasta src 26

Results mv 27

In situ Hi-C Kilobase Hi-C 100 fastq 100GB Rao, Suhas SP, et al. "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping Cell 159.7 (2014): 1665-1680. 28

~/HiC/data Rao, et al. 2014 Human B-Lymphocyte: GM12878 1000 $cd ~/HiC/data $ls l R1, R2 fastq 29

~/HiC/Ref hg19 http://hgdownload.cse.ucsc.edu/downloads.html FASTA 30

~/HiC/Index ~/HiC/Ref hg19 bowtie2-build Bowtie2 31

Hi-C Trimmomatic TAD 3 32

Hi-C $cd ~/HiC/1_mapping_read_to_genome TAD 3 33

Illumina http://assets.illumina.com/content/dam/illuminamarketing/images/technology/paired-end-sequencing-figure.gif 34

Hi-C Lieberman-Aiden, Erez, et al. "Comprehensive mapping of long-range interactions reveals folding principles of the human genome." Science 326.5950 (2009): 289-293. 35

Hi-C R1, R2 R1 R2 36

Hi-C 1. 2. => R1, R2 Imakaev, Maxim, et al. "Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nature methods 9.10 (2012): 999-1003. 37

1. R1, R2 R1, R2 3 I. R1, R2 II. a. LocusA, LocusB LocusB Locus A B b. III. 2. Iterative alignment method 38

Iterative alignment method Imakaev, Maxim, et al. "Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nature methods 9.10 (2012): 999-1003. 39

$less mapping.py R1 R2 40

bp 35bp 41

$python mapping.py Bowtie2 Bowtie2 42

$ls l../data 43

$less parse_results.py HDF5 Biopython Restriction 44

$python parse_results.py $ls -l HDF5 HDFView python HDF5 45

Hi-C $cd ~/HiC/2_filtering_reads TAD 3 46

R1, R2 Hi-C Imakaev, Maxim, et al. "Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nature methods 9.10 (2012): 999-1003. 47

$less filtering.py maximummoleculelength 400bp HDF5 48

$less filtering.py filterrsitestart(): DNA filterduplicates(): PCR duplicate filterlarge(): 10^5bp filterextreme(): 0.5% 49

$less filtering.py 1Mbp Bin raw read count 1Mbp 3,000 3,000 Bin 90% 80% 1000 50

$python filtering.py $ls -l 51

$less./statistics.txt 52

Hi-C $cd ~/HiC/3_normalization TAD 3 53

1. 2. Hi-C I. Ligation II. GC III. Mappability. ChIP-seq: INPUT RNA-seq: 1 Hi-C 54

Hi-C 1. Explicit GC Yaffe and Tanay 2011 HiCNorm 2. Implicit Vanilla coverage, ICE, Knight and Ruiz 2012 55

Raw heatmap Normalized heatmap Raw coverage Corrected coverage 56

k l! A #$ #! A #& # 57

k l 1 A #& # k l 1 A #$ # 58

1 # A #$ # A #* 1 # A #$ # A #+ k = Vanilla coverage normalization 1 # A #$ # A #& l! A #$ # 59

Vanilla coverage normalization i j i j GC implicit bias Explicit Imakaev, Maxim, et al. "Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nature methods 9.10 (2012): 999-1003. Explicit Implicit 60

Iterative correction (ICE method) Vanilla coverage normalization Þ Vanilla coverage normalization matrix balancing ICE matrix balancing Knight and Ruiz 2012 61

$less normalize.py Raw read count Bin ICE 62

$python normalize.py heatmap.pdf 63

TAD 3D 19 $python submatrix.py $less norm_mat.txt 64

JuiceBox JuiceBox JuiceBox $cd ~/4_convert_Juice $less convert_to_juicetext.py $python convert_to_juicetext.py $less./forjuice.txt $./convert_to_juicehic.sh test.hic Juice 65

JuiceBox $./execute_juicebox.sh File => Open => Local test.hic Chromosomes Annotations ENCODE 66

Hi-C TAD 3 67

L2 L1 L3 Rao, Suhas SP, et al. "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping Cell 159.7 (2014): 1665-1680. 68

Forcato, Mattia, et al. "Comparison of computational methods for Hi-C data analysis." Nature methods 14.7 (2017): 679. 69

Fit-Hi-C (Global background) Ay, Ferhat, Timothy L. Bailey, and William Stafford Noble. "Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts. Genome research 24.6 (2014): 999-1011. ICE p-value 70

HiCCUPS (Local background) Rao, Suhas SP, et al. "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping Cell 159.7 (2014): 1665-1680. K&R p-value 71

Hi-C TAD $cd ~/HiC/5_detect_TADs 3 72

Topologically Associated Domains (TADs) Forcato, Mattia, et al. "Comparison of computational methods for Hi-C data analysis." Nature methods 14.7 (2017): 679. TAD TAD 73

TADtree Caleb Weinreb, Benjamin J. Raphael; Identification of hierarchical chromatin domains, Bioinformatics, Volume 32, Issue 11, 1 June 2016, Pages 1601 1609 TAD, sub-tad Python 74

TADtree Caleb Weinreb, Benjamin J. Raphael; Identification of hierarchical chromatin domains, Bioinformatics, Volume 32, Issue 11, 1 June 2016, Pages 1601 1609 TAD TAD TAD 75

$less./control_file.txt TAD Bin TAD TAD TAD 76

TADtree $python TADtree.py./control_file.txt./output/chr19 N TAD proportion_duplicates.txt TAD BED Bin 77

TAD TAD DNA TAD TAD Forcato, Mattia, et al. "Comparison of computational methods for Hi-C data analysis. Nature methods 14.7 (2017): 679. Ke, Yuwen, et al. "3D chromatin structures of mature gametes and structural reprogramming during mammalian embryogenesis." Cell 170.2 (2017): 367-381. 78

Hi-C TAD 3 $cd ~/HiC/6_modeling_3D 79

3D Hi-C Serra, et al. 2015 1. 2. a. b. 80

3D 1. https://ja.wikipedia.org/wiki/ 2. 81

82

D #,. = 1 A #,. 1 α=1 Ai,j = 0 Di,j => Þ Shortest-path reconstruction ShRec3D Lesne, et al. 2014 MATLAB 83

Shortest-path reconstruction Bin 84

Shortest-path reconstruction Floyd-Warshall 85

$less./convert_contact_to_distance.py Python NetworkX 19 19 $python./convert_contact_to_distance.py../3_normalization/norm_mat.txt dist.npy 86

3 Multi-dimensional scaling; MDS 16S PCoA MDS 16S MDS OK 3 87

3 $less./modeling_3d.py dist.npy MDS $python modeling_3d.py 88

Dekker, Job, et al. "Capturing chromosome conformation." science 295.5558 (2002): 1306-1311. de Wit, Elzo, and Wouter de Laat. "A decade of 3C technologies: insights into nuclear organization." Genes & development 26.1 (2012): 11-24. Ke, Yuwen, et al. "3D chromatin structures of mature gametes and structural reprogramming during mammalian embryogenesis." Cell 170.2 (2017): 367-381. Lieberman-Aiden, Erez, et al. "Comprehensive mapping of long-range interactions reveals folding principles of the human genome." science 326.5950 (2009): 289-293. Rao, Suhas SP, et al. "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping."cell 159.7 (2014): 1665-1680. Akdemir, Kadir Caner, and Lynda Chin. "HiCPlotter integrates genomic data with interaction matrices." Genome biology 16.1 (2015): 198. Fudenberg, Geoffrey, et al. "Formation of chromosomal domains by loop extrusion." Cell reports 15.9 (2016): 2038-2049. Nagano, Takashi, et al. Cell-cycle dynamics of chromosomal organization at single-cell resolution. Nature 547 (2017): 61 67 Marbouty, Martial, et al. "Metagenomic chromosome conformation capture (meta3c) unveils the diversity of chromosome organization in microorganisms." Elife 3 (2014): e03318. O'sullivan, Justin M., et al. "The statistical-mechanics of chromosome conformation capture." Nucleus 4.5 (2013): 390-398. Beagrie, Robert A., et al. "Complex multi-enhancer contacts captured by genome architecture mapping." Nature 543.7646 (2017): 519-524. 89

Imakaev, Maxim, et al. "Iterative correction of Hi-C data reveals hallmarks of chromosome organization." Nature methods 9.10 (2012): 999-1003. Yaffe, Eitan, and Amos Tanay. "Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture." Nature genetics 43.11 (2011): 1059-1065. Hu, Ming, et al. "HiCNorm: removing biases in Hi-C data via Poisson regression." Bioinformatics 28.23 (2012): 3131-3133. Knight, Philip A., and Daniel Ruiz. "A fast algorithm for matrix balancing." IMA Journal of Numerical Analysis 33.3 (2013): 1029-1047. Forcato, Mattia, et al. "Comparison of computational methods for Hi-C data analysis." Nature methods 14.7 (2017): 679. Ay, Ferhat, Timothy L. Bailey, and William Stafford Noble. "Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts." Genome research 24.6 (2014): 999-1011. Caleb Weinreb, Benjamin J. Raphael; Identification of hierarchical chromatin domains, Bioinformatics, Volume 32, Issue 11, 1 June 2016, Pages 1601 1609 Serra, François, et al. "Restraint-based three-dimensional modeling of genomes and genomic domains." FEBS letters 589.20PartA (2015): 2987-2995. Lesne, Annick, et al. "3D genome reconstruction from chromosomal contacts." Nature methods 11.11 (2014): 1141-1143. 90