US20200102616A1 - COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5 HYDROXYMETHYLCYTOSINE (5-hmC) - Google Patents
COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5 HYDROXYMETHYLCYTOSINE (5-hmC) Download PDFInfo
- Publication number
- US20200102616A1 US20200102616A1 US16/713,657 US201916713657A US2020102616A1 US 20200102616 A1 US20200102616 A1 US 20200102616A1 US 201916713657 A US201916713657 A US 201916713657A US 2020102616 A1 US2020102616 A1 US 2020102616A1
- Authority
- US
- United States
- Prior art keywords
- sensitive
- hmc
- nucleic acid
- dna
- modified
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 149
- RYVNIFSIEDRLSJ-UHFFFAOYSA-N 5-(hydroxymethyl)cytosine Chemical compound NC=1NC(=O)N=CC=1CO RYVNIFSIEDRLSJ-UHFFFAOYSA-N 0.000 title claims description 89
- 239000000203 mixture Substances 0.000 title abstract description 37
- 230000004048 modification Effects 0.000 title description 72
- 238000012986 modification Methods 0.000 title description 72
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 191
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 191
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 191
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 claims description 74
- 229960002685 biotin Drugs 0.000 claims description 38
- 239000011616 biotin Substances 0.000 claims description 38
- 235000020958 biotin Nutrition 0.000 claims description 37
- 238000012163 sequencing technique Methods 0.000 claims description 37
- 238000003556 assay Methods 0.000 claims description 32
- HSCJRCZFDFQWRP-RDKQLNKOSA-N UDP-D-glucose Chemical class O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)OC1OP(O)(=O)OP(O)(=O)OC[C@@H]1[C@@H](O)[C@@H](O)[C@H](N2C(NC(=O)C=C2)=O)O1 HSCJRCZFDFQWRP-RDKQLNKOSA-N 0.000 claims description 22
- 239000000872 buffer Substances 0.000 claims description 21
- 230000002285 radioactive effect Effects 0.000 claims description 10
- 238000009396 hybridization Methods 0.000 claims description 9
- 238000002955 isolation Methods 0.000 claims description 8
- 230000000903 blocking effect Effects 0.000 claims description 7
- 238000002493 microarray Methods 0.000 claims description 7
- 230000002255 enzymatic effect Effects 0.000 claims description 5
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 claims description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical class NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 abstract description 33
- 238000013507 mapping Methods 0.000 abstract description 12
- 108020004414 DNA Proteins 0.000 description 251
- 108090000623 proteins and genes Proteins 0.000 description 138
- 102000004169 proteins and genes Human genes 0.000 description 107
- 235000018102 proteins Nutrition 0.000 description 105
- 238000007069 methylation reaction Methods 0.000 description 77
- 230000011987 methylation Effects 0.000 description 76
- HSCJRCZFDFQWRP-JZMIEXBBSA-N UDP-alpha-D-glucose Chemical class O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1OP(O)(=O)OP(O)(=O)OC[C@@H]1[C@@H](O)[C@@H](O)[C@H](N2C(NC(=O)C=C2)=O)O1 HSCJRCZFDFQWRP-JZMIEXBBSA-N 0.000 description 72
- 239000012634 fragment Substances 0.000 description 59
- 108091008146 restriction endonucleases Proteins 0.000 description 55
- 238000006243 chemical reaction Methods 0.000 description 52
- 239000000523 sample Substances 0.000 description 52
- 210000004027 cell Anatomy 0.000 description 49
- HSCJRCZFDFQWRP-UHFFFAOYSA-N Uridindiphosphoglukose Natural products OC1C(O)C(O)C(CO)OC1OP(O)(=O)OP(O)(=O)OCC1C(O)C(O)C(N2C(NC(=O)C=C2)=O)O1 HSCJRCZFDFQWRP-UHFFFAOYSA-N 0.000 description 48
- SBHSUMUTJOPRIK-HPFNVAMJSA-N 5-(beta-D-glucosylmethyl)cytosine Chemical compound NC1=NC(=O)NC=C1CO[C@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 SBHSUMUTJOPRIK-HPFNVAMJSA-N 0.000 description 46
- 102000004190 Enzymes Human genes 0.000 description 41
- 108090000790 Enzymes Proteins 0.000 description 41
- 210000001638 cerebellum Anatomy 0.000 description 37
- 238000002372 labelling Methods 0.000 description 37
- 241000699666 Mus <mouse, genus> Species 0.000 description 33
- 238000004458 analytical method Methods 0.000 description 30
- 125000002791 glucosyl group Chemical class C1([C@H](O)[C@@H](O)[C@H](O)[C@H](O1)CO)* 0.000 description 29
- 239000000243 solution Substances 0.000 description 29
- 230000029087 digestion Effects 0.000 description 28
- 239000008103 glucose Substances 0.000 description 28
- 238000006206 glycosylation reaction Methods 0.000 description 28
- 230000001771 impaired effect Effects 0.000 description 27
- 230000013595 glycosylation Effects 0.000 description 26
- 125000003729 nucleotide group Chemical group 0.000 description 26
- 235000001727 glucose Nutrition 0.000 description 25
- 238000000746 purification Methods 0.000 description 25
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 23
- 150000001875 compounds Chemical class 0.000 description 22
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 21
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 21
- 239000002773 nucleotide Substances 0.000 description 21
- 238000011282 treatment Methods 0.000 description 21
- 230000002441 reversible effect Effects 0.000 description 18
- 238000004128 high performance liquid chromatography Methods 0.000 description 17
- 239000000126 substance Substances 0.000 description 17
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 16
- 230000000875 corresponding effect Effects 0.000 description 16
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 16
- 108091029523 CpG island Proteins 0.000 description 15
- LSNNMFCWUKXFEE-UHFFFAOYSA-M Bisulfite Chemical compound OS([O-])=O LSNNMFCWUKXFEE-UHFFFAOYSA-M 0.000 description 14
- JUJWROOIHBZHMG-UHFFFAOYSA-N Pyridine Chemical compound C1=CC=NC=C1 JUJWROOIHBZHMG-UHFFFAOYSA-N 0.000 description 14
- 238000009826 distribution Methods 0.000 description 14
- 230000014509 gene expression Effects 0.000 description 14
- 239000002245 particle Substances 0.000 description 14
- 108010090804 Streptavidin Proteins 0.000 description 13
- 239000012472 biological sample Substances 0.000 description 13
- 238000001514 detection method Methods 0.000 description 13
- 238000001976 enzyme digestion Methods 0.000 description 13
- 239000000499 gel Substances 0.000 description 13
- 239000000758 substrate Substances 0.000 description 13
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 12
- YMWUJEATGCHHMB-UHFFFAOYSA-N Dichloromethane Chemical compound ClCCl YMWUJEATGCHHMB-UHFFFAOYSA-N 0.000 description 12
- ZMANZCXQSJIPKH-UHFFFAOYSA-N Triethylamine Chemical compound CCN(CC)CC ZMANZCXQSJIPKH-UHFFFAOYSA-N 0.000 description 12
- 150000001540 azides Chemical class 0.000 description 12
- -1 column Substances 0.000 description 12
- 229940104302 cytosine Drugs 0.000 description 12
- 238000011002 quantification Methods 0.000 description 12
- 230000015572 biosynthetic process Effects 0.000 description 11
- 239000003153 chemical reaction reagent Substances 0.000 description 11
- 238000003776 cleavage reaction Methods 0.000 description 11
- 238000012165 high-throughput sequencing Methods 0.000 description 11
- 230000003647 oxidation Effects 0.000 description 11
- 238000007254 oxidation reaction Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 11
- 230000007017 scission Effects 0.000 description 11
- 238000003786 synthesis reaction Methods 0.000 description 11
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 10
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 10
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 10
- 101000903725 Enterobacteria phage T4 DNA beta-glucosyltransferase Proteins 0.000 description 10
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 10
- WFDIJRYMOXRFFG-UHFFFAOYSA-N acetic acid anhydride Natural products CC(=O)OC(C)=O WFDIJRYMOXRFFG-UHFFFAOYSA-N 0.000 description 10
- 238000013459 approach Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 10
- 125000000524 functional group Chemical group 0.000 description 10
- 238000007031 hydroxymethylation reaction Methods 0.000 description 10
- 239000002904 solvent Substances 0.000 description 10
- LRSASMSXMSNRBT-UHFFFAOYSA-N 5-methylcytosine Chemical compound CC1=CNC(=O)N=C1N LRSASMSXMSNRBT-UHFFFAOYSA-N 0.000 description 9
- 102000053602 DNA Human genes 0.000 description 9
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 9
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 9
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 9
- 108700009124 Transcription Initiation Site Proteins 0.000 description 9
- 230000027455 binding Effects 0.000 description 9
- 239000003795 chemical substances by application Substances 0.000 description 9
- 239000010949 copper Substances 0.000 description 9
- 238000012350 deep sequencing Methods 0.000 description 9
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 9
- 238000004896 high resolution mass spectrometry Methods 0.000 description 9
- 238000001742 protein purification Methods 0.000 description 9
- 210000001519 tissue Anatomy 0.000 description 9
- 238000012546 transfer Methods 0.000 description 9
- 101000653374 Homo sapiens Methylcytosine dioxygenase TET2 Proteins 0.000 description 8
- 102100030803 Methylcytosine dioxygenase TET2 Human genes 0.000 description 8
- 230000008901 benefit Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 239000012528 membrane Substances 0.000 description 8
- 239000012071 phase Substances 0.000 description 8
- 238000006116 polymerization reaction Methods 0.000 description 8
- 239000011347 resin Substances 0.000 description 8
- 229920005989 resin Polymers 0.000 description 8
- 239000011780 sodium chloride Substances 0.000 description 8
- 238000006467 substitution reaction Methods 0.000 description 8
- 101000653360 Homo sapiens Methylcytosine dioxygenase TET1 Proteins 0.000 description 7
- 102100030819 Methylcytosine dioxygenase TET1 Human genes 0.000 description 7
- 241000699670 Mus sp. Species 0.000 description 7
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 7
- 150000001345 alkine derivatives Chemical class 0.000 description 7
- 239000011324 bead Substances 0.000 description 7
- 238000005119 centrifugation Methods 0.000 description 7
- 238000004587 chromatography analysis Methods 0.000 description 7
- 201000010902 chronic myelomonocytic leukemia Diseases 0.000 description 7
- 201000010099 disease Diseases 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 7
- 239000007788 liquid Substances 0.000 description 7
- 238000002156 mixing Methods 0.000 description 7
- 239000003504 photosensitizing agent Substances 0.000 description 7
- 238000002360 preparation method Methods 0.000 description 7
- UMJSCPRVCHMLSP-UHFFFAOYSA-N pyridine Natural products COC1=CC=CN=C1 UMJSCPRVCHMLSP-UHFFFAOYSA-N 0.000 description 7
- 238000000926 separation method Methods 0.000 description 7
- 238000012360 testing method Methods 0.000 description 7
- 238000005160 1H NMR spectroscopy Methods 0.000 description 6
- 230000007067 DNA methylation Effects 0.000 description 6
- 241000588724 Escherichia coli Species 0.000 description 6
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 6
- 102100036263 Glutamyl-tRNA(Gln) amidotransferase subunit C, mitochondrial Human genes 0.000 description 6
- 101001001786 Homo sapiens Glutamyl-tRNA(Gln) amidotransferase subunit C, mitochondrial Proteins 0.000 description 6
- 102000004856 Lectins Human genes 0.000 description 6
- 108090001090 Lectins Proteins 0.000 description 6
- 208000033833 Myelomonocytic Chronic Leukemia Diseases 0.000 description 6
- ZMXDDKWLCZADIW-UHFFFAOYSA-N N,N-Dimethylformamide Chemical compound CN(C)C=O ZMXDDKWLCZADIW-UHFFFAOYSA-N 0.000 description 6
- 206010028980 Neoplasm Diseases 0.000 description 6
- 108091028043 Nucleic acid sequence Proteins 0.000 description 6
- WYURNTSHIVDZCO-UHFFFAOYSA-N Tetrahydrofuran Chemical compound C1CCOC1 WYURNTSHIVDZCO-UHFFFAOYSA-N 0.000 description 6
- YXFVVABEGXRONW-UHFFFAOYSA-N Toluene Chemical compound CC1=CC=CC=C1 YXFVVABEGXRONW-UHFFFAOYSA-N 0.000 description 6
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 6
- 238000011161 development Methods 0.000 description 6
- 230000018109 developmental process Effects 0.000 description 6
- 238000000605 extraction Methods 0.000 description 6
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 6
- 239000002523 lectin Substances 0.000 description 6
- 239000000463 material Substances 0.000 description 6
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 6
- 210000001178 neural stem cell Anatomy 0.000 description 6
- 230000037361 pathway Effects 0.000 description 6
- 238000003753 real-time PCR Methods 0.000 description 6
- 230000002829 reductive effect Effects 0.000 description 6
- 150000003573 thiols Chemical class 0.000 description 6
- 108090001008 Avidin Proteins 0.000 description 5
- 108090000288 Glycoproteins Proteins 0.000 description 5
- 102000003886 Glycoproteins Human genes 0.000 description 5
- PEEHTFAAVSWFBL-UHFFFAOYSA-N Maleimide Chemical compound O=C1NC(=O)C=C1 PEEHTFAAVSWFBL-UHFFFAOYSA-N 0.000 description 5
- 239000002202 Polyethylene glycol Substances 0.000 description 5
- 108020004682 Single-Stranded DNA Proteins 0.000 description 5
- 238000001261 affinity purification Methods 0.000 description 5
- 210000000349 chromosome Anatomy 0.000 description 5
- 238000012650 click reaction Methods 0.000 description 5
- 229940125904 compound 1 Drugs 0.000 description 5
- 230000007423 decrease Effects 0.000 description 5
- 230000004069 differentiation Effects 0.000 description 5
- 210000001671 embryonic stem cell Anatomy 0.000 description 5
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical class O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 5
- 238000001114 immunoprecipitation Methods 0.000 description 5
- 150000002500 ions Chemical group 0.000 description 5
- 239000003446 ligand Substances 0.000 description 5
- 229910001629 magnesium chloride Inorganic materials 0.000 description 5
- 230000036961 partial effect Effects 0.000 description 5
- 229920001223 polyethylene glycol Polymers 0.000 description 5
- 238000001556 precipitation Methods 0.000 description 5
- 230000035945 sensitivity Effects 0.000 description 5
- 150000003852 triazoles Chemical class 0.000 description 5
- YABZBTUZPWUEKP-BTVCFUMJSA-N (2r,3s,4r,5r)-2,3,4,5,6-pentahydroxyhexanal;azide Chemical compound [N-]=[N+]=[N-].OC[C@@H](O)[C@@H](O)[C@H](O)[C@@H](O)C=O YABZBTUZPWUEKP-BTVCFUMJSA-N 0.000 description 4
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- 238000001644 13C nuclear magnetic resonance spectroscopy Methods 0.000 description 4
- FWBHETKCLVMNFS-UHFFFAOYSA-N 4',6-Diamino-2-phenylindol Chemical compound C1=CC(C(=N)N)=CC=C1C1=CC2=CC=C(C(N)=N)C=C2N1 FWBHETKCLVMNFS-UHFFFAOYSA-N 0.000 description 4
- 108091029430 CpG site Proteins 0.000 description 4
- OKKJLVBELUTLKV-MZCSYVLQSA-N Deuterated methanol Chemical compound [2H]OC([2H])([2H])[2H] OKKJLVBELUTLKV-MZCSYVLQSA-N 0.000 description 4
- 108060004795 Methyltransferase Proteins 0.000 description 4
- 102000016397 Methyltransferase Human genes 0.000 description 4
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 4
- NQRYJNQNLNOLGT-UHFFFAOYSA-N Piperidine Chemical compound C1CCNCC1 NQRYJNQNLNOLGT-UHFFFAOYSA-N 0.000 description 4
- PXIPVTKHYLBLMZ-UHFFFAOYSA-N Sodium azide Chemical compound [Na+].[N-]=[N+]=[N-] PXIPVTKHYLBLMZ-UHFFFAOYSA-N 0.000 description 4
- QAOWNCQODCNURD-UHFFFAOYSA-N Sulfuric acid Chemical compound OS(O)(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-N 0.000 description 4
- HEDRZPFGACZZDS-MICDWDOJSA-N Trichloro(2H)methane Chemical compound [2H]C(Cl)(Cl)Cl HEDRZPFGACZZDS-MICDWDOJSA-N 0.000 description 4
- 239000007983 Tris buffer Substances 0.000 description 4
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 4
- 150000001413 amino acids Chemical class 0.000 description 4
- 230000003321 amplification Effects 0.000 description 4
- 108010030694 avidin-horseradish peroxidase complex Proteins 0.000 description 4
- IVRMZWNICZWHMI-UHFFFAOYSA-N azide group Chemical group [N-]=[N+]=[N-] IVRMZWNICZWHMI-UHFFFAOYSA-N 0.000 description 4
- 230000001580 bacterial effect Effects 0.000 description 4
- WGQKYBSKWIADBV-UHFFFAOYSA-N benzylamine Chemical compound NCC1=CC=CC=C1 WGQKYBSKWIADBV-UHFFFAOYSA-N 0.000 description 4
- 201000011510 cancer Diseases 0.000 description 4
- 239000003054 catalyst Substances 0.000 description 4
- 238000010367 cloning Methods 0.000 description 4
- 238000005520 cutting process Methods 0.000 description 4
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 4
- 239000003599 detergent Substances 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- PHTQWCKDNZKARW-UHFFFAOYSA-N isoamylol Chemical compound CC(C)CCO PHTQWCKDNZKARW-UHFFFAOYSA-N 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 238000003199 nucleic acid amplification method Methods 0.000 description 4
- YBYRMVIVWMBXKQ-UHFFFAOYSA-N phenylmethanesulfonyl fluoride Chemical compound FS(=O)(=O)CC1=CC=CC=C1 YBYRMVIVWMBXKQ-UHFFFAOYSA-N 0.000 description 4
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 4
- 239000000047 product Substances 0.000 description 4
- 150000003384 small molecules Chemical class 0.000 description 4
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 4
- 239000007787 solid Substances 0.000 description 4
- 238000000527 sonication Methods 0.000 description 4
- 238000002198 surface plasmon resonance spectroscopy Methods 0.000 description 4
- 238000004885 tandem mass spectrometry Methods 0.000 description 4
- MPLHNVLQVRSVEE-UHFFFAOYSA-N texas red Chemical compound [O-]S(=O)(=O)C1=CC(S(Cl)(=O)=O)=CC=C1C(C1=CC=2CCCN3CCCC(C=23)=C1O1)=C2C1=C(CCC1)C3=[N+]1CCCC3=C2 MPLHNVLQVRSVEE-UHFFFAOYSA-N 0.000 description 4
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 4
- HWPZZUQOWRWFDB-UHFFFAOYSA-N 1-methylcytosine Chemical compound CN1C=CC(N)=NC1=O HWPZZUQOWRWFDB-UHFFFAOYSA-N 0.000 description 3
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 3
- 238000004679 31P NMR spectroscopy Methods 0.000 description 3
- UWAUSMGZOHPBJJ-UHFFFAOYSA-N 4-nitro-1,2,3-benzoxadiazole Chemical compound [O-][N+](=O)C1=CC=CC2=C1N=NO2 UWAUSMGZOHPBJJ-UHFFFAOYSA-N 0.000 description 3
- CSCPPACGZOOCGX-UHFFFAOYSA-N Acetone Chemical compound CC(C)=O CSCPPACGZOOCGX-UHFFFAOYSA-N 0.000 description 3
- 102100034330 Chromaffin granule amine transporter Human genes 0.000 description 3
- RTZKZFJDLAIYFH-UHFFFAOYSA-N Diethyl ether Chemical compound CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 description 3
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 3
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 3
- 239000007995 HEPES buffer Substances 0.000 description 3
- 101000641221 Homo sapiens Chromaffin granule amine transporter Proteins 0.000 description 3
- 206010021143 Hypoxia Diseases 0.000 description 3
- 108010052285 Membrane Proteins Proteins 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 108091005804 Peptidases Proteins 0.000 description 3
- 206010034972 Photosensitivity reaction Diseases 0.000 description 3
- 239000004365 Protease Substances 0.000 description 3
- PMZURENOXWZQFD-UHFFFAOYSA-L Sodium Sulfate Chemical compound [Na+].[Na+].[O-]S([O-])(=O)=O PMZURENOXWZQFD-UHFFFAOYSA-L 0.000 description 3
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical group O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 3
- 238000002835 absorbance Methods 0.000 description 3
- DPKHZNPWBDQZCN-UHFFFAOYSA-N acridine orange free base Chemical compound C1=CC(N(C)C)=CC2=NC3=CC(N(C)C)=CC=C3C=C21 DPKHZNPWBDQZCN-UHFFFAOYSA-N 0.000 description 3
- 235000001014 amino acid Nutrition 0.000 description 3
- BFNBIHQBYMNNAN-UHFFFAOYSA-N ammonium sulfate Chemical compound N.N.OS(O)(=O)=O BFNBIHQBYMNNAN-UHFFFAOYSA-N 0.000 description 3
- 229910052921 ammonium sulfate Inorganic materials 0.000 description 3
- 230000033115 angiogenesis Effects 0.000 description 3
- DZBUGLKDJFMEHC-UHFFFAOYSA-N benzoquinolinylidene Natural products C1=CC=CC2=CC3=CC=CC=C3N=C21 DZBUGLKDJFMEHC-UHFFFAOYSA-N 0.000 description 3
- 239000011230 binding agent Substances 0.000 description 3
- 150000001615 biotins Chemical class 0.000 description 3
- 238000007413 biotinylation Methods 0.000 description 3
- 230000006287 biotinylation Effects 0.000 description 3
- 238000001369 bisulfite sequencing Methods 0.000 description 3
- 239000012267 brine Substances 0.000 description 3
- 239000004202 carbamide Substances 0.000 description 3
- 150000001720 carbohydrates Chemical class 0.000 description 3
- 235000014633 carbohydrates Nutrition 0.000 description 3
- CZPLANDPABRVHX-UHFFFAOYSA-N cascade blue Chemical compound C=1C2=CC=CC=C2C(NCC)=CC=1C(C=1C=CC(=CC=1)N(CC)CC)=C1C=CC(=[N+](CC)CC)C=C1 CZPLANDPABRVHX-UHFFFAOYSA-N 0.000 description 3
- 210000000170 cell membrane Anatomy 0.000 description 3
- 125000003636 chemical group Chemical group 0.000 description 3
- YTRQFSDWAXHJCC-UHFFFAOYSA-N chloroform;phenol Chemical compound ClC(Cl)Cl.OC1=CC=CC=C1 YTRQFSDWAXHJCC-UHFFFAOYSA-N 0.000 description 3
- 230000002759 chromosomal effect Effects 0.000 description 3
- 239000013068 control sample Substances 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000004925 denaturation Methods 0.000 description 3
- 230000036425 denaturation Effects 0.000 description 3
- 239000000412 dendrimer Substances 0.000 description 3
- 229920000736 dendritic polymer Polymers 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- WQZGKKKJIJFFOK-UKLRSMCWSA-N dextrose-2-13c Chemical compound OC[C@H]1OC(O)[13C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-UKLRSMCWSA-N 0.000 description 3
- 238000010790 dilution Methods 0.000 description 3
- 239000012895 dilution Substances 0.000 description 3
- 239000000839 emulsion Substances 0.000 description 3
- 238000006911 enzymatic reaction Methods 0.000 description 3
- 230000001973 epigenetic effect Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 239000000706 filtrate Substances 0.000 description 3
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 3
- 238000002866 fluorescence resonance energy transfer Methods 0.000 description 3
- 239000006260 foam Substances 0.000 description 3
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 3
- 229910052737 gold Inorganic materials 0.000 description 3
- 239000010931 gold Substances 0.000 description 3
- 210000005260 human cell Anatomy 0.000 description 3
- 230000007954 hypoxia Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 208000032839 leukemia Diseases 0.000 description 3
- DLBFLQKQABVKGT-UHFFFAOYSA-L lucifer yellow dye Chemical compound [Li+].[Li+].[O-]S(=O)(=O)C1=CC(C(N(C(=O)NN)C2=O)=O)=C3C2=CC(S([O-])(=O)=O)=CC3=C1N DLBFLQKQABVKGT-UHFFFAOYSA-L 0.000 description 3
- 210000004962 mammalian cell Anatomy 0.000 description 3
- 230000001404 mediated effect Effects 0.000 description 3
- 208000015122 neurodegenerative disease Diseases 0.000 description 3
- 239000002777 nucleoside Substances 0.000 description 3
- 125000003835 nucleoside group Chemical group 0.000 description 3
- 229910000073 phosphorus hydride Inorganic materials 0.000 description 3
- INAAIJLSXJJHOZ-UHFFFAOYSA-N pibenzimol Chemical compound C1CN(C)CCN1C1=CC=C(N=C(N2)C=3C=C4NC(=NC4=CC=3)C=3C=CC(O)=CC=3)C2=C1 INAAIJLSXJJHOZ-UHFFFAOYSA-N 0.000 description 3
- 239000013612 plasmid Substances 0.000 description 3
- 238000005498 polishing Methods 0.000 description 3
- 229920002704 polyhistidine Polymers 0.000 description 3
- 125000001436 propyl group Chemical group [H]C([*])([H])C([H])([H])C([H])([H])[H] 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000004007 reversed phase HPLC Methods 0.000 description 3
- PYWVYCXTNDRMGF-UHFFFAOYSA-N rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 3
- 238000010008 shearing Methods 0.000 description 3
- 238000010898 silica gel chromatography Methods 0.000 description 3
- 229910052938 sodium sulfate Inorganic materials 0.000 description 3
- 235000011152 sodium sulphate Nutrition 0.000 description 3
- HPALAKNZSZLMCH-UHFFFAOYSA-M sodium;chloride;hydrate Chemical compound O.[Na+].[Cl-] HPALAKNZSZLMCH-UHFFFAOYSA-M 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 238000010186 staining Methods 0.000 description 3
- 235000000346 sugar Nutrition 0.000 description 3
- ANRHNWWPFJCPAZ-UHFFFAOYSA-M thionine Chemical class [Cl-].C1=CC(N)=CC2=[S+]C3=CC(N)=CC=C3N=C21 ANRHNWWPFJCPAZ-UHFFFAOYSA-M 0.000 description 3
- RIOQSEWOXXDEQQ-UHFFFAOYSA-N triphenylphosphine Chemical compound C1=CC=CC=C1P(C=1C=CC=CC=1)C1=CC=CC=C1 RIOQSEWOXXDEQQ-UHFFFAOYSA-N 0.000 description 3
- 238000011144 upstream manufacturing Methods 0.000 description 3
- VGIRNWJSIRVFRT-UHFFFAOYSA-N 2',7'-difluorofluorescein Chemical compound OC(=O)C1=CC=CC=C1C1=C2C=C(F)C(=O)C=C2OC2=CC(O)=C(F)C=C21 VGIRNWJSIRVFRT-UHFFFAOYSA-N 0.000 description 2
- PRDFBSVERLRRMY-UHFFFAOYSA-N 2'-(4-ethoxyphenyl)-5-(4-methylpiperazin-1-yl)-2,5'-bibenzimidazole Chemical compound C1=CC(OCC)=CC=C1C1=NC2=CC=C(C=3NC4=CC(=CC=C4N=3)N3CCN(C)CC3)C=C2N1 PRDFBSVERLRRMY-UHFFFAOYSA-N 0.000 description 2
- PXBFMLJZNCDSMP-UHFFFAOYSA-N 2-Aminobenzamide Chemical compound NC(=O)C1=CC=CC=C1N PXBFMLJZNCDSMP-UHFFFAOYSA-N 0.000 description 2
- BVOITXUNGDUXRW-UHFFFAOYSA-N 2-chloro-1,3,2-benzodioxaphosphinin-4-one Chemical compound C1=CC=C2OP(Cl)OC(=O)C2=C1 BVOITXUNGDUXRW-UHFFFAOYSA-N 0.000 description 2
- UMCMPZBLKLEWAF-BCTGSCMUSA-N 3-[(3-cholamidopropyl)dimethylammonio]propane-1-sulfonate Chemical compound C([C@H]1C[C@H]2O)[C@H](O)CC[C@]1(C)[C@@H]1[C@@H]2[C@@H]2CC[C@H]([C@@H](CCC(=O)NCCC[N+](C)(C)CCCS([O-])(=O)=O)C)[C@@]2(C)[C@@H](O)C1 UMCMPZBLKLEWAF-BCTGSCMUSA-N 0.000 description 2
- MJKVTPMWOKAVMS-UHFFFAOYSA-N 3-hydroxy-1-benzopyran-2-one Chemical compound C1=CC=C2OC(=O)C(O)=CC2=C1 MJKVTPMWOKAVMS-UHFFFAOYSA-N 0.000 description 2
- LVSPDZAGCBEQAV-UHFFFAOYSA-N 4-chloronaphthalen-1-ol Chemical compound C1=CC=C2C(O)=CC=C(Cl)C2=C1 LVSPDZAGCBEQAV-UHFFFAOYSA-N 0.000 description 2
- IKYJCHYORFJFRR-UHFFFAOYSA-N Alexa Fluor 350 Chemical compound O=C1OC=2C=C(N)C(S(O)(=O)=O)=CC=2C(C)=C1CC(=O)ON1C(=O)CCC1=O IKYJCHYORFJFRR-UHFFFAOYSA-N 0.000 description 2
- JLDSMZIBHYTPPR-UHFFFAOYSA-N Alexa Fluor 405 Chemical compound CC[NH+](CC)CC.CC[NH+](CC)CC.CC[NH+](CC)CC.C12=C3C=4C=CC2=C(S([O-])(=O)=O)C=C(S([O-])(=O)=O)C1=CC=C3C(S(=O)(=O)[O-])=CC=4OCC(=O)N(CC1)CCC1C(=O)ON1C(=O)CCC1=O JLDSMZIBHYTPPR-UHFFFAOYSA-N 0.000 description 2
- WEJVZSAYICGDCK-UHFFFAOYSA-N Alexa Fluor 430 Chemical compound CC[NH+](CC)CC.CC1(C)C=C(CS([O-])(=O)=O)C2=CC=3C(C(F)(F)F)=CC(=O)OC=3C=C2N1CCCCCC(=O)ON1C(=O)CCC1=O WEJVZSAYICGDCK-UHFFFAOYSA-N 0.000 description 2
- WHVNXSBKJGAXKU-UHFFFAOYSA-N Alexa Fluor 532 Chemical compound [H+].[H+].CC1(C)C(C)NC(C(=C2OC3=C(C=4C(C(C(C)N=4)(C)C)=CC3=3)S([O-])(=O)=O)S([O-])(=O)=O)=C1C=C2C=3C(C=C1)=CC=C1C(=O)ON1C(=O)CCC1=O WHVNXSBKJGAXKU-UHFFFAOYSA-N 0.000 description 2
- ZAINTDRBUHCDPZ-UHFFFAOYSA-M Alexa Fluor 546 Chemical compound [H+].[Na+].CC1CC(C)(C)NC(C(=C2OC3=C(C4=NC(C)(C)CC(C)C4=CC3=3)S([O-])(=O)=O)S([O-])(=O)=O)=C1C=C2C=3C(C(=C(Cl)C=1Cl)C(O)=O)=C(Cl)C=1SCC(=O)NCCCCCC(=O)ON1C(=O)CCC1=O ZAINTDRBUHCDPZ-UHFFFAOYSA-M 0.000 description 2
- IGAZHQIYONOHQN-UHFFFAOYSA-N Alexa Fluor 555 Chemical compound C=12C=CC(=N)C(S(O)(=O)=O)=C2OC2=C(S(O)(=O)=O)C(N)=CC=C2C=1C1=CC=C(C(O)=O)C=C1C(O)=O IGAZHQIYONOHQN-UHFFFAOYSA-N 0.000 description 2
- 239000012099 Alexa Fluor family Substances 0.000 description 2
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 2
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 2
- 206010003591 Ataxia Diseases 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- BPYKTIZUTYGOLE-IFADSCNNSA-N Bilirubin Chemical compound N1C(=O)C(C)=C(C=C)\C1=C\C1=C(C)C(CCC(O)=O)=C(CC2=C(C(C)=C(\C=C/3C(=C(C=C)C(=O)N\3)C)N2)CCC(O)=O)N1 BPYKTIZUTYGOLE-IFADSCNNSA-N 0.000 description 2
- HMGPOTVBMZFBFI-ZIJVCQRQSA-N CCN(CC)CC.CC(O[C@H]([C@@H](CN=[N+]=[N-])O[C@@H]([C@@H]1OC(C)=O)OP(O)(O)=O)[C@@H]1OC(C)=O)=O Chemical compound CCN(CC)CC.CC(O[C@H]([C@@H](CN=[N+]=[N-])O[C@@H]([C@@H]1OC(C)=O)OP(O)(O)=O)[C@@H]1OC(C)=O)=O HMGPOTVBMZFBFI-ZIJVCQRQSA-N 0.000 description 2
- 241000218645 Cedrus Species 0.000 description 2
- 238000001353 Chip-sequencing Methods 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 2
- 101100477411 Dictyostelium discoideum set1 gene Proteins 0.000 description 2
- MYMOFIZGZYHOMD-UHFFFAOYSA-N Dioxygen Chemical compound O=O MYMOFIZGZYHOMD-UHFFFAOYSA-N 0.000 description 2
- 108010067770 Endopeptidase K Proteins 0.000 description 2
- 241000206602 Eukaryota Species 0.000 description 2
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 2
- 108030004665 Glucosyl-DNA beta-glucosyltransferases Proteins 0.000 description 2
- 108700023372 Glycosyltransferases Proteins 0.000 description 2
- 102000051366 Glycosyltransferases Human genes 0.000 description 2
- 238000006736 Huisgen cycloaddition reaction Methods 0.000 description 2
- OAKJQQAXSVQMHS-UHFFFAOYSA-N Hydrazine Chemical compound NN OAKJQQAXSVQMHS-UHFFFAOYSA-N 0.000 description 2
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- 101500006448 Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97) Endonuclease PI-MboI Proteins 0.000 description 2
- 201000003793 Myelodysplastic syndrome Diseases 0.000 description 2
- PXHVJJICTQNCMI-UHFFFAOYSA-N Nickel Chemical compound [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 2
- 101710163270 Nuclease Proteins 0.000 description 2
- 101710149004 Nuclease P1 Proteins 0.000 description 2
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 2
- 108091034117 Oligonucleotide Proteins 0.000 description 2
- XYFCBTPGUUZFHI-UHFFFAOYSA-N Phosphine Chemical compound P XYFCBTPGUUZFHI-UHFFFAOYSA-N 0.000 description 2
- 229940124158 Protease/peptidase inhibitor Drugs 0.000 description 2
- LOUPRKONTZGTKE-WZBLMQSHSA-N Quinine Chemical compound C([C@H]([C@H](C1)C=C)C2)C[N@@]1[C@@H]2[C@H](O)C1=CC=NC2=CC=C(OC)C=C21 LOUPRKONTZGTKE-WZBLMQSHSA-N 0.000 description 2
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 2
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 2
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 2
- CGNLCCVKSWNSDG-UHFFFAOYSA-N SYBR Green I Chemical compound CN(C)CCCN(CCC)C1=CC(C=C2N(C3=CC=CC=C3S2)C)=C2C=CC=CC2=[N+]1C1=CC=CC=C1 CGNLCCVKSWNSDG-UHFFFAOYSA-N 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- UIIMBOGNXHQVGW-UHFFFAOYSA-M Sodium bicarbonate Chemical compound [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 2
- WQDUMFSSJAZKTM-UHFFFAOYSA-N Sodium methoxide Chemical compound [Na+].[O-]C WQDUMFSSJAZKTM-UHFFFAOYSA-N 0.000 description 2
- 238000000692 Student's t-test Methods 0.000 description 2
- 239000012505 Superdex™ Substances 0.000 description 2
- 101150012475 TET2 gene Proteins 0.000 description 2
- 102000004357 Transferases Human genes 0.000 description 2
- 108090000992 Transferases Proteins 0.000 description 2
- LFQWTDKRDROWBN-RGDJUOJXSA-N [(2r,3r,4s,5r,6r)-4,5,6-triacetyloxy-2-(azidomethyl)oxan-3-yl] acetate Chemical compound CC(=O)O[C@H]1O[C@H](CN=[N+]=[N-])[C@@H](OC(C)=O)[C@H](OC(C)=O)[C@H]1OC(C)=O LFQWTDKRDROWBN-RGDJUOJXSA-N 0.000 description 2
- MUTUZTFDNMUQCN-ZIQFBCGOSA-N [(2r,3r,4s,5r,6s)-4,5-diacetyloxy-2-(azidomethyl)-6-hydroxyoxan-3-yl] acetate Chemical compound CC(=O)O[C@H]1[C@@H](O)O[C@H](CN=[N+]=[N-])[C@@H](OC(C)=O)[C@@H]1OC(C)=O MUTUZTFDNMUQCN-ZIQFBCGOSA-N 0.000 description 2
- RRMTXDYAGCJCDG-LBELIVKGSA-N [(2r,3r,4s,5r,6s)-4,5-diacetyloxy-2-(azidomethyl)-6-methoxyoxan-3-yl] acetate Chemical compound CO[C@H]1O[C@H](CN=[N+]=[N-])[C@@H](OC(C)=O)[C@H](OC(C)=O)[C@H]1OC(C)=O RRMTXDYAGCJCDG-LBELIVKGSA-N 0.000 description 2
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 2
- LPQOADBMXVRBNX-UHFFFAOYSA-N ac1ldcw0 Chemical compound Cl.C1CN(C)CCN1C1=C(F)C=C2C(=O)C(C(O)=O)=CN3CCSC1=C32 LPQOADBMXVRBNX-UHFFFAOYSA-N 0.000 description 2
- 230000021736 acetylation Effects 0.000 description 2
- 238000006640 acetylation reaction Methods 0.000 description 2
- 238000001042 affinity chromatography Methods 0.000 description 2
- 230000032683 aging Effects 0.000 description 2
- 108010004469 allophycocyanin Proteins 0.000 description 2
- 235000011130 ammonium sulphate Nutrition 0.000 description 2
- 229960000723 ampicillin Drugs 0.000 description 2
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 2
- 239000007864 aqueous solution Substances 0.000 description 2
- 238000010462 azide-alkyne Huisgen cycloaddition reaction Methods 0.000 description 2
- 108010058966 bacteriophage T7 induced DNA polymerase Proteins 0.000 description 2
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 2
- AFYNADDZULBEJA-UHFFFAOYSA-N bicinchoninic acid Chemical compound C1=CC=CC2=NC(C=3C=C(C4=CC=CC=C4N=3)C(=O)O)=CC(C(O)=O)=C21 AFYNADDZULBEJA-UHFFFAOYSA-N 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 210000001185 bone marrow Anatomy 0.000 description 2
- 210000004958 brain cell Anatomy 0.000 description 2
- 239000003729 cation exchange resin Substances 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 150000001793 charged compounds Chemical class 0.000 description 2
- 239000013611 chromosomal DNA Substances 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 2
- 239000012141 concentrate Substances 0.000 description 2
- 230000001268 conjugating effect Effects 0.000 description 2
- 229910052802 copper Inorganic materials 0.000 description 2
- 230000007850 degeneration Effects 0.000 description 2
- 229910001882 dioxygen Inorganic materials 0.000 description 2
- 208000035475 disorder Diseases 0.000 description 2
- VHJLVAABSRFDPM-QWWZWVQMSA-N dithiothreitol Chemical compound SC[C@@H](O)[C@H](O)CS VHJLVAABSRFDPM-QWWZWVQMSA-N 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 239000000975 dye Substances 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 238000010828 elution Methods 0.000 description 2
- OAYLNYINCPYISS-UHFFFAOYSA-N ethyl acetate;hexane Chemical compound CCCCCC.CCOC(C)=O OAYLNYINCPYISS-UHFFFAOYSA-N 0.000 description 2
- 230000007717 exclusion Effects 0.000 description 2
- 108010021843 fluorescent protein 583 Proteins 0.000 description 2
- 238000002523 gelfiltration Methods 0.000 description 2
- 238000011331 genomic analysis Methods 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 235000014304 histidine Nutrition 0.000 description 2
- QAOWNCQODCNURD-UHFFFAOYSA-M hydrogensulfate Chemical class OS([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-M 0.000 description 2
- SMWDFEZZVXVKRB-UHFFFAOYSA-O hydron;quinoline Chemical compound [NH+]1=CC=CC2=CC=CC=C21 SMWDFEZZVXVKRB-UHFFFAOYSA-O 0.000 description 2
- 230000002209 hydrophobic effect Effects 0.000 description 2
- 125000004029 hydroxymethyl group Chemical group [H]OC([H])([H])* 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000011534 incubation Methods 0.000 description 2
- 230000005764 inhibitory process Effects 0.000 description 2
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 2
- 229930027917 kanamycin Natural products 0.000 description 2
- 229960000318 kanamycin Drugs 0.000 description 2
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 2
- 229930182823 kanamycin A Natural products 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 238000001294 liquid chromatography-tandem mass spectrometry Methods 0.000 description 2
- 238000011068 loading method Methods 0.000 description 2
- 238000004949 mass spectrometry Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 229910052751 metal Inorganic materials 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- HQCYVSPJIOJEGA-UHFFFAOYSA-N methoxycoumarin Chemical compound C1=CC=C2OC(=O)C(OC)=CC2=C1 HQCYVSPJIOJEGA-UHFFFAOYSA-N 0.000 description 2
- 238000001823 molecular biology technique Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- RUAYLFCHPJYFJL-HVVPPSBSSA-N n,n'-dicyclohexylmorpholine-4-carboximidamide;[(2r,3s,4r,5r)-5-(2,4-dioxopyrimidin-1-yl)-3,4-dihydroxyoxolan-2-yl]methoxy-morpholin-4-ylphosphinic acid Chemical compound C1CCCCC1NC(N1CCOCC1)=NC1CCCCC1.C([C@@H]1[C@H]([C@H]([C@@H](O1)N1C(NC(=O)C=C1)=O)O)O)OP(O)(=O)N1CCOCC1 RUAYLFCHPJYFJL-HVVPPSBSSA-N 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 230000004770 neurodegeneration Effects 0.000 description 2
- XJCPMUIIBDVFDM-UHFFFAOYSA-M nile blue A Chemical compound [Cl-].C1=CC=C2C3=NC4=CC=C(N(CC)CC)C=C4[O+]=C3C=C(N)C2=C1 XJCPMUIIBDVFDM-UHFFFAOYSA-M 0.000 description 2
- VOFUROIFQGPCGE-UHFFFAOYSA-N nile red Chemical compound C1=CC=C2C3=NC4=CC=C(N(CC)CC)C=C4OC3=CC(=O)C2=C1 VOFUROIFQGPCGE-UHFFFAOYSA-N 0.000 description 2
- 239000002853 nucleic acid probe Substances 0.000 description 2
- 239000003960 organic solvent Substances 0.000 description 2
- 238000003068 pathway analysis Methods 0.000 description 2
- 239000008188 pellet Substances 0.000 description 2
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 2
- 150000008300 phosphoramidites Chemical class 0.000 description 2
- 238000007539 photo-oxidation reaction Methods 0.000 description 2
- 239000003880 polar aprotic solvent Substances 0.000 description 2
- 229920002401 polyacrylamide Polymers 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 108090000765 processed proteins & peptides Proteins 0.000 description 2
- 238000004393 prognosis Methods 0.000 description 2
- XJMOSONTPMZWPB-UHFFFAOYSA-M propidium iodide Chemical compound [I-].[I-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CCC[N+](C)(CC)CC)=C1C1=CC=CC=C1 XJMOSONTPMZWPB-UHFFFAOYSA-M 0.000 description 2
- 235000019419 proteases Nutrition 0.000 description 2
- 230000017854 proteolysis Effects 0.000 description 2
- 210000000449 purkinje cell Anatomy 0.000 description 2
- 239000002096 quantum dot Substances 0.000 description 2
- 238000010791 quenching Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 238000002864 sequence alignment Methods 0.000 description 2
- 230000035939 shock Effects 0.000 description 2
- 238000001542 size-exclusion chromatography Methods 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 239000007858 starting material Substances 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 150000003536 tetrazoles Chemical class 0.000 description 2
- ACOJCCLIDPZYJC-UHFFFAOYSA-M thiazole orange Chemical compound CC1=CC=C(S([O-])(=O)=O)C=C1.C1=CC=C2C(C=C3N(C4=CC=CC=C4S3)C)=CC=[N+](C)C2=C1 ACOJCCLIDPZYJC-UHFFFAOYSA-M 0.000 description 2
- 125000003396 thiol group Chemical group [H]S* 0.000 description 2
- AVBGNFCMKJOFIN-UHFFFAOYSA-N triethylammonium acetate Chemical compound CC(O)=O.CCN(CC)CC AVBGNFCMKJOFIN-UHFFFAOYSA-N 0.000 description 2
- 241001515965 unidentified phage Species 0.000 description 2
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 2
- 229940045145 uridine Drugs 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 2
- IJTNSXPMYKJZPR-ZSCYQOFPSA-N (9Z,11E,13E,15Z)-octadecatetraenoic acid Chemical compound CC\C=C/C=C/C=C/C=C\CCCCCCCC(O)=O IJTNSXPMYKJZPR-ZSCYQOFPSA-N 0.000 description 1
- SLLFVLKNXABYGI-UHFFFAOYSA-N 1,2,3-benzoxadiazole Chemical compound C1=CC=C2ON=NC2=C1 SLLFVLKNXABYGI-UHFFFAOYSA-N 0.000 description 1
- NWUYHJFMYQTDRP-UHFFFAOYSA-N 1,2-bis(ethenyl)benzene;1-ethenyl-2-ethylbenzene;styrene Chemical compound C=CC1=CC=CC=C1.CCC1=CC=CC=C1C=C.C=CC1=CC=CC=C1C=C NWUYHJFMYQTDRP-UHFFFAOYSA-N 0.000 description 1
- BOBLSBAZCVBABY-WPWUJOAOSA-N 1,6-diphenylhexatriene Chemical compound C=1C=CC=CC=1\C=C\C=C\C=C\C1=CC=CC=C1 BOBLSBAZCVBABY-WPWUJOAOSA-N 0.000 description 1
- JTTIOYHBNXDJOD-UHFFFAOYSA-N 2,4,6-triaminopyrimidine Chemical compound NC1=CC(N)=NC(N)=N1 JTTIOYHBNXDJOD-UHFFFAOYSA-N 0.000 description 1
- XDFNWJDGWJVGGN-UHFFFAOYSA-N 2-(2,7-dichloro-3,6-dihydroxy-9h-xanthen-9-yl)benzoic acid Chemical compound OC(=O)C1=CC=CC=C1C1C2=CC(Cl)=C(O)C=C2OC2=CC(O)=C(Cl)C=C21 XDFNWJDGWJVGGN-UHFFFAOYSA-N 0.000 description 1
- IJSMFQNTEUNRPY-UHFFFAOYSA-N 2-[3-(dimethylamino)-6-dimethylazaniumylidenexanthen-9-yl]-5-isothiocyanatobenzoate Chemical compound C=12C=CC(=[N+](C)C)C=C2OC2=CC(N(C)C)=CC=C2C=1C1=CC=C(N=C=S)C=C1C([O-])=O IJSMFQNTEUNRPY-UHFFFAOYSA-N 0.000 description 1
- IOOMXAQUNPWDLL-UHFFFAOYSA-N 2-[6-(diethylamino)-3-(diethyliminiumyl)-3h-xanthen-9-yl]-5-sulfobenzene-1-sulfonate Chemical compound C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=C(S(O)(=O)=O)C=C1S([O-])(=O)=O IOOMXAQUNPWDLL-UHFFFAOYSA-N 0.000 description 1
- MPPQGYCZBNURDG-UHFFFAOYSA-N 2-propionyl-6-dimethylaminonaphthalene Chemical class C1=C(N(C)C)C=CC2=CC(C(=O)CC)=CC=C21 MPPQGYCZBNURDG-UHFFFAOYSA-N 0.000 description 1
- BNBQQYFXBLBYJK-UHFFFAOYSA-N 2-pyridin-2-yl-1,3-oxazole Chemical compound C1=COC(C=2N=CC=CC=2)=N1 BNBQQYFXBLBYJK-UHFFFAOYSA-N 0.000 description 1
- QWZHDKGQKYEBKK-UHFFFAOYSA-N 3-aminochromen-2-one Chemical compound C1=CC=C2OC(=O)C(N)=CC2=C1 QWZHDKGQKYEBKK-UHFFFAOYSA-N 0.000 description 1
- QFVHZQCOUORWEI-UHFFFAOYSA-N 4-[(4-anilino-5-sulfonaphthalen-1-yl)diazenyl]-5-hydroxynaphthalene-2,7-disulfonic acid Chemical compound C=12C(O)=CC(S(O)(=O)=O)=CC2=CC(S(O)(=O)=O)=CC=1N=NC(C1=CC=CC(=C11)S(O)(=O)=O)=CC=C1NC1=CC=CC=C1 QFVHZQCOUORWEI-UHFFFAOYSA-N 0.000 description 1
- MZWWAGMOVQUICZ-UHFFFAOYSA-N 4-[6-[6-(4-methylpiperazin-1-yl)-1H-benzimidazol-2-yl]-1H-benzimidazol-2-yl]phenol hydrate trihydrochloride Chemical compound O.Cl.Cl.Cl.C1CN(C)CCN1C1=CC=C(NC(=N2)C=3C=C4N=C(NC4=CC=3)C=3C=CC(O)=CC=3)C2=C1 MZWWAGMOVQUICZ-UHFFFAOYSA-N 0.000 description 1
- QEYONPKSDTUPAX-UHFFFAOYSA-N 4-bromo-2-chloro-6-fluorophenol Chemical compound OC1=C(F)C=C(Br)C=C1Cl QEYONPKSDTUPAX-UHFFFAOYSA-N 0.000 description 1
- RLGKZMHUIXDGGT-UFLZEWODSA-N 5-[(3as,4s,6ar)-2-oxo-1,3,3a,4,6,6a-hexahydrothieno[3,4-d]imidazol-4-yl]pentanoic acid;cyclooctyne Chemical compound C1CCCC#CCC1.N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 RLGKZMHUIXDGGT-UFLZEWODSA-N 0.000 description 1
- VESMEBBJISCMLQ-UFLZEWODSA-N 5-[(3as,4s,6ar)-2-oxo-1,3,3a,4,6,6a-hexahydrothieno[3,4-d]imidazol-4-yl]pentanoic acid;phosphane Chemical compound P.N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 VESMEBBJISCMLQ-UFLZEWODSA-N 0.000 description 1
- HOSGXJWQVBHGLT-UHFFFAOYSA-N 6-hydroxy-3,4-dihydro-1h-quinolin-2-one Chemical group N1C(=O)CCC2=CC(O)=CC=C21 HOSGXJWQVBHGLT-UHFFFAOYSA-N 0.000 description 1
- YXHLJMWYDTXDHS-IRFLANFNSA-N 7-aminoactinomycin D Chemical compound C[C@H]1OC(=O)[C@H](C(C)C)N(C)C(=O)CN(C)C(=O)[C@@H]2CCCN2C(=O)[C@@H](C(C)C)NC(=O)[C@H]1NC(=O)C1=C(N)C(=O)C(C)=C2OC(C(C)=C(N)C=C3C(=O)N[C@@H]4C(=O)N[C@@H](C(N5CCC[C@H]5C(=O)N(C)CC(=O)N(C)[C@@H](C(C)C)C(=O)O[C@@H]4C)=O)C(C)C)=C3N=C21 YXHLJMWYDTXDHS-IRFLANFNSA-N 0.000 description 1
- 108700012813 7-aminoactinomycin D Proteins 0.000 description 1
- ZCYVEMRRCGMTRW-UHFFFAOYSA-N 7553-56-2 Chemical compound [I] ZCYVEMRRCGMTRW-UHFFFAOYSA-N 0.000 description 1
- 208000024893 Acute lymphoblastic leukemia Diseases 0.000 description 1
- 208000014697 Acute lymphocytic leukaemia Diseases 0.000 description 1
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 239000012103 Alexa Fluor 488 Substances 0.000 description 1
- 239000012104 Alexa Fluor 500 Substances 0.000 description 1
- 239000012105 Alexa Fluor 514 Substances 0.000 description 1
- 239000012109 Alexa Fluor 568 Substances 0.000 description 1
- 239000012110 Alexa Fluor 594 Substances 0.000 description 1
- 239000012111 Alexa Fluor 610 Substances 0.000 description 1
- 239000012112 Alexa Fluor 633 Substances 0.000 description 1
- 239000012114 Alexa Fluor 647 Substances 0.000 description 1
- 239000012115 Alexa Fluor 660 Substances 0.000 description 1
- 239000012116 Alexa Fluor 680 Substances 0.000 description 1
- 239000012117 Alexa Fluor 700 Substances 0.000 description 1
- 239000012118 Alexa Fluor 750 Substances 0.000 description 1
- 239000012119 Alexa Fluor 790 Substances 0.000 description 1
- NLXLAEXVIDQMFP-UHFFFAOYSA-N Ammonia chloride Chemical class [NH4+].[Cl-] NLXLAEXVIDQMFP-UHFFFAOYSA-N 0.000 description 1
- QGZKDVFQNNGYKY-UHFFFAOYSA-O Ammonium Chemical compound [NH4+] QGZKDVFQNNGYKY-UHFFFAOYSA-O 0.000 description 1
- USFZMSVCRYTOJT-UHFFFAOYSA-N Ammonium acetate Chemical compound N.CC(O)=O USFZMSVCRYTOJT-UHFFFAOYSA-N 0.000 description 1
- 239000005695 Ammonium acetate Substances 0.000 description 1
- VHUUQVKOLVNVRT-UHFFFAOYSA-N Ammonium hydroxide Chemical compound [NH4+].[OH-] VHUUQVKOLVNVRT-UHFFFAOYSA-N 0.000 description 1
- 241001244729 Apalis Species 0.000 description 1
- 102100035029 Ataxin-1 Human genes 0.000 description 1
- 108010032963 Ataxin-1 Proteins 0.000 description 1
- 108091005950 Azurite Proteins 0.000 description 1
- IFUUTHPIYVCYNT-VJXWHUIBSA-L C.C.C1CCOC1.CC(=O)O[C@H]1C(C)[C@H](C)C(CN=[N+]=[N-])O[C@@H]1O.CC(=O)O[C@H]1C(C)[C@H](C)C(CN=[N+]=[N-])O[C@@H]1OC(C)=O.CC(=O)O[C@H]1C(C)[C@H](C)C(CN=[N+]=[N-])O[C@@H]1OP(=O)(O)O.CO[C@H]1OC(CBr)[C@@H](O)C(O)[C@@H]1O.CO[C@H]1OC(CN=[N+]=[N-])[C@@H](C)C(C)[C@@H]1OC(C)=O.CO[C@H]1OC(CO)[C@@H](O)C(O)[C@@H]1O.CP(=O)(OC[C@H]1O[C@@H](N2C=CC(=O)NC2=O)C(O)[C@H]1O)N1CCOCC1.ClP1OC2=C(C=CC=C2)O1.NCC1=CC=CC=C1.O.O=S(=O)(O)O.[2H]CF.[N-]=[N+]=NCC1O[C@H](OP(=O)([O-])OP(=O)([O-])OC[C@H]2O[C@@H](N3C=CC(=O)NC3=O)C(O)[C@H]2O)[C@@H](O)C(O)[C@@H]1O Chemical compound C.C.C1CCOC1.CC(=O)O[C@H]1C(C)[C@H](C)C(CN=[N+]=[N-])O[C@@H]1O.CC(=O)O[C@H]1C(C)[C@H](C)C(CN=[N+]=[N-])O[C@@H]1OC(C)=O.CC(=O)O[C@H]1C(C)[C@H](C)C(CN=[N+]=[N-])O[C@@H]1OP(=O)(O)O.CO[C@H]1OC(CBr)[C@@H](O)C(O)[C@@H]1O.CO[C@H]1OC(CN=[N+]=[N-])[C@@H](C)C(C)[C@@H]1OC(C)=O.CO[C@H]1OC(CO)[C@@H](O)C(O)[C@@H]1O.CP(=O)(OC[C@H]1O[C@@H](N2C=CC(=O)NC2=O)C(O)[C@H]1O)N1CCOCC1.ClP1OC2=C(C=CC=C2)O1.NCC1=CC=CC=C1.O.O=S(=O)(O)O.[2H]CF.[N-]=[N+]=NCC1O[C@H](OP(=O)([O-])OP(=O)([O-])OC[C@H]2O[C@@H](N3C=CC(=O)NC3=O)C(O)[C@H]2O)[C@@H](O)C(O)[C@@H]1O IFUUTHPIYVCYNT-VJXWHUIBSA-L 0.000 description 1
- ZUHQCDZJPTXVCU-UHFFFAOYSA-N C1#CCCC2=CC=CC=C2C2=CC=CC=C21 Chemical group C1#CCCC2=CC=CC=C2C2=CC=CC=C21 ZUHQCDZJPTXVCU-UHFFFAOYSA-N 0.000 description 1
- 238000011740 C57BL/6 mouse Methods 0.000 description 1
- 101100240528 Caenorhabditis elegans nhr-23 gene Proteins 0.000 description 1
- 102100025570 Cancer/testis antigen 1 Human genes 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 108091005944 Cerulean Proteins 0.000 description 1
- 108091006146 Channels Proteins 0.000 description 1
- 241000579895 Chlorostilbon Species 0.000 description 1
- 108010077544 Chromatin Proteins 0.000 description 1
- RURLVUZRUFHCJO-UHFFFAOYSA-N Chromomycin A3 Natural products COC(C1Cc2cc3cc(OC4CC(OC(=O)C)C(OC5CC(O)C(OC)C(C)O5)C(C)O4)c(C)c(O)c3c(O)c2C(=O)C1OC6CC(OC7CC(C)(O)C(OC(=O)C)C(C)O7)C(O)C(C)O6)C(=O)C(O)C(C)O RURLVUZRUFHCJO-UHFFFAOYSA-N 0.000 description 1
- 235000001258 Cinchona calisaya Nutrition 0.000 description 1
- 241000614261 Citrus hongheensis Species 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 108091005943 CyPet Proteins 0.000 description 1
- 108010009540 DNA (Cytosine-5-)-Methyltransferase 1 Proteins 0.000 description 1
- 102100036279 DNA (cytosine-5)-methyltransferase 1 Human genes 0.000 description 1
- 108010033065 DNA beta-glucosyltransferase Proteins 0.000 description 1
- 230000004543 DNA replication Effects 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 101100364969 Dictyostelium discoideum scai gene Proteins 0.000 description 1
- 102000016680 Dioxygenases Human genes 0.000 description 1
- 108010028143 Dioxygenases Proteins 0.000 description 1
- 230000010777 Disulfide Reduction Effects 0.000 description 1
- 241000255601 Drosophila melanogaster Species 0.000 description 1
- 108091005941 EBFP Proteins 0.000 description 1
- 108091005947 EBFP2 Proteins 0.000 description 1
- 108091005942 ECFP Proteins 0.000 description 1
- 238000002965 ELISA Methods 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 208000034454 F12-related hereditary angioedema with normal C1Inh Diseases 0.000 description 1
- OZLGRUXZXMRXGP-UHFFFAOYSA-N Fluo-3 Chemical compound CC1=CC=C(N(CC(O)=O)CC(O)=O)C(OCCOC=2C(=CC=C(C=2)C2=C3C=C(Cl)C(=O)C=C3OC3=CC(O)=C(Cl)C=C32)N(CC(O)=O)CC(O)=O)=C1 OZLGRUXZXMRXGP-UHFFFAOYSA-N 0.000 description 1
- 102220566467 GDNF family receptor alpha-1_S65A_mutation Human genes 0.000 description 1
- 102220566469 GDNF family receptor alpha-1_S65T_mutation Human genes 0.000 description 1
- 102220566453 GDNF family receptor alpha-1_Y66F_mutation Human genes 0.000 description 1
- 102220566451 GDNF family receptor alpha-1_Y66H_mutation Human genes 0.000 description 1
- 102220566455 GDNF family receptor alpha-1_Y66W_mutation Human genes 0.000 description 1
- SXRSQZLOMIGNAQ-UHFFFAOYSA-N Glutaraldehyde Chemical compound O=CCCCC=O SXRSQZLOMIGNAQ-UHFFFAOYSA-N 0.000 description 1
- 208000002250 Hematologic Neoplasms Diseases 0.000 description 1
- 108090000027 Hexosyltransferases Proteins 0.000 description 1
- 102000003726 Hexosyltransferases Human genes 0.000 description 1
- 108010093488 His-His-His-His-His-His Proteins 0.000 description 1
- 108010033040 Histones Proteins 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101000856237 Homo sapiens Cancer/testis antigen 1 Proteins 0.000 description 1
- 101000724418 Homo sapiens Neutral amino acid transporter B(0) Proteins 0.000 description 1
- 101000935123 Homo sapiens Voltage-dependent N-type calcium channel subunit alpha-1B Proteins 0.000 description 1
- FGBAVQUHSKYMTC-UHFFFAOYSA-M LDS 751 dye Chemical compound [O-]Cl(=O)(=O)=O.C1=CC2=CC(N(C)C)=CC=C2[N+](CC)=C1C=CC=CC1=CC=C(N(C)C)C=C1 FGBAVQUHSKYMTC-UHFFFAOYSA-M 0.000 description 1
- 241000713666 Lentivirus Species 0.000 description 1
- 102000018697 Membrane Proteins Human genes 0.000 description 1
- 241000699660 Mus musculus Species 0.000 description 1
- 101100364971 Mus musculus Scai gene Proteins 0.000 description 1
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 description 1
- PJKKQFAEFWCNAQ-UHFFFAOYSA-N N(4)-methylcytosine Chemical class CNC=1C=CNC(=O)N=1 PJKKQFAEFWCNAQ-UHFFFAOYSA-N 0.000 description 1
- 208000012902 Nervous system disease Diseases 0.000 description 1
- 208000025966 Neurological disease Diseases 0.000 description 1
- 102100028267 Neutral amino acid transporter B(0) Human genes 0.000 description 1
- VEQPNABPJHWNSG-UHFFFAOYSA-N Nickel(2+) Chemical compound [Ni+2] VEQPNABPJHWNSG-UHFFFAOYSA-N 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 108020002230 Pancreatic Ribonuclease Proteins 0.000 description 1
- 102000005891 Pancreatic ribonuclease Human genes 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 102000003992 Peroxidases Human genes 0.000 description 1
- 108010004729 Phycoerythrin Proteins 0.000 description 1
- 108010021757 Polynucleotide 5'-Hydroxyl-Kinase Proteins 0.000 description 1
- 102000008422 Polynucleotide 5'-hydroxyl-kinase Human genes 0.000 description 1
- 208000006664 Precursor Cell Lymphoblastic Leukemia-Lymphoma Diseases 0.000 description 1
- WDVSHHCDHLJJJR-UHFFFAOYSA-N Proflavine Chemical compound C1=CC(N)=CC2=NC3=CC(N)=CC=C3C=C21 WDVSHHCDHLJJJR-UHFFFAOYSA-N 0.000 description 1
- 239000013614 RNA sample Substances 0.000 description 1
- KJTLSVCANCCWHF-UHFFFAOYSA-N Ruthenium Chemical compound [Ru] KJTLSVCANCCWHF-UHFFFAOYSA-N 0.000 description 1
- MEFKEPWMEQBLKI-AIRLBKTGSA-N S-adenosyl-L-methioninate Chemical compound O[C@@H]1[C@H](O)[C@@H](C[S+](CC[C@H](N)C([O-])=O)C)O[C@H]1N1C2=NC=NC(N)=C2N=C1 MEFKEPWMEQBLKI-AIRLBKTGSA-N 0.000 description 1
- 229920005654 Sephadex Polymers 0.000 description 1
- 239000012507 Sephadex™ Substances 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- UIIMBOGNXHQVGW-DEQYMQKBSA-M Sodium bicarbonate-14C Chemical class [Na+].O[14C]([O-])=O UIIMBOGNXHQVGW-DEQYMQKBSA-M 0.000 description 1
- 229930006000 Sucrose Natural products 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- LSNNMFCWUKXFEE-UHFFFAOYSA-N Sulfurous acid Chemical class OS(O)=O LSNNMFCWUKXFEE-UHFFFAOYSA-N 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 241000053227 Themus Species 0.000 description 1
- DPXHITFUCHFTKR-UHFFFAOYSA-L To-Pro-1 Chemical compound [I-].[I-].S1C2=CC=CC=C2[N+](C)=C1C=C1C2=CC=CC=C2N(CCC[N+](C)(C)C)C=C1 DPXHITFUCHFTKR-UHFFFAOYSA-L 0.000 description 1
- QHNORJFCVHUPNH-UHFFFAOYSA-L To-Pro-3 Chemical compound [I-].[I-].S1C2=CC=CC=C2[N+](C)=C1C=CC=C1C2=CC=CC=C2N(CCC[N+](C)(C)C)C=C1 QHNORJFCVHUPNH-UHFFFAOYSA-L 0.000 description 1
- MZZINWWGSYUHGU-UHFFFAOYSA-J ToTo-1 Chemical compound [I-].[I-].[I-].[I-].C12=CC=CC=C2C(C=C2N(C3=CC=CC=C3S2)C)=CC=[N+]1CCC[N+](C)(C)CCC[N+](C)(C)CCC[N+](C1=CC=CC=C11)=CC=C1C=C1N(C)C2=CC=CC=C2S1 MZZINWWGSYUHGU-UHFFFAOYSA-J 0.000 description 1
- 102220615016 Transcription elongation regulator 1_S65C_mutation Human genes 0.000 description 1
- 101710135349 Venom phosphodiesterase Proteins 0.000 description 1
- 102100025342 Voltage-dependent N-type calcium channel subunit alpha-1B Human genes 0.000 description 1
- 108091005971 Wild-type GFP Proteins 0.000 description 1
- GRRMZXFOOGQMFA-UHFFFAOYSA-J YoYo-1 Chemical compound [I-].[I-].[I-].[I-].C12=CC=CC=C2C(C=C2N(C3=CC=CC=C3O2)C)=CC=[N+]1CCC[N+](C)(C)CCC[N+](C)(C)CCC[N+](C1=CC=CC=C11)=CC=C1C=C1N(C)C2=CC=CC=C2O1 GRRMZXFOOGQMFA-UHFFFAOYSA-J 0.000 description 1
- UYRDHEJRPVSJFM-VSWVFQEASA-N [(1s,3r)-3-hydroxy-4-[(3e,5e,7e,9e,11z)-11-[4-[(e)-2-[(1r,3s,6s)-3-hydroxy-1,5,5-trimethyl-7-oxabicyclo[4.1.0]heptan-6-yl]ethenyl]-5-oxofuran-2-ylidene]-3,10-dimethylundeca-1,3,5,7,9-pentaenylidene]-3,5,5-trimethylcyclohexyl] acetate Chemical compound C[C@@]1(O)C[C@@H](OC(=O)C)CC(C)(C)C1=C=C\C(C)=C\C=C\C=C\C=C(/C)\C=C/1C=C(\C=C\[C@]23[C@@](O2)(C)C[C@@H](O)CC3(C)C)C(=O)O\1 UYRDHEJRPVSJFM-VSWVFQEASA-N 0.000 description 1
- JGRUGFGFHIPFGW-UHFFFAOYSA-J [7-dimethylazaniumylidene-1,9-bis[4-[3-(3-methyl-1,3-benzothiazol-2-ylidene)prop-1-enyl]quinolin-1-ium-1-yl]nonan-3-ylidene]-dimethylazanium;tetraiodide Chemical compound [I-].[I-].[I-].[I-].C12=CC=CC=C2C(\C=C\C=C2/N(C3=CC=CC=C3S2)C)=CC=[N+]1CCC(=[N+](C)C)CCCC(=[N+](C)C)CC[N+](C1=CC=CC=C11)=CC=C1\C=C\C=C/1N(C)C2=CC=CC=C2S\1 JGRUGFGFHIPFGW-UHFFFAOYSA-J 0.000 description 1
- ZHAFUINZIZIXFC-UHFFFAOYSA-N [9-(dimethylamino)-10-methylbenzo[a]phenoxazin-5-ylidene]azanium;chloride Chemical compound [Cl-].O1C2=CC(=[NH2+])C3=CC=CC=C3C2=NC2=C1C=C(N(C)C)C(C)=C2 ZHAFUINZIZIXFC-UHFFFAOYSA-N 0.000 description 1
- ZKHQWZAMYRWXGA-KNYAHOBESA-N [[(2r,3s,4r,5r)-5-(6-aminopurin-9-yl)-3,4-dihydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl] dihydroxyphosphoryl hydrogen phosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)O[32P](O)(O)=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KNYAHOBESA-N 0.000 description 1
- GPADSBRECKAOGT-UHFFFAOYSA-N [hydroxy(oxido)phosphoryl] hydrogen phosphate;triethylazanium Chemical compound CC[NH+](CC)CC.CC[NH+](CC)CC.OP([O-])(=O)OP(O)([O-])=O GPADSBRECKAOGT-UHFFFAOYSA-N 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- FXXACINHVKSMDR-UHFFFAOYSA-N acetyl bromide Chemical compound CC(Br)=O FXXACINHVKSMDR-UHFFFAOYSA-N 0.000 description 1
- 125000002777 acetyl group Chemical group [H]C([H])([H])C(*)=O 0.000 description 1
- BGLGAKMTYHWWKW-UHFFFAOYSA-N acridine yellow Chemical compound [H+].[Cl-].CC1=C(N)C=C2N=C(C=C(C(C)=C3)N)C3=CC2=C1 BGLGAKMTYHWWKW-UHFFFAOYSA-N 0.000 description 1
- 150000001251 acridines Chemical class 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 229960001570 ademetionine Drugs 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- IJTNSXPMYKJZPR-WVRBZULHSA-N alpha-parinaric acid Natural products CCC=C/C=C/C=C/C=CCCCCCCCC(=O)O IJTNSXPMYKJZPR-WVRBZULHSA-N 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 229910000147 aluminium phosphate Inorganic materials 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 229940043376 ammonium acetate Drugs 0.000 description 1
- 235000019257 ammonium acetate Nutrition 0.000 description 1
- 235000012538 ammonium bicarbonate Nutrition 0.000 description 1
- 239000000908 ammonium hydroxide Substances 0.000 description 1
- 239000003957 anion exchange resin Substances 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 230000009830 antibody antigen interaction Effects 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 229940027998 antiseptic and disinfectant acridine derivative Drugs 0.000 description 1
- 239000012300 argon atmosphere Substances 0.000 description 1
- 125000003118 aryl group Chemical group 0.000 description 1
- 210000001130 astrocyte Anatomy 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- JPIYZTWMUGTEHX-UHFFFAOYSA-N auramine O free base Chemical compound C1=CC(N(C)C)=CC=C1C(=N)C1=CC=C(N(C)C)C=C1 JPIYZTWMUGTEHX-UHFFFAOYSA-N 0.000 description 1
- 238000000376 autoradiography Methods 0.000 description 1
- 125000000656 azaniumyl group Chemical group [H][N+]([H])([H])[*] 0.000 description 1
- KFZNPGQYVZZSNV-UHFFFAOYSA-M azure B Chemical compound [Cl-].C1=CC(N(C)C)=CC2=[S+]C3=CC(NC)=CC=C3N=C21 KFZNPGQYVZZSNV-UHFFFAOYSA-M 0.000 description 1
- 238000007630 basic procedure Methods 0.000 description 1
- WQZGKKKJIJFFOK-VFUOTHLCSA-N beta-D-glucose Chemical compound OC[C@H]1O[C@@H](O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-VFUOTHLCSA-N 0.000 description 1
- 125000000188 beta-D-glucosyl group Chemical group C1([C@H](O)[C@@H](O)[C@H](O)[C@H](O1)CO)* 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000001045 blue dye Substances 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000004641 brain development Effects 0.000 description 1
- UDSAIICHUKSCKT-UHFFFAOYSA-N bromophenol blue Chemical compound C1=C(Br)C(O)=C(Br)C=C1C1(C=2C=C(Br)C(O)=C(Br)C=2)C2=CC=CC=C2S(=O)(=O)O1 UDSAIICHUKSCKT-UHFFFAOYSA-N 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- VTJUKNSKBAOEHE-UHFFFAOYSA-N calixarene Chemical class COC(=O)COC1=C(CC=2C(=C(CC=3C(=C(C4)C=C(C=3)C(C)(C)C)OCC(=O)OC)C=C(C=2)C(C)(C)C)OCC(=O)OC)C=C(C(C)(C)C)C=C1CC1=C(OCC(=O)OC)C4=CC(C(C)(C)C)=C1 VTJUKNSKBAOEHE-UHFFFAOYSA-N 0.000 description 1
- 150000001718 carbodiimides Chemical class 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 150000007942 carboxylates Chemical class 0.000 description 1
- PTIUZRZHZRYCJE-UHFFFAOYSA-N cascade yellow Chemical compound C1=C(S([O-])(=O)=O)C(OC)=CC=C1C1=CN=C(C=2C=C[N+](CC=3C=C(C=CC=3)C(=O)ON3C(CCC3=O)=O)=CC=2)O1 PTIUZRZHZRYCJE-UHFFFAOYSA-N 0.000 description 1
- 230000003197 catalytic effect Effects 0.000 description 1
- 229940023913 cation exchange resins Drugs 0.000 description 1
- 150000001768 cations Chemical class 0.000 description 1
- 230000003915 cell function Effects 0.000 description 1
- 230000004663 cell proliferation Effects 0.000 description 1
- 239000001913 cellulose Substances 0.000 description 1
- 229920002678 cellulose Polymers 0.000 description 1
- 210000002230 centromere Anatomy 0.000 description 1
- 230000017712 cerebellum development Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- OQNGCCWBHLEQFN-UHFFFAOYSA-N chloroform;hexane Chemical compound ClC(Cl)Cl.CCCCCC OQNGCCWBHLEQFN-UHFFFAOYSA-N 0.000 description 1
- 229930002875 chlorophyll Natural products 0.000 description 1
- 235000019804 chlorophyll Nutrition 0.000 description 1
- ATNHDLDRLWWWCB-AENOIHSZSA-M chlorophyll a Chemical compound C1([C@@H](C(=O)OC)C(=O)C2=C3C)=C2N2C3=CC(C(CC)=C3C)=[N+]4C3=CC3=C(C=C)C(C)=C5N3[Mg-2]42[N+]2=C1[C@@H](CCC(=O)OC\C=C(/C)CCC[C@H](C)CCC[C@H](C)CCCC(C)C)[C@H](C)C2=C5 ATNHDLDRLWWWCB-AENOIHSZSA-M 0.000 description 1
- 210000003483 chromatin Anatomy 0.000 description 1
- 238000011210 chromatographic step Methods 0.000 description 1
- 239000012539 chromatography resin Substances 0.000 description 1
- ZYVSOIYQKUDENJ-WKSBCEQHSA-N chromomycin A3 Chemical compound O([C@@H]1C[C@@H](O[C@H](C)[C@@H]1OC(C)=O)OC=1C=C2C=C3C[C@H]([C@@H](C(=O)C3=C(O)C2=C(O)C=1C)O[C@@H]1O[C@H](C)[C@@H](O)[C@H](O[C@@H]2O[C@H](C)[C@@H](O)[C@H](O[C@@H]3O[C@@H](C)[C@H](OC(C)=O)[C@@](C)(O)C3)C2)C1)[C@H](OC)C(=O)[C@@H](O)[C@@H](C)O)[C@@H]1C[C@@H](O)[C@@H](OC)[C@@H](C)O1 ZYVSOIYQKUDENJ-WKSBCEQHSA-N 0.000 description 1
- 238000003200 chromosome mapping Methods 0.000 description 1
- 230000011855 chromosome organization Effects 0.000 description 1
- LOUPRKONTZGTKE-UHFFFAOYSA-N cinchonine Natural products C1C(C(C2)C=C)CCN2C1C(O)C1=CC=NC2=CC=C(OC)C=C21 LOUPRKONTZGTKE-UHFFFAOYSA-N 0.000 description 1
- 229910017052 cobalt Inorganic materials 0.000 description 1
- 239000010941 cobalt Substances 0.000 description 1
- GUTLYIVDDKVIGB-UHFFFAOYSA-N cobalt atom Chemical compound [Co] GUTLYIVDDKVIGB-UHFFFAOYSA-N 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 150000004775 coumarins Chemical class 0.000 description 1
- 238000009295 crossflow filtration Methods 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 108010082025 cyan fluorescent protein Proteins 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 125000001295 dansyl group Chemical class [H]C1=C([H])C(N(C([H])([H])[H])C([H])([H])[H])=C2C([H])=C([H])C([H])=C(C2=C1[H])S(*)(=O)=O 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000017858 demethylation Effects 0.000 description 1
- 238000010520 demethylation reaction Methods 0.000 description 1
- CFCUWKMKBJTWLW-UHFFFAOYSA-N deoliosyl-3C-alpha-L-digitoxosyl-MTM Natural products CC=1C(O)=C2C(O)=C3C(=O)C(OC4OC(C)C(O)C(OC5OC(C)C(O)C(OC6OC(C)C(O)C(C)(O)C6)C5)C4)C(C(OC)C(=O)C(O)C(C)O)CC3=CC2=CC=1OC(OC(C)C1O)CC1OC1CC(O)C(O)C(C)O1 CFCUWKMKBJTWLW-UHFFFAOYSA-N 0.000 description 1
- 238000010511 deprotection reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003795 desorption Methods 0.000 description 1
- 230000009025 developmental regulation Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 125000004177 diethyl group Chemical group [H]C([H])([H])C([H])([H])* 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000019975 dosage compensation by inactivation of X chromosome Effects 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000007876 drug discovery Methods 0.000 description 1
- 230000005684 electric field Effects 0.000 description 1
- 238000001378 electrochemiluminescence detection Methods 0.000 description 1
- 239000003480 eluent Substances 0.000 description 1
- 239000010976 emerald Substances 0.000 description 1
- 229910052876 emerald Inorganic materials 0.000 description 1
- 108010030074 endodeoxyribonuclease MluI Proteins 0.000 description 1
- 108010048367 enhanced green fluorescent protein Proteins 0.000 description 1
- 230000009144 enzymatic modification Effects 0.000 description 1
- YQGOJNYOYNNSMM-UHFFFAOYSA-N eosin Chemical compound [Na+].OC(=O)C1=CC=CC=C1C1=C2C=C(Br)C(=O)C(Br)=C2OC2=C(Br)C(O)=C(Br)C=C21 YQGOJNYOYNNSMM-UHFFFAOYSA-N 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 1
- 229960005542 ethidium bromide Drugs 0.000 description 1
- NLFBCYMMUAKCPC-KQQUZDAGSA-N ethyl (e)-3-[3-amino-2-cyano-1-[(e)-3-ethoxy-3-oxoprop-1-enyl]sulfanyl-3-oxoprop-1-enyl]sulfanylprop-2-enoate Chemical compound CCOC(=O)\C=C\SC(=C(C#N)C(N)=O)S\C=C\C(=O)OCC NLFBCYMMUAKCPC-KQQUZDAGSA-N 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 108091006047 fluorescent proteins Proteins 0.000 description 1
- 102000034287 fluorescent proteins Human genes 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000004077 genetic alteration Effects 0.000 description 1
- 231100000118 genetic alteration Toxicity 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000001279 glycosylating effect Effects 0.000 description 1
- 108700014210 glycosyltransferase activity proteins Proteins 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 210000002216 heart Anatomy 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 230000009033 hematopoietic malignancy Effects 0.000 description 1
- 208000016861 hereditary angioedema type 3 Diseases 0.000 description 1
- 150000002411 histidines Chemical class 0.000 description 1
- 238000000265 homogenisation Methods 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 230000033444 hydroxylation Effects 0.000 description 1
- 238000005805 hydroxylation reaction Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000003100 immobilizing effect Effects 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- PNDZEEPOYCVIIY-UHFFFAOYSA-N indo-1 Chemical compound CC1=CC=C(N(CC(O)=O)CC(O)=O)C(OCCOC=2C(=CC=C(C=2)C=2N=C3[CH]C(=CC=C3C=2)C(O)=O)N(CC(O)=O)CC(O)=O)=C1 PNDZEEPOYCVIIY-UHFFFAOYSA-N 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000006713 insertion reaction Methods 0.000 description 1
- 239000011630 iodine Substances 0.000 description 1
- 229910052740 iodine Inorganic materials 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 150000002540 isothiocyanates Chemical class 0.000 description 1
- 230000000155 isotopic effect Effects 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 238000011005 laboratory method Methods 0.000 description 1
- 238000000670 ligand binding assay Methods 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 230000033001 locomotion Effects 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 238000005710 macrocyclization reaction Methods 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- 238000002826 magnetic-activated cell sorting Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 229940107698 malachite green Drugs 0.000 description 1
- FDZZZRQASAIRJF-UHFFFAOYSA-M malachite green Chemical compound [Cl-].C1=CC(N(C)C)=CC=C1C(C=1C=CC=CC=1)=C1C=CC(=[N+](C)C)C=C1 FDZZZRQASAIRJF-UHFFFAOYSA-M 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000001840 matrix-assisted laser desorption--ionisation time-of-flight mass spectrometry Methods 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- DZVCFNFOPIZQKX-LTHRDKTGSA-M merocyanine Chemical compound [Na+].O=C1N(CCCC)C(=O)N(CCCC)C(=O)C1=C\C=C\C=C/1N(CCCS([O-])(=O)=O)C2=CC=CC=C2O\1 DZVCFNFOPIZQKX-LTHRDKTGSA-M 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 229910021645 metal ion Inorganic materials 0.000 description 1
- 150000002739 metals Chemical class 0.000 description 1
- HOVAGTYPODGVJG-ZFYZTMLRSA-N methyl alpha-D-glucopyranoside Chemical compound CO[C@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O HOVAGTYPODGVJG-ZFYZTMLRSA-N 0.000 description 1
- HOVAGTYPODGVJG-UHFFFAOYSA-N methyl beta-galactoside Natural products COC1OC(CO)C(O)C(O)C1O HOVAGTYPODGVJG-UHFFFAOYSA-N 0.000 description 1
- 238000010208 microarray analysis Methods 0.000 description 1
- 238000009629 microbiological culture Methods 0.000 description 1
- 239000004005 microsphere Substances 0.000 description 1
- CFCUWKMKBJTWLW-BKHRDMLASA-N mithramycin Chemical compound O([C@@H]1C[C@@H](O[C@H](C)[C@H]1O)OC=1C=C2C=C3C[C@H]([C@@H](C(=O)C3=C(O)C2=C(O)C=1C)O[C@@H]1O[C@H](C)[C@@H](O)[C@H](O[C@@H]2O[C@H](C)[C@H](O)[C@H](O[C@@H]3O[C@H](C)[C@@H](O)[C@@](C)(O)C3)C2)C1)[C@H](OC)C(=O)[C@@H](O)[C@@H](C)O)[C@H]1C[C@@H](O)[C@H](O)[C@@H](C)O1 CFCUWKMKBJTWLW-BKHRDMLASA-N 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 210000001700 mitochondrial membrane Anatomy 0.000 description 1
- 239000003068 molecular probe Substances 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 201000000050 myeloid neoplasm Diseases 0.000 description 1
- SYSQUGFVNFXIIT-UHFFFAOYSA-N n-[4-(1,3-benzoxazol-2-yl)phenyl]-4-nitrobenzenesulfonamide Chemical class C1=CC([N+](=O)[O-])=CC=C1S(=O)(=O)NC1=CC=C(C=2OC3=CC=CC=C3N=2)C=C1 SYSQUGFVNFXIIT-UHFFFAOYSA-N 0.000 description 1
- 239000002105 nanoparticle Substances 0.000 description 1
- 150000002790 naphthalenes Chemical class 0.000 description 1
- 229930014626 natural product Natural products 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000007472 neurodevelopment Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000009207 neuronal maturation Effects 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 229910052759 nickel Inorganic materials 0.000 description 1
- 229910001453 nickel ion Inorganic materials 0.000 description 1
- 238000003499 nucleic acid array Methods 0.000 description 1
- 238000002515 oligonucleotide synthesis Methods 0.000 description 1
- 229920001542 oligosaccharide Polymers 0.000 description 1
- 150000002482 oligosaccharides Chemical class 0.000 description 1
- 239000001048 orange dye Substances 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 150000004866 oxadiazoles Chemical class 0.000 description 1
- GHTWDWCFRFTBRB-UHFFFAOYSA-M oxazine-170 Chemical compound [O-]Cl(=O)(=O)=O.N1=C2C3=CC=CC=C3C(NCC)=CC2=[O+]C2=C1C=C(C)C(N(C)CC)=C2 GHTWDWCFRFTBRB-UHFFFAOYSA-M 0.000 description 1
- 150000004893 oxazines Chemical class 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- VYNDHICBIRRPFP-UHFFFAOYSA-N pacific blue Chemical compound FC1=C(O)C(F)=C2OC(=O)C(C(=O)O)=CC2=C1 VYNDHICBIRRPFP-UHFFFAOYSA-N 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 239000013618 particulate matter Substances 0.000 description 1
- 230000008506 pathogenesis Effects 0.000 description 1
- UTIQDNPUHSAVDN-UHFFFAOYSA-N peridinin Natural products CC(=O)OC1CC(C)(C)C(=C=CC(=CC=CC=CC=C2/OC(=O)C(=C2)C=CC34OC3(C)CC(O)CC4(C)C)C)C(C)(O)C1 UTIQDNPUHSAVDN-UHFFFAOYSA-N 0.000 description 1
- 230000008823 permeabilization Effects 0.000 description 1
- 108040007629 peroxidase activity proteins Proteins 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-N phosphoric acid Substances OP(O)(O)=O NBIIXXVUZAFLBC-UHFFFAOYSA-N 0.000 description 1
- QYSPLQLAKJAUJT-UHFFFAOYSA-N piroxicam Chemical compound OC=1C2=CC=CC=C2S(=O)(=O)N(C)C=1C(=O)NC1=CC=CC=N1 QYSPLQLAKJAUJT-UHFFFAOYSA-N 0.000 description 1
- 229960002702 piroxicam Drugs 0.000 description 1
- 229960003171 plicamycin Drugs 0.000 description 1
- 210000001778 pluripotent stem cell Anatomy 0.000 description 1
- 229920001467 poly(styrenesulfonates) Polymers 0.000 description 1
- 239000004810 polytetrafluoroethylene Substances 0.000 description 1
- 229920001343 polytetrafluoroethylene Polymers 0.000 description 1
- RKCAIXNGYQCCAL-UHFFFAOYSA-N porphin Chemical compound N1C(C=C2N=C(C=C3NC(=C4)C=C3)C=C2)=CC=C1C=C1C=CC4=N1 RKCAIXNGYQCCAL-UHFFFAOYSA-N 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 239000002244 precipitate Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 229960000286 proflavine Drugs 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 238000002731 protein assay Methods 0.000 description 1
- 108020001775 protein parts Proteins 0.000 description 1
- 239000012460 protein solution Substances 0.000 description 1
- 150000003220 pyrenes Chemical class 0.000 description 1
- 229960000948 quinine Drugs 0.000 description 1
- 239000011535 reaction buffer Substances 0.000 description 1
- 239000012429 reaction media Substances 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 230000009257 reactivity Effects 0.000 description 1
- 102000005962 receptors Human genes 0.000 description 1
- 108010054624 red fluorescent protein Proteins 0.000 description 1
- 230000014493 regulation of gene expression Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000003938 response to stress Effects 0.000 description 1
- 102200089551 rs5030826 Human genes 0.000 description 1
- 229910052707 ruthenium Inorganic materials 0.000 description 1
- 229910052594 sapphire Inorganic materials 0.000 description 1
- 239000010980 sapphire Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- DYPYMMHZGRPOCK-UHFFFAOYSA-N seminaphtharhodafluor Chemical compound O1C(=O)C2=CC=CC=C2C21C(C=CC=1C3=CC=C(O)C=1)=C3OC1=CC(N)=CC=C21 DYPYMMHZGRPOCK-UHFFFAOYSA-N 0.000 description 1
- 238000011896 sensitive detection Methods 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 238000010583 slow cooling Methods 0.000 description 1
- 229910000030 sodium bicarbonate Inorganic materials 0.000 description 1
- 238000010532 solid phase synthesis reaction Methods 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 238000009987 spinning Methods 0.000 description 1
- 210000000952 spleen Anatomy 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 210000000130 stem cell Anatomy 0.000 description 1
- 238000003756 stirring Methods 0.000 description 1
- 239000012089 stop solution Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- 238000000856 sucrose gradient centrifugation Methods 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 238000006277 sulfonation reaction Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 210000001550 testis Anatomy 0.000 description 1
- JGVWCANSWKRBCS-UHFFFAOYSA-N tetramethylrhodamine thiocyanate Chemical compound [Cl-].C=12C=CC(N(C)C)=CC2=[O+]C2=CC(N(C)C)=CC=C2C=1C1=CC=C(SC#N)C=C1C(O)=O JGVWCANSWKRBCS-UHFFFAOYSA-N 0.000 description 1
- 238000010257 thawing Methods 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 238000004809 thin layer chromatography Methods 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 108091006106 transcriptional activators Proteins 0.000 description 1
- 238000006276 transfer reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- GWBUNZLLLLDXMD-UHFFFAOYSA-H tricopper;dicarbonate;dihydroxide Chemical compound [OH-].[OH-].[Cu+2].[Cu+2].[Cu+2].[O-]C([O-])=O.[O-]C([O-])=O GWBUNZLLLLDXMD-UHFFFAOYSA-H 0.000 description 1
- GPRLSGONYQIRFK-MNYXATJNSA-N triton Chemical compound [3H+] GPRLSGONYQIRFK-MNYXATJNSA-N 0.000 description 1
- 238000000108 ultra-filtration Methods 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 239000007762 w/o emulsion Substances 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- 150000003732 xanthenes Chemical class 0.000 description 1
- NLIVDORGVGAOOJ-MAHBNPEESA-M xylene cyanol Chemical compound [Na+].C1=C(C)C(NCC)=CC=C1C(\C=1C(=CC(OS([O-])=O)=CC=1)OS([O-])=O)=C\1C=C(C)\C(=[NH+]/CC)\C=C/1 NLIVDORGVGAOOJ-MAHBNPEESA-M 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/26—Preparation of nitrogen-containing carbohydrates
- C12P19/28—N-glycosides
- C12P19/30—Nucleotides
- C12P19/34—Polynucleotides, e.g. nucleic acids, oligoribonucleotides
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/48—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving transferase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6827—Hybridisation assays for detection of mutation or polymorphism
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y204/00—Glycosyltransferases (2.4)
- C12Y204/01—Hexosyltransferases (2.4.1)
- C12Y204/01028—Glucosyl-DNA beta-glucosyltransferase (2.4.1.28)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/154—Methylation markers
Definitions
- the present invention relates generally to the field of molecular biology. More particularly, it concerns methods and compositions for detecting, evaluating, and/or mapping 5-hydroxymethyl-modified cytosine bases within a nucleic acid molecule.
- 5-Methylcytosine constitutes approximately 2-8% of the total cytosines in human genomic DNA, and impacts a broad range of biological functions, including gene expression, maintenance of genome integrity, parental imprinting, X-chromosome inactivation, regulation of development, aging, and cancer. Recently, the presence of an oxidized 5-mC, 5-hydroxymethylcytosine (5-hmC), has been discovered in embryonic and neuronal stem cells, certain adult brain cells, and some cancer cells.
- Methods and compositions involve ⁇ -glucosyltransferase ( ⁇ GT), which is in the glycosyltransferase family of enzymes and which selectively glycosylates 5-hmC.
- ⁇ GT ⁇ -glucosyltransferase
- embodiments involve selectively glycosylating 5-hmC in a nucleic acid sample and directly or indirectly detecting, qualitatively and/or quantitatively, the glycosylated nucleotides based on a molecule or compound that is attached to glycosylated nucleotide.
- the attachment to the glycosylated nucleotide may occur at the time the nucleotide is glycosylated through the use of a modified UDP-Glu molecule with the attachment, or the attachment may be attached subsequent to the glycosylation with the modified Glu molecule.
- Other embodiments involve a modified and glycosylated nucleic acid molecule. Subsequent manipulation of the glycosylated nucleic acid using any number of different nucleic acid modifications is contemplated.
- Methods may involve any of the following steps described herein.
- methods involve incubating the nucleic acid molecule with ⁇ -glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule to glycosylate 5-hydroxymethylcytosine in the nucleic acid molecule with a modified glucose (Glu) molecule.
- methods may involve mixing the nucleic acids with ⁇ -glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule under conditions to promote glycosylation of the 5-hydroxymethylcytosines in the nucleic acids with a modified glucose (Glu) molecule.
- nucleic acids may involve contacting the nucleic acids with ⁇ -glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule under conditions to promote glycosylation of the 5-hydroxymethylcytosines in the nucleic acids with a modified glucose (Glu) molecule.
- a composition comprising nucleic acids, an effective amount of ⁇ -glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule is generated and then placed under conditions to promote glycosylation of the 5-hydroxymethylcytosines in the nucleic acids with a modified glucose (Glu) molecule.
- reactions involving any enzymes may be restricted or limited by time, enzyme concentration, substrate concentration, and/or template concentration.
- Reaction conditions may be adjusted so that the reaction is carried out under conditions that result in about, at least about, or at most about 20, 30, 40, 50, 60, 70, 80, 90, 95, 96, 97, 98, 99, 100% completion, or any range derivable therein.
- methods may also involve one or more of the following regarding nucleic acids prior to and/or concurrent with glycosylation of nucleic acids (generating a nucleic acid that is glycosylated on nucleotides that were 5-hmC nucleotides): obtaining nucleic acid molecules; obtaining nucleic acid molecules from a biological sample; obtaining a biological sample containing nucleic acids from a subject; isolating nucleic acid molecules; purifying nucleic acid molecules; obtaining an array or microarray containing nucleic acids to be glycosylated; denaturing nucleic acid molecules; shearing or cutting nucleic acid; denaturing nucleic acid molecules; hybridizing nucleic acid molecules; incubating the nucleic acid molecule with an enzyme that is not ⁇ -glucosyltransferase; incubating the nucleic acid molecule with a restriction enzyme; attaching one or more chemical groups or compounds to the nucleic acid; conjugating one or more chemical groups or compounds to the
- Methods may further involve one or more of the following steps that is concurrent with and/or subsequent to glycosylation of nucleic acids: isolating nucleic acids glycosylated with the modified glucose; isolating glycosylated (and modified) nucleic acids based on the modification to the glucose; purifying glycosylated (and modified) nucleic acids based on the modification to the glucose; reacting the modified glucose in the glycosylated nucleic acid molecule with a detectable or functional moiety, such as a linker; conjugating or attaching a detectable or functional moiety to the glycosylated nucleotide; exposing to, incubating with, or mixing with the glycosylated nucleic acid an enzyme that will use the glycosylated nucleic acid as a substrate independent of the modification to the glucose; exposing to, incubating with, or mixing with the glycosylated nucleic acid an enzyme that will use the glycosylated nucleic acid as a substrate unless the modification
- Methods may also involve the following steps: cloning ⁇ -glucosyltransferase ( ⁇ GT); synthesizing ⁇ -glucosyltransferase or a functional fragment thereof; isolating ⁇ -glucosyltransferase; purifying ⁇ -glucosyltransferase; synthesizing ⁇ -glucosyltransferase; placing ⁇ -glucosyltransferase in a sterile container; shipping purified or isolated ⁇ -glucosyltransferase in a container; and/or providing instructions regarding use of ⁇ -glucosyltransferase; incubating ⁇ -glucosyltransferase with UDP-glucose molecules and a nucleic acid substrate under conditions to promote glycosylation of the nucleic acid with the glucose molecule (which may or may not be modified) and result in a nucleic acid that is glycosylated at one or more 5-hydroxymethylcytosines.
- ⁇ GT ⁇ -glucosyl
- compositions may involve a purified nucleic acid, modified UDP-Glu, and/or enzyme, such as ⁇ -glucosyltransferase.
- purification may result in a molecule that is about or at least about 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 99.1, 99.2, 99.3, 99.4, 99.5, 99.6, 99.7 99.8, 99.9% or more pure, or any range derivable therein, relative to any contaminating components (w/w or w/v).
- steps including, but not limited to, obtaining information (qualitative and/or quantitative) about one or more 5-hydroxymethylcytosines in a nucleic acid sample; ordering an assay to determine, identify, and/or map 5-hydroxymethylcytosines in a nucleic acid sample; reporting information (qualitative and/or quantitative) about one or more 5-hydroxymethylcytosines in a nucleic acid sample; comparing that information to information about 5-hydroxymethylcytosines in a control or comparative sample.
- the terms “determine,” “analyze,” “assay,” and “evaluate” in the context of a sample refer to transformation of that sample to gather qualitative and/or quantitative data about the sample.
- nucleic acid molecules may be DNA, RNA, or a combination of both. Nucleic acids may be recombinant, genomic, or synthesized. In additional embodiments, methods involve nucleic acid molecules that are isolated and/or purified. The nucleic acid may be isolated from a cell or biological sample in some embodiments. Certain embodiments involve isolating nucleic acids from a eukaryotic, mammalian, or human cell. In some cases, they are isolated from non-nucleic acids. In some embodiments, the nucleic acid molecule is eukaryotic; in some cases, the nucleic acid is mammalian, which may be human.
- nucleic acid molecule is isolated from a human cell and/or has a sequence that identifies it as human.
- the nucleic acid molecule is not a prokaryotic nucleic acid, such as a bacterial nucleic acid molecule.
- isolated nucleic acid molecule are on an array.
- the array is a microarray.
- a nucleic acid is isolated by any technique known to those of skill in the art, including, but not limited to, using a gel, column, matrix or filter to isolate the nucleic acids.
- the gel is a polyacrylamide or agarose gel.
- Methods and compositions may also include a modified UDP-Glu.
- the modified UDP-Glu comprises a modification moiety. In some embodiments, more than one modification moiety is included.
- modification moiety refers to a chemical compound or element that is added to a UDP-Glu molecule.
- a modified UDP-Glu refers to a UDP-Glu molecule having i) a modification moiety or ii) a chemical compound or element that is substituted for a molecule in UDP-Glu, such that the resulting modified compound has a different chemical formula than unmodified UDP-Glu.
- a modified UDP-Glu does not include a UDP-Glu that is radioactive by substitution of a molecule or compound in a UDP-Glu with the same molecule or compound, for example, a molecule or compound that is merely radioactive.
- a modified UDP-Glu is not employed, but that a UDP-Glu molecule that is unmodified, but that one or more chemicals compounds are a radioactive version of the same molecule.
- modified UDP-Glu or a modification moiety may comprise one or more detectable moieties.
- a detectable moiety refers to a chemical compound or element that is capable of being detected.
- a modified UDP-Glu is not a version of UDP-Glu that is radioactive, and in specific embodiments, a modified UDP-Glu does not have a radioactive carbon molecule.
- a detectable moiety is fluorescent, radioactive, enzymatic, electrochemical, or colorimetric.
- the detectable moiety is a fluorophore or quantum dot.
- FRET may be employed to detect glycosylated nucleotides.
- a modification moiety may be a linker that allows one or more functional or detectable moieties or isolation tags to be attached to the glycosylated 5-hmC molecules.
- the linker is an azide linker or a thiol linker.
- the modification moiety may be an isolation tag, which means the tag can be used to isolate a molecule that is attached to the tag.
- the isolation tag is biotin or a histidine tag.
- the tag is modified, such as with a detectable moiety. It is contemplated that the linker allows for other chemical compounds or substances to be attached to the glycosylated nucleic acid at 5-hmC.
- a functional moieties is attached to the modified UDP-Glu molecule, which is then used to glycosylate 5-hmC nucleotides.
- a function moiety is attached to the modified glucose after 5-hmC nucleotides have been glycosylation.
- one or more functional and/or detectable moieties and/or isolation tags are attached to each 5-hmC nucleotides.
- a functional moiety comprises a molecule or compound that inhibits or blocks an enzyme from using the glycosylated 5-hydroxymethylcytosine in the nucleic acid molecule as a substrate.
- the inhibition is sufficiently complete to prevent detection of an enzymatic reaction involving the glycosylated 5-hydroxymethylcytosine.
- the molecule or compound that blocks an enzyme may be doing this by sterically blocking access of the enzyme.
- Such sterical blocking moieties are specifically contemplated as modification moieties.
- the sterical blocking moieties contain 1, 2, or 3 ringed structures, including but not limited to aromatic ring structures.
- the blocking moiety is polyethylene glycol. In other embodiments, it is a nucleic acid, amino acid, carbohydrate, or fatty acid (including mono-, di-, or tri-versions).
- Methods and compositions may also involve one or more enzymes in addition to ⁇ -glucosyltransferase.
- the enzyme is a restriction enzyme or a polymerase.
- embodiments involve a restriction enzyme.
- the restriction enzyme may be methylation-insensitive.
- the enzyme is polymerase.
- nucleic acids are contacted with a restriction enzyme prior to, concurrent with, or subsequent to glycosylation of nucleic acids with a modified UDP-Glu.
- the glycosylated nucleic acid may be contacted with a polymerase before or after the nucleic acid has been exposed to a restriction enzyme.
- Methods and compositions involve distinguishing between 5-hydroxymethylcytosine and methylcytosine after modifying the 5-hydroxymethylcytosines and not the methylcytosines.
- Methods may involve identifying 5-hydroxymethylcytosines in the nucleic acids by comparing glycosylated nucleic acids with unglycosylated nucleic acids or to nucleic acids whose glycosylation state is already known. Detection of the modification can involve a wide variety of recombinant nucleic acid techniques.
- a glycosylated nucleic acid molecule is incubated with polymerase, at least one primer, and one or more nucleotides under conditions to allow polymerization of the glycosylated nucleic acid.
- methods may involve sequencing a glycosylated nucleic acid molecule.
- a glycosylated nucleic acid is used in a primer extension assay.
- Methods and compositions may involve a control nucleic acid.
- the control may be used to evaluate whether glycosylation or other enzymatic reactions are occurring.
- the control may be used to compare glycosylation states.
- the control may be a negative control or it may be a positive control. It may be a control that was not incubated with one or more reagents in the glycosylation reaction.
- a control nucleic acid may be a reference nucleic acid, which means its glycosylation state (based on qualitative and/or quantitative information related to glycosylation at 5-hydroxymethylcytosines, or the absence thereof) is used for comparing to a nucleic acid being evaluated.
- control nucleic acid provides the basis for a control nucleic acid.
- the control nucleic acid is from a normal sample with respect to a particular attribute, such as a disease or condition, or other phenotype.
- the control sample is from a different patient population, a different cell type or organ type, a different disease state, a different phase or severity of a disease state, a different prognosis, a different developmental stage, etc.
- there are methods for distinguishing 5-hydroxymethylcytosine from 5-methylcytosine in a nucleic acid molecule comprising incubating the nucleic acid molecule with ⁇ -glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule to glycosylate 5-hydroxymethylcytosines in the nucleic acid molecule with a modified glucose molecule.
- UDP-Glu modified uridine diphosphoglucose
- nucleic acid molecule containing at least one 5-hydroxymethylcytosine comprising incubating the nucleic acid molecule with ⁇ -glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule to glycosylate 5-hydroxymethylcytosines in the nucleic acid molecule with the modified Glu molecule.
- UDP-Glu modified uridine diphosphoglucose
- Particular embodiments involve identifying 5-hydroxymethylcytosines in genomic DNA comprising: a) isolating the genomic DNA; b) shearing or cutting the genomic DNA into pieces; c) mixing the genomic DNA pieces with ⁇ -glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule under conditions to promote glycosylation of the 5-hydroxymethylcytosines in the genomic DNA with the modified UDP-Glu molecule; and, d) identifying 5-hydroxymethylcytosines in the genomic DNA using the modified UDP-Glu molecule.
- UDP-Glu modified uridine diphosphoglucose
- there are methods for identifying 5-hydroxymethylcytosines in a nucleic acid molecule comprising: a) mixing the nucleic acid molecule with ⁇ -glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule under conditions to promote glycosylation of the 5-hydroxymethylcytosines in the nucleic acid with the modified UDP-Glu molecule; b) mixing the glycosylated nucleic acid with a methylation-insensitive restriction enzyme, wherein the modified UDP-Glu molecule comprises a molecule or compounds that prevents cleavage of the nucleic acid molecule at a site that would have been cleaved if nucleic acid molecule had not been glycosylated with the modified UDP-Glu; and, c) identifying 5-hydroxymethylcytosines in the genomic DNA using the modified UDP-Glu molecule.
- UDP-Glu modified uridine diphosphoglucose
- Embodiments may involve methods for mapping 5-hydroxymethylcytosine in a nucleic acid molecule comprising incubating the nucleic acid molecule with ⁇ -glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule to glycosylate 5-hydroxymethylcytosines in the nucleic acid molecule with the modified UDP-Glu molecule; and mapping the 5-hydroxymethylcytosines in the nucleic acid molecule.
- UDP-Glu modified uridine diphosphoglucose
- the 5-hydroxymethylcytosines in the nucleic acid may be mapped by a number of ways, including being mapped by sequencing the glycosylated nucleic acid and comparing the results to a control nucleic acid or by subjecting the glycosylated nucleic acid to a primer extension assay and comparing the results to a control nucleic acid.
- 5-hydroxymethylcytosines in the nucleic acid are mapped by subjecting the glycosylated nucleic acid to a hybridization assay and comparing the results to a control nucleic acid.
- Additional embodiments include methods for obtaining information about the presence and/or absence of 5-hydroxymethylcytosine in nucleic acids in a first sample from a subject comprising: a) retrieving a first sample comprising nucleic acids from a biological sample; b) obtaining information about the presence and/or absence of 5-hydroxymethylcytosine in nucleic acids in the first sample, wherein the information is obtained by i) incubating the first nucleic acid sample with ⁇ -glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule, wherein a modified Glu is enzymatically attached to 5-hydroxymethylcytosines in nucleic acid molecules in the first nucleic acid sample; ii) detecting or measuring the 5-hydroxymethylcytosines based on the presence of the modified Glu to determine the 5-hydroxymethylcytosine status of the first nucleic acid sample; and, iii) comparing the 5-hydroxymethylcytosine status of the first nucleic acid sample with the 5-hydroxy
- methods concern obtaining a biological sample directly from a patient or extracting nucleic acids from a biological sample.
- the biological sample is from a patient.
- the patient is a human patient.
- methods comprise reporting information about the presence or absence of 5-hMC.
- the reporting is done on a document or an electronic version of a document. It is contemplated that in some embodiments, a clinician reports this information.
- kits which may be in a suitable container, that can be used to achieve the described methods.
- kits comprising purified ⁇ -glucosyltransferase and one or more modified uridine diphosphoglucose (UDP-Glu) molecule.
- the molecules may have or involve different types of modifications.
- a kit may include one or more buffers, such as buffers for nucleic acids or for reactions involving nucleic acids.
- Other enzymes may be included in kits in addition to or instead of ⁇ -glucosyltransferase.
- an enzyme is a polymerase. Kits may also include nucleotides for use with the polymerase.
- a restriction enzyme is included in addition to or instead of a polymerase.
- inventions also concern an array or microarray containing nucleic acid molecules that have been modified at the nucleotides that were 5′-hmC.
- compositions and kits of the invention can be used to achieve methods of the invention.
- the words “comprising” (and any form of comprising, such as “comprise” and “comprises”), “having” (and any form of having, such as “have” and “has”), “including” (and any form of including, such as “includes” and “include”) or “containing” (and any form of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, unrecited elements or method steps.
- FIGS. 1A-1B General strategy for 5-hmC modification and identification.
- FIG. 1A The 5-hmC in duplex DNA is modified using a ⁇ -glucosyltransferase enzyme ( ⁇ GT) which covalently links a glucose molecule from UDP-Glucose (UDP-Glu) to the hydroxymethyl-modified base to produce 5-gmC.
- ⁇ GT ⁇ -glucosyltransferase enzyme
- X Functional tagging groups
- the 5-gmC modification in duplex DNA completely blocks the activity of the restriction enzyme MspI.
- FIG. 3 Synthetic scheme for UDP-6-N3-UDP.
- the synthesis started from commercially available I.
- I Treatment of I with NBS and Ph3P in DMF selectively afforded 6-bromo derivative II. Without isolation, treatment of II with sodium azide followed by acetylation of the hydroxyl groups in pyridine generated compound III.
- Conversion of the 1-MeO to the corresponding 1-OAc by treatment of III with acetic acid and acetic anhydride in the presence of sulfuric acid gave IV.
- UDP-6-N3-UDP was obtained by treatment of VI with uridine 5-monophosphomorpholidate 4-morpholine-N,N-dicyclohexylcarboxamidine salt and tetrazole in pyridine and the subsequent treatment reaction with triethylamine and aqueous solution of NH4HCO3 in methanol to remove the acetyl groups.
- UDP-6-N3-UDP was purified by C18 reverse-phase HPLC and its structure was confirmed by 1H NMR, 13C NMR, 31P NMR, MALDI-TOF MS, and HRMS.
- FIGS. 4A-4B High-throughput methods to detect the 5-hmC modification in genomic DNA.
- FIG. 4A Reactivity differences of 5-meC, 5-hmC, and 5-gmC to bisulfite can be exploited to differentiate 5-hmC or 5-gmC from 5-meC.
- FIG. 4B A photosensitizer installed specifically on 5-gmC can lead to photosensitized oxidation of the labeled (5-gmC)pG, and subsequent base-mediated strand cleavage selective to this region.
- FIGS. 5A-5B Mass spec of 5-hmC, 5-N 3 -gmC and biotin-5-N 3 -gmC-containing 15 mer DNA with the corresponding reactions on the side.
- FIG. A MALDI-TOF of 5-hmC, 5-N 3 -gmC and biotin-5-N 3 -gmC-containing 15 mer DNA, respectively, with the calculated molecular weight and observed molecular weight indicated.
- FIG. B Corresponding reactions of ⁇ GT transferring 5-N 3 -glucose to 5-N 3 -gmC and the subsequent copper-free click chemistry on 5-N 3 -gmC.
- FIG. 6 Activity assays of wild-type ⁇ GT on UDP-Glu and UDP-6-N3-Glu.
- FIGS. 7A-7B HPLC analysis of the click reaction.
- A Reaction scheme of the click chemistry between compound 1 and the 11-mer synthetic DNA containing N3-5-gmC.
- B HPLC chromatograms (at 260 nm) of the nucleosides derived from the 11-mer N3-5-gmC-containing synthetic DNA before and after the click chemistry. The peak corresponding to N3-5-gmC decreased dramatically after the click chemistry, indicating the reaction yield is over 90%.
- DNA was digested by Nuclease P1 (Sigma) and Alkaline Phosphatase (Sigma). Samples were analyzed by HPLC with a C18 reverse-phase column equilibrated with buffer A (5 mM ammonium acetate, pH 7.5) and buffer B (5 mM ammonium, 0.01% TFA, 60% CH 3 CN).
- FIGS. 8A-8B HPLC, and MS identification of biotin-N3-5-gmC.
- A HPLC chromatograms (260 nm) of the nucleosides derived from the 11-mer biotin-N3-5-gmC-containing synthetic DNA. The peaks corresponding to biotin-N3-5-gmC (a pair of isomers) were collected and subjected to HRMS analysis.
- B HRMS of biotin-N3-5-gmC (structures are shown in the insets). Theoretical m/z values are shown; observed m/z values are also shown.
- FIGS. 9A-9B The streptavidin adduct of biotin-5-N 3 -gmC hinders primer extension.
- A Sequence of 40 mer DNA containing cytosines derivatives used in primer extension. Cytosine 25, counting from right to left, with an asterisk presented the modified cytosines in the sequences, and also the position at which DNA polymerases tended to stall when the streptavidine adduct of biotin-5-N 3 -gmC-containing DNA was used as template. The arrow corresponded to the reverse PCR primer used for primer extension.
- B Primer extension assays for 40 mer DNA containing different cytosine species, shown beside a Sanger sequencing ladder.
- FIGS. 10A-10D Quantification of 5-hmC in various cell lines and tissues.
- A Dot-blot assay of avidin-HRP detection and quantification of mouse cerebellum genomic DNA containing biotin-N3-5-gmC. Top row: 40 ng of biotin-labeled samples using UDP-6-N3-Glu. Bottom row: 40 ng of control samples using regular UDP-Glu without biotin label. The exact same procedures were followed for experiments in both rows. P7, P14 and P21 represent postnatal day 7, 14 and 21, respectively.
- B Amounts of 5-hmC are shown in percentage of total nucleotides of mouse genome. *, P ⁇ 0.05, Student's t-test; means ⁇ s.e.m.
- C Dot-blot assay of avidin-HRP detection and quantification of genomic DNA samples from four cell lines (from same blot as in a), except that each dot contains 700 ng DNA.
- FIGS. 11A-11C Validation of the 5-hmC labeling method by antibody and HRMS.
- A Dot-blot assay using anti-5-hmC antibody (Active Motif) with cerebellum genomic DNAs confirming an age-dependent accumulation of 5-hmC in mouse cerebellum. Quantification is shown on the right.
- FIGS. 12A-12D Genome-wide distribution of 5-hmC in adult mouse cerebellum and gene-specific acquisition of intragenic 5-hmC during mouse cerebellum development.
- A Genome-scale reproducibility of 5-hmC profiles and enrichment relative to genomic DNA and control-treated DNA in adult mouse cerebellum. Heatmap representations of read densities have been equally scaled and then normalized based on the total number of mapped reads per sample. Data are derived from a single lane of sequence from each condition. Control, UDP-Glu treated without biotin; Input, genomic DNA; 5-hmC, UDP-6-N3-Glu treated with biotin incorporated.
- RefSeq transcripts were divided into four equally sized bins based on gene expression level and 5-hmC or input genomic DNA reads falling in 10-bp bins centered on transcription start sites or end sites. The reads were summed and normalized based on the total number of aligned reads (in millions).
- Input genomic DNA reads were mapped to each of the four gene expression level bins and are plotted here in black. The profiles completely overlap and so are collectively referred to as ‘Input’.
- FIGS. 13A-13B Reads mapping to 5-hmC and control spike.
- A Sequences of the 5-hmC spike and control spike.
- B Equal amount of two spikes were added into mouse genomic DNA. After 5-hmC labeling, enrichment and deep sequencing, reads mapping to the 5-hmC spike and the control spike are shown. There are total 131 reads mapped to 5-hmC spike and 5 reads mapped to the negative control, indicating that enrichment for 5-hmC was successful.
- FIG. 15 Percentages of sequencing reads from MBD-Seq, MeDIP-Seq and 5-hmC-Seq mapped to RepeatMasker (Rmsk) and RefSeq. MeDIP-Seq, MBD-Seq, and 5-hmC reads were aligned to the NCBI37, mm9 using identical parameters and identified as RepeatMasker (Rmsk) or RefSeq if overlapping ⁇ 1 bp of a particular annotation. The fraction of total reads corresponding to each was then determined. The expected fraction of reads based on the fraction of genomic sequence corresponding to either Rmsk or RefSeq was also plotted for comparison.
- FIG. 16 Examples of intragenic enrichment of 5-hmC at genes that have been linked to ataxia and disorders of Purkinje cell degeneration in mouse and human.
- Top panel shows Ataxin 1 while bottom panel shows RORa, with pink representing female and blue representing male.
- FIG. 17 Genomic DNA was extracted from Hela cells and subsequently sonicated into 100-500 bp fragments. These fragments were divided into two groups, each added either azide-glucose or regular glucose (control group) to potential 5-hmC using ⁇ -glucoyltransferase, followed by biotinylation of these fragments using click chemistry. Only the azide-glucose group will be biotinylated, the control group will not. Both groups were then subjected to monomeric-avidin column. After elution, UV showed that only the azide-glucose group had pull-down DNA, the control group did not.
- FIG. 18 The ⁇ GT-catalyzed formation of N 3 -5-gmC and the subsequent click chemistry to yield biotin-N 3 -5-gmC on the TCGA site in duplex DNA. Modification on only one strand is shown.
- FIGS. 19A-19B Taq ⁇ I-mediated digestion of 5-hmC-, N 3 -5-gmC-, and biotin-N 3 -5-gmC-containing DNA with the sequences showing on top. *C indicates the modified position; arrows indicate Taq ⁇ I cutting sites.
- A Digestion of fully-modified DNA.
- B Digestion of hemi-modified DNA.
- the 32-mer dsDNA (1 pmol) was digested with 100 U of Taq ⁇ I (New England BioLabs) for 1 hr at 65° C. Samples were analyzed by 16% PAGE/Urea gel and visualized using SYBR Green I staining (Lumiprobe).
- FIGS. 20A-20B MspI digestion of 5-hmC-, N3-5-gmC-, and biotin-N 3 -5-gmC-containing DNA with the sequences showing on top. *C indicates the modified position; arrows indicate Mspl cutting sites.
- A Digestion of fully-modified DNA.
- B Digestion of hemi-modified DNA.
- the 32-mer dsDNA (1 pmol) was digested with 100 U of MspI (New England BioLabs) for 1 hr at 37° C. Samples were analyzed by 16% PAGE/Urea gel and visualized using SYBR Green I staining (Lumiprobe).
- FIGS. 21A-21C Show the development of a cleavable biotin-containing capture agent with a disulfide linker as the click reaction partner to form biotin-S-S-N 3 -5-gmC.
- Certain embodiments are directed to methods and compositions for modifying 5-hmC, detecting 5-hmC, and/or evaluating 5-hmC in nucleic acids.
- 5-hmC is glycosylated.
- 5-hmC is coupled to a labeled or modified glucose moiety. Using the methods described herein a large variety of detectable groups (biotin, fluorescent tag, radioactive groups, etc.) can be coupled to 5-hmC via a glucose modification.
- Modification of 5-hmC can be performed using the enzyme ⁇ -glucosyltransferase ( ⁇ GT), or a similar enzyme, that catalyzes the transfer of a glucose moiety from uridine diphosphoglucose (UDP-Glu) to the hydroxyl group of 5-hmC, yielding ⁇ -glycosyl-5-hydroxymethyl-cytosine (5-gmC).
- ⁇ GT ⁇ -glucosyltransferase
- UDP-Glu uridine diphosphoglucose
- a glucose molecule chemically modified to contain an azide (N 3 ) group may be covalently attached to 5-hmC through this enzyme-catalyzed glycosylation.
- phosphine-activated reagents including but not limited to biotin-phosphine, fluorophore-phosphine, and NHS-phosphine, or other affinity tags can be specifically installed onto glycosylated 5-hmC via reactions with the azide.
- Chemical tagging can be used to determine the precise locations of 5-hmC in a high throughput manner.
- the inventors have shown that the 5-gmC modification renders the labeled DNA resistant to restriction enzyme digestion and/or polymerization.
- glycosylated and unmodified genomic DNA may be treated with restriction enzymes and subsequently subjected to various sequencing methods to reveal the precise locations of each cytosine modification that hampers the digestion.
- a functional group e.g., an azide group
- This incorporation of a functional group allows further labeling or tagging cytosine residues with biotin and other tags.
- the labeling or tagging of 5-hmC can use, for example, click chemistry or other functional/coupling groups know to those skilled in the art.
- the labeled or tagged DNA fragments containing 5-hmC can be isolated and/or evaluated using modified methods being currently used to evaluate 5-mC containing nucleic acids.
- compositions of the invention may be used to introduce a sterically bulky group to 5-hmC.
- the presence of a bulky group on the DNA template strand will interfere with the synthesis of a nucleic acid strand by DNA polymerase or RNA polymerase, or the efficient cleavage of DNA by a restriction endonuclease or inhibition of other enzymatic modifications of nucleic acid containing 5-hmC.
- primer extensions or other assays can be employed, for example, to evaluate a partially extended primer of certain length and the modification sites can be revealed by sequencing the partially extended primers.
- Other approaches taking advantage of this chemical tagging method are also contemplated.
- differential modification of nucleic acid between two or more samples can be evaluated.
- Studies including heart, liver, lungs, kidney, muscle, testes, spleen, and brain indicate that under normal conditions 5-hmC is predominately in normal brain cells. Additional studies have shown that 5-hmC is also present in mouse embryonic stem cells.
- the Ten-eleven translocation 1 (TET1) protein has been identified as the catalyst for converting 5-mC to 5-hmC. Studies have shown that TET1 expression is inversely correlated to 5-mC expression. Overexpression of TET1 in cells seems to correlate with increased expression of 5-hmC. Also, TET1 is known to be involved in pediatric and adult acute myeloid leukemia and acute lymphoblastic leukemia. Thus, evaluating and comparing 5-hmC levels can be used in evaluating various disease states and comparing various nucleic acid samples.
- Certain embodiments are directed to methods and compositions for modifying eukaryotic nucleic acids containing 5-hmC.
- a target nucleic acid is contacted with a ⁇ -glucosyltransferase enzyme and a UDP substrate comprising a modified or modifiable glucose moiety.
- ⁇ GT ⁇ -glycosyltransferase
- a glucosyl-DNA beta-glucosyltransferase (EC 2.4.1.28, ⁇ -glycosyltransferase ( ⁇ GT)) is an enzyme that catalyzes the chemical reaction in which a beta-D-glucosyl residue is transferred from UDP-glucose to a glucosylhydroxymethylcytosine residue in a nucleic acid.
- This enzyme resembles DNA beta-glucosyltransferase in that respect.
- This enzyme belongs to the family of glycosyltransferases, specifically the hexosyltransferases. The systematic name of this enzyme class is UDP-glucose:D-glucosyl-DNA beta-D-glucosyltransferase.
- T6-glucosyl-HMC-beta-glucosyl transferase T6-beta-glucosyl transferase
- uridine diphosphoglucose-glucosyldeoxyribonucleate T6-beta-glucosyl transferase
- beta-glucosyltransferase T6-glucosyl-HMC-beta-glucosyl transferase
- the a ⁇ -glucosyltransferase is a His-tag fusion protein having the amino acid sequence ( ⁇ GT begins at amino acid 25(mer)):
- the protein may be used without the His-tag (hexa-histidine tag shown above) portion.
- ⁇ GT was cloned into the target vector pMCSG19 by Ligation Independent Cloning (LIC) method according to Donnelly et al. (2006).
- the resulting plasmid was transformed into BL21 star (DE3) competent cells containing pRK1037 (Science Reagents, Inc.) by heat shock. Positive colonies were selected with 150 ⁇ g/ml Ampicillin and 30 ⁇ g/ml Kanamycin.
- One liter of cells was grown at 37° C. from a 1:100 dilution of an overnight culture. The cells were induced with 1 mM of IPTG when OD600 reaches 0.6-0.8.
- Ni-NTA buffer A (20 mM Tris-HCl pH 7.5, 150 mM NaCl, 30 mM imidazole, and 10 mM ⁇ -ME) with protease inhibitor PMSF.
- Ni-NTA buffer B (20 mM Tris-HCl pH 7.5, 150 mM NaCl, 400 mM imidazole, and 10 mM (3-ME).
- ⁇ GT-containing fractions were further purified by MonoS (Buffer A: 10 mM Tris-HCl pH 7.5; Buffer B: 10 mM Tris-HCl pH 7.5, and 1M NaCl) to remove DNA. Finally, the collected protein fractions were loaded onto a Superdex 200 (GE) gel-filtration column equilibrated with 50 mM Tris-HCl pH 7.5, 20 mM MgCl 2 , and 10 mM ⁇ -ME. SDS-PAGE gel revealed a high degree of purity of ⁇ GT. ⁇ GT was concentrated to 45 ⁇ M and stored frozen at ⁇ 80° C. with an addition of 30% glycerol.
- MonoS Buffer A: 10 mM Tris-HCl pH 7.5
- Buffer B 10 mM Tris-HCl pH 7.5, and 1M NaCl
- Protein purification is a series of processes intended to isolate a single type of protein from a complex mixture. Protein purification is vital for the characterization of the function, structure and interactions of the protein of interest.
- the starting material is usually a biological tissue or a microbial culture.
- the various steps in the purification process may free the protein from a matrix that confines it, separate the protein and non-protein parts of the mixture, and finally separate the desired protein from all other proteins. Separation of one protein from all others is typically the most laborious aspect of protein purification. Separation steps exploit differences in protein size, physico-chemical properties and binding affinity.
- the amount of the specific protein has to be compared to the amount of total protein.
- the latter can be determined by the Bradford total protein assay or by absorbance of light at 280 nm, however some reagents used during the purification process may interfere with the quantification.
- imidazole commonly used for purification of polyhistidine-tagged recombinant proteins
- BCA bicinchoninic acid
- SPR Surface Plasmon Resonance
- SPR can detect binding of label free molecules on the surface of a chip. If the desired protein is an antibody, binding can be translated to directly to the activity of the protein. One can express the active concentration of the protein as the percent of the total protein. SPR can be a powerful method for quickly determining protein activity and overall yield. It is a powerful technology that requires an instrument to perform.
- the protein has to be brought into solution by breaking the tissue or cells containing it.
- soluble proteins will be in the solvent, and can be separated from cell membranes, DNA etc. by centrifugation.
- the extraction process also extracts proteases, which will start digesting the proteins in the solution. If the protein is sensitive to proteolysis, it is usually desirable to proceed quickly, and keep the extract cooled, to slow down proteolysis.
- a common first step to isolate proteins is precipitation with ammonium sulfate (NH 4 ) 2 SO 4 . This is performed by adding increasing amounts of ammonium sulfate and collecting the different fractions of precipitate protein.
- NH 4 ) 2 SO 4 ammonium sulfate
- the first proteins to be purified are water-soluble proteins. Purification of integral membrane proteins requires disruption of the cell membrane in order to isolate any one particular protein from others that are in the same membrane compartment. Sometimes a particular membrane fraction can be isolated first, such as isolating mitochondria from cells before purifying a protein located in a mitochondrial membrane.
- a detergent such as sodium dodecyl sulfate (SDS) can be used to dissolve cell membranes and keep membrane proteins in solution during purification; however, because SDS causes denaturation, milder detergents such as TRITON X-100 (2-ethanediyl), ⁇ -(4-(1,1,3,3-tetramethylbutyl)pheyl)- ⁇ -hydroxy-poly(oxy-1)) or CHAPS (3-[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonate) can be used to retain the protein's native conformation during complete purification.
- SDS sodium dodecyl sulfate
- Centrifugation is a process that uses centrifugal force to separate mixtures of particles of varying masses or densities suspended in a liquid.
- a vessel typically a tube or bottle
- a mixture of proteins or other particulate matter such as bacterial cells
- the angular momentum yields an outward force to each particle that is proportional to its mass.
- the tendency of a given particle to move through the liquid because of this force is offset by the resistance the liquid exerts on the particle.
- the net effect of “spinning” the sample in a centrifuge is that massive, small, and dense particles move outward faster than less massive particles or particles with more “drag” in the liquid.
- a “pellet” When suspensions of particles are “spun” in a centrifuge, a “pellet” may form at the bottom of the vessel that is enriched for the most massive particles with low drag in the liquid. Non-compacted particles still remaining mostly in the liquid are called the “supernatant” and can be removed from the vessel to separate the supernatant from the pellet.
- the rate of centrifugation is specified by the angular acceleration applied to the sample, typically measured in comparison to the g. If samples are centrifuged long enough, the particles in the vessel will reach equilibrium wherein the particles accumulate specifically at a point in the vessel where their buoyant density is balanced with centrifugal force. Such an “equilibrium” centrifugation can allow extensive purification of a given particle.
- Sucrose gradient centrifugation is a linear concentration gradient of sugar (typically sucrose, glycerol, or a silica based density gradient media, like PercollTM) is generated in a tube such that the highest concentration is on the bottom and lowest on top.
- sugar typically sucrose, glycerol, or a silica based density gradient media, like PercollTM
- a protein sample is then layered on top of the gradient and spun at high speeds in an ultracentrifuge. This causes heavy macromolecules to migrate towards the bottom of the tube faster than lighter material. After separating the protein/particles, the gradient is then fractionated and collected.
- a protein purification protocol contains one or more chromatographic steps.
- the basic procedure in chromatography is to flow the solution containing the protein through a column packed with various materials. Different proteins interact differently with the column material, and can thus be separated by the time required to pass the column, or the conditions required to elute the protein from the column. Usually proteins are detected as they are coming off the column by their absorbance at 280 nm. Many different chromatographic methods exist:
- Chromatography can be used to separate protein in solution or denaturing conditions by using porous gels. This technique is known as size exclusion chromatography. The principle is that smaller molecules have to traverse a larger volume in a porous matrix. Consequentially, proteins of a certain range in size will require a variable volume of eluent (solvent) before being collected at the other end of the column of gel.
- eluent solvent
- the eluant is usually pooled in different test tubes. All test tubes containing no measurable trace of the protein to purify are discarded. The remaining solution is thus made of the protein to purify and any other similarly-sized proteins.
- Ion exchange chromatography separates compounds according to the nature and degree of their ionic charge.
- the column to be used is selected according to its type and strength of charge.
- Anion exchange resins have a positive charge and are used to retain and separate negatively charged compounds, while cation exchange resins have a negative charge and are used to separate positively charged molecules.
- a buffer is pumped through the column to equilibrate the opposing charged ions.
- solute molecules will exchange with the buffer ions as each competes for the binding sites on the resin.
- the length of retention for each solute depends upon the strength of its charge. The most weakly charged compounds will elute first, followed by those with successively stronger charges. Because of the nature of the separating mechanism, pH, buffer type, buffer concentration, and temperature all play important roles in controlling the separation.
- Affinity Chromatography is a separation technique based upon molecular conformation, which frequently utilizes application specific resins. These resins have ligands attached to their surfaces which are specific for the compounds to be separated. Most frequently, these ligands function in a fashion similar to that of antibody-antigen interactions. This “lock and key” fit between the ligand and its target compound makes it highly specific, frequently generating a single peak, while all else in the sample is unretained.
- membrane proteins are glycoproteins and can be purified by lectin affinity chromatography.
- Detergent-solubilized proteins can be allowed to bind to a chromatography resin that has been modified to have a covalently attached lectin. Proteins that do not bind to the lectin are washed away and then specifically bound glycoproteins can be eluted by adding a high concentration of a sugar that competes with the bound glycoproteins at the lectin binding site.
- Some lectins have high affinity binding to oligosaccharides of glycoproteins that is hard to compete with sugars, and bound glycoproteins need to be released by denaturing the lectin.
- a common technique involves engineering a sequence of 6 to 8 histidines into the N- or C-terminal of the protein.
- the polyhistidine binds strongly to divalent metal ions such as nickel and cobalt.
- the protein can be passed through a column containing immobilized nickel ions, which binds the polyhistidine tag. All untagged proteins pass through the column.
- the protein can be eluted with imidazole, which competes with the polyhistidine tag for binding to the column, or by a decrease in pH (typically to 4.5), which decreases the affinity of the tag for the resin. While this procedure is generally used for the purification of recombinant proteins with an engineered affinity tag (such as a 6 ⁇ His tag or Clontech's HAT tag), it can also be used for natural proteins with an inherent affinity for divalent cations.
- an engineered affinity tag such as a 6 ⁇ His tag or Clontech's HAT tag
- Immunoaffinity chromatography uses the specific binding of an antibody to the target protein to selectively purify the protein.
- the procedure involves immobilizing an antibody to a column material, which then selectively binds the protein, while everything else flows through.
- the protein can be eluted by changing the pH or the salinity. Because this method does not involve engineering in a tag, it can be used for proteins from natural sources.
- Another way to tag proteins is to engineer an antigen peptide tag onto the protein, and then purify the protein on a column or by incubating with a loose resin that is coated with an immobilized antibody. This particular procedure is known as immunoprecipitation. Immunoprecipitation is quite capable of generating an extremely specific interaction which usually results in binding only the desired protein. The purified tagged proteins can then easily be separated from the other proteins in solution and later eluted back into clean solution. Tags can be cleaved by use of a protease. This often involves engineering a protease cleavage site between the tag and the protein.
- High performance liquid chromatography or high pressure liquid chromatography is a form of chromatography applying high pressure to drive the solutes through the column faster. This means that the diffusion is limited and the resolution is improved.
- the most common form is “reversed phase” hplc, where the column material is hydrophobic.
- the proteins are eluted by a gradient of increasing amounts of an organic solvent, such as acetonitrile. The proteins elute according to their hydrophobicity. After purification by HPLC the protein is in a solution that only contains volatile compounds, and can easily be lyophilized. HPLC purification frequently results in denaturation of the purified proteins and is thus not applicable to proteins that do not spontaneously refold.
- the protein At the end of a protein purification, the protein often has to be concentrated. Different methods exist. If the solution doesn't contain any other soluble component than the protein in question the protein can be lyophilized (dried). This is commonly done after an HPLC run. This simply removes all volatile component leaving the proteins behind.
- Ultrafiltration concentrates a protein solution using selective permeable membranes.
- the function of the membrane is to let the water and small molecules pass through while retaining the protein.
- the solution is forced against the membrane by mechanical pump or gas pressure or centrifugation.
- Gel electrophoresis is a common laboratory technique that can be used both as preparative and analytical method.
- the principle of electrophoresis relies on the movement of a charged ion in an electric field.
- the proteins are denatured in a solution containing a detergent (SDS).
- SDS detergent
- the proteins are unfolded and coated with negatively charged detergent molecules.
- the proteins in SDS-PAGE are separated on the sole basis of their size.
- the protein migrate as bands based on size. Each band can be detected using stains such as Coomassie blue dye or silver stain.
- Preparative methods to purify large amounts of protein require the extraction of the protein from the electrophoretic gel. This extraction may involve excision of the gel containing a band, or eluting the band directly off the gel as it runs off the end of the gel.
- denaturing condition electrophoresis provides an improved resolution over size exclusion chromatography, but does not scale to large quantity of proteins in a sample as well as the late chromatography columns.
- a functionalized or labeled glucose molecule can be used in conjunction with ⁇ GT to modify 5-hmC in a nucleic polymer such as DNA or RNA.
- the ⁇ GT UDP substrate comprises a functionalized or labeled glucose moiety.
- the glucose moiety can be modified or functionalized using click chemistry or other coupling chemistries known in the art. Click chemistry is a chemical philosophy introduced by K. Barry Sharpless in 2001 (Kolb et al., 2001; Evans, 2007) and describes chemistry tailored to generate substances quickly and reliably by joining small units.
- the label can be any label that is detected, or is capable of being detected.
- suitable labels include, e.g., chromogenic label, a radiolabel, a fluorescent label, and a biotinylated label.
- the label can be, e.g., fluorescent glucose, biotin-labeled glucose, radiolabeled glucose and the like.
- the label is a chromogenic label.
- chromogenic label includes all agents that have a distinct color or otherwise detectable marker.
- other markers used include fluorescent groups, biotin tags, enzymes (that may be used in a reaction that results in the formation of a colored product), magnetic and isotopic markers, and so on. The foregoing list of detectable markers is for illustrative purposes only, and is in no way intended to be limiting or exhaustive.
- Labels include any detectable group attached to the glucose molecule, or detection agent that does not interfere with its function.
- Further labels that may be used include fluorescent labels, such as Fluorescein, TEXAS RED (sulforhodamine 101 acid chloride), Lucifer Yellow, Rhodamine, Nile-red (NILE BLUE oxazone), tetramethyl-rhodamine-5-isothiocyanate, 1,6-diphenyl-1,3,5-hexatriene, cis-Parinaric acid, Phycoerythrin, Allophycocyanin, 4′,6-diamidino-2-phenylindole (DAPI), HOECHST 33258 (2′-(4-hydroxyphenyl)-5-(4-methyl-1-piperazinyl)-2,5′-bi-1H-benzimidazole trihydrochloride hydrate), 2-aminobenzamide, and the like.
- Further labels include Fluorescein, TEXAS
- a fluorophore contains or is a functional group that will absorb energy of a specific wavelength and re-emit energy at a different (but equally specific) wavelength. The amount and wavelength of the emitted energy depend on both the fluorophore and the chemical environment of the fluorophore.
- Fluorophores can be attached to protein using functional groups and or linkers, such as amino groups (Active ester, Carboxylate, Isothiocyanate, hydrazine); carboxyl groups (carbodiimide); thiol (maleimide, acetyl bromide); azide (via click chemistry or non-specifically (glutaraldehyde).
- Fluorophores can be proteins, quantum dots (fluorescent semiconductor nanoparticles), or small molecules. Common dye families include, but are not limited to Xanthene derivatives: fluorescein, rhodamine, OREGON GREEN (2′,7′-difluorofluorescein), eosin, TEXAS RED, etc.; Cyanine derivatives: cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine and merocyanine; Naphthalene derivatives (dansyl and prodan derivatives); Coumarin derivatives; oxadiazole derivatives: pyridyloxazole, nitrobenzoxadiazole and benzoxadiazole; Pyrene derivatives: cascade blue etc.; BODIPY (Invitrogen); Oxazine derivatives: Nile red, Nile blue, cresyl violet, oxazine 170 etc.; Acridine derivatives: proflavin
- fluorophores include: Hydroxycoumarin; Aminocoumarin; Methoxycoumarin; CASCADE BLUE ([4-[(4-diethylaminophenyl)-(4-ethylaminonaphthalen-2-yl)methylidene]-1-cyclohexa-2,5-dienylidene]-diethyl-azanium); Pacific Blue; Pacific Orange; Lucifer yellow; NBD (nitrobenzoxadiazole); R-Phycoerythrin (PE); PE-Cy5 conjugates; PE-Cy7 conjugates; Red 613 (PE-TEXAS RED; or conjugate of TEXAS RED with R-phycoerythrin); PerCP (Peridinin chlorophyll); TruRed (PerCP-Cy5.5 conjugate); FluorX; Fluorescein; BODIPY-FL (4,4-difluoro-4-bora-3a,4adiaza-s-indacene conjugate substitute for fluroscein);
- ALEXA FLUOR dyes include: ALEXA FLUOR 350, ALEXA FLUOR 405, ALEXA FLUOR 430, ALEXA FLUOR 488, ALEXA FLUOR 500, ALEXA FLUOR 514, ALEXA FLUOR 532, ALEXA FLUOR 546, ALEXA FLUOR 555, ALEXA FLUOR 568, ALEXA FLUOR 594, ALEXA FLUOR 610, ALEXA FLUOR 633, ALEXA FLUOR 647, ALEXA FLUOR 660, ALEXA FLUOR 680, ALEXA FLUOR 700, ALEXA FLUOR 750, and ALEXA FLUOR 790.
- Cy Dyes include Cy2, Cy3, Cy3B, Cy3.5, Cy5, Cy5.5 and Cy7.
- Nucleic acid probes include HOECHST 33342 (2′-(4-ethoxyphenyl)-5-(4-methyl-1-piperazinyl)-2,5-bi(1H-benzimidazole), DAPI, HOECHST 33258, SYTOX Blue, Chromomycin A3, Mithramycin, YOYO-1 (2-([1-(3-[[3-(dimethyl(3-[4-[(E)-(3-methyl-1,3-benzoxazol-3-ium-2-yl)methylidene]-1(4H)-quinolinyl]propyl)ammonio)propyl]diethyl)ammonio]propyl)-4(1H)-quinolinylidene]methyl)-3-methyl-1,3-benzoxazol-3-ium tetraiodide), Ethidium Bromide, Acridine Orange, SYTOX Green, TOTO-1 (thiazole orange dye),
- Cell function probes include Indo-1, Fluo-3, DCFH, DHR, SNARF.
- Fluorescent proteins include Y66H, Y66F, EBFP, EBFP2, Azurite, GFPuv, T-Sapphire, Cerulean, mCFP, ECFP, CyPet, Y66W, mKeima-Red, TagCFP, AmCyan1, mTFP1, S65A, Midoriishi Cyan, Wild Type GFP, S65C, TurboGFP, TagGFP, S65L, Emerald, S65T (Invitrogen), EGFP (Clontech), Azami Green (MBL), ZsGreenl (Clontech), TagYFP (Evrogen), EYFP (Clontech), Topaz, Venus, mCitrine, YPet, TurboYFP, ZsYellow1 (Clontech), Kusabira Orange (MBL), mOrange, mKO, TurboRFP (Evrogen), tdTomato, TagRFP (Evrogen), DsRed (Clon
- the Huisgen 1,3-dipolar cycloaddition, in particular the Cu(I)-catalyzed stepwise variant, is often referred to simply as the “click reaction”.
- the Cu(I)-catalyzed variant (Tornoe et al., 2002) was first reported by Morten Meldal and co-workers from Carlsberg Laboratory, Denmark for the synthesis of peptidotriazoles on solid support. Fokin and Sharpless independently described it as a reliable catalytic process offering “an unprecedented level of selectivity, reliability, and scope for those organic synthesis endeavors which depend on the creation of covalent links between diverse building blocks”, firmly placing it among the most reliable processes fitting the click criteria.
- Copper catalyzed click reactions work essentially on terminal alkynes.
- the Cu species undergo metal insertion reaction into the terminal alkynes.
- Commonly used solvents are polar aprotic solvents such as THF, DMSO, CH 3 CN, DMF as well as in non-polar aprotic solvents such as toluene. Neat solvents or a mixture of solvents may be used.
- Click chemistry has widespread applications. Some of them are: preparative organic synthesis of 1,4-substituted triazoles; modification of peptide function with triazoles; modification of natural products and pharmaceuticals; drug discovery; macrocyclizations using Cu(1) catalyzed triazole couplings; modification of DNA and nucleotides by triazole ligation; supramolecular chemistry: calixarenes, rotaxanes, and catenanes; dendrimer design; carbohydrate clusters and carbohydrate conjugation by Cu(1) catalyzed triazole ligation reactions; polymers; material science; and nanotechnology (Moses and Moorhouse, 2007; Hein et al., 2008, each of which is incorporated herein by reference).
- the functional group installed on 5-gmC can be readily labeled with commercially available maleimide or alkyne (click chemistry) linked with a biotin, respectively.
- the reaction of thiol with maleimide is highly efficient; however, this labeling reaction cannot tolerate proteins or small molecules that bear thiol groups.
- genomic DNA must be isolated from other cellular components prior to the labeling, which can be readily achieved.
- the azide labeling with commercially available biotin-linked alkyne is completely bio-orthogonal, thus genomic DNA with bound proteins can be directly used.
- the biotin-labeled DNA fragments may pulled down with streptavidin and submitted for high-throughput sequencing in order to map out global distributions and the locations of 5-hmC in chromosome. This will reveal a distribution map of 5-hmC in genomic DNA at different development stages of a particular cell or cell line.
- An alternative strategy that does not rely on converting 5-hmC:G base pair to a different base pair is to tether a photosensitizer to 5-gmC using approaches indicated in FIG. 1 .
- Photosensitized one-electron oxidation can lead to site-specific oxidation of the modified 5-gmC or the nearby guanines (Tanabe et al., 2007; Meyer et al., 2003).
- Subsequent base (piperidine) treatment will lead to specific strand cleavage on the oxidized site ( FIG. 4B ) (Tanabe et al., 2007; Meyer et al., 2003).
- genomic DNA containing 5-gmC labeled with photosensitizer can be subjected to photo-oxidation and base treatment. DNA fragments will be generated with oxidation sites at the end. High-throughput sequencing will reveal these modification sites.
- a sterically bulky group such as polyethyleneglycol (PEG), a dendrimer, or a protein such as streptavidin can be introduced to the thiol- or azide-modified 5-gmC.
- PEG polyethyleneglycol
- a dendrimer a dendrimer
- streptavidin a protein such as streptavidin
- 5-gmC in duplex DNA does not interfere with the polymerization reaction catalyzed by various different polymerases, the presence of an additional bulky group on 5-gmC on the DNA template strand can interfere with the synthesis of the new strand by DNA polymerase.
- primer extension will lead to a partially extended primer of certain length.
- the modification sites can be revealed by sequencing the partially extended primers. This method can be very versatile. It can be used to determine the modification sites for a given promoter site of interest.
- a high-throughput format can be developed as well.
- DNA fragments containing multiple 5-hmC can be affinity purified and random or designed primers can be used to perform primer extension experiments on these DNA fragments.
- Partially extended primers can be collected and subjected to high-throughput sequencing using a similar protocol as described in the restriction enzyme digestion method.
- a bulky modification may stop the polymerization reaction a few bases ahead of the modification site. Still, this method will map the modification sites to the resolution of a few bases. Considering that most 5-hmC exists in a CpG sequence, the resolution can be adequate for most applications. With a bulky substitution on 5-gmC digestion of modified DNA by restriction enzymes could be blocked for the restriction enzyme digestion-based assay.
- Nucleic acid analysis and evaluation includes various methods of amplifying, fragmenting, and/or hybridizing nucleic acids that have or have not been modified.
- Methodologies are available for large scale sequence analysis.
- the methods described exploit these genomic analysis methodologies and adapt them for uses incorporating the methodologies described herein.
- the methods can be used to perform high resolution hydroxtmethylation analysis on several thousand CpGs in genomic DNA.
- methods are directed to analysis of the hydroxymethylation status of a genomic DNA sample, comprising one or more of the steps: (a) fragmenting the sample and enriching the sample for sequences comprising CpG islands, (b) generating a single stranded DNA library, (c) subjecting the sample to bisulfite treatment, (d) amplifying individual members of the single stranded DNA library by means of PCR, e.g., emulsion PCR, and (e) sequencing the amplified single stranded DNA library.
- PCR e.g., emulsion PCR
- the present methods allow for analyzing the hydroxymethylation status of all regions of a complete genome, where changes in hydroxymethylation status are expected to have an influence on gene expression. Due to the combination of bisulfite treatment, amplification and high throughput sequencing, it is possible to analyze the hydroxymethylation status of at least 1000 and preferably 5000 CpG islands in parallel.
- CpG island refers to regions of DNA with a high G/C content and a high frequency of CpG dinucleotides relative to the whole genome of an organism of interest. Also used interchangeably in the art is the term “CG island.”
- the ‘p’ in “CpG island” refers to the phosphodiester bond between the cytosine and guanine nucleotides.
- DNA may be isolated from an organism of interest, including, but not limited to eukaryotic organisms and prokaryotic organisms, preferably mammalian organisms, such as humans.
- the step of enriching a sample for sequences comprising CpG islands can be done in different ways.
- One technique for enrichment is immunoprecipitation of methylated DNA using a methyl-Cytosine specific antibody (Weber et al., 2005).
- an enrichment step can comprise digesting the sample with a one or more restriction enzymes which more frequently cut regions of DNA comprising no CpG islands and less frequently cut regions comprising CpG islands, and isolating DNA fragments with a specific size range.
- the inventors have demonstrated that while the methylation-insensitive restriction enzyme MspI can completely cut C(5-meC)GG and partially cut C(5-hmC)GG, its activity is completely blocked by C(5-gmC)GG. This indicates that the introduction of a glucose moiety can change the property of 5-hmC in duplex DNA. With bulkier groups on 5-hmC, digestions by other restriction enzymes that recognize DNA sequences containing CpG can be blocked. Since 5-gmC can block restriction enzyme digestion, the genomic DNA modified with 5-gmC can be treated with and without restriction enzymes and subjected to known methods of mapping the genome-wide distribution and location of the 5-hmC modification.
- Such restrictions enzymes can be selected by a person skilled in the art using conventional Bioinformatics approaches.
- the selection of appropriate enzymes also has a substantial influence on the average size of fragments that ultimately will be generated and sequenced.
- the selection of appropriate enzymes may be designed in such a way that it promotes enrichment of a certain fragment length. Thus, the selection may be adjusted to the kind of sequencing method which is finally applied. For most sequencing methods, a fragment length between 100 and 1000 by has been proven to be efficient. Therefore, in one embodiment, said fragment size range is from 100, 200 or 300 base pairs to 400, 500, 600, 700, 800, 900, or 1000 base pairs (bp), including all ranges and values there between.
- the human genome reference sequence (NCBI Build 36.1 from March 2006; assembled parts of chromosomes only) has a length of 3,142,044,949 bp and contains 26,567 annotated CpG islands (CpGs) for a total length of 21,073,737 bp (0.67%).
- a DNA sequence read hits a CpG if the read overlaps with the CpG by at least 50 bp.
- the following enzymes or their isoschizomers can be used for a method according to the present invention: MseI (TTAA), Tsp509 (AATT), AluI (AGCT), NlaIII (CATG), BfaI (CTAG), HpyCH4 (TGCA), Dpul (GATC), MboII (GAAGA), MlyI (GAGTC), BCCI (CCATC).
- Isoschizomers are pairs of restriction enzymes specific to the same recognition sequence and cut in the same location.
- Embodiments include a CG island enriched library produced from genomic DNA by digestion with several restriction enzymes that preferably cut within non-CG island regions.
- the restriction enzymes are selected in such a way that digestion can result in fragments with a size range between 300, 400, 500, 600 to 500, 600, 800, 900 bp or greater, including all ranges and values there between.
- the library fragments are ligated to adaptors.
- a conventional bisulfite treatment is performed according to methods that are well known in the art. As a result, unmethylated cytosine residues are converted to Uracil residues, which in a subsequent sequencing reaction base calling are identified as “T” instead of “C”, when compared with a non bisulfite treated reference. Subsequent to bisulfite treatment, the sample is subjected to a conventional sequencing protocol.
- the 454 Genome Sequencer System supports the sequencing of samples from a wide variety of starting materials including, but not limited to, eukaryotic or bacterial genomic DNA. Genomic DNAs are fractionated into small, 100- to 1000-bp fragments with an appropriate specific combination of restriction enzymes which enriches for CpG island containing fragments.
- the restriction enzymes used for a method according to the present invention are selected from a group consisting of Msel, Tsp509, Alul, N1aIII, BfaI, HpyCH4, Dpul, MboII, MlyI, and BCCI, or any isoschizomer of any of the enzymes mentioned. Preferably, 4-5 different enzymes are selected.
- a and B short adaptors
- the adaptors are used for purification, amplification, and sequencing steps. Single-stranded fragments with A and B adaptors compose the sample library used for subsequent steps.
- the fragments Prior to ligation of the adaptors, the fragments can be completely double stranded without any single stranded overhang.
- a fragment polishing reaction is performed using e.g. E. coli T4 DNA polymerase.
- the polishing reaction is performed in the presence of hydroxymethyl-dCTP instead of dCTP.
- the fragment polishing reaction is performed in the presence of a DNA polymerase which lacks proofreading activity, such as Tth DNA polymerase (Roche Applied Science Cat. No: 11 480 014 001).
- the two different double stranded adaptors A and B are ligated to the ends of the fragments. Some or all of the C-residues of adaptors A and B can be hydroxymethyl-C residues.
- the fragments containing at least one B adaptor are immobilized on a streptavidin coated solid support and a nick repair-fill-in synthesis is performed using a strand displacement enzyme such as Bst Polymerase (New England Biolabs).
- Bst Polymerase New England Biolabs
- said reaction is performed in the presence of hydroxymethyl -dCTP instead of dCTP.
- the bisulfite treatment can be done according to standard methods that are well known in the art (Frommer et al., 1992; Zeschnigk et al., 1997; Clark et al., 1994).
- the sample can be purified, for example by a Sephadex size exclusion column or, at least by means of precipitation. It is also within the scope of the present invention, if directly after bisulfite treatment, or directly after bisulfite treatment followed by purification, the sample is amplified by means of performing a conventional PCR using amplification primers with sequences corresponding to the A and B adaptor sequences.
- the bisulfite treated and optionally purified and/or amplified single-stranded DNA library is immobilized onto specifically designed DNA Capture Beads. Each bead carries a unique single-stranded DNA library fragment.
- a library fragment can be amplified within its own microreactor comprised of a water-in-oil emulsion, excluding competing or contaminating sequences. Amplification of the entire fragment collection can be done in parallel; for each fragment, this results in a copy number of several million clonally amplified copies of the unique fragment per bead. After PCR amplification within the emulsion, the emulsion is broken while the amplified fragments remain bound to their specific beads.
- DNA methyltransferases that transfer a methyl group from S-adenosylmethionine to either adenine or cytosine residues, are found in a wide variety of prokaryotes and eukaryotes. Methylation should be considered when digesting DNA with restriction endonucleases because cleavage can be blocked or impaired when a particular base in the recognition site is methylated or otherwise modified.
- MTases have most often been identified as elements of restriction/modification systems that act to protect host DNA from cleavage by the corresponding restriction endonuclease.
- Most laboratory strains of E. coli contain three site-specific DNA methylases. Some or all of the sites for a restriction endonuclease may be resistant to cleavage when isolated from strains expressing the Dam or Dcm methylases if the methylase recognition site overlaps the endonuclease recognition site.
- plasmid DNA isolated from dam+ E. coli is completely resistant to cleavage by MboI, which cleaves at GATC sites.
- CpG MTases found in higher eukaryotes (e.g., Dnmt1), transfer a methyl group to the C5 position of cytosine residues. Patterns of CpG methylation are heritable, tissue specific and correlate with gene expression. Consequently, CpG methylation has been postulated to play a role in differentiation and gene expression (Josse and Kornberg, 1962). The effects of CpG methylation are mainly a concern when digesting eukaryotic genomic DNA. CpG methylation patterns are not retained once the DNA is cloned into a bacterial host.
- Microarray methods can be used in conjunction with the methods described herein for simultaneous testing of numerous genetic alterations of the human genome.
- the subject matter described herein can also be used in various fields to greatly improve the accuracy and reliability of nucleic acid analyses, chromosome mapping, and genetic testing.
- Selected chromosomal target elements can be included on the array and evaluated for 5-hmC content in conjunction with hybridization to a nucleic acid array.
- a diagnostic array such as a microarray used for comparative genomic hybridization (CGH)
- CGH comparative genomic hybridization
- 5-hmC in genomic DNA fragments are specifically labeled using radio-labels, fluorescent labels or amplifiable signals. These labeled target DNA fragments are then screened by hybridization using microarrays.
- This method involves using AC impedance as a measurement for the presence of 5-hmC.
- a nucleic acid probe specific for the sequence to be analyzed is immobilized on a gold electrode.
- the DNA fragment to be analyzed is added and allowed to hybridize to the probe.
- Excess non-hybridized, single-strand DNA is digested using nucleases.
- Biotin is covalently linked to the 5-hmC using the methods of the invention either before or after hybridization.
- Avidin-HRP is bound to the biotinylated DNA sequence then 4-chloronaphthol is added.
- the HRP molecule If the HRP molecule is bound to the hybridized target DNA near the gold electrode, the HRP oxidizes the 4-chloronaphthol to a hydrophobic product that absorbs to the electrode surface. This results in a higher AC impedance if 5-hmC is present in the target DNA compared to a control sequence lacking 5-hmC.
- Chromosomal DNA is prepared using standard karotyping techniques known in the art.
- the 5-hmC in the chromosomal DNA is labeled with a detectable moiety (fluorophore, radio-label, amplifiable signal) and imaged in the context of the intact chromosomes.
- kits for modifying cytosine bases of nucleic acids and/or subjecting such modified nucleic acids to further analysis can include one or more of a modification agent(s), a labeling reagent for detecting or modifying glucose or a 5-hmC, and, if desired, a substrate that contains or is capable of attaching to one or more modified 5-gmC.
- the substrate can be, e.g., a microsphere, antibody, or other binding agent.
- Each kit preferably includes a 5-hmC modifying agent or agents, e.g., ⁇ GT and its functionalized substrate.
- One or more reagent is preferably supplied in a solid form or liquid buffer that is suitable for inventory storage, and later for addition into the reaction medium when the method of using the reagent is performed. Suitable packaging is provided.
- the kit may optionally provide additional components that are useful in the procedure. These optional components include buffers, capture reagents, developing reagents, labels, reacting surfaces, means for detection, control samples, instructions, and interpretive information.
- the kit may optionally include a detectable label or a modified glucose-binding agent and, if desired, reagents for detecting the binding agent.
- the first step is to identify the locations of 5-hmC within genomic DNA, but so far it has remained challenging to distinguish 5-hmC from 5-mC and to enrich 5-hmC-containing genomic DNA fragments.
- the inventors describe a chemical tagging technology. It has been shown that 5-hmC is present in the genome of the T-even bacteriophages.
- a viral enzyme, ⁇ -glucosyltransferase ( ⁇ -GT) can catalyze the transfer of a glucose moiety from uridine diphosphoglucose (UDP-Glu) to the hydroxyl group of 5-hmC, yielding ⁇ -glucosyl-5-hydroxymethyl-cytosine (5-gmC) in duplex DNA (Josse and Kornberg, 1962; Lariviere and Morera, 2004) ( FIG. 1A ).
- the inventors took advantage of this enzymatic process and used ⁇ -GT to transfer a chemically modified glucose, 6-N3-glucose, onto 5-hmC for selective bio-orthogonal labeling of 5-hmC in genomic DNA ( FIG. 1B ).
- a biotin tag or any other tag can be installed using Huisgen cycloaddition (click) chemistry for a variety of enrichment, detection and sequencing applications (Kolb et al., 2001; Speers and Cravatt, 2004; Sletten and Bertozzi, 2009).
- the inventors used the biotin tag for high-affinity capture and/or enrichment of 5-hmC-containing DNA for sensitive detection and deep sequencing to reveal genomic locations of 5-hmC ( FIG. 1B ).
- the covalent chemical labeling coupled with biotin-based affinity purification provides considerable advantages over noncovalent, antibody-based immunoprecipitation as it ensures accurate and comprehensive capture of 5-hmC-containing DNA fragments, while still providing high selectivity.
- the inventors chemically synthesized UDP-6-N3-Glu ( FIG. 3 ) and attempted the glycosylation reaction of an 11-mer duplex DNA containing a 5-hmC modification as a model system ( FIG. 5 ).
- Wild-type ⁇ -GT worked efficiently using UDP-6-N3-Glu as the co-factor, showing only a sixfold decrease of the reaction rate compared to the native co-factor UDP-Glu ( FIG. 6 ).
- the 6-N3-glucose transfer reaction finished within 5 min with as low as 1% enzyme concentration.
- 5-hmC in duplex DNA The properties of 5-hmC in duplex DNA are quite similar to those of 5-mC in terms of its sensitivity toward enzymatic reactions such as restriction enzyme digestion and polymerization (Flusberg et al., 2010; Josse and Kornberg, 1962; Lariviere and Morera, 2004).
- primer extension with a biotin-N3-5-gmC-modified DNA template was tested.
- Addition of streptavidin tetramer (binds biotin tightly) completely stops replication by Taq polymerase specifically at the modified position as well as one base before the modified position ( FIG. 6 ). Therefore, this method has the potential to provide single-base resolution of the location of 5-hmC in DNA loci of interest.
- Genomic DNA from various sources was sonicated into small fragments ( ⁇ 100-500 base pairs), treated with ⁇ -GT in the presence of UDP-6-N3-Glu or regular UDP-Glu (control group) to yield N3-5-gmC or 5-gmC modifications and finally labeled with cyclooctyne-biotin (1) to install biotin. Because each step is efficient and bio-orthogonal, this protocol ensures selective labeling of most 5-hmC in genomic DNA. The presence of biotin-N3-5-gmC allows affinity enrichment of this modification and accurate quantification of the amount of 5-hmC in a genome using avidin-horseradish peroxidase (HRP).
- HRP avidin-horseradish peroxidase
- the inventors determined the total amount of 5-hmC in mouse cerebellum at different stages of development ( FIGS. 10A and 10B ).
- the control group showed almost no signal, demonstrating the high selectivity of this method.
- the amount of 5-hmC depends on the developmental stage of the mouse cerebellum ( FIG. 10B ).
- the 5-hmC level of mouse embryonic stem cells was determined to be comparable to results reported previously ( ⁇ 0.05% of total nucleotides) ( FIGS. 10C and 10D ) (Tahiliani et al. 2009).
- the amount of 5-hmC in mouse adult neural stem cells (aNSC) was tested, which proved comparable to that of mESC ( ⁇ 0.04% of total nucleotides) ( FIGS. 10C and 10D ).
- FIGS. 10C and 10D The inventors also tested human cell lines ( FIGS. 10C and 10D ). Notably, the presence of 5-hmC was detected in HeLa and HEK293FT cell lines, although in much lower abundance ( ⁇ 0.01% of total nucleotides) ( FIG. 10D ) than in other cells or tissues that have been previously reported to contain 5-hmC (previous studies did not show the presence of 5-hmC in HeLa cells due to the limited sensitivity of the methods employed (Kriaucionis and Heintz, 2009)). These results suggest that this modification may be more widespread than previously anticipated. By contrast, no 5-hmC signal was detected in wild-type Drosophila melanogaster, consistent with a lack of DNA methylation in this organism (Lyko et al., 2000).
- the inventors confirmed the presence of 5-hmC in the genomic DNA from HeLa cells.
- a monomeric avidin column was used to pull down the biotin-N3-5-gmC-containing DNA after genomic DNA labeling.
- These enriched DNA fragments were digested into single nucleotides, purified by HPLC and subjected to HRMS analysis.
- the inventors obtained HRMS as well as MS/MS spectra of biotin-N3-5-gmC identical to the standard from synthetic DNA ( FIG. 8 , and FIGS. 12B and 12C ).
- two 60-mer double-stranded (ds)DNAs one with a single 5-hmC in its sequence and the other without the modification, were prepared.
- the inventors spiked equal amounts of both samples into mouse genomic DNA and performed labeling and subsequent affinity purification of the biotinylated DNA.
- the pull-down sample was subjected to deep sequencing, and the result showed that the 5-hmC-containing DNA was >25-fold higher than the control sample ( FIG. 13 ).
- the inventors performed chemical labeling of genomic DNA from mouse cerebellum, subjecting the enriched fragments to deep sequencing such that 5-hmC-containing genomic regions could be identified.
- the inventors compared male and female adult mice (2.5 months old), sequencing multiple independent biological samples and multiple libraries prepared from the same genomic DNA.
- Genome-scale density profiles are nearly identical between male and female and are clearly distinguishable from both input genomic DNA and control DNA labeled with regular glucose (no biotin) ( FIG. 12A ).
- Peak identification revealed a total of 39,011 high-confidence regions enriched consistently with 5-hmC in both male and female ( FIG. 12A ).
- All of the 13 selected, enriched regions were subsequently successfully verified in both adult female and male cerebellum by quantitative PCR (qPCR), whereas multiple control regions did not display enrichment ( FIG. 14 ).
- DNA methylation is widespread in mammalian genomes, with the exception of most transcription start sites (TSS) (Meissner et al., 2008; Lister et al., 2009; Edwards et al., 2010).
- TSS transcription start sites
- Previous studies have mostly assessed DNA methylation by bisulfite sequencing and methylation-sensitive restriction digests. It has since been appreciated that neither of these methods adequately distinguishes 5-mC from 5-hmC (Huang et al., 2010; Jin et al., 2010).
- metagene 5-hmC read density profiles were generated for RefSeq transcripts.
- Normalized 5-hmC read densities differ by an average of 2.10 ⁇ 0.04% (mean ⁇ s.e.m.) in adult male and female cerebellum samples, indicating that the profiles are accurate and reproducible. Enrichment of 5-hmC was observed in gene bodies as well as in proximal upstream and downstream regions relative to TSS, transcription termination sites (TTS) and distal regions ( FIG. 12B ).
- ⁇ -GT was used to transfer a radiolabeled glucose for 5-hmC quantification (Szwagierczak et al., 2010).
- a major advantage of the technology described herein is its ability to selectively label 5-hmC in genomic DNA with any tag. With a biotin tag attached to 5-hmC, DNA fragments containing 5-hmC can be affinity purified for deep sequencing to reveal distribution and/or location of 5-hmC in mammalian genomes. Because biotin is covalently linked to 5-hmC and biotin-avidin/streptavidin interaction is strong and highly specific, this technology promises high robustness as compared to potential anti-5-hmC, antibody-based, immune-purification methods (Ito et al., 2010).
- fluorescent or affinity tags may be readily installed using the same approach for various other applications. For instance, imaging of 5-hmC in fixed cells or even live cells (if labeling can be performed in one step with a mutant enzyme) may be achieved with a fluorescent tag.
- the chemical labeling of 5-hmC with a bulky group could interfere with restriction enzyme digestion or ligation, which may be used to detect 5-hmC in specific genome regions.
- the attachment of biotin or other tags to 5-hmC also dramatically enhances the sensitivity and simplicity of the 5-hmC detection and/or quantification in various biological samples (Szwagierczak et al., 2010). The detection limit of this method can reach 0.004% ( FIG. 10D ) and the method can be readily applied to study a large number of biological samples.
- the inventors observed the developmental stage-dependent increase of 5-hmC in mouse cerebellum. Compared to postnatal day 7 at a time of massive cell proliferation in the mouse cerebellum, adult cerebellum has a significantly increased level of 5-hmC, suggesting that 5-hmC might be involved in neuronal development and maturation. Indeed, the inventors also observed an increase of 5-hmC in aNSCs upon differentiation (unpublished data).
- This technology enables the selectively capture of 5-hmC-enriched regions in the cerebellums from both P7 and adult mice, and determine the genome-wide distribution of 5-hmC by deep sequencing.
- the inventor's analyses revealed general features of 5-hmC in mouse cerebellum.
- 5-hmC was enriched specifically in gene bodies as well as defined gene proximal regions relative to more distal regions. This differs from the distribution of 5-mC, where DNA methylation has been found both within gene bodies as well as in more distal regions (Meissner et al., 2008; Lister et al., 2009; Edwards et al., 2010; Maunakea et al., 2010).
- the enrichment of 5-hmC is higher in gene bodies that are more highly expressed, suggesting a potential role for 5-hmC in activating and/or maintaining gene expression. It is possible that conversion of 5-mC to 5-hmC is a pathway to offset the gene repression effect of 5-mC during this process without going through demethylation (Wu and Zhang, 2010).
- the inventors observed an enrichment of 5-hmC in genes linked to hypoxia and angiogenesis. The oxidation of 5-mC to 5-hmC by Tet proteins requires dioxygen (Tahiliani et al. 2009; Ito et al., 2010).
- HIF protein A well-known oxygen sensor in mammalian systems that are involved in hypoxia and angiogenesis is the HIF protein, which belongs to the same mononuclear iron-containing dioxygenase superfamily as the active domain of the Tet proteins (Hausinger, 2004). It is believed to speculate that oxidation of 5-mC to 5-hmC by Tet proteins may constitute another oxygen-sensing and regulation pathway in mammalian cells. Lastly, the association of 5-hmC with genes that have been implicated in neurodegenerative disorders suggests that this base modification could potentially contribute to the pathogenesis of human neurological disorders. Should a connection between 5-hmC levels and human disease be established, the affinity purification approach shown in the current work could be used to purify and/or enrich 5-hmC-containing DNA fragments as a simple and sensitive method for disease prognosis and diagnosis.
- ⁇ -GT was cloned from the extract of T4 bacteriophage (American Type Culture Collection) into the target vector pMCSG19 by the ligation independent cloning method (Donnelly et al., 2006).
- the resulting plasmid was transformed into BL21 star (DE3)-competent cells containing pRK1037 (Science Reagents) by heat shock. Positive colonies were selected with 150 g/ml ampicillin and 30 g/ml kanamycin. One liter of cells was grown at 37° C. from a 1:100 dilution of an overnight culture.
- the cells were induced with 1 mM of isopropyl- ⁇ -d-thiogalactoside when OD600 reached 0.6-0.8. After overnight growth at 16° C. with shaking, the cells were collected by centrifugation, suspended in 30 ml Ni-NTA buffer A (20 mM Tris-HCl, pH 7.5, 150 mM NaCl, 30 mM imidazole and 10 mM ⁇ -mercaptoethanol) with protease inhibitor phenylmethylsulfonyl fluoride.
- Ni-NTA buffer A (20 mM Tris-HCl, pH 7.5, 150 mM NaCl, 30 mM imidazole and 10 mM ⁇ -mercaptoethanol
- Ni-NTA buffer B (20 mM Tris-HCl, pH 7.5, 150 mM NaCl, 400 mM imidazole and 10 mM ⁇ -mercaptoethanol).
- ⁇ -GT-containing fractions were further purified by MonoS (GE Healthcare) (buffer A: 10 mM Tris-HCl, pH 7.5; buffer B: 10 mM Tris-HCl, pH 7.5 and 1 M NaCl).
- ⁇ GT-catalyzed 5-hmC glycosylation in duplex DNA The inventors synthesized the phosphoramidite of 5-hmC (now commercially available from Glen Research) and prepared duplex DNA with 5-hmC incorporated at specific locations. The inventors found that incubation of 5-hmC-containing duplex DNA (either 15 mer or 40 mer) with 10% purified ⁇ GT at 37° C. for 3 hours led to complete conversion of 5-hmC to 5-gmC, as judged by mass spectrometry analysis and digestion of DNA into single nucleosides for HPLC analysis.
- the genomic DNA modified with 5-gmC can be treated with and without restriction enzymes.
- the two samples can be subjected to next generation sequencing in order to map out the genome-wide distribution and location of the 5-hmC modification in the specific sequences recognized by corresponding restriction enzymes.
- MspI-digested genomic DNA is ligated to a double-stranded adaptor with biotinylation on the upper strand. DNA is sheared further by partial nuclease digest, and size-selected for the 300-500 bp range. A second adaptor is then ligated onto ends that have not yet been filled by the first adaptor.
- Biotinylated fragments are then pulled down by streptavidin-coated beads, and denaturation will release single-stranded fragments flanked by both types of adaptors. These fragments are amplified by PCR for use in high-throughput sequencing. Internal Mspl sites in the sequencing reads indicate resistance to MspI digest and hence the presence of 5-hmC at those sites. Bulky groups with modified glucose can be installed to interfere with other restriction enzymes.
- Biotination of 5-hmC in genomic DNA for affinity purification and sequencing can be readily labeled with commercially available maleimide or alkyne (click chemistry) linked with a biotin, respectively.
- the reaction of thiol with maleimide is highly efficient; however, this labeling reaction cannot tolerate proteins or small molecules that bear thiol groups.
- genomic DNA must be isolated from other cellular components prior to the labeling, which can be readily achieved.
- the azide labeling with commercially available biotin-linked alkyne is completely bio-orthogonal, thus genomic DNA with bound proteins can be directly used.
- biotin-labeled DNA fragments may pulled down with streptavidin and submitted for high-throughput sequencing in order to map out global distributions and the locations of 5-hmC in chromosome. This will reveal a distribution map of 5-hmC in genomic DNA at different development stages of a particular cell or cell line.
- a sterically bulky group such as polyethyleneglycol (PEG), a dendrimer, or a protein such as streptavidin can be introduced to the thiol- or azide-modified 5-gmC.
- PEG polyethyleneglycol
- a dendrimer a dendrimer
- streptavidin a protein such as streptavidin
- 5-gmC in duplex DNA does not interfere with the polymerization reaction catalyzed by various different polymerases, the presence of an additional bulky group on 5-gmC on the DNA template strand can interfere with the synthesis of the new strand by DNA polymerase.
- primer extension will lead to a partially extended primer of certain length.
- the modification sites can be revealed by sequencing the partially extended primers. This method can be very versatile.
- DNA fragments containing multiple 5-hmC can be affinity purified and random or designed primers can be used to perform primer extension experiments on these DNA fragments.
- Partially extended primers can be collected and subjected to high-throughput sequencing using a similar protocol as described in the restriction enzyme digestion method.
- a bulky modification may stop the polymerization reaction a few bases ahead of the modification site. Still, this method will map the modification sites to the resolution of a few bases. Considering that most 5-hmC exists in a CpG sequence, the resolution can be adequate for most applications. With a bulky substitution on 5-gmC digestion of modified DNA by restriction enzymes other than Mspl could be blocked for the restriction enzyme digestion-based assay.
- 5-hmC in mouse embryonic stem cells (ESCs) and fibroblasts are mapped. This would produces a general picture of how 5-hmC patterns compare and contrast between a pluripotent stem cell type and a terminally differentiated cell type.
- ESCs mouse embryonic stem cells
- fibroblasts are mapped. This would produces a general picture of how 5-hmC patterns compare and contrast between a pluripotent stem cell type and a terminally differentiated cell type.
- 5-hmC is mapped in mouse neural stem cells (NSCs) as well as neurons and astrocytes derived via the differentiation of these NSCs. This elucidates how the process of lineage differentiation affects 5-hmC patterns.
- Affinity purified genomic DNA fragments enriched for 5-hmC as described in Example 3 can be directly subjected to high-throughput sequencing to identify these fragments.
- CMML chronic myelomonocytic leukemia
- the TET2 gene is closely related to TET1, which encodes an enzyme capable of converting 5-meC to 5-hmC, (Tahiliani et al., 2009) raising the issue that patients with CMML have altered content or distribution of 5hmC. Indeed, when lentiviruses have been used to increase the level of TET2 three-fold in leukemia cells, thin layer chromatography showed that 5-hmC content increased by 1.5-fold, and the bone marrow of patients with homozygous TET2 mutations showed a 20% decrease in 5-hmC levels (Szpurka et al., 2009).
- TET2 which is a homologue of TET1 modifies RNA. 5-hmC exist in human RNA. TET2 is defective in various leukemias (Abdel-Wahab et al., 2009).
- Azide modified UDP-Glucose can be synthesized by the following reaction scheme.
- genomic DNA analysis can be divided into two general strategies. Initially genomic DNA is fragmented, for example by sonication or restiction enzyme digestion. After fragmentation 5-hmC in genomic DNA is detectably modified, for example using click chemistry after introduction of an azide-glucose. In a first strategy, the biotin-labeled DNA fragments will be pulled down using an avidin column. These affinity purified genomic DNA fragments are enriched for 5-hmC and will be directly subjected to high-throughput sequencing for sequence identification. This will give the global distribution map of 5-hmC in genomic DNA anlyzed. Such analysis can be performed at different development stages or in various tissue or cell samples.
- biotin-labeled genomic DNA fragments the will be ligated to an adaptor with known sequence, then primer extension will then be performed in order to map out the exact location of 5-hmC.
- Genomic DNA was purified using Wizard genomic DNA purification kit (Promega) with additional Proteinase K treatment and rehydrated in 10 mM Tris (pH 7.9). Genomic DNA samples were further sonicated in Eppendorf tubes into 100-500 bp by Misonix sonicator 3000 (using microtip, three pulses of 30 s each with 2 min of rest and a power output level of 2) or Bioruptor UCD-200 sonicator (Diagenode, Sparta). (The output selector switch was set on High (H), and sonication interval was 30 s with 30 cycles of sonication performed.
- Misonix sonicator 3000 using microtip, three pulses of 30 s each with 2 min of rest and a power output level of 2
- Bioruptor UCD-200 sonicator Diagenode, Sparta
- Oligonucleotide synthesis Oligonucleotides containing 5-hmC were prepared using Applied Biosystems 392 DNA synthesizer. 5-Hydroxymethyl-dC-CE phosphoramidite (Glen Research) was used to incorporate 5-hmC at the desired position during solid-phase synthesis, followed by postsynthetic deprotection by treatment with 30% ammonium hydroxide first and then 25-30% wt/wt solution of sodium methoxide in methanol (Alfa Aesar) overnight at 25° C. The 11-mer DNA was purified by reversed-phase HPLC and confirmed by MALDI-TOF. Other DNA was purified by denaturing PAGE.
- oligonucleotides Concentrations of the oligonucleotides were estimated by UV at 260 nm. Duplexes were prepared by combining equimolar portions of the each strand in annealing buffer (10 mM Tris, pH 7.5, 100 mM NaCl), heating for 10 min at 95° C. followed by slow cooling overnight.
- annealing buffer 10 mM Tris, pH 7.5, 100 mM NaCl
- the 5-hmC labeling reactions were performed in a 100- ⁇ l solution containing 50 mM HEPES buffer (pH 7.9), 25 mM MgCl 2 , 300 ng/ ⁇ l sonicated genomic DNA (100-500 bp), 250 ⁇ M UDP-6-N3-Glu, and 2.25 ⁇ M wild-type ⁇ GT. The reactions were incubated for 1 h at 37° C. After the reaction, the DNA substrates were purified by Qiagen DNA purification kit or by phenol-chloroform precipitation and reconstituted in H 2 O.
- the click chemistry was performed with addition of 150 ⁇ M dibenzocyclooctyne modified biotin (compound 1) into the DNA solution, and the reaction mixture was incubated for 2 h at 37° C.
- the DNA samples were then purified by Qiagen DNA purification kit, which were ready for further applications.
- biotin-N3-5-gmC Affinity enrichment of the biotinylated 5-hmC (biotin-N3-5-gmC).
- Genomic DNAs used for deep sequencing were purified/enriched by Pierce Monomeric Avidin Kit (Thermo) twice following manufacturer's recommendations.
- the biotin-N3-5-gmC containing DNA was concentrated by 10 K Amicon Ultra-0.5 ml Centrifugal Filters (Millipore) and purified by Qiagen DNA purification kit. Starting with 30 ⁇ g total genomic DNA, it is possible to obtain 100-300 ng enriched DNA samples following the labeling and pull-down protocol described here.
- the deep sequencing experiment can be performed with as low as 10 ng DNA sample.
- the inventors have also developed a cleavable biotin-containing capture agent with a disulfide linker as the click reaction partner to form biotin-S-S-N 3 -5-gmC ( FIGS. 21A-21C ).
- the 5-hmC-containing DNA fragments from genomic DNA are captured by streptavidin beads, allowing non-modified DNA to be removed.
- a simple dithiothreitol (DTT) treatment releases the bound DNA fragments of interest with 5-hmC modified as HS-N 3 -5-gmC ( FIGS. 21A-21C ).
- Primer extension assay Reverse primer (14-mer, 5′-AAGCTTCTGGAGTG-3′ (SEQ ID NO:2), purchased from Eurofins MWG Operon and PAGE purified) was end-labeled with T4 polynucleotide kinase (T4 PNK) (New England Biolabs) and 15 ⁇ Ci of [ ⁇ -32P]-ATP (PerkinElmer) for 0.5 h at 37° C., and then purified by Bio-Spin 6 column (Bio-Rad). For primer extension assay, REDTaq DNA polymerase (Sigma) was used.
- the inventors first mixed 0.2 pmol template and 0.25 pmol ⁇ -32P-labeled primers with dNTP in the polymerase reaction buffer without adding polymerase. The mixture was heated at 65° C. for 2 min and allowed to cool slowly for 30 min. Streptavidin in PBS was then added if needed and allowed to mix at 25° C. for 5 min. REDTaq DNA polymerase was then added (final volumn 20 ⁇ l) and the extension reaction was run at 72° C. for 1 min. The reaction was quenched by 2 ⁇ stop solution (98% formamide, 10 mM EDTA, 0.1% xylene cyanol, 0.1% bromophenol blue) and loaded on to a 20% denaturing polyacrylamide gel (7 M urea). Sanger sequencing was performed using Sequenase DNA Sequencing Kit (USB) with 1 pmol template and 0.5 pmol [ ⁇ -32P]-labeled primer. The results were visualized by autoradiography.
- USB Sequenas
- the 5-hmC labeling reaction was carried out in a 4 ml solution containing 50 mM HEPES buffer (pH 7.9), 25 mM MgCl 2 , 550 ng/ ⁇ l sonicated HeLa genomic DNA, 250 ⁇ M UDP-6-N3-Glu and 2.25 ⁇ M wild-type ⁇ -GT.
- the reaction was incubated for 1 h at 37° C., purified by phenol-chloroform precipitation and reconstituted in 4 ml H 2 O.
- the inventors added 20 ⁇ l of 30 mM dibenzocyclooctyne-modified biotin (compound 1) and incubated the mixture for 2 h at 37° C.
- the DNA sample was purified again by phenol-chloroform precipitation and then enriched for biotin-N3-5-gmC by monomeric avidin column as noted before.
- the pull-down DNA was concentrated and digested by nuclease P1 (Sigma), venom phosphodiesterase I (Type VI) (Sigma) and alkaline phosphatase (Sigma) according to published protocols (Crain, 1990).
- the sample was purified by HPLC C18 reversed-phase column as noted in FIG. 7 .
- the peaks corresponding to the biotin-N3-5-gmC from synthetic DNA were collected, lyophilized and subjected to FIRMS analysis.
- LC-MS/MS For FIRMS analysis, lyophilized fractions were dissolved in 100 ⁇ l of 50% methanol and 5-20 ⁇ l samples were injected for LC-MS/MS analysis.
- the LC-MS/MS system is composed of an Agilent 1200 HPLC system and an Agilent 6520 QTOF system controlled by MassHunter Workstation Acquisition software (B.02.01 Build 2116).
- a reversed-phase C18 column (Kinetex C18, 50 mm ⁇ 2.1 mm, 1.7 ⁇ m, with 0.2 ⁇ m guard cartridge) flowing at 0.4 ml min ⁇ 1 was used for online separation to avoid potential ion suppression.
- the gradient was from 98% solvent A (0.05% (vol/vol) acetic acid in MilliQ water), held for 0.5 min, to 100% solvent B (90% acetonitrile (vol/vol) with 0.05% acetic acid (vol/vol) in 4 min.
- MS and MS/MS data were acquired in extended dynamic range (1,700 m/z) mode, with post-column addition of reference mass solution for real time mass calibration.
- Reads were mapped to the Mus musculus reference genome (NCBI37/mm9), excluding sequences that were not finished or that have not be placed with certainty (i.e., exclusion of sequences contained in the chrUn_random.fa and chrN_radom.fa files provided by the UCSC genome browser) and appended to contain fasta sequences corresponding to the positive and negative spiked controls. Sequence alignment was accomplished using bwa (Li and Durbin, 2009) and default alignment settings.
- a total of 91,751 peaks were identified in adult female cerebellum and a total of 240,147 peaks were identified in adult male cerebellum using these parameters; 39,011 peaks overlapped ⁇ 1 bp between sexes and are reported as the set of high-confidence peaks consistently detected adult cerebellum.
- Regions enriched for 5-hmC in adult cerebellum relative to P7 cerebellum were identified using a single lane of adult female 5-hmC reads as the treatment and the single lane of P7 reads as the background and/or control sample.
- a total of 20,092 regions were identified as enriched for 5-hmC in adult female cerebellum relative to P7 cerebellum. Of these, 15,388 (76.6%) were intragenic to 5,425 unique RefSeq transcripts.
- Genes acquiring 5-hmC during development are those with peaks overlapping ⁇ 1 bp of a RefSeq gene.
- Metagene RefSeq transcript profiles were generated by first determining the distance between any given read and the closest TSS or TTS and then summing the number of 5′ends within 10 bp bins centered on either TSS or txEnds. Ten bp bins were then examined 5 kb upstream and 3 kb downstream to assess the level of 5-hmC in gene bodies relative to TSS and txEnds.
- the RefSeq reference file was obtained through the UCSC Genome Browser Tables (downloaded May 20, 2010).
- Read densities were calculated for each individual lane of sequence and then normalized per million reads of aligned sequence to generate a normalized read density. For samples sequenced on multiple lanes, normalized read densities were averaged. To generate the metagene profile for adult cerebellum the inventors averaged normalized read densities from male and female. The inventors observed excellent consistency in normalized read densities between both technical replicates (independent library preparation and sequencing the same library on multiple lanes) as well as between biological replicates (male and female adult samples). For genomic DNA input libraries from male and female samples normalized read densities differed by 3.41 ⁇ 0.05% (mean ⁇ s.e.m.). For 5-hmC libraries from male and female samples normalized read densities differed by 2.10 ⁇ 0.04% (mean ⁇ s.e.m.).
- MeDIP-Seq MeDIP-Seq, MBD-Seq data and analysis.
- MBD-Seq data were downloaded from NCBI GEO number GSE19786, data sets SRR037089 and SRR037090 (Skene et al., 2010).
- Methyl cytosine containing DNA was immunoprecipitated as previously described (Szulwach et al., 2010) using 4 ⁇ g sonicated genomic DNA from adult female mouse cerebellum. The inventors used 25 ng immunoprecipitated DNA to generate libraries for sequencing as described above.
- MeDIP-Seq and MBD-Seq reads were aligned to the NCBI37, mm9 using identical parameters as that used for 5-hmC reads. Using these parameters SRR037089 provided 15,351,672 aligned reads, SRR037090 provided 15,586,459 aligned reads and MeDIP-Seq provided 14,104,172 aligned reads. Reads were identified as either RepeatMasker (Rmsk, NCBI37, mm9) or RefSeq (based on 05/20/10 UCSC download) if overlapping ⁇ 1 bp of a particular annotation. The fraction of total reads corresponding to each was then determined. The expected fraction of reads based on the fraction of genomic sequence corresponding to either Rmsk or RefSeq was also plotted for comparison.
- UDP-Glu modified uridine diphosphate glucose bearing thiol or azide.
- the inventors have synthesized azide-substituted UDP-Glu and expect to synthesize thiol-substituted UDP-Glu for 5-hmC labeling.
- An azide tag is preferred since this functional group is not present inside cells.
- the click chemistry to label this group is completely bio-orthogonal, meaning no interference from biological samples (Kolb et al., 2001).
- An azide-substituted UDP-Glu shown in FIG. 3 The azide-substituted glucoses can be transferred to 5-hmC, as shown below.
- Uridine 5′-(2,3,4-tri-O-acetyl-6-azido-6-deoxy- ⁇ -D-glucopyranosyl) diphosphate bistriethylammonium salt UDP-6-N3-UDP.
- 2,3,4-Tri-O-acetyl-6-azido-6-deoxy- ⁇ -D-glucopyranosyl phosphate mono-triethylamine salt VI, 23 mg, 0.045 mmol
- Uridine 5′-monophosphomorpholidate 4-morpholine-N,N′-dicyclohexylcarboxamidine salt (77 mg) and tetrazole (250 ⁇ L, 0.45M) were added and the mixture was co-evaporated with dry pyridine (5 mL) for three times under reduced pressure. The residue was dried in vacuum for overnight and was added distilled pyridine (5 mL). The mixture was stirred at rt under argon atmosphere. After three days, pyridine was removed under reduced pressure and the residue was co-evaporated with toluene.
- CGRA Combined Glycosylation Restriction Analysis
- the inventors developed a chemical labeling method to selectively label 5-hmC with glucose by ⁇ -glucosyltransferase ( ⁇ GT), e.g., an azide modified glucose.
- ⁇ GT ⁇ -glucosyltransferase
- the glucose is subsequently coupled to a probe that allows detection of 5-hmC in genomic DNA (Song et al., 2011).
- ⁇ GT ⁇ -glucosyltransferase
- methylation-sensitive restriction enzymes are a classic approach to the study of DNA methylation at specific loci (Singer-Sam et al., 1990).
- 5-mC and 5-hmC are indistinguishable to most restriction enzymes (Jin et al., 2010; Nestor et al., 2010; Tardy-Planechaud et al., 1997; Szwagierczak et al., 2011).
- the resulting 5-N 3 -gmC or biotin-5-N 3 -gmC in a duplex DNA may be able to block digestion from the methylation-insensitive restriction enzyme, which can digest both 5-hmC- and 5-mC-containing DNA.
- Zymo Research and New England Biolabs have launched products based on this combined glycosylation restriction analysis (CGRA). They utilize ⁇ GT to transfer a regular glucose to 5-hmC and show that it can block the methylation-insensitive restriction enzyme MspI, which has a recognized sequence of C ⁇ circumflex over ( ) ⁇ CGG (Davis and Vaisvila, 2011).
- MspI is also blocked if the outer C is 5-mC or 5-hmC, regardless of the cytosine modification status of the inner C, which limits the use of this approach on many CCGG sites where the outer C methylated (Tardy-Planechaud et al., 1997); (ii) it cannot tell whether 5-hmC occurs on only one strand or both strands of the CpG dinucleotide.
- the inventors demonstrate that Taq ⁇ I, another methylation-insensitive restriction enzyme that recognizes and cuts TACGA, can also be used in CGRA when coupled with our chemical labeling method. This new approach can differentiate fully versus hemi-hydroxymethylated states in the CpG dinucleotide.
- the inventors prepared the same 32-mer double strand DNA with hemi-hydroxymethylation, performed the same labeling procedure, and subjected to Taq ⁇ I digestion ( FIG. 19B ). While the hemi-5-hmC can be cut, hemi-N 3 -5-glocose cannot block digestion as well as the fully-modified one ( FIG. 19B , lane 1-4). Even with the bulkier group, biotin-N 3 -5-gmC, present, the majority of DNA was still digested ( FIG. 19B , lane 5-6). Thus, Taq ⁇ I digests the hemi-modified sequence but is blocked by the fully-modified one with biotin-N3-5-gmC. This noticeable difference of Taq ⁇ I in response to fully- and hemi-hydroxymethylation states after modification provides a method to distinguish these two states on TCGA sites by comparison of the sensitivity to restriction.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Genetics & Genomics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Immunology (AREA)
- Biophysics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Pathology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Description
- This application is a continuation of U.S. patent application Ser. No. 14/267,727 filed May 1, 2014; which is a divisional of U.S. patent application Ser. No. 13/095,505 filed Apr. 27, 2011, which issued as U.S. Pat. No. 8,741,567 on May 14, 2014; which is a continuation of International Application PCT/US2011/031370 filed Apr. 6, 2011; which claims priority to U.S. Provisional Patent Application Ser. No. 61/321,198 filed Apr. 6, 2010, all of which are incorporated herein by reference in their entirety.
- This invention was made with government support under GM071440 awarded by the National Institutes of Health. The government has certain rights in the invention.
- The present invention relates generally to the field of molecular biology. More particularly, it concerns methods and compositions for detecting, evaluating, and/or mapping 5-hydroxymethyl-modified cytosine bases within a nucleic acid molecule.
- 5-Methylcytosine (5-mC) constitutes approximately 2-8% of the total cytosines in human genomic DNA, and impacts a broad range of biological functions, including gene expression, maintenance of genome integrity, parental imprinting, X-chromosome inactivation, regulation of development, aging, and cancer. Recently, the presence of an oxidized 5-mC, 5-hydroxymethylcytosine (5-hmC), has been discovered in embryonic and neuronal stem cells, certain adult brain cells, and some cancer cells.
- There is a need for methods and compositions for detecting and evaluating 5-hmC in the genome of eukaryotic organisms.
- The inability to distinguish between 5-hydroxymethylcytosine (5-hmC) and 5-methylcytosine (5-mC) presents challenges to studying and understanding better the significance of endogenous hydroxylation of 5-methylcytosine in genomic DNA. Solutions that enable detection or mapping of the hydroxymethylation state open the door to diagnostic and therapeutic applications. Accordingly, methods and compositions are provided and described.
- In a number of embodiments, 5-hydroxymethylcytosines (5-hmC)—and specifically not 5-methylcytosines (5-mC)—are modified in a nucleic acid molecule. Methods and compositions involve β-glucosyltransferase (βGT), which is in the glycosyltransferase family of enzymes and which selectively glycosylates 5-hmC.
- Generally, embodiments involve selectively glycosylating 5-hmC in a nucleic acid sample and directly or indirectly detecting, qualitatively and/or quantitatively, the glycosylated nucleotides based on a molecule or compound that is attached to glycosylated nucleotide. The attachment to the glycosylated nucleotide may occur at the time the nucleotide is glycosylated through the use of a modified UDP-Glu molecule with the attachment, or the attachment may be attached subsequent to the glycosylation with the modified Glu molecule. Other embodiments involve a modified and glycosylated nucleic acid molecule. Subsequent manipulation of the glycosylated nucleic acid using any number of different nucleic acid modifications is contemplated.
- In some embodiments, there are methods for at least the following: distinguishing 5-hydroxymethylcytosine from 5-methylcytosine in a nucleic acid molecule; modifying a nucleic acid molecule containing at least one 5-hydroxymethylcytosine; identifying 5-hydroxymethylcytosine in genomic DNA; preparing a nucleic acid that has been modified at nucleotides that containing a 5-hmC prior to modification (and specifically not modifying nucleotides that have a 5-mC; and, comparing a first nucleic acid sample with at least a second nucleic acid sample based on the presence and/or absence of 5-hydroxymethylcytosine in nucleic acids in the first sample.
- Methods may involve any of the following steps described herein. In some embodiments, methods involve incubating the nucleic acid molecule with β-glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule to glycosylate 5-hydroxymethylcytosine in the nucleic acid molecule with a modified glucose (Glu) molecule. In other embodiments, methods may involve mixing the nucleic acids with β-glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule under conditions to promote glycosylation of the 5-hydroxymethylcytosines in the nucleic acids with a modified glucose (Glu) molecule. Other embodiments may involve contacting the nucleic acids with β-glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule under conditions to promote glycosylation of the 5-hydroxymethylcytosines in the nucleic acids with a modified glucose (Glu) molecule. In still further embodiments, a composition comprising nucleic acids, an effective amount of β-glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule is generated and then placed under conditions to promote glycosylation of the 5-hydroxymethylcytosines in the nucleic acids with a modified glucose (Glu) molecule. It is specifically contemplated that reactions involving any enzymes may be restricted or limited by time, enzyme concentration, substrate concentration, and/or template concentration. For example, there may be a partial restriction enzyme digest or partial glycosylation of nucleic acid molecules. Reaction conditions may be adjusted so that the reaction is carried out under conditions that result in about, at least about, or at most about 20, 30, 40, 50, 60, 70, 80, 90, 95, 96, 97, 98, 99, 100% completion, or any range derivable therein.
- In some embodiments, methods may also involve one or more of the following regarding nucleic acids prior to and/or concurrent with glycosylation of nucleic acids (generating a nucleic acid that is glycosylated on nucleotides that were 5-hmC nucleotides): obtaining nucleic acid molecules; obtaining nucleic acid molecules from a biological sample; obtaining a biological sample containing nucleic acids from a subject; isolating nucleic acid molecules; purifying nucleic acid molecules; obtaining an array or microarray containing nucleic acids to be glycosylated; denaturing nucleic acid molecules; shearing or cutting nucleic acid; denaturing nucleic acid molecules; hybridizing nucleic acid molecules; incubating the nucleic acid molecule with an enzyme that is not β-glucosyltransferase; incubating the nucleic acid molecule with a restriction enzyme; attaching one or more chemical groups or compounds to the nucleic acid; conjugating one or more chemical groups or compounds to the nucleic acid; incubating nucleic acid molecules with an enzyme that modifies the nucleic acid molecules by adding or removing one or more elements, chemical groups, or compounds.
- Methods may further involve one or more of the following steps that is concurrent with and/or subsequent to glycosylation of nucleic acids: isolating nucleic acids glycosylated with the modified glucose; isolating glycosylated (and modified) nucleic acids based on the modification to the glucose; purifying glycosylated (and modified) nucleic acids based on the modification to the glucose; reacting the modified glucose in the glycosylated nucleic acid molecule with a detectable or functional moiety, such as a linker; conjugating or attaching a detectable or functional moiety to the glycosylated nucleotide; exposing to, incubating with, or mixing with the glycosylated nucleic acid an enzyme that will use the glycosylated nucleic acid as a substrate independent of the modification to the glucose; exposing to, incubating with, or mixing with the glycosylated nucleic acid an enzyme that will use the glycosylated nucleic acid as a substrate unless the modification to the glucose modifies, alters, prevents, or hinders it; exposing to, incubating with, or mixing with the glycosylated nucleic acid an enzyme that will use the glycosylated nucleic acid as a substrate unless the modification sterically prevents or inhibits the enzyme; enriching for nucleic acids containing modified and glycosylated nucleic acids; identifying 5-hydroxymethylcytosines in the nucleic acids using the modified glucose molecule, identifying 5-hydroxymethylcytosines in the nucleic acid by comparing glycosylated nucleic acids with unglycosylated nucleic acids; mapping the 5-hydroxymethylcytosines in the nucleic acid molecule; subjecting the glycosylated nucleic acid to chromatography; subjecting the glycosylated nucleic acid to a primer extension assay and comparing the results to a control nucleic acid; subjecting the glycosylated nucleic acid to a hybridization assay and comparing the results to a control nucleic acid; and/or sequencing the glycosylated nucleic acid and comparing the results to a control nucleic acid.
- Methods may also involve the following steps: cloning β-glucosyltransferase (βGT); synthesizing β-glucosyltransferase or a functional fragment thereof; isolating β-glucosyltransferase; purifying β-glucosyltransferase; synthesizing β-glucosyltransferase; placing β-glucosyltransferase in a sterile container; shipping purified or isolated β-glucosyltransferase in a container; and/or providing instructions regarding use of β-glucosyltransferase; incubating β-glucosyltransferase with UDP-glucose molecules and a nucleic acid substrate under conditions to promote glycosylation of the nucleic acid with the glucose molecule (which may or may not be modified) and result in a nucleic acid that is glycosylated at one or more 5-hydroxymethylcytosines.
- Methods and compositions may involve a purified nucleic acid, modified UDP-Glu, and/or enzyme, such as β-glucosyltransferase. Such protocols are known to those of skill in the art. In certain embodiments, purification may result in a molecule that is about or at least about 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 99.1, 99.2, 99.3, 99.4, 99.5, 99.6, 99.7 99.8, 99.9% or more pure, or any range derivable therein, relative to any contaminating components (w/w or w/v).
- In other methods, there may be steps including, but not limited to, obtaining information (qualitative and/or quantitative) about one or more 5-hydroxymethylcytosines in a nucleic acid sample; ordering an assay to determine, identify, and/or map 5-hydroxymethylcytosines in a nucleic acid sample; reporting information (qualitative and/or quantitative) about one or more 5-hydroxymethylcytosines in a nucleic acid sample; comparing that information to information about 5-hydroxymethylcytosines in a control or comparative sample. Unless otherwise stated, the terms “determine,” “analyze,” “assay,” and “evaluate” in the context of a sample refer to transformation of that sample to gather qualitative and/or quantitative data about the sample.
- In some embodiments, nucleic acid molecules may be DNA, RNA, or a combination of both. Nucleic acids may be recombinant, genomic, or synthesized. In additional embodiments, methods involve nucleic acid molecules that are isolated and/or purified. The nucleic acid may be isolated from a cell or biological sample in some embodiments. Certain embodiments involve isolating nucleic acids from a eukaryotic, mammalian, or human cell. In some cases, they are isolated from non-nucleic acids. In some embodiments, the nucleic acid molecule is eukaryotic; in some cases, the nucleic acid is mammalian, which may be human. This means the nucleic acid molecule is isolated from a human cell and/or has a sequence that identifies it as human. In particular embodiments, it is contemplated that the nucleic acid molecule is not a prokaryotic nucleic acid, such as a bacterial nucleic acid molecule. In additional embodiments, isolated nucleic acid molecule are on an array. In particular cases, the array is a microarray. In some cases, a nucleic acid is isolated by any technique known to those of skill in the art, including, but not limited to, using a gel, column, matrix or filter to isolate the nucleic acids. In some embodiments, the gel is a polyacrylamide or agarose gel.
- Methods and compositions may also include a modified UDP-Glu. In some embodiments, the modified UDP-Glu comprises a modification moiety. In some embodiments, more than one modification moiety is included. The term “modification moiety” refers to a chemical compound or element that is added to a UDP-Glu molecule. A modified UDP-Glu refers to a UDP-Glu molecule having i) a modification moiety or ii) a chemical compound or element that is substituted for a molecule in UDP-Glu, such that the resulting modified compound has a different chemical formula than unmodified UDP-Glu. It is specifically contemplated that a modified UDP-Glu does not include a UDP-Glu that is radioactive by substitution of a molecule or compound in a UDP-Glu with the same molecule or compound, for example, a molecule or compound that is merely radioactive. In certain embodiments it is contemplated that a modified UDP-Glu is not employed, but that a UDP-Glu molecule that is unmodified, but that one or more chemicals compounds are a radioactive version of the same molecule.
- In certain embodiments, modified UDP-Glu or a modification moiety may comprise one or more detectable moieties. A detectable moiety refers to a chemical compound or element that is capable of being detected. In particular embodiments, a modified UDP-Glu is not a version of UDP-Glu that is radioactive, and in specific embodiments, a modified UDP-Glu does not have a radioactive carbon molecule. In certain embodiments, a detectable moiety is fluorescent, radioactive, enzymatic, electrochemical, or colorimetric. In some embodiments, the detectable moiety is a fluorophore or quantum dot. In particular embodiments, FRET may be employed to detect glycosylated nucleotides.
- In some embodiments, a modification moiety may be a linker that allows one or more functional or detectable moieties or isolation tags to be attached to the glycosylated 5-hmC molecules. In some embodiments the linker is an azide linker or a thiol linker. In further embodiments, the modification moiety may be an isolation tag, which means the tag can be used to isolate a molecule that is attached to the tag. In certain embodiments, the isolation tag is biotin or a histidine tag. In some cases, the tag is modified, such as with a detectable moiety. It is contemplated that the linker allows for other chemical compounds or substances to be attached to the glycosylated nucleic acid at 5-hmC. In some embodiments, a functional moieties is attached to the modified UDP-Glu molecule, which is then used to glycosylate 5-hmC nucleotides. In other embodiments, a function moiety is attached to the modified glucose after 5-hmC nucleotides have been glycosylation. In certain embodiments one or more functional and/or detectable moieties and/or isolation tags are attached to each 5-hmC nucleotides.
- In further embodiments, a functional moiety comprises a molecule or compound that inhibits or blocks an enzyme from using the glycosylated 5-hydroxymethylcytosine in the nucleic acid molecule as a substrate. In some embodiments, the inhibition is sufficiently complete to prevent detection of an enzymatic reaction involving the glycosylated 5-hydroxymethylcytosine. It is contemplated that the molecule or compound that blocks an enzyme may be doing this by sterically blocking access of the enzyme. Such sterical blocking moieties are specifically contemplated as modification moieties. In specific embodiments, the sterical blocking moieties contain 1, 2, or 3 ringed structures, including but not limited to aromatic ring structures. In certain embodiments the blocking moiety is polyethylene glycol. In other embodiments, it is a nucleic acid, amino acid, carbohydrate, or fatty acid (including mono-, di-, or tri-versions).
- Methods and compositions may also involve one or more enzymes in addition to β-glucosyltransferase. In some embodiments, the enzyme is a restriction enzyme or a polymerase. In certain cases, embodiments involve a restriction enzyme. The restriction enzyme may be methylation-insensitive. In other embodiments, the enzyme is polymerase. In certain embodiments, nucleic acids are contacted with a restriction enzyme prior to, concurrent with, or subsequent to glycosylation of nucleic acids with a modified UDP-Glu. The glycosylated nucleic acid may be contacted with a polymerase before or after the nucleic acid has been exposed to a restriction enzyme.
- Methods and compositions involve distinguishing between 5-hydroxymethylcytosine and methylcytosine after modifying the 5-hydroxymethylcytosines and not the methylcytosines. Methods may involve identifying 5-hydroxymethylcytosines in the nucleic acids by comparing glycosylated nucleic acids with unglycosylated nucleic acids or to nucleic acids whose glycosylation state is already known. Detection of the modification can involve a wide variety of recombinant nucleic acid techniques. In some embodiments, a glycosylated nucleic acid molecule is incubated with polymerase, at least one primer, and one or more nucleotides under conditions to allow polymerization of the glycosylated nucleic acid. In additional embodiments, methods may involve sequencing a glycosylated nucleic acid molecule. In other embodiments, a glycosylated nucleic acid is used in a primer extension assay.
- Methods and compositions may involve a control nucleic acid. The control may be used to evaluate whether glycosylation or other enzymatic reactions are occurring. Alternatively, the control may be used to compare glycosylation states. The control may be a negative control or it may be a positive control. It may be a control that was not incubated with one or more reagents in the glycosylation reaction. Alternatively, a control nucleic acid may be a reference nucleic acid, which means its glycosylation state (based on qualitative and/or quantitative information related to glycosylation at 5-hydroxymethylcytosines, or the absence thereof) is used for comparing to a nucleic acid being evaluated. In some embodiments, multiple nucleic acids from different sources provides the basis for a control nucleic acid. Moreover, in some cases, the control nucleic acid is from a normal sample with respect to a particular attribute, such as a disease or condition, or other phenotype. In some embodiments, the control sample is from a different patient population, a different cell type or organ type, a different disease state, a different phase or severity of a disease state, a different prognosis, a different developmental stage, etc.
- In particular embodiments, there are methods for distinguishing 5-hydroxymethylcytosine from 5-methylcytosine in a nucleic acid molecule comprising incubating the nucleic acid molecule with β-glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule to glycosylate 5-hydroxymethylcytosines in the nucleic acid molecule with a modified glucose molecule.
- Other methods concern modifying a nucleic acid molecule containing at least one 5-hydroxymethylcytosine comprising incubating the nucleic acid molecule with β-glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule to glycosylate 5-hydroxymethylcytosines in the nucleic acid molecule with the modified Glu molecule.
- Particular embodiments involve identifying 5-hydroxymethylcytosines in genomic DNA comprising: a) isolating the genomic DNA; b) shearing or cutting the genomic DNA into pieces; c) mixing the genomic DNA pieces with β-glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule under conditions to promote glycosylation of the 5-hydroxymethylcytosines in the genomic DNA with the modified UDP-Glu molecule; and, d) identifying 5-hydroxymethylcytosines in the genomic DNA using the modified UDP-Glu molecule.
- In further embodiments, there are methods for identifying 5-hydroxymethylcytosines in a nucleic acid molecule comprising: a) mixing the nucleic acid molecule with β-glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule under conditions to promote glycosylation of the 5-hydroxymethylcytosines in the nucleic acid with the modified UDP-Glu molecule; b) mixing the glycosylated nucleic acid with a methylation-insensitive restriction enzyme, wherein the modified UDP-Glu molecule comprises a molecule or compounds that prevents cleavage of the nucleic acid molecule at a site that would have been cleaved if nucleic acid molecule had not been glycosylated with the modified UDP-Glu; and, c) identifying 5-hydroxymethylcytosines in the genomic DNA using the modified UDP-Glu molecule.
- Embodiments may involve methods for mapping 5-hydroxymethylcytosine in a nucleic acid molecule comprising incubating the nucleic acid molecule with β-glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule to glycosylate 5-hydroxymethylcytosines in the nucleic acid molecule with the modified UDP-Glu molecule; and mapping the 5-hydroxymethylcytosines in the nucleic acid molecule. As discussed above, the 5-hydroxymethylcytosines in the nucleic acid may be mapped by a number of ways, including being mapped by sequencing the glycosylated nucleic acid and comparing the results to a control nucleic acid or by subjecting the glycosylated nucleic acid to a primer extension assay and comparing the results to a control nucleic acid. In some embodiments, 5-hydroxymethylcytosines in the nucleic acid are mapped by subjecting the glycosylated nucleic acid to a hybridization assay and comparing the results to a control nucleic acid.
- Additional embodiments include methods for obtaining information about the presence and/or absence of 5-hydroxymethylcytosine in nucleic acids in a first sample from a subject comprising: a) retrieving a first sample comprising nucleic acids from a biological sample; b) obtaining information about the presence and/or absence of 5-hydroxymethylcytosine in nucleic acids in the first sample, wherein the information is obtained by i) incubating the first nucleic acid sample with β-glucosyltransferase and a modified uridine diphosphoglucose (UDP-Glu) molecule, wherein a modified Glu is enzymatically attached to 5-hydroxymethylcytosines in nucleic acid molecules in the first nucleic acid sample; ii) detecting or measuring the 5-hydroxymethylcytosines based on the presence of the modified Glu to determine the 5-hydroxymethylcytosine status of the first nucleic acid sample; and, iii) comparing the 5-hydroxymethylcytosine status of the first nucleic acid sample with the 5-hydroxymethylcytosine status of nucleic acids in the at least second nucleic acid sample. In additional embodiments, instead of a retrieving a first sample, methods concern obtaining a biological sample directly from a patient or extracting nucleic acids from a biological sample. In certain embodiments, the biological sample is from a patient. In further embodiments, the patient is a human patient.
- In some embodiments, methods comprise reporting information about the presence or absence of 5-hMC. In certain embodiments, the reporting is done on a document or an electronic version of a document. It is contemplated that in some embodiments, a clinician reports this information.
- Embodiments also concern kits, which may be in a suitable container, that can be used to achieve the described methods. In some embodiments, there are kits comprising purified β-glucosyltransferase and one or more modified uridine diphosphoglucose (UDP-Glu) molecule. The molecules may have or involve different types of modifications. In further embodiments, a kit may include one or more buffers, such as buffers for nucleic acids or for reactions involving nucleic acids. Other enzymes may be included in kits in addition to or instead of β-glucosyltransferase. In some embodiments, an enzyme is a polymerase. Kits may also include nucleotides for use with the polymerase. In some cases, a restriction enzyme is included in addition to or instead of a polymerase.
- Other embodiments also concern an array or microarray containing nucleic acid molecules that have been modified at the nucleotides that were 5′-hmC.
- The use of the word “a” or “an” when used in conjunction with the term “comprising” in the claims and/or the specification may mean “one,” but it is also consistent with the meaning of “one or more,” “at least one,” and “one or more than one.”
- It is contemplated that any embodiment discussed herein can be implemented with respect to any method or composition of the invention, and vice versa. Furthermore, compositions and kits of the invention can be used to achieve methods of the invention.
- Throughout this application, the term “about” is used to indicate that a value includes the standard deviation of error for the device or method being employed to determine the value.
- The use of the term “or” in the claims is used to mean “and/or” unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and “and/or.” It is also contemplated that anything listed using the term “or” may also be specifically excluded.
- As used in this specification and claim(s), the words “comprising” (and any form of comprising, such as “comprise” and “comprises”), “having” (and any form of having, such as “have” and “has”), “including” (and any form of including, such as “includes” and “include”) or “containing” (and any form of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, unrecited elements or method steps.
- Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating specific embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
- The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present invention. The invention may be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein.
-
FIGS. 1A-1B . General strategy for 5-hmC modification and identification. (FIG. 1A ) The 5-hmC in duplex DNA is modified using a β-glucosyltransferase enzyme (βGT) which covalently links a glucose molecule from UDP-Glucose (UDP-Glu) to the hydroxymethyl-modified base to produce 5-gmC. Functional tagging groups (X) can be installed onto 5-hmC using synthetically modified UDP-Glu. (FIG. 1B ) Functional tagging of 5-hmC within nucleic acid molecules, such as with thiol or azide reactive groups, allows for the further covalent attachment of functional groups, such as biotin, that have been developed for use in a myriad of molecular biology techniques. -
FIG. 2 . Restriction enzyme digestion assay of a 40 mer DNA with CC*GG (where C*=C, 5-meC, 5-hmC, or 5-gmC). The 5-gmC modification in duplex DNA completely blocks the activity of the restriction enzyme MspI. -
FIG. 3 . Synthetic scheme for UDP-6-N3-UDP. The synthesis started from commercially available I. Treatment of I with NBS and Ph3P in DMF selectively afforded 6-bromo derivative II. Without isolation, treatment of II with sodium azide followed by acetylation of the hydroxyl groups in pyridine generated compound III. Conversion of the 1-MeO to the corresponding 1-OAc by treatment of III with acetic acid and acetic anhydride in the presence of sulfuric acid gave IV. Selective removal of the 1-α-acetyl protecting group with benzylamine provided compound V, which was converted to 1-α-phosphoric acid by treating V with 2-chloro-4H-1,3,2-benzodioxaphosphorin-4-one followed by hydrolysis and oxidation. The target molecule UDP-6-N3-UDP was obtained by treatment of VI with uridine 5-monophosphomorpholidate 4-morpholine-N,N-dicyclohexylcarboxamidine salt and tetrazole in pyridine and the subsequent treatment reaction with triethylamine and aqueous solution of NH4HCO3 in methanol to remove the acetyl groups. UDP-6-N3-UDP was purified by C18 reverse-phase HPLC and its structure was confirmed by 1H NMR, 13C NMR, 31P NMR, MALDI-TOF MS, and HRMS. -
FIGS. 4A-4B . High-throughput methods to detect the 5-hmC modification in genomic DNA. (FIG. 4A ) Reactivity differences of 5-meC, 5-hmC, and 5-gmC to bisulfite can be exploited to differentiate 5-hmC or 5-gmC from 5-meC. (FIG. 4B ) A photosensitizer installed specifically on 5-gmC can lead to photosensitized oxidation of the labeled (5-gmC)pG, and subsequent base-mediated strand cleavage selective to this region. -
FIGS. 5A-5B . Mass spec of 5-hmC, 5-N3-gmC and biotin-5-N3-gmC-containing 15 mer DNA with the corresponding reactions on the side. (FIG. A) MALDI-TOF of 5-hmC, 5-N3-gmC and biotin-5-N3-gmC-containing 15 mer DNA, respectively, with the calculated molecular weight and observed molecular weight indicated. (FIG. B) Corresponding reactions of βGT transferring 5-N3-glucose to 5-N3-gmC and the subsequent copper-free click chemistry on 5-N3-gmC. -
FIG. 6 . Activity assays of wild-type βGT on UDP-Glu and UDP-6-N3-Glu. For all activity assays, a 60 μL reaction solution containing 50 mM HEPES buffer (pH 7.9), 25 mM MgCl2, 30 μM dsDNA (sequence is shown in the inset), 300 μM UDP-Glu or UDP-6-N3-Glu, and 0.03 μM wild-type βGT (for UDP-Glu) or 0.15 μM wild-type βGT (for UDP-6-N3-Glu) was incubated at 37° C. The reaction was stopped at different incubation time points up to 6 min by immediately adding 100 mM EDTA and subjecting to Bio-Spin 6 column (Bio-Rad) to remove the excess UDP-Glu or UDP-6-N3-Glu. Samples were analyzed by HPLC with a C18 reverse-phase column equilibrated with buffer A (0.1 M TEAA, pH 7.0) and buffer B (CH3CN), showing an approximately 6-fold decrease of rate (kcat) for reactions with UDP-6-N3-Glu compared to those with regular UDP-Glu. -
FIGS. 7A-7B . HPLC analysis of the click reaction. (A), Reaction scheme of the click chemistry betweencompound 1 and the 11-mer synthetic DNA containing N3-5-gmC. (B), HPLC chromatograms (at 260 nm) of the nucleosides derived from the 11-mer N3-5-gmC-containing synthetic DNA before and after the click chemistry. The peak corresponding to N3-5-gmC decreased dramatically after the click chemistry, indicating the reaction yield is over 90%. DNA was digested by Nuclease P1 (Sigma) and Alkaline Phosphatase (Sigma). Samples were analyzed by HPLC with a C18 reverse-phase column equilibrated with buffer A (5 mM ammonium acetate, pH 7.5) and buffer B (5 mM ammonium, 0.01% TFA, 60% CH3CN). -
FIGS. 8A-8B . HPLC, and MS identification of biotin-N3-5-gmC. (A), HPLC chromatograms (260 nm) of the nucleosides derived from the 11-mer biotin-N3-5-gmC-containing synthetic DNA. The peaks corresponding to biotin-N3-5-gmC (a pair of isomers) were collected and subjected to HRMS analysis. (B), HRMS of biotin-N3-5-gmC (structures are shown in the insets). Theoretical m/z values are shown; observed m/z values are also shown. -
FIGS. 9A-9B . The streptavidin adduct of biotin-5-N3-gmC hinders primer extension. (A) Sequence of 40 mer DNA containing cytosines derivatives used in primer extension.Cytosine 25, counting from right to left, with an asterisk presented the modified cytosines in the sequences, and also the position at which DNA polymerases tended to stall when the streptavidine adduct of biotin-5-N3-gmC-containing DNA was used as template. The arrow corresponded to the reverse PCR primer used for primer extension. (B) Primer extension assays for 40 mer DNA containing different cytosine species, shown beside a Sanger sequencing ladder. No significant incomplete extension were observed in regular cytosine, 5-mC, 5-hmC-containing DNAs. Partial stalling of the primer extension at the modified position was observed in biotin-5-N3-gmC-containing DNA. Primer extension were completely stalled when 5-N3-gmC-containing DNA treated with 6 eq.-48 eq. of streptavidin. Position with the most significant stalling wasposition 24, one base before the modified position, although significant stalling still observed in the modified position (position 25, arrow). Primer extension using Sigma TaqRED polymerase, extension at 72° C. for 1 min. Sequencing ladder was performed with Sequenase from USB. -
FIGS. 10A-10D . Quantification of 5-hmC in various cell lines and tissues. (A) Dot-blot assay of avidin-HRP detection and quantification of mouse cerebellum genomic DNA containing biotin-N3-5-gmC. Top row: 40 ng of biotin-labeled samples using UDP-6-N3-Glu. Bottom row: 40 ng of control samples using regular UDP-Glu without biotin label. The exact same procedures were followed for experiments in both rows. P7, P14 and P21 representpostnatal day -
FIGS. 11A-11C . Validation of the 5-hmC labeling method by antibody and HRMS. (A), Dot-blot assay using anti-5-hmC antibody (Active Motif) with cerebellum genomic DNAs confirming an age-dependent accumulation of 5-hmC in mouse cerebellum. Quantification is shown on the right. (B), High resolution MS/MS CID spectrum (collision energy=15) of biotin-N3-5-gmC from the digestion of enriched HeLa genomic DNA labeled with biotin on 5-hmC. Structures are shown in the insets. Theoretical m/z values are shown; observed m/z values are also shown. Parent and fragment ion structures are shown in the panel. (C), MS/MS spectrum for (M+2H)2+ of biotin-N3-5-gmC obtained from the digestion of enriched HeLa genomic DNA labeled with biotin on 5-hmC. Structures are shown in the insets. Theoretical m/z values and observed m/z values are shown. -
FIGS. 12A-12D . Genome-wide distribution of 5-hmC in adult mouse cerebellum and gene-specific acquisition of intragenic 5-hmC during mouse cerebellum development. (A) Genome-scale reproducibility of 5-hmC profiles and enrichment relative to genomic DNA and control-treated DNA in adult mouse cerebellum. Heatmap representations of read densities have been equally scaled and then normalized based on the total number of mapped reads per sample. Data are derived from a single lane of sequence from each condition. Control, UDP-Glu treated without biotin; Input, genomic DNA; 5-hmC, UDP-6-N3-Glu treated with biotin incorporated. (B) Metagene profiles of 5-hmC and input genomic DNA reads mapped relative to RefSeq transcripts expressed at different levels in adult mouse cerebellum. RefSeq transcripts were divided into four equally sized bins based on gene expression level and 5-hmC or input genomic DNA reads falling in 10-bp bins centered on transcription start sites or end sites. The reads were summed and normalized based on the total number of aligned reads (in millions). Input genomic DNA reads were mapped to each of the four gene expression level bins and are plotted here in black. The profiles completely overlap and so are collectively referred to as ‘Input’. (C) Proximal and intragenic enrichment of 5-hmC relative to surrounding regions in adult and P7 mouse cerebellum. Reads from 5-hmC-captured samples and input genomic DNA were summed in 10-bp intervals centered on either TSS or txEnds and normalized to the total number of aligned reads from each sample (in millions). (D) Enrichment of pathways associated with age-related neurodegenerative diseases in genes acquiring intragenic 5-hmC in adult mice relative to P7 mice. Shown are the number of genes that acquired 5-hmC in adult cerebellum and the number of genes expected based on the total number of genes associated with that pathway in mouse. **, P<10−10; *, P<10−5. -
FIGS. 13A-13B . Reads mapping to 5-hmC and control spike. (A), Sequences of the 5-hmC spike and control spike. (B), Equal amount of two spikes were added into mouse genomic DNA. After 5-hmC labeling, enrichment and deep sequencing, reads mapping to the 5-hmC spike and the control spike are shown. There are total 131 reads mapped to 5-hmC spike and 5 reads mapped to the negative control, indicating that enrichment for 5-hmC was successful. -
FIG. 14 Verification of 5-hmC-enriched regions by qPCR. Regions were identified as peaks in Adult Female relative to P7 and subsequently verified by qPCR in both Adult Female and Male. X-axis is labeled with the gene names within which the identified peak was identified. Fold Enrichment is calculated as 2{circumflex over ( )}-dCt, where dCt=Ct (5-hmC enriched)−Ct (Input). Control regions are two regions that were not identified as 5-hmC peaks in Adult Female relative to P7. -
FIG. 15 . Percentages of sequencing reads from MBD-Seq, MeDIP-Seq and 5-hmC-Seq mapped to RepeatMasker (Rmsk) and RefSeq. MeDIP-Seq, MBD-Seq, and 5-hmC reads were aligned to the NCBI37, mm9 using identical parameters and identified as RepeatMasker (Rmsk) or RefSeq if overlapping ≥1 bp of a particular annotation. The fraction of total reads corresponding to each was then determined. The expected fraction of reads based on the fraction of genomic sequence corresponding to either Rmsk or RefSeq was also plotted for comparison. -
FIG. 16 . Examples of intragenic enrichment of 5-hmC at genes that have been linked to ataxia and disorders of Purkinje cell degeneration in mouse and human. Top panel showsAtaxin 1 while bottom panel shows RORa, with pink representing female and blue representing male. -
FIG. 17 . Genomic DNA was extracted from Hela cells and subsequently sonicated into 100-500 bp fragments. These fragments were divided into two groups, each added either azide-glucose or regular glucose (control group) to potential 5-hmC using β-glucoyltransferase, followed by biotinylation of these fragments using click chemistry. Only the azide-glucose group will be biotinylated, the control group will not. Both groups were then subjected to monomeric-avidin column. After elution, UV showed that only the azide-glucose group had pull-down DNA, the control group did not. -
FIG. 18 . The βGT-catalyzed formation of N3-5-gmC and the subsequent click chemistry to yield biotin-N3-5-gmC on the TCGA site in duplex DNA. Modification on only one strand is shown. -
FIGS. 19A-19B . TaqαI-mediated digestion of 5-hmC-, N3-5-gmC-, and biotin-N3-5-gmC-containing DNA with the sequences showing on top. *C indicates the modified position; arrows indicate TaqαI cutting sites. (A), Digestion of fully-modified DNA. (B), Digestion of hemi-modified DNA. The 32-mer dsDNA (1 pmol) was digested with 100 U of TaqαI (New England BioLabs) for 1 hr at 65° C. Samples were analyzed by 16% PAGE/Urea gel and visualized using SYBR Green I staining (Lumiprobe). -
FIGS. 20A-20B . MspI digestion of 5-hmC-, N3-5-gmC-, and biotin-N3-5-gmC-containing DNA with the sequences showing on top. *C indicates the modified position; arrows indicate Mspl cutting sites. (A), Digestion of fully-modified DNA. (B), Digestion of hemi-modified DNA. The 32-mer dsDNA (1 pmol) was digested with 100 U of MspI (New England BioLabs) for 1 hr at 37° C. Samples were analyzed by 16% PAGE/Urea gel and visualized using SYBR Green I staining (Lumiprobe). -
FIGS. 21A-21C . Show the development of a cleavable biotin-containing capture agent with a disulfide linker as the click reaction partner to form biotin-S-S-N3-5-gmC. - Certain embodiments are directed to methods and compositions for modifying 5-hmC, detecting 5-hmC, and/or evaluating 5-hmC in nucleic acids. In certain aspects, 5-hmC is glycosylated. In a further aspect 5-hmC is coupled to a labeled or modified glucose moiety. Using the methods described herein a large variety of detectable groups (biotin, fluorescent tag, radioactive groups, etc.) can be coupled to 5-hmC via a glucose modification.
- Modification of 5-hmC can be performed using the enzyme β-glucosyltransferase (βGT), or a similar enzyme, that catalyzes the transfer of a glucose moiety from uridine diphosphoglucose (UDP-Glu) to the hydroxyl group of 5-hmC, yielding β-glycosyl-5-hydroxymethyl-cytosine (5-gmC). The inventors have found that this enzymatic glycosylation offers a strategy for incorporating modified glucose molecules for labeling or tagging 5-hmC in eukaryotic nucleic acids. For instance, a glucose molecule chemically modified to contain an azide (N3) group may be covalently attached to 5-hmC through this enzyme-catalyzed glycosylation. Thereafter, phosphine-activated reagents, including but not limited to biotin-phosphine, fluorophore-phosphine, and NHS-phosphine, or other affinity tags can be specifically installed onto glycosylated 5-hmC via reactions with the azide.
- Chemical tagging can be used to determine the precise locations of 5-hmC in a high throughput manner. The inventors have shown that the 5-gmC modification renders the labeled DNA resistant to restriction enzyme digestion and/or polymerization. In certain aspects, glycosylated and unmodified genomic DNA may be treated with restriction enzymes and subsequently subjected to various sequencing methods to reveal the precise locations of each cytosine modification that hampers the digestion.
- The inventors have shown that a functional group (e.g., an azide group) can be incorporated into DNA using methods described herein. This incorporation of a functional group allows further labeling or tagging cytosine residues with biotin and other tags. The labeling or tagging of 5-hmC can use, for example, click chemistry or other functional/coupling groups know to those skilled in the art. The labeled or tagged DNA fragments containing 5-hmC can be isolated and/or evaluated using modified methods being currently used to evaluate 5-mC containing nucleic acids.
- Furthermore, methods and compositions of the invention may be used to introduce a sterically bulky group to 5-hmC. The presence of a bulky group on the DNA template strand will interfere with the synthesis of a nucleic acid strand by DNA polymerase or RNA polymerase, or the efficient cleavage of DNA by a restriction endonuclease or inhibition of other enzymatic modifications of nucleic acid containing 5-hmC. As a result, primer extensions or other assays can be employed, for example, to evaluate a partially extended primer of certain length and the modification sites can be revealed by sequencing the partially extended primers. Other approaches taking advantage of this chemical tagging method are also contemplated.
- In certain aspects, differential modification of nucleic acid between two or more samples can be evaluated. Studies including heart, liver, lungs, kidney, muscle, testes, spleen, and brain indicate that under normal conditions 5-hmC is predominately in normal brain cells. Additional studies have shown that 5-hmC is also present in mouse embryonic stem cells. The Ten-eleven translocation 1 (TET1) protein has been identified as the catalyst for converting 5-mC to 5-hmC. Studies have shown that TET1 expression is inversely correlated to 5-mC expression. Overexpression of TET1 in cells seems to correlate with increased expression of 5-hmC. Also, TET1 is known to be involved in pediatric and adult acute myeloid leukemia and acute lymphoblastic leukemia. Thus, evaluating and comparing 5-hmC levels can be used in evaluating various disease states and comparing various nucleic acid samples.
- I. Modification of 5-hmC
- Certain embodiments are directed to methods and compositions for modifying eukaryotic nucleic acids containing 5-hmC. In certain aspects a target nucleic acid is contacted with a β-glucosyltransferase enzyme and a UDP substrate comprising a modified or modifiable glucose moiety.
- A. β-glycosyltransferase (βGT)
- A glucosyl-DNA beta-glucosyltransferase (EC 2.4.1.28, β-glycosyltransferase (βGT)) is an enzyme that catalyzes the chemical reaction in which a beta-D-glucosyl residue is transferred from UDP-glucose to a glucosylhydroxymethylcytosine residue in a nucleic acid. This enzyme resembles DNA beta-glucosyltransferase in that respect. This enzyme belongs to the family of glycosyltransferases, specifically the hexosyltransferases. The systematic name of this enzyme class is UDP-glucose:D-glucosyl-DNA beta-D-glucosyltransferase. Other names in common use include T6-glucosyl-HMC-beta-glucosyl transferase, T6-beta-glucosyl transferase, uridine diphosphoglucose-glucosyldeoxyribonucleate, and beta-glucosyltransferase.
- In certain aspects, the a β-glucosyltransferase is a His-tag fusion protein having the amino acid sequence (βGT begins at amino acid 25(mer)):
-
(SEQ ID NO: 1) SHHHHHHSSGVDLGTENLYFQSNAMKIAIINMGNNVINFKTVPSSETIYL FKVISEMGLNVDIISLKNGVYTKSFDEVDVNDYDRLIVVNSSINFFGGKP NLAILSAQKFMAKYKSKIYYLFTDIRLPFSQSWPNVKNRPWAYLYTEEEL LIKSPIKVISQGINLDIAKAAHKKVDNVIEFEYFPIEQYKIHMNDFQLSK PTKKTLDVIYGGSFRSGQRESKMVEFLFDTGLNIEFFGNAREKQFKNPKY PWTKAPVFTGKIPMNMVSEKNSQAIAALIIGDKNYNDNFITLRVWETMAS DAVMLIDEEFDTKHRIINDARFYVNNRAELIDRVNELKHSDVLRKEMLSI QHDILNKTRAKKAEWQDAFKKAIDL. - In other embodiments, the protein may be used without the His-tag (hexa-histidine tag shown above) portion. For example, βGT was cloned into the target vector pMCSG19 by Ligation Independent Cloning (LIC) method according to Donnelly et al. (2006). The resulting plasmid was transformed into BL21 star (DE3) competent cells containing pRK1037 (Science Reagents, Inc.) by heat shock. Positive colonies were selected with 150 μg/ml Ampicillin and 30 μg/ml Kanamycin. One liter of cells was grown at 37° C. from a 1:100 dilution of an overnight culture. The cells were induced with 1 mM of IPTG when OD600 reaches 0.6-0.8. After overnight growth at 16° C. with shaking, the cells were collected by centrifugation, suspended in 30 mL Ni-NTA buffer A (20 mM Tris-HCl pH 7.5, 150 mM NaCl, 30 mM imidazole, and 10 mM β-ME) with protease inhibitor PMSF. After loading to a Ni-NTA column, proteins were eluted with a 0-100% gradient of Ni-NTA buffer B (20 mM Tris-HCl pH 7.5, 150 mM NaCl, 400 mM imidazole, and 10 mM (3-ME). βGT-containing fractions were further purified by MonoS (Buffer A: 10 mM Tris-HCl pH 7.5; Buffer B: 10 mM Tris-HCl pH 7.5, and 1M NaCl) to remove DNA. Finally, the collected protein fractions were loaded onto a Superdex 200 (GE) gel-filtration column equilibrated with 50 mM Tris-HCl pH 7.5, 20 mM MgCl2, and 10 mM β-ME. SDS-PAGE gel revealed a high degree of purity of βGT. βGT was concentrated to 45 μM and stored frozen at −80° C. with an addition of 30% glycerol.
- A variety of proteins can be purified using methods known in the art. Protein purification is a series of processes intended to isolate a single type of protein from a complex mixture. Protein purification is vital for the characterization of the function, structure and interactions of the protein of interest. The starting material is usually a biological tissue or a microbial culture. The various steps in the purification process may free the protein from a matrix that confines it, separate the protein and non-protein parts of the mixture, and finally separate the desired protein from all other proteins. Separation of one protein from all others is typically the most laborious aspect of protein purification. Separation steps exploit differences in protein size, physico-chemical properties and binding affinity.
- Evaluating purification yield. The most general method to monitor the purification process is by running a SDS-PAGE of the different steps. This method only gives a rough measure of the amounts of different proteins in the mixture, and it is not able to distinguish between proteins with similar molecular weight. If the protein has a distinguishing spectroscopic feature or an enzymatic activity, this property can be used to detect and quantify the specific protein, and thus to select the fractions of the separation, that contains the protein. If antibodies against the protein are available then western blotting and ELISA can specifically detect and quantify the amount of desired protein. Some proteins function as receptors and can be detected during purification steps by a ligand binding assay, often using a radioactive ligand.
- In order to evaluate the process of multistep purification, the amount of the specific protein has to be compared to the amount of total protein. The latter can be determined by the Bradford total protein assay or by absorbance of light at 280 nm, however some reagents used during the purification process may interfere with the quantification. For example, imidazole (commonly used for purification of polyhistidine-tagged recombinant proteins) is an amino acid analogue and at low concentrations will interfere with the bicinchoninic acid (BCA) assay for total protein quantification. Impurities in low-grade imidazole will also absorb at 280 nm, resulting in an inaccurate reading of protein concentration from UV absorbance.
- Another method to be considered is Surface Plasmon Resonance (SPR). SPR can detect binding of label free molecules on the surface of a chip. If the desired protein is an antibody, binding can be translated to directly to the activity of the protein. One can express the active concentration of the protein as the percent of the total protein. SPR can be a powerful method for quickly determining protein activity and overall yield. It is a powerful technology that requires an instrument to perform.
- Methods of protein purification. The methods used in protein purification can roughly be divided into analytical and preparative methods. The distinction is not exact, but the deciding factor is the amount of protein that can practically be purified with that method. Analytical methods aim to detect and identify a protein in a mixture, whereas preparative methods aim to produce large quantities of the protein for other purposes, such as structural biology or industrial use.
- Depending on the source, the protein has to be brought into solution by breaking the tissue or cells containing it. There are several methods to achieve this: Repeated freezing and thawing, sonication, homogenization by high pressure, filtration (either via cellulose-based depth filters or cross-flow filtration), or permeabilization by organic solvents. The method of choice depends on how fragile the protein is and how sturdy the cells are. After this extraction process soluble proteins will be in the solvent, and can be separated from cell membranes, DNA etc. by centrifugation. The extraction process also extracts proteases, which will start digesting the proteins in the solution. If the protein is sensitive to proteolysis, it is usually desirable to proceed quickly, and keep the extract cooled, to slow down proteolysis.
- In bulk protein purification, a common first step to isolate proteins is precipitation with ammonium sulfate (NH4)2SO4. This is performed by adding increasing amounts of ammonium sulfate and collecting the different fractions of precipitate protein. One advantage of this method is that it can be performed inexpensively with very large volumes.
- The first proteins to be purified are water-soluble proteins. Purification of integral membrane proteins requires disruption of the cell membrane in order to isolate any one particular protein from others that are in the same membrane compartment. Sometimes a particular membrane fraction can be isolated first, such as isolating mitochondria from cells before purifying a protein located in a mitochondrial membrane. A detergent such as sodium dodecyl sulfate (SDS) can be used to dissolve cell membranes and keep membrane proteins in solution during purification; however, because SDS causes denaturation, milder detergents such as TRITON X-100 (2-ethanediyl),α-(4-(1,1,3,3-tetramethylbutyl)pheyl)-ω-hydroxy-poly(oxy-1)) or CHAPS (3-[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonate) can be used to retain the protein's native conformation during complete purification.
- Centrifugation is a process that uses centrifugal force to separate mixtures of particles of varying masses or densities suspended in a liquid. When a vessel (typically a tube or bottle) containing a mixture of proteins or other particulate matter, such as bacterial cells, is rotated at high speeds, the angular momentum yields an outward force to each particle that is proportional to its mass. The tendency of a given particle to move through the liquid because of this force is offset by the resistance the liquid exerts on the particle. The net effect of “spinning” the sample in a centrifuge is that massive, small, and dense particles move outward faster than less massive particles or particles with more “drag” in the liquid. When suspensions of particles are “spun” in a centrifuge, a “pellet” may form at the bottom of the vessel that is enriched for the most massive particles with low drag in the liquid. Non-compacted particles still remaining mostly in the liquid are called the “supernatant” and can be removed from the vessel to separate the supernatant from the pellet. The rate of centrifugation is specified by the angular acceleration applied to the sample, typically measured in comparison to the g. If samples are centrifuged long enough, the particles in the vessel will reach equilibrium wherein the particles accumulate specifically at a point in the vessel where their buoyant density is balanced with centrifugal force. Such an “equilibrium” centrifugation can allow extensive purification of a given particle.
- Sucrose gradient centrifugation is a linear concentration gradient of sugar (typically sucrose, glycerol, or a silica based density gradient media, like Percoll™) is generated in a tube such that the highest concentration is on the bottom and lowest on top. A protein sample is then layered on top of the gradient and spun at high speeds in an ultracentrifuge. This causes heavy macromolecules to migrate towards the bottom of the tube faster than lighter material. After separating the protein/particles, the gradient is then fractionated and collected.
- Usually a protein purification protocol contains one or more chromatographic steps. The basic procedure in chromatography is to flow the solution containing the protein through a column packed with various materials. Different proteins interact differently with the column material, and can thus be separated by the time required to pass the column, or the conditions required to elute the protein from the column. Usually proteins are detected as they are coming off the column by their absorbance at 280 nm. Many different chromatographic methods exist:
- Chromatography can be used to separate protein in solution or denaturing conditions by using porous gels. This technique is known as size exclusion chromatography. The principle is that smaller molecules have to traverse a larger volume in a porous matrix. Consequentially, proteins of a certain range in size will require a variable volume of eluent (solvent) before being collected at the other end of the column of gel.
- In the context of protein purification, the eluant is usually pooled in different test tubes. All test tubes containing no measurable trace of the protein to purify are discarded. The remaining solution is thus made of the protein to purify and any other similarly-sized proteins.
- Ion exchange chromatography separates compounds according to the nature and degree of their ionic charge. The column to be used is selected according to its type and strength of charge. Anion exchange resins have a positive charge and are used to retain and separate negatively charged compounds, while cation exchange resins have a negative charge and are used to separate positively charged molecules. Before the separation begins a buffer is pumped through the column to equilibrate the opposing charged ions. Upon injection of the sample, solute molecules will exchange with the buffer ions as each competes for the binding sites on the resin. The length of retention for each solute depends upon the strength of its charge. The most weakly charged compounds will elute first, followed by those with successively stronger charges. Because of the nature of the separating mechanism, pH, buffer type, buffer concentration, and temperature all play important roles in controlling the separation.
- Affinity Chromatography is a separation technique based upon molecular conformation, which frequently utilizes application specific resins. These resins have ligands attached to their surfaces which are specific for the compounds to be separated. Most frequently, these ligands function in a fashion similar to that of antibody-antigen interactions. This “lock and key” fit between the ligand and its target compound makes it highly specific, frequently generating a single peak, while all else in the sample is unretained.
- Many membrane proteins are glycoproteins and can be purified by lectin affinity chromatography. Detergent-solubilized proteins can be allowed to bind to a chromatography resin that has been modified to have a covalently attached lectin. Proteins that do not bind to the lectin are washed away and then specifically bound glycoproteins can be eluted by adding a high concentration of a sugar that competes with the bound glycoproteins at the lectin binding site. Some lectins have high affinity binding to oligosaccharides of glycoproteins that is hard to compete with sugars, and bound glycoproteins need to be released by denaturing the lectin.
- A common technique involves engineering a sequence of 6 to 8 histidines into the N- or C-terminal of the protein. The polyhistidine binds strongly to divalent metal ions such as nickel and cobalt. The protein can be passed through a column containing immobilized nickel ions, which binds the polyhistidine tag. All untagged proteins pass through the column. The protein can be eluted with imidazole, which competes with the polyhistidine tag for binding to the column, or by a decrease in pH (typically to 4.5), which decreases the affinity of the tag for the resin. While this procedure is generally used for the purification of recombinant proteins with an engineered affinity tag (such as a 6× His tag or Clontech's HAT tag), it can also be used for natural proteins with an inherent affinity for divalent cations.
- Immunoaffinity chromatography uses the specific binding of an antibody to the target protein to selectively purify the protein. The procedure involves immobilizing an antibody to a column material, which then selectively binds the protein, while everything else flows through. The protein can be eluted by changing the pH or the salinity. Because this method does not involve engineering in a tag, it can be used for proteins from natural sources.
- Another way to tag proteins is to engineer an antigen peptide tag onto the protein, and then purify the protein on a column or by incubating with a loose resin that is coated with an immobilized antibody. This particular procedure is known as immunoprecipitation. Immunoprecipitation is quite capable of generating an extremely specific interaction which usually results in binding only the desired protein. The purified tagged proteins can then easily be separated from the other proteins in solution and later eluted back into clean solution. Tags can be cleaved by use of a protease. This often involves engineering a protease cleavage site between the tag and the protein.
- High performance liquid chromatography or high pressure liquid chromatography is a form of chromatography applying high pressure to drive the solutes through the column faster. This means that the diffusion is limited and the resolution is improved. The most common form is “reversed phase” hplc, where the column material is hydrophobic. The proteins are eluted by a gradient of increasing amounts of an organic solvent, such as acetonitrile. The proteins elute according to their hydrophobicity. After purification by HPLC the protein is in a solution that only contains volatile compounds, and can easily be lyophilized. HPLC purification frequently results in denaturation of the purified proteins and is thus not applicable to proteins that do not spontaneously refold.
- At the end of a protein purification, the protein often has to be concentrated. Different methods exist. If the solution doesn't contain any other soluble component than the protein in question the protein can be lyophilized (dried). This is commonly done after an HPLC run. This simply removes all volatile component leaving the proteins behind.
- Ultrafiltration concentrates a protein solution using selective permeable membranes. The function of the membrane is to let the water and small molecules pass through while retaining the protein. The solution is forced against the membrane by mechanical pump or gas pressure or centrifugation.
- Gel electrophoresis is a common laboratory technique that can be used both as preparative and analytical method. The principle of electrophoresis relies on the movement of a charged ion in an electric field. In practice, the proteins are denatured in a solution containing a detergent (SDS). In these conditions, the proteins are unfolded and coated with negatively charged detergent molecules. The proteins in SDS-PAGE are separated on the sole basis of their size.
- In analytical methods, the protein migrate as bands based on size. Each band can be detected using stains such as Coomassie blue dye or silver stain. Preparative methods to purify large amounts of protein, require the extraction of the protein from the electrophoretic gel. This extraction may involve excision of the gel containing a band, or eluting the band directly off the gel as it runs off the end of the gel.
- In the context of a purification strategy, denaturing condition electrophoresis provides an improved resolution over size exclusion chromatography, but does not scale to large quantity of proteins in a sample as well as the late chromatography columns.
- B. Modified Glucose Molecule
- A functionalized or labeled glucose molecule can be used in conjunction with βGT to modify 5-hmC in a nucleic polymer such as DNA or RNA.
- In certain aspects, the βGT UDP substrate comprises a functionalized or labeled glucose moiety. In a further aspect, the glucose moiety can be modified or functionalized using click chemistry or other coupling chemistries known in the art. Click chemistry is a chemical philosophy introduced by K. Barry Sharpless in 2001 (Kolb et al., 2001; Evans, 2007) and describes chemistry tailored to generate substances quickly and reliably by joining small units.
- The label can be any label that is detected, or is capable of being detected. Examples of suitable labels include, e.g., chromogenic label, a radiolabel, a fluorescent label, and a biotinylated label. Thus, the label can be, e.g., fluorescent glucose, biotin-labeled glucose, radiolabeled glucose and the like. In certain aspects, the label is a chromogenic label. The term “chromogenic label” includes all agents that have a distinct color or otherwise detectable marker. In addition to chemical structures having intrinsic, readily-observable colors in the visible range, other markers used include fluorescent groups, biotin tags, enzymes (that may be used in a reaction that results in the formation of a colored product), magnetic and isotopic markers, and so on. The foregoing list of detectable markers is for illustrative purposes only, and is in no way intended to be limiting or exhaustive.
- The label may be attached to the agent using methods known in the art. Labels include any detectable group attached to the glucose molecule, or detection agent that does not interfere with its function. Further labels that may be used include fluorescent labels, such as Fluorescein, TEXAS RED (sulforhodamine 101 acid chloride), Lucifer Yellow, Rhodamine, Nile-red (NILE BLUE oxazone), tetramethyl-rhodamine-5-isothiocyanate, 1,6-diphenyl-1,3,5-hexatriene, cis-Parinaric acid, Phycoerythrin, Allophycocyanin, 4′,6-diamidino-2-phenylindole (DAPI), HOECHST 33258 (2′-(4-hydroxyphenyl)-5-(4-methyl-1-piperazinyl)-2,5′-bi-1H-benzimidazole trihydrochloride hydrate), 2-aminobenzamide, and the like. Further labels include electron dense metals, such as gold, ligands, haptens, such as biotin, radioactive labels.
- A fluorophore contains or is a functional group that will absorb energy of a specific wavelength and re-emit energy at a different (but equally specific) wavelength. The amount and wavelength of the emitted energy depend on both the fluorophore and the chemical environment of the fluorophore. Fluorophores can be attached to protein using functional groups and or linkers, such as amino groups (Active ester, Carboxylate, Isothiocyanate, hydrazine); carboxyl groups (carbodiimide); thiol (maleimide, acetyl bromide); azide (via click chemistry or non-specifically (glutaraldehyde).
- Fluorophores can be proteins, quantum dots (fluorescent semiconductor nanoparticles), or small molecules. Common dye families include, but are not limited to Xanthene derivatives: fluorescein, rhodamine, OREGON GREEN (2′,7′-difluorofluorescein), eosin, TEXAS RED, etc.; Cyanine derivatives: cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine and merocyanine; Naphthalene derivatives (dansyl and prodan derivatives); Coumarin derivatives; oxadiazole derivatives: pyridyloxazole, nitrobenzoxadiazole and benzoxadiazole; Pyrene derivatives: cascade blue etc.; BODIPY (Invitrogen); Oxazine derivatives: Nile red, Nile blue, cresyl violet, oxazine 170 etc.; Acridine derivatives: proflavin, acridine orange, acridine yellow etc.; Arylmethine derivatives: auramine, crystal violet, malachite green; CF dye (Biotium); ALEXA FLUOR (Invitrogen); ATTO and TRACY (Sigma Aldrich); FLUOPROBES (Interchim); Tetrapyrrole derivatives: porphin, phtalocyanine, bilirubin; cascade yellow; azure B; acridine orange; DAPI; HOECHST 33258; lucifer yellow; piroxicam; quinine and anthraqinone; squarylium; oligophenylenes; and the like.
- Other fluorophores include: Hydroxycoumarin; Aminocoumarin; Methoxycoumarin; CASCADE BLUE ([4-[(4-diethylaminophenyl)-(4-ethylaminonaphthalen-2-yl)methylidene]-1-cyclohexa-2,5-dienylidene]-diethyl-azanium); Pacific Blue; Pacific Orange; Lucifer yellow; NBD (nitrobenzoxadiazole); R-Phycoerythrin (PE); PE-Cy5 conjugates; PE-Cy7 conjugates; Red 613 (PE-TEXAS RED; or conjugate of TEXAS RED with R-phycoerythrin); PerCP (Peridinin chlorophyll); TruRed (PerCP-Cy5.5 conjugate); FluorX; Fluorescein; BODIPY-FL (4,4-difluoro-4-bora-3a,4adiaza-s-indacene conjugate substitute for fluroscein); TRITC (siothiocyanate derivative of rhodamine); X-Rhodamine; Lissamine Rhodamine B; TEXAS RED; Allophycocyanin; APC-Cy7 conjugates.
- ALEXA FLUOR dyes (Molecular Probes) include:
ALEXA FLUOR 350, ALEXA FLUOR 405, ALEXA FLUOR 430, ALEXA FLUOR 488,ALEXA FLUOR 500, ALEXA FLUOR 514, ALEXA FLUOR 532, ALEXA FLUOR 546, ALEXA FLUOR 555, ALEXA FLUOR 568, ALEXA FLUOR 594, ALEXA FLUOR 610, ALEXA FLUOR 633, ALEXA FLUOR 647, ALEXA FLUOR 660, ALEXA FLUOR 680,ALEXA FLUOR 700,ALEXA FLUOR 750, and ALEXA FLUOR 790. - Cy Dyes (GE Heathcare) include Cy2, Cy3, Cy3B, Cy3.5, Cy5, Cy5.5 and Cy7.
- Nucleic acid probes include HOECHST 33342 (2′-(4-ethoxyphenyl)-5-(4-methyl-1-piperazinyl)-2,5-bi(1H-benzimidazole), DAPI, HOECHST 33258, SYTOX Blue, Chromomycin A3, Mithramycin, YOYO-1 (2-([1-(3-[[3-(dimethyl(3-[4-[(E)-(3-methyl-1,3-benzoxazol-3-ium-2-yl)methylidene]-1(4H)-quinolinyl]propyl)ammonio)propyl]diethyl)ammonio]propyl)-4(1H)-quinolinylidene]methyl)-3-methyl-1,3-benzoxazol-3-ium tetraiodide), Ethidium Bromide, Acridine Orange, SYTOX Green, TOTO-1 (thiazole orange dye), TO-PRO-1 (quinolinium, 4-[(3-methyl-2(3H)-benzothiazolylidene)methyl]-1-[3-(trimethylammonio)propyl]-, diiodide), TO-PRO: Cyanine Monomer, Thiazole Orange, Propidium Iodide (PI), LDS 751, 7-AAD, SYTOX Orange, TOTO-3 (quinolinium, 1,1′-[1,3-propanediylbis[(dimethyliminio)-3,1-propanediyl]]bis [4-[3-(3-methyl-2(3H)-benzothiazolylidene)-1-propenyl]-, tetraiodide), TO-PRO-3 (quinolinium, 4-[3-(3-methyl-2(3H)-benzothiazolylidene)-1-propenyl]-1-[3-(trimethylammonio)propyl]-, diiodide), and DRAQ5.
- Cell function probes include Indo-1, Fluo-3, DCFH, DHR, SNARF.
- Fluorescent proteins include Y66H, Y66F, EBFP, EBFP2, Azurite, GFPuv, T-Sapphire, Cerulean, mCFP, ECFP, CyPet, Y66W, mKeima-Red, TagCFP, AmCyan1, mTFP1, S65A, Midoriishi Cyan, Wild Type GFP, S65C, TurboGFP, TagGFP, S65L, Emerald, S65T (Invitrogen), EGFP (Clontech), Azami Green (MBL), ZsGreenl (Clontech), TagYFP (Evrogen), EYFP (Clontech), Topaz, Venus, mCitrine, YPet, TurboYFP, ZsYellow1 (Clontech), Kusabira Orange (MBL), mOrange, mKO, TurboRFP (Evrogen), tdTomato, TagRFP (Evrogen), DsRed (Clontech), DsRed2 (Clontech), mStrawberry, TurboFP602 (Evrogen), AsRed2 (Clontech), mRFP1, J-Red, mCherry, HcRedl (Clontech), Katusha, Kate (Evrogen), TurboFP635 (Evrogen), mPlum, and mRaspberry.
- 1. Click Chemistry
- The
Huisgen 1,3-dipolar cycloaddition, in particular the Cu(I)-catalyzed stepwise variant, is often referred to simply as the “click reaction”. The Cu(I)-catalyzed variant (Tornoe et al., 2002) was first reported by Morten Meldal and co-workers from Carlsberg Laboratory, Denmark for the synthesis of peptidotriazoles on solid support. Fokin and Sharpless independently described it as a reliable catalytic process offering “an unprecedented level of selectivity, reliability, and scope for those organic synthesis endeavors which depend on the creation of covalent links between diverse building blocks”, firmly placing it among the most reliable processes fitting the click criteria. - One of the most popular reactions within the click chemistry philosophy is the azide alkyne Huisgen cycloaddition using a Cu catalyst at room temperature discovered concurrently and independently by the groups of K. Barry Sharpless and Morten Meldal. This was an improvement over the same reaction first popularized by Rolf Huisgen in the 1970s, albeit at elevated temperatures in the absence of water and without a Cu catalyst (it is explained fully in 1,3-Dipolar Cycloaddition Chemistry, published by Wiley and updated in 2002). However, the azides and alkynes are both kinetically stable. Copper and Ruthenium are the commonly used catalysts in the reaction.
- Copper catalyzed click reactions work essentially on terminal alkynes. The Cu species undergo metal insertion reaction into the terminal alkynes. Commonly used solvents are polar aprotic solvents such as THF, DMSO, CH3CN, DMF as well as in non-polar aprotic solvents such as toluene. Neat solvents or a mixture of solvents may be used.
- Click chemistry has widespread applications. Some of them are: preparative organic synthesis of 1,4-substituted triazoles; modification of peptide function with triazoles; modification of natural products and pharmaceuticals; drug discovery; macrocyclizations using Cu(1) catalyzed triazole couplings; modification of DNA and nucleotides by triazole ligation; supramolecular chemistry: calixarenes, rotaxanes, and catenanes; dendrimer design; carbohydrate clusters and carbohydrate conjugation by Cu(1) catalyzed triazole ligation reactions; polymers; material science; and nanotechnology (Moses and Moorhouse, 2007; Hein et al., 2008, each of which is incorporated herein by reference).
- 2. Synthesis of Modified Uridine Diphosphate Glucose (UDP-Glu) Bearing Thiol or Azide.
- The initial success of 5-hmC glycosylation led to the hypothesis that thiol- or azide-modified glucose can be similarly transferred to 5-hmC in duplex DNA. Thus, the inventors have synthesized azide-substituted UDP-Glu and contemplate synthesiaing thiol-substituted UDP-Glu for 5-hmC labeling. An azide tag is preferred since this functional group is not present inside cells. The click chemistry to label this group is completely bio-orthogonal, meaning no interference from biological samples (Kolb et al., 2001). An azide-substituted UDP-Glu shown in
FIG. 3 . The azide-substituted glucoses can be transferred to 5-hmC, see Song et al., 2011, which is incorporated herein by reference. - 3. Biotinylation of 5-hmC in Genomic DNA for Affinity Purification
- The functional group installed on 5-gmC can be readily labeled with commercially available maleimide or alkyne (click chemistry) linked with a biotin, respectively. The reaction of thiol with maleimide is highly efficient; however, this labeling reaction cannot tolerate proteins or small molecules that bear thiol groups. Thus, genomic DNA must be isolated from other cellular components prior to the labeling, which can be readily achieved. The azide labeling with commercially available biotin-linked alkyne is completely bio-orthogonal, thus genomic DNA with bound proteins can be directly used. In both cases, the biotin-labeled DNA fragments may pulled down with streptavidin and submitted for high-throughput sequencing in order to map out global distributions and the locations of 5-hmC in chromosome. This will reveal a distribution map of 5-hmC in genomic DNA at different development stages of a particular cell or cell line.
- 4. Labeling 5-gmC with a Photosensitizer
- An alternative strategy that does not rely on converting 5-hmC:G base pair to a different base pair is to tether a photosensitizer to 5-gmC using approaches indicated in
FIG. 1 . Photosensitized one-electron oxidation can lead to site-specific oxidation of the modified 5-gmC or the nearby guanines (Tanabe et al., 2007; Meyer et al., 2003). Subsequent base (piperidine) treatment will lead to specific strand cleavage on the oxidized site (FIG. 4B ) (Tanabe et al., 2007; Meyer et al., 2003). Thus, genomic DNA containing 5-gmC labeled with photosensitizer can be subjected to photo-oxidation and base treatment. DNA fragments will be generated with oxidation sites at the end. High-throughput sequencing will reveal these modification sites. - 5. Attachment of a Sterically Bulky Group to 5-gmC
- In another strategy, a sterically bulky group such as polyethyleneglycol (PEG), a dendrimer, or a protein such as streptavidin can be introduced to the thiol- or azide-modified 5-gmC. Although 5-gmC in duplex DNA does not interfere with the polymerization reaction catalyzed by various different polymerases, the presence of an additional bulky group on 5-gmC on the DNA template strand can interfere with the synthesis of the new strand by DNA polymerase. As a result, primer extension will lead to a partially extended primer of certain length. The modification sites can be revealed by sequencing the partially extended primers. This method can be very versatile. It can be used to determine the modification sites for a given promoter site of interest. A high-throughput format can be developed as well. DNA fragments containing multiple 5-hmC can be affinity purified and random or designed primers can be used to perform primer extension experiments on these DNA fragments. Partially extended primers can be collected and subjected to high-throughput sequencing using a similar protocol as described in the restriction enzyme digestion method. A bulky modification may stop the polymerization reaction a few bases ahead of the modification site. Still, this method will map the modification sites to the resolution of a few bases. Considering that most 5-hmC exists in a CpG sequence, the resolution can be adequate for most applications. With a bulky substitution on 5-gmC digestion of modified DNA by restriction enzymes could be blocked for the restriction enzyme digestion-based assay.
- II. Assays Utilizing 5-hmc Modification
- Nucleic acid analysis and evaluation includes various methods of amplifying, fragmenting, and/or hybridizing nucleic acids that have or have not been modified.
- A. Genomic Analysis
- Methodologies are available for large scale sequence analysis. In certain aspects, the methods described exploit these genomic analysis methodologies and adapt them for uses incorporating the methodologies described herein. In certain instances the methods can be used to perform high resolution hydroxtmethylation analysis on several thousand CpGs in genomic DNA. Therefore, methods are directed to analysis of the hydroxymethylation status of a genomic DNA sample, comprising one or more of the steps: (a) fragmenting the sample and enriching the sample for sequences comprising CpG islands, (b) generating a single stranded DNA library, (c) subjecting the sample to bisulfite treatment, (d) amplifying individual members of the single stranded DNA library by means of PCR, e.g., emulsion PCR, and (e) sequencing the amplified single stranded DNA library.
- The present methods allow for analyzing the hydroxymethylation status of all regions of a complete genome, where changes in hydroxymethylation status are expected to have an influence on gene expression. Due to the combination of bisulfite treatment, amplification and high throughput sequencing, it is possible to analyze the hydroxymethylation status of at least 1000 and preferably 5000 CpG islands in parallel.
- A “CpG island” as used herein refers to regions of DNA with a high G/C content and a high frequency of CpG dinucleotides relative to the whole genome of an organism of interest. Also used interchangeably in the art is the term “CG island.” The ‘p’ in “CpG island” refers to the phosphodiester bond between the cytosine and guanine nucleotides.
- DNA may be isolated from an organism of interest, including, but not limited to eukaryotic organisms and prokaryotic organisms, preferably mammalian organisms, such as humans.
- In certain aspects, the step of enriching a sample for sequences comprising CpG islands can be done in different ways. One technique for enrichment is immunoprecipitation of methylated DNA using a methyl-Cytosine specific antibody (Weber et al., 2005). Alternatively, an enrichment step can comprise digesting the sample with a one or more restriction enzymes which more frequently cut regions of DNA comprising no CpG islands and less frequently cut regions comprising CpG islands, and isolating DNA fragments with a specific size range.
- The inventors have demonstrated that while the methylation-insensitive restriction enzyme MspI can completely cut C(5-meC)GG and partially cut C(5-hmC)GG, its activity is completely blocked by C(5-gmC)GG. This indicates that the introduction of a glucose moiety can change the property of 5-hmC in duplex DNA. With bulkier groups on 5-hmC, digestions by other restriction enzymes that recognize DNA sequences containing CpG can be blocked. Since 5-gmC can block restriction enzyme digestion, the genomic DNA modified with 5-gmC can be treated with and without restriction enzymes and subjected to known methods of mapping the genome-wide distribution and location of the 5-hmC modification.
- Such restrictions enzymes can be selected by a person skilled in the art using conventional Bioinformatics approaches. The selection of appropriate enzymes also has a substantial influence on the average size of fragments that ultimately will be generated and sequenced. The selection of appropriate enzymes may be designed in such a way that it promotes enrichment of a certain fragment length. Thus, the selection may be adjusted to the kind of sequencing method which is finally applied. For most sequencing methods, a fragment length between 100 and 1000 by has been proven to be efficient. Therefore, in one embodiment, said fragment size range is from 100, 200 or 300 base pairs to 400, 500, 600, 700, 800, 900, or 1000 base pairs (bp), including all ranges and values there between.
- The human genome reference sequence (NCBI Build 36.1 from March 2006; assembled parts of chromosomes only) has a length of 3,142,044,949 bp and contains 26,567 annotated CpG islands (CpGs) for a total length of 21,073,737 bp (0.67%). In certain aspects, a DNA sequence read hits a CpG if the read overlaps with the CpG by at least 50 bp.
- As a non-limiting example, the following enzymes or their isoschizomers (with the following restriction sites) can be used for a method according to the present invention: MseI (TTAA), Tsp509 (AATT), AluI (AGCT), NlaIII (CATG), BfaI (CTAG), HpyCH4 (TGCA), Dpul (GATC), MboII (GAAGA), MlyI (GAGTC), BCCI (CCATC). Isoschizomers are pairs of restriction enzymes specific to the same recognition sequence and cut in the same location.
- Embodiments include a CG island enriched library produced from genomic DNA by digestion with several restriction enzymes that preferably cut within non-CG island regions. In certain aspects, the restriction enzymes are selected in such a way that digestion can result in fragments with a size range between 300, 400, 500, 600 to 500, 600, 800, 900 bp or greater, including all ranges and values there between. The library fragments are ligated to adaptors. Subsequently, a conventional bisulfite treatment is performed according to methods that are well known in the art. As a result, unmethylated cytosine residues are converted to Uracil residues, which in a subsequent sequencing reaction base calling are identified as “T” instead of “C”, when compared with a non bisulfite treated reference. Subsequent to bisulfite treatment, the sample is subjected to a conventional sequencing protocol.
- As one example, the 454 Genome Sequencer System supports the sequencing of samples from a wide variety of starting materials including, but not limited to, eukaryotic or bacterial genomic DNA. Genomic DNAs are fractionated into small, 100- to 1000-bp fragments with an appropriate specific combination of restriction enzymes which enriches for CpG island containing fragments. In one embodiment, the restriction enzymes used for a method according to the present invention are selected from a group consisting of Msel, Tsp509, Alul, N1aIII, BfaI, HpyCH4, Dpul, MboII, MlyI, and BCCI, or any isoschizomer of any of the enzymes mentioned. Preferably, 4-5 different enzymes are selected.
- Using a series of standard molecular biology techniques, short adaptors (A and B) are added to each fragment. The adaptors are used for purification, amplification, and sequencing steps. Single-stranded fragments with A and B adaptors compose the sample library used for subsequent steps.
- Prior to ligation of the adaptors, the fragments can be completely double stranded without any single stranded overhang. A fragment polishing reaction is performed using e.g. E. coli T4 DNA polymerase. In one embodiment, the polishing reaction is performed in the presence of hydroxymethyl-dCTP instead of dCTP. In another embodiment, the fragment polishing reaction is performed in the presence of a DNA polymerase which lacks proofreading activity, such as Tth DNA polymerase (Roche Applied Science Cat. No: 11 480 014 001).
- The two different double stranded adaptors A and B are ligated to the ends of the fragments. Some or all of the C-residues of adaptors A and B can be hydroxymethyl-C residues. Subsequently, the fragments containing at least one B adaptor are immobilized on a streptavidin coated solid support and a nick repair-fill-in synthesis is performed using a strand displacement enzyme such as Bst Polymerase (New England Biolabs). Preferably said reaction is performed in the presence of hydroxymethyl -dCTP instead of dCTP. Subsequently single stranded molecules comprising one adaptor A and one adaptor B are removed from the streptavidin coated beads as disclosed in (Margulies et al., 2005). In those cases where hydroxymethyl-dCTP replaces dCTP, it can be used at the same concentrations as dCTP is used in the original protocol.
- The bisulfite treatment can be done according to standard methods that are well known in the art (Frommer et al., 1992; Zeschnigk et al., 1997; Clark et al., 1994). The sample can be purified, for example by a Sephadex size exclusion column or, at least by means of precipitation. It is also within the scope of the present invention, if directly after bisulfite treatment, or directly after bisulfite treatment followed by purification, the sample is amplified by means of performing a conventional PCR using amplification primers with sequences corresponding to the A and B adaptor sequences.
- In certain aspects, the bisulfite treated and optionally purified and/or amplified single-stranded DNA library is immobilized onto specifically designed DNA Capture Beads. Each bead carries a unique single-stranded DNA library fragment.
- A library fragment can be amplified within its own microreactor comprised of a water-in-oil emulsion, excluding competing or contaminating sequences. Amplification of the entire fragment collection can be done in parallel; for each fragment, this results in a copy number of several million clonally amplified copies of the unique fragment per bead. After PCR amplification within the emulsion, the emulsion is broken while the amplified fragments remain bound to their specific beads.
- B. Modification Sensitive Enzymes.
- DNA methyltransferases (MTases) that transfer a methyl group from S-adenosylmethionine to either adenine or cytosine residues, are found in a wide variety of prokaryotes and eukaryotes. Methylation should be considered when digesting DNA with restriction endonucleases because cleavage can be blocked or impaired when a particular base in the recognition site is methylated or otherwise modified.
- In prokaryotes, MTases have most often been identified as elements of restriction/modification systems that act to protect host DNA from cleavage by the corresponding restriction endonuclease. Most laboratory strains of E. coli contain three site-specific DNA methylases. Some or all of the sites for a restriction endonuclease may be resistant to cleavage when isolated from strains expressing the Dam or Dcm methylases if the methylase recognition site overlaps the endonuclease recognition site. For example, plasmid DNA isolated from dam+ E. coli is completely resistant to cleavage by MboI, which cleaves at GATC sites.
- Not all DNA isolated from E. coli is methylated to the same extent. While pBR322 DNA is fully modified (and is therefore completely resistant to Mbol digestion), only about 50% of λ DNA Dam sites are methylated, presumably because the methylase does not have the opportunity to methylate the DNA fully before it is packaged into the phage head. As a result, enzymes blocked by Dam or Dcm modification will yield partial digestion patterns with λ DNA. Restriction sites that are blocked by Dam or Dcm methylation can be un-methylated by cloning DNA into a dam-, dcm-strain of E. coli, such as dam-/dcm-Competent E. coli (NEB #C2925).
- CpG MTases, found in higher eukaryotes (e.g., Dnmt1), transfer a methyl group to the C5 position of cytosine residues. Patterns of CpG methylation are heritable, tissue specific and correlate with gene expression. Consequently, CpG methylation has been postulated to play a role in differentiation and gene expression (Josse and Kornberg, 1962). The effects of CpG methylation are mainly a concern when digesting eukaryotic genomic DNA. CpG methylation patterns are not retained once the DNA is cloned into a bacterial host.
- The table below summarizes methylation sensitivity for NEB restriction enzymes, indicating whether or not cleavage is blocked or impaired by Dam, Dcm or CpG methylation if or when it overlaps each recognition site. REBASE, the restriction enzyme database, can be consulted for more detailed information and specific examples. (Marinus and Morris, 1973; Geier and Modrich, 1979; May and Hattman, 1975; Siegfried and Cedar, 1997).
-
Enzyme Sequence Dam Dcm CpG AatII GACGT/C Not Sensitive Not Sensitive Blocked Acc65I G/GTACC Not Sensitive Blocked by Blocked by Some Some Overlapping Overlapping Combinations Combinations AccI GT/MKAC Not Sensitive Not Sensitive Blocked by Overlapping Methylation AciI CCGC(−3/−1) Not Sensitive Not Sensitive Blocked AclI AA/CGTT Not Sensitive Not Sensitive Blocked AcuI CTGAAG(16/14) Not Sensitive Not Sensitive Not Sensitive AfeI AGC/GCT Not Sensitive Not Sensitive Blocked AflII C/TTAAG Not Sensitive Not Sensitive Not Sensitive AflIII A/CRYGT Not Sensitive Not Sensitive Not Sensitive AgeI A/CCGGT Not Sensitive Not Sensitive Blocked AgeI-HF ™ A/CCGGT — — — AhdI GACNNN/NNGTC Not Sensitive Not Sensitive Impaired by Some Overlapping Combinations AleI CACNN/NNGTG Not Sensitive Not Sensitive Impaired by Some Overlapping Combinations AluI AG/CT Not Sensitive Not Sensitive Not Sensitive AlwI GGATC(4/5) Blocked Not Sensitive Not Sensitive AlwNI CAGNNN/CTG Not Sensitive Blocked by Not Sensitive Overlapping Methylation ApaI GGGCC/C Not Sensitive Blocked by Blocked by Overlapping Overlapping Methylation Methylation ApaLI G/TGCAC Not Sensitive Not Sensitive Blocked by Overlapping Methylation ApeKI G/CWGC Not Sensitive Not Sensitive Not Sensitive ApoI R/AATTY Not Sensitive Not Sensitive Not Sensitive AscI GG/CGCGCC Not Sensitive Not Sensitive Blocked AseI AT/TAAT Not Sensitive Not Sensitive Not Sensitive AsiSI GCGAT/CGC Not Sensitive Not Sensitive Blocked AvaI C/YCGRG Not Sensitive Not Sensitive Blocked AvaII G/GWCC Not Sensitive Blocked by Blocked by Overlapping Overlapping Methylation Methylation AvtII C/CTAGG Not Sensitive Not Sensitive Not Sensitive BaeGI GKGCM/C Not Sensitive Not Sensitive Not Sensitive BaeI (10/15)ACNNNNGTAYC(12/7) Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations BamHI G/GATCC Not Sensitive Not Sensitive Not Sensitive BamHI-HF ™ G/GATCC Not Sensitive Not Sensitive Not Sensitive BanI G/GYRCC Not Sensitive Blocked by Blocked by Some Some Overlapping Overlapping Combinations Combinations BanII GRGCY/C Not Sensitive Not Sensitive Not Sensitive BbsI GAAGAC(2/6) Not Sensitive Not Sensitive Not Sensitive BbvCI CCTCAGC(−5/−2) Not Sensitive Not Sensitive Impaired by Overlapping Methylation BbvI GCAGC(8/12) Not Sensitive Not Sensitive Not Sensitive BccI CCATC(4/5) Not Sensitive Not Sensitive Not Sensitive BceAI ACGGC(12/14) Not Sensitive Not Sensitive Blocked BcgI (10/12)CGANNNNNNTGC(12/10) Blocked by Not Sensitive Blocked by Overlapping Some Methylation Overlapping Combinations BciVI GTATCC(6/5) Not Sensitive Not Sensitive Not Sensitive BclI T/GATCA Blocked Not Sensitive Not Sensitive BfaI C/TAG Not Sensitive Not Sensitive Not Sensitive BfuAI ACCTGC(4/8) Not Sensitive Not Sensitive Impaired by Overlapping Methylation BfuCI /GATC Not Sensitive Not Sensitive Blocked by Overlapping Methylation BglI GCCNNNN/NGGC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations BglII A/GATCT Not Sensitive Not Sensitive Not Sensitive BlpI GC/TNAGC Not Sensitive Not Sensitive Not Sensitive BmgBI CACGTC(−3/−3) Not Sensitive Not Sensitive Blocked BmrI ACTGGG(5/4) Not Sensitive Not Sensitive Not Sensitive BmtI GCTAG/C Not Sensitive Not Sensitive Not Sensitive BpmI CTGGAG(16/14) Not Sensitive Not Sensitive Not Sensitive Bpu10I CCTNAGC(−5/−2) Not Sensitive Not Sensitive Not Sensitive BpuEI CTTGAG(16/14) Not Sensitive Not Sensitive Not Sensitive BsaAI YAC/GTR Not Sensitive Not Sensitive Blocked BsaBI GATNN/NNATC Blocked by Not Sensitive Blocked by Overlapping Some Methylation Overlapping Combinations BsaHI GR/CGYC Not Sensitive Blocked by Blocked Some Overlapping Combinations BsaI GGTCTC(1/5) Not Sensitive Blocked by Blocked by Overlapping Some Methylation Overlapping Combinations BsaI-HF ™ GGTCTC(1/5) — Blocked by — Overlapping Methylation BsaJI C/CNNGG Not Sensitive Not Sensitive Not Sensitive BsaWI W/CCGGW Not Sensitive Not Sensitive Not Sensitive BsaXI (9/12)ACNNNNNCTCC(10/7) Not Sensitive Not Sensitive Not Sensitive BseRI GAGGAG(10/8) Not Sensitive Not Sensitive Not Sensitive BseYI CCCAGC(−5/−1) Not Sensitive Not Sensitive Blocked by Overlapping Methylation BsgI GTGCAG(16/14) Not Sensitive Not Sensitive Not Sensitive BsiEI CGRY/CG Not Sensitive Not Sensitive Blocked BsiHKAI GWGCW/C Not Sensitive Not Sensitive Not Sensitive BsiWI C/GTACG Not Sensitive Not Sensitive Blocked BslI CCNNNNN/NNGG Not Sensitive Blocked by Blocked by Some Some Overlapping Overlapping Combinations Combinations BsmAI GTCTC(1/5) Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations BsmBI CGTCTC(1/5) Not Sensitive Not Sensitive Blocked BsmFI GGGAC(10/14) Not Sensitive Blocked by Blocked by Overlapping Overlapping Methylation Methylation BsmI GAATGC(1/−1) Not Sensitive Not Sensitive Not Sensitive BsoBI C/YCGRG Not Sensitive Not Sensitive Not Sensitive Bsp1286I GDGCH/C Not Sensitive Not Sensitive Not Sensitive BspCNI CTCAG(9/7) Not Sensitive Not Sensitive Not Sensitive BspDI AT/CGAT Blocked by Not Sensitive Blocked Overlapping Methylation BspEI T/CCGGA Blocked by Not Sensitive Impaired Overlapping Methylation BspHI T/CATGA Blocked by Not Sensitive Not Sensitive Overlapping Methylation BspMI ACCTGC(4/8) Not Sensitive Not Sensitive Not Sensitive BspQI GCTCTTC(1/4) Not Sensitive Not Sensitive Not Sensitive BsrBI CCGCTC(−3/−3) Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations BsrDI GCAATG(2/0) Not Sensitive Not Sensitive Not Sensitive BsrFI R/CCGGY Not Sensitive Not Sensitive Blocked BsrGI T/GTACA Not Sensitive Not Sensitive Not Sensitive BsrI ACTGG(1/−1) Not Sensitive Not Sensitive Not Sensitive BssHII G/CGCGC Not Sensitive Not Sensitive Blocked BssKI /CCNGG Not Sensitive Blocked by Blocked by Overlapping Overlapping Methylation Methylation BssSI CACGAG(−5/−1) Not Sensitive Not Sensitive Not Sensitive BstAPI GCANNNN/NTGC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations BstBI TT/CGAA Not Sensitive Not Sensitive Blocked BstEII G/GTNACC Not Sensitive Not Sensitive Not Sensitive BstNI CC/WGG Not Sensitive Not Sensitive Not Sensitive BstUI CG/CG Not Sensitive Not Sensitive Blocked BstXI CCANNNNN/NTGG Not Sensitive Blocked by Not Sensitive Some Overlapping Combinations BstYI R/GATCY Not Sensitive Not Sensitive Not Sensitive BstZ17I GTA/TAC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations Bsu36I CC/TNAGG Not Sensitive Not Sensitive Not Sensitive BtgI C/CRYGG Not Sensitive Not Sensitive Not Sensitive BtgZI GCGATG(10/14) Not Sensitive Not Sensitive Impaired BtsCI GGATG(2/0) Not Sensitive Not Sensitive Not Sensitive BtsI GCAGTG(2/0) Not Sensitive Not Sensitive Not Sensitive Cac8I GCN/NGC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations ClaI AT/CGAT Blocked by Not Sensitive Blocked Overlapping Methylation CspCI (11/13)CAANNNNNGTGG(12/10) Not Sensitive Not Sensitive Not Sensitive CviAII C/ATG Not Sensitive Not Sensitive Not Sensitive CviKI-1 RG/CY Not Sensitive Not Sensitive Not Sensitive CviQI G/TAC Not Sensitive Not Sensitive Not Sensitive DdeI C/TNAG Not Sensitive Not Sensitive Not Sensitive DpnI GA/TC Not Sensitive Not Sensitive Blocked by Overlapping Methylation DpnII /GATC Blocked Not Sensitive Not Sensitive DraI TTT/AAA Not Sensitive Not Sensitive Not Sensitive DraIII CACNNN/GTG Not Sensitive Not Sensitive Impaired by Overlapping Methylation DrdI GACNNNN/NNGTC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations EaeI Y/GGCCR Not Sensitive Blocked by Blocked by Overlapping Overlapping Methylation Methylation EagI C/GGCCG Not Sensitive Not Sensitive Blocked EagI-HF ™ C/GGCCG Not Sensitive Not Sensitive Blocked EarI CTCTTC(1/4) Not Sensitive Not Sensitive Impaired by Overlapping Methylation EciI GGCGGA(11/9) Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations Eco53kI GAG/CTC — — — EcoNI CCTNN/NNNAGG Not Sensitive Not Sensitive Not Sensitive EcoO109I RG/GNCCY Not Sensitive Blocked by Not Sensitive Overlapping Methylation EcoP15I CAGCAG(25/27) Not Sensitive Not Sensitive Not Sensitive EcoRI G/AATTC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations EcoRI-HF ™ G/AATTC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations EcoRV GAT/ATC Not Sensitive Not Sensitive Impaired by Some Overlapping Combinations EcoRV-HF ™ GAT/ATC Not Sensitive Not Sensitive Impaired by Some Overlapping Combinations FatI /CATG Not Sensitive Not Sensitive Not Sensitive FauI CCCGC(4/6) Not Sensitive Not Sensitive Blocked Fnu4HI GC/NGC Not Sensitive Not Sensitive Blocked by Overlapping Methylation FokI GGATG(9/13) Not Sensitive Impaired by Impaired by Overlapping Overlapping Methylation Methylation FseI GGCCGG/CC Not Sensitive Impaired by Blocked Some Overlapping Combinations FspI TGC/GCA Not Sensitive Not Sensitive Blocked HaeII RGCGC/Y Not Sensitive Not Sensitive Blocked HaeIII GG/CC Not Sensitive Not Sensitive Not Sensitive HgaI GACGC(5/10) Not Sensitive Not Sensitive Blocked HhaI GCG/C Not Sensitive Not Sensitive Blocked HincII GTY/RAC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations HindIII A/AGCTT Not Sensitive Not Sensitive Not Sensitive HinfI G/ANTC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations HinP1I G/CGC Not Sensitive Not Sensitive Blocked HpaI GTT/AAC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations HpaII C/CGG Not Sensitive Not Sensitive Blocked HphI GGTGA(8/7) Blocked by Not Sensitive Not Sensitive Overlapping Methylation Hpy166II GTN/NAC Not Sensitive Not Sensitive Blocked by Overlapping Methylation Hpy188I TCN/GA Blocked by Not Sensitive Not Sensitive Overlapping Methylation Hpy188III TC/NNGA Blocked by Not Sensitive Blocked by Overlapping Overlapping Methylation Methylation Hpy99I CGWCG/ Not Sensitive Not Sensitive Blocked HpyAV CCTTC(6/5) Not Sensitive Not Sensitive Impaired by Overlapping Methylation HpyCH4III ACN/GT Not Sensitive Not Sensitive Not Sensitive HpyCH4IV A/CGT Not Sensitive Not Sensitive Blocked HpyCH4V TG/CA Not Sensitive Not Sensitive Not Sensitive I-CeuI CGTAACTATAACGGTCCTAAGGTAGCGAA — — — (−9/−13) I-SceI TAGGGATAACAGGGTAAT(−9/−13) — — — KasI G/GCGCC Not Sensitive Not Sensitive Blocked KpnI GGTAC/C Not Sensitive Not Sensitive Not Sensitive KpnI-HF ™ GGTAC/C — — — MboI /GATC Blocked Not Sensitive Impaired by Overlapping Methylation MboII GAAGA(8/7) Blocked by Not Sensitive Not Sensitive Overlapping Methylation MfeI C/AATTG Not Sensitive Not Sensitive Not Sensitive MfeI-HF ™ C/AATTG Not Sensitive Not Sensitive Not Sensitive MluI A/CGCGT Not Sensitive Not Sensitive Blocked MlyI GAGTC(5/5) Not Sensitive Not Sensitive MmeI TCCRAC(20/18) Not Sensitive Not Sensitive Blocked by Overlapping Methylation MnlI CCTC(7/6) Not Sensitive Not Sensitive Not Sensitive MscI TGG/CCA Not Sensitive Blocked by Not Sensitive Overlapping Methylation MseI T/TAA Not Sensitive Not Sensitive Not Sensitive MslI CAYNN/NNRTG Not Sensitive Not Sensitive Not Sensitive MspA1I CMG/CKG Not Sensitive Not Sensitive Blocked by Overlapping Methylation MspI C/CGG Not Sensitive Not Sensitive Not Sensitive MwoI GCNNNNN/NNGC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations NaeI GCC/GGC Not Sensitive Not Sensitive Blocked NarI GG/CGCC Not Sensitive Not Sensitive Blocked Nb.BbvCI CCTCAGC Not Sensitive Not Sensitive Not Sensitive Nb.BsmI GAATGC Not Sensitive Not Sensitive Not Sensitive Nb.BsrDI GCAATG Not Sensitive Not Sensitive Not Sensitive Nb.BtsI GCAGTG — — — NciI CC/SGG Not Sensitive Not Sensitive Impaired by Overlapping Methylation NcoI C/CATGG Not Sensitive Not Sensitive Not Sensitive NcoI-HF ™ C/CATGG Not Sensitive Not Sensitive Not Sensitive NdeI CA/TATG Not Sensitive Not Sensitive Not Sensitive NgoMIV G/CCGGC Not Sensitive Not Sensitive Blocked NheI G/CTAGC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations NheI-HF ™ G/CTAGC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations NlaIII CATG/ Not Sensitive Not Sensitive Not Sensitive NlaIV GGN/NCC Not Sensitive Blocked by Blocked by Overlapping Overlapping Methylation Methylation NmeAIII GCCGAG(21/19) Not Sensitive Not Sensitive Not Sensitive NotI GC/GGCCGC Not Sensitive Not Sensitive Blocked NotI-HF ™ GC/GGCCGC Not Sensitive Not Sensitive Blocked NruI TCG/CGA Blocked by Not Sensitive Blocked Overlapping Methylation NsiI ATGCA/T Not Sensitive Not Sensitive Not Sensitive NspI RCATG/Y Not Sensitive Not Sensitive Not Sensitive Nt.AlwI GGATC(4/−5) Blocked Not Sensitive Not Sensitive Nt.BbvCI CCTCAGC(−5/−7) Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations Nt.BsmAI GTCTC(1/−5) Not Sensitive Not Sensitive Blocked Nt.BspQI GCTCTTC(1/−7) Not Sensitive Not Sensitive Not Sensitive Nt.BstNBI GAGTC(4/−5) Not Sensitive Not Sensitive Not Sensitive Nt.CviPII (0/−1)CCD Not Sensitive Not Sensitive Blocked PacI TTAAT/TAA Not Sensitive Not Sensitive Not Sensitive PaeR7I C/TCGAG Not Sensitive Not Sensitive Blocked PciI A/CATGT Not Sensitive Not Sensitive Not Sensitive PflFI GACN/NNGTC Not Sensitive Not Sensitive Not Sensitive PflMI CCANNNN/NTGG Not Sensitive Blocked by Not Sensitive Overlapping Methylation PhoI GG/CC Not Sensitive Impaired by Impaired by Some Some Overlapping Overlapping Combinations Combinations Pl-PspI TGGCAAACAGCTATTATGGGTATTATGGGT — — — (−13/−17) Pl-SceI ATCTATGTCGGGTGCGGAGAAAGAGGTAAT (−15/−19) — — — PleI GAGTC(4/5) Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations PmeI GTTT/AAAC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations PmlI CAC/GTG Not Sensitive Not Sensitive Blocked PpuMI RG/GWCCY Not Sensitive Blocked by Not Sensitive Overlapping Methylation PshAI GACNN/NNGTC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations PsiI TTA/TAA Not Sensitive Not Sensitive Not Sensitive PspGI /CCWGG Not Sensitive Blocked Not Sensitive PspOMI G/GGCCC Not Sensitive Blocked by Blocked by Overlapping Overlapping Methylation Methylation PspXI VC/TCGAGB Not Sensitive Not Sensitive Impaired PstI CTGCA/G Not Sensitive Not Sensitive Not Sensitive PstI-HF ™ CTGCA/G — — — PvuI CGAT/CG Not Sensitive Not Sensitive Blocked PvuII CAG/CTG Not Sensitive Not Sensitive Not Sensitive PvuII-HF ™ CAG/CTG Not Sensitive Not Sensitive Not Sensitive RsaI GT/AC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations RsrII CG/GWCCG Not Sensitive Not Sensitive Blocked SacI GAGCT/C Not Sensitive Not Sensitive Not Sensitive SacI-HF ™ GAGCT/C Not Sensitive Not Sensitive Not Sensitive SacII CCGC/GG Not Sensitive Not Sensitive Blocked SalI G/TCGAC Not Sensitive Not Sensitive Blocked SalI-HF ™ G/TCGAC Not Sensitive Not Sensitive Blocked SapI GCTCTTC(1/4) Not Sensitive Not Sensitive Not Sensitive Sau3AI /GATC Not Sensitive Not Sensitive Blocked by Overlapping Methylation Sau96I G/GNCC Not Sensitive Blocked by Blocked by Overlapping Overlapping Methylation Methylation SbfI CCTGCA/GG Not Sensitive Not Sensitive Not Sensitive SbfI-HF ™ CCTGCA/GG Not Sensitive Not Sensitive Not Sensitive ScaI AGT/ACT Not Sensitive Not Sensitive Not Sensitive ScaI-HF ™ AGT/ACT Not Sensitive Not Sensitive Not Sensitive ScrFI CC/NGG Not Sensitive Blocked by Blocked by Overlapping Overlapping Methylation Methylation SexAI A/CCWGGT Not Sensitive Blocked Not Sensitive SfaNI GCATC(5/9) Not Sensitive Not Sensitive Impaired by Some Overlapping Combinations SfcI C/TRYAG Not Sensitive Not Sensitive Not Sensitive SfiI GGCCNNNN/NGGCC Not Sensitive Impaired by Blocked by Overlapping Some Methylation Overlapping Combinations SfoI GGC/GCC Not Sensitive Blocked by Blocked Some Overlapping Combinations SgrAI CR/CCGGYG Not Sensitive Not Sensitive Blocked SmaI CCC/GGG Not Sensitive Not Sensitive Blocked SmlI C/TYRAG Not Sensitive Not Sensitive Not Sensitive SnaBI TAC/GTA Not Sensitive Not Sensitive Blocked SpeI A/CTAGT Not Sensitive Not Sensitive Not Sensitive SphI GCATG/C Not Sensitive Not Sensitive Not Sensitive SphI-HF ™ GCATG/C Not Sensitive Not Sensitive Not Sensitive SspI AAT/ATT Not Sensitive Not Sensitive Not Sensitive SspI-HF ™ AAT/ATT Not Sensitive Not Sensitive Not Sensitive StuI AGG/CCT Not Sensitive Blocked by Not Sensitive Overlapping Methylation StyD4I /CCNGG Not Sensitive Blocked by Impaired by Overlapping Overlapping Methylation Methylation StyI C/CWWGG Not Sensitive Not Sensitive Not Sensitive StyI-HF ™ C/CWWGG — — — SwaI ATTT/AAAT Not Sensitive Not Sensitive Not Sensitive TaqαI T/CGA Blocked by Not Sensitive Not Sensitive Overlapping Methylation TfiI G/AWTC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations TliI C/TCGAG Not Sensitive Not Sensitive Impaired TseI G/CWGC Not Sensitive Not Sensitive Blocked by Some Overlapping Combinations Tsp45I /GTSAC Not Sensitive Not Sensitive Not Sensitive Tsp509I /AATT Not Sensitive Not Sensitive Not Sensitive TspMI C/CCGGG Not Sensitive Not Sensitive Blocked TspRI NNCASTGNN/ Not Sensitive Not Sensitive Not Sensitive Tth111I GACN/NNGTC Not Sensitive Not Sensitive Not Sensitive XbaI T/CTAGA Blocked by Not Sensitive Not Sensitive Overlapping Methylation XcmI CCANNNNN/NNNNTGG Not Sensitive Not Sensitive Not Sensitive XhoI C/TCGAG Not Sensitive Not Sensitive Impaired XmaI C/CCGGG Not Sensitive Not Sensitive Impaired XmnI GAANN/NNTTC Not Sensitive Not Sensitive Not Sensitive ZraI GAC/GTC Not Sensitive Not Sensitive Blocked - C. Microarray Analysis
- Microarray methods can be used in conjunction with the methods described herein for simultaneous testing of numerous genetic alterations of the human genome. The subject matter described herein can also be used in various fields to greatly improve the accuracy and reliability of nucleic acid analyses, chromosome mapping, and genetic testing. Selected chromosomal target elements can be included on the array and evaluated for 5-hmC content in conjunction with hybridization to a nucleic acid array. In an implementation that uses a diagnostic array (hereafter, “array”), such as a microarray used for comparative genomic hybridization (CGH), a comprehensive battery of clinically relevant chromosomal loci can be selected and evaluated for 5-hmC status or content. 5-hmC in genomic DNA fragments are specifically labeled using radio-labels, fluorescent labels or amplifiable signals. These labeled target DNA fragments are then screened by hybridization using microarrays.
- D. FRET-Based Hybridization Assay
- Attach a fluorescent tag to the 5-hmC. Hybridize to a probe containing a nucleotide labeled with a fluorescent tag that functions as a FRET partner to the first. If the labeled based in the probe is juxtaposed with the labeled 5-hmC, a FRET signal will be observed.
- E. Electrochemical Labeling
- This method involves using AC impedance as a measurement for the presence of 5-hmC. Briefly, a nucleic acid probe specific for the sequence to be analyzed is immobilized on a gold electrode. The DNA fragment to be analyzed is added and allowed to hybridize to the probe. Excess non-hybridized, single-strand DNA is digested using nucleases. Biotin is covalently linked to the 5-hmC using the methods of the invention either before or after hybridization. Avidin-HRP is bound to the biotinylated DNA sequence then 4-chloronaphthol is added. If the HRP molecule is bound to the hybridized target DNA near the gold electrode, the HRP oxidizes the 4-chloronaphthol to a hydrophobic product that absorbs to the electrode surface. This results in a higher AC impedance if 5-hmC is present in the target DNA compared to a control sequence lacking 5-hmC.
- F. Chromosomal Staining
- Chromosomal DNA is prepared using standard karotyping techniques known in the art. The 5-hmC in the chromosomal DNA is labeled with a detectable moiety (fluorophore, radio-label, amplifiable signal) and imaged in the context of the intact chromosomes.
- The invention additionally provides kits for modifying cytosine bases of nucleic acids and/or subjecting such modified nucleic acids to further analysis. The contents of a kit can include one or more of a modification agent(s), a labeling reagent for detecting or modifying glucose or a 5-hmC, and, if desired, a substrate that contains or is capable of attaching to one or more modified 5-gmC. The substrate can be, e.g., a microsphere, antibody, or other binding agent.
- Each kit preferably includes a 5-hmC modifying agent or agents, e.g., βGT and its functionalized substrate. One or more reagent is preferably supplied in a solid form or liquid buffer that is suitable for inventory storage, and later for addition into the reaction medium when the method of using the reagent is performed. Suitable packaging is provided. The kit may optionally provide additional components that are useful in the procedure. These optional components include buffers, capture reagents, developing reagents, labels, reacting surfaces, means for detection, control samples, instructions, and interpretive information.
- The kit may optionally include a detectable label or a modified glucose-binding agent and, if desired, reagents for detecting the binding agent.
- The following examples are given for the purpose of illustrating various embodiments of the invention and are not meant to limit the present invention in any fashion. One skilled in the art will appreciate readily that the present invention is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those objects, ends and advantages inherent herein. The present examples, along with the methods described herein are presently representative of preferred embodiments, are exemplary, and are not intended as limitations on the scope of the invention. Changes therein and other uses which are encompassed within the spirit of the invention as defined by the scope of the claims will occur to those skilled in the art.
- To elucidate the biology of 5-hmC, the first step is to identify the locations of 5-hmC within genomic DNA, but so far it has remained challenging to distinguish 5-hmC from 5-mC and to enrich 5-hmC-containing genomic DNA fragments.
- Widely used methods to probe 5-mC, such as bisulfite sequencing and methylation-sensitive restriction digestion, cannot discriminate between 5-hmC and 5-mC (Huang et al., 2010; Jin et al., 2010). Anti-5-hmC antibodies have only recently become commercially available. However, attempts to use the antibodies to immuno-enrich 5-hmC-containing genomic DNA from complex genomes for sequencing have yet to be successful (Ito et al., 2010). A single-molecule, real-time sequencing technology has been applied to distinguish between cytosine, 5-mC and 5-hmC, but further improvements are necessary to affinity-enrich 5-hmC-containing DNA and to achieve base-resolution sequencing (Flusberg et al., 2010).
- In certain aspects, the inventors describe a chemical tagging technology. It has been shown that 5-hmC is present in the genome of the T-even bacteriophages. A viral enzyme, β-glucosyltransferase (β-GT), can catalyze the transfer of a glucose moiety from uridine diphosphoglucose (UDP-Glu) to the hydroxyl group of 5-hmC, yielding β-glucosyl-5-hydroxymethyl-cytosine (5-gmC) in duplex DNA (Josse and Kornberg, 1962; Lariviere and Morera, 2004) (
FIG. 1A ). The inventors took advantage of this enzymatic process and used β-GT to transfer a chemically modified glucose, 6-N3-glucose, onto 5-hmC for selective bio-orthogonal labeling of 5-hmC in genomic DNA (FIG. 1B ). With an azide group present, a biotin tag or any other tag can be installed using Huisgen cycloaddition (click) chemistry for a variety of enrichment, detection and sequencing applications (Kolb et al., 2001; Speers and Cravatt, 2004; Sletten and Bertozzi, 2009). - The inventors used the biotin tag for high-affinity capture and/or enrichment of 5-hmC-containing DNA for sensitive detection and deep sequencing to reveal genomic locations of 5-hmC (
FIG. 1B ). The covalent chemical labeling coupled with biotin-based affinity purification provides considerable advantages over noncovalent, antibody-based immunoprecipitation as it ensures accurate and comprehensive capture of 5-hmC-containing DNA fragments, while still providing high selectivity. - The inventors chemically synthesized UDP-6-N3-Glu (
FIG. 3 ) and attempted the glycosylation reaction of an 11-mer duplex DNA containing a 5-hmC modification as a model system (FIG. 5 ). Wild-type β-GT worked efficiently using UDP-6-N3-Glu as the co-factor, showing only a sixfold decrease of the reaction rate compared to the native co-factor UDP-Glu (FIG. 6 ). The 6-N3-glucose transfer reaction finished within 5 min with as low as 1% enzyme concentration. The identity of the resulting β-6-azide-glucosyl-5-hydroxymethyl-cytosine (N3-5-gmC) of the 11-mer DNA was confirmed by matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) analysis (FIG. 5 ). One can readily couple N3-5-gmC with dibenzocyclooctyne-modified biotin (compound 1) by copper-free click chemistry to introduce a biotin group (FIG. 5 ) (Baskin et al., 2007; Ning et al., 2008). Again, the identity of the 11-mer DNA with the biotin-N3-5-gmC label was confirmed by MALDI-TOF analysis (FIG. 5 ). High-performance liquid chromatography (HPLC) analysis indicated that the click chemistry is high yielding (˜90%)(FIG. 7). High-resolution mass spectroscopy (HRMS) analysis of the corresponding HPLC hydrolysates further verified that biotin-N3-5-gmC was formed (FIG. 8 ). - The properties of 5-hmC in duplex DNA are quite similar to those of 5-mC in terms of its sensitivity toward enzymatic reactions such as restriction enzyme digestion and polymerization (Flusberg et al., 2010; Josse and Kornberg, 1962; Lariviere and Morera, 2004). In an attempt to develop a method to differentiate these two bases in DNA, primer extension with a biotin-N3-5-gmC-modified DNA template was tested. Addition of streptavidin tetramer (binds biotin tightly) completely stops replication by Taq polymerase specifically at the modified position as well as one base before the modified position (
FIG. 6 ). Therefore, this method has the potential to provide single-base resolution of the location of 5-hmC in DNA loci of interest. - Next, the inventors performed selective labeling of 5-hmC in genomic DNA from various cell lines and animal tissues (
FIG. 10 ). Genomic DNA from various sources was sonicated into small fragments (˜100-500 base pairs), treated with β-GT in the presence of UDP-6-N3-Glu or regular UDP-Glu (control group) to yield N3-5-gmC or 5-gmC modifications and finally labeled with cyclooctyne-biotin (1) to install biotin. Because each step is efficient and bio-orthogonal, this protocol ensures selective labeling of most 5-hmC in genomic DNA. The presence of biotin-N3-5-gmC allows affinity enrichment of this modification and accurate quantification of the amount of 5-hmC in a genome using avidin-horseradish peroxidase (HRP). - The inventors determined the total amount of 5-hmC in mouse cerebellum at different stages of development (
FIGS. 10A and 10B ). The control group showed almost no signal, demonstrating the high selectivity of this method. The amount of 5-hmC depends on the developmental stage of the mouse cerebellum (FIG. 10B ). A gradual increase from postnatal day 7 (P7, 0.1% of total nucleotides in the genome) to adult stage (0.4% of total nucleotides) was observed (Munzel et al., 2010), which was further confirmed using antibody against 5-hmC through a dot-blot assay) (FIG. 11 ). These observations suggest that 5-hmC might play an important role in brain development. The 5-hmC level of mouse embryonic stem cells (mESC) was determined to be comparable to results reported previously (˜0.05% of total nucleotides) (FIGS. 10C and 10D ) (Tahiliani et al. 2009). In addition, the amount of 5-hmC in mouse adult neural stem cells (aNSC) was tested, which proved comparable to that of mESC (˜0.04% of total nucleotides) (FIGS. 10C and 10D ). - The inventors also tested human cell lines (
FIGS. 10C and 10D ). Notably, the presence of 5-hmC was detected in HeLa and HEK293FT cell lines, although in much lower abundance (˜0.01% of total nucleotides) (FIG. 10D ) than in other cells or tissues that have been previously reported to contain 5-hmC (previous studies did not show the presence of 5-hmC in HeLa cells due to the limited sensitivity of the methods employed (Kriaucionis and Heintz, 2009)). These results suggest that this modification may be more widespread than previously anticipated. By contrast, no 5-hmC signal was detected in wild-type Drosophila melanogaster, consistent with a lack of DNA methylation in this organism (Lyko et al., 2000). - To further validate the utility of the method for biological samples the inventors confirmed the presence of 5-hmC in the genomic DNA from HeLa cells. A monomeric avidin column was used to pull down the biotin-N3-5-gmC-containing DNA after genomic DNA labeling. These enriched DNA fragments were digested into single nucleotides, purified by HPLC and subjected to HRMS analysis. The inventors obtained HRMS as well as MS/MS spectra of biotin-N3-5-gmC identical to the standard from synthetic DNA (
FIG. 8 , andFIGS. 12B and 12C ). In addition, two 60-mer double-stranded (ds)DNAs, one with a single 5-hmC in its sequence and the other without the modification, were prepared. The inventors spiked equal amounts of both samples into mouse genomic DNA and performed labeling and subsequent affinity purification of the biotinylated DNA. The pull-down sample was subjected to deep sequencing, and the result showed that the 5-hmC-containing DNA was >25-fold higher than the control sample (FIG. 13 ). - The inventors performed chemical labeling of genomic DNA from mouse cerebellum, subjecting the enriched fragments to deep sequencing such that 5-hmC-containing genomic regions could be identified. Initially, the inventors compared male and female adult mice (2.5 months old), sequencing multiple independent biological samples and multiple libraries prepared from the same genomic DNA. Genome-scale density profiles are nearly identical between male and female and are clearly distinguishable from both input genomic DNA and control DNA labeled with regular glucose (no biotin) (
FIG. 12A ). Peak identification revealed a total of 39,011 high-confidence regions enriched consistently with 5-hmC in both male and female (FIG. 12A ). All of the 13 selected, enriched regions were subsequently successfully verified in both adult female and male cerebellum by quantitative PCR (qPCR), whereas multiple control regions did not display enrichment (FIG. 14 ). - DNA methylation is widespread in mammalian genomes, with the exception of most transcription start sites (TSS) (Meissner et al., 2008; Lister et al., 2009; Edwards et al., 2010). Previous studies have mostly assessed DNA methylation by bisulfite sequencing and methylation-sensitive restriction digests. It has since been appreciated that neither of these methods adequately distinguishes 5-mC from 5-hmC (Huang et al., 2010; Jin et al., 2010). To determine the genome-wide distribution of 5-hmC, metagene 5-hmC read density profiles were generated for RefSeq transcripts. Normalized 5-hmC read densities differ by an average of 2.10±0.04% (mean±s.e.m.) in adult male and female cerebellum samples, indicating that the profiles are accurate and reproducible. Enrichment of 5-hmC was observed in gene bodies as well as in proximal upstream and downstream regions relative to TSS, transcription termination sites (TTS) and distal regions (
FIG. 12B ). This is in contrast to previously generated methyl-binding domain-sequencing (MBD-Seq) (Skene et al., 2010), as well as our own methylated DNA immunoprecipitation sequencing (MeDIP-Seq) from mouse cerebellum genomic DNA, in which the majority (˜80%) of 5-mC-enriched DNA sequences were derived from satellite and/or repeat regions (FIG. 15 ). Further analyses also reveal that both intragenic and proximal enrichment of 5-hmC is associated with more highly expressed genes, consistent with a role for 5-hmC in maintaining and/or promoting gene expression (FIG. 12B ). Proximal enrichment of 5-hmC ˜875 bp upstream of TSSs and ˜160-200 bp downstream of the annotated TTSs further suggests a role for these regions in the regulation of gene expression through 5-hmC. - Quantification of bulk 5-hmC in the cerebellum of P7 and adult mice indicates genomic acquisition of 5-hmC during cerebellum maturation (
FIG. 10A ). The inventors further explored this phenomenon by sequencing 5-hmC-enriched DNA from P7 cerebellum and compared these sequences to those derived from adult mice. Metagene profiles at RefSeq transcripts confirmed an increase in proximal and intragenic 5-hmC in adult relative to P7 cerebellum, although there was little to no difference and minimal enrichment over input genomic DNA in distal regions (FIG. 12C ). Peak identification using P7 as background identified a total of 20,092 enriched regions that showed significant differences between P7 and adult tissues. Of those, 15,388 (76.6%) occurred within 5,425 genes acquiring intragenic 5-hmC in adult females. - Gene ontology pathway analysis of the 5,425 genes acquiring 5-hmC during aging identified significant enrichment of pathways associated with age-related neurodegenerative disorders as well as angiogenesis and hypoxia response (
FIG. 12D ). This is of particular interest given that all these pathways have been linked to oxidation stress response and that the conversion of 5-mC to 5-hmC requires dioxygen. Furthermore, an assessment of the gene list revealed that 15/23 genes previously identified as causing ataxia and disorders of Purkinje cell degeneration in mouse and human acquired intragenic 5-hmC in adult mice (FIG. 16 ) (Lim et al., 2006). Together, these observations suggest that 5-hmC may play a role in age-related neurodegeneration. - Recently, β-GT was used to transfer a radiolabeled glucose for 5-hmC quantification (Szwagierczak et al., 2010). A major advantage of the technology described herein is its ability to selectively label 5-hmC in genomic DNA with any tag. With a biotin tag attached to 5-hmC, DNA fragments containing 5-hmC can be affinity purified for deep sequencing to reveal distribution and/or location of 5-hmC in mammalian genomes. Because biotin is covalently linked to 5-hmC and biotin-avidin/streptavidin interaction is strong and highly specific, this technology promises high robustness as compared to potential anti-5-hmC, antibody-based, immune-purification methods (Ito et al., 2010). Other fluorescent or affinity tags may be readily installed using the same approach for various other applications. For instance, imaging of 5-hmC in fixed cells or even live cells (if labeling can be performed in one step with a mutant enzyme) may be achieved with a fluorescent tag. In addition, the chemical labeling of 5-hmC with a bulky group could interfere with restriction enzyme digestion or ligation, which may be used to detect 5-hmC in specific genome regions. The attachment of biotin or other tags to 5-hmC also dramatically enhances the sensitivity and simplicity of the 5-hmC detection and/or quantification in various biological samples (Szwagierczak et al., 2010). The detection limit of this method can reach 0.004% (
FIG. 10D ) and the method can be readily applied to study a large number of biological samples. - With the technology described herein, the inventors observed the developmental stage-dependent increase of 5-hmC in mouse cerebellum. Compared to
postnatal day 7 at a time of massive cell proliferation in the mouse cerebellum, adult cerebellum has a significantly increased level of 5-hmC, suggesting that 5-hmC might be involved in neuronal development and maturation. Indeed, the inventors also observed an increase of 5-hmC in aNSCs upon differentiation (unpublished data). - This technology enables the selectively capture of 5-hmC-enriched regions in the cerebellums from both P7 and adult mice, and determine the genome-wide distribution of 5-hmC by deep sequencing. The inventor's analyses revealed general features of 5-hmC in mouse cerebellum. First, 5-hmC was enriched specifically in gene bodies as well as defined gene proximal regions relative to more distal regions. This differs from the distribution of 5-mC, where DNA methylation has been found both within gene bodies as well as in more distal regions (Meissner et al., 2008; Lister et al., 2009; Edwards et al., 2010; Maunakea et al., 2010). Second, the enrichment of 5-hmC is higher in gene bodies that are more highly expressed, suggesting a potential role for 5-hmC in activating and/or maintaining gene expression. It is possible that conversion of 5-mC to 5-hmC is a pathway to offset the gene repression effect of 5-mC during this process without going through demethylation (Wu and Zhang, 2010). Third, the inventors observed an enrichment of 5-hmC in genes linked to hypoxia and angiogenesis. The oxidation of 5-mC to 5-hmC by Tet proteins requires dioxygen (Tahiliani et al. 2009; Ito et al., 2010). A well-known oxygen sensor in mammalian systems that are involved in hypoxia and angiogenesis is the HIF protein, which belongs to the same mononuclear iron-containing dioxygenase superfamily as the active domain of the Tet proteins (Hausinger, 2004). It is tempting to speculate that oxidation of 5-mC to 5-hmC by Tet proteins may constitute another oxygen-sensing and regulation pathway in mammalian cells. Lastly, the association of 5-hmC with genes that have been implicated in neurodegenerative disorders suggests that this base modification could potentially contribute to the pathogenesis of human neurological disorders. Should a connection between 5-hmC levels and human disease be established, the affinity purification approach shown in the current work could be used to purify and/or enrich 5-hmC-containing DNA fragments as a simple and sensitive method for disease prognosis and diagnosis.
- Construction, expression and purification of wild-type β-GT. β-GT was cloned from the extract of T4 bacteriophage (American Type Culture Collection) into the target vector pMCSG19 by the ligation independent cloning method (Donnelly et al., 2006). The resulting plasmid was transformed into BL21 star (DE3)-competent cells containing pRK1037 (Science Reagents) by heat shock. Positive colonies were selected with 150 g/ml ampicillin and 30 g/ml kanamycin. One liter of cells was grown at 37° C. from a 1:100 dilution of an overnight culture. The cells were induced with 1 mM of isopropyl-β-d-thiogalactoside when OD600 reached 0.6-0.8. After overnight growth at 16° C. with shaking, the cells were collected by centrifugation, suspended in 30 ml Ni-NTA buffer A (20 mM Tris-HCl, pH 7.5, 150 mM NaCl, 30 mM imidazole and 10 mM β-mercaptoethanol) with protease inhibitor phenylmethylsulfonyl fluoride. After loading to a Ni-NTA column, proteins were eluted with a 0-100% gradient of Ni-NTA buffer B (20 mM Tris-HCl, pH 7.5, 150 mM NaCl, 400 mM imidazole and 10 mM β-mercaptoethanol). β-GT-containing fractions were further purified by MonoS (GE Healthcare) (buffer A: 10 mM Tris-HCl, pH 7.5; buffer B: 10 mM Tris-HCl, pH 7.5 and 1 M NaCl). Finally, the collected protein fractions were loaded onto a Superdex 200 (GE Healthcare) gel-filtration column equilibrated with 50 mM Tris-HCl (pH 7.5), 20 mM MgCl2 and 10 mM (3-ME. The purity of the purified protein was determined by SDS-PAGE to be >95%. β-GT was concentrated to 45 μM and stored frozen at −80° C. with an addition of 30% glycerol.
- βGT-catalyzed 5-hmC glycosylation in duplex DNA. The inventors synthesized the phosphoramidite of 5-hmC (now commercially available from Glen Research) and prepared duplex DNA with 5-hmC incorporated at specific locations. The inventors found that incubation of 5-hmC-containing duplex DNA (either 15 mer or 40 mer) with 10% purified βGT at 37° C. for 3 hours led to complete conversion of 5-hmC to 5-gmC, as judged by mass spectrometry analysis and digestion of DNA into single nucleosides for HPLC analysis. Primer extension and PCR experiments demonstrated that the presence of 5-gmC in DNA does not exhibit any interference to the polymerization reaction catalyzed by four different polymerases tested, and that guanine is incorporated into the complementary strand opposite 5-gmC. This result demonstrates that modification of 5-hmC with a molecule of glucose is not sufficient for use in primer extension and other polymerase-based assays to determine the exact location of this modification and that addition of much bulkier groups to 5-hmC using functionally modified glucose molecules is preferred.
- A restriction enzyme digestion-based method to detect the precise locations of 5-hmC. The inventors tested a restriction enzyme digestion assay with a 40-mer DNA containing a CC*GG (C*=C, 5-meC, 5-hmC, or 5-gmC) sequence in the middle. As expected, the methylation-sensitive restriction enzyme HpaII does not cut sequences containing 5-meC, 5-hmC, or 5-gmC. However, as shown in
FIG. 2 , while the methylation-insensitive restriction enzyme Mspl can completely cut C(5-meC)GG and partially cut C(5-hmC)GG, its activity is completely blocked by C(5-gmC)GG. This indicates that the introduction of a glucose moiety can change the property of 5-hmC in duplex DNA. With bulkier groups on 5-hmC, digestions by other restriction enzymes that recognize DNA sequences containing CpG can be blocked. - Since 5-gmC can block restriction enzyme digestion, the genomic DNA modified with 5-gmC can be treated with and without restriction enzymes. The two samples can be subjected to next generation sequencing in order to map out the genome-wide distribution and location of the 5-hmC modification in the specific sequences recognized by corresponding restriction enzymes. Briefly, MspI-digested genomic DNA is ligated to a double-stranded adaptor with biotinylation on the upper strand. DNA is sheared further by partial nuclease digest, and size-selected for the 300-500 bp range. A second adaptor is then ligated onto ends that have not yet been filled by the first adaptor. Biotinylated fragments are then pulled down by streptavidin-coated beads, and denaturation will release single-stranded fragments flanked by both types of adaptors. These fragments are amplified by PCR for use in high-throughput sequencing. Internal Mspl sites in the sequencing reads indicate resistance to MspI digest and hence the presence of 5-hmC at those sites. Bulky groups with modified glucose can be installed to interfere with other restriction enzymes.
- Glycosylation of 5-hmC with modified UDP-Glu, including azide-substituted UDP-Glu. The structure of βGT with bound UDP-Glu has been solved and shows that the 6-hydroxyl group on the glucose of UDP-Glu is quite exposed on the protein surface with a water-filled channel located directly on top of it implying that modification of this 6-hyrdoxyl group may be modified to incorporate functional groups. The inventors have synthesized azide-substituted UDP-Glu (
FIG. 3 , Compound 8). The inventors have demonstrated that βGT can catalyze the transfer of 6-azide-glucose to 5-hmC-containing duplex DNA. The reaction with azide-substituted UDP-Glu proceeds as efficiently as with UDP-Glu. These results demonstrate that the azide substitution at this position can still be tolerated and recognized by βGT. Other substitutions at the same position of the azide are expected to be substrates for this reaction as well. - Biotination of 5-hmC in genomic DNA for affinity purification and sequencing. The functional group installed on 5-gmC can be readily labeled with commercially available maleimide or alkyne (click chemistry) linked with a biotin, respectively. The reaction of thiol with maleimide is highly efficient; however, this labeling reaction cannot tolerate proteins or small molecules that bear thiol groups. Thus, genomic DNA must be isolated from other cellular components prior to the labeling, which can be readily achieved. The azide labeling with commercially available biotin-linked alkyne is completely bio-orthogonal, thus genomic DNA with bound proteins can be directly used. In both cases, the biotin-labeled DNA fragments may pulled down with streptavidin and submitted for high-throughput sequencing in order to map out global distributions and the locations of 5-hmC in chromosome. This will reveal a distribution map of 5-hmC in genomic DNA at different development stages of a particular cell or cell line.
- Modified bisulfate sequencing method. 5-meC in DNA reacts with bisulfite slowly, mostly due to the donating factor of the 5-methyl substitution to the cytosine ring (Hayatsu et al., 1970). Instead of attacking the 6-position of the cytosine, bisulfite actually reacts with the hydroxyl group of 5-hmC first. This process has been shown to be fast at pH 4.5 (Hayatsu and Shiragami, 1979). The resulting 5-methylenesulfonate cytosine deaminates slowly to afford 5-methlenesulfonate uracil (
FIG. 4A ). However, the 5-methylenesulfonate substitution renders this group less electronically donating compared to the methyl substitution. By varying the reaction temperature or pH, the inventors believe it is feasible to identify conditions in which bisulfite will react with 5-hmC, but not with 5-meC. Alternatively, at elevated temperature, both 5-meC and 5-hmC will react slowly with bisulfite. However, 5-gmC should be quite inactive since the sulfonation of the hydroxyl groups on the glucose will have minimum electronic effect on the cytosine ring. Through systematic variation of reaction conditions, a modified bisulfite sequencing method can be developed that allows the differentiation between 5-meC, 5-hmC, and 5-gmC. - Labeling 5-gmC with a photosensitizer for high-throughput sequencing. An alternative strategy that does not rely on converting 5-hmC:G base pair to a different base pair is to tether a photosensitizer to 5-gmC using approaches indicated in
FIG. 1 . Photosensitized one-electron oxidation can lead to site-specific oxidation of the modified 5-gmC or the nearby guanines (Tanabe et al., 2007; Meyer et al., 2003). Subsequent base (piperidine) treatment will lead to specific strand cleavage on the oxidized site (FIG. 4B ) (Tanabe et al., 2007; Meyer et al., 2003). Thus, genomic DNA containing 5-gmC labeled with photosensitizer can be subjected to photo-oxidation and base treatment. DNA fragments will be generated with oxidation sites at the end. High-throughput sequencing will reveal these modification sites. - Attachment of a sterically bulky group to 5-gmC. In another strategy, a sterically bulky group such as polyethyleneglycol (PEG), a dendrimer, or a protein such as streptavidin can be introduced to the thiol- or azide-modified 5-gmC. Although 5-gmC in duplex DNA does not interfere with the polymerization reaction catalyzed by various different polymerases, the presence of an additional bulky group on 5-gmC on the DNA template strand can interfere with the synthesis of the new strand by DNA polymerase. As a result, primer extension will lead to a partially extended primer of certain length. The modification sites can be revealed by sequencing the partially extended primers. This method can be very versatile. It can be used to determine the modification sites for a given promoter site of interest. A high-throughput format can be developed as well. DNA fragments containing multiple 5-hmC can be affinity purified and random or designed primers can be used to perform primer extension experiments on these DNA fragments. Partially extended primers can be collected and subjected to high-throughput sequencing using a similar protocol as described in the restriction enzyme digestion method. A bulky modification may stop the polymerization reaction a few bases ahead of the modification site. Still, this method will map the modification sites to the resolution of a few bases. Considering that most 5-hmC exists in a CpG sequence, the resolution can be adequate for most applications. With a bulky substitution on 5-gmC digestion of modified DNA by restriction enzymes other than Mspl could be blocked for the restriction enzyme digestion-based assay.
- Applications in embryonic stem cells and neural stem cells. To test the methods and to demonstrate their relevance to real biological situations, the inventors will apply them to the mapping of 5-hmC in several mammalian cell types. This involves two approaches. First, 5-hmC in mouse embryonic stem cells (ESCs) and fibroblasts are mapped. This would produces a general picture of how 5-hmC patterns compare and contrast between a pluripotent stem cell type and a terminally differentiated cell type. Second, 5-hmC is mapped in mouse neural stem cells (NSCs) as well as neurons and astrocytes derived via the differentiation of these NSCs. This elucidates how the process of lineage differentiation affects 5-hmC patterns.
- These cell types have been selected to parallel a comprehensive set of whole-genome epigenetic studies, including the mapping of the methylome using bisulfate sequencing (which would not differentiate between 5-mC and 5-hmC), the transcriptome, and a variety of histone modifications implicated in gene regulation such as H3K4 acetylation, and H3K4, H3K9 and H3K27 methylation (Goldberg et al., 2007). Furthermore, it was also recently identified that a novel epigenetic state called “occlusion”, whereby affected genes are silenced by cis-acting chromatin mechanisms in a manner that blocks them from responding the trans-acting transcriptional activators in the cell (Lee et al., 2009a; Lee et al., 2009b). Other labs are in the process of producing occludome maps (i.e., genome-wide maps of occluded genes) for the aforementioned cell types. Results from these comprehensive epigenetic studies would provide a highly informative context for interpreting the 5-hmC mapping data.
- Affinity purified genomic DNA fragments enriched for 5-hmC as described in Example 3 can be directly subjected to high-throughput sequencing to identify these fragments.
- Applications in cancer screening. The redistribution of 5-meC in cancer cells is well documented across essentially every known human tumor type, and drugs that alter the DNA methylation state have become the standard of care for patients with myelodysplastic syndrome and hold promise for patients with other hematopoietic malignancies. Recently, mutations of the TET2 gene have been found in a variety of myelodysplastic syndromes, including in approximately 50% of cases of chronic myelomonocytic leukemia (CMML) (Abdel-Wahab et at., 2009; Kohlmann et al., 2009; Smith et al., 2009). The TET2 gene is closely related to TET1, which encodes an enzyme capable of converting 5-meC to 5-hmC, (Tahiliani et al., 2009) raising the issue that patients with CMML have altered content or distribution of 5hmC. Indeed, when lentiviruses have been used to increase the level of TET2 three-fold in leukemia cells, thin layer chromatography showed that 5-hmC content increased by 1.5-fold, and the bone marrow of patients with homozygous TET2 mutations showed a 20% decrease in 5-hmC levels (Szpurka et al., 2009).
- These data suggest that alterations in TET2 levels influence the amount and/or location of 5-hmC within myeloid malignancies. To study this hypothesis, the inventors contemplate the use of the described methods to determine the precise location of 5-hmC bases within three cases of TET2-mutated CMML versus two CMML cases with wild-type TET2. Cases are identified using frozen tumor banks, which contain thousands of stored human leukemia samples. Cases of CMML are identified, and complete sequencing of the TET2 coding region is performed. Three cases of TET2-mutated CMML are chosen, giving preference to samples with a high disease burden (>85% bone marrow involvement) and with adequate cell numbers. For comparison, two cases of CMML will be analyzed that contain wild-type TET2. Determination of 5-hmC within these samples will allow the inventors to measure 5-hmC content as well as the precise location of this modified base. Of particular interest is to note whether the 5-hmC base occurs preferentially at promoter/CpG island/CpG shore regions, repetitive elements, or near centromeres, and whether 5-hmC is concentrated at particular chromosomes.
- Labeling of 5-hmC in RNA samples. TET2 which is a homologue of TET1 modifies RNA. 5-hmC exist in human RNA. TET2 is defective in various leukemias (Abdel-Wahab et al., 2009).
- Azide modified UDP-Glucose can be synthesized by the following reaction scheme.
- Method used for the genomic DNA analysis can be divided into two general strategies. Initially genomic DNA is fragmented, for example by sonication or restiction enzyme digestion. After fragmentation 5-hmC in genomic DNA is detectably modified, for example using click chemistry after introduction of an azide-glucose. In a first strategy, the biotin-labeled DNA fragments will be pulled down using an avidin column. These affinity purified genomic DNA fragments are enriched for 5-hmC and will be directly subjected to high-throughput sequencing for sequence identification. This will give the global distribution map of 5-hmC in genomic DNA anlyzed. Such analysis can be performed at different development stages or in various tissue or cell samples.
- In a second strategy, biotin-labeled genomic DNA fragments the will be ligated to an adaptor with known sequence, then primer extension will then be performed in order to map out the exact location of 5-hmC.
- Preparation of genomic DNA. All animal procedures were performed according to protocols approved by Emory University Institutional Animal Care and Use Committee. Genomic DNA from tissues and cell lines was purified using Wizard genomic DNA purification kit (Promega) with additional Proteinase K treatment and rehydrated in 10 mM Tris (pH 7.9). Genomic DNA samples were further sonicated in Eppendorf tubes into 100-500 bp by Misonix sonicator 3000 (using microtip, three pulses of 30 s each with 2 min of rest and a power output level of 2) or Bioruptor UCD-200 sonicator (Diagenode, Sparta). (The output selector switch was set on High (H), and sonication interval was 30 s with 30 cycles of sonication performed. In addition, samples were resuspended and centrifuged briefly every five cycles to keep the constancy of DNA shearing.) Cerebellums from P7 and 10-week-old C57BL/6 were used. Mouse feeder-free E14Tg2A ES cells (mESC) were cultured as reported (Silva et al., 2008). Adult neural stem cells (aNSCs) were isolated and cultured as described previously (Szulwach et al. 2010).
- Accession codes. The sequencing data have been deposited in NCBI's Gene Expression Omnibus with accession number GSE25398.
- Oligonucleotide synthesis. Oligonucleotides containing 5-hmC were prepared using Applied Biosystems 392 DNA synthesizer. 5-Hydroxymethyl-dC-CE phosphoramidite (Glen Research) was used to incorporate 5-hmC at the desired position during solid-phase synthesis, followed by postsynthetic deprotection by treatment with 30% ammonium hydroxide first and then 25-30% wt/wt solution of sodium methoxide in methanol (Alfa Aesar) overnight at 25° C. The 11-mer DNA was purified by reversed-phase HPLC and confirmed by MALDI-TOF. Other DNA was purified by denaturing PAGE. Concentrations of the oligonucleotides were estimated by UV at 260 nm. Duplexes were prepared by combining equimolar portions of the each strand in annealing buffer (10 mM Tris, pH 7.5, 100 mM NaCl), heating for 10 min at 95° C. followed by slow cooling overnight.
- 5-hmC labeling reaction and click chemistry. The 5-hmC labeling reactions were performed in a 100-μl solution containing 50 mM HEPES buffer (pH 7.9), 25 mM MgCl2, 300 ng/μl sonicated genomic DNA (100-500 bp), 250 μM UDP-6-N3-Glu, and 2.25 μM wild-type βGT. The reactions were incubated for 1 h at 37° C. After the reaction, the DNA substrates were purified by Qiagen DNA purification kit or by phenol-chloroform precipitation and reconstituted in H2O. The click chemistry was performed with addition of 150 μM dibenzocyclooctyne modified biotin (compound 1) into the DNA solution, and the reaction mixture was incubated for 2 h at 37° C. The DNA samples were then purified by Qiagen DNA purification kit, which were ready for further applications.
- Affinity enrichment of the biotinylated 5-hmC (biotin-N3-5-gmC). Genomic DNAs used for deep sequencing were purified/enriched by Pierce Monomeric Avidin Kit (Thermo) twice following manufacturer's recommendations. After elution, the biotin-N3-5-gmC containing DNA was concentrated by 10 K Amicon Ultra-0.5 ml Centrifugal Filters (Millipore) and purified by Qiagen DNA purification kit. Starting with 30 μg total genomic DNA, it is possible to obtain 100-300 ng enriched DNA samples following the labeling and pull-down protocol described here. The deep sequencing experiment can be performed with as low as 10 ng DNA sample.
- The inventors have also developed a cleavable biotin-containing capture agent with a disulfide linker as the click reaction partner to form biotin-S-S-N3-5-gmC (
FIGS. 21A-21C ). The 5-hmC-containing DNA fragments from genomic DNA are captured by streptavidin beads, allowing non-modified DNA to be removed. A simple dithiothreitol (DTT) treatment releases the bound DNA fragments of interest with 5-hmC modified as HS-N3-5-gmC (FIGS. 21A-21C ). This disulfide-reduction strategy to release desired DNA fragments is less time-consuming and more efficient than the previous monomeric avidin column-based purification method, increasing the pull-down efficiency by 2-3 fold. For comparison, pull-down yields of the previous method with mousepostnatal day 7 and mouse adult cerebellum genomic DNA were 0.46% and 1.3%, respectively, while the new method improved the yields to 1.6% and 3.1%, respectively. - Primer extension assay. Reverse primer (14-mer, 5′-AAGCTTCTGGAGTG-3′ (SEQ ID NO:2), purchased from Eurofins MWG Operon and PAGE purified) was end-labeled with T4 polynucleotide kinase (T4 PNK) (New England Biolabs) and 15 μCi of [γ-32P]-ATP (PerkinElmer) for 0.5 h at 37° C., and then purified by
Bio-Spin 6 column (Bio-Rad). For primer extension assay, REDTaq DNA polymerase (Sigma) was used. The inventors first mixed 0.2 pmol template and 0.25 pmol γ-32P-labeled primers with dNTP in the polymerase reaction buffer without adding polymerase. The mixture was heated at 65° C. for 2 min and allowed to cool slowly for 30 min. Streptavidin in PBS was then added if needed and allowed to mix at 25° C. for 5 min. REDTaq DNA polymerase was then added (final volumn 20 μl) and the extension reaction was run at 72° C. for 1 min. The reaction was quenched by 2× stop solution (98% formamide, 10 mM EDTA, 0.1% xylene cyanol, 0.1% bromophenol blue) and loaded on to a 20% denaturing polyacrylamide gel (7 M urea). Sanger sequencing was performed using Sequenase DNA Sequencing Kit (USB) with 1 pmol template and 0.5 pmol [γ-32P]-labeled primer. The results were visualized by autoradiography. - Large-scale HeLa 5-hmC pull-down. Twenty dishes (15 cm) of HeLa cells were harvested and resuspended at 20 ml of 10 mM Tris (pH 8.0), 10 mM EDTA. Sodium dodecyl sulfate (SDS) and Proteinase K were added to final concentrations of 0.5% and 200 μg/ml, respectively, and the solution was allowed to incubate at 55° C. for 2 h. After adding NaCl to a final concentration of 0.2 M, the sample was extracted twice with equal volumes of phenol/chloroform/isoamyl alcohol (25:24:1) and once with chloroform. Chloroform was evaporated by placing the tube in 55° C. water bath for 1 h with cap open. RNase A was then added to a final concentration of 25 μg/ml and the solution incubated for 1 h at 37° C. DNA was then extracted once with phenol/chloroform/isoamyl alcohol (25:24:1) and once with chloroform and precipitated with 1.5 volumes of ethanol. Genomic DNA was washed twice with 20 ml of 70% ethanol, dried and resuspended in 10 mM Tris (pH 7.9) at 37° C. Genomic DNA was then sonicated by Bioruptor UCD-200 sonicator into 100-1,000 bp as noted before. The 5-hmC labeling reaction was carried out in a 4 ml solution containing 50 mM HEPES buffer (pH 7.9), 25 mM MgCl2, 550 ng/μl sonicated HeLa genomic DNA, 250 μM UDP-6-N3-Glu and 2.25 μM wild-type β-GT. The reaction was incubated for 1 h at 37° C., purified by phenol-chloroform precipitation and reconstituted in 4 ml H2O. The inventors added 20 μl of 30 mM dibenzocyclooctyne-modified biotin (compound 1) and incubated the mixture for 2 h at 37° C. The DNA sample was purified again by phenol-chloroform precipitation and then enriched for biotin-N3-5-gmC by monomeric avidin column as noted before. The pull-down DNA was concentrated and digested by nuclease P1 (Sigma), venom phosphodiesterase I (Type VI) (Sigma) and alkaline phosphatase (Sigma) according to published protocols (Crain, 1990). The sample was purified by HPLC C18 reversed-phase column as noted in
FIG. 7 . The peaks corresponding to the biotin-N3-5-gmC from synthetic DNA were collected, lyophilized and subjected to FIRMS analysis. For FIRMS analysis, lyophilized fractions were dissolved in 100 μl of 50% methanol and 5-20 μl samples were injected for LC-MS/MS analysis. The LC-MS/MS system is composed of anAgilent 1200 HPLC system and an Agilent 6520 QTOF system controlled by MassHunter Workstation Acquisition software (B.02.01 Build 2116). A reversed-phase C18 column (Kinetex C18, 50 mm×2.1 mm, 1.7 μm, with 0.2 μm guard cartridge) flowing at 0.4 ml min−1 was used for online separation to avoid potential ion suppression. The gradient was from 98% solvent A (0.05% (vol/vol) acetic acid in MilliQ water), held for 0.5 min, to 100% solvent B (90% acetonitrile (vol/vol) with 0.05% acetic acid (vol/vol) in 4 min. MS and MS/MS data were acquired in extended dynamic range (1,700 m/z) mode, with post-column addition of reference mass solution for real time mass calibration. - Dot-blot assays and quantification of genomic DNA containing 5-hmC. Labeled genomic DNA samples (biotin-N3-5-gmC, 40 ng for mouse cerebellum samples, 700 ng for other samples) were spotted on an Amersham Hybond-N+ membrane (GE Healthcare). DNA was fixed to the membrane by Stratagene UV Stratalinker 2400 (auto-crosslink). The membrane was then blocked with 5% BSA and incubated with avidin-HRP (1:20,000) (Bio-Rad), which was visualized by enhanced chemiluminescence. Quantification was calculated using a working curve generated by 1-8 ng of 32 bp synthetic biotin-N3-5-gmC-containing DNA. Polyclonal antibody against 5-hmC (Active Motif) was also used for dot-blot assay (1:10,000 dilution).
- 5-hmC-enrichment test. Two solutions of 60-mer dsDNA (see
FIG. 13 ) were prepared as noted. Mouse DNA (30 μg) was spiked with 3 pg from each DNA solution. The inventors did 5-hmC labeling and enrichment as noted. The pull-down DNA (10 ng) was end-repaired, adenylated, ligated to adapters (size selection 140-400 bp) and sequenced on an Illumina Genome Analyzer according to the manufacturer's recommendations for Illumina ChIP-Seq to identify spike enrichment. - Reads were mapped to the Mus musculus reference genome (NCBI37/mm9), excluding sequences that were not finished or that have not be placed with certainty (i.e., exclusion of sequences contained in the chrUn_random.fa and chrN_radom.fa files provided by the UCSC genome browser) and appended to contain fasta sequences corresponding to the positive and negative spiked controls. Sequence alignment was accomplished using bwa (Li and Durbin, 2009) and default alignment settings.
- Deep sequencing of mouse cerebellum genomic DNA. DNA libraries were generated following the Illumina protocol for “Preparing Samples for ChIP Sequencing of DNA” (Part# 111257047 Rev. A). 25 ng genomic DNA, 5-hmC-captured DNA, or control captured DNA (in the absence of biotin) were used to initiate the protocol. In some instances <25 ng DNA was eluted in the no-biotin control treatment. In these cases the entire amount of eluted DNA was used to initiate library preparation. DNA fragments ˜150-300 bp were gel purified after the adaptor ligation step. PCR amplified DNA libraries were quantified on an Agilent 2100 Bioanalyzer and diluted to 6 pM for cluster generation and sequencing. 38-cycle single end sequencing was performed using
Version 4 Cluster Generation and Sequencing Kits (Part #15002739 and #15005236 respectively) and Version 7.0 recipes. Image processing and sequence extraction were done using the standard Illumina Pipeline. - Sequence alignment and peak identification. FASTQ sequence files were aligned to Mus musculus reference genome (NCBI37/mm9) using Bowtie (Langmead et al., 2009). The best alignment and reporting option was used for all conditions, corresponding to no more than 2 bp mismatches across each 38 bp read. 5-hmC peak identification was performed using nonduplicate reads with MACS (Zhang et al., 2008). Parameters were as follows: effective genome size=2.72e+09; tag size=38; band width=100; model fold=10; P value cutoff=1.00e−05; ranges for calculating regional lambda are: peak_region, 1,000, 5,000, 10,000.
- For identification of high-confidence peaks consistently detected in adult female and male samples, data from all lanes were merged per condition (5-hmC enriched, nonenriched genomic DNA input) for each sex and used in the analysis described above. Using a combined input genomic DNA sequence set (male input plus female input) as background, 78.7% overlap in identified peaks were observed between male and female samples. As a more stringent analysis, the inventors also used sex-matched input genomic DNA as background/control samples for peak identification. A total of 91,751 peaks were identified in adult female cerebellum and a total of 240,147 peaks were identified in adult male cerebellum using these parameters; 39,011 peaks overlapped ≥1 bp between sexes and are reported as the set of high-confidence peaks consistently detected adult cerebellum. Regions enriched for 5-hmC in adult cerebellum relative to P7 cerebellum were identified using a single lane of adult female 5-hmC reads as the treatment and the single lane of P7 reads as the background and/or control sample. A total of 20,092 regions were identified as enriched for 5-hmC in adult female cerebellum relative to P7 cerebellum. Of these, 15,388 (76.6%) were intragenic to 5,425 unique RefSeq transcripts. Genes acquiring 5-hmC during development are those with peaks overlapping ≥1 bp of a RefSeq gene.
- Generation of metagene profiles and heatmaps. Metagene RefSeq transcript profiles were generated by first determining the distance between any given read and the closest TSS or TTS and then summing the number of 5′ends within 10 bp bins centered on either TSS or txEnds. Ten bp bins were then examined 5 kb upstream and 3 kb downstream to assess the level of 5-hmC in gene bodies relative to TSS and txEnds. The RefSeq reference file was obtained through the UCSC Genome Browser Tables (downloaded May 20, 2010).
- Read densities (Reads/10 bp) were calculated for each individual lane of sequence and then normalized per million reads of aligned sequence to generate a normalized read density. For samples sequenced on multiple lanes, normalized read densities were averaged. To generate the metagene profile for adult cerebellum the inventors averaged normalized read densities from male and female. The inventors observed excellent consistency in normalized read densities between both technical replicates (independent library preparation and sequencing the same library on multiple lanes) as well as between biological replicates (male and female adult samples). For genomic DNA input libraries from male and female samples normalized read densities differed by 3.41±0.05% (mean±s.e.m.). For 5-hmC libraries from male and female samples normalized read densities differed by 2.10±0.04% (mean±s.e.m.).
- To assess 5-hmC in genes expressed at different levels, the inventors obtained adult cerebellum gene expression data from the NCBI GEO sample GSM82974. Signal intensities were downloaded directly, divided into four bins of equal size, and converted into RefSeq mRNA IDs. The inventors then mapped 5-hmC reads to the TSS and txEnds as described above. Heatmap representations of sequence densities were generated using Integrated Genomics Viewer tools and browser (IGV 1.4.2, https://www.broadinstitute.org/igv) with a window size (−w) of 25 and a read extend (−e) of 200.
- MeDIP-Seq, MBD-Seq data and analysis. MBD-Seq data were downloaded from NCBI GEO number GSE19786, data sets SRR037089 and SRR037090 (Skene et al., 2010). Methyl cytosine containing DNA was immunoprecipitated as previously described (Szulwach et al., 2010) using 4 μg sonicated genomic DNA from adult female mouse cerebellum. The inventors used 25 ng immunoprecipitated DNA to generate libraries for sequencing as described above.
- MeDIP-Seq and MBD-Seq reads were aligned to the NCBI37, mm9 using identical parameters as that used for 5-hmC reads. Using these parameters SRR037089 provided 15,351,672 aligned reads, SRR037090 provided 15,586,459 aligned reads and MeDIP-Seq provided 14,104,172 aligned reads. Reads were identified as either RepeatMasker (Rmsk, NCBI37, mm9) or RefSeq (based on 05/20/10 UCSC download) if overlapping ≥1 bp of a particular annotation. The fraction of total reads corresponding to each was then determined. The expected fraction of reads based on the fraction of genomic sequence corresponding to either Rmsk or RefSeq was also plotted for comparison.
- qPCR validation of 5-hmC-enriched regions. Input genomic DNA and 5-hmC enriched DNA were diluted to 1 ng/μl and 1 μl was used in
triplicate 20 μl qPCR reactions each with 1× PowerSYBR Green PCR Master Mix (ABI), 0.5 μM forward and reverse primers, and water. Reactions were run on an SDS 7500 Fast Instrument using the standard cycling conditions. Primers were as follows, including the gene with which the identified peak associated the genomic location. Fold-enrichment was calculated as 2{circumflex over ( )}-dCt, where dCt=Ct (5-hmC enriched)−Ct (Input). Chr3: 65106415-65106915_Kcnab1: Forward (AAGCTATGCCCGTGTCACTCA) (SEQ ID NO:3), Reverse (TGCATCAAGCGACACACAGA) (SEQ ID NO:4); Chr15: 27460605-27461105_Ank: Forward (ATCGGCAGAAGGTAGGAGGAA) (SEQ ID NO:5), Reverse (CCTCACTTGTCTCCCTGCTTATC) (SEQ ID NO:6); Chr8: 24136542-24137042_Ank1: Forward (GAGACCCTCTTGGGACAGTTACC) (SEQ ID NO:7), Reverse (TGGGTTACATTCCTCACTCGAA) (SEQ ID NO:8); Chr19: 16420423-16420923_Gnaq: Forward (ATGAGTGAACCATCCCATGCA) (SEQ ID NO:9), Reverse (TCAGCCAGTGCCTCGTGAT) (SEQ ID NO:10); Chr1: 36417273-36417773_4632411B12Rik: Forward (TGCAACAAGTGCCTGACATACA) (SEQ ID NO:11), Reverse (TTGTGTGTGCAATCATTGTTCATT) (SEQ ID NO:12); Chr11: 53835569-53836069_Slc22a4: Forward (CCTCCAGTCCAGGCAGTGAT) (SEQ ID NO:13), Reverse (CGTCAAAGGAGTCCTGGTCAA) (SEQ ID NO:14); Chr15: 99352255-99352755_Faim2: Forward (CCTCCTTAGGGCCATTCTCAA) (SEQ ID NO:15), Reverse (CGGACCTGATGGGCATAGTAG) (SEQ ID NO:16); Chr16: 7197547-7198047_A2bp1: Forward (TCTACTCCCGTTTCACCGTTTATAT) (SEQ ID NO:17), Reverse (GCCCATGCAGCCAGTTG) (SEQ ID NO:18); Chr17: 12879263-12879763_Igf2r: Forward (AGAGGGACATGGGCATCACA) (SEQ ID NO:19), Reverse (ACCGCTGACTGCCAGTACCT) (SEQ ID NO:20); Chr17: 32919340-32919840_Zfp871: Forward (GACCCAGGAGAGAAAGCATGAG) (SEQ ID NO:21), Reverse (TGACTCCGTGAACAGGAATGG) (SEQ ID NO:22); Chr2: 25147087-25147587_Grin1: Forward (AGAGAGATAGAGGTGGAAGTCAGGTT) (SEQ ID NO:23), Reverse (AGGAGCCTGGAGCAGAAATG) (SEQ ID NO:24); Chr5: 117916917-117917417_Ksr2: Forward (GAACAGTGTAAGGTCCACCCAAGT) (SEQ ID NO:25); Reverse (GGAAAAACGGGTTCGGAAAG) (SEQ ID NO:26); Chr7: 88013448-88013948_Zscan2: Forward (TGGCACACTTGAGCAAATCCTA) (SEQ ID NO:27); Reverse (TGCCAACTATTGGAATGGAAAATA) (SEQ ID NO:28); Control primers: Chr17: 31829767-31830267_Control1: Forward (GAACAGCCAGCAACCTTCTAAAA) (SEQ ID NO:29), Reverse (CAACAGCGTCATGGGATAACA) (SEQ ID NO:30); Chr12: 98299598-98300098_Control2: Forward (ACAACCCGCCCACCAAT) (SEQ ID NO:31, Reverse (TTTAGCTACCCCCAAGTTTAATGG) (SEQ ID NO:32). - GO pathway analysis. Peaks enriched for 5-hmC in adult female relative to P7 were overlapped with RefSeq annotations and those overlapping ≥1 bp were retained. A unique set of genes with ≥1 enriched 5-hmC region was then generated and used as input for the binomial gene list comparison tool provided by the Protein Analysis Through Evolutionary Relationships (PANTHER) classification system (Thomas et al., 2003; Thomas et al., 2006).
- Chemical synthesis.
Compound 1 was prepared according to previous literatures (Ning et al., 2008; Jung and Miller, 1981). UDP-6-N3-UDP was chemically synthesized, see Song et al, 2011. - Statistical methods. The inventors used unpaired two-tailed Student's t-tests (assuming equal variance) to determine significance and calculate P-values between mouse samples of different age. A minimum of three data points was used for each analysis.
- Synthesis of modified uridine diphosphate glucose (UDP-Glu) bearing thiol or azide. The initial success of 5-hmC glycosylation led to the hypothesis that thiol- or azide-modified glucose can be similarly transferred to 5-hmC in duplex DNA. Thus, the inventors have synthesized azide-substituted UDP-Glu and expect to synthesize thiol-substituted UDP-Glu for 5-hmC labeling. An azide tag is preferred since this functional group is not present inside cells. The click chemistry to label this group is completely bio-orthogonal, meaning no interference from biological samples (Kolb et al., 2001). An azide-substituted UDP-Glu shown in
FIG. 3 . The azide-substituted glucoses can be transferred to 5-hmC, as shown below. -
Methyl - 1,2,3,4-tetra-O-acetyl-6-azido-6-deoxy-α-D-glucopyranose (IV). To a solution of
methyl - 2,3,4-tri-O-acetyl-6-azido-6-deoxy-α-D-glucopyranose (V). To a solution of 1,2,3,4-tetra-O-acetyl-6-azido-6-deoxy-α-D-glucopyranose (IV, 850 mg, 2.78 mmol) in THF (12 mL) was added benzylamine (0.23 mL) and the mixture was stirred at rt overnight. Solvents were removed under the reduced pressure and the resulting residue was dissolved in dichloromethane (50 mL) and washed with water, saturated ammonia chloride, brine, dried over sodium sulfate and concentrated. The residue was purified by silica gel chromatography, eluting with 1:1 hexane-ethyl acetate, to give V (640 mg, 85%) as a white foam. 1H NMR (500.1 MHz) (CD3Cl) δ: 5.50 (m, 2H), 5.03 (t, J=10.5 Hz, 1H), 4.90 (m, 1H), 4.24 (m, 1H), 3.35 (m, 2H), 2.12 (s, 3H), 2.05 (s, 3H), 2.02 (s, 3H). 13C NMR (125.8 MHz) (CDCl3) δ: 170.26, 170.23, 169.78, 89.98, 71.09, 69.71, 68.34, 51.05, 20.68, 20.62, 20.58.
- 2,3,4-tri-O-acetyl-6-azido-6-deoxy-α-D-glucopyranosyl phosphate mono-triethylamine salt (VI). To a solution of 2,3,4-tri-O-acetyl-6-azido-6-deoxy-α-D-glucopyranose (V, 390 mg, 1.18 mmol) in diethyl ether at 0° C. triethylamine (1 mL) was added followed by 2-chloro-4H-1,3,2-benzodioxaphosphorin-4one (239 mg). After stirring at 0° C. for 1 hr, water (1 mL) was added and the mixture was concentrated. The residue was dissolved in THF (20 mL) and the generated precipitant was separated by filtration. The filtrate was passed through a cation-exchange resin column with additional THF. The filtrate was concentrated. To a solution of the resulting oil in THF was added iodine (100 mg) and the mixture was stirred at rt for 24 hr and triethylamine was added to neutralize the solution to pH=7. The concentrate was purified by a C18 reverse-phase column to give the phosphate VI. 1H NMR (500.1 MHz) (CD3OD) δ: 5.75 (dd, J=9.5, 4.0 Hz, 1H), 5.52 (m, 1H), 5.14 (t, J=12.0 Hz, 1H), 4.89 (
m 1H), 4.29 (m 1H), 3.60 (dd, J=17.0, 4.0 Hz, 1H), 3.34 (m, 2H), 3.20 (q, 6H), 2.07 (s, 3H), 2.04 (s, 3H), 2.01 (s, 3H). 31P NMR (202.5 MHz) (CD3OD) δ: −0.94. -
Uridine 5′-(2,3,4-tri-O-acetyl-6-azido-6-deoxy-α-D-glucopyranosyl) diphosphate bistriethylammonium salt (UDP-6-N3-UDP). 2,3,4-Tri-O-acetyl-6-azido-6-deoxy-α-D-glucopyranosyl phosphate mono-triethylamine salt (VI, 23 mg, 0.045 mmol) was co-evaporated with dry pyridine (5 mL) for three times under reduced pressure.Uridine 5′-monophosphomorpholidate 4-morpholine-N,N′-dicyclohexylcarboxamidine salt (77 mg) and tetrazole (250 μL, 0.45M) were added and the mixture was co-evaporated with dry pyridine (5 mL) for three times under reduced pressure. The residue was dried in vacuum for overnight and was added distilled pyridine (5 mL). The mixture was stirred at rt under argon atmosphere. After three days, pyridine was removed under reduced pressure and the residue was co-evaporated with toluene. The residue was then added with methanol (3.4 mL), aqueous solution of NH4HCO3 (0.1 M, 4.5 mL) and triethylamine (0.18 mL) and the mixture was stirred at 0 for 24 hr. Then water (20 mL) was added and pH was adjusted to 7.5 with DOWEX 50W (H+ form) resin. The resin was removed by filtration through a PTFE filter and the resin was washed with water (10 mL). After concentration, the product was purified by C18 reverse-phase HPLC, eluting with 0-20% CH3CN in 0.1 M TEAA. 1H NMR (500.1 MHz) (D2O) δ: 7.85 (d, J=8.0 Hz, 1H), 5.85 (m, 2H), 5.46 (m, 1H), 4.09-4.66 (m, 7H), 3.63 (m, 2H), 3.43 (m, 2H), 3.37 (Et3NHOAc), 1.78 (Et3NHOAc), 1.15 (Et3NHOAc). 13C NMR (125.8 MHz) (D2O) δ: 181.46 (Et3NHOAc), 166.30, 151.87, 141.61, 102.66, 95.47, 88.38, 83.22, 73.78, 72.66, 71.62, 69.78, 69.61, 64.90, 50.53, 46.68 (Et3NHOAc), 23.25 (Et3NHOAc), 8.22 (Et3NHOAc). 31P NMR (202.5 MHz) (D2O) δ: −10.69 ppm (d, J=20.65 Hz) and −12.49 ppm (d, J=21.01 Hz). HRMS (ESI, negative mode) for C49H57N4NaO9P, [M−H]−: 590.0542 (calcd.); 590.0523 (found). - Here the inventors describe an example of using of a methylation-insensitive restriction enzyme coupled with selective chemical labeling of 5-hmC in a Combined Glycosylation Restriction Analysis (CGRA) to detect 5-hmC in TCGA sequences. This example provides a proof of principle demonstration using the methylation-insensitive restriction enzyme TaqαI. This method, differentiates fully versus hemi-hydroxymethylated cytosine in the CpG dinucleotide, adds a new tool to facilitate biological studies of 5-hmC.
- As described herein, the inventors developed a chemical labeling method to selectively label 5-hmC with glucose by β-glucosyltransferase (βGT), e.g., an azide modified glucose. The glucose is subsequently coupled to a probe that allows detection of 5-hmC in genomic DNA (Song et al., 2011). With this method, 5-hmC-containing genomic DNA fragments can be enriched and sequenced to provide the genomic distribution of this modification.
- The use of methylation-sensitive restriction enzymes is a classic approach to the study of DNA methylation at specific loci (Singer-Sam et al., 1990). However, due to the similar size of the methyl and hydroxymethyl groups, 5-mC and 5-hmC are indistinguishable to most restriction enzymes (Jin et al., 2010; Nestor et al., 2010; Tardy-Planechaud et al., 1997; Szwagierczak et al., 2011). The inventors contemplated that after adding the bulky glucose group or a subsequent biotin attachment to 5-hmC, the glucosylated base would have different properties from 5-mC (Huang et al., 1982). For example, the resulting 5-N3-gmC or biotin-5-N3-gmC in a duplex DNA may be able to block digestion from the methylation-insensitive restriction enzyme, which can digest both 5-hmC- and 5-mC-containing DNA. Zymo Research and New England Biolabs have launched products based on this combined glycosylation restriction analysis (CGRA). They utilize βGT to transfer a regular glucose to 5-hmC and show that it can block the methylation-insensitive restriction enzyme MspI, which has a recognized sequence of C{circumflex over ( )}CGG (Davis and Vaisvila, 2011). Although the use of MspI in CGRA can detect the presence of 5-hmC on CCGG site, it has several limitations: (i) MspI is also blocked if the outer C is 5-mC or 5-hmC, regardless of the cytosine modification status of the inner C, which limits the use of this approach on many CCGG sites where the outer C methylated (Tardy-Planechaud et al., 1997); (ii) it cannot tell whether 5-hmC occurs on only one strand or both strands of the CpG dinucleotide. The inventors demonstrate that TaqαI, another methylation-insensitive restriction enzyme that recognizes and cuts TACGA, can also be used in CGRA when coupled with our chemical labeling method. This new approach can differentiate fully versus hemi-hydroxymethylated states in the CpG dinucleotide.
- The inventors first synthesized a 32-mer double strand DNA bearing T*CGA (*C=5-hmC) on both strands (
FIG. 19A ). Instead of glucose, inventors employed βGT to transfer chemically modified 6-N3-glucose, onto 5-hmC. Then the modified DNA was subjected to TaqαI-mediated digestion. TaqαI can completely cut unmodified 5-hmC, but only partially for N3-5-gmC. (FIG. 19A , lane 1-4) (Huang et al., 1982). The relatively high tolerance of TaqαI to the cytosine modification requires more bulky modifications in order to achieve a satisfactory difference in CGRA. The presence of azide group on the glucose allows the further addition of modifications by using click chemistry (Rostovtsev et al., 2002; Sletten et al., 2009; Speers and Cravatt, 2004). N3-5-gmC was coupled with dibenzocyclooctyne-modified biotin using copper-free click chemistry to introduce a sterically bulky dibenzocyclooctyne moiety (FIG. 18 ) (Ning et al., 2008; Jewett and Bertozzi, 2010). When the inventors introduced the biotin-N3-5-gmC modification into TaqαI digestion, it showed an almost complete blocking effect (FIG. 19A , lane 5-6). - Due to the semi-conservative DNA replication, besides a full methylation state, hemi-methylation state also exists in mammalian genome. The conversion of 5-mC to 5-hmC suggests that fully and hemi-hydroxymethylation states may also exist in the mammalian genome. If this is the case, developing a method to distinguish between these two states can be used to understand the formation of 5-hmC and conversion process between 5-mC and 5-hmC. Since the blocking efficiency of TaqαI is largely dependent on the size of the modification group on the hydroxyl group, TaqαI was tested to see if it behaved differently over fully and hemi-hydroxymethylation states. The inventors prepared the same 32-mer double strand DNA with hemi-hydroxymethylation, performed the same labeling procedure, and subjected to TaqαI digestion (
FIG. 19B ). While the hemi-5-hmC can be cut, hemi-N3-5-glocose cannot block digestion as well as the fully-modified one (FIG. 19B , lane 1-4). Even with the bulkier group, biotin-N3-5-gmC, present, the majority of DNA was still digested (FIG. 19B , lane 5-6). Thus, TaqαI digests the hemi-modified sequence but is blocked by the fully-modified one with biotin-N3-5-gmC. This noticeable difference of TaqαI in response to fully- and hemi-hydroxymethylation states after modification provides a method to distinguish these two states on TCGA sites by comparison of the sensitivity to restriction. - To further investigate if the difference in digestion between fully and hemi-hydroxymethylation states is universal in restriction enzymes, the inventors replaced the T*CGA site with C*CGG (*C=5-hmC) in the previous 32-mer duplex DNA and performed the same assays using Mspl (
FIG. 20 ). Both fully and hemi-5-hmC can be cut by Mspl completely before modification (FIG. 20A , lane 1-2 andFIG. 20B , lane 1-2). For a fully 5-hmC site, both N3-gmC and biotin-N3-5-gmC can block Mspl digestion completely (FIG. 20A , lane 3-6), suggesting that MspI is more sensitive towards the steric hindrance of cytosine modification of the CpG dinucleotide; the presence of a glucose group on the inner C is enough for the protection from digestion. For the hemi-5-hmC site, it gave the same results as the fully-modified one: both the N3-5-gmC and biotin-N3-5-gmC can be protected from digestion (FIG. 20B , lane 3-6). This result is in accordance with our assumption that MspI is much easier to block and that it cannot distinguish the fully-modified 5-hmC site from hemi-5-hmC site as TaqαI. - The following references, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically incorporated herein by reference.
- Abdel-Wahab et al., Blood, 114:144-147, 2009.
- Baskin et al., Proc. Natl. Acad. Sci. USA, 104:16793-16797, 2007.
- Clark et al., Nucleic Acids Res. 22:2990-2997, 1994.
- Crain, Methods Enzymol., 193:782-790, 1990.
- Davis et al., Vis. Exp., (48), 2011.
- Donnelly et al., Protein Expr. Purif., 47(2):446-454, 2006.
- Edwards et al., Genome Res., 20:972-980, 2010.
- Evans, Austral. J. Chem., 60(6): 384-395, 2007.
- Flusberg et al., Nat. Methods, 7:461-465, 2010.
- Frommer et al.: Proc. Natl. Acad. Sci. USA, 89:1827-1831, 1992.
- Geier and Modrich, J. Biol. Chem., 254:1408-1413, 1979.
- Goldberg et al., Cell, 128:635-638, 2007.
- Hausinger, Crit. Rev. Biochem. Mol. Biol., 39:21-68, 2004.
- Hayatsu and Shiragami, Biochemistry, 18:632-637, 1979.
- Hayatsu et al., Biochemistry, 9:2858-2865, 1970.
- Hein et al., Pharmaceut. Res., 25(10):2216-2230, 2008.
- Huang et al., Nucleic Acids Res., 10:1579, 1982.
- Huang et al., PLoS One, 5:e8888, 2010.
- Ito et al., Nature, 466:1129-1133, 2010.
- Jewett and Bertozzi, Chem. Soc. Rev., 39:1272, 2010.
- Jin et al., Nucleic Acids Res., 38:e125, 2010.
- Josse and Kornberg, Biol. Chem., 237:1968-1976, 1962.
- Jung and Miller, J. Am. Chem. Soc., 103:1984-1992, 1981.
- Kohlmann et al., In: American Society of Hematology, Abstract 417, LA, Dec., 2009.
- Kolb et al., Angew. Chem. Int. Ed., 40:2004-2021, 2001.
- Kriaucionis and Heintz, Science, 324:929-930, 2009.
- Langmead et al., Genome Biol., 10:R25, 2009.
- Lariviere and Morera, J. Biol. Chem., 279:34715-34720, 2004.
- Lee et al., Hum. Mol. Genet., 18:2567-2574, 2009b.
- Lee et al., Hum. Mol. Genet., 18:835-846, 2009a.
- Li and Durbin, Bioinformatics, 25:1754-1760, 2009.
- Lim et al., Cell, 125:801-814, 2006.
- Lister et al., Nature, 462:315-322, 2009.
- Lyko et al., Nature, 408:538-540, 2000.
- Margulies et al., Nature, 437:376-380, 2005.
- Marinus and Morris, J. Bacteriol., 114:1143-1150, 1973.
- Maunakea et al., Nature, 466:253-257, 2010.
- May and Hattman, J. Bacteriol., 123:768-770, 1975.
- Meissner et al., Nature, 454:766-770, 2008.
- Meyer et al., Chem, Bio. Chem., 4:610-614, 2003.
- Moses and Moorhouse, Chem. Soc. Rev., (36):1249-1262, 2007.
- Munzel et al., Angew. Chem. Int. Ed., 49:5375-5377, 2010.
- Nestor et al., BioTechniques, 48:317, 2010.
- Ning et al., Angew. Chem. Int. Ed., 47:2253-2255, 2008.
- Rostovtsev et al., Angew. Chem., Int. Ed., 41:2596, 2002.
- Siegfried and Cedar, Curr. Biol., 7:r305-307, 1997.
- Silva et al., PLoS Biol., 6:e253, 2008.
- Singer-Sam et al., Mol. Cell. Biol., 10:4987, 1990.
- Skene et al. Mol. Cell, 37:457-468, 2010.
- Sletten and Bertozzi, Angew. Chem. Int. Ed., 48:6974-6998, 2009.
- Smith et al., In: American Society of Hematology, Abstract 733, LA, December 2009.
- Song et al., Nat. Biotechnol., 29:68, 2011.
- Speers and Cravatt, Chem. Biol., 11:535-546, 2004.
- Szpurka et al., In: American Society of Hematology, Abstract 2908, LA, December 2009.
- Szulwach et al., J. Cell Biol., 189:127-141, 2010.
- Szwagierczak et al., Nucleic Acids Res., 2011 (ahead of Pub)
- Szwagierczak et al., Nucleic Acids Res., 38:e181, 2010.
- Tahiliani et al., Science, 324:930-935, 2009.
- Tanabe et al., J. Am. Chem. Soc., 129:8034-8040, 2007.
- Tardy-Planechaud et al., Nucleic Acids Res., 25:553, 1997.
- Thomas et al., Genome Biol., 13:2129-2141, 2003.
- Thomas et al., Nucleic Acids Res., 34:W645-650, 2006.
- Tornoe et al., J. Organic Chem., 67(9):3057-3064, 2002.
- Weber et al., Nature Genetics, 37:853-862, 2005.
- Wu and Zhang, Mol. Cell Biol., 11:607-620, 2010.
- Zeschnigk et al., Hum. Mol. Genet., 6:387-395, 1997.
- Zhang et al., Genome Biol.,9,:137, 2008.
Claims (21)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/713,657 US20200102616A1 (en) | 2010-04-06 | 2019-12-13 | COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5 HYDROXYMETHYLCYTOSINE (5-hmC) |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US32119810P | 2010-04-06 | 2010-04-06 | |
PCT/US2011/031370 WO2011127136A1 (en) | 2010-04-06 | 2011-04-06 | Composition and methods related to modification of 5-hydroxymethylcytosine (5-hmc) |
US13/095,505 US8741567B2 (en) | 2010-04-06 | 2011-04-27 | Composition and methods related to modification of 5-hydroxymethylcytosine (5-hmC) |
US14/267,727 US20150056616A1 (en) | 2010-04-06 | 2014-05-01 | COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5-HYDROXYMETHYLCYTOSINE (5-hmC) |
US16/713,657 US20200102616A1 (en) | 2010-04-06 | 2019-12-13 | COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5 HYDROXYMETHYLCYTOSINE (5-hmC) |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/267,727 Continuation US20150056616A1 (en) | 2010-04-06 | 2014-05-01 | COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5-HYDROXYMETHYLCYTOSINE (5-hmC) |
Publications (1)
Publication Number | Publication Date |
---|---|
US20200102616A1 true US20200102616A1 (en) | 2020-04-02 |
Family
ID=44763258
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/095,505 Active US8741567B2 (en) | 2010-04-06 | 2011-04-27 | Composition and methods related to modification of 5-hydroxymethylcytosine (5-hmC) |
US14/267,727 Abandoned US20150056616A1 (en) | 2010-04-06 | 2014-05-01 | COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5-HYDROXYMETHYLCYTOSINE (5-hmC) |
US16/713,657 Pending US20200102616A1 (en) | 2010-04-06 | 2019-12-13 | COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5 HYDROXYMETHYLCYTOSINE (5-hmC) |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/095,505 Active US8741567B2 (en) | 2010-04-06 | 2011-04-27 | Composition and methods related to modification of 5-hydroxymethylcytosine (5-hmC) |
US14/267,727 Abandoned US20150056616A1 (en) | 2010-04-06 | 2014-05-01 | COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5-HYDROXYMETHYLCYTOSINE (5-hmC) |
Country Status (2)
Country | Link |
---|---|
US (3) | US8741567B2 (en) |
WO (1) | WO2011127136A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023278528A1 (en) * | 2021-07-02 | 2023-01-05 | Enzo Biochem, Inc. | Method for detecting and quantifying dna methylation in a selected locus or region of dna |
US11608518B2 (en) | 2020-07-30 | 2023-03-21 | Cambridge Epigenetix Limited | Methods for analyzing nucleic acids |
Families Citing this family (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010037001A2 (en) | 2008-09-26 | 2010-04-01 | Immune Disease Institute, Inc. | Selective oxidation of 5-methylcytosine by tet-family proteins |
EP2470675B1 (en) | 2009-08-25 | 2016-03-30 | New England Biolabs, Inc. | Detection and quantification of hydroxymethylated nucleotides in a polynucleotide preparation |
WO2011127136A1 (en) * | 2010-04-06 | 2011-10-13 | University Of Chicago | Composition and methods related to modification of 5-hydroxymethylcytosine (5-hmc) |
US20120064521A1 (en) * | 2010-09-09 | 2012-03-15 | James Yen | Detection of dna hydroxymethylation |
EP2694686B2 (en) | 2011-04-06 | 2023-07-19 | The University of Chicago | COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5-METHYLCYTOSINE (5mC) |
WO2012149047A1 (en) * | 2011-04-29 | 2012-11-01 | Sequenom, Inc. | Multimer glycosylated nucleic acid binding protein conjugates and uses thereof |
US9290807B2 (en) | 2011-07-29 | 2016-03-22 | Cambridge Epigenetix Limited | Methods for detection of nucleotide modification |
CN103131754B (en) * | 2011-11-24 | 2014-07-30 | 深圳华大基因科技服务有限公司 | Method for detecting nucleic acid hydroxylmethylation modification, and application thereof |
US10023909B2 (en) | 2011-12-13 | 2018-07-17 | Oslo Universitetssykehus Hf | Methods and kits for detection of methylation status |
WO2013131981A1 (en) | 2012-03-08 | 2013-09-12 | Novartis Ag | Predictive markers useful in the diagnosis and treatment of fragile x syndrome (fxs) |
US9040239B2 (en) | 2012-03-15 | 2015-05-26 | New England Biolabs, Inc. | Composition and methods of oxygenation of nucleic acids containing 5-methylpyrimidine |
US10081827B2 (en) | 2012-03-15 | 2018-09-25 | New England Biolabs, Inc. | Mapping cytosine modifications |
US9297806B2 (en) | 2012-08-01 | 2016-03-29 | The Johns Hopkins University | 5-hydroxymethylcytosine in human cancer |
WO2014062431A1 (en) * | 2012-10-18 | 2014-04-24 | Nexcelom Bioscience Llc | Automated yeast budding measurement |
GB2523919B (en) * | 2012-11-06 | 2017-07-05 | New England Biolabs Inc | Mapping cytosine modifications |
ES2669512T3 (en) | 2012-11-30 | 2018-05-28 | Cambridge Epigenetix Limited | Oxidizing agent for modified nucleotides |
KR20160011670A (en) | 2013-05-28 | 2016-02-01 | 라모트 앳 텔-아비브 유니버시티 리미티드 | Detection of hydroxymethylcytosine bases |
WO2015009844A2 (en) * | 2013-07-16 | 2015-01-22 | Zymo Research Corp. | Mirror bisulfite analysis |
WO2015021282A1 (en) * | 2013-08-09 | 2015-02-12 | New England Biolabs, Inc. | Detecting, sequencing and/or mapping 5-hydroxymethylcytosine and 5-formylcytosine at single-base resolution |
GB201403216D0 (en) | 2014-02-24 | 2014-04-09 | Cambridge Epigenetix Ltd | Nucleic acid sample preparation |
US20180327855A1 (en) * | 2015-11-11 | 2018-11-15 | Ramot At Tel-Aviv University Ltd. | Methods of detecting 5-hydroxymethylcytosine and diagnosing of cancer |
WO2017176630A1 (en) | 2016-04-07 | 2017-10-12 | The Board Of Trustees Of The Leland Stanford Junior University | Noninvasive diagnostics by sequencing 5-hydroxymethylated cell-free dna |
WO2018129120A1 (en) * | 2017-01-04 | 2018-07-12 | The University Of Chicago | Methods for detecting cytosine modifications |
US11130991B2 (en) | 2017-03-08 | 2021-09-28 | The University Of Chicago | Method for highly sensitive DNA methylation analysis |
MX2020007259A (en) | 2018-01-08 | 2022-08-26 | Ludwig Inst For Cancer Res Ltd | Bisulfite-free, base-resolution identification of cytosine modifications. |
WO2019160994A1 (en) | 2018-02-14 | 2019-08-22 | Bluestar Genomics, Inc. | Methods for the epigenetic analysis of dna, particularly cell-free dna |
WO2019191429A2 (en) * | 2018-03-28 | 2019-10-03 | Board Of Regents, The University Of Texas System | Identification of epigenetic alterations in dna isolated from exosomes |
CN112236520A (en) | 2018-04-02 | 2021-01-15 | 格里尔公司 | Methylation signatures and target methylation probe plates |
CN112204666A (en) | 2018-04-13 | 2021-01-08 | 格里尔公司 | Multiple assay predictive model for cancer detection |
WO2019232435A1 (en) | 2018-06-01 | 2019-12-05 | Grail, Inc. | Convolutional neural network systems and methods for data classification |
WO2020023842A1 (en) | 2018-07-27 | 2020-01-30 | The University Of Chicago | Methods for the amplification of bisulfite-treated dna |
CA3112880A1 (en) * | 2018-09-19 | 2020-03-26 | Bluestar Genomics, Inc. | Cell-free dna hydroxymethylation profiles in the evaluation of pancreatic lesions |
EP3856903A4 (en) | 2018-09-27 | 2022-07-27 | Grail, LLC | Methylation markers and targeted methylation probe panel |
MX2021003847A (en) | 2018-10-04 | 2021-05-27 | Bluestar Genomics Inc | Simultaneous, sequencing-based analysis of proteins, nucleosomes, and cell-free nucleic acids from a single biological sample. |
US11581062B2 (en) | 2018-12-10 | 2023-02-14 | Grail, Llc | Systems and methods for classifying patients with respect to multiple cancer classes |
WO2020132151A1 (en) | 2018-12-19 | 2020-06-25 | Grail, Inc. | Cancer tissue source of origin prediction with multi-tier analysis of small variants in cell-free dna samples |
PL3914736T3 (en) | 2019-01-25 | 2024-06-17 | Grail, Llc | Detecting cancer, cancer tissue of origin, and/or a cancer cell type |
CA3127894A1 (en) | 2019-02-05 | 2020-08-13 | Grail, Inc. | Detecting cancer, cancer tissue of origin, and/or a cancer cell type |
WO2020163410A1 (en) | 2019-02-05 | 2020-08-13 | Grail, Inc. | Detecting cancer, cancer tissue of origin, and/or a cancer cell type |
CA3136204A1 (en) | 2019-05-13 | 2020-11-19 | Grail, Inc. | Model-based featurization and classification |
WO2021041968A1 (en) | 2019-08-28 | 2021-03-04 | Grail, Inc. | Systems and methods for predicting and monitoring treatment response from cell-free nucleic acids |
WO2021257854A1 (en) | 2020-06-20 | 2021-12-23 | Grail, Inc. | Detection and classification of human papillomavirus associated cancers |
US20230357852A1 (en) | 2020-08-19 | 2023-11-09 | Mayo Foundation For Medical Education And Research | Detecting non-hodgkin lymphoma |
AU2022213409A1 (en) | 2021-01-29 | 2023-08-17 | Exact Sciences Corporation | Detecting the presence or absence of multiple types of cancer |
CN112858419B (en) * | 2021-02-26 | 2021-11-23 | 山东农业大学 | Method for detecting 5-hydroxymethylcytosine by constructing photoelectrochemical sensor |
EP4373967A1 (en) * | 2021-07-20 | 2024-05-29 | Freenome Holdings, Inc. | Compositions and methods for improved 5-hydroxymethylated cytosine resolution in nucleic acid sequencing |
CN113637752B (en) * | 2021-07-21 | 2023-07-18 | 中山大学 | Whole genome whole 5hmC detection method and application thereof |
IL314574A (en) | 2022-02-17 | 2024-09-01 | Grail Llc | Tumor fraction estimation using methylation variants |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010037001A2 (en) * | 2008-09-26 | 2010-04-01 | Immune Disease Institute, Inc. | Selective oxidation of 5-methylcytosine by tet-family proteins |
US8741567B2 (en) * | 2010-04-06 | 2014-06-03 | The University Of Chicago | Composition and methods related to modification of 5-hydroxymethylcytosine (5-hmC) |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS5558096A (en) * | 1978-10-25 | 1980-04-30 | Noda Sangyo Kagaku Kenkyusho | Method of making novel recombined dna |
US5281524A (en) * | 1988-01-22 | 1994-01-25 | Kanebo, Ltd. | Cell-associated glucosyltransferase from Streptococcus mutans serotype, c, e or f |
US5075216A (en) * | 1988-09-23 | 1991-12-24 | Cetus Corporation | Methods for dna sequencing with thermus aquaticus dna polymerase |
DE4407423A1 (en) * | 1994-03-05 | 1995-09-07 | Boehringer Mannheim Gmbh | Anti-interference agent for use in immunoassays |
US6043062A (en) * | 1995-02-17 | 2000-03-28 | The Regents Of The University Of California | Constitutively active phosphatidylinositol 3-kinase and uses thereof |
ZA973642B (en) * | 1996-04-26 | 1997-11-25 | Merck & Co Inc | DNA vaccine formulations. |
AU2965500A (en) * | 1999-01-15 | 2000-08-01 | Gene Logic, Inc. | Immobilized nucleic acid hybridization reagent and method |
US6913895B1 (en) * | 1999-08-17 | 2005-07-05 | Advanced Medicine East, Inc. | Methods for assaying transglycosylase reactions, and for identifying inhibitors thereof |
WO2002048331A2 (en) * | 2000-12-13 | 2002-06-20 | Memorial Sloan-Kettering Cancer Center | Active-site engineering of nucleotidyltransferases and enzymatic methods for the synthesis of natural and 'unnatural' udp- and nucleotide sugars |
AU2002365404A1 (en) * | 2001-11-27 | 2003-06-10 | Compound Therapeutics, Inc. | Solid-phase immobilization of proteins and peptides |
ES2334220T3 (en) * | 2002-10-11 | 2010-03-08 | Astellas Pharma Europe B.V. | GLUCOSE-BASED COMPOUNDS WITH AFFINITY FOR P-SELECTIVE. |
US20060160110A1 (en) * | 2004-12-02 | 2006-07-20 | Takayuki Mizutani | Methods of designing small interfering RNAs, antisense polynucleotides, and other hybridizing polynucleotides |
US20090148850A1 (en) * | 2007-10-04 | 2009-06-11 | Stacia Kargman | Methods for identifying modulators of P2RY14 |
US7608402B2 (en) * | 2008-01-10 | 2009-10-27 | Weiwei Li | DNA methylation specific signal amplification |
-
2011
- 2011-04-06 WO PCT/US2011/031370 patent/WO2011127136A1/en active Application Filing
- 2011-04-27 US US13/095,505 patent/US8741567B2/en active Active
-
2014
- 2014-05-01 US US14/267,727 patent/US20150056616A1/en not_active Abandoned
-
2019
- 2019-12-13 US US16/713,657 patent/US20200102616A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010037001A2 (en) * | 2008-09-26 | 2010-04-01 | Immune Disease Institute, Inc. | Selective oxidation of 5-methylcytosine by tet-family proteins |
US8741567B2 (en) * | 2010-04-06 | 2014-06-03 | The University Of Chicago | Composition and methods related to modification of 5-hydroxymethylcytosine (5-hmC) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11608518B2 (en) | 2020-07-30 | 2023-03-21 | Cambridge Epigenetix Limited | Methods for analyzing nucleic acids |
WO2023278528A1 (en) * | 2021-07-02 | 2023-01-05 | Enzo Biochem, Inc. | Method for detecting and quantifying dna methylation in a selected locus or region of dna |
Also Published As
Publication number | Publication date |
---|---|
US20150056616A1 (en) | 2015-02-26 |
WO2011127136A1 (en) | 2011-10-13 |
US20110301045A1 (en) | 2011-12-08 |
US8741567B2 (en) | 2014-06-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200102616A1 (en) | COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5 HYDROXYMETHYLCYTOSINE (5-hmC) | |
EP2694686B1 (en) | COMPOSITION AND METHODS RELATED TO MODIFICATION OF 5-METHYLCYTOSINE (5mC) | |
US20230392191A1 (en) | Selective degradation of wild-type dna and enrichment of mutant alleles using nuclease | |
US10683551B2 (en) | Detecting methylation in a subpopulation of genomic DNA | |
CN106854679B (en) | Methylated DNA detection method | |
AU2015294354B2 (en) | Polynucleotide enrichment using CRISPR-Cas systems | |
KR102592367B1 (en) | Systems and methods for clonal replication and amplification of nucleic acid molecules for genomic and therapeutic applications | |
US20050208538A1 (en) | Methods for analysis of nucleic acid methylation status and methods for fragmentation, labeling and immobilization of nucleic acids | |
WO2002044335A2 (en) | Detection of nucleic acid differences using combined endonuclease cleavage and ligation reactions | |
EP2722401B1 (en) | Addition of an adaptor by invasive cleavage | |
JP6234463B2 (en) | Nucleic acid multiplex analysis method | |
US20070292866A1 (en) | Diagnosing human diseases by detecting DNA methylation changes | |
US11608518B2 (en) | Methods for analyzing nucleic acids | |
US10287628B2 (en) | Methods and kits for identifying polypeptide binding sites in a genome | |
CN117778568A (en) | Marker for identifying gastric cancer and application thereof | |
JP2982304B2 (en) | Method for identifying nucleic acid and test set for identifying nucleic acid | |
WO2001044504A2 (en) | Method for detecting methylated cpg-containing nucleic acid | |
WO2024084439A2 (en) | Nucleic acid analysis | |
IL293201A (en) | Reaction buffer compositions and methods for dna amplification and sequencing | |
CN112714796A (en) | Method for amplifying bisulfite-treated DNA |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: THE UNIVERSITY OF CHICAGO, ILLINOIS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HE, CHUAN;SONG, CHUNXIAO;SIGNING DATES FROM 20140618 TO 20140708;REEL/FRAME:051277/0331 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCV | Information on status: appeal procedure |
Free format text: NOTICE OF APPEAL FILED |
|
STCV | Information on status: appeal procedure |
Free format text: APPEAL BRIEF (OR SUPPLEMENTAL BRIEF) ENTERED AND FORWARDED TO EXAMINER |
|
STCV | Information on status: appeal procedure |
Free format text: EXAMINER'S ANSWER TO APPEAL BRIEF MAILED |
|
STCV | Information on status: appeal procedure |
Free format text: ON APPEAL -- AWAITING DECISION BY THE BOARD OF APPEALS |