WO2024173277A2 - Delfi-derived cell-free dna fragmentation patterns differentiate histologic subtypes of lung cancers in a non-invasive manner - Google Patents
Delfi-derived cell-free dna fragmentation patterns differentiate histologic subtypes of lung cancers in a non-invasive manner Download PDFInfo
- Publication number
- WO2024173277A2 WO2024173277A2 PCT/US2024/015444 US2024015444W WO2024173277A2 WO 2024173277 A2 WO2024173277 A2 WO 2024173277A2 US 2024015444 W US2024015444 W US 2024015444W WO 2024173277 A2 WO2024173277 A2 WO 2024173277A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cfdna
- coverage
- transcription factor
- fragment
- cancer
- Prior art date
Links
- 238000013467 fragmentation Methods 0.000 title claims description 124
- 238000006062 fragmentation reaction Methods 0.000 title claims description 124
- 208000020816 lung neoplasm Diseases 0.000 title description 11
- 230000002962 histologic effect Effects 0.000 title description 3
- 238000000034 method Methods 0.000 claims abstract description 143
- 230000000955 neuroendocrine Effects 0.000 claims abstract description 54
- 208000002154 non-small cell lung carcinoma Diseases 0.000 claims abstract description 44
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 claims abstract description 43
- 238000012070 whole genome sequencing analysis Methods 0.000 claims abstract description 34
- 239000012634 fragment Substances 0.000 claims description 277
- 206010041067 Small cell lung cancer Diseases 0.000 claims description 176
- 208000000587 small cell lung carcinoma Diseases 0.000 claims description 168
- 206010028980 Neoplasm Diseases 0.000 claims description 148
- 108091023040 Transcription factor Proteins 0.000 claims description 128
- 102000040945 Transcription factor Human genes 0.000 claims description 128
- 201000011510 cancer Diseases 0.000 claims description 109
- 238000011282 treatment Methods 0.000 claims description 55
- 238000012163 sequencing technique Methods 0.000 claims description 47
- 208000009956 adenocarcinoma Diseases 0.000 claims description 38
- 206010041823 squamous cell carcinoma Diseases 0.000 claims description 37
- 238000009826 distribution Methods 0.000 claims description 34
- 101000901099 Homo sapiens Achaete-scute homolog 1 Proteins 0.000 claims description 29
- 238000012545 processing Methods 0.000 claims description 29
- 102100022142 Achaete-scute homolog 1 Human genes 0.000 claims description 26
- 230000004913 activation Effects 0.000 claims description 26
- 238000012937 correction Methods 0.000 claims description 26
- 239000003814 drug Substances 0.000 claims description 26
- 238000009169 immunotherapy Methods 0.000 claims description 25
- 229940124597 therapeutic agent Drugs 0.000 claims description 25
- 230000007423 decrease Effects 0.000 claims description 24
- 238000010801 machine learning Methods 0.000 claims description 23
- 238000013507 mapping Methods 0.000 claims description 20
- 238000011156 evaluation Methods 0.000 claims description 15
- 230000004044 response Effects 0.000 claims description 15
- 101000775102 Homo sapiens Transcriptional coactivator YAP1 Proteins 0.000 claims description 13
- 102100031873 Transcriptional coactivator YAP1 Human genes 0.000 claims description 13
- 239000000654 additive Substances 0.000 claims description 13
- 230000000996 additive effect Effects 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 12
- 108700028369 Alleles Proteins 0.000 claims description 9
- 241000124008 Mammalia Species 0.000 description 111
- 239000000523 sample Substances 0.000 description 54
- 108020004414 DNA Proteins 0.000 description 41
- 238000004458 analytical method Methods 0.000 description 20
- 210000004369 blood Anatomy 0.000 description 19
- 239000008280 blood Substances 0.000 description 19
- 230000004075 alteration Effects 0.000 description 17
- 239000000463 material Substances 0.000 description 16
- 238000012544 monitoring process Methods 0.000 description 15
- 238000013459 approach Methods 0.000 description 14
- 201000010099 disease Diseases 0.000 description 13
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 13
- 238000002203 pretreatment Methods 0.000 description 12
- 210000002381 plasma Anatomy 0.000 description 10
- 210000001519 tissue Anatomy 0.000 description 10
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 8
- 238000004590 computer program Methods 0.000 description 8
- 201000005202 lung cancer Diseases 0.000 description 8
- 238000003860 storage Methods 0.000 description 8
- 108700009124 Transcription Initiation Site Proteins 0.000 description 7
- 239000012472 biological sample Substances 0.000 description 7
- 238000002512 chemotherapy Methods 0.000 description 7
- 230000002759 chromosomal effect Effects 0.000 description 7
- 238000003745 diagnosis Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000000513 principal component analysis Methods 0.000 description 7
- 210000000349 chromosome Anatomy 0.000 description 6
- 238000004891 communication Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 230000014509 gene expression Effects 0.000 description 6
- 206010009944 Colon cancer Diseases 0.000 description 5
- 210000001744 T-lymphocyte Anatomy 0.000 description 5
- 238000003556 assay Methods 0.000 description 5
- 238000001574 biopsy Methods 0.000 description 5
- 210000004027 cell Anatomy 0.000 description 5
- 239000012530 fluid Substances 0.000 description 5
- 230000011132 hemopoiesis Effects 0.000 description 5
- 230000015654 memory Effects 0.000 description 5
- 230000035772 mutation Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 241000282412 Homo Species 0.000 description 4
- 230000001973 epigenetic effect Effects 0.000 description 4
- -1 erlotinib hydrochlorides Chemical class 0.000 description 4
- 238000011528 liquid biopsy Methods 0.000 description 4
- 150000007523 nucleic acids Chemical group 0.000 description 4
- 238000001959 radiotherapy Methods 0.000 description 4
- 230000035945 sensitivity Effects 0.000 description 4
- 238000001356 surgical procedure Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 206010004593 Bile duct cancer Diseases 0.000 description 3
- 206010006187 Breast cancer Diseases 0.000 description 3
- 208000026310 Breast neoplasm Diseases 0.000 description 3
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 3
- 206010033128 Ovarian cancer Diseases 0.000 description 3
- 206010061535 Ovarian neoplasm Diseases 0.000 description 3
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 3
- 238000003559 RNA-seq method Methods 0.000 description 3
- 208000005718 Stomach Neoplasms Diseases 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 208000026900 bile duct neoplasm Diseases 0.000 description 3
- 210000001175 cerebrospinal fluid Anatomy 0.000 description 3
- 208000006990 cholangiocarcinoma Diseases 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000004069 differentiation Effects 0.000 description 3
- 229950009791 durvalumab Drugs 0.000 description 3
- 206010017758 gastric cancer Diseases 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 229940043355 kinase inhibitor Drugs 0.000 description 3
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 102000039446 nucleic acids Human genes 0.000 description 3
- 108020004707 nucleic acids Proteins 0.000 description 3
- 229960000572 olaparib Drugs 0.000 description 3
- FAQDUNYVKQKNLD-UHFFFAOYSA-N olaparib Chemical compound FC1=CC=C(CC2=C3[CH]C=CC=C3C(=O)N=N2)C=C1C(=O)N(CC1)CCN1C(=O)C1CC1 FAQDUNYVKQKNLD-UHFFFAOYSA-N 0.000 description 3
- 201000002528 pancreatic cancer Diseases 0.000 description 3
- 208000008443 pancreatic carcinoma Diseases 0.000 description 3
- 239000003757 phosphotransferase inhibitor Substances 0.000 description 3
- 229920001184 polypeptide Polymers 0.000 description 3
- 108090000765 processed proteins & peptides Proteins 0.000 description 3
- 102000004196 processed proteins & peptides Human genes 0.000 description 3
- 208000037821 progressive disease Diseases 0.000 description 3
- 108090000623 proteins and genes Proteins 0.000 description 3
- 201000011549 stomach cancer Diseases 0.000 description 3
- 230000004083 survival effect Effects 0.000 description 3
- 238000002560 therapeutic procedure Methods 0.000 description 3
- 230000002103 transcriptional effect Effects 0.000 description 3
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 2
- 241000283690 Bos taurus Species 0.000 description 2
- 241000282472 Canis lupus familiaris Species 0.000 description 2
- 241000282693 Cercopithecidae Species 0.000 description 2
- AOJJSUZBOXZQNB-TZSSRYMLSA-N Doxorubicin Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 AOJJSUZBOXZQNB-TZSSRYMLSA-N 0.000 description 2
- 241000283086 Equidae Species 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 2
- 206010061218 Inflammation Diseases 0.000 description 2
- 241000699666 Mus <mouse, genus> Species 0.000 description 2
- 241000699670 Mus sp. Species 0.000 description 2
- 241001494479 Pecora Species 0.000 description 2
- 241000288906 Primates Species 0.000 description 2
- 206010036790 Productive cough Diseases 0.000 description 2
- 241000700159 Rattus Species 0.000 description 2
- 102000006382 Ribonucleases Human genes 0.000 description 2
- 108010083644 Ribonucleases Proteins 0.000 description 2
- 241000282887 Suidae Species 0.000 description 2
- 208000036878 aneuploidy Diseases 0.000 description 2
- 231100001075 aneuploidy Toxicity 0.000 description 2
- 239000002246 antineoplastic agent Substances 0.000 description 2
- 230000006037 cell lysis Effects 0.000 description 2
- 210000002939 cerumen Anatomy 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 229940127089 cytotoxic agent Drugs 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 239000003599 detergent Substances 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 210000004251 human milk Anatomy 0.000 description 2
- 235000020256 human milk Nutrition 0.000 description 2
- 230000004054 inflammatory process Effects 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 210000000265 leukocyte Anatomy 0.000 description 2
- 210000004698 lymphocyte Anatomy 0.000 description 2
- 230000036210 malignancy Effects 0.000 description 2
- 206010061289 metastatic neoplasm Diseases 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000003752 polymerase chain reaction Methods 0.000 description 2
- 238000010837 poor prognosis Methods 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 210000003296 saliva Anatomy 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 210000003802 sputum Anatomy 0.000 description 2
- 208000024794 sputum Diseases 0.000 description 2
- 230000000153 supplemental effect Effects 0.000 description 2
- 239000004094 surface-active agent Substances 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 210000003171 tumor-infiltrating lymphocyte Anatomy 0.000 description 2
- 210000002700 urine Anatomy 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- FDKXTQMXEQVLRF-ZHACJKMWSA-N (E)-dacarbazine Chemical compound CN(C)\N=N\c1[nH]cnc1C(N)=O FDKXTQMXEQVLRF-ZHACJKMWSA-N 0.000 description 1
- VSNHCAURESNICA-NJFSPNSNSA-N 1-oxidanylurea Chemical compound N[14C](=O)NO VSNHCAURESNICA-NJFSPNSNSA-N 0.000 description 1
- NDMPLJNOPCLANR-UHFFFAOYSA-N 3,4-dihydroxy-15-(4-hydroxy-18-methoxycarbonyl-5,18-seco-ibogamin-18-yl)-16-methoxy-1-methyl-6,7-didehydro-aspidospermidine-3-carboxylic acid methyl ester Natural products C1C(CC)(O)CC(CC2(C(=O)OC)C=3C(=CC4=C(C56C(C(C(O)C7(CC)C=CCN(C67)CC5)(O)C(=O)OC)N4C)C=3)OC)CN1CCC1=C2NC2=CC=CC=C12 NDMPLJNOPCLANR-UHFFFAOYSA-N 0.000 description 1
- AOJJSUZBOXZQNB-VTZDEGQISA-N 4'-epidoxorubicin Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)[C@@H](O)[C@H](C)O1 AOJJSUZBOXZQNB-VTZDEGQISA-N 0.000 description 1
- IDPUKCWIGUEADI-UHFFFAOYSA-N 5-[bis(2-chloroethyl)amino]uracil Chemical compound ClCCN(CCCl)C1=CNC(=O)NC1=O IDPUKCWIGUEADI-UHFFFAOYSA-N 0.000 description 1
- NMUSYJAQQFHJEW-KVTDHHQDSA-N 5-azacytidine Chemical compound O=C1N=C(N)N=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 NMUSYJAQQFHJEW-KVTDHHQDSA-N 0.000 description 1
- WYWHKKSPHMUBEB-UHFFFAOYSA-N 6-Mercaptoguanine Natural products N1C(N)=NC(=S)C2=C1N=CN2 WYWHKKSPHMUBEB-UHFFFAOYSA-N 0.000 description 1
- STQGQHZAVUOBTE-UHFFFAOYSA-N 7-Cyan-hept-2t-en-4,6-diinsaeure Natural products C1=2C(O)=C3C(=O)C=4C(OC)=CC=CC=4C(=O)C3=C(O)C=2CC(O)(C(C)=O)CC1OC1CC(N)C(O)C(C)O1 STQGQHZAVUOBTE-UHFFFAOYSA-N 0.000 description 1
- 206010069754 Acquired gene mutation Diseases 0.000 description 1
- 206010003445 Ascites Diseases 0.000 description 1
- 102000008096 B7-H1 Antigen Human genes 0.000 description 1
- 108010074708 B7-H1 Antigen Proteins 0.000 description 1
- 108010006654 Bleomycin Proteins 0.000 description 1
- COVZYZSDYWQREU-UHFFFAOYSA-N Busulfan Chemical compound CS(=O)(=O)OCCCCOS(C)(=O)=O COVZYZSDYWQREU-UHFFFAOYSA-N 0.000 description 1
- GAGWJHPBXLXJQN-UORFTKCHSA-N Capecitabine Chemical compound C1=C(F)C(NC(=O)OCCCCC)=NC(=O)N1[C@H]1[C@H](O)[C@H](O)[C@@H](C)O1 GAGWJHPBXLXJQN-UORFTKCHSA-N 0.000 description 1
- GAGWJHPBXLXJQN-UHFFFAOYSA-N Capecitabine Natural products C1=C(F)C(NC(=O)OCCCCC)=NC(=O)N1C1C(O)C(O)C(C)O1 GAGWJHPBXLXJQN-UHFFFAOYSA-N 0.000 description 1
- 206010050337 Cerumen impaction Diseases 0.000 description 1
- 108010019670 Chimeric Antigen Receptors Proteins 0.000 description 1
- 108010077544 Chromatin Proteins 0.000 description 1
- 208000005443 Circulating Neoplastic Cells Diseases 0.000 description 1
- CMSMOCZEIVJLDB-UHFFFAOYSA-N Cyclophosphamide Chemical compound ClCCN(CCCl)P1(=O)NCCCO1 CMSMOCZEIVJLDB-UHFFFAOYSA-N 0.000 description 1
- UHDGCWIWMRVCDJ-CCXZUQQUSA-N Cytarabine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@@H](O)[C@H](O)[C@@H](CO)O1 UHDGCWIWMRVCDJ-CCXZUQQUSA-N 0.000 description 1
- 238000007399 DNA isolation Methods 0.000 description 1
- 102000016911 Deoxyribonucleases Human genes 0.000 description 1
- 108010053770 Deoxyribonucleases Proteins 0.000 description 1
- HTIJFSOGRVMCQR-UHFFFAOYSA-N Epirubicin Natural products COc1cccc2C(=O)c3c(O)c4CC(O)(CC(OC5CC(N)C(=O)C(C)O5)c4c(O)c3C(=O)c12)C(=O)CO HTIJFSOGRVMCQR-UHFFFAOYSA-N 0.000 description 1
- GHASVSINZRGABV-UHFFFAOYSA-N Fluorouracil Chemical compound FC1=CNC(=O)NC1=O GHASVSINZRGABV-UHFFFAOYSA-N 0.000 description 1
- 101001139134 Homo sapiens Krueppel-like factor 4 Proteins 0.000 description 1
- 101000572976 Homo sapiens POU domain, class 2, transcription factor 3 Proteins 0.000 description 1
- XDXDZDZNSLXDNA-TZNDIEGXSA-N Idarubicin Chemical compound C1[C@H](N)[C@H](O)[C@H](C)O[C@H]1O[C@@H]1C2=C(O)C(C(=O)C3=CC=CC=C3C3=O)=C3C(O)=C2C[C@@](O)(C(C)=O)C1 XDXDZDZNSLXDNA-TZNDIEGXSA-N 0.000 description 1
- XDXDZDZNSLXDNA-UHFFFAOYSA-N Idarubicin Natural products C1C(N)C(O)C(C)OC1OC1C2=C(O)C(C(=O)C3=CC=CC=C3C3=O)=C3C(O)=C2CC(O)(C(C)=O)C1 XDXDZDZNSLXDNA-UHFFFAOYSA-N 0.000 description 1
- 229940076838 Immune checkpoint inhibitor Drugs 0.000 description 1
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 1
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 1
- 206010062717 Increased upper airway secretion Diseases 0.000 description 1
- 102000037984 Inhibitory immune checkpoint proteins Human genes 0.000 description 1
- 108091008026 Inhibitory immune checkpoint proteins Proteins 0.000 description 1
- 102100020677 Krueppel-like factor 4 Human genes 0.000 description 1
- 239000005551 L01XE03 - Erlotinib Substances 0.000 description 1
- GQYIWUVLTXOXAJ-UHFFFAOYSA-N Lomustine Chemical compound ClCCN(N=O)C(=O)NC1CCCCC1 GQYIWUVLTXOXAJ-UHFFFAOYSA-N 0.000 description 1
- 108020005196 Mitochondrial DNA Proteins 0.000 description 1
- 229930192392 Mitomycin Natural products 0.000 description 1
- NWIBSHFKIJFRCO-WUDYKRTCSA-N Mytomycin Chemical compound C1N2C(C(C(C)=C(N)C3=O)=O)=C3[C@@H](COC(N)=O)[C@@]2(OC)[C@@H]2[C@H]1N2 NWIBSHFKIJFRCO-WUDYKRTCSA-N 0.000 description 1
- ZDZOTLJHXYCWBA-VCVYQWHSSA-N N-debenzoyl-N-(tert-butoxycarbonyl)-10-deacetyltaxol Chemical compound O([C@H]1[C@H]2[C@@](C([C@H](O)C3=C(C)[C@@H](OC(=O)[C@H](O)[C@@H](NC(=O)OC(C)(C)C)C=4C=CC=CC=4)C[C@]1(O)C3(C)C)=O)(C)[C@@H](O)C[C@H]1OC[C@]12OC(=O)C)C(=O)C1=CC=CC=C1 ZDZOTLJHXYCWBA-VCVYQWHSSA-N 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 102100026466 POU domain, class 2, transcription factor 3 Human genes 0.000 description 1
- 229930012538 Paclitaxel Natural products 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 208000005228 Pericardial Effusion Diseases 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 108091008109 Pseudogenes Proteins 0.000 description 1
- 102000057361 Pseudogenes Human genes 0.000 description 1
- 208000007660 Residual Neoplasm Diseases 0.000 description 1
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 1
- 241000219061 Rheum Species 0.000 description 1
- 108091027544 Subgenomic mRNA Proteins 0.000 description 1
- 108091008874 T cell receptors Proteins 0.000 description 1
- 102000016266 T-Cell Antigen Receptors Human genes 0.000 description 1
- BPEGJWRSRHCHSN-UHFFFAOYSA-N Temozolomide Chemical compound O=C1N(C)N=NC2=C(C(N)=O)N=CN21 BPEGJWRSRHCHSN-UHFFFAOYSA-N 0.000 description 1
- JXLYSJRDGCGARV-WWYNWVTFSA-N Vinblastine Natural products O=C(O[C@H]1[C@](O)(C(=O)OC)[C@@H]2N(C)c3c(cc(c(OC)c3)[C@]3(C(=O)OC)c4[nH]c5c(c4CCN4C[C@](O)(CC)C[C@H](C3)C4)cccc5)[C@@]32[C@H]2[C@@]1(CC)C=CCN2CC3)C JXLYSJRDGCGARV-WWYNWVTFSA-N 0.000 description 1
- RTJVUHUGTUDWRK-CSLCKUBZSA-N [(2r,4ar,6r,7r,8s,8ar)-6-[[(5s,5ar,8ar,9r)-9-(3,5-dimethoxy-4-phosphonooxyphenyl)-8-oxo-5a,6,8a,9-tetrahydro-5h-[2]benzofuro[6,5-f][1,3]benzodioxol-5-yl]oxy]-2-methyl-7-[2-(2,3,4,5,6-pentafluorophenoxy)acetyl]oxy-4,4a,6,7,8,8a-hexahydropyrano[3,2-d][1,3]d Chemical compound COC1=C(OP(O)(O)=O)C(OC)=CC([C@@H]2C3=CC=4OCOC=4C=C3[C@@H](O[C@H]3[C@@H]([C@@H](OC(=O)COC=4C(=C(F)C(F)=C(F)C=4F)F)[C@@H]4O[C@H](C)OC[C@H]4O3)OC(=O)COC=3C(=C(F)C(F)=C(F)C=3F)F)[C@@H]3[C@@H]2C(OC3)=O)=C1 RTJVUHUGTUDWRK-CSLCKUBZSA-N 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000011226 adjuvant chemotherapy Methods 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- SHGAZHPCJJPHSC-YCNIQYBTSA-N all-trans-retinoic acid Chemical compound OC(=O)\C=C(/C)\C=C\C=C(/C)\C=C\C1=C(C)CCCC1(C)C SHGAZHPCJJPHSC-YCNIQYBTSA-N 0.000 description 1
- 210000001691 amnion Anatomy 0.000 description 1
- 210000004381 amniotic fluid Anatomy 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 229960001220 amsacrine Drugs 0.000 description 1
- XCPGHVQEEXUHNC-UHFFFAOYSA-N amsacrine Chemical compound COC1=CC(NS(C)(=O)=O)=CC=C1NC1=C(C=CC=C2)C2=NC2=CC=CC=C12 XCPGHVQEEXUHNC-UHFFFAOYSA-N 0.000 description 1
- 229940124650 anti-cancer therapies Drugs 0.000 description 1
- 238000011319 anticancer therapy Methods 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 210000001742 aqueous humor Anatomy 0.000 description 1
- 210000003567 ascitic fluid Anatomy 0.000 description 1
- 229960002756 azacitidine Drugs 0.000 description 1
- 229960000397 bevacizumab Drugs 0.000 description 1
- 210000000941 bile Anatomy 0.000 description 1
- 239000013060 biological fluid Substances 0.000 description 1
- 239000000090 biomarker Substances 0.000 description 1
- 229960001561 bleomycin Drugs 0.000 description 1
- OYVAGSVQBOHSSS-UAPAGMARSA-O bleomycin A2 Chemical compound N([C@H](C(=O)N[C@H](C)[C@@H](O)[C@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)NCCC=1SC=C(N=1)C=1SC=C(N=1)C(=O)NCCC[S+](C)C)[C@@H](O[C@H]1[C@H]([C@@H](O)[C@H](O)[C@H](CO)O1)O[C@@H]1[C@H]([C@@H](OC(N)=O)[C@H](O)[C@@H](CO)O1)O)C=1N=CNC=1)C(=O)C1=NC([C@H](CC(N)=O)NC[C@H](N)C(N)=O)=NC(N)=C1C OYVAGSVQBOHSSS-UAPAGMARSA-O 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 229960002092 busulfan Drugs 0.000 description 1
- 229960004117 capecitabine Drugs 0.000 description 1
- 229960004562 carboplatin Drugs 0.000 description 1
- 190000008236 carboplatin Chemical compound 0.000 description 1
- 239000013592 cell lysate Substances 0.000 description 1
- 238000002659 cell therapy Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- JCKYGMPEJWAADB-UHFFFAOYSA-N chlorambucil Chemical compound OC(=O)CCCC1=CC=C(N(CCCl)CCCl)C=C1 JCKYGMPEJWAADB-UHFFFAOYSA-N 0.000 description 1
- 229960004630 chlorambucil Drugs 0.000 description 1
- 210000003483 chromatin Anatomy 0.000 description 1
- 210000001268 chyle Anatomy 0.000 description 1
- 108091092240 circulating cell-free DNA Proteins 0.000 description 1
- DQLATGHUWYMOKM-UHFFFAOYSA-L cisplatin Chemical compound N[Pt](N)(Cl)Cl DQLATGHUWYMOKM-UHFFFAOYSA-L 0.000 description 1
- 229960004316 cisplatin Drugs 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 229960004397 cyclophosphamide Drugs 0.000 description 1
- 210000002726 cyst fluid Anatomy 0.000 description 1
- 229960000684 cytarabine Drugs 0.000 description 1
- 231100000433 cytotoxic Toxicity 0.000 description 1
- 230000001472 cytotoxic effect Effects 0.000 description 1
- 229960003901 dacarbazine Drugs 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 229960000975 daunorubicin Drugs 0.000 description 1
- STQGQHZAVUOBTE-VGBVRHCVSA-N daunorubicin Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(C)=O)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 STQGQHZAVUOBTE-VGBVRHCVSA-N 0.000 description 1
- 238000011982 device technology Methods 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 229960003668 docetaxel Drugs 0.000 description 1
- ZWAOHEXOSAUJHY-ZIYNGMLESA-N doxifluridine Chemical compound O[C@@H]1[C@H](O)[C@@H](C)O[C@H]1N1C(=O)NC(=O)C(F)=C1 ZWAOHEXOSAUJHY-ZIYNGMLESA-N 0.000 description 1
- 229950005454 doxifluridine Drugs 0.000 description 1
- 229960004679 doxorubicin Drugs 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 210000003060 endolymph Anatomy 0.000 description 1
- 239000003256 environmental substance Substances 0.000 description 1
- 229960001904 epirubicin Drugs 0.000 description 1
- 229960001433 erlotinib Drugs 0.000 description 1
- VJJPUSNTGOMMGY-MRVIYFEKSA-N etoposide Chemical compound COC1=C(O)C(OC)=CC([C@@H]2C3=CC=4OCOC=4C=C3[C@@H](O[C@H]3[C@@H]([C@@H](O)[C@@H]4O[C@H](C)OC[C@H]4O3)O)[C@@H]3[C@@H]2C(OC3)=O)=C1 VJJPUSNTGOMMGY-MRVIYFEKSA-N 0.000 description 1
- 229960005420 etoposide Drugs 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000029142 excretion Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000003608 fece Anatomy 0.000 description 1
- 230000012953 feeding on blood of other organism Effects 0.000 description 1
- 229960000961 floxuridine Drugs 0.000 description 1
- ODKNJVUHOIMIIZ-RRKCRQDMSA-N floxuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(F)=C1 ODKNJVUHOIMIIZ-RRKCRQDMSA-N 0.000 description 1
- 229960000390 fludarabine Drugs 0.000 description 1
- GIUYCYHIANZCFB-FJFJXFQQSA-N fludarabine phosphate Chemical compound C1=NC=2C(N)=NC(F)=NC=2N1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)[C@@H]1O GIUYCYHIANZCFB-FJFJXFQQSA-N 0.000 description 1
- 229960002949 fluorouracil Drugs 0.000 description 1
- 238000007672 fourth generation sequencing Methods 0.000 description 1
- 210000004211 gastric acid Anatomy 0.000 description 1
- 230000002496 gastric effect Effects 0.000 description 1
- 210000004051 gastric juice Anatomy 0.000 description 1
- 229960005277 gemcitabine Drugs 0.000 description 1
- SDUQYLNIPVEERB-QPPQHZFASA-N gemcitabine Chemical compound O=C1N=C(N)C=CN1[C@H]1C(F)(F)[C@H](O)[C@@H](CO)O1 SDUQYLNIPVEERB-QPPQHZFASA-N 0.000 description 1
- 230000003394 haemopoietic effect Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 201000005787 hematologic cancer Diseases 0.000 description 1
- 208000024200 hematopoietic and lymphoid system neoplasm Diseases 0.000 description 1
- 238000001794 hormone therapy Methods 0.000 description 1
- 229960000908 idarubicin Drugs 0.000 description 1
- HOMGKSMUEGBAAB-UHFFFAOYSA-N ifosfamide Chemical compound ClCCNP1(=O)OCCCN1CCCl HOMGKSMUEGBAAB-UHFFFAOYSA-N 0.000 description 1
- 229960001101 ifosfamide Drugs 0.000 description 1
- 230000005746 immune checkpoint blockade Effects 0.000 description 1
- 230000008088 immune pathway Effects 0.000 description 1
- 239000012274 immune-checkpoint protein inhibitor Substances 0.000 description 1
- 238000003364 immunohistochemistry Methods 0.000 description 1
- 230000002757 inflammatory effect Effects 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 229960004768 irinotecan Drugs 0.000 description 1
- UWKQSNNFCGGAFS-XIFFEERXSA-N irinotecan Chemical compound C1=C2C(CC)=C3CN(C(C4=C([C@@](C(=O)OC4)(O)CC)C=4)=O)C=4C3=NC2=CC=C1OC(=O)N(CC1)CCC1N1CCCCC1 UWKQSNNFCGGAFS-XIFFEERXSA-N 0.000 description 1
- 208000003849 large cell carcinoma Diseases 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 229960002247 lomustine Drugs 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 210000002751 lymph Anatomy 0.000 description 1
- 230000001926 lymphatic effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- HAWPXGHAZFHHAD-UHFFFAOYSA-N mechlorethamine Chemical compound ClCCN(C)CCCl HAWPXGHAZFHHAD-UHFFFAOYSA-N 0.000 description 1
- 229960004961 mechlorethamine Drugs 0.000 description 1
- SGDBTWWWUNNDEQ-LBPRGKRZSA-N melphalan Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N(CCCl)CCCl)C=C1 SGDBTWWWUNNDEQ-LBPRGKRZSA-N 0.000 description 1
- 229960001924 melphalan Drugs 0.000 description 1
- GLVAUDGFNGKCSF-UHFFFAOYSA-N mercaptopurine Chemical compound S=C1NC=NC2=C1NC=N2 GLVAUDGFNGKCSF-UHFFFAOYSA-N 0.000 description 1
- 229960001428 mercaptopurine Drugs 0.000 description 1
- 230000001394 metastastic effect Effects 0.000 description 1
- 229960004857 mitomycin Drugs 0.000 description 1
- 229960001156 mitoxantrone Drugs 0.000 description 1
- KKZJGLLVHKMTCM-UHFFFAOYSA-N mitoxantrone Chemical compound O=C1C2=C(O)C=CC(O)=C2C(=O)C2=C1C(NCCNCCO)=CC=C2NCCNCCO KKZJGLLVHKMTCM-UHFFFAOYSA-N 0.000 description 1
- 210000003097 mucus Anatomy 0.000 description 1
- 238000013188 needle biopsy Methods 0.000 description 1
- 238000011227 neoadjuvant chemotherapy Methods 0.000 description 1
- 210000000440 neutrophil Anatomy 0.000 description 1
- 210000000019 nipple aspirate fluid Anatomy 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000011369 optimal treatment Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- DWAFYCQODLXJNR-BNTLRKBRSA-L oxaliplatin Chemical compound O1C(=O)C(=O)O[Pt]11N[C@@H]2CCCC[C@H]2N1 DWAFYCQODLXJNR-BNTLRKBRSA-L 0.000 description 1
- 229960001756 oxaliplatin Drugs 0.000 description 1
- 229960001592 paclitaxel Drugs 0.000 description 1
- 238000009595 pap smear Methods 0.000 description 1
- 229960005079 pemetrexed Drugs 0.000 description 1
- QOFFJEBXNKRSPX-ZDUSSCGKSA-N pemetrexed Chemical compound C1=N[C]2NC(N)=NC(=O)C2=C1CCC1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 QOFFJEBXNKRSPX-ZDUSSCGKSA-N 0.000 description 1
- 210000004912 pericardial fluid Anatomy 0.000 description 1
- 210000004049 perilymph Anatomy 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 208000026435 phlegm Diseases 0.000 description 1
- 210000004910 pleural fluid Anatomy 0.000 description 1
- CPTBDICYNRMXFX-UHFFFAOYSA-N procarbazine Chemical compound CNNCC1=CC=C(C(=O)NC(C)C)C=C1 CPTBDICYNRMXFX-UHFFFAOYSA-N 0.000 description 1
- 229960000624 procarbazine Drugs 0.000 description 1
- 210000004908 prostatic fluid Anatomy 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 210000004915 pus Anatomy 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000002271 resection Methods 0.000 description 1
- 229930002330 retinoic acid Natural products 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 210000002374 sebum Anatomy 0.000 description 1
- 210000000582 semen Anatomy 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 230000000391 smoking effect Effects 0.000 description 1
- 230000037439 somatic mutation Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 229960001052 streptozocin Drugs 0.000 description 1
- ZSJLQEPLLKMAKR-GKHCUFPYSA-N streptozocin Chemical compound O=NN(C)C(=O)N[C@H]1[C@@H](O)O[C@H](CO)[C@@H](O)[C@@H]1O ZSJLQEPLLKMAKR-GKHCUFPYSA-N 0.000 description 1
- 230000003319 supportive effect Effects 0.000 description 1
- 238000011477 surgical intervention Methods 0.000 description 1
- 210000004243 sweat Anatomy 0.000 description 1
- 210000001179 synovial fluid Anatomy 0.000 description 1
- 229950003999 tafluposide Drugs 0.000 description 1
- 238000002626 targeted therapy Methods 0.000 description 1
- RCINICONZNJXQF-MZXODVADSA-N taxol Chemical compound O([C@@H]1[C@@]2(C[C@@H](C(C)=C(C2(C)C)[C@H](C([C@]2(C)[C@@H](O)C[C@H]3OC[C@]3([C@H]21)OC(C)=O)=O)OC(=O)C)OC(=O)[C@H](O)[C@@H](NC(=O)C=1C=CC=CC=1)C=1C=CC=CC=1)O)C(=O)C1=CC=CC=C1 RCINICONZNJXQF-MZXODVADSA-N 0.000 description 1
- 210000001138 tear Anatomy 0.000 description 1
- 229960004964 temozolomide Drugs 0.000 description 1
- NRUKOCRGYNPUPR-QBPJDGROSA-N teniposide Chemical compound COC1=C(O)C(OC)=CC([C@@H]2C3=CC=4OCOC=4C=C3[C@@H](O[C@H]3[C@@H]([C@@H](O)[C@@H]4O[C@@H](OC[C@H]4O3)C=3SC=CC=3)O)[C@@H]3[C@@H]2C(OC3)=O)=C1 NRUKOCRGYNPUPR-QBPJDGROSA-N 0.000 description 1
- 229960001278 teniposide Drugs 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 229960003087 tioguanine Drugs 0.000 description 1
- MNRILEROXIRVNJ-UHFFFAOYSA-N tioguanine Chemical compound N1C(N)=NC(=S)C2=NC=N[C]21 MNRILEROXIRVNJ-UHFFFAOYSA-N 0.000 description 1
- 229960000303 topotecan Drugs 0.000 description 1
- UCFGDBYHRUNTLO-QHCPKHFHSA-N topotecan Chemical compound C1=C(O)C(CN(C)C)=C2C=C(CN3C4=CC5=C(C3=O)COC(=O)[C@]5(O)CC)C4=NC2=C1 UCFGDBYHRUNTLO-QHCPKHFHSA-N 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- 229960001055 uracil mustard Drugs 0.000 description 1
- 229960000653 valrubicin Drugs 0.000 description 1
- ZOCKGBMQLCSHFP-KQRAQHLDSA-N valrubicin Chemical compound O([C@H]1C[C@](CC2=C(O)C=3C(=O)C4=CC=CC(OC)=C4C(=O)C=3C(O)=C21)(O)C(=O)COC(=O)CCCC)[C@H]1C[C@H](NC(=O)C(F)(F)F)[C@H](O)[C@H](C)O1 ZOCKGBMQLCSHFP-KQRAQHLDSA-N 0.000 description 1
- 229960003048 vinblastine Drugs 0.000 description 1
- JXLYSJRDGCGARV-XQKSVPLYSA-N vincaleukoblastine Chemical compound C([C@@H](C[C@]1(C(=O)OC)C=2C(=CC3=C([C@]45[C@H]([C@@]([C@H](OC(C)=O)[C@]6(CC)C=CCN([C@H]56)CC4)(O)C(=O)OC)N3C)C=2)OC)C[C@@](C2)(O)CC)N2CCC2=C1NC1=CC=CC=C21 JXLYSJRDGCGARV-XQKSVPLYSA-N 0.000 description 1
- 229960004528 vincristine Drugs 0.000 description 1
- OGWKCGZFUXNPDA-XQKSVPLYSA-N vincristine Chemical compound C([N@]1C[C@@H](C[C@]2(C(=O)OC)C=3C(=CC4=C([C@]56[C@H]([C@@]([C@H](OC(C)=O)[C@]7(CC)C=CCN([C@H]67)CC5)(O)C(=O)OC)N4C=O)C=3)OC)C[C@@](C1)(O)CC)CC1=C2NC2=CC=CC=C12 OGWKCGZFUXNPDA-XQKSVPLYSA-N 0.000 description 1
- OGWKCGZFUXNPDA-UHFFFAOYSA-N vincristine Natural products C1C(CC)(O)CC(CC2(C(=O)OC)C=3C(=CC4=C(C56C(C(C(OC(C)=O)C7(CC)C=CCN(C67)CC5)(O)C(=O)OC)N4C=O)C=3)OC)CN1CCC1=C2NC2=CC=CC=C12 OGWKCGZFUXNPDA-UHFFFAOYSA-N 0.000 description 1
- 229960004355 vindesine Drugs 0.000 description 1
- UGGWPQSBPIFKDZ-KOTLKJBCSA-N vindesine Chemical compound C([C@@H](C[C@]1(C(=O)OC)C=2C(=CC3=C([C@]45[C@H]([C@@]([C@H](O)[C@]6(CC)C=CCN([C@H]56)CC4)(O)C(N)=O)N3C)C=2)OC)C[C@@](C2)(O)CC)N2CCC2=C1N=C1[C]2C=CC=C1 UGGWPQSBPIFKDZ-KOTLKJBCSA-N 0.000 description 1
- 229960002066 vinorelbine Drugs 0.000 description 1
- GBABOYUKABKIAF-GHYRFKGUSA-N vinorelbine Chemical compound C1N(CC=2C3=CC=CC=C3NC=22)CC(CC)=C[C@H]1C[C@]2(C(=O)OC)C1=CC([C@]23[C@H]([C@]([C@H](OC(C)=O)[C@]4(CC)C=CCN([C@H]34)CC2)(O)C(=O)OC)N2C)=C2C=C1OC GBABOYUKABKIAF-GHYRFKGUSA-N 0.000 description 1
- 210000004127 vitreous body Anatomy 0.000 description 1
- 210000004916 vomit Anatomy 0.000 description 1
- 230000008673 vomiting Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
Definitions
- the present invention relates generally to the non-invasive diagnosis and subtyping of small cell lung cancer (SCLC) and more specifically to the analysis of genome-wide patterns of fragmented cell-free DNA (cfDNA) in conjunction with clinical and demographic features of individual patients.
- SCLC small cell lung cancer
- cfDNA fragmented cell-free DNA
- the present invention is also related to the of Cell-free DNA fragmentation profiling as a method for tumor fraction assessment and treatment monitoring in non-small cell lung cancer (NSCLC).
- NSCLC non-small cell lung cancer
- Small cell lung cancer is an aggressive malignancy with a poor prognosis. Although Small cell lung cancers are clinically managed as a single cancer type, new evidence supports these subtypes (high- neuroendocrine vs low-neuroendocrine) of small cell lung cancer acquire diverse transcriptional and epigenetic states. Furthermore, distinct small cell lung cancer subtypes respond to specific treatment such as immunotherapy in the case of the low- neuroendocrine highly inflamed small cell lung cancer subtype, non-small cell lung cancer is any type of epithelial lung cancer other than small cell lung cancer. The most common types of non-small cell lung cancer are squamous cell carcinoma, large cell carcinoma, and adenocarcinoma, but there are several other types that occur less frequently, and all types can PATENT
- non-small cell lung cancer is usually less sensitive to chemotherapy and radiation therapy than small cell lung cancer.
- Patients with resectable disease may be cured by surgery or surgery followed by chemotherapy, as well as chemotherapy followed by surgery. Local control can be achieved with radiation therapy in many patients with unresectable disease, but cure is seen in relatively few patients.
- Patients with locally advanced unresectable disease may achieve long-term survival with radiation therapy combined with chemotherapy.
- Patients with advanced metastatic disease may achieve improved survival and palliation of symptoms with chemotherapy, targeted agents, and other supportive measures.
- Novel, non-invasive methods for identifying and subtyping non-small cell lung cancer and small cell lung cancer are needed to improve patient outcomes.
- the present invention is based on the seminal discovery that the characterizing genome-wide patterns of fragmentation of cell-free DNA (cfDNA) in plasma using low- coverage whole-genome sequencing can improve cancer diagnosis when analyzed in conjunction with certain clinical and demographic features of individual patients.
- cfDNA cell-free DNA
- the present invention provides methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high- neuroendocrine small cell lung cancer subtype.
- the present invention provides methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and
- ATTORNEY DOCKET NO. DELFI2150-3WO generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high- neuroendocrine small cell lung cancer subtype.
- the present invention provides methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9x to O.lx; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high- neuroendocrine small cell lung cancer subtype.
- the method uses the specified transcription factor is ASCL1, NEURODI, POUF
- the method uses machine learning to subtype small cell lung cancer in a subject.
- subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
- the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
- the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
- Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias.
- the genomic intervals are non-overlapping.
- the genomic intervals each comprise thousands to millions of base pairs.
- a cfDNA fragmentation profile is determined within each genomic intervals.
- the cfDNA fragmentation profile comprises a median fragment size.
- the cfDNA fragmentation profile comprises a fragment size distribution.
- Some aspects further comprise administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the type of cancer.
- the therapeutic agent is an immunotherapy.
- the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment
- ATTORNEY DOCKET NO. DELFI2150-3WO coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
- the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
- the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9x to O.lx; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
- the specified transcription factor is AS CL 1, NEURODI, POUF23, YAP1, or any combination thereof.
- the method uses machine learning to subtype small cell lung cancer in the subject.
- subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
- the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
- all coverage calculations are shifted by 1 to avoid divisions by 0.
- the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
- Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias.
- the genomic intervals are non-overlapping.
- the genomic intervals each comprise thousands to millions of base pairs.
- a cfDNA fragmentation profile is determined within each genomic intervals.
- the cfDNA fragmentation profile comprises a median fragment size.
- the cfDNA fragmentation profile comprises a fragment size distribution.
- Some aspects further comprise administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer.
- the therapeutic agent is an immunotherapy.
- Some aspects further comprise quantifying a circulating tumor fraction, mirroring major allele fraction (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both Adenocarcinoma and Squamous carcinoma.
- MAF mirroring major allele fraction
- Figure 1 illustrates DELFI-scores of the patients diagnosed with SCLC in the study of Example 1.
- Figure 2 illustrates that genome-wide fragmentation profiles were significantly consistent among pre-treatment, post-treatment, progression and response time points suggesting genome-wide circulating tumor DNA fragment sizes is a powerful method to detect changes during monitoring of immunotherapy treatment of SCLC cancer treatment.
- Figure 3 illustrates genome- wide fragmentation profiles (and correspondent DELFI-scores below) showing partial differences between the two main subtypes of SCLC cases: High Neuroendocrine and Low Neuroendocrine; suggesting genome-wide circulating tumor DNA fragment sizes may be a powerful method to detect subtyping during monitoring of immunotherapy treatment of SCLC cancer treatment.
- Figure 4 illustrates differential genes between High and Low neuroendocrine SCLC cases, using publicly available data (Lissa et al. 2022) and a principal Component Analysis (on DELFI 30X plasma WGS data) to cluster the two subtypes.
- PCA analysis data revealed distinguishable clusters between SCLC High-NE vs Low-NE pre-treatment samples.
- Figures 5A-5B illustrates an Eigencor analysis revealing a significant correlation between Principal Component 1 and 2 with the NE status of the samples.
- Figures 6A-6B illustrates the potential of the DELFI assay to subtype SCLC using TFBS, A genome-wide cfDNA fragmentation analyses were performed at ASCL1 binding sites in the pre-treatment NCI samples.
- Figure 5 A illustrates that distinct clusters of SCLC samples were observed corresponding to different levels of ASCL1 activation. Importantly the two clusters corresponded perfectly to the different SCLC subtypes. The main driver of differentiation between samples was in fact differences in the fragment coverage identified at the center of the ASCL1 binding sites.
- Figure 6B illustrates that other clinical or sample characteristics have a negligible impact in the clustering of these samples. Overall, these data show how DELFI analysis can perfectly distinguish between SCLC subtypes using fragment coverage at TFBS.
- Figure 7 illustrates a patient example supporting the hypothesis that the detected signal comes from tumor infiltrating lymphocytes.
- Patient NCI-0422 diagnosed with SCLC with inflammation subtype was treated with Durvalumab plus Olaparib. NCI-0422 responded well to treatment but was later diagnosed with progressive disease. Variable genomic binding is detectable at pre-treatment and at progression timepoints. Peaks are centered at a mix of TFBS and TSS.
- Figures 8A-8B illustrates ASCL1 binding site between resp vs non-responders.
- Figure 8A shows 500 cell type specific bins from the PMD data to determine fraction of white blood cells.
- Figure 9 illustrates a machine learning model to predict high vs low neuroendocrine SCLC.
- Figures 10A-10B illustrates that the method described herein is able to differentiate neuroendocrine vs non-neuroendocrine by calculating Read Depth Ratio at SCLC specific genomic coordinates ( Figure 10A.). Figure 10B shows that the method described herein appears to subtype better SCLC cases.
- Figure 11 illustrates that Read Depth Ratio at SCLC specific genomic coordinates are transformed into a 2-dimensional PCA to separate samples in Neuroendocrine vs non- neuroendocrine groups. Most of samples are clearly divided into different clusters.
- Figure 12 illustrates PCA derived from Read Depth Ratio at SCLC specific genomic coordinates is able to separate pre-treatment samples into specific SCLC subtypes (A,N,P,Y).
- the ground truth of the SCLC subtypes was calculated using tissue RNA-seq gene expression differential analysis.
- Figure 13 illustrates depicts how fragmentation profiles inform on lung cancer status and treatment response patterns.
- Figure 14 illustrates that model derived DELFI-TF exhibits a strong correlation with Mutant Allele Frequency across samples.
- Figure 15 illustrates that DELFI-TF accurately quantify circulating tumor fraction, mirroring MAF performance in relation to RECIST evaluation.
- Figure 16 illustrates that tumor derived cfDNA has altered fragmentation.
- Figures 17A-17C illustrates that DELFI-TF accurately detects circulating tumor fraction without being confounded by clonal hematopoiesis.
- Figure 18 depicts cfDNA fragmentation patterns accurately differentiate NSCLC subtypes.
- Figure 19 depicts proof-of-concept DELFI-TF model development.
- Figure 20 is an example computer 800 that may be used to implement the methods described herein.
- the present invention is based on the seminal discovery that the characterizing genome-wide patterns of fragmentation of cell-free DNA (cfDNA) in plasma using low- coverage whole-genome sequencing improves cancer diagnosis when analyzed in conjunction with certain clinical and demographic features of individual patients.
- cfDNA cell-free DNA
- Described herein are non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30 x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding is indicative of a high-neuroendocrine small cell lung cancer subtype.
- Described herein are non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating
- ATTORNEY DOCKET NO. DELFI2150-3WO sequencing libraries subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high-neuroendocrine small cell lung cancer subtype.
- Described herein are non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high-neuroendocrine small cell lung cancer subtype.
- the method uses the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
- the method uses machine learning to subtype small cell lung cancer in the subject.
- subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
- the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
- the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
- Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias.
- the genomic intervals are non-overlapping.
- the genomic intervals each comprise thousands to millions of base pairs.
- a cfDNA fragmentation profile is determined within each genomic intervals.
- the cfDNA fragmentation profile comprises a median fragment size. In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution. [0074] Some aspects further comprise administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the type of cancer.
- the therapeutic agent is an immunotherapy.
- non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30 x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites;
- the method uses the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
- the method uses machine learning to subtype small cell lung cancer in the subject.
- subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
- the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 5OObp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
- the cfDNA fragmentation profile comprises a fragment size distribution.
- Some aspects further comprise administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the type of cancer.
- the therapeutic agent is an immunotherapy.
- non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10* to 0.1 *; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites;
- ATTORNEY DOCKET NO. DELFI2150-3WO transcription factor binding sites is indicative of a high-neuroendocrine small cell lung cancer subtype.
- the method uses the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
- the method uses machine learning to subtype small cell lung cancer in the subject.
- subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
- the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 5OObp anchors at the ends of the 2kb window (i.e.
- the cfDNA fragmentation profile comprises a median fragment size. In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution. Some aspects further comprise administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the type of cancer. In some aspects, the therapeutic agent is an immunotherapy.
- non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites;
- the method uses the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
- the method uses machine learning to subtype small cell lung cancer in the subject.
- subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
- the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 5OObp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
- the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
- Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias.
- the genomic intervals are non-overlapping.
- the genomic intervals each comprise thousands to millions of base pairs.
- a cfDNA fragmentation profile is determined within each genomic intervals.
- the cfDNA fragmentation profile comprises a median fragment size.
- the cfDNA fragmentation profile comprises a fragment size distribution.
- Some aspects further comprise administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the type of cancer.
- the therapeutic agent is an immunotherapy.
- the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based
- the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
- the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9x to O.lx; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
- the specified transcription factor is AS CL 1, NEURODI, POUF23, YAP1, or any combination thereof.
- the method uses machine learning to subtype small cell lung cancer in the subject.
- subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
- the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
- all coverage calculations are shifted by 1 to avoid divisions by 0.
- the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
- Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias.
- the genomic intervals are non-overlapping.
- the genomic intervals each comprise thousands to millions of base pairs.
- a cfDNA fragmentation profile is determined within each genomic intervals.
- the cfDNA fragmentation profile comprises a median fragment size.
- the cfDNA fragmentation profile comprises a fragment size distribution.
- Some aspects further comprise administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer.
- the therapeutic agent is an immunotherapy.
- Some aspects further comprise quantifying a circulating tumor fraction, mirroring major allele frequency (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both Adenocarcinoma and Squamous carcinoma.
- MAF mirroring major allele frequency
- the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
- the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
- the method uses machine learning to subtype small cell lung cancer in the subject.
- subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
- the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)). In some aspects, all coverage calculations are shifted by 1 to avoid divisions by 0.
- the log2 Read Depth Ratio
- the genomic intervals are non-overlapping.
- the genomic intervals each comprise thousands to millions of base pairs.
- a cfDNA fragmentation profile is determined within each genomic intervals.
- the cfDNA fragmentation profile comprises a median fragment size.
- the cfDNA fragmentation profile comprises a fragment size distribution.
- Some aspects further comprise administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer.
- the therapeutic agent is an immunotherapy.
- Some aspects further comprise quantifying a circulating tumor fraction, mirroring major allele frequency (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both Adenocarcinoma and Squamous carcinoma.
- MAF mirroring major allele frequency
- the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
- the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
- the method uses machine learning to subtype small cell lung cancer in the subject.
- subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
- the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)). In some aspects, all coverage calculations are shifted by 1 to avoid divisions by 0.
- the log2 Read Depth Ratio
- the genomic intervals are non-overlapping.
- the genomic intervals each comprise thousands to millions of base pairs.
- a cfDNA fragmentation profile is determined within each genomic intervals.
- the cfDNA fragmentation profile comprises a median fragment size.
- the cfDNA fragmentation profile comprises a fragment size
- ATTORNEY DOCKET NO. DELFI2150-3WO distribution Some aspects further comprise administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer. In some aspects, the therapeutic agent is an immunotherapy. Some aspects further comprise quantifying a circulating tumor fraction, mirroring major allele frequency (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both Adenocarcinoma and Squamous carcinoma.
- MAF major allele frequency
- the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9* to 0.1*; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
- the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
- the method uses machine learning to subtype small cell lung cancer in the subject.
- subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
- the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)). In some aspects, all coverage calculations are shifted by 1 to avoid divisions by 0.
- the log2 Read Depth Ratio
- the genomic intervals are non-overlapping. In some aspects, the genomic intervals each comprise thousands
- a cfDNA fragmentation profile is determined within each genomic intervals.
- the cfDNA fragmentation profile comprises a median fragment size.
- the cfDNA fragmentation profile comprises a fragment size distribution.
- Some aspects further comprise administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer.
- the therapeutic agent is an immunotherapy.
- Some aspects further comprise quantifying a circulating tumor fraction, mirroring major allele frequency (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both Adenocarcinoma and Squamous carcinoma.
- MAF major allele frequency
- NSCLC subtype non-small cell lung cancer
- cfDNA cell-free DNA fragmentomes.
- NSCLC is often presented in different subtypes such as adenocarcinoma and squamous cell carcinoma.
- This novel non- invasive DELFI-based approach distinguishes NSCLC adenocarcinoma versus Squamous carcinoma subtypes investigating cell-free fragments that reflect the unique epigenetic states of NSCLCs.
- Other cfDNA-based liquid biopsies are not known to differentiate NSCLC subtypes without having access to tissue clinical data.
- NSCLC subtypes can be predicted from the DELFI-based approach using cfDNA LC-WGS data.
- the DELFI machine learning classifier detects the presence of cancer in patients with NSCLC.
- Both genome-wide fragmentation profiles and correspondent DELFI- Tumor Fraction (TF) scores are investigated to identify preliminary differences between the two main clinical subtypes of NSCLC cases: adenocarcinoma and squamous cell carcinoma.
- DELFI-TF accurately detects circulating tumor fraction without being confused by clonal hematopoiesis.
- DELFI-TF accurately quantifies circulating tumor fraction, mirroring MAF performance in relation to RECIST evaluation in both Adenocarcinoma and Squamous carcinoma.
- FIG. 19 depicts proof-of-concept DELFI-TF model development.
- TSS differentially accessible transcription start sites
- a list of differentially accessible transcription start sites can be obtained from public databases such as, but not limited to UCSC or Ensembl. The creation of such lists is similar to what would be done for RNA-Seq experiments except instead of transcript per million the read depth ratio is used as a proxy for expression.
- sites with the largest and most significant log fold changes from 20% of a test cohort are selected and applied to the remainder of the cohort at the same loci followed by performed hierarchical clustering to determine if subtypes remained together.
- the most highly expressed transcripts from TCGA for a given cancer type are selected and those transcripts which are also expressed at any level in AML (blood cancer as an imperfect proxy for normal blood) are filtered out from the list.
- the most differential DELFI-TSSs exhibit short/long changes was investigated, further confirming these cfDNA molecules are tumor-derived.
- ROC receiver operating characteristic
- This document provides methods and materials for determining a cfDNA fragmentation profile in a mammal (e.g., in a sample obtained from a mammal).
- fragmentation profile position dependent differences in fragmentation patterns
- ATTORNEY DOCKET NO. DELFI2150-3WO genome are equivalent and can be used interchangeably.
- determining a cfDNA fragmentation profile in a mammal can be used for identifying a mammal as having cancer.
- cfDNA fragments obtained from a mammal e.g., from a sample obtained from a mammal
- a cfDNA fragmentation profile of a mammal having cancer is more heterogeneous (e.g., in fragment lengths) than a cfDNA fragmentation profile of a healthy mammal (e.g., a mammal not having cancer).
- this document also provides methods and materials for assessing, monitoring, and/or treating mammals (e.g., humans) having, or suspected of having, cancer. In some cases, this document provides methods and materials for identifying a mammal as having cancer.
- a sample obtained from a mammal can be assessed to determine the presence and, optionally, the tissue of origin of the cancer in the mammal based, at least in part, on the cfDNA fragmentation profile of the mammal.
- this document provides methods and materials for monitoring a mammal as having cancer.
- a sample e.g., a blood sample obtained from a mammal can be assessed to determine the presence of the cancer in the mammal based, at least in part, on the cfDNA fragmentation profile of the mammal.
- this document provides methods and materials for identifying a mammal as having cancer and administering one or more cancer treatments to the mammal to treat the mammal.
- a sample e.g., a blood sample
- a sample obtained from a mammal can be assessed to determine if the mammal has cancer based, at least in part, on the cfDNA fragmentation profile of the mammal, and one or more cancer treatments can be administered to the mammal.
- a cfDNA fragmentation profile can include one or more cfDNA fragmentation patterns.
- a cfDNA fragmentation pattern can include any appropriate cfDNA fragmentation pattern. Examples of cfDNA fragmentation patterns include, without limitation, median fragment size, fragment size distribution, ratio of small cfDNA fragments to large cfDNA fragments, and the coverage of cfDNA fragments.
- a cfDNA fragmentation pattern includes two or more (e.g., two, three, or four) of median fragment size, fragment size distribution, ratio of small cfDNA fragments to large cfDNA fragments, and the coverage of cfDNA fragments.
- cfDNA fragmentation profile can be a genome- wide cfDNA profile (e.g., a genome-wide cfDNA profile in windows across the genome). In some cases,
- ATTORNEY DOCKET NO. DELFI2150-3WO cfDNA fragmentation profile can be a targeted region profile.
- a targeted region can be any appropriate portion of the genome (e.g., a chromosomal region).
- chromosomal regions for which a cfDNA fragmentation profile can be determined as described herein include, without limitation, a portion of a chromosome (e.g., a portion of 2q, 4p, 5p, 6q, 7p, 8q, 9q, lOq, l lq, 12q, and/or 14q) and a chromosomal arm (e.g., a chromosomal arm of 8q, 13q, 11 q, and/or 3p).
- a cfDNA fragmentation profile can include two or more targeted region profiles.
- a cfDNA fragmentation profile can be used to identify changes (e.g., alterations) in cfDNA fragment lengths.
- An alteration can be a genome-wide alteration or an alteration in one or more targeted regions/loci.
- a target region can be any region containing one or more cancer-specific alterations. Examples of cancer-specific alterations, and their chromosomal locations, include, without limitation, those shown in Table 3 (Appendix C) and those shown in Table 6 (Appendix F).
- a cfDNA fragmentation profile can be used to identify (e.g., simultaneously identify) from about 10 alterations to about 500 alterations (e.g., from about 25 to about 500, from about 50 to about 500, from about 100 to about 500, from about 200 to about 500, from about 300 to about 500, from about 10 to about 400, from about 10 to about 300, from about 10 to about 200, from about 10 to about 100, from about 10 to about 50, from about 20 to about 400, from about 30 to about 300, from about 40 to about 200, from about 50 to about 100, from about 20 to about 100, from about 25 to about 75, from about 50 to about 250, or from about 100 to about 200, alterations).
- alterations to about 500 alterations e.g., from about 25 to about 500, from about 50 to about 500, from about 100 to about 500, from about 200 to about 500, from about 300 to about 500, from about 10 to about 400, from about 10 to about 300, from about 10 to about 200, from about 10 to about 100, from about 10 to about 50,
- a cfDNA fragmentation profile can be used to detect tumor-derived DNA.
- a cfDNA fragmentation profile can be used to detect tumor-derived DNA by comparing a cfDNA fragmentation profile of a mammal having, or suspected of having, cancer to a reference cfDNA fragmentation profile (e.g., a cfDNA fragmentation profile of a healthy mammal and/or a nucleosomal DNA fragmentation profile of healthy cells from the mammal having, or suspected of having, cancer).
- a reference cfDNA fragmentation profile is a previously generated profile from a healthy mammal.
- methods provided herein can be used to determine a reference cfDNA fragmentation profile in a healthy mammal, and that reference cfDNA fragmentation profile can be stored (e.g., in a computer or other electronic storage medium) for future comparison to a test cfDNA fragmentation profile in mammal having, or suspected of having, cancer.
- a reference cfDNA fragmentation profile e.g., a stored cfDNA fragmentation profile of a
- ATTORNEY DOCKET NO. DELFI2150-3WO healthy mammal is determined over the whole genome.
- a reference cfDNA fragmentation profile e.g., a stored cfDNA fragmentation profile
- ATTORNEY DOCKET NO. DELFI2150-3WO healthy mammal is determined over the whole genome.
- a reference cfDNA fragmentation profile e.g., a stored cfDNA fragmentation profile
- a cfDNA fragmentation profile can be used to identify a mammal (e.g., a human) as having cancer (e.g., a colorectal cancer, a lung cancer, a breast cancer, a gastric cancer, a pancreatic cancer, a bile duct cancer, and/or an ovarian cancer).
- a mammal e.g., a human
- cancer e.g., a colorectal cancer, a lung cancer, a breast cancer, a gastric cancer, a pancreatic cancer, a bile duct cancer, and/or an ovarian cancer.
- a cfDNA fragmentation profile can include a cfDNA fragment size pattern.
- cfDNA fragments can be any appropriate size.
- cfDNA fragment can be from about 50 base pairs (bp) to about 400 bp in length.
- a mammal having cancer can have a cfDNA fragment size pattern that contains a shorter median cfDNA fragment size than the median cfDNA fragment size in a healthy mammal.
- a healthy mammal e.g., a mammal not having cancer
- a mammal having cancer can have cfDNA fragment sizes that are, on average, about 1.28 bp to about 2.49 bp (e.g., about 1.88 bp) shorter than cfDNA fragment sizes in a healthy mammal.
- a mammal having cancer can have cfDNA fragment sizes having a median cfDNA fragment size of about 164.11 bp to about 165.92 bp (e.g., about 165.02 bp).
- a cfDNA fragmentation profile can include a cfDNA fragment size distribution.
- a mammal having cancer can have a cfDNA size distribution that is more variable than a cfDNA fragment size distribution in a healthy mammal.
- a size distribution can be within a targeted region.
- a healthy mammal e.g., a mammal not having cancer
- a mammal having cancer can have a targeted region cfDNA fragment size distribution that is longer (e.g., 10, 15, 20, 25, 30, 35, 40, 45, 50 or more bp longer, or any number of base pairs between these numbers) than a targeted region cfDNA fragment size distribution in a healthy mammal.
- a mammal having cancer can have a targeted region cfDNA fragment size distribution that is shorter (e.g., 10, 15, 20, 25, 30, 35, 40, 45, 50 or more bp shorter, or any number of base pairs between these numbers) than a targeted region cfDNA fragment size distribution in a healthy mammal.
- a mammal having cancer can have a targeted region cfDNA fragment size distribution that is about 47 bp smaller to about 30 bp longer than a targeted region cfDNA fragment size distribution in a healthy mammal. In some cases, a mammal having cancer can have a targeted region cfDNA fragment
- a mammal having cancer can have a targeted region cfDNA fragment size distribution of, on average, about a 13 bp difference in lengths of cfDNA fragments.
- a size distribution can be a genome-wide size distribution.
- a healthy mammal e.g., a mammal not having cancer
- a mammal having cancer can have, genome-wide, one or more alterations (e.g., increases and decreases) in cfDNA fragment sizes.
- the one or more alterations can be any appropriate chromosomal region of the genome.
- an alteration can be in a portion of a chromosome.
- portions of chromosomes that can contain one or more alterations in cfDNA fragment sizes include, without limitation, portions of 2q, 4p, 5p, 6q, 7p, 8q, 9q, lOq, llq, 12q, and 14q.
- an alteration can be across a chromosome arm (e.g., an entire chromosome arm).
- a cfDNA fragmentation profile can include a ratio of small cfDNA fragments to large cfDNA fragments and a correlation of fragment ratios to reference fragment ratios.
- a small cfDNA fragment can be from about 100 bp in length to about 150 bp in length.
- a large cfDNA fragment can be from about 151 bp in length to 220 bp in length.
- a mammal having cancer can have a correlation of fragment ratios (e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals) that is lower (e.g., 2-fold lower, 3-fold lower, 4-fold lower, 5-fold lower, 6-fold lower, 7-fold lower, 8-fold lower, 9-fold lower, 10-fold lower, or more) than in a healthy mammal.
- a correlation of fragment ratios e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals
- lower e.g., 2-fold lower, 3-fold lower, 4-fold lower, 5-fold lower, 6-fold lower, 7-fold lower, 8-fold lower, 9-fold lower, 10-fold lower, or more
- a healthy mammal e.g., a mammal not having cancer
- can have a correlation of fragment ratios e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals
- a correlation of fragment ratios e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals
- a mammal having cancer can have a correlation of fragment ratios (e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals) that is, on average, about 0.19 to about 0.30 (e.g., about 0.25) lower than a correlation of fragment ratios (e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals) in a healthy mammal.
- a correlation of fragment ratios e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals
- a cfDNA fragmentation profile can include coverage of all fragments.
- Coverage of all fragments can include windows (e.g., non-overlapping windows) of coverage.
- coverage of all fragments can include windows of small fragments (e.g., fragments from about 100 bp to about 150 bp in length).
- coverage of all fragments can include windows of large fragments (e.g., fragments from about 151 bp to about 220 bp in length).
- a cfDNA fragmentation profile can be obtained using any appropriate method.
- cfDNA from a mammal e.g., a mammal having, or suspected of having, cancer
- sequencing libraries which can be subjected to whole genome sequencing (e.g., low-coverage whole genome sequencing), mapped to the genome, and analyzed to determine cfDNA fragment lengths.
- Mapped sequences can be analyzed in non-overlapping windows covering the genome. Windows can be any appropriate size. For example, windows can be from thousands to millions of bases in length. As one non-limiting example, a window can be about 5 megabases (Mb) long. Any appropriate number of windows can be mapped.
- cfDNA fragmentation profile can be determined within each window.
- a cfDNA fragmentation profile can be obtained as described in Example 1.
- a cfDNA fragmentation profile can be obtained as shown in FIG. 1.
- methods and materials described herein also can include machine learning.
- machine learning can be used for identifying an altered fragmentation profile (e.g., using coverage of cfDNA fragments, fragment size of cfDNA fragments, coverage of chromosomes, and mtDNA).
- methods and materials described herein can be the sole method used to identify a mammal (e.g., a human) as having cancer (e.g., a colorectal cancer, a lung cancer, a breast cancer, a gastric cancer, a pancreatic cancer, a bile duct cancer, and/or an ovarian cancer).
- a mammal e.g., a human
- cancer e.g., a colorectal cancer, a lung cancer, a breast cancer, a gastric cancer, a pancreatic cancer, a bile duct cancer, and/or an ovarian cancer.
- determining a cfDNA fragmentation profile can be the sole method used to identify a mammal as having cancer.
- methods and materials described herein can be used together with one or more additional methods used to identify a mammal (e.g., a human) as having cancer (e.g., a colorectal cancer, a lung cancer, a breast cancer, a gastric cancer, a pancreatic cancer, a bile duct cancer, and/or an ovarian cancer).
- a mammal e.g., a human
- cancer e.g., a colorectal cancer, a lung cancer, a breast cancer, a gastric cancer, a pancreatic cancer, a bile duct cancer, and/or an ovarian cancer.
- methods used to identify a mammal as having cancer include, without limitation, identifying one or more cancer-specific sequence
- ATTORNEY DOCKET NO. DELFI2150-3WO alterations identifying one or more chromosomal alterations (e.g., aneuploidies and rearrangements), and identifying other cfDNA alterations.
- determining a cfDNA fragmentation profile can be used together with identifying one or more cancer-specific mutations in a mammal's genome to identify a mammal as having cancer.
- determining a cfDNA fragmentation profile can be used together with identifying one or more aneuploidies in a mammal's genome to identify a mammal as having cancer.
- this document also provides methods and materials for assessing, monitoring, and/or treating mammals (e.g., humans) having, or suspected of having, cancer.
- this document provides methods and materials for identifying a mammal as having cancer. For example, a sample (e.g., a blood sample) obtained from a mammal can be assessed to determine if the mammal has cancer based, at least in part, on the cfDNA fragmentation profile of the mammal.
- this document provides methods and materials for identifying the location (e.g., the anatomic site or tissue of origin) of a cancer in a mammal.
- a sample obtained from a mammal can be assessed to determine the tissue of origin of the cancer in the mammal based, at least in part, on the cfDNA fragmentation profile of the mammal.
- this document provides methods and materials for identifying a mammal as having cancer and administering one or more cancer treatments to the mammal to treat the mammal.
- a sample e.g., a blood sample obtained from a mammal can be assessed to determine if the mammal has cancer based, at least in part, on the cfDNA fragmentation profile of the mammal, and administering one or more cancer treatments to the mammal.
- this document provides methods and materials for treating a mammal having cancer.
- one or more cancer treatments can be administered to a mammal identified as having cancer (e.g., based, at least in part, on the cfDNA fragmentation profile of the mammal) to treat the mammal.
- a mammal can undergo monitoring (or be selected for increased monitoring) and/or further diagnostic testing.
- monitoring can include assessing mammals having, or suspected of having, cancer by, for example, assessing a sample (e.g., a blood sample) obtained from the mammal to determine the cfDNA fragmentation profile of the mammal as described herein, and changes in the cfDNA fragmentation profiles over time can be used to identify response to treatment and/or identify the mammal as having cancer (e.g., a residual cancer).
- a sample e.g., a blood sample
- changes in the cfDNA fragmentation profiles over time can be used to identify response to treatment and/or identify the mammal as having cancer (e.g., a residual cancer).
- a mammal can be a mammal having cancer.
- a mammal can be a mammal suspected of having cancer.
- mammals that can be assessed, monitored, and/or treated as described herein include, without limitation, humans, primates such as monkeys, dogs, cats, horses, cows, pigs, sheep, mice, and rats.
- a human having, or suspected of having, cancer can be assessed to determine a cfDNA fragmentation profiled as described herein and, optionally, can be treated with one or more cancer treatments as described herein.
- a sample can include DNA (e.g., genomic DNA).
- a sample can include cfDNA (e.g., circulating tumor DNA (ctDNA)).
- a sample can be fluid sample (e.g., a liquid biopsy).
- samples that can contain DNA and/or polypeptides include, without limitation, blood (e.g., whole blood, serum, or plasma), amnion, tissue, urine, cerebrospinal fluid, saliva, sputum, broncho-alveolar lavage, bile, lymphatic fluid, cyst fluid, stool, ascites, pap smears, breast milk, and exhaled breath condensate.
- blood e.g., whole blood, serum, or plasma
- amnion tissue
- tissue e.g., whole blood, serum, or plasma
- saliva saliva
- sputum e.g., sputum
- broncho-alveolar lavage bile
- bile lymphatic fluid
- cyst fluid e.g., cyst fluid
- stool ascites, pap smears, breast milk
- exhaled breath condensate e.g., a plasma sample can be assessed to determine a cfDNA fragmentation profiled as described herein.
- a sample from a mammal to be assessed as described herein can include any appropriate amount of cfDNA.
- a sample can include a limited amount of DNA.
- a cfDNA fragmentation profile can be obtained from a sample that includes less DNA than is typically required for other cfDNA analysis methods, such as those described in, for example, Phallen et al., 2017 Sci Transl Med 9; Cohen et al., 2018 Science 359:926; Newman et al., 2014 Nat Med 20:548; and Newman et al., 2016 Nat Biotechnol 34:547).
- a sample can be processed (e.g., to isolate and/or purify DNA and/or polypeptides from the sample).
- DNA isolation and/or purification can include cell lysis (e.g., using detergents and/or surfactants), protein removal (e.g., using a protease), and/or RNA removal (e.g., using an RNase).
- polypeptide isolation and/or purification can include cell lysis (e.g., using detergents and/or surfactants), DNA removal (e.g., using a DNase), and/or RNA removal (e.g., using an RNase).
- Figure 20 illustrates an example computer 800 that may be used to implement the methods described herein.
- the computer 800 may include a machine learning system that trains a machine learning model to subtype small cell lung cancer or non-small cell lung cancer as described above or a portion or combination thereof in some embodiments.
- the computer 800 may be any electronic device that runs software applications derived from compiled instructions, including without limitation personal computers, servers, smart phones, media players, electronic tablets, game consoles, email devices, etc.
- the computer 800 may include one or more processors 802, one or more input devices 804, one or more display devices 806, one or more network interfaces 808, and one or more computer- readable mediums 812. Each of these components may be coupled by bus 810, and in some embodiments, these components may be distributed among multiple physical locations and coupled by a network.
- Display device 806 may be any known display technology, including but not limited to display devices using Liquid Crystal Display (LCD) or Light Emitting Diode (LED) technology.
- Processor(s) 802 may use any known processor technology, including but not limited to graphics processors and multi-core processors.
- Input device 804 may be any known input device technology, including but not limited to a keyboard (including a virtual keyboard), mouse, track ball, camera, and touch-sensitive pad or display.
- Bus 810 may be any known internal or external bus technology, including but not limited to ISA, EISA, PCI, PCI Express, USB, Serial ATA or FireWire.
- Computer-readable medium 812 may be any non-transitory medium that participates in providing instructions to processor(s) 804 for execution, including without limitation, non-volatile storage media (e.g., optical disks, magnetic disks, flash drives, etc.), or volatile media (e.g., SDRAM, ROM, etc.).
- non-volatile storage media e.g., optical disks, magnetic disks, flash drives, etc.
- volatile media e.g., SDRAM, ROM, etc.
- Computer-readable medium 812 may include various instructions 814 for implementing an operating system (e.g., Mac OSĀ®, WindowsĀ®, Linux).
- the operating system may be multi-user, multiprocessing, multitasking, multithreading, real-time, and the like.
- the operating system may perform basic tasks, including but not limited to: recognizing input from input device 804; sending output to display device 806; keeping track of files and directories on computer-readable medium 812; controlling peripheral devices (e.g., disk drives, printers, etc.) which can be controlled directly or through an I/O controller; and managing traffic on bus 810.
- Network communications instructions 816 may establish and maintain network
- ATTORNEY DOCKET NO. DELFI2150-3WO connections e.g., software for implementing communication protocols, such as TCP/IP, HTTP, Ethernet, telephony, etc.).
- Machine learning instructions 818 may include instructions that enable computer 800 to function as a machine learning system and/or to training machine learning models to generate DMS values as described herein.
- Application(s) 820 may be an application that uses or implements the processes described herein and/or other processes. The processes may also be implemented in operating system 814. For example, application 820 and/or operating system may create tasks in applications as described herein.
- the described features may be implemented in one or more computer programs that may be executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device.
- a computer program is a set of instructions that can be used, directly or indirectly, in a computer to perform a certain activity or bring about a certain result.
- a computer program may be written in any form of programming language (e.g., Objective-C, Java), including compiled or interpreted languages, and it may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- Suitable processors for the execution of a program of instructions may include, by way of example, both general and special purpose microprocessors, and the sole processor or one of multiple processors or cores, of any kind of computer.
- a processor may receive instructions and data from a read-only memory or a random-access memory or both.
- the essential elements of a computer may include a processor for executing instructions and one or more memories for storing instructions and data.
- a computer may also include, or be operatively coupled to communicate with, one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks.
- Storage devices suitable for tangibly embodying computer program instructions and data may include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.
- semiconductor memory devices such as EPROM, EEPROM, and flash memory devices
- magnetic disks such as internal hard disks and removable disks
- magneto-optical disks and CD-ROM and DVD-ROM disks.
- the processor and the memory may be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
- ASICs application-specific integrated circuits
- the features may be implemented on a computer having a display device such as an LED or LCD monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer.
- a display device such as an LED or LCD monitor for displaying information to the user
- a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer.
- the features may be implemented in a computer system that includes a back-end component, such as a data server, or that includes a middleware component, such as an application server or an Internet server, or that includes a front-end component, such as a client computer having a graphical user interface or an Internet browser, or any combination thereof.
- the components of the system may be connected by any form or medium of digital data communication such as a communication network. Examples of communication networks include, e.g., a telephone network, a LAN, a WAN, and the computers and networks forming the Internet.
- the computer system may include clients and servers.
- a client and server may generally be remote from each other and may typically interact through a network.
- the relationship of client and server may arise by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- An API may define one or more parameters that are passed between a calling application and other software code (e.g., an operating system, library routine, function) that provides a service, that provides data, or that performs an operation or a computation.
- software code e.g., an operating system, library routine, function
- the API may be implemented as one or more calls in program code that send or receive one or more parameters through a parameter list or other structure based on a call convention defined in an API specification document.
- a parameter may be a constant, a key, a data structure, an object, an object class, a variable, a data type, a pointer, an array, a list, or another call.
- API calls and parameters may be implemented in any programming language.
- the programming language may define the vocabulary and calling convention that a programmer will employ to access functions supporting the API.
- an API call may report to an application the capabilities of a device running the application, such as input capability, output capability, processing capability, power capability, communications capability, etc.
- the presently described methods and systems are useful for subtyping non-small cell lung cancer or small cell lung cancer in a subject and optionally treating the cancer subtype in the subject.
- Any appropriate subject such as a mammal can be assessed, and/or treated as described herein.
- Examples of some mammals that can be assessed, and/or treated as described herein include, without limitation, humans, primates such as monkeys, dogs, cats, horses, cows, pigs, sheep, mice, and rats.
- a human having, or suspected of having, cancer can be assessed using a method described herein and, optionally, can be treated with one or more cancer treatments as described herein.
- the methods disclosed herein may include administering to the subject identified as having the type of cancer, a therapeutic agent suitable for the treatment of the type of cancer.
- the subject can be administered one or more cancer treatments.
- a cancer treatment can be any appropriate cancer treatment.
- One or more cancer treatments described herein can be administered to a subject at any appropriate frequency (e.g., once or multiple times over a
- a cancer treatment can reduce the severity of the cancer, reduce a symptom of the cancer, and/or to reduce the number
- a cancer treatment can be a chemotherapeutic agent.
- chemotherapeutic agents include: amsacrine, azacitidine, axathioprine, bevacizumab (or an antigen-binding fragment thereof), bleomycin, busulfan, carboplatin , capecitabine, chlorambucil, cisplatin, cyclophosphamide, cytarabine, dacarbazine, daunorubicin, docetaxel, doxifluridine, doxorubicin, epirubicin, erlotinib hydrochlorides, etoposide, fiudarabine, floxuridine, fludarabine, fluorouracil, gemcitabine, hydroxyurea, idarubicin, ifosfamide, irinotecan, lomustine, mechlorethamine, melphalan, mercaptopurine, methotr
- DNA is present in a biological sample taken from a subject and used in the methodology of the invention.
- the biological sample can be virtually any type of biological sample that includes DNA.
- the biological sample is typically a fluid, such as whole blood or a portion thereof with circulating cfDNA.
- the sample includes DNA from a tumor or a liquid biopsy, such as, but not limited to amniotic fluid, aqueous humor, vitreous humor, blood, whole blood, fractionated blood, plasma, serum, breast milk, cerebrospinal fluid (CSF), cerumen (earwax), chyle, chime, endolymph, perilymph, feces, breath, gastric acid, gastric juice, lymph, mucus (including nasal drainage and phlegm),
- a liquid biopsy such as, but not limited to amniotic fluid, aqueous humor, vitreous humor, blood, whole blood, fractionated blood, plasma, serum, breast milk, cerebrospinal fluid (CSF), cerumen (earwax), chyle, chime, endolymph, perilymph, feces, breath, gastric acid, gastric juice, lymph, mucus (including nasal drainage and phlegm),
- CSF cerebrospinal fluid
- cerumen earwax
- the sample includes DNA from a circulating tumor cell.
- the biological sample can be a blood sample.
- the blood sample can be obtained using methods known in the art, such as finger prick or phlebotomy.
- the blood sample is approximately 0. 1 to 20 ml, or alternatively approximately 1 to 15 ml with the volume of blood being approximately 10 ml. Smaller amounts may also be used, as well as circulating free DNA in blood.
- Microsampling and sampling by needle biopsy, catheter, excretion or production of bodily fluids containing DNA are also potential biological sample sources.
- the methods and systems of the disclosure utilize nucleic acid sequence information and can therefore include any method or sequencing device for performing nucleic acid sequencing including nucleic acid amplification, polymerase chain reaction (PCR), nanopore sequencing, 454 sequencing, insertion tagged sequencing.
- PCR polymerase chain reaction
- nanopore sequencing nanopore sequencing
- 454 sequencing insertion tagged sequencing
- the methodology or systems of the disclosure utilize systems such as those provided by Illumina, Inc, (including but not limited to HiSeqTM X10, HiSeqTM 1000, HiSeqTM 2000, HiSeqTM 2500, Genome AnalyzersTM, MiSeqTMā NextSeq, NovaSeq 6000 systems), Applied Biosystems Life Technologies (SOLiDTM System, Ion PGMTM Sequencer, ion ProtonTM Sequencer) or Genapsys or BGI MGI and other systems. Nucleic acid analysis can also be carried out by systems provided by Oxford Nanopore Technologies (GridiONTM, MiniONTM) or Pacific Biosciences (PacbioTM RS II or Sequel I or II).
- the present invention includes systems for performing steps of the disclosed methods and is described partly in terms of functional components and various processing steps.
- Such functional components and processing steps may be realized by any number of components, operations and techniques configured to perform the specified functions and achieve the various results.
- the present invention may employ various biological samples, biomarkers, elements, materials, computers, data sources, storage systems and media, information gathering techniques and processes, data processing criteria, statistical analyses, regression analyses and the like, which may carry out a variety of functions.
- the invention further provides a non-invasive system for subtyping small cell lung cancer or non-small cell lung cancer.
- the system includes: (a) a sequencer configured to generate a low-coverage whole genome sequencing data set for a sample; and (b) a computer system and/or processor with functionality to perform a method of the invention.
- the computer system further includes one or more additional modules.
- the system may include one or more of an extraction and/or isolation unit operable to select suitable genetic components analysis, e.g., cfDNA fragments of a particular size.
- the computer system further includes a visual display device.
- the visual display device may be operable to display a curve fit line, a reference curve fit line, and/or a comparison of both.
- Methods for the non-invasive subtyping of small cell lung cancer or non-small cell lung cancer may be implemented in any suitable manner, for example using a computer program operating on the computer system.
- an exemplary system may be implemented in conjunction with a computer system, for example a conventional computer system comprising a processor and a random access memory, such as a remotely- accessible application server, network server, personal computer or workstation.
- the computer system also suitably includes additional memory devices or information storage systems, such as a mass storage system and a user interface, for example a conventional monitor, keyboard and tracking device.
- the computer system may, however, include any suitable computer system and associated equipment and may be configured in any suitable manner.
- the computer system comprises a stand-alone system.
- the computer system is part of a network of computers including a server and a database.
- the software required for receiving, processing, and analyzing information may be implemented in a single device or implemented in a plurality of devices.
- the software may be accessible via a network such that storage and processing of information takes place remotely with respect to users.
- the system according to various aspects of the present invention and its various elements provide functions and operations to facilitate detection and/or analysis, such as data gathering, processing, analysis, reporting and/or diagnosis.
- the computer system executes the computer program, which may receive, store, search,
- ATTORNEY DOCKET NO. DELFI2150-3WO analyze, and report information relating to the human genome or region thereof.
- the computer program may comprise multiple modules performing various functions or operations, such as a processing module for processing raw data and generating supplemental data and an analysis module for analyzing raw data and supplemental data to generate quantitative assessments of a disease status model and/or diagnosis information.
- the procedures performed by the system may comprise any suitable processes to facilitate analysis and/or subtyping of small cell lung cancer or non-small cell lung cancer.
- the system is configured to establish a disease subtype model and/or determine disease subtype in a patient. Determining or identifying disease subtype may include generating any useful information regarding the condition of the patient relative to the disease, such as performing a diagnosis, providing information helpful to a diagnosis, assessing the stage or progress of a disease, identifying a condition that may indicate a susceptibility to the disease, identify whether further tests may be recommended, predicting and/or assessing the efficacy of one or more treatment programs, or otherwise assessing the disease status, likelihood of disease, or other health aspect of the patient.
- SCLC Small Cell Lung Cancer
- Patients with SCLC suffer from a 5-year survival rate of less than 8% (Gay, C. M. etal. Patterns of transcription factor programs and immune pathway activation define four major subtypes of SCLC with distinct therapeutic vulnerabilities. Cancer Cell 39, 346-360.e7 (2021)).
- SCLC is a heterogeneous tumor type consisting of tumor cells with neuroendocrine and non- neuroendocrine features. SCLC has two major subtypes: High-grade Neuroendocrine (High- NE) versus Low-grade Neuroendocrine (Low-NE) (Gay et al.).
- High-NE subtypes are characterized by the activation of lineage-specific transcription factors: ASCL1 and NEURODI.
- Low-NE is characterized by non-neuroendocrine factors such as POU2F3 (SCLC- P) or the high presence of inflammatory T-cells (SCLC-I).
- SCLC SCLC is treated as a single entity with predictably poor results.
- SCLC-I inflamed group
- DELFI-based cell-free fragmentomes accurately differentiate High-NE vs Low-NE SCLC subtypes in anon-invasive manner.
- DELFI SCLC subtyping final goal is to guide optimal treatment selection for each patient diagnosed with advanced SCLC.
- RESULTS Circulating cell-free DNA
- cfDNA Circulating cell-free DNA
- DELFI calculates fragment distribution for each TFBS within a 2kb window and scales between 0-1 independently for each sample.
- a principal component analysis was then performed in R and clusters were defined based on ASCL1 TFBS sites (Mathios, D. et al. Detection and characterization of lung cancer using cell-free DNA fragmentomes. Nat Commun 12, 5060 (2021)). Clinical information was examined in orthogonal analyses.
- Genome-wide cfDNA fragmentation analyses at ASCL1 binding sites (-12,000 genomic coordinates) in the SCLC patients reveal a decrease in coverage near transcription factor binding sites of SCLC patients compared to noncancer and NSCLC samples. The difference at the center of the ASCL1 binding sites was the main driver of the differentiation between also pre-treatment SCLC samples. Investigation of fragments distribution at ASCL1 binding sites was capable of differentiating High-NE vs Low- NE samples.
- Table 1 Patient characteristics and associated clinical information
- SCLC is an aggressive malignancy with a poor prognosis.
- SCLCs are clinically managed as a single cancer type, new evidence supports that these subtypes (high- vs low-neuroendocrine) of SCLC acquire diverse transcriptional and epigenetic states.
- SCLC subtypes respond to specific treatment such as immunotherapy in the case of the low-neuroendocrine highly inflamed SCLC subtype.
- novel non-invasive DELFI-based approach described herein distinguishes between high- and low-neuroendocrine SCLC subtypes investigating cell-free fragments that
- ATTORNEY DOCKET NO. DELFI2150-3WO reflect the unique epigenetic states of SCLCs. This novel method is capable of differentiating the patients that will likely respond to immunotherapy, such as immune checkpoint blockade.
- SCLC subtypes can be predicted from the DELFI-based approach using cfDNA LC-WGS data.
- the DELFI machine learning classifier detects the presence of cancer in patients with SCLC. Both genome-wide fragmentation profiles and correspondent DELFI scores were investigated to identify preliminary differences between the two main clinical subtypes of SCLC cases: high-neuroendocrine and low-neuroendocrine. Later publicly available data was used for running a principal component analysis to cluster the two subtypes. [0168] Second, using publicly available data, it was determined that these clusters of SCLC samples exhibit a decrease in aggregate fragment coverage at ASCL1 transcription factor binding sites classifying these cases as high-neuroendocrine SCLCs.
- ROC receiver operating characteristic
- Figure 2 illustrates that genome-wide fragmentation profiles were significantly consistent among pre-treatment, post-treatment, progression and response time points suggesting genome-wide circulating tumor DNA fragment sizes is a powerful method to detect changes during monitoring of immunotherapy treatment of SCLC cancer treatment.
- Figure 3 illustrates genome-wide fragmentation profiles (and correspondent DELFI-scores below) showing partial differences between the two main subtypes of SCLC cases: High Neuroendocrine and Low Neuroendocrine; suggesting genome-wide circulating tumor DNA fragment sizes may be a powerful method to detect subtyping during monitoring of immunotherapy treatment of SCLC cancer treatment.
- Figure 4 illustrates differential genes between High and Low neuroendocrine SCLC cases, using publicly available data (Lissa et al. 2022) and a principal Component Analysis (on DELFI 30X plasma WGS data) to cluster the two subtypes.
- PCA analysis data revealed distinguishable clusters between SCLC High-NE vs Low-NE pre-treatment samples.
- Figure 5 illustrates an Eigencor analysis revealing a significant correlation between Principal Component 1 and 2 with the NE status of the samples.
- Figure 6 A-B illustrates the potential of the DELFI assay to subtype SCLC using TFBS, A genome-wide cfDNA fragmentation analyses were performed at ASCL1 binding sites in the pre-treatment NCI samples.
- Figure 5 A illustrates that distinct clusters of SCLC samples were observed corresponding to different levels of ASCL1 activation. Importantly the two clusters corresponded perfectly to the different SCLC subtypes. The main driver of differentiation between samples was in fact differences in the fragment coverage identified at the center of the ASCL1 binding sites.
- Figure 6B illustrates that other clinical or sample characteristics have a negligible impact in the clustering of these samples. Overall, these data show how DELFI analysis can perfectly distinguish between SCLC subtypes using fragment coverage at TFBS.
- Figure 7 illustrates a patient example supporting the hypothesis that the detected signal comes from tumor infiltrating lymphocytes.
- Patient NCI-0422 diagnosed with SCLC with inflammation subtype was treated with Durvalumab plus Olaparib. NCI-0422 responded well to treatment but was later diagnosed with progressive disease. Variable genomic binding is detectable at pre-treatment and at progression timepoints. Peaks are centered at a mix of TFBS and TSS.
- Figure 8 A-B illustrates ASCL1 binding site between resp vs non-responders.
- Figure 8A shows 500 cell type specific bins from the PMD data to determine fraction of white blood cells.
- Figure 8B shows lymphocyte tissue specific bins.
- Figure 9 illustrates a machine learning model to predict high vs low neuroendocrine SCLC.
- Subtyping is performed by calculating the log2 (Read Depth Ratio) across 20bp windows starting 2kb from ASCL1 TFBS sites.
- Read Depth Ratio is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
- log2(Read Depth/Correction Factor) A General Additive Model is then used to smooth coverage and correct for GC bias.
- neuroendocrine status is derived from mean NE50 score per subject/treatment status, a positive value is classified as NE+.
- Subtype status is derived from aggregating the mean for each of the 4 subtypes (ASCL1, NEURODI, POUF23, YAP1) such that each subject/treatment status has one value for each subtype. The subtype with the highest score is then selected as the truth. Fragmentomics are calculated as previously described (Cristiano et al.), Scores are for the latest available models.
- Figure 10 A-B illustrates that the method described herein is able to differentiate neuroendocrine vs non-neuroendocrine by calculating Read Depth Ratio at SCLC specific genomic coordinates (Figure 10A.).
- Figure 10B shows that the method described herein appears to subtype better SCLC cases.
- Figure 11 illustrates that Read Depth Ratio at SCLC specific genomic coordinates are transformed into a 2-dimensional PCA to separate samples in Neuroendocrine vs non- neuroendocrine groups. Most of samples are clearly divided into different clusters.
- Figure 12 illustrates PCA derived from Read Depth Ratio at SCLC specific genomic coordinates is able to separate pre-treatment samples into specific SCLC subtypes (A,N,P,Y). The ground truth of the SCLC subtypes was calculated using tissue RNA-seq gene expression differential analysis.
- the fragmentomics platform described herein detects small cell lung cancer (SCLC) with high sensitivity.
- the DELFI SCLC Subtyping Assay capable of subtyping SCLC samples, without the need of clinical knowledge.
- the DELFI SCLC Subtyping Assay identified four groups of SCLC samples belonging to different levels of ASCL1 activation. Samples with high activation of ASCL1 were confirmed to belong to Neuroendocrine subtypes (SCLC-A).
- cfDNA Fragmentomics Background Traditionally monitoring is done with imaging, many individuals do not have ready access to hospitals with appropriate imaging equipment and expertise and need to travel significant distances for such procedures making continual monitoring problematic. There is no a priori need to know where a tumor is located. There is additionally no need to know beforehand the somatic mutations a tumor harbors. Costs associated with liquid biopsies are cheap when compared to imaging which should improve as sequencing costs continue to decline.
- cfDNA Mutations have Limited Signal and can be Confounded by Clonal Hematopoiesis. DELFI captures many features of the cfDNA universe. cfDNA fragmentation patterns are determined by basal chromatin organization. cfDNA fragmentation profiles are highly consistent in healthy people and altered in patients with cancer.
- DELFI-TF Model Overview and Application to Monitoring NSCLC Patients during Therapy cfDNA aliquots from plasma samples of CRC patients were analyzed using ddPCR for RAS mutation status as well as low-pass WGS sequencing. WGS data was aligned and fragment size distributions obtained for 504 5Mb bins across the genome. A Bayesian regression model was trained and cross-validated against RAS MT samples using fragmentomics features. Figure 20 depicts proof-of-concept DELFI-TF model development.
- a CRC trained DELFI-TF model was applied to a real-world NSCLC cohort.
- Figure 13 illustrates depicts how fragmentation profiles inform on lung cancer status and treatment response patterns.
- Figure 14 illustrates that model derived DELFI-TF exhibits a strong correlation with Mutant Allele Frequency across samples.
- Figure 15 illustrates that DELFI-TF accurately quantify circulating tumor fraction, mirroring MAF performance in relation to RECIST evaluation.
- Figure 16 illustrates that tumor derived cfDNA has altered fragmentation.
- Figure 17 A-C illustrates that DELFI-TF accurately detects circulating tumor fraction without being confounded by clonal hematopoiesis.
- Figure 18 depicts cfDNA fragmentation patterns accurately differentiate NSCLC subtypes.
- cfDNA fragmentation scores are highly correlated with known mutation allele frequencies. cfDNA fragmentation predicts RECIST status. cfDNA fragmentation is not confounded by clonal hematopoiesis. cfDNA fragmentation features can noninvasively distinguish histologic subtypes of lung cancers.
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Pathology (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Physics & Mathematics (AREA)
- Zoology (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Public Health (AREA)
- Molecular Biology (AREA)
- Analytical Chemistry (AREA)
- Wood Science & Technology (AREA)
- Genetics & Genomics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Theoretical Computer Science (AREA)
- Physiology (AREA)
- Oncology (AREA)
- Evolutionary Biology (AREA)
- Hospice & Palliative Care (AREA)
- Spectroscopy & Molecular Physics (AREA)
- General Engineering & Computer Science (AREA)
- Biochemistry (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Microbiology (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present disclosure provides methods of uses thereof for improved diagnostic applications using genome-wide patterns of fragmented cell-free DNA (cfDNA) from plasma, derived by low-coverage whole-genome sequencing. In particular, the present invention provides new and effective methods for subtyping non-small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine non-small cell lung cancer.
Description
DELFI-DERIVED CELL-FREE DNA FRAGMENTATION PATTERNS DIFFERENTIATE HISTOLOGIC SUBTYPES OF LUNG CANCERS IN A NON- INVASIVE MANNER
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of priority under 35 U.S.C. Ā§ 119(e) of U.S. Provisional Patent Application Serial No. 63/445,284 filed on February 13, 2023, U.S. Provisional Patent Application Serial No. 63/470,101 filed on May 31, 2023, and U.S. Provisional Patent Application Serial No. 63/528,237 filed on July 21, 2023. The disclosures of the prior applications are considered part of and are herein incorporated by reference in the disclosure of this application in their entirety.
FIELD OF THE INVENTION
[0002] The present invention relates generally to the non-invasive diagnosis and subtyping of small cell lung cancer (SCLC) and more specifically to the analysis of genome-wide patterns of fragmented cell-free DNA (cfDNA) in conjunction with clinical and demographic features of individual patients. The present invention is also related to the of Cell-free DNA fragmentation profiling as a method for tumor fraction assessment and treatment monitoring in non-small cell lung cancer (NSCLC). Given the logistical difficulties to perform SCLSC and NSCLC tumor biopsies, new approaches are needed for viable and quick methods of subtyping SCLC and NSCLC in a non-invasive manner.
BACKGROUND
[0003] Small cell lung cancer is an aggressive malignancy with a poor prognosis. Although Small cell lung cancers are clinically managed as a single cancer type, new evidence supports these subtypes (high- neuroendocrine vs low-neuroendocrine) of small cell lung cancer acquire diverse transcriptional and epigenetic states. Furthermore, distinct small cell lung cancer subtypes respond to specific treatment such as immunotherapy in the case of the low- neuroendocrine highly inflamed small cell lung cancer subtype, non-small cell lung cancer is any type of epithelial lung cancer other than small cell lung cancer. The most common types of non-small cell lung cancer are squamous cell carcinoma, large cell carcinoma, and adenocarcinoma, but there are several other types that occur less frequently, and all types can
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO occur in unusual histological variants. As a class, non-small cell lung cancer is usually less sensitive to chemotherapy and radiation therapy than small cell lung cancer. Patients with resectable disease may be cured by surgery or surgery followed by chemotherapy, as well as chemotherapy followed by surgery. Local control can be achieved with radiation therapy in many patients with unresectable disease, but cure is seen in relatively few patients. Patients with locally advanced unresectable disease may achieve long-term survival with radiation therapy combined with chemotherapy. Patients with advanced metastatic disease may achieve improved survival and palliation of symptoms with chemotherapy, targeted agents, and other supportive measures. Novel, non-invasive methods for identifying and subtyping non-small cell lung cancer and small cell lung cancer are needed to improve patient outcomes.
SUMMARY OF THE INVENTION
[0004] The present invention is based on the seminal discovery that the characterizing genome-wide patterns of fragmentation of cell-free DNA (cfDNA) in plasma using low- coverage whole-genome sequencing can improve cancer diagnosis when analyzed in conjunction with certain clinical and demographic features of individual patients.
[0005] In one embodiment, the present invention provides methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high- neuroendocrine small cell lung cancer subtype.
[0006] In one embodiment, the present invention provides methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and
2
ACTIVE\1607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high- neuroendocrine small cell lung cancer subtype.
[0007] In one embodiment, the present invention provides methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9x to O.lx; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high- neuroendocrine small cell lung cancer subtype. In some aspects, the method uses the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
[0008] In some aspects, the method uses machine learning to subtype small cell lung cancer in a subject.
[0009] In some aspects, subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
[0010] In some aspects, the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
3
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0011] In some aspects, the method uses all coverage calculations are shifted by 1 to avoid divisions by 0. In some aspects this adjustment is expressed as f(x) = log2((window depth + l)/(median(anchors) +1).
[0012] In some aspects, the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
[0013] Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias.
[0014] In some aspects, the genomic intervals are non-overlapping.
[0015] In some aspects, the genomic intervals each comprise thousands to millions of base pairs.
[0016] In some aspects, a cfDNA fragmentation profile is determined within each genomic intervals.
[0017] In some aspects, the cfDNA fragmentation profile comprises a median fragment size.
[0018] In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution.
[0019] Some aspects further comprise administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the type of cancer.
[0020] In some aspects, the therapeutic agent is an immunotherapy.
[0021] In one embodiment, the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment
4
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
[0022] In one embodiment, the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
[0023] In one embodiment, the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9x to O.lx; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
[0024] In some aspects, the specified transcription factor is AS CL 1, NEURODI, POUF23, YAP1, or any combination thereof.
[0025] In some aspects, the method uses machine learning to subtype small cell lung cancer in the subject.
5
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0026] In some aspects, subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
[0027] In some aspects, the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
[0028] In some aspects, all coverage calculations are shifted by 1 to avoid divisions by 0.
In some aspects this adjustment is expressed as f(x) = log2((window depth + l)/(median(anchors) +1).
[0029] In some aspects, the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
[0030] Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias.
[0031] In some aspects, the genomic intervals are non-overlapping.
[0032] In some aspects, the genomic intervals each comprise thousands to millions of base pairs.
[0033] In some aspects, a cfDNA fragmentation profile is determined within each genomic intervals.
[0034] In some aspects, the cfDNA fragmentation profile comprises a median fragment size.
[0035] In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution.
[0036] Some aspects further comprise administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer.
[0037] In some aspects, the therapeutic agent is an immunotherapy.
[0038] Some aspects further comprise quantifying a circulating tumor fraction, mirroring major allele fraction (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both Adenocarcinoma and Squamous carcinoma.
6
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
BRIEF DESCRIPTION OF THE DRAWINGS
[0039] Figure 1 illustrates DELFI-scores of the patients diagnosed with SCLC in the study of Example 1.
[0040] Figure 2 illustrates that genome-wide fragmentation profiles were significantly consistent among pre-treatment, post-treatment, progression and response time points suggesting genome-wide circulating tumor DNA fragment sizes is a powerful method to detect changes during monitoring of immunotherapy treatment of SCLC cancer treatment.
[0041] Figure 3 illustrates genome- wide fragmentation profiles (and correspondent DELFI-scores below) showing partial differences between the two main subtypes of SCLC cases: High Neuroendocrine and Low Neuroendocrine; suggesting genome-wide circulating tumor DNA fragment sizes may be a powerful method to detect subtyping during monitoring of immunotherapy treatment of SCLC cancer treatment.
[0042] Figure 4 illustrates differential genes between High and Low neuroendocrine SCLC cases, using publicly available data (Lissa et al. 2022) and a principal Component Analysis (on DELFI 30X plasma WGS data) to cluster the two subtypes. PCA analysis data revealed distinguishable clusters between SCLC High-NE vs Low-NE pre-treatment samples.
[0043] Figures 5A-5B illustrates an Eigencor analysis revealing a significant correlation between Principal Component 1 and 2 with the NE status of the samples.
[0044] Figures 6A-6B illustrates the potential of the DELFI assay to subtype SCLC using TFBS, A genome-wide cfDNA fragmentation analyses were performed at ASCL1 binding sites in the pre-treatment NCI samples. Figure 5 A illustrates that distinct clusters of SCLC samples were observed corresponding to different levels of ASCL1 activation. Importantly the two clusters corresponded perfectly to the different SCLC subtypes. The main driver of differentiation between samples was in fact differences in the fragment coverage identified at the center of the ASCL1 binding sites. Figure 6B illustrates that other clinical or sample characteristics have a negligible impact in the clustering of these samples. Overall, these data show how DELFI analysis can perfectly distinguish between SCLC subtypes using fragment coverage at TFBS.
7
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0045] Figure 7 illustrates a patient example supporting the hypothesis that the detected signal comes from tumor infiltrating lymphocytes. Patient NCI-0422, diagnosed with SCLC with inflammation subtype was treated with Durvalumab plus Olaparib. NCI-0422 responded well to treatment but was later diagnosed with progressive disease. Variable genomic binding is detectable at pre-treatment and at progression timepoints. Peaks are centered at a mix of TFBS and TSS.
[0046] Figures 8A-8B illustrates ASCL1 binding site between resp vs non-responders. Figure 8A shows 500 cell type specific bins from the PMD data to determine fraction of white blood cells.
[0047] Figure 9 illustrates a machine learning model to predict high vs low neuroendocrine SCLC.
[0048] Figures 10A-10B illustrates that the method described herein is able to differentiate neuroendocrine vs non-neuroendocrine by calculating Read Depth Ratio at SCLC specific genomic coordinates (Figure 10A.). Figure 10B shows that the method described herein appears to subtype better SCLC cases.
[0049] Figure 11 illustrates that Read Depth Ratio at SCLC specific genomic coordinates are transformed into a 2-dimensional PCA to separate samples in Neuroendocrine vs non- neuroendocrine groups. Most of samples are clearly divided into different clusters.
[0050] Figure 12 illustrates PCA derived from Read Depth Ratio at SCLC specific genomic coordinates is able to separate pre-treatment samples into specific SCLC subtypes (A,N,P,Y). The ground truth of the SCLC subtypes was calculated using tissue RNA-seq gene expression differential analysis.
[0051] Figure 13 illustrates depicts how fragmentation profiles inform on lung cancer status and treatment response patterns.
[0052] Figure 14 illustrates that model derived DELFI-TF exhibits a strong correlation with Mutant Allele Frequency across samples.
[0053] Figure 15 illustrates that DELFI-TF accurately quantify circulating tumor fraction, mirroring MAF performance in relation to RECIST evaluation.
8
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0054] Figure 16 illustrates that tumor derived cfDNA has altered fragmentation.
[0055] Figures 17A-17C illustrates that DELFI-TF accurately detects circulating tumor fraction without being confounded by clonal hematopoiesis.
[0056] Figure 18 depicts cfDNA fragmentation patterns accurately differentiate NSCLC subtypes.
[0057] Figure 19 depicts proof-of-concept DELFI-TF model development.
[0058] Figure 20 is an example computer 800 that may be used to implement the methods described herein.
DETAILED DESCRIPTION
[0059] The present invention is based on the seminal discovery that the characterizing genome-wide patterns of fragmentation of cell-free DNA (cfDNA) in plasma using low- coverage whole-genome sequencing improves cancer diagnosis when analyzed in conjunction with certain clinical and demographic features of individual patients.
[0060] Described herein are non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30 x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding is indicative of a high-neuroendocrine small cell lung cancer subtype.
[0061] Described herein are non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating
9
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high-neuroendocrine small cell lung cancer subtype.
[0062] Described herein are non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high-neuroendocrine small cell lung cancer subtype.
[0063] In some aspects, the method uses the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
[0064] In some aspects, the method uses machine learning to subtype small cell lung cancer in the subject.
[0065] In some aspects, subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
[0066] In some aspects, the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
10
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0067] In some aspects, the method uses all coverage calculations are shifted by 1 to avoid divisions by 0. In some aspects, this adjustment is expressed as f(x) = log2((window depth + l)/(median(anchors) +1).
[0068] In some aspects, the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
[0069] Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias.
[0070] In some aspects, the genomic intervals are non-overlapping.
[0071] In some aspects, the genomic intervals each comprise thousands to millions of base pairs.
[0072] In some aspects, a cfDNA fragmentation profile is determined within each genomic intervals.
[0073] In some aspects, the cfDNA fragmentation profile comprises a median fragment size.In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution. [0074] Some aspects further comprise administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the type of cancer.
[0075] In some aspects, the therapeutic agent is an immunotherapy.
[0076] Described herein are non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30 x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites;
[0077] analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high-neuroendocrine small cell lung cancer
11
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO subtype. In some aspects, the method uses the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof. In some aspects, the method uses machine learning to subtype small cell lung cancer in the subject. In some aspects, subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites. In some aspects, the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 5OObp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)). In some aspects, the method uses all coverage calculations are shifted by 1 to avoid divisions by 0. In some aspects this adjustment is expressed as fix) = log2((window depth + l)/(median(anchors) +1). In some aspects, the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score. Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias. In some aspects, the genomic intervals are non-overlapping. In some aspects, the genomic intervals each comprise thousands to millions of base pairs. In some aspects, a cfDNA fragmentation profile is determined within each genomic intervals. In some aspects, the cfDNA fragmentation profile comprises a median fragment size. In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution. Some aspects further comprise administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the type of cancer. In some aspects, the therapeutic agent is an immunotherapy.
[0078] Described herein are non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10* to 0.1 *; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites;
[0079] analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified
12
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO transcription factor binding sitesis indicative of a high-neuroendocrine small cell lung cancer subtype. In some aspects, the method uses the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof. In some aspects, the method uses machine learning to subtype small cell lung cancer in the subject. In some aspects, subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites. In some aspects, the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 5OObp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)). In some aspects, the method uses all coverage calculations are shifted by 1 to avoid divisions by 0. In some aspects this adjustment is expressed as f(x) = log2((window depth + l)/(median(anchors) +1). In some aspects, the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score. Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias. In some aspects, the genomic intervals are non-overlapping. In some aspects, the genomic intervals each comprise thousands to millions of base pairs. In some aspects, a cfDNA fragmentation profile is determined within each genomic intervals. In some aspects, the cfDNA fragmentation profile comprises a median fragment size. In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution. Some aspects further comprise administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the type of cancer. In some aspects, the therapeutic agent is an immunotherapy. [0080] Described herein are non-invasive methods for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites;
[0081] analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation;
13
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high-neuroendocrine small cell lung cancer subtype. In some aspects, the method uses the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof. In some aspects, the method uses machine learning to subtype small cell lung cancer in the subject. In some aspects, subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites. In some aspects, the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 5OObp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)). In some aspects, the method uses all coverage calculations are shifted by 1 to avoid divisions by 0. In some aspects this adjustment is expressed as f(x) = log2((window depth + l)/(median(anchors) +1). In some aspects, the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score. Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias. In some aspects, the genomic intervals are non-overlapping. In some aspects, the genomic intervals each comprise thousands to millions of base pairs. In some aspects, a cfDNA fragmentation profile is determined within each genomic intervals. In some aspects, the cfDNA fragmentation profile comprises a median fragment size. In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution. Some aspects further comprise administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the type of cancer. In some aspects, the therapeutic agent is an immunotherapy.
[0082] In one embodiment, the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based
14
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
[0083] In one embodiment, the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
[0084] In one embodiment, the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9x to O.lx; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
[0085] In some aspects, the specified transcription factor is AS CL 1, NEURODI, POUF23, YAP1, or any combination thereof.
15
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0086] In some aspects, the method uses machine learning to subtype small cell lung cancer in the subject.
[0087] In some aspects, subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
[0088] In some aspects, the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
[0089] In some aspects, all coverage calculations are shifted by 1 to avoid divisions by 0.
In some aspects this adjustment is expressed as f(x) = log2((window depth + l)/(median(anchors) +1).
[0090] In some aspects, the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
[0091] Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias.
[0092] In some aspects, the genomic intervals are non-overlapping.
[0093] In some aspects, the genomic intervals each comprise thousands to millions of base pairs.
[0094] In some aspects, a cfDNA fragmentation profile is determined within each genomic intervals.
[0095] In some aspects, the cfDNA fragmentation profile comprises a median fragment size.
[0096] In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution.
[0097] Some aspects further comprise administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer.
[0098] In some aspects, the therapeutic agent is an immunotherapy.
[0099] Some aspects further comprise quantifying a circulating tumor fraction, mirroring major allele frequency (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both Adenocarcinoma and Squamous carcinoma.
16
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0100] In one embodiment, the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30* to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype. In some aspects, the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof. In some aspects, the method uses machine learning to subtype small cell lung cancer in the subject. In some aspects, subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites. In some aspects, the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)). In some aspects, all coverage calculations are shifted by 1 to avoid divisions by 0. In some aspects this adjustment is expressed as f(x) = log2((window depth + l)/(median(anchors) +1). In some aspects, the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score. Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias. In some aspects, the genomic intervals are non-overlapping. In some aspects, the genomic intervals each comprise thousands to millions of base pairs. In some aspects, a cfDNA fragmentation profile is determined within each genomic intervals. In some aspects, the cfDNA fragmentation profile comprises a median fragment size. In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution. Some aspects further comprise administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer. In some aspects, the therapeutic agent is an immunotherapy.
17
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
Some aspects further comprise quantifying a circulating tumor fraction, mirroring major allele frequency (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both Adenocarcinoma and Squamous carcinoma.
[0101] In one embodiment, the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 10x to 0.1 x; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype. In some aspects, the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof. In some aspects, the method uses machine learning to subtype small cell lung cancer in the subject. In some aspects, subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites. In some aspects, the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)). In some aspects, all coverage calculations are shifted by 1 to avoid divisions by 0. In some aspects this adjustment is expressed as f(x) = log2((window depth + l)/(median(anchors) +1). In some aspects, the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score. Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias. In some aspects, the genomic intervals are non-overlapping. In some aspects, the genomic intervals each comprise thousands to millions of base pairs. In some aspects, a cfDNA fragmentation profile is determined within each genomic intervals. In some aspects, the cfDNA fragmentation profile comprises a median fragment size. In some aspects, the cfDNA fragmentation profile comprises a fragment size
18
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO distribution. Some aspects further comprise administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer. In some aspects, the therapeutic agent is an immunotherapy. Some aspects further comprise quantifying a circulating tumor fraction, mirroring major allele frequency (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both Adenocarcinoma and Squamous carcinoma.
[0102] In one embodiment, the present invention provides methods for the non-invasive subtyping of non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 9* to 0.1*; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype. In some aspects, the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof. In some aspects, the method uses machine learning to subtype small cell lung cancer in the subject. In some aspects, subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites. In some aspects, the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)). In some aspects, all coverage calculations are shifted by 1 to avoid divisions by 0. In some aspects this adjustment is expressed as f(x) = log2((window depth + l)/(median(anchors) +1). In some aspects, the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score. Some aspects further comprise the use of a general additive model to smooth coverage and correct for GC bias. In some aspects, the genomic intervals are non-overlapping. In some aspects, the genomic intervals each comprise thousands
19
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO to millions of base pairs. In some aspects, a cfDNA fragmentation profile is determined within each genomic intervals. In some aspects, the cfDNA fragmentation profile comprises a median fragment size. In some aspects, the cfDNA fragmentation profile comprises a fragment size distribution. Some aspects further comprise administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer. In some aspects, the therapeutic agent is an immunotherapy. Some aspects further comprise quantifying a circulating tumor fraction, mirroring major allele frequency (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both Adenocarcinoma and Squamous carcinoma.
[0103] Before the present compositions and methods are described, it is to be understood that this invention is not limited to the particular methods and systems described, as such methods and systems may vary. It is also to be understood that the terminology used herein is for purposes of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only in the appended claims.
[0104] As used in this specification and the appended claims, the singular forms āaā, āanā, and ātheā include plural references unless the context clearly dictates otherwise. Thus, for example, references to āthe methodā includes one or more methods, and/or steps of the type described herein which will become apparent to those persons skilled in the art upon reading this disclosure and so forth.
[0105] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the invention, the preferred methods and materials are now described.
[0106] Described herein are methods to subtype non-small cell lung cancer (NSCLC) in a non-invasive manner using cell-free DNA (cfDNA) fragmentomes. NSCLC is often presented in different subtypes such as adenocarcinoma and squamous cell carcinoma. This novel non- invasive DELFI-based approach distinguishes NSCLC adenocarcinoma versus Squamous carcinoma subtypes investigating cell-free fragments that reflect the unique epigenetic states of NSCLCs. Other cfDNA-based liquid biopsies are not known to differentiate NSCLC subtypes without having access to tissue clinical data.
20
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0107] NSCLC subtypes can be predicted from the DELFI-based approach using cfDNA LC-WGS data. First, the DELFI machine learning classifier detects the presence of cancer in patients with NSCLC. Both genome-wide fragmentation profiles and correspondent DELFI- Tumor Fraction (TF) scores are investigated to identify preliminary differences between the two main clinical subtypes of NSCLC cases: adenocarcinoma and squamous cell carcinoma. DELFI-TF accurately detects circulating tumor fraction without being confused by clonal hematopoiesis. Furthermore, DELFI-TF accurately quantifies circulating tumor fraction, mirroring MAF performance in relation to RECIST evaluation in both Adenocarcinoma and Squamous carcinoma. Publicly available data was used for running a principal component analysis to cluster the two subtypes using Whole Genome fragmentation data. Figure 19 depicts proof-of-concept DELFI-TF model development. Second, a list of differentially accessible transcription start sites (TSS) is applied and is capable of differentiating between adenocarcinoma versus squamous. A list of differentially accessible transcription start sites can be obtained from public databases such as, but not limited to UCSC or Ensembl. The creation of such lists is similar to what would be done for RNA-Seq experiments except instead of transcript per million the read depth ratio is used as a proxy for expression. In some aspects, sites with the largest and most significant log fold changes from 20% of a test cohort are selected and applied to the remainder of the cohort at the same loci followed by performed hierarchical clustering to determine if subtypes remained together. In another aspect, the most highly expressed transcripts from TCGA for a given cancer type are selected and those transcripts which are also expressed at any level in AML (blood cancer as an imperfect proxy for normal blood) are filtered out from the list. Third, if the most differential DELFI-TSSs exhibit short/long changes was investigated, further confirming these cfDNA molecules are tumor-derived. Finally, a receiver operating characteristic (ROC) representing sensitivity and specificity of the DELFI-fragmentome approach to identify NSCLC subtypes is used to identify the diagnostic accuracy of our approach between the identified cluster samples. Given the logistical difficulties to perform NSCLC tumor biopsies, we believe this approach could be a viable and quick method of subtyping NSCLC in a non-invasive manner.
[0108] This document provides methods and materials for determining a cfDNA fragmentation profile in a mammal (e.g., in a sample obtained from a mammal). As used herein, the terms āfragmentation profile,ā āposition dependent differences in fragmentation patterns,ā and ādifferences in fragment size and coverage in a position dependent manner across the
21
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO genomeā are equivalent and can be used interchangeably. In some cases, determining a cfDNA fragmentation profile in a mammal can be used for identifying a mammal as having cancer. For example, cfDNA fragments obtained from a mammal (e.g., from a sample obtained from a mammal) can be subjected to low coverage whole-genome sequencing, and the sequenced fragments can be mapped to the genome (e.g., in non-overlapping windows) and assessed to determine a cfDNA fragmentation profile. As described herein, a cfDNA fragmentation profile of a mammal having cancer is more heterogeneous (e.g., in fragment lengths) than a cfDNA fragmentation profile of a healthy mammal (e.g., a mammal not having cancer). As such, this document also provides methods and materials for assessing, monitoring, and/or treating mammals (e.g., humans) having, or suspected of having, cancer. In some cases, this document provides methods and materials for identifying a mammal as having cancer. For example, a sample (e.g., a blood sample) obtained from a mammal can be assessed to determine the presence and, optionally, the tissue of origin of the cancer in the mammal based, at least in part, on the cfDNA fragmentation profile of the mammal. In some cases, this document provides methods and materials for monitoring a mammal as having cancer. For example, a sample (e.g., a blood sample) obtained from a mammal can be assessed to determine the presence of the cancer in the mammal based, at least in part, on the cfDNA fragmentation profile of the mammal. In some cases, this document provides methods and materials for identifying a mammal as having cancer and administering one or more cancer treatments to the mammal to treat the mammal. For example, a sample (e.g., a blood sample) obtained from a mammal can be assessed to determine if the mammal has cancer based, at least in part, on the cfDNA fragmentation profile of the mammal, and one or more cancer treatments can be administered to the mammal.
[0109] A cfDNA fragmentation profile can include one or more cfDNA fragmentation patterns. A cfDNA fragmentation pattern can include any appropriate cfDNA fragmentation pattern. Examples of cfDNA fragmentation patterns include, without limitation, median fragment size, fragment size distribution, ratio of small cfDNA fragments to large cfDNA fragments, and the coverage of cfDNA fragments. In some cases, a cfDNA fragmentation pattern includes two or more (e.g., two, three, or four) of median fragment size, fragment size distribution, ratio of small cfDNA fragments to large cfDNA fragments, and the coverage of cfDNA fragments. In some cases, cfDNA fragmentation profile can be a genome- wide cfDNA profile (e.g., a genome-wide cfDNA profile in windows across the genome). In some cases,
22
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO cfDNA fragmentation profile can be a targeted region profile. A targeted region can be any appropriate portion of the genome (e.g., a chromosomal region). Examples of chromosomal regions for which a cfDNA fragmentation profile can be determined as described herein include, without limitation, a portion of a chromosome (e.g., a portion of 2q, 4p, 5p, 6q, 7p, 8q, 9q, lOq, l lq, 12q, and/or 14q) and a chromosomal arm (e.g., a chromosomal arm of 8q, 13q, 11 q, and/or 3p). In some cases, a cfDNA fragmentation profile can include two or more targeted region profiles.
[0110] In some cases, a cfDNA fragmentation profile can be used to identify changes (e.g., alterations) in cfDNA fragment lengths. An alteration can be a genome-wide alteration or an alteration in one or more targeted regions/loci. A target region can be any region containing one or more cancer-specific alterations. Examples of cancer-specific alterations, and their chromosomal locations, include, without limitation, those shown in Table 3 (Appendix C) and those shown in Table 6 (Appendix F). In some cases, a cfDNA fragmentation profile can be used to identify (e.g., simultaneously identify) from about 10 alterations to about 500 alterations (e.g., from about 25 to about 500, from about 50 to about 500, from about 100 to about 500, from about 200 to about 500, from about 300 to about 500, from about 10 to about 400, from about 10 to about 300, from about 10 to about 200, from about 10 to about 100, from about 10 to about 50, from about 20 to about 400, from about 30 to about 300, from about 40 to about 200, from about 50 to about 100, from about 20 to about 100, from about 25 to about 75, from about 50 to about 250, or from about 100 to about 200, alterations).
[0111] In some cases, a cfDNA fragmentation profile can be used to detect tumor-derived DNA. For example, a cfDNA fragmentation profile can be used to detect tumor-derived DNA by comparing a cfDNA fragmentation profile of a mammal having, or suspected of having, cancer to a reference cfDNA fragmentation profile (e.g., a cfDNA fragmentation profile of a healthy mammal and/or a nucleosomal DNA fragmentation profile of healthy cells from the mammal having, or suspected of having, cancer). In some cases, a reference cfDNA fragmentation profile is a previously generated profile from a healthy mammal. For example, methods provided herein can be used to determine a reference cfDNA fragmentation profile in a healthy mammal, and that reference cfDNA fragmentation profile can be stored (e.g., in a computer or other electronic storage medium) for future comparison to a test cfDNA fragmentation profile in mammal having, or suspected of having, cancer. In some cases, a reference cfDNA fragmentation profile (e.g., a stored cfDNA fragmentation profile) of a
23
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO healthy mammal is determined over the whole genome. In some cases, a reference cfDNA fragmentation profile (e.g., a stored cfDNA fragmentation profile) of a healthy mammal is determined over a subgenomic interval.
[0112] In some cases, a cfDNA fragmentation profile can be used to identify a mammal (e.g., a human) as having cancer (e.g., a colorectal cancer, a lung cancer, a breast cancer, a gastric cancer, a pancreatic cancer, a bile duct cancer, and/or an ovarian cancer).
[0113] A cfDNA fragmentation profile can include a cfDNA fragment size pattern. cfDNA fragments can be any appropriate size. For example, cfDNA fragment can be from about 50 base pairs (bp) to about 400 bp in length. As described herein, a mammal having cancer can have a cfDNA fragment size pattern that contains a shorter median cfDNA fragment size than the median cfDNA fragment size in a healthy mammal. A healthy mammal (e.g., a mammal not having cancer) can have cfDNA fragment sizes having a median cfDNA fragment size from about 166.6 bp to about 167.2 bp (e.g., about 166.9 bp). In some cases, a mammal having cancer can have cfDNA fragment sizes that are, on average, about 1.28 bp to about 2.49 bp (e.g., about 1.88 bp) shorter than cfDNA fragment sizes in a healthy mammal. For example, a mammal having cancer can have cfDNA fragment sizes having a median cfDNA fragment size of about 164.11 bp to about 165.92 bp (e.g., about 165.02 bp).
[0114] A cfDNA fragmentation profile can include a cfDNA fragment size distribution. As described herein, a mammal having cancer can have a cfDNA size distribution that is more variable than a cfDNA fragment size distribution in a healthy mammal. In some case, a size distribution can be within a targeted region. A healthy mammal (e.g., a mammal not having cancer) can have a targeted region cfDNA fragment size distribution of about 1 or less than about 1. In some cases, a mammal having cancer can have a targeted region cfDNA fragment size distribution that is longer (e.g., 10, 15, 20, 25, 30, 35, 40, 45, 50 or more bp longer, or any number of base pairs between these numbers) than a targeted region cfDNA fragment size distribution in a healthy mammal. In some cases, a mammal having cancer can have a targeted region cfDNA fragment size distribution that is shorter (e.g., 10, 15, 20, 25, 30, 35, 40, 45, 50 or more bp shorter, or any number of base pairs between these numbers) than a targeted region cfDNA fragment size distribution in a healthy mammal. In some cases, a mammal having cancer can have a targeted region cfDNA fragment size distribution that is about 47 bp smaller to about 30 bp longer than a targeted region cfDNA fragment size distribution in a healthy mammal. In some cases, a mammal having cancer can have a targeted region cfDNA fragment
24
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO size distribution of, on average, a 10, 11, 12, 13, 14, 15, 15, 17, 18, 19, 20 or more bp difference in lengths of cfDNA fragments. For example, a mammal having cancer can have a targeted region cfDNA fragment size distribution of, on average, about a 13 bp difference in lengths of cfDNA fragments. In some case, a size distribution can be a genome-wide size distribution. A healthy mammal (e.g., a mammal not having cancer) can have very similar distributions of short and long cfDNA fragments genome-wide. In some cases, a mammal having cancer can have, genome-wide, one or more alterations (e.g., increases and decreases) in cfDNA fragment sizes. The one or more alterations can be any appropriate chromosomal region of the genome. For example, an alteration can be in a portion of a chromosome. Examples of portions of chromosomes that can contain one or more alterations in cfDNA fragment sizes include, without limitation, portions of 2q, 4p, 5p, 6q, 7p, 8q, 9q, lOq, llq, 12q, and 14q. For example, an alteration can be across a chromosome arm (e.g., an entire chromosome arm).
[0115] A cfDNA fragmentation profile can include a ratio of small cfDNA fragments to large cfDNA fragments and a correlation of fragment ratios to reference fragment ratios. As used herein, with respect to ratios of small cfDNA fragments to large cfDNA fragments, a small cfDNA fragment can be from about 100 bp in length to about 150 bp in length. As used herein, with respect to ratios of small cfDNA fragments to large cfDNA fragments, a large cfDNA fragment can be from about 151 bp in length to 220 bp in length. As described herein, a mammal having cancer can have a correlation of fragment ratios (e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals) that is lower (e.g., 2-fold lower, 3-fold lower, 4-fold lower, 5-fold lower, 6-fold lower, 7-fold lower, 8-fold lower, 9-fold lower, 10-fold lower, or more) than in a healthy mammal. A healthy mammal (e.g., a mammal not having cancer) can have a correlation of fragment ratios (e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals) of about 1 (e.g., about 0.96). In some cases, a mammal having cancer can have a correlation of fragment ratios (e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals) that is, on average, about 0.19 to about 0.30 (e.g., about 0.25) lower than a correlation of fragment ratios (e.g., a correlation of cfDNA fragment ratios to reference DNA fragment ratios such as DNA fragment ratios from one or more healthy mammals) in a healthy mammal.
25
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0116] A cfDNA fragmentation profile can include coverage of all fragments. Coverage of all fragments can include windows (e.g., non-overlapping windows) of coverage. In some cases, coverage of all fragments can include windows of small fragments (e.g., fragments from about 100 bp to about 150 bp in length). In some cases, coverage of all fragments can include windows of large fragments (e.g., fragments from about 151 bp to about 220 bp in length).
[0117] A cfDNA fragmentation profile can be obtained using any appropriate method. In some cases, cfDNA from a mammal (e.g., a mammal having, or suspected of having, cancer) can be processed into sequencing libraries which can be subjected to whole genome sequencing (e.g., low-coverage whole genome sequencing), mapped to the genome, and analyzed to determine cfDNA fragment lengths. Mapped sequences can be analyzed in non-overlapping windows covering the genome. Windows can be any appropriate size. For example, windows can be from thousands to millions of bases in length. As one non-limiting example, a window can be about 5 megabases (Mb) long. Any appropriate number of windows can be mapped. For example, tens to thousands of windows can be mapped in the genome. For example, hundreds to thousands of windows can be mapped in the genome. A cfDNA fragmentation profile can be determined within each window. In some cases, a cfDNA fragmentation profile can be obtained as described in Example 1. In some cases, a cfDNA fragmentation profile can be obtained as shown in FIG. 1.
[0118] In some cases, methods and materials described herein also can include machine learning. For example, machine learning can be used for identifying an altered fragmentation profile (e.g., using coverage of cfDNA fragments, fragment size of cfDNA fragments, coverage of chromosomes, and mtDNA).
[0119] In some cases, methods and materials described herein can be the sole method used to identify a mammal (e.g., a human) as having cancer (e.g., a colorectal cancer, a lung cancer, a breast cancer, a gastric cancer, a pancreatic cancer, a bile duct cancer, and/or an ovarian cancer). For example, determining a cfDNA fragmentation profile can be the sole method used to identify a mammal as having cancer.
[0120] In some cases, methods and materials described herein can be used together with one or more additional methods used to identify a mammal (e.g., a human) as having cancer (e.g., a colorectal cancer, a lung cancer, a breast cancer, a gastric cancer, a pancreatic cancer, a bile duct cancer, and/or an ovarian cancer). Examples of methods used to identify a mammal as having cancer include, without limitation, identifying one or more cancer-specific sequence
26
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO alterations, identifying one or more chromosomal alterations (e.g., aneuploidies and rearrangements), and identifying other cfDNA alterations. For example, determining a cfDNA fragmentation profile can be used together with identifying one or more cancer-specific mutations in a mammal's genome to identify a mammal as having cancer. For example, determining a cfDNA fragmentation profile can be used together with identifying one or more aneuploidies in a mammal's genome to identify a mammal as having cancer.
[0121] In some aspects, this document also provides methods and materials for assessing, monitoring, and/or treating mammals (e.g., humans) having, or suspected of having, cancer. In some cases, this document provides methods and materials for identifying a mammal as having cancer. For example, a sample (e.g., a blood sample) obtained from a mammal can be assessed to determine if the mammal has cancer based, at least in part, on the cfDNA fragmentation profile of the mammal. In some cases, this document provides methods and materials for identifying the location (e.g., the anatomic site or tissue of origin) of a cancer in a mammal. For example, a sample (e.g., a blood sample) obtained from a mammal can be assessed to determine the tissue of origin of the cancer in the mammal based, at least in part, on the cfDNA fragmentation profile of the mammal. In some cases, this document provides methods and materials for identifying a mammal as having cancer and administering one or more cancer treatments to the mammal to treat the mammal. For example, a sample (e.g., a blood sample) obtained from a mammal can be assessed to determine if the mammal has cancer based, at least in part, on the cfDNA fragmentation profile of the mammal, and administering one or more cancer treatments to the mammal. In some cases, this document provides methods and materials for treating a mammal having cancer. For example, one or more cancer treatments can be administered to a mammal identified as having cancer (e.g., based, at least in part, on the cfDNA fragmentation profile of the mammal) to treat the mammal. In some cases, during or after the course of a cancer treatment (e.g., any of the cancer treatments described herein), a mammal can undergo monitoring (or be selected for increased monitoring) and/or further diagnostic testing. In some cases, monitoring can include assessing mammals having, or suspected of having, cancer by, for example, assessing a sample (e.g., a blood sample) obtained from the mammal to determine the cfDNA fragmentation profile of the mammal as described herein, and changes in the cfDNA fragmentation profiles over time can be used to identify response to treatment and/or identify the mammal as having cancer (e.g., a residual cancer).
27
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0122] Any appropriate mammal can be assessed, monitored, and/or treated as described herein. A mammal can be a mammal having cancer. A mammal can be a mammal suspected of having cancer. Examples of mammals that can be assessed, monitored, and/or treated as described herein include, without limitation, humans, primates such as monkeys, dogs, cats, horses, cows, pigs, sheep, mice, and rats. For example, a human having, or suspected of having, cancer can be assessed to determine a cfDNA fragmentation profiled as described herein and, optionally, can be treated with one or more cancer treatments as described herein.
[0123] Any appropriate sample from a mammal can be assessed as described herein (e.g., assessed for a DNA fragmentation pattern). In some cases, a sample can include DNA (e.g., genomic DNA). In some cases, a sample can include cfDNA (e.g., circulating tumor DNA (ctDNA)). In some cases, a sample can be fluid sample (e.g., a liquid biopsy). Examples of samples that can contain DNA and/or polypeptides include, without limitation, blood (e.g., whole blood, serum, or plasma), amnion, tissue, urine, cerebrospinal fluid, saliva, sputum, broncho-alveolar lavage, bile, lymphatic fluid, cyst fluid, stool, ascites, pap smears, breast milk, and exhaled breath condensate. For example, a plasma sample can be assessed to determine a cfDNA fragmentation profiled as described herein.
[0124] A sample from a mammal to be assessed as described herein (e.g., assessed for a DNA fragmentation pattern) can include any appropriate amount of cfDNA. In some cases, a sample can include a limited amount of DNA. For example, a cfDNA fragmentation profile can be obtained from a sample that includes less DNA than is typically required for other cfDNA analysis methods, such as those described in, for example, Phallen et al., 2017 Sci Transl Med 9; Cohen et al., 2018 Science 359:926; Newman et al., 2014 Nat Med 20:548; and Newman et al., 2016 Nat Biotechnol 34:547).
[0125] In some cases, a sample can be processed (e.g., to isolate and/or purify DNA and/or polypeptides from the sample). For example, DNA isolation and/or purification can include cell lysis (e.g., using detergents and/or surfactants), protein removal (e.g., using a protease), and/or RNA removal (e.g., using an RNase). As another example, polypeptide isolation and/or purification can include cell lysis (e.g., using detergents and/or surfactants), DNA removal (e.g., using a DNase), and/or RNA removal (e.g., using an RNase).
[0126] Additional methods are described in U.S. Patents No. 10,982,279 and No. 10,975,431, the disclosure of which is considered part of and is herein incorporated by reference in the disclosure of this application in its entirety.
28
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
Example Hardware Implementation
[0127] Figure 20 illustrates an example computer 800 that may be used to implement the methods described herein. For example, the computer 800 may include a machine learning system that trains a machine learning model to subtype small cell lung cancer or non-small cell lung cancer as described above or a portion or combination thereof in some embodiments. The computer 800 may be any electronic device that runs software applications derived from compiled instructions, including without limitation personal computers, servers, smart phones, media players, electronic tablets, game consoles, email devices, etc. In some implementations, the computer 800 may include one or more processors 802, one or more input devices 804, one or more display devices 806, one or more network interfaces 808, and one or more computer- readable mediums 812. Each of these components may be coupled by bus 810, and in some embodiments, these components may be distributed among multiple physical locations and coupled by a network.
[0128] Display device 806 may be any known display technology, including but not limited to display devices using Liquid Crystal Display (LCD) or Light Emitting Diode (LED) technology. Processor(s) 802 may use any known processor technology, including but not limited to graphics processors and multi-core processors. Input device 804 may be any known input device technology, including but not limited to a keyboard (including a virtual keyboard), mouse, track ball, camera, and touch-sensitive pad or display. Bus 810 may be any known internal or external bus technology, including but not limited to ISA, EISA, PCI, PCI Express, USB, Serial ATA or FireWire. Computer-readable medium 812 may be any non-transitory medium that participates in providing instructions to processor(s) 804 for execution, including without limitation, non-volatile storage media (e.g., optical disks, magnetic disks, flash drives, etc.), or volatile media (e.g., SDRAM, ROM, etc.).
[0129] Computer-readable medium 812 may include various instructions 814 for implementing an operating system (e.g., Mac OSĀ®, WindowsĀ®, Linux). The operating system may be multi-user, multiprocessing, multitasking, multithreading, real-time, and the like. The operating system may perform basic tasks, including but not limited to: recognizing input from input device 804; sending output to display device 806; keeping track of files and directories on computer-readable medium 812; controlling peripheral devices (e.g., disk drives, printers, etc.) which can be controlled directly or through an I/O controller; and managing traffic on bus 810. Network communications instructions 816 may establish and maintain network
29
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO connections (e.g., software for implementing communication protocols, such as TCP/IP, HTTP, Ethernet, telephony, etc.).
[0130] Machine learning instructions 818 may include instructions that enable computer 800 to function as a machine learning system and/or to training machine learning models to generate DMS values as described herein. Application(s) 820 may be an application that uses or implements the processes described herein and/or other processes. The processes may also be implemented in operating system 814. For example, application 820 and/or operating system may create tasks in applications as described herein.
[0131] The described features may be implemented in one or more computer programs that may be executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. A computer program is a set of instructions that can be used, directly or indirectly, in a computer to perform a certain activity or bring about a certain result. A computer program may be written in any form of programming language (e.g., Objective-C, Java), including compiled or interpreted languages, and it may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
[0132] Suitable processors for the execution of a program of instructions may include, by way of example, both general and special purpose microprocessors, and the sole processor or one of multiple processors or cores, of any kind of computer. Generally, a processor may receive instructions and data from a read-only memory or a random-access memory or both. The essential elements of a computer may include a processor for executing instructions and one or more memories for storing instructions and data. Generally, a computer may also include, or be operatively coupled to communicate with, one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data may include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory may be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
30
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0133] To provide for interaction with a user, the features may be implemented on a computer having a display device such as an LED or LCD monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer.
[0134] The features may be implemented in a computer system that includes a back-end component, such as a data server, or that includes a middleware component, such as an application server or an Internet server, or that includes a front-end component, such as a client computer having a graphical user interface or an Internet browser, or any combination thereof. The components of the system may be connected by any form or medium of digital data communication such as a communication network. Examples of communication networks include, e.g., a telephone network, a LAN, a WAN, and the computers and networks forming the Internet.
[0135] The computer system may include clients and servers. A client and server may generally be remote from each other and may typically interact through a network. The relationship of client and server may arise by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
[0136] One or more features or steps of the disclosed embodiments may be implemented using an Application Programming Interface (API). An API may define one or more parameters that are passed between a calling application and other software code (e.g., an operating system, library routine, function) that provides a service, that provides data, or that performs an operation or a computation.
[0137] The API may be implemented as one or more calls in program code that send or receive one or more parameters through a parameter list or other structure based on a call convention defined in an API specification document. A parameter may be a constant, a key, a data structure, an object, an object class, a variable, a data type, a pointer, an array, a list, or another call. API calls and parameters may be implemented in any programming language. The programming language may define the vocabulary and calling convention that a programmer will employ to access functions supporting the API.
[0138] In some implementations, an API call may report to an application the capabilities of a device running the application, such as input capability, output capability, processing capability, power capability, communications capability, etc.
31
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0139] While various embodiments have been described above, it should be understood that they have been presented by way of example and not limitation. It will be apparent to persons skilled in the relevant art(s) that various changes in form and detail can be made therein without departing from the spirit and scope. In fact, after reading the above description, it will be apparent to one skilled in the relevant art(s) how to implement alternative embodiments. For example, other steps may be provided, or steps may be eliminated, from the described flows, and other components may be added to, or removed from, the described systems. Accordingly, other implementations are within the scope of the following claims.
[0140] In addition, it should be understood that any figures which highlight the functionality and advantages are presented for example purposes only. The disclosed methodology and system are each sufficiently flexible and configurable such that they may be utilized in ways other than that shown.
[0141] Although the term āat least oneā may often be used in the specification, claims and drawings, the terms āaā, āanā, ātheā, āsaidā, etc. also signify āat least oneā or āthe at least oneā in the specification, claims and drawings.
[0142] Finally, it is the applicant's intent that only claims that include the express language "means for" or "step for" be interpreted under 35 U.S.C. 112(f). Claims that do not expressly include the phrase "means for" or "step for" are not to be interpreted under 35 U.S.C. 112(f).
[0143] The presently described methods and systems are useful for subtyping non-small cell lung cancer or small cell lung cancer in a subject and optionally treating the cancer subtype in the subject. Any appropriate subject, such as a mammal can be assessed, and/or treated as described herein. Examples of some mammals that can be assessed, and/or treated as described herein include, without limitation, humans, primates such as monkeys, dogs, cats, horses, cows, pigs, sheep, mice, and rats. For example, a human having, or suspected of having, cancer can be assessed using a method described herein and, optionally, can be treated with one or more cancer treatments as described herein. The methods disclosed herein may include administering to the subject identified as having the type of cancer, a therapeutic agent suitable for the treatment of the type of cancer.
[0144] When treating a subject having, or suspected of having, cancer as described herein, the subject can be administered one or more cancer treatments. A cancer treatment can be any appropriate cancer treatment. One or more cancer treatments described herein can be administered to a subject at any appropriate frequency (e.g., once or multiple times over a
32
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO period of time ranging from days to weeks). Examples of cancer treatments include, without limitation, surgical intervention, adjuvant chemotherapy, neoadjuvant chemotherapy, radiation therapy, hormone therapy, cytotoxic therapy, immunotherapy, adoptive T cell therapy (e.g., chimeric antigen receptors and/or T cells having wild-type or modified T cell receptors), targeted therapy such as administration of kinase inhibitors (e.g., kinase inhibitors that target a particular genetic lesion, such as a translocation or mutation), (e.g., a kinase inhibitor, an antibody, a bispecific antibody), signal transduction inhibitors, bispecific antibodies or antibody fragments (e.g., BiTEs), monoclonal antibodies, immune checkpoint inhibitors, surgery (e.g., surgical resection), or any combination of the above. In some aspects, a cancer treatment can reduce the severity of the cancer, reduce a symptom of the cancer, and/or to reduce the number of cancer cells present within the subject.
[0145] In some aspects, a cancer treatment can be a chemotherapeutic agent. Non-limiting examples of chemotherapeutic agents include: amsacrine, azacitidine, axathioprine, bevacizumab (or an antigen-binding fragment thereof), bleomycin, busulfan, carboplatin , capecitabine, chlorambucil, cisplatin, cyclophosphamide, cytarabine, dacarbazine, daunorubicin, docetaxel, doxifluridine, doxorubicin, epirubicin, erlotinib hydrochlorides, etoposide, fiudarabine, floxuridine, fludarabine, fluorouracil, gemcitabine, hydroxyurea, idarubicin, ifosfamide, irinotecan, lomustine, mechlorethamine, melphalan, mercaptopurine, methotrxate, mitomycin, mitoxantrone, oxaliplatin, paclitaxel, pemetrexed, procarbazine, all- trans retinoic acid, streptozocin, tafluposide, temozolomide, teniposide, tioguanine, topotecan, uramustine, valrubicin, vinblastine, vincristine, vindesine, vinorelbine, and combinations thereof. Additional examples of anti-cancer therapies are known in the art; see, e.g., the guidelines for therapy from the American Society of Clinical Oncology (ASCO), European Society for Medical Oncology (ESMO), or National Comprehensive Cancer Network (NCCN). [0146] In various aspects, DNA is present in a biological sample taken from a subject and used in the methodology of the invention. The biological sample can be virtually any type of biological sample that includes DNA. The biological sample is typically a fluid, such as whole blood or a portion thereof with circulating cfDNA. In embodiments, the sample includes DNA from a tumor or a liquid biopsy, such as, but not limited to amniotic fluid, aqueous humor, vitreous humor, blood, whole blood, fractionated blood, plasma, serum, breast milk, cerebrospinal fluid (CSF), cerumen (earwax), chyle, chime, endolymph, perilymph, feces, breath, gastric acid, gastric juice, lymph, mucus (including nasal drainage and phlegm),
33
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO pericardial fluid, peritoneal fluid, pleural fluid, pus, rheum, saliva, exhaled breath condensates, sebum, semen, sputum, sweat, synovial fluid, tears, vomit, prostatic fluid, nipple aspirate fluid, lachrymal fluid, perspiration, cheek swabs, cell lysate, gastrointestinal fluid, biopsy tissue and urine or other biological fluid. In one aspect, the sample includes DNA from a circulating tumor cell.
[0147] As disclosed above, the biological sample can be a blood sample. The blood sample can be obtained using methods known in the art, such as finger prick or phlebotomy. Suitably, the blood sample is approximately 0. 1 to 20 ml, or alternatively approximately 1 to 15 ml with the volume of blood being approximately 10 ml. Smaller amounts may also be used, as well as circulating free DNA in blood. Microsampling and sampling by needle biopsy, catheter, excretion or production of bodily fluids containing DNA are also potential biological sample sources.
[0148] The methods and systems of the disclosure utilize nucleic acid sequence information and can therefore include any method or sequencing device for performing nucleic acid sequencing including nucleic acid amplification, polymerase chain reaction (PCR), nanopore sequencing, 454 sequencing, insertion tagged sequencing. In some aspects, the methodology or systems of the disclosure utilize systems such as those provided by Illumina, Inc, (including but not limited to HiSeqā¢ X10, HiSeqā¢ 1000, HiSeqā¢ 2000, HiSeqā¢ 2500, Genome Analyzersā¢, MiSeqā¢ā NextSeq, NovaSeq 6000 systems), Applied Biosystems Life Technologies (SOLiDā¢ System, Ion PGMā¢ Sequencer, ion Protonā¢ Sequencer) or Genapsys or BGI MGI and other systems. Nucleic acid analysis can also be carried out by systems provided by Oxford Nanopore Technologies (GridiONā¢, MiniONā¢) or Pacific Biosciences (Pacbioā¢ RS II or Sequel I or II).
[0149] The present invention includes systems for performing steps of the disclosed methods and is described partly in terms of functional components and various processing steps. Such functional components and processing steps may be realized by any number of components, operations and techniques configured to perform the specified functions and achieve the various results. For example, the present invention may employ various biological samples, biomarkers, elements, materials, computers, data sources, storage systems and media, information gathering techniques and processes, data processing criteria, statistical analyses, regression analyses and the like, which may carry out a variety of functions.
34
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0150] Accordingly, the invention further provides a non-invasive system for subtyping small cell lung cancer or non-small cell lung cancer. In various aspects, the system includes: (a) a sequencer configured to generate a low-coverage whole genome sequencing data set for a sample; and (b) a computer system and/or processor with functionality to perform a method of the invention.
[0151] In some aspects, the computer system further includes one or more additional modules. For example, the system may include one or more of an extraction and/or isolation unit operable to select suitable genetic components analysis, e.g., cfDNA fragments of a particular size.
[0152] In some aspects, the computer system further includes a visual display device. The visual display device may be operable to display a curve fit line, a reference curve fit line, and/or a comparison of both.
[0153] Methods for the non-invasive subtyping of small cell lung cancer or non-small cell lung cancer according to various aspects of the present invention may be implemented in any suitable manner, for example using a computer program operating on the computer system. As discussed herein, an exemplary system, according to various aspects of the present invention, may be implemented in conjunction with a computer system, for example a conventional computer system comprising a processor and a random access memory, such as a remotely- accessible application server, network server, personal computer or workstation. The computer system also suitably includes additional memory devices or information storage systems, such as a mass storage system and a user interface, for example a conventional monitor, keyboard and tracking device. The computer system may, however, include any suitable computer system and associated equipment and may be configured in any suitable manner. In one embodiment, the computer system comprises a stand-alone system. In another embodiment, the computer system is part of a network of computers including a server and a database.
[0154] The software required for receiving, processing, and analyzing information may be implemented in a single device or implemented in a plurality of devices. The software may be accessible via a network such that storage and processing of information takes place remotely with respect to users. The system according to various aspects of the present invention and its various elements provide functions and operations to facilitate detection and/or analysis, such as data gathering, processing, analysis, reporting and/or diagnosis. For example, in the present aspect, the computer system executes the computer program, which may receive, store, search,
35
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO analyze, and report information relating to the human genome or region thereof. The computer program may comprise multiple modules performing various functions or operations, such as a processing module for processing raw data and generating supplemental data and an analysis module for analyzing raw data and supplemental data to generate quantitative assessments of a disease status model and/or diagnosis information.
[0155] The procedures performed by the system may comprise any suitable processes to facilitate analysis and/or subtyping of small cell lung cancer or non-small cell lung cancer. In one embodiment, the system is configured to establish a disease subtype model and/or determine disease subtype in a patient. Determining or identifying disease subtype may include generating any useful information regarding the condition of the patient relative to the disease, such as performing a diagnosis, providing information helpful to a diagnosis, assessing the stage or progress of a disease, identifying a condition that may indicate a susceptibility to the disease, identify whether further tests may be recommended, predicting and/or assessing the efficacy of one or more treatment programs, or otherwise assessing the disease status, likelihood of disease, or other health aspect of the patient.
[0156] The following examples are provided to further illustrate the advantages and features of the present invention, but it is not intended to limit the scope of the invention. While this example is typical of those that might be used, other procedures, methodologies, or techniques known to those skilled in the art may alternatively be used.
EXAMPLE 1
[0157] Dissecting Small Cell Lung Cancer Subtypes by cell-free DNA fragmentomes
[0158] BACKGROUND: Small Cell Lung Cancer (SCLC) is an aggressive form of lung cancer, strongly associated with smoking and exposure to other environmental chemicals. Patients with SCLC suffer from a 5-year survival rate of less than 8% (Gay, C. M. etal. Patterns of transcription factor programs and immune pathway activation define four major subtypes of SCLC with distinct therapeutic vulnerabilities. Cancer Cell 39, 346-360.e7 (2021)). SCLC is a heterogeneous tumor type consisting of tumor cells with neuroendocrine and non- neuroendocrine features. SCLC has two major subtypes: High-grade Neuroendocrine (High- NE) versus Low-grade Neuroendocrine (Low-NE) (Gay et al.). High-NE subtypes are characterized by the activation of lineage-specific transcription factors: ASCL1 and NEURODI. Low-NE is characterized by non-neuroendocrine factors such as POU2F3 (SCLC- P) or the high presence of inflammatory T-cells (SCLC-I). Despite these molecular and clinical
36
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO heterogeneities, SCLC is treated as a single entity with predictably poor results. Recent results showed that the inflamed group (SCLC-I) has distinct biology and responds better to immunotherapy compared to the other SCLC subtypes (Gay et al. and Lissa, D. et al. Heterogeneity of neuroendocrine transcriptional states in metastatic small cell lung cancers and patient-derived models. Nat Commun 13, 2023 (2022)). DELFI-based cell-free fragmentomes accurately differentiate High-NE vs Low-NE SCLC subtypes in anon-invasive manner. DELFI SCLC subtyping final goal is to guide optimal treatment selection for each patient diagnosed with advanced SCLC.
[0159] METHODS: Circulating cell-free DNA (cfDNA) was isolated from plasma samples of patients diagnosed with relapsed SCLC and treated with durvalumab plus olaparib in a phase II trial (NCT02484404). Pre-treatment tissue biopsies immunohistochemistry and genomics data were used to classify SCLC subtypes into High-NE (n=10)) vs Low-NE (n=5). To infer tumor gene expression profiles in cfDNA, we investigated genome-wide signals of tissuespecific transcription factors differentially regulated in SCLC and applied a novel DELFI- based approach to inform SCLC molecular subtypes. In detail, DELFI calculates fragment distribution for each TFBS within a 2kb window and scales between 0-1 independently for each sample. A principal component analysis was then performed in R and clusters were defined based on ASCL1 TFBS sites (Mathios, D. et al. Detection and characterization of lung cancer using cell-free DNA fragmentomes. Nat Commun 12, 5060 (2021)). Clinical information was examined in orthogonal analyses.
[0160] RESULTS: The DELFIās Proprietary Fragmentomics Platform detects SCLC cases (N=47) with high sensitivity, with a median DELFI score of 1.0 (95% CI 0.99-1), matching previous data from internal DELFI SCLC samples. Genome-wide cfDNA fragmentation analyses at ASCL1 binding sites (-12,000 genomic coordinates) in the SCLC patients reveal a decrease in coverage near transcription factor binding sites of SCLC patients compared to noncancer and NSCLC samples. The difference at the center of the ASCL1 binding sites was the main driver of the differentiation between also pre-treatment SCLC samples. Investigation of fragments distribution at ASCL1 binding sites was capable of differentiating High-NE vs Low- NE samples. Patients (treated with I/O) classified as LOW-NE by the DELFI subtyping classifier had a better treatment response than DELFI-classified HIGH-NE. Differential pseudogene expression identified thousands of highly enriched Transcription Start Sites (TSS) in LOW-NE cases compared to HIGH-NE. LOW-NE samples showed a strong enrichment for
37
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO the T-lymphocytes specific factor KLF4, mirroring the neutrophils to lymphocytes ratio of this SCLC subtype. These data reflect the inflamed phenotype, possibly explaining why these cancer subtypes are more likely to respond to I/O treatment.
EXAMPLE 2
[0162] Use of DELFI circulating cfDNA-based approach to subtype small cell lung cancer (SCLC) in anon-invasive manner using cell-free DNA fragmentomes.
[0163] SCLC is an aggressive malignancy with a poor prognosis. Although SCLCs are clinically managed as a single cancer type, new evidence supports that these subtypes (high- vs low-neuroendocrine) of SCLC acquire diverse transcriptional and epigenetic states. Furthermore, distinct SCLC subtypes respond to specific treatment such as immunotherapy in the case of the low-neuroendocrine highly inflamed SCLC subtype.
[0164] The novel non-invasive DELFI-based approach described herein distinguishes between high- and low-neuroendocrine SCLC subtypes investigating cell-free fragments that
38
ACTIVE\1607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO reflect the unique epigenetic states of SCLCs. This novel method is capable of differentiating the patients that will likely respond to immunotherapy, such as immune checkpoint blockade.
[0165] Using WGS data from plasma-derived cfDNA, fragment coverage at specific genomic coordinates was investigated to reveal the specific subtype of SCLC cases. The DELFI-based targeted analysis is capable to differentitate between high-neuroendocrine vs low-neuroendocrine SCLC cases.
[0166] SCLC subtypes can be predicted from the DELFI-based approach using cfDNA LC-WGS data.
[0167] First, the DELFI machine learning classifier detects the presence of cancer in patients with SCLC. Both genome-wide fragmentation profiles and correspondent DELFI scores were investigated to identify preliminary differences between the two main clinical subtypes of SCLC cases: high-neuroendocrine and low-neuroendocrine. Later publicly available data was used for running a principal component analysis to cluster the two subtypes. [0168] Second, using publicly available data, it was determined that these clusters of SCLC samples exhibit a decrease in aggregate fragment coverage at ASCL1 transcription factor binding sites classifying these cases as high-neuroendocrine SCLCs.
[0169] Third, if one of these clusters of SCLC samples exhibit reduction of fragment coverage at genomic binding sites regulated by hematopoietic transcription factors (low- neuroendocrine SCLC) was also investigated.
[0170] Fourth, High vs Low Neuroendocrine cases were distinguished based on fragment length distribution at T-cells-specific partially methylated domains.
[0171] Finally, a receiver operating characteristic (ROC) representing sensitivity and specificity of the DELFI-fragmentome approach to identify SCLC subtypes was used to identify the diagnostic accuracy of our approach between the identified cluster samples.
[0172] Given the limited access to SCLC tumor biopsies it is believed that this approach could be a viable method of subtyping SCLC in a non-invasive manner. A promising assay for both pharmaceutical companies during clinical trials to evaluate novel immunotherapy drugs and for clinicians looking to select which SCLC diagnosed patients have the best chances to respond to immunotherapy.
[0173] The DELFI-scores of the patients diagnosed with SCLC in this study are shown in Figure 1.
39
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0174] Figure 2 illustrates that genome-wide fragmentation profiles were significantly consistent among pre-treatment, post-treatment, progression and response time points suggesting genome-wide circulating tumor DNA fragment sizes is a powerful method to detect changes during monitoring of immunotherapy treatment of SCLC cancer treatment.
[0175] Figure 3 illustrates genome-wide fragmentation profiles (and correspondent DELFI-scores below) showing partial differences between the two main subtypes of SCLC cases: High Neuroendocrine and Low Neuroendocrine; suggesting genome-wide circulating tumor DNA fragment sizes may be a powerful method to detect subtyping during monitoring of immunotherapy treatment of SCLC cancer treatment.
[0176] Figure 4 illustrates differential genes between High and Low neuroendocrine SCLC cases, using publicly available data (Lissa et al. 2022) and a principal Component Analysis (on DELFI 30X plasma WGS data) to cluster the two subtypes. PCA analysis data revealed distinguishable clusters between SCLC High-NE vs Low-NE pre-treatment samples. [0177] Figure 5 illustrates an Eigencor analysis revealing a significant correlation between Principal Component 1 and 2 with the NE status of the samples.
[0178] Figure 6 A-B illustrates the potential of the DELFI assay to subtype SCLC using TFBS, A genome-wide cfDNA fragmentation analyses were performed at ASCL1 binding sites in the pre-treatment NCI samples. Figure 5 A illustrates that distinct clusters of SCLC samples were observed corresponding to different levels of ASCL1 activation. Importantly the two clusters corresponded perfectly to the different SCLC subtypes. The main driver of differentiation between samples was in fact differences in the fragment coverage identified at the center of the ASCL1 binding sites. Figure 6B illustrates that other clinical or sample characteristics have a negligible impact in the clustering of these samples. Overall, these data show how DELFI analysis can perfectly distinguish between SCLC subtypes using fragment coverage at TFBS.
[0179] Figure 7 illustrates a patient example supporting the hypothesis that the detected signal comes from tumor infiltrating lymphocytes. Patient NCI-0422, diagnosed with SCLC with inflammation subtype was treated with Durvalumab plus Olaparib. NCI-0422 responded well to treatment but was later diagnosed with progressive disease. Variable genomic binding is detectable at pre-treatment and at progression timepoints. Peaks are centered at a mix of TFBS and TSS.
40
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
[0180] Figure 8 A-B illustrates ASCL1 binding site between resp vs non-responders. Figure 8A shows 500 cell type specific bins from the PMD data to determine fraction of white blood cells. Figure 8B shows lymphocyte tissue specific bins.
[0181] Figure 9 illustrates a machine learning model to predict high vs low neuroendocrine SCLC.
[0182] Subtyping is performed by calculating the log2 (Read Depth Ratio) across 20bp windows starting 2kb from ASCL1 TFBS sites. Read Depth Ratio is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)). A General Additive Model is then used to smooth coverage and correct for GC bias. To validate the approach described herein, neuroendocrine status is derived from mean NE50 score per subject/treatment status, a positive value is classified as NE+. Subtype status is derived from aggregating the mean for each of the 4 subtypes (ASCL1, NEURODI, POUF23, YAP1) such that each subject/treatment status has one value for each subtype. The subtype with the highest score is then selected as the truth. Fragmentomics are calculated as previously described (Cristiano et al.), Scores are for the latest available models.
[0183] Figure 10 A-B illustrates that the method described herein is able to differentiate neuroendocrine vs non-neuroendocrine by calculating Read Depth Ratio at SCLC specific genomic coordinates (Figure 10A.). Figure 10B shows that the method described herein appears to subtype better SCLC cases.
[0184] Figure 11 illustrates that Read Depth Ratio at SCLC specific genomic coordinates are transformed into a 2-dimensional PCA to separate samples in Neuroendocrine vs non- neuroendocrine groups. Most of samples are clearly divided into different clusters.
[0185] Figure 12 illustrates PCA derived from Read Depth Ratio at SCLC specific genomic coordinates is able to separate pre-treatment samples into specific SCLC subtypes (A,N,P,Y). The ground truth of the SCLC subtypes was calculated using tissue RNA-seq gene expression differential analysis.
[0186] The fragmentomics platform described herein detects small cell lung cancer (SCLC) with high sensitivity. The DELFI SCLC Subtyping Assay capable of subtyping SCLC samples, without the need of clinical knowledge. The DELFI SCLC Subtyping Assay identified four groups of SCLC samples belonging to different levels of ASCL1 activation. Samples with high activation of ASCL1 were confirmed to belong to Neuroendocrine subtypes (SCLC-A
41
ACTIVEM607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO and SCLC-N), while samples with low ASCL1 signal were confirmed to belong to non- neuroendocrine subtypes (SCLC-P and SCLC-Y).
EXAMPLE 3
[0187] Cell-free DNA fragmentation profiling as a method for tumor fraction assessment and treatment monitoring in NSCLC
[0188] cfDNA Fragmentomics Background: Traditionally monitoring is done with imaging, many individuals do not have ready access to hospitals with appropriate imaging equipment and expertise and need to travel significant distances for such procedures making continual monitoring problematic. There is no a priori need to know where a tumor is located. There is additionally no need to know beforehand the somatic mutations a tumor harbors. Costs associated with liquid biopsies are cheap when compared to imaging which should improve as sequencing costs continue to decline.
[0189] cfDNA Mutations have Limited Signal and can be Confounded by Clonal Hematopoiesis. DELFI captures many features of the cfDNA universe. cfDNA fragmentation patterns are determined by basal chromatin organization. cfDNA fragmentation profiles are highly consistent in healthy people and altered in patients with cancer.
[0190] DELFI-TF Model Overview and Application to Monitoring NSCLC Patients during Therapy: cfDNA aliquots from plasma samples of CRC patients were analyzed using ddPCR for RAS mutation status as well as low-pass WGS sequencing. WGS data was aligned and fragment size distributions obtained for 504 5Mb bins across the genome. A Bayesian regression model was trained and cross-validated against RAS MT samples using fragmentomics features. Figure 20 depicts proof-of-concept DELFI-TF model development.
[0191] A CRC trained DELFI-TF model was applied to a real-world NSCLC cohort.
Table 2 - NSCLC Cohort Clinical Features
Clinical features Median age (range) 71.5 (53-88) Sex
11 Male
10 Female
Median BMI (range) 24.5 (17-37)
42
ACTIVE\1607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3WO
Stage
Stage IV
PD-L1 expression
Strong positive
Weak positive
Negative ,
NA 5
Chemotherapy
Progressive disease
Ongoing treatment
[0192] Results: Figure 13 illustrates depicts how fragmentation profiles inform on lung cancer status and treatment response patterns.
[0193] Figure 14 illustrates that model derived DELFI-TF exhibits a strong correlation with Mutant Allele Frequency across samples.
[0194] Figure 15 illustrates that DELFI-TF accurately quantify circulating tumor fraction, mirroring MAF performance in relation to RECIST evaluation.
[0195] Figure 16 illustrates that tumor derived cfDNA has altered fragmentation.
[0196] Figure 17 A-C illustrates that DELFI-TF accurately detects circulating tumor fraction without being confounded by clonal hematopoiesis.
[0197] Figure 18 depicts cfDNA fragmentation patterns accurately differentiate NSCLC subtypes.
[0198] Conclusions: Genome-wide cfDNA fragment profiles are abnormal in patients with cancer
[0199] cfDNA fragmentation scores (DELFI-TF) are highly correlated with known mutation allele frequencies. cfDNA fragmentation predicts RECIST status. cfDNA fragmentation is not confounded by clonal hematopoiesis. cfDNA fragmentation features can noninvasively distinguish histologic subtypes of lung cancers.
43
ACTIVE\1607300700.2
Claims
1. A non-invasive method for subtyping small cell lung cancer in a subject as low neuroendocrine or high neuroendocrine small cell lung cancer comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30* to O.lx; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of a high-neuroendocrine small cell lung cancer subtype.
2. The method of claim 1, wherein the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
3. The method of claim 1, wherein the method uses machine learning to subtype small cell lung cancer in the subject.
4. The method of claim 1, wherein subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
5. The method of claim 4, wherein the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
6. The method of claim 5, wherein all coverage calculations are shifted by 1 to avoid divisions by 0.
44
ACTIVE\1607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3 WO
7. The method of claim 5, wherein the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
8. The method of claim 1, further comprising the use of a general additive model to smooth coverage and correct for GC bias.
9. The method of claim 1, wherein the genomic intervals are non-overlapping.
10. The method of claim 1, wherein the genomic intervals each comprise thousands to millions of base pairs.
11. The method of claim 1, wherein a cfDNA fragmentation profile is determined within each genomic intervals.
12. The method of claim 11, wherein the cfDNA fragmentation profile comprises a median fragment size.
13. The method of claim 11, wherein the cfDNA fragmentation profile comprises a fragment size distribution.
14. The method claim 1, further comprising administering to the subject identified as having a high-neuroendocrine small cell lung cancer subtype, a therapeutic agent suitable for the treatment of the subtype of cancer.
15. The method of claim 14 wherein the therapeutic agent is an immunotherapy.
16. A non-invasive method for subtyping non-small cell lung cancer in a subject as adenocarcinoma or squamous carcinoma comprising: processing cfDNA fragments from a sample obtained from the subject and generating sequencing libraries; subjecting the sequencing libraries to whole genome sequencing to obtain sequenced fragments, wherein genome coverage is about 30* to O.lx; mapping the sequenced fragments to a genome to obtain genomic intervals of mapped sequences at specified transcription factor binding sites; analyzing the genomic intervals of mapped sequences to determine cfDNA fragment lengths and amounts to establish a cfDNA fragment coverage score at specified transcription factor binding sites using the cfDNA fragment lengths and amounts; and
45
ACTIVE\1607300700.2
PATENT ATTORNEY DOCKET NO. DELFI2150-3 WO subtyping the small cell lung cancer in the subject based on transcription factor activation; wherein a decrease in an aggregate cfDNA fragment coverage scores at the specified transcription factor binding sites is indicative of an adenocarcinoma or squamous carcinoma subtype.
17. The method of claim 16, wherein the specified transcription factor is ASCL1, NEURODI, POUF23, YAP1, or any combination thereof.
18. The method of claim 16, wherein the method uses machine learning to subtype small cell lung cancer in the subject.
19. The method of claim 16, wherein subtyping is performed by calculating a log2 (Read Depth Ratio) across 20bp windows starting 2kb from the specified transcription factor binding sites.
20. The method of claim 19, wherein the log2 (Read Depth Ratio) is the total coverage over a coverage correction factor calculated using the median coverage of 5' and 3' 500bp anchors at the ends of the 2kb window (i.e. log2(Read Depth/Correction Factor)).
21. The method of claim 19, wherein all coverage calculations are shifted by 1 to avoid divisions by 0.
22. The method of claim 19, wherein the log2 (Read Depth Ratio) is calculated for each transcription factor binding site and then aggregated by calculating the median of each window to obtain a fragment coverage score.
23. The method of claim 16, further comprising the use of a general additive model to smooth coverage and correct for GC bias.
24. The method of claim 16, wherein the genomic intervals are non-overlapping.
25. The method of claim 16, wherein the genomic intervals each comprise thousands to millions of base pairs.
26. The method of claim 16, wherein a cfDNA fragmentation profile is determined within each genomic intervals.
27. The method of claim 26, wherein the cfDNA fragmentation profile comprises a median fragment size.
28. The method of claim 26, wherein the cfDNA fragmentation profile comprises a fragment size distribution.
46
ACTIVE\1607300700.2
PATENT
ATTORNEY DOCKET NO. DELFI2150-3 WO
29. The method claim 16, further comprising administering to the subject identified as having an adenocarcinoma or squamous carcinoma subtype, a therapeutic agent suitable for the treatment of the type of cancer.
30. The method of claim 29, wherein the therapeutic agent is an immunotherapy.
31. The method of claim 16, further comprising quantifying a circulating tumor fraction, mirroring major allele frequency (MAF) performance in relation to Response Evaluation Criteria in Solid Tumors (RECIST) evaluation in both adenocarcinoma and squamous carcinoma.
47
ACTIVE\1607300700.2
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202363445284P | 2023-02-13 | 2023-02-13 | |
US63/445,284 | 2023-02-13 | ||
US202363470101P | 2023-05-31 | 2023-05-31 | |
US63/470,101 | 2023-05-31 | ||
US202363528237P | 2023-07-21 | 2023-07-21 | |
US63/528,237 | 2023-07-21 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2024173277A2 true WO2024173277A2 (en) | 2024-08-22 |
WO2024173277A3 WO2024173277A3 (en) | 2024-10-24 |
Family
ID=92420632
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2024/015444 WO2024173277A2 (en) | 2023-02-13 | 2024-02-12 | Delfi-derived cell-free dna fragmentation patterns differentiate histologic subtypes of lung cancers in a non-invasive manner |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024173277A2 (en) |
-
2024
- 2024-02-12 WO PCT/US2024/015444 patent/WO2024173277A2/en unknown
Also Published As
Publication number | Publication date |
---|---|
WO2024173277A3 (en) | 2024-10-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200232046A1 (en) | Genomic sequencing classifier | |
CN110958853A (en) | Methods and systems for identifying or monitoring lung disease | |
Munoz et al. | Molecular profiling and the reclassification of cancer: divide and conquer | |
JP7499239B2 (en) | Methods and systems for somatic mutations and uses thereof | |
US12000002B2 (en) | Pre-surgical risk stratification based on PDE4D7 expression and pre-surgical clinical variables | |
KR20230025895A (en) | Multimodal analysis of circulating tumor nucleic acid molecules | |
CN110004229A (en) | Application of the polygenes as EGFR monoclonal antibody class Drug-resistant marker | |
WO2024173277A2 (en) | Delfi-derived cell-free dna fragmentation patterns differentiate histologic subtypes of lung cancers in a non-invasive manner | |
Parasramka et al. | Validation of gene expression signatures to identify low-risk clear-cell renal cell carcinoma patients at higher risk for disease-related death | |
US20220148677A1 (en) | Methods and systems for detecting genetic fusions to identify a lung disorder | |
JP2024515565A (en) | Cell-free DNA sequencing data analysis methods to investigate nucleosome protection and chromatin accessibility | |
US20220301654A1 (en) | Systems and methods for predicting and monitoring treatment response from cell-free nucleic acids | |
WO2023177901A1 (en) | Method of monitoring cancer using fragmentation profiles | |
CN116312814B (en) | Construction method, equipment, device and kit of lung adenocarcinoma molecular typing model | |
US20230416841A1 (en) | Inferring transcription factor activity from dna methylation and its application as a biomarker | |
WO2024076769A1 (en) | Incorporating clinical risk into biomarker-based assessment for cancer pre-screening | |
WO2023220414A1 (en) | Use of cell-free dna fragmentomes in the diagnostic evaluation of patients with signs and symptoms suggestive of cancer | |
KR20240015624A (en) | How to Detect Cancer Using Genome-Wide CFDNA Fragmentation Profiles | |
Glennon | Investigating the Genomic Underpinnings of Renal Cell Carcinoma, Genetic Predisposition, and Their Clinical Implications | |
KR20240053637A (en) | Next-generation sequencing and artificial intelligence-based approaches for improved cancer diagnosis and treatment selection | |
Vandekerkhove | Circulating tumour DNA as a biomarker in metastatic bladder cancer | |
WO2022120076A1 (en) | Clinical classifiers and genomic classifiers and uses thereof | |
Rafaelsen et al. | CT assessment of early response to neoadjuvant therapy in colon cancer | |
Rafaelsen et al. | Local staging of sigmoid colonic cancer using MRI | |
CN118984879A (en) | Methods for monitoring cancer using fragmentation patterns |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24757501 Country of ref document: EP Kind code of ref document: A2 |