WO2024089270A2 - Pore monomers and pores - Google Patents

Pore monomers and pores Download PDF

Info

Publication number
WO2024089270A2
WO2024089270A2 PCT/EP2023/080135 EP2023080135W WO2024089270A2 WO 2024089270 A2 WO2024089270 A2 WO 2024089270A2 EP 2023080135 W EP2023080135 W EP 2023080135W WO 2024089270 A2 WO2024089270 A2 WO 2024089270A2
Authority
WO
WIPO (PCT)
Prior art keywords
pore
chimeric
pores
porarc
different
Prior art date
Application number
PCT/EP2023/080135
Other languages
French (fr)
Other versions
WO2024089270A3 (en
Inventor
Elizabeth Jayne Wallace
Riera ALBERTO
Lakmal Nishantha JAYASINGHE
Alistair James SCOTT
Han REMAUT
Original Assignee
Oxford Nanopore Technologies Plc
Vib Vzw
Vrije Universiteit Brussel
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GBGB2216026.1A external-priority patent/GB202216026D0/en
Priority claimed from GBGB2312689.9A external-priority patent/GB202312689D0/en
Application filed by Oxford Nanopore Technologies Plc, Vib Vzw, Vrije Universiteit Brussel filed Critical Oxford Nanopore Technologies Plc
Publication of WO2024089270A2 publication Critical patent/WO2024089270A2/en
Publication of WO2024089270A3 publication Critical patent/WO2024089270A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/35Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Mycobacteriaceae (F)
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/483Physical analysis of biological material
    • G01N33/487Physical analysis of biological material of liquid biological material
    • G01N33/48707Physical analysis of biological material of liquid biological material by electrical means
    • G01N33/48721Investigating individual macromolecules, e.g. by translocation through nanopores

Definitions

  • the present invention relates to novel pore monomers, pores formed from the pore monomers and their uses in analyte detection and characterisation.
  • Nanopore sensing is an approach to analyte detection and characterization that relies on the observation of individual binding or interaction events between the analyte molecules and an ion conducting channel.
  • Two of the essential components of analyte characterization using nanopore sensing are (1) the control of analyte movement through the pore and (2) the discrimination of the composing building blocks as the analyte is moved through the pore.
  • the narrowest part of the pore forms the most discriminating part of the nanopore with respect to the current signatures as a function of the passing analyte.
  • nucleotide discrimination is achieved by measuring the current as the polynucleotide passes through the pore. Multiple nucleotides contribute to the observed current, so the height of the channel constriction and extent of the interaction with the polynucleotide affect the relationship between observed current and polynucleotide sequence. While the current range and signal-to-noise ratio for nucleotide discrimination have been improved through mutations of protein pores, a sequencing system would have higher performance if the current differences between nucleotides could be improved further. Accordingly, there is a need to identify novel ways to improve nanopore sensing features.
  • Ghanem et al. (2022), FEBS J, 289: 3505-3520 discloses chimeric mutants of alphahemolysin and gamma-hemolysin. However, these chimeric mutants are not used in nanopore sensing.
  • chimeric pores formed from at least two different pores display an increased signal-to-noise ratio (SNR), an increased current range, decreased noise and increased normalised median absolute deviation (nMAD) during analyte characterisation compared with the different pores from which the chimeras are derived.
  • SNR signal-to-noise ratio
  • nMAD normalised median absolute deviation
  • the invention therefore provides a chimeric pore monomer comprising two or more regions, wherein at least two of the two or more regions are from at least two different pores, and wherein the at least two different pores do not comprise alpha-hemolysin and gamma-hemolysin.
  • the invention also provides:
  • a chimeric pore comprising at least one chimeric pore monomer of the invention or at least one construct of the invention
  • a chimeric pore multimer comprising two or more pores, wherein at least one of the pores is a chimeric pore of the invention
  • a chimeric pore of the invention or a chimeric pore multimer of the invention, which is comprised in a membrane.
  • a membrane comprising a chimeric pore of the invention or a chimeric pore multimer of the invention
  • a method for producing a chimeric pore monomer of the invention comprising attaching the at least two regions from at least two different pores;
  • a method of characterising a target analyte using (a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from at least two different pores or (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a);
  • a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from at least two different pores or (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a) to determine the presence, absence or one or more characteristics of a target analyte; - a kit for characterising a target polynucleotide comprising:
  • an apparatus for characterising a target polynucleotide in a sample comprising:
  • kit for characterising a target analyte comprising (a) a chimeric pore of the invention or a chimeric pore multimer of the invention and (b) the components of a membrane;
  • a system comprising (a) a membrane of the invention or an array of the invention, (b) means for applying a potential across the membrane(s) and (c) means for detecting electrical or optical signals across the membrane(s);
  • an apparatus produced by a method comprising (i) obtaining a chimeric pore of the invention or a chimeric pore multimer of the invention and (ii) contacting the chimeric pore or a pore multimer with an in vitro membrane such that the chimeric pore or the pore multimer is inserted in the in vitro membrane.
  • the inventors have also surprisingly shown the PorARc pores from various species are capable of nanopore sensing with a high signal-to-noise ratio (SNR), a good current range, minimal noise, and a good normalised median absolute deviation (nMAD).
  • SNR signal-to-noise ratio
  • nMAD normalised median absolute deviation
  • the invention therefore provides a PorARc pore monomer which comprises a sequence having at least 88% identity to the sequence shown in SEQ ID NO: 2 or a sequence having at least about 20% identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55.
  • the invention also provides: - a PorARc construct comprising two or more covalently attached PorARc pore monomers of the invention;
  • PorARc pore comprising at least one PorARc pore monomer of the invention or at least one construct of the invention
  • PorARc pore multimer comprising two or more pores, wherein at least one of the pores is a PorARc pore of the invention
  • PorARc pore of the invention or a PorARc pore multimer of the invention, which is comprised in a membrane;
  • PorARc pore of the invention or a PorARc pore multimer of the invention to determine the presence, absence or one or more characteristics of a target analyte
  • kits for characterising a target polynucleotide comprising (a) a PorARc pore of the invention or a PorARc pore multimer according to claim of the invention and (b) a polynucleotide binding protein;
  • an apparatus for characterising a target polynucleotide in a sample comprising (a) a plurality of PorARc pores of the invention or a plurality of PorARc pore multimers of the invention and (b) a plurality of polynucleotide binding proteins;
  • - a polynucleotide which encodes a PorARc pore monomer of the invention or a PorARc construct of the invention; - a kit for characterising a target analyte comprising (a) a PorARc pore of the invention or a PorARc pore multimer of the invention and (b) the components of a membrane;
  • a system comprising (a) a membrane of the invention or an array of the invention, (b) means for applying a potential across the membrane(s) and (c) means for detecting electrical or optical signals across the membrane(s);
  • an apparatus produced by a method comprising (i) obtaining a PorARc pore of the invention or a PorARc pore multimer of the invention and (ii) contacting the chimeric pore or a pore multimer with an in vitro membrane such that the chimeric pore or the pore multimer is inserted in the in vitro membrane.
  • Figure 1 Schematic showing a chimeric pore formed from the cap region (also known as the scaffold) of one pore (A) and the constriction region of another (B).
  • Figure 2 An alignment of the sequences of the constriction pore monomer chimeras of the invention.
  • the dark shading shows the consistency of the cap region (or scaffold) between the chimeras and the lack of shading shows the differences between the constriction regions.
  • Figure 3 Snapshots of run reports showing the ionic current (pA) versus time (s) as single stranded DNA or a peptide-DNA conjugate (bottom squiggle) translocates through PorARc from Rhodococcus corynebacteroides (PorARc_Rco) (for comparative purposes).
  • Figure 4 Snapshots of run reports showing the ionic current (pA) versus time (s) as single stranded DNA or a peptide-DNA conjugate (bottom squiggle) translocates through PorARc pore from Mycolicibacterium phlei (PorARc_Mph).
  • Figure 5 Snapshots of run reports showing the ionic current (pA) versus time (s) as single stranded DNA or a peptide-DNA conjugates (bottom squiggle) translocates through PorARc_Rco_Mel_ONLZ18401_ONLP19805
  • Figure 6 Snapshots of run reports showing the ionic current (pA) versus time (s) as single stranded DNA or a peptide-DNA conjugates (bottom squiggle) translocates through PorARc_Aku_Mph_ONLZ19310_ONLP20864 (SEQ ID NO: 40).
  • Figure 7 SNR, current range (pA) and noise (pA) for all the chimeras tested in the Example 1.
  • the comparative line relates to PorARc pore from Mycolicibacterium phlei (PorARc_Mph) with the substitutions D91N/D92N (SEQ ID NO: 2).
  • Figure 8 nMAD for all the chimeras tested in the Example 1.
  • the comparative line relates to PorARc pore from Mycolicibacterium phlei (PorARc_Mph) with the substitutions D91N/D92N (SEQ ID NO: 2).
  • Figure 9 Representative ionic current (pA) versus time (s) traces as single stranded DNA translocates through a CsgG chimeric nanopore (CsgG-Eco-Vmae in Table 7) in Example 2.
  • the raw current trace is shown in black lines and the event detected signal is shown in red lines.
  • A shows consecutive DNA translocation events through a single nanopore
  • B shows an individual DNA translocation event
  • C shows a zoomed in view in the x-axis of the first section of the current trace
  • D shows a zoomed in view in the x- and y-axes of the first section of the current trace.
  • Figure 10 SNR, current range (pA) and noise (pA) for the CsgG chimeras tested. The results are shown from left to right in the order in which the chimeras appear in Table 7.
  • the comparative line relates to CsgG wild type pore from E. coli (CsgG-Eco-WT).
  • FIG 11 The structure and size of the wild-type CsgG pore from Escherichia coli strain K12 (the databank accession code for this structure is 4UV3). The distances shown are measured from backbone to backbone of the amino acids forming the pore structure.
  • the CsgG pore is a tightly interconnected symmetrical nonameric pore that resembles a crown.
  • the overall height is 98 A, and the largest outer diameter is 120 A. It defines a central channel and consists of three parts: (A) the cap region, (B) the constriction region and (C) the transmembrane beta barrel region.
  • Cap axial length, or height, is 39 A. It has an inner diameter of 43 A and a 66 A mouth.
  • the beta barrel has 36 strands, an axial length of 39 A and inner diameter of 55 A. Transition between pore cap and beta barrel is sharp, being the constriction located among them, at the level of the predicted lipid-aqueous interface.
  • the constriction is approximately 18.5 A in diameter and exhibits a length of 20A along the axis of the channel.
  • FIG 12 Structure and dimensions of PorARc (cryoEM structure). The distances are measured from backbone to backbone of the amino acids forming the pore structure.
  • the PorARc pore is a symmetrical octameric pore. The overall height is 90.4 A, and the largest outer dimension is 90.7 A.
  • the PorARc pore consists of the cap region (A) and the transmembrane beta barrel region (B) which together make the cap region (or scaffold) (C), and the constriction region (D).
  • the cap region (or scaffold) has a height of 73.6 A corresponding to height of the cap region (A), 44.7 A, and the transmembrane beta barrel region (B), 26.2 A.
  • the constriction region (C) has a height of 19.7 A.
  • the PorARc pore has an overall funnel shape with an entry width of 49 A which narrows to 41.5 A at the bottom of the cap region (A) and narrows further to 39.8 A at the transmembrane beta barrel region (B).
  • the constriction region (C) has a sharp narrowing to 27.4 A and then widens to 36. 1 A at the base of the pore structure.
  • Figure 13 Representative ionic current (pA) versus time (s) traces as single stranded DNA translocates through a CsgG chimeric nanopore (CsgG-Eco-Vfu in Table 10) in Example 3.
  • A-D are the same as in Figure 9.
  • Figure 14 SNR, current range (pA) and noise (pA) for the CsgG chimeras tested. The results are shown from left to right in the order in which the chimeras appear in Tables 7 and 12. The data for the first nine chimeras from left to right is for the chimeras in Example 2 (and these data are identical to the data in Figure 10). The data for the final three chimeras left to right is for the chimeras in Example 3.
  • the comparative line relates to CsgG wild type pore from E. coli (CsgG-Eco-WT).
  • SEQ ID NO: 1 shows the amino acid sequence of PorARc from Rhodococcus corynebacteroides (PorARc_Rco) with the substitutions E78R/D82S/E116T/E125A/D165S.
  • SEQ ID NO: 2 shows the amino acid sequence of PorARc pore from Mycolicibacterium phlei (PorARc_Mph) with the substitutions D91N/D92N.
  • SEQ ID NO: 50 shows the amino acid sequence of PorARc pore from Mycobacterium sp. (PorARc_Msp) with the substitutions D91N/D92N.
  • SEQ ID NO: 51 shows the amino acid sequence of PorARc pore from Mycolicibacterium rhodesiae (PorARc_Mrh) with the substitutions D91N/D92N.
  • SEQ ID NO: 52 shows the amino acid sequence of PorARc pore from Mycolicibacterium elephantis (PorARc_Mel) with the substitutions D91N/E101Q.
  • SEQ ID NO: 53 shows the amino acid sequence of PorARc pore from Mycolicibacterium cosmeticum (PorARc_Mco) with the substitutions D91N/D92N.
  • SEQ ID NO: 54 shows the amino acid sequence of PorARc pore from unclassified Rhodococcus (WP_056447532.1; PorARc_Rsp) with the substitutions E89Q/D91N/D93N/D100N.
  • SEQ ID NO: 55 shows the amino acid sequence of PorARc pore from Rhodococcus sp PSBB049 (WP_206003768. 1; PorARc_Rsp) with the substitutions D90N/D95N/D103N.
  • SEQ ID NOs: 56-64 and 73-75 show the amino acid sequences of CsgG pores in Table 4 below. The signal peptide in each sequence is underlined. Any discussion below of specific position numbering in SEQ ID NO: 56 (e.g., Q100) or any of SEQ ID NOs: 57-64 and 73-75 excludes the signal peptide.
  • SEQ ID NOs: 65-72 show the amino acid sequences of the CsgG constriction chimeras in Table 7 below. The signal peptide in each sequence is underlined.
  • SEQ ID NOs: 76-78 show the amino acid sequences of the CsgG constriction chimeras in Table 12 below. The signal peptide in each sequence is underlined.
  • a polynucleotide includes two or more polynucleotides
  • reference to “a polynucleotide binding protein” includes two or more such proteins
  • reference to “a helicase” includes two or more helicases
  • reference to “a monomer” refers to two or more monomers
  • reference to “a pore” includes two or more pores and the like.
  • the I symbol means "or".
  • D91N/Q means D91N or D91Q.
  • the I symbol means "and” such that D91/D92 is D91 and D92.
  • a "pore” in the context of the invention is a transmembrane protein structure defining a channel or hole that allows the translocation of molecules and ions from one side of the membrane to the other.
  • the translocation of ionic species through the pore may be driven by an electrical potential difference applied to either side of the pore.
  • a "nanopore” is a biological pore in which the minimum diameter of the channel through which molecules or ions pass is in the order of nanometres (10-9 nanometres).
  • the pore can be a transmembrane protein pore.
  • the transmembrane protein structure of a biological pore may be monomeric or oligomeric in nature.
  • the pore comprises a plurality of polypeptide monomers or subunits arranged around a central axis thereby forming a protein-lined channel that extends substantially perpendicular to the membrane in which the pore resides.
  • the number of polypeptide monomers or subunits is not limited. Typically, the number of monomers or subunits is from 5 to up to 30, suitably the number of monomers or subunits is from 6 to 10.
  • the portions of the protein monomers or subunits within the pore that form protein-lined channel typically comprise secondary structural motifs that may include one or more trans-membrane p-barrel, and/or o-helix sections.
  • the chimeric pore monomers of the invention are formed from at least two different pores, i.e. from at least two monomers from at least two different pores. This means the chimeric pore monomers are formed from pores or pore monomers that in their natural state are/form a pore as defined above. Different pores are defined below.
  • the invention provides a chimeric pore monomer.
  • the chimeric pore monomer is typically a protein or polypeptide.
  • the chimeric pore monomer is capable of forming a pore. This can be measured using routine methods, including any of those described in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety) and in the Examples.
  • the chimeric pore monomer comprises two or more regions.
  • the chimeric pore monomer may comprise any number of regions, such as three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more or ten or more regions.
  • the chimeric pore monomer may comprise three, four, five, six, seven, eight, nine or ten regions.
  • the chimeric pore monomer preferably comprises two regions.
  • the chimeric pore monomer preferably comprises three regions.
  • the chimeric pore monomer preferably comprises five regions.
  • Regions in proteins of unknown structure can be defined by aligning the protein sequence with a homologous protein of known structure. If there is sufficiently high sequence identity or similarity between the sequences, there is confidence in the region boundaries. It is possible to add further confidence in a region boundary through use of tools that predict tertiary structure (e.g., AlphaFold) or secondary structural elements (e.g., PSIPRED). For the latter, if both the protein of unknown structure and the protein of known structure are flanked by the same predicted secondary structural element, then there is more confidence of the region boundary.
  • tertiary structure e.g., AlphaFold
  • secondary structural elements e.g., PSIPRED
  • the at least two regions preferably comprise a cap region and a constriction region.
  • the cap region typically forms the structural core of the protein pore and may partially reside in a membrane. When the cap region forms the structural core and partially resides in a membrane, it is also known as a scaffold.
  • the constriction region typically comprises at least one constriction.
  • the "constriction” refers to an aperture defined by a luminal surface of a pore, which acts to allow the passage of ions and target molecules (e.g., but not limited to polynucleotides, polypeptides, or individual nucleotides) but not other non-target molecules through the pore channel.
  • pores formed from the chimeric pore monomers of the invention may comprise two or more constrictions.
  • the constriction(s) are typically the narrowest aperture(s) within a pore or within the channel defined by the pore.
  • the constriction(s) may serve to limit the passage of molecules through the pore.
  • the size of the constriction is typically a key factor in determining suitability of a pore for analyte characterisation. If the constriction is too small, the molecule to be characterised will not be able to pass through. However, to achieve a maximal effect on ion flow through the channel, the constriction should not be too large. For example, the constriction should not be wider than the solvent-accessible transverse diameter of a target analyte.
  • any constriction should be as close as possible in diameter to the transverse diameter of the analyte passing through.
  • the narrowest point in the constriction region preferably forms a constriction at least 5 A in diameter, such as at least about 10 A, at least about 15 A, at least about 18 A, at least about 20 A, at least about 25 A or at least about 27 A in diameter.
  • the constriction region may comprise any number of constrictions, such as at least two, at least three, at least four, or at least five constrictions.
  • the constriction region typically resides in a membrane.
  • the constriction is preferably transmembrane.
  • the skilled person is capable of identifying cap regions (or scaffolds) and constriction regions in pores. Specific examples of cap regions and constriction regions are provided below.
  • the cap region (or scaffold) in the chimeric pore monomer may be longer, shorter or the same length as the cap region (or scaffold) in the pore which the constriction region in the chimeric pore monomer is from or derived from.
  • the constriction region in the chimeric pore monomer may be longer, shorter or the same length as the constriction region in the pore which the cap region (or scaffold) in the chimeric pore monomer is from or derived from.
  • the constriction region in the chimeric pore monomer is preferably shorter than the constriction region in the pore which the cap region (or scaffold) in the chimeric pore monomer is from or derived from. Length may be measured in terms of amino acid number and/or length along the sagittal (or longitudinal) plane of the pore.
  • the at least two regions preferably comprise a cap region, a constriction region, and a transmembrane region.
  • the transmembrane region may be a transmembrane beta barrel region or a transmembrane alpha helical region.
  • the transmembrane region is preferably a transmembrane beta barrel region.
  • CsgG pores typically comprise these three regions and they are discussed in more detail below in relation to CsgG pores.
  • the cap region and transmembrane region together is also known as a scaffold.
  • the cap region may further comprise two subregions, namely a landing platform region and a carboxy-terminal (C-terminal) region.
  • the at least two regions preferably comprise a cap region, a landing platform region, a C-terminal region, a constriction region, and a transmembrane region.
  • the transmembrane region may be a transmembrane beta barrel region or a transmembrane alpha helical region.
  • the transmembrane region is preferably a transmembrane beta barrel region.
  • CsgG pores typically comprise these five regions and they are discussed in more detail below in relation to CsgG pores.
  • the cap region, the landing platform region, the C-terminal region and transmembrane region together is also known as a scaffold.
  • the cap region in the chimeric pore monomer may be longer, shorter or the same length as the cap region in the pore(s) which the constriction region and/or the transmembrane region in the chimeric pore monomer is/are from or derived from.
  • the constriction region in the chimeric pore monomer may be longer, shorter or the same length as the constriction region in the pore(s) which the cap region and/or the transmembrane region in the chimeric pore monomer is/are from or derived from.
  • the transmembrane region in the chimeric pore monomer may be longer, shorter or the same length as the transmembrane region in the pore(s) which the cap region and/or the constriction region in the chimeric pore monomer is/are from or derived from.
  • the constriction region in the chimeric pore monomer may be shorter than the constriction region in the pore(s) which the cap region and/or the transmembrane region in the chimeric pore monomer is/are from or derived from.
  • Length may be measured in terms of amino acid number and/or length along the sagittal (or longitudinal) plane of the pore.
  • the chimeric pore monomer preferably comprises two regions.
  • the two regions are preferably a cap region (or scaffold) and a constriction region.
  • the cap region (or scaffold) and the constriction region are typically from different pores.
  • An example of this is shown in Figure 1.
  • the constriction transplants and cap transplants (the latter also known as scaffold transplants) tested in Example 1 are also examples of chimeric pore monomers formed from two different pores.
  • a transplant indicates when a chimeric pore monomer is created by transplanting all or part of a region from one pore monomer into the monomer from a different pore.
  • a constriction transplant involves transplanting all or part of the constriction from a pore A monomer into a pore B monomer.
  • the chimeric pore monomer preferably comprises three regions.
  • the three regions are preferably a cap region, a constriction region, and a transmembrane region.
  • the cap region and transmembrane region together may also be known as a scaffold.
  • the constriction region may be from one pore and the cap region and the transmembrane region (together also known as the scaffold) may from a different pore, i.e., the chimeric pore monomer is formed from/derived from two different pores.
  • the cap region, the constriction region, and the transmembrane region may each be from a different pore, i.e., the chimeric pore monomer is formed from/derived from three different pores.
  • the transmembrane region may be a transmembrane beta barrel region or a transmembrane alpha helical region.
  • the transmembrane region is preferably a transmembrane beta barrel region. Such regions are found in CsgG pores.
  • the chimeric pore monomer preferably comprises five regions.
  • the five regions are preferably a cap region, a landing platform region, a C-terminal region, a constriction region, and a transmembrane region.
  • the cap region, the landing platform region, the C-terminal region, and transmembrane region together may also be known as a scaffold.
  • the constriction region may be from one pore and the cap region, the landing platform region, the C-terminal region, and the transmembrane region (together also known as the scaffold) may from a different pore, i.e., the chimeric pore monomer is formed from/derived from two different pores.
  • the cap region, the landing platform region, the C-terminal region, the constriction region, and the transmembrane region may each be from a different pore, i.e., the chimeric pore monomer is formed from/derived from five different pores.
  • the transmembrane region may be a transmembrane beta barrel region or a transmembrane alpha helical region.
  • the transmembrane region is preferably a transmembrane beta barrel region. Such regions are found in CsgG pores.
  • the constriction region in the chimeric pore monomer may be formed from the constriction regions of the two different pores.
  • the constriction region may be a hybrid constriction region.
  • part of the constriction region in one pore e.g., pore A
  • the chimeric pore monomer comprises a cap region (or scaffold) from pore A and a constriction region from pores A and B.
  • the two regions are from two different pores. Any amount or part of the constriction region in one pore may be replaced with any amount or part from the constriction region of the different pore.
  • At least about 5%, at least about 10%, at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 99% of the constriction region in one pore may be replaced with at least about 5%, at least about 10%, at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 99% of the constriction region from the different pore.
  • At least about 5 amino acids such as at least about 10, 15, 20, 25, 26, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105 or 110 amino acids
  • in the constriction region in one pore may be replaced with at least about 5 amino acids, such as at least about 10, 15, 20, 25, 26, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105 or 110 amino acids, from the constriction region of the different pore. Specific examples of this are discussed below with reference to CsgG chimeras.
  • the whole constriction region of one pore may be replaced with the whole constriction region of a different pore.
  • 100% of the constriction region of one pore may be replaced with 100% of the constriction region of a different pore.
  • All of the amino acids in the constriction region of one pore may be replaced with all of the amino acids in the constriction region of a different pore.
  • At least two of the two or more regions are from, preferably derived from, at least two different pores.
  • the pore may comprise any number of regions from any number of different pores as long as at least two of the regions are from or derived from two different pores.
  • the chimeric pore monomer may comprise two or more different regions from at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine or at least ten different pores.
  • the number of two or more regions is typically the same as the number of at least two different pores with each region being from or derived from a different pore.
  • the chimeric pore monomer preferably comprises two regions from or derived from two different pores.
  • the chimeric pore monomer preferably comprises three regions from or derived from three different pores.
  • the chimeric pore monomer preferably comprises five regions from or derived from five different pores.
  • the regions from at least two different pores are typically attached, preferably covalently attached, to each other to form the chimeric pore monomer.
  • the regions from at least two different pores are typically attached, preferably covalently attached, to each other to form the chimeric pore monomer.
  • the regions may be attached directly or may be attached using one or more linkers, preferably one or more peptide linkers. Suitable linkers are discussed below with reference to the constructs of the invention.
  • the chimeric pore monomer of the invention is typically created by genetic fusion of the two or more regions.
  • a polynucleotide encoding the chimeric pore monomer can be designed using the sequences of the at least two different pores. The polynucleotide can then be used to express the chimeric pore monomer as a genetic fusion. This is discussed in more detail below.
  • a region is "from” or “derived from” a pore if it shares significant homology/identity with a region from the pore.
  • the region preferably the cap region (or scaffold) or the constriction region, preferably comprises a sequence having at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology to the sequence of the corresponding region, preferably the cap region (or scaffold) or the constriction region, from the pore.
  • the region preferably the cap region (or scaffold) or the constriction region, preferably comprises a sequence having 100% homology to the sequence of the corresponding region, preferably the cap region (or scaffold) or the constriction region, from the pore.
  • the region, preferably the cap region (or scaffold) or the constriction region preferably comprises a sequence having at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% identity to the sequence of the corresponding region, preferably the cap region (or scaffold) or the constriction region, from the pore.
  • the region preferably the cap region (or scaffold) or the constriction region, preferably comprises a sequence having 100% identity to the sequence of the corresponding region, preferably the cap region (or scaffold) or the constriction region, from the pore. Homology and/or identity is typically measured over the entire length of the region.
  • the cap region, the constriction region or the transmembrane region preferably comprises a sequence having at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology to the sequence of the cap region, the constriction region or the transmembrane region from the pore.
  • the cap region, the constriction region or the transmembrane region preferably comprises a sequence having 100% homology to the sequence of the cap region, the constriction region, or the transmembrane region from the pore.
  • the cap region, the constriction region or the transmembrane region preferably comprises a sequence having at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% identity to the sequence of the cap region, the constriction region or the transmembrane region from the pore.
  • the cap region, the constriction region or the transmembrane region preferably comprises a sequence having 100% identity to the sequence of the cap region, the constriction region, or the transmembrane region from the pore. Homology and/or identity is typically measured over the entire length of the region.
  • the chimeric pore monomer comprises at least two regions from or derived from at least two different pores.
  • the at least two different pores are typically at least two different pores that appear in nature.
  • the at least two different pores are typically at least two different wild-type or naturally occurring pores.
  • the at least two different pores are preferably different before any artificial or synthetic modifications, such as additions, deletions and/or substitutions, are made to them.
  • the at least two different pores are preferably different before any modifications, such as additions, deletions and/or substitutions, are made to their wild-type or naturally occurring sequences.
  • one or more of the at least two regions or two regions from or derived from the at least two different pores or two different pores may by modified when constructing the chimeric pore monomer in accordance with the invention.
  • one or more of the at least two regions or one or more of the two regions in the chimeric pore monomer preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte.
  • the at least two different pores are preferably homologues, for example structural homologues.
  • a structural homologue refers to a protein or molecule that shares a similar three-dimensional structure with another protein or molecule. This can be determined using standard methods in the art (e.g., AlphaFold or PSIPRED).
  • Structural homologues typically have similar sequences. Structural homologues are normally identified in similar species.
  • the at least two different pores may be at least two PorARc pores selected from Table 2 below.
  • the at least two different pores may be at least two CsgG pores selected from Table 4 below. These show structural homologues amongst different species.
  • Each region in the chimeric pore monomer typically shares low homology or identity with the corresponding region in the different pore(s) which the other the region(s) in the chimeric pore monomer are from or derived from.
  • a region corresponds to a region in a different pore when they share a similar structure and/or function.
  • a region may correspond to a region in a different pore when they share a similar sequence. This can be determined as discussed above.
  • the cap region (or scaffold) in one pore corresponds to the cap region (or scaffold) in a different pore.
  • the cap region in one pore corresponds to the cap region in a different pore.
  • the constriction region in one pore corresponds to the constriction region in a different pore.
  • the transmembrane region in one pore corresponds to the transmembrane region in a different pore.
  • the transmembrane beta barrel region in one pore corresponds to the transmembrane beta barrel region in a
  • Each region in the chimeric pore preferably comprises a sequence that is about 99% or less homologous or identical to the sequence of the corresponding region in the different pore(s) which the other region(s) in the chimeric pore monomer are from or derived from.
  • Each region in the chimeric pore preferably comprises a sequence that is about 99% or less, about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence of the corresponding region in the different pore(s) which the other region(s) in the chimeric pore are from ore derived from.
  • Each region preferably comprises the cap region (or scaffold) and the constriction region.
  • Each region preferably comprises the cap region, the constriction region, or the transmembrane region. Homology and/or identity is typically measured over the entire length of the region.
  • the chimeric pore monomer comprises a cap region (or scaffold) and a constriction region from two different pores.
  • the chimeric pore monomer comprises a constriction region formed from two different pores and a cap region (or scaffold) from one of the different pores, i.e., the chimeric pore monomer is formed from/derived from two different pores.
  • the cap region (or scaffold) in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence of the cap region (or scaffold) in the pore which the constriction region is from or derived from.
  • the cap region (or scaffold) in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence of the cap region (or scaffold) in the pore which the constriction region is from or derived from.
  • the constriction region in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence of the constriction region in the pore which the cap region (or scaffold) is from or derived from.
  • the constriction region in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence of the constriction region in the pore which the cap region (or scaffold) is from or derived from. Homology and/or identity is typically measured over the entire length of the region.
  • the chimeric pore monomer comprises a constriction region from one pore and a cap region and a transmembrane region from a different pore, i.e., the chimeric pore monomer is formed from/derived from two different pores.
  • the chimeric pore monomer comprises a constriction region formed from two different pores and a cap region and a transmembrane region from one of the different pores, i.e., the chimeric pore monomer is formed from/derived from two different pores.
  • the constriction region in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence of the constriction region in the pore which the cap region and transmembrane region are from or derived from.
  • the constriction region in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 94% or less, about 90% or less, about 85% or less, about 82% or less, about 80% or less, about 75% or less, about 72% or less, about 70% or less, about 69% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence of the constriction region in the pore which the cap region and transmembrane region is from or derived from.
  • the cap region and/or the transmembrane region in the chimeric pore monomer preferably comprise(s) a sequence/sequences that is/are about 99% or less homologous or identical to the sequence(s) of the cap region and/or the transmembrane region in the pore which the constriction region is from or derived from.
  • the cap region and/or the transmembrane region in the chimeric pore monomer more preferably comprise(s) a sequence/sequences that is/are about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence(s) of the cap region and/or the transmembrane region in the pore which the constriction region is from or derived from. Homology and/or identity is typically measured over the entire length of the region.
  • the chimeric pore monomer comprises a constriction region, a cap region, and a transmembrane region each from a different pore, i.e., the chimeric pore monomer is formed from/derived from three different pores.
  • the chimeric pore monomer comprises a constriction region formed from two different pores, a transmembrane region from one of the two different pores and a cap region from a third different pore, i.e., the chimeric pore monomer is formed from/derived from three different pores.
  • the chimeric pore monomer comprises a constriction region, a cap region, a landing platform region, a C-terminal region, and a transmembrane region each from a different pore, i.e., the chimeric pore monomer is formed from/derived from five different pores.
  • the chimeric pore monomer comprises a constriction region formed from two different pores, a transmembrane region from one of the two different pores and a cap region, a landing platform region, a C-terminal region, from three different pores, i.e., the chimeric pore monomer is formed from/derived from five different pores.
  • the constriction region in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence(s) of the constriction region in the pore(s) which the cap region and/or transmembrane region is/are from or derived from.
  • the constriction region in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 94% or less, about 90% or less, about 85% or less, about 82% or less, about 80% or less, about 75% or less, about 72% or less, about 70% or less, about 69% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence(s) of the constriction region in the pore(s) which the cap region and/or transmembrane region is/are from or derived from.
  • the cap region in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence(s) of the cap region in the pore(s) which the constriction region and/or transmembrane regions is/are from or derived from.
  • the cap region in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence(s) of the cap region in the pore(s) which the constriction region and/or the transmembrane region is/are from or derived from.
  • the transmembrane region in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence(s) of the transmembrane region in the pore(s) which the cap region and/or constriction region is/are from or derived from.
  • the transmembrane region in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence of the transmembrane region in the pore(s) which the cap region and/or constriction region is/are from or derived from. Homology and/or identity is typically measured over the entire length of the region.
  • the constriction region in the chimeric pore monomer preferably comprises at least about 1 amino acid difference, such as at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
  • the constriction region in the chimeric pore monomer is formed from the constriction regions of two different pores (/.e., is a hybrid constriction region), it preferably comprises at least about 1 amino acid difference, such as at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
  • the entire or complete chimeric pore monomer preferably comprises a sequence that is about 96% or less homologous or identical to the wild type monomer sequences of the at least two different pores.
  • the chimeric pore monomer preferably comprises a sequence that is about 96% or less homologous or identical to the wild type monomer sequences of the different pores from which the chimeric pore monomer is derived.
  • the chimeric pore monomer more preferably comprises a sequence that is about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the wild type monomer sequences of the at least two different pores or the wild type monomer sequences of the different pores from which the chimeric pore monomer is derived. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the entire or complete chimeric pore monomer preferably comprises a sequence that is about 99.7% or less homologous or identical to the wild type monomer sequences of the at least two different pores.
  • the chimeric pore monomer preferably comprises a sequence that is about 99.7% or less homologous or identical to the wild type monomer sequences of the different pores from which the chimeric pore monomer is derived.
  • the chimeric pore monomer more preferably comprises a sequence that is about 99.6% or less, about 99.5% or less, about 99.4% or less, about 99.3% or less, about
  • Standard methods in the art may be used to determine homology or identity.
  • the UWGCG Package provides the BESTFIT program which can be used to calculate homology or identity, for example used on its default settings (Devereux et al (1984) Nucleic Acids Research 12, p387-395).
  • the PILEUP and BLAST algorithms can be used to calculate homology and identity or line up sequences (such as identifying equivalent residues or corresponding sequences (typically on their default settings)), for example as described in Altschul S. F. (1993) J Mol Evol 36:290-300; Altschul, S.F et al (1990) J Mol Biol 215:403-10.
  • pore is well known to a skilled person and typically refers to a biological pore, i.e., a typical protein structure which defines a channel.
  • the term "pore” typically relates to a structure, preferably a protein structure, that in its native state associates with or crosses a membrane, such as a cell membrane. Other ring-like or channel-like structures are preferably not included.
  • the at least two different pores or the two different pores preferably do not comprise a proteasome.
  • the at least two different pores or the two different pores preferably do not comprise mouse proteasome activator 28o (also called REG or IIS activators).
  • the at least two different pores or the two different pores are preferably selected from Wza, Iota toxin, Anthrax protective antigen, Vibrio cholerae cytolysin, Cytotoxin K (CytK), CELIII, CsgG, Aerolysin, alpha hemolysin, InvG, GspD, MspA, MspB, MspC, PorARr, PorBRr, PorARc, PilQ, necrotic enteritis B-like toxin (NetB), FraC, portal proteins including G20c, P23_45, T4, SPP1, P22 and Phi29, gamma hemolysin, Monalysin, Lysenin, ClyA, and Clostridium perfringens beta toxin.
  • the at least two different pores or the two different pores are preferably selected from Wza, Iota toxin, Anthrax protective antigen, Vibrio cholerae cytolysin, Cytotoxin K (CytK), CELIII, CsgG, Aerolysin, alpha hemolysin, InvG, GspD, MspA, MspB, MspC, PorARr, PorBRr, PorARc, PilQ, necrotic enteritis B-like toxin (NetB), FraC, portal proteins including G20c, P23_45, T4, SPP1, P22 and Phi29, gamma hemolysin, Monalysin, Lysenin, ClyA, Clostridium perfringens beta toxin, parasporin-2, epsilon toxin, lectin from the parasitic mushroom Laetiporus sulphureus (LSL), volvatoxin, Cry toxins, CytlAa and
  • the at least three different pores are preferably selected from Wza, Iota toxin, Anthrax protective antigen, Vibrio cholerae cytolysin, Cytotoxin K (CytK), CELIII, CsgG, Aerolysin, alpha hemolysin, InvG, GspD, MspA, MspB, MspC, PorARr, PorBRr, PorARc, PilQ, necrotic enteritis B-like toxin (NetB), FraC, portal proteins including G20c, P23_45, T4, SPP1, P22 and Phi29, gamma hemolysin, Monalysin, Lysenin, ClyA, and Clostridium perfringens beta toxin.
  • the five different pores may be selected from any of these pores.
  • the at least three different pores are preferably selected from Wza, Iota toxin, Anthrax protective antigen, Vibrio cholerae cytolysin, Cytotoxin K (CytK), CELIII, CsgG, Aerolysin, alpha hemolysin, InvG, GspD, MspA, MspB, MspC, PorARr, PorBRr, PorARc, PilQ, necrotic enteritis B-like toxin (NetB), FraC, portal proteins including G20c, P23_45, T4, SPP1, P22 and Phi29, gamma hemolysin, Monalysin, Lysenin, ClyA, Clostridium perfringens beta toxin, parasporin-2, epsilon toxin, lectin from the parasitic mushroom Laetiporus sulphureus (LSL), volvatoxin, Cry toxins, CytlAa and Cyt2Aa
  • the five different pores may be selected from any of these pores.
  • the pore may be from any species.
  • the different pores may be selected from any of the pores from any of the species.
  • the two different pores may be two different CsgG pores.
  • the three different pores may be three different CsgG pores.
  • the at least two different pores may be two different PorARc pores or three different PorARc pores.
  • the at least two different pores may be five different PorARc pores.
  • the PorARc pore or two different PorARc pores preferably comprise(s) a cap region (or scaffold) (e.g., C in Figure 12) and a constriction region (e.g., D is Figure 12).
  • the PorARc pore, two different PorARc pores or three different PorARc pores preferably comprise(s) one or more of (a) a cap region (e.g.
  • the chimeric pore monomer preferably comprises one or more of (a) a cap region, (b) a constriction region, and (c) a transmembrane beta barrel region, such as (a), (b), (c), (a) and (b), (a) and (c), (b) and (c), or (a), (b) and (c).
  • the chimeric pore monomer preferably comprises (a)-(c).
  • the chimeric pore monomer preferably comprises (a) and (c) from one PorARc pore and (b) from a different PorARc pore.
  • the chimeric pore monomer preferably comprises (a), (b) and (c) each from a different PorARc pore or (a), (b) and (c) from three different PorARc pores.
  • the PorARc pore, two different PorARc pores or three different PorARc pores may have any structure but preferably has/have or comprise(s) the structure of the wild-type PorARc pore ( Figure 12).
  • the protein structure of PorARc defines a channel or hole that allows the translocation of molecules and ions from one side of the membrane to the other.
  • the PorARc pore, two different PorARc pores or three different PorARc pores be any size but preferably has/have the dimensions of the wild-type PorARc_Rco ( Figure 12).
  • the PorARc pore or at least two different PorARc pores preferably has/have an external diameter of from about 70 to about 110 A at its widest point, such as from about 80 to about 100 A or from about 85 to about 95 A at its widest point.
  • the PoARc pore, two different PoARc pores or three different PoARc pores preferably has/have an external diameter of about 90.7 A at its widest point.
  • the PorARc pore, two different PorARc pores or three different PorARc pores preferably has/have a total length of from about 70 to about 110 A, such as from about 80 to about 100 A or from about 85 to about 95 A.
  • the PorARc pore, two different PorARc pores or three different PorARc pores preferably has/have a total length of about 90.4 A. References to "total length” and “length” relate to the length of the pore or pore region when viewed from the side (see, e.g., the side view in Figure 12).
  • the cap region (A in Figure 12) preferably has a length of from about 25 to about 65 A, such as from about 35 to about 55 A or from about 40 to about 50 A.
  • the cap region preferably has a length of about 44.7 A.
  • the channel defined by the cap region preferably has an opening of from about 30 to about 70 A in diameter, such as from about 40 to about 60 A or from about 45 to about 55 A in diameter.
  • the channel defined by the cap region preferably has an opening of about 49 A in diameter.
  • the channel defined by the cap region is preferably from about 20 to about 60 A in diameter at its narrowest point, such as from about 30 to about 50 A or from about 35 to about 45 A in diameter at its narrowest point.
  • the channel defined by the cap region is preferably about 41.5 A in diameter at its narrowest point.
  • the transmembrane beta barrel region (B in Figure 12) preferably has a length of from about 5 to about 45 A, such as from about 15 to about 35 A or from about 20 to about 30 A.
  • the transmembrane beta barrel preferably has a length of about 26.2 A.
  • the channel defined by the transmembrane beta barrel region is preferably from about 20 to about 60 A in diameter at its narrowest point, such as from about 30 to about 50 A or from about 35 to about 45 A in diameter at its narrowest point.
  • the channel defined by the transmembrane beta barrel region is preferably about 39.8 A in diameter at its narrowest point.
  • the cap region (or scaffold) (C in Figure 12 and formed from A and B) preferably has a length of from about 55 to about 95 A, such as from about 65 to about 85 A or from about 70 to about 80 A.
  • the cap region (or scaffold) preferably has a length of about 73.6 A.
  • the channel defined by the cap region (or scaffold) (C) is preferably from about 20 to about 60 A in diameter at its narrowest point, such as from about 30 to about 50 A or from about 35 to about 45 A in diameter at its narrowest point.
  • the channel defined by the cap region (or scaffold) (C) is preferably about 39.8 A in diameter at its narrowest point.
  • the constriction region (D in Figure 12) preferably has a length of from about 5 to about 40 A, such as from about 10 to about 30 A or from about 15 to about 25 A.
  • the constriction region preferably has a length of about 19.7 A.
  • the channel defined by the constriction region is preferably from about 10 to about 50 A in diameter at its narrowest point, such as from about 20 to about 40 A, from about 22 to about 32 A or from about 25 to about 35 A in diameter at its narrowest point.
  • the channel defined by the constriction region is preferably about 27.4 A in diameter at its narrowest point.
  • the channel defined by the constriction region is preferably from about 10 to about 50 A in diameter at its narrowest point, such as from about 15 to about 55 A, from about 25 to about 45 A or from about 30 to about 40 A in diameter at the base of the pore structure.
  • the channel defined by the constriction region is preferably about 36.1 A in diameter at the base of the pore structure.
  • the constriction region is preferably from about 20 to about 60 A in diameter, such as from about 30 to about 50 A or from about 35 to about 45 A.
  • the constriction region is preferably about 41.9 A in diameter. All of the measurements above are based on measuring from backbone to backbone of the amino acids forming the different regions (as shown in Figure 12).
  • the cap region (or scaffold) in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably from about 67 to about 187 amino acids in length.
  • the cap region (or scaffold) in the PorARc, at least two different PorARc pores or at least three different PorARc pores is preferably about 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124,
  • the cap region (or scaffold) in the PorARc, at least two different PorARc pores or at least three different PorARc pores is preferably from about 87 to about 167 amino acids in length.
  • the cap region (or scaffold) in the PorARc, at least two different PorARc pores or at least three different PorARc pores is preferably about 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120,
  • the cap region (or scaffold) in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably about 87, 159, 160, 166, or 167 amino acids in length.
  • the constriction region in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably from about 5 to about 114 amino acids in length.
  • the constriction region in the PorARc, at least two different PorARc pores or at least three different PorARc pores is preferably about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
  • the constriction region in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably from about 15 to about 94 amino acids in length.
  • the constriction region in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, or 94 amino acids in length.
  • the constriction region in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably about 15, 16, 17, 18, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 39, 50, 52, 86, 94 amino acids in length.
  • the PorARc pore is preferably selected from the pores in Table 2.
  • the PorARc pore is preferably selected from the pores in Table 2.
  • the (fourth) Reference column provides a reference to the wild-type (or naturally occurring) sequence of each pore on GenBank.
  • the PorARc pore may be selected from any of the wild-type pores in Table 2 (/.e., from the references in the fourth column).
  • the PorARc pore may be selected from any of the wild-type pores in Table 2 with the signal peptide removed and a methionine (M) at the N terminus (/.e., at position 1). The skilled person can determine these sequences from the references in the fourth column.
  • Preferred cap and constriction regions for each pore are shown in the fifth and sixth columns of Table 2.
  • the residue numbering in the fifth and sixth columns correspond to the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1).
  • the residue numbering of the cap and constriction regions will need to be adjusted for the wild-type (or naturally occurring) sequences in the references in the fourth column.
  • the cap and constriction regions shown in the fifth and sixth columns of Table 2 are preferred cap and constriction regions.
  • the invention covers additional cap and constriction regions which differ from those shown in the fifth and sixth columns by ⁇ about 10 amino acids. For instance, in PorAR.c_R.co (pore 1):
  • 1-86 includes 1-76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,
  • 107-180 includes 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116 or 117-170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189 or 190
  • 87-106 includes 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93,
  • one or more of the at least two regions or one or more of the two regions in the chimeric pore monomer preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte.
  • One or more negatively charged amino acids in any of the pores in Table 2 may be removed, for instance by deletion or substitution.
  • One or more negatively charged amino acids in any of the pores in Table 2, such as one or more E and/or D, are preferably deleted or substituted with one or more different amino acids, such as one or more positively charged amino acids and/or one or more uncharged amino acids.
  • Any number of negatively charged amino acids may be deleted or substituted, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more.
  • the one or more negatively charged amino acids are preferably in the constriction region of the pore.
  • the preferred constriction region for each pore is defined in sixth column of Table 2. This applies to any of the sequences in Table 2, including the wild-type sequences and the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1). Preferred constriction substitutions for each pore are shown in the seventh column.
  • the PorARc pore may be selected from any of the pores in Table 2 comprising substitution of the negatively charged amino acid, such as D or E, with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of the positions shown, preferably all of the positions shown.
  • the PorARc pore may be pore 2 with substitution of D with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at position 91 and/or position 92.
  • the PorARc pore may be pore 3, pore 8 or pore 19 with substitution of D with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at position 91 and/or position 92.
  • the PorARc pore may be pore 17 with substitution of D or E with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at position 91 and/or position 101.
  • the PorARc pore may be pore 25 with substitution of D or E with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of, preferably all of, position 89, 91, 93 and 100.
  • the PorARc pore may be pore 27 with substitution of D with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of, preferably all of, positions 90, 95 and 103.
  • the PorARc pore may be selected from any of the pores in Table 2 comprising one or more of, or all of, the substitutions shown in the seventh column.
  • the PorARc pore may be pore 2 in Table 2 with D91N and/or D92N.
  • the PorARc pore may be pore 3, pore 8 or pore 19 with D91N and/or D92N.
  • the PorARc pore may be pore 17 with D91N and/or E101Q.
  • the PorARc pore may be pore 25 with one or more of, preferably all of, E89Q, D91N, D93N and D100N.
  • the PorARc pore may be pore 27 with one or more of, preferably all of, D90N, D95N and D103N.
  • the PorARc pore preferably comprises or consists of (a) the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55 or (b) a sequence having at least about 20% homology or identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. Such sequences are discussed in more detail below.
  • the PorARc pore may be pore 1 in Table 2 with one or more cap substitutions.
  • the PorARc is preferably pore 1 in Table 2 with substitution of D or E with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of, preferably all of, positions 78, 82, 116, 125 and 165.
  • the PorARc pore is preferably pore 1 in Table 2 with one or more of, preferably all of, E78R, D82S, E116T, E125A, or D165S.
  • the PorARc pore preferably comprises or consists of the sequence shown in SEQ ID NO: 1.
  • the PorARc pore may be pore 1 in Table 2 with one or more constriction substitutions.
  • the PorARc is preferably pore 1 in Table 2 with substitution of D or E with a positive amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at position 89 and/or position 104.
  • the PorARc pore is preferably pore 1 in Table 2 with E89R, E89Q, E89L or E89A and/or D104S or D104N, such as E89R/D104S, E89Q/D104S, E89Q/D104N, E89L/D104S, E89L/D104N, E89A/D104S or E89A/D104N.
  • the PorARc pore is preferably pore 1 in Table 2 with E89R and/or D104S, such as E89R/D104S. This applies to any of the pore 1 sequences in Table 2, including the wild-type sequence and the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1). These one or more constriction substitutions may be made in addition to the one or more cap substitutions discussed above.
  • the at least two different pores preferably comprise at least two different PorARc pores.
  • the two different pores are preferably two different PorARc pores.
  • the at least two or two different PorARc pores may be selected from the pores in Table 2.
  • One of the at least two different pores is preferably PorARc_Rco (pore 1 in Table 2) or PorARc_Mph (pore 2 in Table 2).
  • One of the two different pores is preferably PorARc_Rco (pore 1 in Table 2) or PorARc_Mph (pore 2 in Table 2).
  • the at least two different pores preferably comprise or the two different pores preferably are (a) PorARc_Rco or PorARc_Mph and (b) one of the pores in Table 2.
  • the one of the pores in Table 2 is preferably pore 3, 8, 17, 19, 20, 25 or 27.
  • the at least two different pores preferably comprise or the two different pores preferably are PorArc_Rco and PorARc_Rco_Mph.
  • Any reference in this paragraph to a particular pore in Table 2 includes any of the pores discussed above in relation to that pore Table 2 including the wild-type sequence, the sequence lacking the signal peptide and including a M at the N terminus (/.e., at position 1) and either of such pores comprising the one or more substitutions discussed above.
  • PorARc_Rco pore preferably comprises or consists of SEQ ID NO: 1.
  • PorARc_Mph preferably comprises or consists of SEQ ID NO: 2.
  • the constriction transplants and cap transplants (the latter also known as scaffold transplants) in Example 1 are formed from different PorARc pores.
  • the at least two different pores preferably comprise or the two different pores preferably are the combination of two different PorARc pores in any of those transplants.
  • the at least two different pores preferably comprise or the two different pores preferably are any combination shown in a row in Table 3.
  • the cap region (or scaffold) is preferably from or derived from the pore in column A and the constriction region is preferably from or derived from the pore in column B.
  • the constriction region is preferably from or derived from the pore in column A and the cap region (or scaffold) is preferably from or derived from the pore in column B.
  • a reference in Table 3 to a particular pore in Table 2 includes any of the pores discussed above in relation to that pore in Table 2 including the wild-type sequence, the sequence lacking the signal peptide and including a M at the N terminus (/.e., at position 1) and either of such pores comprising the one or more substitutions discussed above.
  • preferred pores have a sequence identifier number, these are also shown in Table 3.
  • the pore preferably comprises or consists of the sequence identifier number.
  • the chimeric pore monomer preferably comprises a sequence having at least about 40% homology or identity to the sequence shown in any one of SEQ ID NOs: 3-49.
  • the chimeric pore monomer preferably comprises a sequence having at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 3-49.
  • SEQ ID NOs: 3-49 is of course equivalent to SEQ ID NO: 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48 or 49.
  • Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 3-49.
  • the chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%. at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 3-49.
  • the chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 3-49, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 3-49.
  • Any one one of SEQ ID NOs: 3-49 is of course equivalent to SEQ ID NO: 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48 or 49.
  • Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the chimeric pore monomer preferably does not comprise the entire sequence of any wild-type pore. In any of these embodiments, the chimeric pore monomer preferably does not comprise a sequence having 100% identity to the entire sequence of any wild-type pore. In any of these embodiments, the chimeric pore monomer preferably does not comprise the entire sequence of any of the different pores used to create the chimeric pore monomer or any of the different pores which the two or more regions, such as two regions, three regions or five regions, are from or derived from.
  • the chimeric pore monomer preferably does not comprise a sequence having 100% identity to the entire sequence of any of the different pores used to create the chimeric pore monomer or any of the different pores which the two or more regions, such as two regions, three regions or five regions, are from or derived from.
  • the at least two different pores preferably comprise or the two different pores are preferably (a) PorARc_Rco and MspA, (b) two different CsgG pores or three different CsgG pores, (c) alpha-hemolysin and CytK, or (d) NetB and CytK.
  • the at least two different pores preferably comprise or the two different pores are preferably (a) PorARc_Rco and MspA, (b) two different CsgG pores, (c) alpha-hemolysin and CytK, or (d) NetB and CytK.
  • the at least two different pores preferably comprise five different CsgG pores.
  • the two different CsgG pores or three different CsgG pores in (b) may be selected from any of the pores disclosed in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068, CN 113773373 A, CN 113896776 A, CN 113912683 A, and CN 113754743 A or a variant thereof (all incorporated by reference herein in their entirety).
  • the five different CsgG pores may be selected from any of the pores disclosed in thes references.
  • the two different CsgG pores may be CsgG from E. coli and CsgG from a different species.
  • the three different CsgG pores may be CsgG from E. coli and CsgG from two different non-E. coli species.
  • the five different CsgG pores may be CsgG from E. coli and CsgG from four different non-E. coli species.
  • the CsgG pore, two different CsgG pores or three different CsgG pores preferably comprise(s) one or more of (a) a cap region, (b) a constriction region, and (c) a transmembrane beta barrel region, such as (a), (b), (c), (a) and (b), (a) and (c), (b) and (c), or (a), (b) and (c).
  • a cap region (a) in CsgG is also known as the rim. Together (a) and (c) is also known as a scaffold.
  • the chimeric pore monomer preferably comprises one or more of (a) a cap region, (b) a constriction region, and (c) a transmembrane beta barrel region, such as (a), (b), (c), (a) and (b), (a) and (c), (b) and (c), or (a), (b) and (c).
  • the chimeric pore monomer preferably comprises (a)-(c).
  • the chimeric pore monomer preferably comprises (a) and (c) from one CsgG pore and (b) from a different CsgG pore.
  • the chimeric pore monomer preferably comprises (a), (b) and (c) each from a different CsgG pore or (a), (b) and (c) from three different CsgG pores.
  • the residues of SEQ ID NOs: 56-64 which form these regions are defined below.
  • the CsgG pore, two different CsgG pores or three different CsgG pores may have any structure but preferably has/have or comprise(s) the structure of the wild-type CsgG pore ( Figure 11). This also applies to five different CsgG pores.
  • the protein structure of CsgG defines a channel or hole that allows the translocation of molecules and ions from one side of the membrane to the other.
  • constriction refers to an aperture defined by a luminal surface of a pore or pore complex, which acts to allow the passage of ions and target molecules (e.g., but not limited to polynucleotides or individual nucleotides) but not other non-target molecules through the pore or pore complex channel.
  • target molecules e.g., but not limited to polynucleotides or individual nucleotides
  • the constriction(s) are typically the narrowest aperture(s) within a pore or pore complex or within the channel defined by the pore or pore complex. The constriction(s) may serve to limit the passage of molecules through the pore.
  • the size of the constriction is typically a key factor in determining suitability of a pore or pore complex for analyte characterisation. If the constriction is too small, the molecule to be characterised will not be able to pass through. However, to achieve a maximal effect on ion flow through the channel, the constriction should not be too large. For example, the constriction should not be wider than the solvent-accessible transverse diameter of a target analyte. Ideally, any constriction should be as close as possible in diameter to the transverse diameter of the analyte passing through.
  • the CsgG pore, two different CsgG pores or three different CsgG pores may be any size but preferably has/have the dimensions of the wild-type CsgG pore ( Figure 11).
  • the CsgG pore, two different CsgG pores or three different CsgG pores preferably has/have an external diameter of from about 100 to about 150 A at its widest point, such as from about 110 to about 140 A or from about 115 to about 125 A at its widest point.
  • the CsgG pore, two different CsgG pores or three different CsgG pores preferably has/have an external diameter of about 120 A at its widest point.
  • the CsgG pore, two different CsgG pores or three different CsgG pores preferably has/have a total length of from about 80 to about 120 A, such as from about 90 to about 110 A or from about 95 to about 105 A.
  • the CsgG pore, two different CsgG pores or three different CsgG pores preferably has/have a total length of about 98 A.
  • References to "total length” and "length" relate to the length of the pore or pore region when viewed from the side (see, e.g., the side view in Figure 11). These sizes also apply to the five different CsgG pores.
  • the cap region preferably has a length of from about 20 to about 60 A, such as from about 30 to about 50 A or from about 35 to about 45 A.
  • the cap region preferably has a length of about 39 A.
  • the channel defined by the cap region preferably has an opening of from about 45 to about 85 A in diameter, such as from about 55 to about 75 A or from about 60 to about 70 A in diameter.
  • the channel defined by the cap region preferably has an opening of about 66 A in diameter.
  • the channel defined by the cap region is preferably from about 30 to about 70 A in diameter at its narrowest point, such as from about 35 to about 60 A or from about 40 to about 50 A in diameter at its narrowest point.
  • the channel defined by the cap region is preferably about 43 A in diameter at its narrowest point.
  • the constriction region preferably has a length of from about 5 to about 40 A, such as from about 10 to about 30 A or from about 15 to about 25 A.
  • the constriction region preferably has a length of about 20 A.
  • the channel defined by the constriction region is preferably from about 2 to about 40 A in diameter at its narrowest point, such as from about 5 to about 35 A, from about 8 to about 25 A or from about 10 to about 20 A in diameter at its narrowest point.
  • the channel defined by the constriction region is preferably about 9 A or 12 A in diameter.
  • the channel defined by the constriction region is preferably about 18.5 A in diameter.
  • the constriction is preferably from about 2 to about 40 A in diameter, such as from about 5 to about 35 A, from about 8 to about 25 A or from about 10 to about 20 A in diameter.
  • the constriction is preferably about 9 A or 12 A in diameter.
  • the constriction is preferably about 12 A in diameter.
  • the transmembrane beta barrel region preferably has a length of from about 20 to about 60 A, such as from about 30 to about 50 A or from about 35 to about 45 A.
  • the transmembrane beta barrel preferably has a length of about 39 A.
  • the channel defined by the transmembrane beta barrel region is preferably from about 35 to about 75 A in diameter at its narrowest point, such as from about 45 to about 65 A or from about 50 to about 60 A in diameter at its narrowest point.
  • the channel defined by the transmembrane beta barrel region is preferably about 55 A in diameter at its narrowest point.
  • the cap region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably from about 164 to about 210 amino acids in length.
  • the cap region (or scaffold) in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209 or 210 amino acids in length.
  • the cap region (or scaffold) in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably from about 184 to about 190 amino acids in length.
  • the cap region (or scaffold) in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 184, 185, 186, 187, 188, 189 or 190 amino acids in length. These lengths equally apply to the five different CsgG pores.
  • the cap region preferably comprises a landing platform region and a carboxy-terminal (C-terminal) region.
  • the constriction region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably from about 6 to about 46 amino acids in length.
  • the constriction region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or 46 amino acids in length.
  • the constriction region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably from about 16 to about 36 amino acids in length.
  • the constriction region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35 or 36 amino acids in length.
  • the constriction in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 26 amino acids in length. These lengths equally apply to the five different CsgG pores.
  • the transmembrane beta barrel region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably from about 28 to about 68 amino acids in length.
  • the constriction region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67 or 68 amino acids in length.
  • the transmembrane beta barrel region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 48 amino acids in length. These lengths equally apply to the five different CsgG pores.
  • CsgG pores are highly conserved (as can be readily appreciated from Figures 45 to 47 of WO 2017/149317).
  • the CsgG pore, two different CsgG pores or three different CsgG pores may have any of the sequences shown in SEQ ID NOs: 68 to 88 of WO 2019/002893 (incorporated by reference herein in its entirety) and comprise any of the modifications or mutations disclosed therein.
  • the CsgG pore, two different CsgG pores or three different CsgG pores may also be any of the sequences shown in CN 113773373 A, CN 113896776 A, CN 113912683 A, and CN 113754743 A or variants thereof.
  • the five different CsgG pores may have any of these sequences. It will further be appreciated that the invention extends to other variant CsgG pores not expressly identified in the specification that show highly conserved regions.
  • the CsgG pore, the two different CsgG pores or the three different CsgG pores preferably is/are selected from the pores in Table 4.
  • the five different CsgG pores are preferably selected from the pores in Table 4.
  • the fourth column provides the SEQ ID NO: for the wild-type pore.
  • the CsgG pore, two different CsgG pores or three different CsgG pores may be selected from the wild-type pores in Table 4 (/.e., from the sequences in the fourth column).
  • the CsgG pore, two different CsgG pores or three different CsgG pores may be selected from the wild-type pores in Table 4 with the signal peptide removed, i.e., from SEQ ID NOs: 56-64 with the signal peptide removed or from SEQ ID NO: 55-64 and 73-75 with the signal peptide removed.
  • the five different CsgG pores may be selected from any of these pores. The skilled person can determine these sequences from the sequences in the sequence listing where the signal peptide is underlined.
  • Preferred cap, constriction and transmembrane beta barrel regions for each pore are shown in the fifth, sixth and seventh columns of Table 4. These preferred regions may be used to construct the chimeric pore monomers of the invention as discussed above.
  • the residue numbering in the fifth, sixth and seventh columns correspond to the wild-type sequences without the signal peptide. The skilled person can determine these sequences from the sequences in the sequence listing where the signal peptide is underlined.
  • the cap, transmembrane beta barrel and constriction regions in the fifth, sixth and seventh columns of Table 4 are preferred cap and constriction regions.
  • the invention covers cap, transmembrane beta barrel and constriction regions which differ from those shown in the fifth, sixth and seventh columns by ⁇ about 10 amino acids. For instance, in CsgG_Eco_WT:
  • - 1-37 includes 1-27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46 or 47,
  • 64-134 includes 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73 or 74-124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143 or 144, - 155-181 includes 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164 or 165-171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191,
  • - 210-262 includes 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219 or 220-252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271 or 272,
  • - 38-63 includes 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47 or 48-53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72 or 73,
  • - 135-154 includes 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144 or 145-144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163 or 164, and
  • - 182-209 includes 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191 or 192-199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218 or 219.
  • the cap region may further comprise two subregions, namely a landing platform region and a carboxy-terminal (C-terminal) region.
  • the landing platform region comprises helix 2, that shapes the pore surface on the Cys side and with which molecular engage the channel, such as N22 peptides in CsgA like sequences, or the enzyme in analyte characterisation as discussed below applications.
  • This is a separate structural and functional unit that can be replaced with equivalent sequences from homologues.
  • the C-terminal tail can carry additional sequences in fusion to the channel. These can come from CsgG homologues.
  • constriction regions in Table 4 may be 46 amino acids in length (e.g., 28-73 in CsgG_Eco_WT).
  • the whole constriction region from one CsgG pore may be used, i.e., all 46 amino acids from one CsgG pore may be used. This typically involves complete replacement i.e., absence) of the constriction region(s) from the other different CsgG pore or the other two different CsgG pores.
  • the constriction region in the chimeric pore monomer may be longer, shorter or the same length as the constriction region in the pore(s) which the cap region and/or the transmembrane beta barrel region in the chimeric pore monomer is/are from or derived from.
  • At least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or 46, of the amino acids in the constriction region of one CsgG pore may be introduced into the constriction region of a different CsgG pore.
  • At least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or 46, of the amino acids in the constriction region of one CsgG pore may be replaced with at least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or 46, of the amino acids in the constriction region from a different CsgG pore.
  • Preferred constriction regions for each pore are shown in the sixth column of Table 4. Any of these constriction regions may be used to construct the chimeric pore monomers of the invention as discussed above.
  • the preferred constriction regions in Table 4 are all 26 amino acids in length.
  • the whole constriction region from one CsgG pore may be used, i.e., all 26 amino acids from one CsgG pore may be used. This typically involves complete replacement (/.e., absence) of the constriction region(s) from the other different CsgG pore or the other two different CsgG pores. This also applies to chimeric pore monomers constructed from five different CsgG pores.
  • part of the constriction region from one CsgG pore may be replaced with all or part of the constriction region from a different CsgG pore.
  • the constriction region in the chimeric pore monomer may be longer, shorter or the same length as the constriction region in the pore(s) which the cap region and/or the transmembrane beta barrel region in the chimeric pore monomer is/are from or derived from. At least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25, of the amino acids in the constriction region of one CsgG pore may be introduced into the constriction region of a different CsgG pore.
  • At least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25, of the amino acids in the constriction region of one CsgG pore may be replaced with at least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25, of the amino acids in the constriction region from a different CsgG pore.
  • At least about 15 of the amino acids in the constriction region of one CsgG pore may be replaced with at least about 14 of the amino acids in the constriction region from a different CsgG pore.
  • At least about 18 of the amino acids in the constriction region of one CsgG pore may be replaced with at least about 20 of the amino acids in the constriction region from a different CsgG pore.
  • one or more of the at least two regions or one or more of the two regions in the chimeric pore monomer preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte.
  • One or more negatively charged amino acids in any of the pores in Table 4 may be removed, for instance by deletion or substitution.
  • One or more negatively charged amino acids in any of the pores in Table 4, such as one or more E and/or D, are preferably deleted or substituted with one or more different amino acids, such as one or more positively charged amino acids and/or one or more uncharged amino acids. This removes negative charge from the sequence of the pore. Any number of negatively charged amino acids may be deleted or substituted, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more.
  • the one or more negatively charged amino acids are preferably in the constriction region of the pore.
  • the preferred constriction region for each pore is defined in sixth column of Table 4. This applies to any of the sequences in Table 4, including the wild-type sequences and the wild-type sequences without the signal peptide.
  • SEQ ID NO: 56 is the wild-type CsgG pore from Escherichia coli Str. K-12 substr. MC4100.
  • the CsgG pore, one of the two different CsgG pores or one of the three different CsgG pores may comprise the sequence of SEQ ID NO: 56 or may comprise SEQ ID NO: 56 having any of the substitutions present in another CsgG homologue. This also applies to the five different CsgG pores.
  • Preferred CsgG homologues are shown in SEQ ID NOs: 68 to 88 of WO 2019/002893 (incorporated by reference herein in its entirety) and in Table 4 above.
  • the CsgG pore, one of the two different CsgG pores or one of the three different CsgG pores may comprise combinations of one or more of the substitutions present in SEQ ID NOs: 68 to 88 WO 2019/002893 (incorporated by reference herein in its entirety) or in the other homologies in Table 4 compared with SEQ ID NO: 56, including one or more substitutions, one or more conservative mutations, one or more deletions or one or more insertion mutations, such as deletion or insertion of 1 to 10 amino acids, such as of 2 to 8 or 3 to 6 amino acids. This also applies to one of the five different CsgG pores.
  • the chimeric pore monomer of the invention typically retains the ability to form the same 3D structure as the wild-type CsgG pore monomer, such as the same 3D structure as a CsgG pore having the sequence of SEQ ID NO: 56.
  • the 3D structure of CsgG is known in the art and is disclosed, for example, in Goyal et al (2014) Nature 516(7530):250-3. Any number of modifications or mutations may be made in the wild-type CsgG sequence in addition to the modifications and mutations described herein provided that the chimeric pore monomer retains the improved properties of the invention.
  • a chimeric pore monomer formed from a CsgG pore two different CsgG pores or three different CsgG pores will retain the ability to form a structure comprising three alpha- helices and five beta-sheets. This also applies to chimeric pore monomers constructed from five different CsgG pores.
  • One or more modifications may be made at least in the region which is N-terminal to the first alpha helix (which starts at S63 in SEQ ID NO: 56), in the second alpha helix (from G85 to A99 of SEQ ID NO: 56), in the loop between the second alpha helix and the first beta sheet (from Q100 to N120 of SEQ ID NO: 56), in the fourth and fifth beta sheets (S173 to R192 and R198 to T107 of SEQ ID NO: 56, respectively) and in the loop between the fourth and fifth beta sheets (F193 to Q197 of SEQ ID NO: 56) without affecting the ability of the chimeric pore monomer to form a transmembrane pore which is capable of translocating analytes.
  • any chimeric pore monomer formed from a CsgG pore, two different CsgG pores or three different CsgG pores without affecting the ability of the chimeric pore monomer to form a pore that can translocate analytes. This also applies to chimeric pore monomers constructed from five different CsgG pores.
  • deletions of one or more amino acids can be made in any of the loop regions linking the alpha helices and beta sheets and/or in the N-terminal and/or C-terminal regions without affecting the ability of the chimeric pore monomer to form a pore that can translocate analytes.
  • the chimeric pore monomer may contain the region(s) of SEQ ID NO: 56 that is/are responsible for pore formation.
  • the pore forming ability of CsgG, which contains a p-barrel, is provided by p-sheets in each subunit.
  • the chimeric pore monomer may comprise the regions in SEQ ID NO: 56 that form p-sheets, namely K134-Q154 and S183-S208.
  • One or more modifications can be made to the regions of SEQ ID NO: 56 that form p-sheets as long as the resulting variant retains its ability to form a pore.
  • the chimeric pore monomer preferably includes one or more modifications, such as substitutions, additions, or deletions, within the o-helices and/or loop regions of SEQ ID NO: 56.
  • the one or more modifications in the CsgG pore, two different CsgG pores or three different CsgG pores preferably improve the ability of a chimeric pore monomer formed from a CsgG pore, two different CsgG pores or three different CsgG pores to characterise an analyte.
  • modifications/mutations/substitutions are contemplated to alter the number, size, shape, placement, or orientation of the constriction within a channel of the chimeric pore monomer.
  • the CsgG pore, two different CsgG pores or three different CsgG pores may have any of the particular modifications or substitutions disclosed in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety). This also applies to chimeric pore monomers constructed from five different CsgG pores.
  • SEQ ID NO: 56 Preferred modifications or substitutions in SEQ ID NO: 56 include, but are not limited to, one or more of, such as 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more or all of:
  • a substitution at position Y51 such as Y51I, Y51L, Y51A, Y51V, Y51T, Y51S, Y51Q or Y51N;
  • N55 a substitution at position N55, such as N55I, N55L, N55A, N55V, N55T, N55S or N55Q;
  • N91 a substitution at position N91, such as N91D, N91E, N91R or N91K;
  • a substitution at position C215 such as C215T, C215S, C215I, C215L, C215A, C215V, or C215G.
  • SEQ ID NO: 56 Preferred modifications or substitutions in SEQ ID NO: 56 include, but are not limited to, one or more of, such as 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, 12 or more or all of:
  • a substitution at position Y51 such as Y51I, Y51L, Y51A, Y51V, Y51T, Y51S, Y51Q or Y51N;
  • N55 a substitution at position N55, such as N55I, N55L, N55A, N55V, N55T, N55S or N55Q;
  • a substitution at position N91 such as N91D, N91E, N91R or N91K
  • a substitution at position K94 such as K94R, K94F, K94Y, K94Q, K94W, K94L, K94S or K94N;
  • R97 a substitution at position R97, such as with R97H, R97K, R97A, R97V, R97I, R97L, R97M, R97F, R97W, R97Y, R97S, R97T, R97Q, R97D, R97E, R97N, R97C, R97P or R97G;
  • E1O1 a substitution at position E1O1, such as E101V, E1O1I, E1O1L, E101M, E101A, E1O1F, E101Y, E101W, E1O1S, E101T, E1O1N, E1O1Q, E1O1C, E101G or E1O1P;
  • N102 a substitution at position N102, such as N102E, N102R, N102H, N102K, N102S, N102T, N102D, N102Q, N102V, N102I, N102L, N102M, N102F, N102Y, N102W or N102A;
  • (k) a substitution at position T1O4, such as T104E, T104R, T104H, T104K, T104S, T104T, T104Q, T104V, T104D, T1O4I, T104L, T104M, T104F, T104Y, T104W or T104A;
  • the CsgG pore may comprise a deletion of one or more positions from SEQ ID NO: 56, such as a deletion of V105-I107, a deletion of F193-L199 or a deletion of F195-L199.
  • CsgG pores are highly conserved (as can be readily appreciated from Figures 45 to 47 of WO2017/149317). Furthermore, from knowledge of the modifications in relation to SEQ ID NO: 56, it is possible to determine the equivalent positions for modifications of other CsgG pores, especially CsgG pores having the sequences shown in SEQ ID NOs: 57-64 in Table 4.
  • Amino acid substitutions may be made to the amino acid sequences of any of SEQ ID NOs: 56-64 in addition to those discussed above, for example up to 1, 2, 3, 4, 5, 10, 20 or 30 substitutions.
  • Conservative substitutions replace amino acids with other amino acids of similar chemical structure, similar chemical properties, or similar side-chain volume.
  • the amino acids introduced may have similar polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality, or charge to the amino acids they replace.
  • the conservative substitution may introduce another amino acid that is aromatic or aliphatic in the place of a pre-existing aromatic or aliphatic amino acid.
  • Conservative amino acid changes are well- known in the art.
  • the CsgG pore, two different CsgG pores or three different CsgG pores may be modified to introduce one or more cysteines, one or more hydrophobic amino acids, one or more charged amino acids, one or more non-native amino acids, one or more polar amino acids, or one or more photoreactive amino acids. Any number and combination of such introductions may be made. The introduction is preferably by substitution or addition. This also applies to chimeric pore monomers constructed from five different CsgG pores.
  • One or more amino acid residues of the amino acid sequences of any of SEQ ID NOs: 56-64 may additionally be deleted from the polypeptides described above. Up to 1, 2, 3, 4, 5, 10, 20 or 30 or more residues may be deleted.
  • One or more amino acids may be alternatively or additionally added to the CsgG polypeptides described above.
  • An extension may be provided at the amino terminal or carboxy terminal of the amino acid sequences of any of SEQ ID NOs: 56-64 or polypeptide variants or fragments thereof.
  • the extension may be quite short, for example from 1 to 10 amino acids in length. Alternatively, the extension may be longer, for example up to 50 or 100 amino acids.
  • the at least two different pores preferably comprise at least two different CsgG pores or at least three different CsgG pores.
  • the two different pores are preferably two different CsgG pores.
  • the three different pores are preferably three different CsgG pores.
  • the at least two different pores preferably comprise at least five different CsgG pores or five different CsgG pores.
  • the at least two different CsgG pores, the two different CsgG pores or the three different CsgG pores may be selected from the pores in Table 4.
  • One of the at least two different pores is preferably CsgG_Eco_WT (row 1 in Table 4).
  • One of the two different pores is preferably CsgG_Eco_WT (row 1 in Table 4).
  • One of the three different pores is preferably CsgG_Eco_WT (row 1 in Table 4).
  • One of the five different pores is preferably CsgG_Eco_WT (row 1 in Table 4).
  • the at least two different pores preferably comprise or the two or three different pores preferably are selected from (a) CsgG_Eco_WT, (b) CsgG_Vdi_WT, (c) CsgG_Vmae_WT, (d) CsgG_Vsp_WT, (e) CsgG_Ler_WT, (f) CsgG_Vcr_WT, (g) CsgG_Psh_WT, (h) CsgG_Vhi_WT or (i) CsgG_Vma_WT. (a)-(i) are defined in Table 4.
  • the five different pores may be selected from (a)-(i).
  • the at least two different pores preferably comprise or the two different pores preferably are (a) and (b) , (a) and (c), (a) and (d), (a) and (e), (a) and (f), (a) and (g), (a) and (h), (a) and (i), (b) and (c), (b) and (d), (b) and (e), (b) and (f), (b) and (g), (b) and (h), (b) and (i), (c) and (d), (c) and (e), (c) and (f), (c) and (g), (c) and (c) and
  • the at least three different pores preferably comprise or the three different pores preferably are (a), (b) and (c); (a), (b) and (d); (a), (b) and (e); (a), (b) and (f); (a), (b) and (g); (a),
  • the at least two different pores preferably comprise or the two or three different pores preferably are selected from (a) CsgG_Eco_WT, (b) CsgG_Vdi_WT, (c) CsgG_Vmae_WT, (d) CsgG_Vsp_WT, (e) CsgG_Ler_WT, (f) CsgG_Vcr_WT, (g) CsgG_Psh_WT, (h) CsgG_Vhi_WT,
  • the five different pores may be selected from (a)-(l).
  • the at least two different pores preferably comprise or the two different pores preferably are (a) and (b), (a) and (c), (a) and (d), (a) and (e), (a) and (f), (a) and (g), (a) and (h), (a) and (i), (a) and
  • the at least three different pores preferably comprise or the three different pores preferably are (a), (b) and (c); (a), (b) and (d); (a), (b) and (e); (a), (b) and (f); (a), (b) and (g); (a),
  • the constriction transplants in Example 2 are formed from two different CsgG pores.
  • the at least two different pores preferably comprise or the two different pores preferably are the combination of two different CsgG pores in any of those transplants.
  • the at least two different pores preferably comprise or the two different pores preferably are (a) and (b), (a) and (c), (a) and (d), (a) and (e), (a) and (f), (a) and (g), (a) and (h), or (a) and (i).
  • the chimeric pore monomers preferably comprise a cap region and transmembrane beta barrel region from (a) and a constriction region from one of (b)-(i).
  • the at least two different pores preferably comprise or the two different pores preferably are (a) and (c), (a) and (d), (a) and (h), (a) and (i), or (a) and (b).
  • the chimeric pore monomers preferably comprise a cap region and a transmembrane beta barrel region from (a) and a constriction region from (c), (d), (h), (i) or (d).
  • the at least two different pores preferably comprise or the two different pores preferably are (a) and (f), or (a) and (g).
  • the chimeric pore monomers preferably comprise a cap region and a transmembrane beta barrel region from (a) and a constriction region from (f) or (g).
  • the at least two different pores preferably comprise or the two different pores preferably are (a) and (c), (a) and (h), or (a) and (i).
  • the chimeric pore monomers preferably comprise a cap region from (a) and a constriction region from (c), (h) or (i).
  • the chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 or to the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide.
  • the chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide.
  • the chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72, to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78, to the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide, or to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide.
  • the chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 65-72 or the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 65-72 or the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide.
  • the chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide.
  • Any one one of SEQ ID NOs: 65-72 is of course equivalent to SEQ ID NO: 65, 66, 67, 68, 69, 70, 71 or 72.
  • SEQ ID NOs: 65-72 and 76-78 is of course equivalent to SEQ ID NO: 65, 66, 67, 68, 69, 70, 71, 72, 76, 77 or 78.
  • Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 or to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 without the signal peptide.
  • the chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 or to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 without the signal peptide.
  • the chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 or the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 or the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 without the signal peptide.
  • Any one one one of SEQ ID NOs: 66, 67, 71, 72 and 65 is of course equivalent to SEQ ID NO: 66, 67, 71, 72 or 65. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 or to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 without the signal peptide.
  • the chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 or to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 without the signal peptide.
  • the chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 or the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 66, 67,
  • SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 are of course equivalent to SEQ ID NO: 66, 67, 71, 72, 65, 76, 77 or 78.
  • Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 69 and 70 or to the sequence shown in any one of SEQ ID NOs: 69 and 70 without the signal peptide.
  • the chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 69 and 70 or to the sequence shown in any one of SEQ ID NOs: 69 and 70 without the signal peptide.
  • the chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 69 and 70 or the sequence shown in any one of SEQ ID NOs: 69 and 70 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 69 and 70 or the sequence shown in any one of SEQ ID NOs: 69 and 70 without the signal peptide. Any one one of SEQ ID NOs: 69 and 70 is of course equivalent to SEQ ID NO: 69 or 70. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 or to the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 without the signal peptide.
  • the chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 or to the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 without the signal peptide.
  • the chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 or the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 or the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 without the signal peptide. Any one one of SEQ ID NOs: 66, 71 and 72 is of course equivalent to SEQ ID NO: 66, 71 or 72. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 or to the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 without the signal peptide.
  • the chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 or to the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 without the signal peptide.
  • the chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 or the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 or the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 without the signal peptide. Any one one one of SEQ ID NOs: 66, 71, 72 and 76 is of course equivalent to SEQ ID NO: 66, 71, 72 or 76. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the chimeric pore monomer comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 or any one of SEQ ID NOs: 65-72 and 76-78 or to the sequence shown in any one of SEQ ID NOs: 65-72 or any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide
  • the chimeric pore monomer preferably does not comprise the entire sequence of any wild-type pore.
  • the chimeric pore monomer preferably does not comprise a sequence having 100% identity to the entire sequence of any wild-type pore.
  • the chimeric pore monomer preferably does not comprise the entire sequence of any of the different pores used to create the chimeric pore monomer or any of the different pores which the two or more regions, such as two regions or three regions, are from or derived from. In any of these embodiments, the chimeric pore monomer preferably does not comprise a sequence having 100% identity to the entire sequence of any of the different pores used to create the chimeric pore monomer or any of the different pores which the two or more regions, such as two regions or three regions, are from or derived from.
  • the at least two different pores do not comprise alpha-hemolysin and gamma-hemolysin.
  • the at least two different pores preferably do not comprise two sodium channels.
  • the chimeric pore monomer preferably does not comprise the sequence shown in SEQ ID NO: 70.
  • the invention also provides a chimeric pore monomer, comprising a fusion protein comprising the following three structural regions: (1) a cap region, (2) a constriction region, and (5) a transmembrane region, wherein the three structural regions are derived from at least two different CsgG pores.
  • the invention also provides a chimeric pore monomer, comprising a fusion protein comprising the following five structural regions: (1) a cap region, (2) a landing platform region, (3) a C-terminal region, (4) a constriction region, and (5) a transmembrane region, wherein the five structural regions are derived from at least two different CsgG pores. Any of the embodiments discussed above equally apply to these chimeric pore monomers of the invention.
  • the three or five structural regions may be derived from two, three or five different CsgG pores.
  • any of the chimeric pore monomers of the invention may further comprise one or more CsgF peptides.
  • Such peptides and their association with pore monomers are described in WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety).
  • One or more of the at least two regions or one or more of the two regions in the chimeric pore monomer preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte. All the at least two regions or both of the two regions preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte.
  • the cap region (or scaffold) and/or the constriction region preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte.
  • One or more of, such as all of, the cap region, the constriction region, and the transmembrane region preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte.
  • the CsgG pore monomer of the invention preferably comprises one or more modifications which stabilise a CsgG pore and/or improve the ability of a CsgG pore formed from the CsgG pore monomer to characterise a target analyte.
  • the one or more modifications are preferably (a) one or more deletions, (b) one or more substitutions, (c) one or more additions or (d) any combination of (a) to (c), including (a) and (b), (a) and (c), (b) or (d) or (a), (b) and (c).
  • the one or more modifications are preferably one or more substitutions.
  • the one or more modifications are preferably one or more of the substitutions to PorARc discussed above, especially in relation to Table 2.
  • the one or more modifications are preferably one or more of the substitutions to CsgG discussed above, especially in relation to SEQ ID NO: 56.
  • the one or more substitutions discussed above preferably remove negative charge.
  • Such one or more substitutions improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise negatively charged target analytes, such as polynucleotides.
  • the one or more modifications may be any of those discussed below with reference to SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55.
  • the one or more modifications are preferably one or more of the modifications to SEQ ID NO: 56 discussed above.
  • Suitable one or more modifications are known in the art. For instance, modifications to CsgG which improve its ability to characterise target analytes are disclosed in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety). Suitable modifications are also described in United Kingdom Patent Application No. 2118939.4 filed on 23 December 2021 (incorporated by reference herein in its entirety).
  • the invention provides various PorARc pore monomers from different species which are collectively known as PorARc pore monomers of the invention.
  • SEQ ID NO: 2 is the wild-type PorARc pore from Mycolicibacterium phlei (PorARc_Mph) with the substitutions D91N/D92N. This is pore 2 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (/.e., at position 1) and its preferred substitutions(seventh column).
  • the invention provides a PorARc_Mph pore monomer which comprises or consists of a sequence having at least about 88% homology or identity to the sequence shown in SEQ ID NO: 2.
  • the PorARc_Mph pore monomer of the invention preferably comprises or consists of a sequence having at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% homology to the sequence shown in SEQ ID NO: 2.
  • the PorARc_Mph pore monomer of the invention preferably comprises or consists of a sequence having at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity to the sequence shown in SEQ ID NO: 2.
  • the PorARc_Mph pore monomer of the invention preferably comprises or consists of a sequence having 100% homology or identity to the sequence shown in SEQ ID NO: 2. Homology and/or identity is typically measured over the entire length of the pore monomer. Methods for determining homology and/or identity are described above.
  • One or more negative amino acids in SEQ ID NO: 2, such as one or more E and/or D, are preferably deleted or substituted with one or more different amino acids, such as one or more positively charged amino acids, such as R, H or K, and/or one or more uncharged amino acids, such as S, T, N or Q. This removes negative charge from the sequence. Any number of negatively charged amino acids may be deleted or substituted, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more.
  • the one or more negatively charged amino acids are preferably in the constriction region of SEQ ID NO: 2. This is identified in Table 2.
  • sequence that is homologous or identical to the sequence shown in SEQ ID NO: 2 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or both of the positions corresponding to positions 91 and 92 in SEQ ID NO: 2.
  • sequence that is homologous or identical to the sequence shown in SEQ ID NO: 2 preferably comprises N at the position(s) corresponding to position 91 and/or position 92 in SEQ ID NO: 2.
  • SEQ ID NO: 50 shows the amino acid sequence of PorARc pore from Mycobacterium sp. (PorARc_Msp) with the substitutions D91N/D92N. This is pore 3 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (/.e., at position 1) and its preferred substitutions (seventh column).
  • SEQ ID NO: 51 shows the amino acid sequence of PorARc pore from Mycolicibacterium rhodesiae (PorARc_Mrh) with the substitutions D91N/D92N.
  • SEQ ID NO: 52 shows the amino acid sequence of PorARc pore from Mycolicibacterium elephantis (PorARc_Mel) with the substitutions D91N/E101Q.
  • This is pore 17 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (/.e., at position 1) and its preferred substitutions (seventh column).
  • SEQ ID NO: 53 shows the amino acid sequence of PorARc pore from Mycolicibacterium cosmeticum (PorARc_Mco) with the substitutions D91N/D92N (seventh column). This is pore 19 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (/.e., at position 1) and its preferred substitutions (seventh column).
  • SEQ ID NO: 54 shows the amino acid sequence of PorARc pore from unclassified Rhodococcus (WP_056447532.1; PorARc_Rsp) with the substitutions E89Q/D91N/D93N/D100N.
  • SEQ ID NO: 55 shows the amino acid sequence of PorARc pore from Rhodococcus sp PSBB049 (WP_206003768.1; PorARc_Rsp) with the substitutions D90N/D95N/D103N.
  • This is pore 27 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (i.e., at position 1) and its preferred substitutions (seventh column).
  • the invention provides a PorARc pore monomer which comprises or consists of a sequence having at least about 40% homology or identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55.
  • a pore monomer based on SEQ ID NO: 50 may be referred to as a PorARc_Msp pore monomer.
  • a pore monomer based on SEQ ID NO: 51 may be referred to as a PorARc_Mrh pore monomer.
  • a pore monomer based on SEQ ID NO: 52 may be referred to as a PorARc_Mel pore monomer.
  • a pore monomer based on SEQ ID NO: 53 may be referred to as a PorARc_Mco pore monomer.
  • a pore monomer based on SEQ ID NO: 54 may be referred to as a PorARc_Rsp pore monomer.
  • a pore monomer based on SEQ ID NO: 55 may be referred to as a PorARc_Rsp pore monomer.
  • the PorARc pore monomer of the invention preferably comprises or consists of a sequence having at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55.
  • the PorARc pore monomer preferably comprises or consists of a sequence having at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55.
  • the PorARc pore monomer of the invention preferably comprises or consists of a sequence having 100% homology or identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55.
  • the invention provides a PorARc pore monomer which comprises or consists of a sequence having at least about 20% homology or identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55.
  • a pore monomer based on SEQ ID NO: 50 may be referred to as a PorARc_Msp pore monomer.
  • a pore monomer based on SEQ ID NO: 51 may be referred to as a PorARc_Mrh pore monomer.
  • a pore monomer based on SEQ ID NO: 52 may be referred to as a PorARc_Mel pore monomer.
  • a pore monomer based on SEQ ID NO: 53 may be referred to as a PorARc_Mco pore monomer.
  • a pore monomer based on SEQ ID NO: 54 may be referred to as a PorARc_Rsp pore monomer.
  • a pore monomer based on SEQ ID NO: 55 may be referred to as a PorARc_Rsp pore monomer.
  • the PorARc pore monomer of the invention preferably comprises or consists of a sequence having at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55.
  • the PorARc pore monomer preferably comprises or consists of a sequence having at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55.
  • the PorARc pore monomer of the invention preferably comprises or consists of a sequence having 100% homology or identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. Homology and/or identity is typically measured over the entire length of the pore monomer. Methods for determining homology and/or identity are described above.
  • One or more negatively charged amino acids in SEQ ID NO: 50, 51, 52, 53, 54 or 55 are preferably deleted or substituted with one or more different amino acids, such as one or more positively charged amino acids, such as R, H or K, and/or one or more uncharged amino acids, such as S, T, N or Q. This removes negative charge from the sequence. Any number of negatively charged amino acids may be deleted or substituted, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more.
  • the one or more negatively charged amino acids are preferably in the constriction region of SEQ ID NO: 50, 51, 52, 53, 54 or 55. These are identified in Table 2.
  • sequence having homology or identity to the sequence shown in SEQ ID NO: 50 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or both of the positions corresponding to positions 91 and 92 in SEQ ID NO: 50.
  • sequence that is homologous or identical to the sequence shown in SEQ ID NO: 50 preferably comprises N at the position(s) corresponding to position 91 and/or position 92 in SEQ ID NO: 50.
  • sequence having homology or identity to the sequence shown in SEQ ID NO: 51 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or both of the positions corresponding to positions 91 and 92 in SEQ ID NO: 51.
  • sequence that is homologous or identical to the sequence shown in SEQ ID NO: 51 preferably comprises N at the position(s) corresponding to position 91 and/or position 92 in SEQ ID NO: 51.
  • sequence having homology or identity to the sequence shown in SEQ ID NO: 52 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or both of the positions corresponding to positions 91 and 101 in SEQ ID NO: 52.
  • sequence that is homologous or identical to the sequence shown in SEQ ID NO: 52 preferably comprises Q at the position(s) corresponding to position 91 and/or position 101 in SEQ ID NO: 52.
  • sequence having homology or identity to the sequence shown in SEQ ID NO: 53 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or both of the positions corresponding to positions 91 and 92 in SEQ ID NO: 53.
  • sequence that is homologous or identical to the sequence shown in SEQ ID NO: 53 preferably comprises N at the position(s) corresponding to position 91 and/or position 92 in SEQ ID NO: 53.
  • sequence having homology or identity the sequence shown in SEQ ID NO: 54 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of, such as all of, the positions corresponding to positions 89, 91, 93 and 100 in SEQ ID NO: 54.
  • sequence that is homologous or identical to the sequence shown in SEQ ID NO: 54 preferably comprises N at the positions corresponding to positions 89, 91, 93 and 100 in SEQ ID NO: 54.
  • sequence having homology or identity to the sequence shown in SEQ ID NO: 55 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of, such as all of, the positions corresponding to positions 90, 95 and 103 in SEQ ID NO: 55.
  • sequence that is homologous or identical to the sequence shown in SEQ ID NO: 55 preferably comprises N at the position corresponding to positions 90, 95 and 103 in SEQ ID NO: 55.
  • the sequence of the PorARc pore monomer may comprise any of the substitutions present in other PorARc pores including any of the pores listed in Table 2.
  • the PorARc pore monomer typically retains the ability to form the same 3D structure as the wild-type PorARc pore monomer, such as the same 3D structure as a PorARc pore monomer having the sequence of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55.
  • the PorARc pore monomer is capable of forming a pore. Methods for measuring are discussed above with reference to the chimer pore monomers of the invention.
  • Amino acid substitutions may be made to the amino acid sequence of SEQ ID NO: 2, 50, 51,
  • amino acids 52, 53, 54 or 55 for example up to 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90 or 100 or more substitutions.
  • Conservative substitutions replace amino acids with other amino acids of similar chemical structure, similar chemical properties, or similar side-chain volume.
  • the amino acids introduced may have similar polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality, or charge to the amino acids they replace.
  • the conservative substitution may introduce another amino acid that is aromatic or aliphatic in the place of a pre-existing aromatic or aliphatic amino acid.
  • Conservative amino acid changes are well-known in the art.
  • the PorARc pore monomer may be modified to introduce one or more cysteines, one or more hydrophobic amino acids, one or more charged amino acids, one or more non-native amino acids, one or more polar amino acids, or one or more photoreactive amino acids. Any number and combination of such introductions may be made. The introduction is preferably by substitution or addition.
  • 53, 54 or 55 may additionally be deleted from the polypeptides described above. Up to 1, 2, 3, 4, 5, 10, 20 or 30 or more residues may be deleted.
  • the PorARc pore monomer may comprise a fragment of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55. Such fragments retain pore forming activity. Fragments may be at least about 50, at least about 100, at least about 150, or at least about 200 amino acids in length. Such fragments may be used to produce the pores of the invention.
  • a fragment preferably comprises the transmembrane beta barrel region of the relevant sequence, namely residues W73 to T84 and G113 to N122 of SEQ ID NO: 2, residues W73 to T84 and G114 to N123 of SEQ ID NO: 50, residues W73 to T84 and G127 to N136 of SEQ ID NO: 51, residues W73 to T84 and Gill to N120 of SEQ ID NO: 52, residues W73 to T84 and G109 to N118 of SEQ ID NO: 53, residues A73 to S84 and Q104 to Pl 13 of SEQ ID NO: 54, residues G74 to S85 and Q107 to P116 of SEQ ID NO: 55, or a variant thereof as discussed above.
  • One or more amino acids may be alternatively or additionally added to the polypeptides described above.
  • An extension may be provided at the amino terminal or carboxy terminal of the amino acid sequence of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55 or polypeptide variant or fragment thereof.
  • the extension may be quite short, for example from 1 to 10 amino acids in length. Alternatively, the extension may be longer, for example up to 50 or 100 amino acids.
  • a carrier protein may be fused to an amino acid sequence according to the invention.
  • the sequence of the PorARc pore monomer has an amino acid sequence which varies from that of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55 and which retains its ability to form a pore.
  • the sequence typically contains the regions of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55 that are responsible for pore formation.
  • the pore forming ability of PorARc, which contains a p- barrel, is provided by p-strands in the transmembrane beta barrel region of each monomer.
  • a variant of SEQ ID NO: 2 typically comprises the region in the relevant sequence that forms p-strands, namely residues W73 to T84 and G113 to N122 of SEQ ID NO: 2, residues W73 to T84 and G114 to N123 of SEQ ID NO: 50, residues W73 to T84 and G127 to N136 of SEQ ID NO: 51, residues W73 to T84 and Gill to N120 of SEQ ID NO: 52, residues W73 to T84 and G109 to N118 of SEQ ID NO: 53, residues A73 to S84 and Q104 to P113 of SEQ ID NO: 54, residues G74 to S85 and Q107 to P116 of SEQ ID NO: 55, or a variant thereof as discussed above.
  • One or more modifications can be made to the region of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55 that form p-strands as long as the resulting variant retains its ability to form a pore.
  • the one or more modifications in the PorARc pore monomer preferably improve the ability of a pore comprising the pore monomer to characterise an analyte.
  • the invention also provides various CsgG pore monomers which are collectively known as CsgG pore monomers of the invention.
  • the invention also provides a CsgG pore monomer comprising a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 or to the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide.
  • the CsgG pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 or to the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide.
  • the CsgG pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 65-72 or the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 65-72 or the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide.
  • Any one one of SEQ ID NOs: 65-72 is of course equivalent to SEQ ID NO: 65, 66, 67, 68, 69, 70, 71 or 72. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the invention also provides a CsgG pore monomer comprising a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide.
  • the CsgG pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide.
  • the CsgG pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide.
  • Any one one of SEQ ID NOs: 65-72 and 76-78 is of course equivalent to SEQ ID NO: 65, 66, 67, 68, 69, 70, 71, 72, 76, 77 or 78.
  • Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
  • the CsG pore monomer preferably does not comprise the entire sequence of any wild-type pore.
  • the CsgG pore monomer preferably does not comprise a sequence having 100% identity to the entire sequence of any wild-type pore.
  • the CsgG pore monomer preferably does not comprise the entire sequence of any of the CsgG pores identified in Table 4, including SEQ ID NO: 58 or 73.
  • the invention also provides a CsgG pore monomer comprising or consisting of a sequence having at least about 68% homology or identity to the sequence shown in SEQ ID NO: 58.
  • the CsgG pore monomer of the invention preferably comprises or consists of a sequence having at least about 69%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% homology or identity to the sequence shown in SEQ ID NO: 58.
  • the CsgG pore monomer may comprise or consist of SEQ ID NO: 58. Homology and/or identity is typically measured over the entire length of SEQ ID NO: 58.
  • the invention also provides a CsgG pore monomer comprising or consisting of a sequence having at least about 79% homology or identity to the sequence shown in SEQ ID NO: 73.
  • the CsgG pore monomer of the invention preferably comprises or consists of a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% homology or identity to the sequence shown in SEQ ID NO: 73.
  • the CsgG pore monomer may comprise or consist of SEQ ID NO: 73. Homology and/or identity is typically measured over the entire length of SEQ ID NO: 73.
  • the CsgG pore monomer of the invention preferably comprises or consists of a sequence having at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% homology to the sequence shown in SEQ ID NO: 58 or 73.
  • the CsgG pore monomer of the invention any comprise one or more the particular modifications or substitutions disclosed in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety).
  • the CsgG pore monomer may comprise one or more the modifications or substitutions discussed above with reference to SEQ ID NO: 56.
  • the one or more modifications in the CsgG pore monomer preferably improve the ability of a pore comprising the pore monomer to characterise an analyte.
  • the CsgG pore monomer typically retains the ability to form the same 3D structure as the wild-type CsgG pore monomer, such as the same 3D structure as a CsgG pore monomer having the sequence of SEQ ID NO: 58 or 73.
  • the CsgG pore monomer is capable of forming a pore. Methods for measuring are discussed above with reference to the chimer pore monomers of the invention.
  • Amino acid substitutions may be made to the amino acid sequence of SEQ ID NO: 58 or 73, for example up to 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90 or 100 or more substitutions.
  • Conservative substitutions replace amino acids with other amino acids of similar chemical structure, similar chemical properties, or similar side-chain volume.
  • the amino acids introduced may have similar polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality, or charge to the amino acids they replace.
  • the conservative substitution may introduce another amino acid that is aromatic or aliphatic in the place of a pre-existing aromatic or aliphatic amino acid.
  • Conservative amino acid changes are well- known in the art.
  • the CsgG pore monomer may be modified to introduce one or more cysteines, one or more hydrophobic amino acids, one or more charged amino acids, one or more non-native amino acids, one or more polar amino acids, or one or more photoreactive amino acids. Any number and combination of such introductions may be made. The introduction is preferably by substitution or addition.
  • One or more amino acid residues of the amino acid sequence of SEQ ID NO: 58 or 73 may additionally be deleted from the polypeptides described above. Up to 1, 2, 3, 4, 5, 10, 20 or 30 or more residues may be deleted.
  • the CsgG pore monomer may comprise a fragment of SEQ ID NO: 58 or 73. Such fragments retain pore forming activity. Fragments may be at least about 50, at least about 100, at least about 150, or at least about 200 amino acids in length. Such fragments may be used to produce the pores of the invention.
  • a fragment preferably comprises the transmembrane beta barrel region of the relevant sequence (as identified above).
  • One or more amino acids may be alternatively or additionally added to the polypeptides described above.
  • An extension may be provided at the amino terminal or carboxy terminal of the amino acid sequence of SEQ ID NO: 58 or 73 or polypeptide variant or fragment thereof.
  • the extension may be quite short, for example from 1 to 10 amino acids in length. Alternatively, the extension may be longer, for example up to 50 or 100 amino acids.
  • a carrier protein may be fused to an amino acid sequence according to the invention.
  • the invention also provides a construct comprising two or more covalently attached chimeric pore monomers of the invention or two or more covalently attached PorARc pore monomers of the invention.
  • the chimeric pore monomers of the invention and the PorARc pore monomers of the invention are collectively referred to as "pore monomers of the invention”.
  • the chimeric pore monomers of the invention, the PorARc pore monomers of the invention and the CsgG pore monomers of the invention are collectively referred to as "pore monomers of the invention”.
  • the chimeric construct of the invention and the PorARc construct of the invention are collectively referred to as "constructs of the invention”.
  • the invention also provides a construct comprising two or more covalently attached CsgG pore monomers of the invention.
  • the chimeric construct of the invention, the PorARc construct of the invention and the CsgG construct of the invention are collectively referred to as "constructs of the invention".
  • the construct of the invention may comprise 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more or 10 or more pore monomers of the invention.
  • the construct may comprise at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9 or at least 10 pore monomers of the invention.
  • the two or more pore monomers of the invention may be the same or different.
  • the two or more pore monomers of the invention may differ based on their sequences.
  • the two or more pore monomers of the invention are preferably the same (/.e., identical).
  • the construct of the invention preferably comprises two pore monomers of the invention.
  • the two pore monomers of the invention may be the same or different.
  • the two pore monomers of the invention are preferably the same (/.e., identical).
  • the pore monomers of the invention may be genetically fused.
  • the pore monomers of the invention may be attached via a linker, or chemically fused, for instance via a chemical crosslinker.
  • Methods for covalently attaching pore monomers of the invention are disclosed in WO 2017/149316, WO 2017/149317, and WO 2017/149318 (incorporated herein by reference in their entirety).
  • the pore monomers of the invention may be genetically fused using a linker.
  • the linker is preferably an amino acid sequence and/or a chemical crosslinker.
  • Suitable amino acid linkers such as peptide linkers, are known in the art.
  • the length, flexibility and hydrophilicity of the amino acid or peptide linker are typically designed to facilitate the formation of pores from the constructs.
  • Preferred flexible peptide linkers are stretches of 2 to 20, such as 4, 6, 8, 10 or 16, serine and/or glycine amino acids. More preferred flexible linkers include (SG)i, (SG) 2 , (SG) 3 , (SG) 4 , (SG) 5 , (SG) 8 , (SG) i0 , (SG)i 5 or (SG) 20 wherein S is serine and G is glycine.
  • Preferred rigid linkers are stretches of 2 to 30, such as 4, 6, 8, 16 or 24, proline amino acids. More preferred rigid linkers include (P) i2 wherein P is proline.
  • Suitable chemical crosslinkers are well-known in the art. Suitable chemical crosslinkers include, but are not limited to, those including the following functional groups: maleimide, active esters, succinimide, azide, alkyne (such as dibenzocyclooctynol (DIBO or DBCO), difluoro cycloalkynes and linear alkynes), phosphine (such as those used in traceless and non-traceless Staudinger ligations), haloacetyl (such as iodoacetamide), phosgene type reagents, sulfonyl chloride reagents, isothiocyanates, acyl halides, hydrazines, disulfides, vinyl sulfones, aziridines and photoreactive reagents (such as aryl azides, diaziridines).
  • alkyne such as dibenzocyclooctynol (DIBO or DBCO), di
  • Linkers can comprise any molecule that stretches across the distance required. Linkers can vary in length from one carbon (phosgene-type linkers) to many Angstroms. Examples of linker molecules, include but are not limited to, are polyethyleneglycols (PEGs), polypeptides, polysaccharides, deoxyribonucleic acid (DNA), peptide nucleic acid (PNA), threose nucleic acid (TNA), glycerol nucleic acid (GNA), saturated and unsaturated hydrocarbons, polyamides.
  • PEGs polyethyleneglycols
  • PNA polypeptides
  • TAA threose nucleic acid
  • GNA glycerol nucleic acid
  • linkers may be inert or reactive, in particular they may be chemically cleavable at a defined position, or may be themselves modified with a fluorophore or ligand.
  • the linker is preferably resistant to reducing agents, such as dithiothreitol (DTT), following the covalent attachment of the monomers.
  • DTT dithiothreitol
  • Preferred crosslinkers include 2,5-dioxopyrrolidin-l-yl 3-(pyridin-2-yldisulfanyl)propanoate, 2,5-dioxopyrrolidin-l-yl 4-(pyridin-2-yldisulfanyl)butanoate and 2,5-dioxopyrrolidin-l-yl 8- (pyridin-2-yldisulfanyl)octananoate, di-maleimide PEG Ik, di-maleimide PEG 3.4k, di- maleimide PEG 5k, di-maleimide PEG 10k, bis(maleimido)ethane (BMOE), bis- maleimidohexane (BMH), 1,4-bis-maleimidobutane (BMB), 1,4 bis-maleimidyl-2,3- di hydroxybutane (BMDB), BM[PEO]2 (1,8-bis-maleimidodiethylenegly
  • the linker is preferably resistant to dithiothreitol (DTT).
  • Suitable linkers include, but are not limited to, iodoacetamide-based and maleimide-based linkers.
  • the pore monomers of the invention may be connected using two or more linkers each comprising a hybridizable region and a group capable of forming a covalent bond.
  • the hybridizable regions in the linkers hybridize and link the pore monomers of the invention.
  • the linked pore monomers of the invention are then coupled via the formation of covalent bonds between the groups.
  • Any of the specific linkers disclosed in WO 2010/086602 (incorporated herein by reference in its entirety) may be used in accordance with the invention.
  • the linkers may be labelled. Suitable labels include, but are not limited to, fluorescent molecules (such as Cy3 or AlexaFluor®555), radioisotopes, e.g., 125 I, 35 S, 32 P, enzymes, antibodies, antigens, polynucleotides, and ligands such as biotin. Such labels allow the amount of linker to be quantified.
  • the label could also be a cleavable purification tag, such as biotin, or a specific sequence to show up in an identification method, such as a peptide that is not present in the protein itself, but that is released by trypsin digestion.
  • a preferred method of connecting the pore monomers of the invention is via cysteine linkage. This can be mediated by a bi-functional chemical crosslinker or by an amino acid linker with a terminal presented cysteine residue.
  • Another preferred method of attachment via 4-azidophenylalanine or Faz linkage can be mediated by a bi-functional chemical linker or by a polypeptide linker with a terminal presented 4-azidophenylalanine or Faz residue. Additional suitable linkers are discussed in more detail below.
  • the invention provides a chimeric pore comprising at least one chimeric pore monomer of the invention or at least one construct of the invention comprising two or more covalently linked chimeric pore monomers of the invention.
  • the invention also provides a CsgG pore comprising at least one CsgG pore monomer of the invention or at least one construct comprising two or more covalently linked CsgG pore monomers of the invention.
  • the invention also provides a PorARc pore comprising at least one PorARc pore monomer of the invention or at least one construct comprising two or more covalently linked PorARc pore monomers of the invention.
  • the chimeric pore of the invention and the PorARc pore of the invention are collectively referred to as "pores of the invention".
  • the chimeric pore of the invention, the PorARc pore of the invention and the CsgG pore of the invention are collectively referred to as "pores of the invention”.
  • pore refers to an oligomeric pore comprising at least one pore monomer of the invention (including, e.g., one or more pore monomers of the invention such as two or more pore monomers of the invention, three or more pore monomers of the invention etc.).
  • the pore of the invention has the features of a biological pore, i.e., it has a typical protein structure and defines a channel. When the pore is provided in an environment having membrane components, membranes, cells, or an insulating layer, the pore will insert in the membrane or the insulating layer and form a "transmembrane pore".
  • the chimeric pores of the invention typically display improved target analyte characterisation compared with the at least two different pores or the two different pores from which they are derived.
  • the PorARc pores of the invention typically display improved target analyte characterisation compared with other pores used in nanopore sensing.
  • the pores of the invention display one or more of (a) increased signal- to-noise ratio (SNR), (b) an increased current range, (c) decreased noise and (d) and increased normalised median absolute deviation (nMAD).
  • SNR signal- to-noise ratio
  • nMAD normalised median absolute deviation
  • the median absolute deviation (MAD) is the median of the absolute values between the current at a given event and the overall mean current of the squiggle (/.e., current trace during analyte translocation).
  • Normalised MAD is this value normalised by the overall range of the squiggle current.
  • the pore of the invention may display (a); (b); (c); (d); (a) and (b); (a) and (c); (a) and (d); (b) and (c); (b) and (d); (c) and (d); (a), (b) and (c); (a), (b) and (d); (a), (c) and (d); (b), (c) and (d); or (a), (b), (c) and (d).
  • one or more of (a) to (d) are preferably compared with the at least two different pores or the two different pores from which the chimeric pores of the invention are constructed.
  • one or more of (a) to (d) are preferably compared with other pores used in nanopore sensing, including any of the pores described herein.
  • the pore of the invention typically comprises at least one constriction. Constrictions and their functions are defined above.
  • the pore of the invention may comprise two or more constrictions, such as three or more, four or more, five or more or six or more constrictions.
  • the additional constriction(s) expand(s) the contact surface with passing analytes and can improve analyte detection and characterization.
  • Pores comprising the pore monomers of the invention can improve the characterisation of analytes, such as polynucleotides, providing a more discriminating direct relationship between the observed current as the polynucleotide moves through the pore.
  • the pore may facilitate characterization of polynucleotides that contain at least one homopolymeric stretch, e.g., several consecutive copies of the same nucleotide that otherwise exceed the interaction length of the single constriction.
  • small molecule analytes including organic or inorganic drugs and pollutants passing through the pore will consecutively pass the two constrictions.
  • the chemical nature of either constriction can be independently modified, each giving unique interaction properties with the analyte, thus providing additional discriminating power during analyte detection.
  • the pore of the invention is preferably a homooligomer comprising 6 to 20, such as 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20, pore monomers of the invention.
  • the pore of the invention is preferably a homooligomer comprising 6 to 10, such as 6, 7, 8, 9 or 10, pore monomers of the invention.
  • the pore monomers of the invention are typically identical.
  • the pore preferably comprises 8 or 9 identical pore monomers of the invention.
  • the pore monomers of the invention may be any of those discussed above.
  • the invention provides pores comprising at least one construct of the invention.
  • the pore typically comprises at least 1, 2, 3, 4 or 5 constructs of the invention.
  • the pore comprises sufficient monomers to form a pore.
  • an octameric pore may comprise (a) four constructs each comprising two pore monomers of the invention, (b) two constructs each comprising four pore monomers of the invention, (c) one construct comprising two pore monomers of the invention and six pore monomers of the invention that do not form part of a construct, (d) three constructs comprising two pore monomers of the invention and two pore monomers of the invention that do not form part of a construct, and (e) combinations thereof. Same and additional possibilities are provided for a nonameric pore for instance.
  • One or more constructs of the invention may be used to form a pore for characterising, such as sequencing, polynucleotides.
  • the pore preferably comprises 4 constructs of the invention each of which comprises two chimeric pore monomers or pore monomers.
  • the constructs are typically the same (/.e., identical).
  • the pore of the invention is preferably a homooligomer comprising 1-5, such as 1, 2, 3, 4, 5, constructs of the invention.
  • the constructs are typically the same (/.e., identical).
  • the pore preferably comprises 4 identical constructs of the invention each of which comprises two pore monomers of the invention.
  • the constructs may be any of those discussed above.
  • the pore monomers of the invention in the pore are preferably all approximately the same length or are the same length.
  • the barrels of the pore monomers of the invention in the pore are preferably approximately the same length or are the same length. Length may be measured in number of amino acids and/or units of length.
  • any of the pores of the invention may further comprise one or more CsgF peptides.
  • Such peptides and their association with pores are described in WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety).
  • the pore of the invention may be isolated, substantially isolated, purified or substantially purified.
  • a pore of the invention is isolated or purified if it is completely free of any other components, such as lipids or other pores.
  • a pore is substantially isolated if it is mixed with carriers or diluents which will not interfere with its intended use.
  • a pore is substantially isolated or substantially purified if it is present in a form that comprises less than 10%, less than 5%, less than 2% or less than 1% of other components, such as block copolymers, lipids, or other pores.
  • a pore of the invention may be present in a membrane. Suitable membranes are discussed below.
  • a pore of the invention may be present as an individual or single pore.
  • a pore of the invention may be present in a homologous or heterologous population of two or more pores.
  • Other formats involving the pores of the invention are discussed in more detail below.
  • the invention also provides a chimeric pore multimer comprising two or more pores, wherein at least one of the pores is a chimeric pore of the invention.
  • the invention also provides a PorARc pore multimer comprising two or more pores, wherein at least one of the pores is a PorARc pore of the invention.
  • the chimeric pore multimer of the invention and the PorARc pore multimer of the invention are collectively referred to as "pore multimers of the invention”.
  • the invention also provides a CsgG pore multimer comprising two or more pores, wherein at least one of the pores is a CsgG pore of the invention.
  • the chimeric pore multimer of the invention, the PorARc pore multimer of the invention and the CsgG pore multimer of the invention are collectively referred to as "pore multimers of the invention”.
  • the multimer of the invention may comprise any number of pores, such as 2, 3, 4, 5, 6, 7 or 8 or more pores. Any number of the pores in the multimer, including all of them, may be a pore of the invention.
  • the pore multimer of the invention may be a double pore comprising a first pore of the invention and a second pore.
  • the double pore may be orientated in any way.
  • the double pore may be two pores end to end.
  • the double pore may be two pores adjacent to each other (/.e., side to side).
  • the second pore may be a pore of the invention.
  • Both the first pore and the second pore are preferably pores of the invention.
  • the first pore may be attached to the second pore by hydrophobic interactions and/or by one or more disulfide bonds.
  • One or more, such as 2, 3, 4, 5, 6, 8, 9, for example all, of the monomers in the first pore and/or the second pore (complex) may be modified to enhance such interactions. This may be achieved in any suitable way. Particular methods of forming double pores are described in WO 2019/002893 (incorporated by reference herein in its entirety).
  • the pore multimer of the invention may be isolated, substantially isolated, purified or substantially purified. Such terms are defined above with reference to the pores of the invention.
  • the invention also provides a pore of the invention or a pore multimer of the invention which is comprised in a membrane.
  • the invention also provides a membrane comprising a pore of the invention or a pore multimer of the invention.
  • proteins may be modified to assist their identification or purification, for example by the addition of a streptavidin tag or by the addition of a signal sequence to promote their secretion from a cell where the monomer does not naturally contain such a sequence.
  • the proteins may also be produced using D- amino acids or a mixture of L-amino acids and D-amino acids. This is conventional in the art for producing such proteins or peptides.
  • the chimeric pore monomer, the PorARc pore monomer, the chimeric construct, the PorARc construct, the chimeric pore, the PorARc pore, the chimeric pore multimer or the PorARc pore multimer may be chemically modified.
  • these proteins of the invention are collectively referred to as "the protein".
  • the protein can be chemically modified in any way and at any site.
  • the protein may be chemically modified by attachment of a molecule to one or more cysteines (cysteine linkage), attachment of a molecule to one or more lysines, attachment of a molecule to one or more non-natural amino acids, enzyme modification of an epitope or modification of a terminus. Suitable methods for carrying out such modifications are well-known in the art.
  • the protein may be chemically modified by the attachment of any molecule, such as a dye or a fluorophore.
  • the protein may be chemically modified with a molecular adaptor that facilitates the interaction between a pore comprising the monomer and a target nucleotide or target polynucleotide sequence.
  • Suitable adaptors including a cyclic molecule, a cyclodextrin, a species that is capable of hybridization, a DNA binder or interchelator, a peptide or peptide analogue, a synthetic polymer, an aromatic planar molecule, a small positively charged molecule or a small molecule capable of hydrogen-bonding, are described in WO 2019/002893 (incorporated by reference herein in its entirety).
  • the molecular adaptor may be attached using any of the methods and linkers discussed above.
  • the protein may be attached to a polynucleotide binding protein.
  • Polynucleotide binding proteins are discussed below.
  • the protein can be covalently attached to the monomer using any method known in the art.
  • the monomer and protein may be chemically fused or genetically fused. Genetic fusion of a monomer to a polynucleotide binding protein is discussed in WO 2010/004265 (incorporated herein by reference in its entirety).
  • the polynucleotide binding protein may be attached via cysteine linkage using any method described above.
  • the polynucleotide binding protein may be attached directly to the protein via one or more linkers.
  • the molecule may be attached to the pore monomer using the hybridization linkers described in as WO 2010/086602 (incorporated herein by reference in its entirety).
  • peptide linkers may be used. Suitable peptide linkers are discussed above.
  • any of the proteins may be modified to assist their identification or purification, for example by the addition of histidine residues (a his tag), aspartic acid residues (an asp tag), a streptavidin tag, a flag tag, a SUMO tag, a GST tag or a MBP tag, or by the addition of a signal sequence to promote their secretion from a cell where the polypeptide does not naturally contain such a sequence.
  • An alternative to introducing a genetic tag is to chemically react a tag onto a native or engineered position on the protein. An example of this would be to react a gel-shift reagent to a cysteine engineered on the outside of the protein. This has been demonstrated as a method for separating hemolysin heterooligomers (Chem Biol. 1997 Jul;4(7):497-505).
  • any of the proteins may be labelled with a revealing label.
  • the revealing label may be any suitable label which allows the protein to be detected. Suitable labels include, but are not limited to, fluorescent molecules, radioisotopes, e.g., 1251, 35S, enzymes, antibodies, antigens, polynucleotides, and ligands such as biotin.
  • the protein may also contain other non-specific modifications as long as they do not interfere with the function of the protein.
  • a number of non-specific side chain modifications are known in the art and may be made to the side chains of the protein(s). Such modifications include, for example, reductive alkylation of amino acids by reaction with an aldehyde followed by reduction with NaBH4, amidation with methylacetimidate or acylation with acetic anhydride.
  • Polynucleotide sequences encoding a protein may be derived and replicated using standard methods in the art. Polynucleotide sequences encoding a protein may be expressed in a bacterial host cell using standard techniques in the art. The protein may be produced in a cell by in situ expression of the polypeptide from a recombinant expression vector. The expression vector optionally carries an inducible promoter to control the expression of the polypeptide. These methods are described in Sambrook, J. and Russell, D. (2001). Molecular Cloning: A Laboratory Manual, 3rd Edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
  • Proteins may be produced in large scale following purification by any protein liquid chromatography system from protein producing organisms or after recombinant expression.
  • Typical protein liquid chromatography systems include FPLC, AKTA systems, the Bio-Cad system, the Bio-Rad BioLogic system, and the Gilson HPLC system.
  • the invention provides methods for producing a chimeric pore monomer of the invention.
  • the method comprises attaching, preferably covalently attaching, the at least two regions from or derived from at least two different pores.
  • the at least two regions may be attached or covalently attached using one or more linkers as described above.
  • the invention provides a method for producing a chimeric pore monomer of the invention comprising (a) designing a polynucleotide which encodes the chimeric pore monomer of the invention as a genetic fusion and (b) expressing the chimeric pore monomer from the polynucleotide.
  • Methods for designing polynucleotide sequences, constructing polynucleotides, and expressing them are well known in the art.
  • the invention also provides a method for producing a pore of the invention or a pore multimer of the invention.
  • the pore of the invention may be a chimeric pore of the invention or a PorARc pore of the invention.
  • the pore multimer of the invention may be a chimeric pore multimer of the invention or a PorARc pore multimer of the invention.
  • the method may involve expressing the pore in a host cell.
  • the method may comprise expressing at least one pore monomer of the invention or at least one construct of the invention and sufficient pore monomers or constructs to form the pore or the pore multimer in a host cell and allowing the pore or pore multimer to form in the host cell.
  • the sufficient pore monomers or constructs are preferably sufficient pore monomers of the invention or sufficient constructs of the invention.
  • the pore monomer(s) of the invention may be chimeric pore monomer(s) of the invention or PorARc pore monomer(s) of the invention.
  • the construct(s) of the invention may be chimeric construct(s) of the invention or PorARc constructs of the invention.
  • the numbers of pore monomers or constructs needed to form the pores of the invention or pore multimers of the invention are discussed above. Suitable host cells and expression systems are known in the art and are discussed in the Examples.
  • the method may involve forming the pore in a non-cellular or in vitro context.
  • the method may comprise contacting at least one pore monomer of the invention or at least one construct of the invention with sufficient pore monomers or constructs in vitro and allowing the formation of the pore or pore multimer.
  • the pore monomer(s) or the construct(s) may be produced separately by in vitro translation and transcription (IVTT) and then incubated with the sufficient pore monomers or constructs.
  • the sufficient pore monomers or constructs are preferably sufficient pore monomers of the invention or sufficient constructs of the invention.
  • the pore monomer(s) of the invention may be chimeric pore monomer(s) of the invention or PorARc pore monomer(s) of the invention.
  • the construct(s) of the invention may be chimeric construct(s) of the invention or PorARc construct(s) of the invention.
  • the numbers of pore monomers or constructs needed to form the pores of the invention or pore multimers of the invention are discussed above.
  • the method may be conducted in an "in vitro system", which refers to a system comprising at least the necessary components and environment to execute said method, and makes use of biological molecules, organisms, a cell (or part of a cell) outside of their normal naturally occurring environment, permitting a more detailed, more convenient, or more efficient analysis than can be done with whole organisms.
  • An in vitro system may also comprise a suitable buffer composition provided in a test tube, wherein said protein components to form the complex have been added. A person skilled in the art is aware of the options to provide said system.
  • Some or all of the components of the pore or pore multimer may be tagged to facilitate purification. Purification can also be performed when the components are untagged. Methods known in the art (e.g., ion exchange, gel filtration, hydrophobic interaction column chromatography etc.) can be used alone or in different combinations to purify the components of the pore.
  • the pore or pore multimer can be made prior to insertion into a membrane or after insertion of the components into a membrane.
  • the invention provides a method of determining the presence, absence or one or more characteristics of a target analyte.
  • the method involves contacting the target analyte with a pore or pore multimer such that the target analyte moves with respect to, such as into or through, the pore or pore multimer and taking one or more measurements as the target analyte moves with respect to the pore or pore multimer and thereby determining the presence, absence or one or more characteristics of the target analyte.
  • the target analyte may also be called the template analyte or the analyte of interest.
  • the pore may be a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from or derived from at least two different pores. All of the discussion above in relation to chimeric pores, two or more regions and at least two of the regions being from or derived from at least two different pores equally apply to the method of the invention. For the avoidance of doubt, in the context of the method of the invention, the at least two different pores can comprise alpha-hemolysin and gamma-hemolysin.
  • the chimeric pore is preferably a chimeric pore of the invention.
  • the pore may be a PorARc pore of the invention.
  • the pore may be a CsgG pore of the invention.
  • the pore multimer may comprise two or more pores wherein at least one pore is a chimeric pore a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from or derived from at least two different pores. All of the discussion above in relation to chimeric pores, two or more regions and at least two of the regions being from or derived from at least two different pores equally apply to the method of the invention.
  • the pore multimer is preferably a chimeric pore multimer of the invention.
  • the pore multimer may be a PorARc pore multimer of the invention.
  • the pore may be a CsgG pore multimer of the invention.
  • the method is for determining the presence, absence or one or more characteristics of a target analyte.
  • the method may be for determining the presence, absence or one or more characteristics of at least one analyte.
  • the method may concern determining the presence, absence or one or more characteristics of two or more analytes.
  • the method may comprise determining the presence, absence or one or more characteristics of any number of analytes, such as 2, 5, 10, 15, 20, 30, 40, 50, 100 or more analytes. Any number of characteristics of the one or more analytes may be determined, such as 1, 2, 3, 4, 5, 10 or more characteristics.
  • the degree of reduction in ion flow is related to the size of the obstruction within, or in the vicinity of, the pore. Binding of a molecule of interest, also referred to as an "analyte", in or near the pore therefore provides a detectable and measurable event, thereby forming the basis of a "biological sensor".
  • Suitable molecules for nanopore sensing include polynucleotides/nucleic acids, proteins, peptides, polynucleotide-polypeptide conjugates, polysaccharides, and small molecules (refers here to a low molecular weight (e.g., ⁇ 900Da or ⁇ 500Da) organic or inorganic compound) such as pharmaceuticals, toxins, cytokines, and pollutants. Detecting the presence of biological molecules finds application in personalised drug development, medicine, diagnostics, life science research, environmental monitoring and in the security and/or the defence industry.
  • the pore or pore multimer may serve as a molecular or biological sensor.
  • the target analyte molecule that is to be detected may bind to either face of the channel, or within the lumen of the channel itself. The position of binding may be determined by the size of the molecule to be sensed.
  • the target analyte preferably comprises or is a metal ion, an inorganic salt, a polymer, an amino acid, a peptide, a polypeptide, a protein, a nucleotide, an oligonucleotide, a polynucleotide, a polynucleotide-polypeptide conjugate, a monosaccharide, an oligosaccharide, a polysaccharide, a dye, a bleach, a pharmaceutical, a diagnostic agent, a recreational drug, an explosive, a toxic compound, or an environmental pollutant.
  • the target analyte preferably comprises or is a metal ion, an inorganic salt, a polymer, an amino acid, a peptide, a polypeptide, a protein, a nucleotide, an oligonucleotide, a polynucleotide, a polynucleotide-polypeptide conjugate, a monosaccharide, an oligosaccharide, a polysaccharide, a dye, a bleach, a pharmaceutical, a diagnostic agent, a recreational drug, an explosive, a toxic compound, an environmental pollutant, or a metabolite.
  • the target analyte may comprise two or more different molecules, such as a peptide and a polypeptide.
  • the target analyte may be a polynucleotide-polypeptide conjugate.
  • the method may concern determining the presence, absence or one or more characteristics of two or more analytes of the same type, such as two or more proteins, two or more nucleotides or two or more pharmaceuticals.
  • the method may concern determining the presence, absence or one or more characteristics of two or more analytes of different types, such as one or more proteins, one or more nucleotides and one or more pharmaceuticals.
  • the target analyte can be secreted from cells.
  • the target analyte can be an analyte that is present inside cells such that the target analyte must be extracted from the cells before the method can be carried out.
  • the target analyte may be obtained from or extracted from any organism or microorganism.
  • the target analyte may be obtained from a human or animal, e.g., from urine, lymph, saliva, mucus, seminal fluid, or amniotic fluid, or from whole blood, plasma, or serum.
  • the target analyte may be obtained from a plant e.g., a cereal, legume, fruit, or vegetable.
  • the pore or pore multimer may be modified via recombinant or chemical methods to increase the strength of binding, the position of binding, or the specificity of binding of the molecule to be sensed. Typical modifications include addition of a specific binding moiety complimentary to the structure of the molecule to be sensed.
  • this binding moiety may comprise a cyclodextrin or an oligonucleotide; for small molecules this may be a known complimentary binding region, for example the antigen binding portion of an antibody or of a non-antibody molecule, including a single chain variable fragment (scFv) region or an antigen recognition domain from a T- cell receptor (TCR); or for proteins, it may be a known ligand of the target protein.
  • scFv single chain variable fragment
  • TCR T- cell receptor
  • the pore or pore multimer may be rendered capable of acting as a molecular sensor for detecting presence in a sample of suitable antigens (including epitopes) that may include cell surface antigens, including receptors, markers of solid tumours or haematologic cancer cells (e.g. lymphoma or leukaemia), viral antigens, bacterial antigens, protozoal antigens, allergens, allergy related molecules, albumin (e.g. human, rodent, or bovine), fluorescent molecules (including fluorescein), blood group antigens, small molecules, drugs, enzymes, catalytic sites of enzymes or enzyme substrates, and transition state analogues of enzyme substrates.
  • suitable antigens including epitopes
  • suitable antigens including epitopes
  • suitable antigens including epitopes
  • suitable antigens including epitopes
  • suitable antigens including epitopes
  • suitable antigens including epitopes
  • suitable antigens including epitopes
  • suitable antigens
  • modifications may be achieved using known genetic engineering and recombinant DNA techniques.
  • the positioning of any adaptation would be dependent on the nature of the molecule to be sensed, for example, the size, three- dimensional structure, and its biochemical nature.
  • the choice of adapted structure may make use of computational structural design. Determination and optimization of proteinprotein interactions or protein-small molecule interactions can be investigated using technologies such as a BIAcore® which detects molecular interactions using surface plasmon resonance (BIAcore, Inc., Piscataway, NJ; see also www.biacore.com).
  • the target analyte preferably comprises or is an amino acid, a peptide, a polypeptide, or protein.
  • the amino acid, peptide, polypeptide, or protein can be naturally occurring or non- naturally occurring.
  • the polypeptide or protein can include within them synthetic or modified amino acids. Several different types of modification to amino acids are known in the art. Suitable amino acids and modifications thereof are above. It is to be understood that the target analyte can be modified by any method available in the art.
  • the target analyte preferably comprises or is a polynucleotide, such as a nucleic acid.
  • a polynucleotide is defined as a macromolecule comprising two or more nucleotides.
  • Nucleic acids are particularly suitable for nanopore sequencing.
  • the naturally occurring nucleic acid bases in DNA and RNA may be distinguished by their physical size. As a nucleic acid molecule, or individual base, passes through the channel of a nanopore, the size differential between the bases causes a directly correlated reduction in the ion flow through the channel. The variation in ion flow may be recorded. Suitable electrical measurement techniques for recording ion flow variations are discussed above.
  • the characteristic reduction in ion flow can be used to identify the particular nucleotide and associated base traversing the channel in real-time.
  • the open-channel ion flow is reduced as the individual nucleotides of the nucleic sequence of interest sequentially pass through the channel of the nanopore due to the partial blockage of the channel by the nucleotide. It is this reduction in ion flow that is measured using the suitable recording techniques described above.
  • the reduction in ion flow may be calibrated to the reduction in measured ion flow for known nucleotides through the channel resulting in a means for determining which nucleotide is passing through the channel, and therefore, when done sequentially, a way of determining the nucleotide sequence of the nucleic acid passing through the nanopore.
  • it has typically required for the reduction in ion flow through the channel to be directly correlated to the size of the individual nucleotide passing through the constriction.
  • sequencing may be performed upon an intact nucleic acid polymer that is 'threaded' through the pore via the action of an associated polymerase, for example.
  • sequences may be determined by passage of nucleotide triphosphate bases that have been sequentially removed from a target nucleic acid in proximity to the pore (see for example WO 2014/187924 incorporated herein by reference in its entirety).
  • the polynucleotide or nucleic acid may comprise any combination of any nucleotides.
  • the nucleotides can be naturally occurring or artificial.
  • One or more nucleotides in the polynucleotide can be oxidized or methylated.
  • One or more nucleotides in the polynucleotide may be damaged.
  • the polynucleotide may comprise a pyrimidine dimer. Such dimers are typically associated with damage by ultraviolet light and are the primary cause of skin melanomas.
  • One or more nucleotides in the polynucleotide may be modified, for instance with a label or a tag, for which suitable examples are known by a skilled person.
  • the polynucleotide may comprise one or more spacers.
  • a nucleotide typically contains a nucleobase, a sugar and at least one phosphate group.
  • the nucleobase and sugar form a nucleoside.
  • the nucleobase is typically heterocyclic.
  • Nucleobases include, but are not limited to, purines and pyrimidines and more specifically adenine (A), guanine (G), thymine (T), uracil (U) and cytosine (C).
  • the sugar is typically a pentose sugar.
  • Nucleotide sugars include, but are not limited to, ribose and deoxyribose. The sugar is preferably a deoxyribose.
  • the polynucleotide preferably comprises the following nucleosides: deoxyadenosine (dA), deoxyuridine (dll) and/or thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC).
  • the nucleotide is typically a ribonucleotide or deoxyribonucleotide.
  • the nucleotide typically contains a monophosphate, diphosphate, or triphosphate.
  • the nucleotide may comprise more than three phosphates, such as 4 or 5 phosphates. Phosphates may be attached on the 5' or 3' side of a nucleotide.
  • the nucleotides in the polynucleotide may be attached to each other in any manner.
  • the nucleotides are typically attached by their sugar and phosphate groups as in nucleic acids.
  • the nucleotides may be connected via their nucleobases as in pyrimidine dimers.
  • the polynucleotide may be single stranded or double stranded. At least a portion of the polynucleotide is preferably double stranded.
  • the polynucleotide is most preferably ribonucleic nucleic acid (RNA) or deoxyribonucleic acid (DNA).
  • said method using a polynucleotide as an analyte alternatively comprises determining one or more characteristics selected from (i) the length of the polynucleotide, (ii) the identity of the polynucleotide, (iii) the sequence of the polynucleotide, (iv) the secondary structure of the polynucleotide and (v) whether or not the polynucleotide is modified.
  • the polynucleotide can be any length (i).
  • the polynucleotide can be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400 or at least 500 nucleotides or nucleotide pairs in length.
  • the polynucleotide can be 1000 or more nucleotides or nucleotide pairs, 5000 or more nucleotides or nucleotide pairs in length or 100000 or more nucleotides or nucleotide pairs in length. Any number of polynucleotides can be investigated. For instance, the method may concern characterising 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 50, 100 or more polynucleotides.
  • polynucleotides may be different polynucleotides or two instances of the same polynucleotide.
  • the polynucleotide can be naturally occurring or artificial.
  • the method may be used to verify the sequence of a manufactured oligonucleotide. The method is typically carried out in vitro.
  • Nucleotides can have any identity (ii), and include, but are not limited to, adenosine monophosphate (AMP), guanosine monophosphate (GMP), thymidine monophosphate (TMP), uridine monophosphate (UMP), 5-methylcytidine monophosphate, 5- hydroxy methylcytidine monophosphate, cytidine monophosphate (CMP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyguanosine monophosphate (dGMP), deoxythymidine monophosphate (dTMP), deoxyuridine monophosphate (dUMP), deoxycytidine monophosphate (dCMP) and deoxymethylcytidine monophosphate.
  • AMP adenosine monophosphate
  • GFP guanosine monophosphate
  • TMP thymidine monophosphate
  • UMP
  • the nucleotides are preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP, dCMP and dUMP.
  • a nucleotide may be abasic (/.e., lack a nucleobase).
  • a nucleotide may also lack a nucleobase and a sugar (/.e., is a C3 spacer).
  • the sequence of the nucleotides (iii) is determined by the consecutive identity of following nucleotides attached to each other throughout the polynucleotide strain, in the 5' to 3' direction of the strand.
  • the movement of the polynucleotide with respect to the pore or pore multimer, such as through the pore or pore multimer, is preferably controlled using a polynucleotide binding protein. Suitable proteins are discussed in more detail below.
  • the invention provides a method for determining the presence, absence or one or more characteristics of a target polynucleotide, comprising the steps of:
  • the chimeric pore in (a) is preferably a chimeric pore of the invention.
  • the chimeric pore multimer in (b) is preferably a chimeric pore multimer of the invention.
  • the target analyte preferably comprises a polypeptide.
  • Any suitable polypeptide can be characterised.
  • the polypeptide may be an unmodified protein or a portion thereof, or a naturally occurring polypeptide or a portion thereof.
  • the target polypeptide may be secreted from cells.
  • the target polypeptide can be produced inside cells such that it must be extracted from cells for characterisation.
  • the polypeptide may comprise the products of cellular expression of a plasmid, e.g., a plasmid used in cloning of proteins in accordance with the methods described in Sambrook et al., Molecular Cloning: A Laboratory Manual, 4 th ed., Cold Spring Harbor Press, Plainsview, New York (2012); and Ausubel et al., Current Protocols in Molecular Biology (Supplement 114), John Wiley & Sons, New York (2016).
  • a plasmid e.g., a plasmid used in cloning of proteins in accordance with the methods described in Sambrook et al., Molecular Cloning: A Laboratory Manual, 4 th ed., Cold Spring Harbor Press, Plainsview, New York (2012); and Ausubel et al., Current Protocols in Molecular Biology (Supplement 114), John Wiley & Sons, New York (2016).
  • the polypeptide can be provided as an impure mixture of one or more polypeptides and one or more impurities.
  • Impurities may comprise truncated forms of the target polypeptide which are distinct from the "target polypeptides" for characterisation.
  • the target polypeptide may be a full-length protein and impurities may comprise fractions of the protein.
  • Impurities may also comprise proteins other than the target protein, e.g., which may be co-purified from a cell culture or obtained from a sample.
  • a polypeptide may comprise any combination of any amino acids, amino acid analogs and modified amino acids (/.e., amino acid derivatives).
  • Amino acids (and derivatives, analogs etc) in the polypeptide can be distinguished by their physical size and charge.
  • the amino acids/derivatives/analogs can be naturally occurring or artificial.
  • the polypeptide may comprise any naturally occurring amino acid.
  • the polypeptide may be modified.
  • the polypeptide may be modified for detection using the method of the invention.
  • the method may be for characterising modifications in the target polypeptide.
  • One or more of the amino acids/derivatives/analogs in the polypeptide may be modified.
  • One or more of the amino acids/derivatives/analogs in the polypeptide may be post- translationally modified.
  • the method of the invention can be used to detect the presence, absence, number of positions of post-translational modifications in a polypeptide.
  • the method can be used to characterise the extent to which a polypeptide has been post- translationally modified.
  • post-translational modifications may be present in the polypeptide.
  • Typical post-translational modifications include modification with a hydrophobic group, modification with a cofactor, addition of a chemical group, glycation (the non-enzymatic attachment of a sugar), biotinylation and pegylation.
  • Post-translational modifications can also be nonnatural, such that they are chemical modifications done in the laboratory for biotechnological or biomedical purposes. This can allow monitoring the levels of the laboratory made peptide, polypeptide, or protein in contrast to the natural counterparts.
  • Examples of post-translational modification with a hydrophobic group include myristoylation, attachment of myristate, a C i4 saturated acid; palmitoylation, attachment of palmitate, a Ci 6 saturated acid; isoprenylation or prenylation, the attachment of an isoprenoid group; farnesylation, the attachment of a farnesol group; geranylgeranylation, the attachment of a geranylgeraniol group; and glypiation, and glycosylphosphatidylinositol (GPI) anchor formation via an amide bond.
  • GPI glycosylphosphatidylinositol
  • post-translational modification with a cofactor examples include lipoylation, attachment of a lipoate (C 8 ) functional group; flavination, attachment of a flavin moiety (e.g. flavin mononucleotide (FMN) or flavin adenine dinucleotide (FAD)); attachment of heme C, for instance via a thioether bond with cysteine; phosphopantetheinylation, the attachment of a 4'-phosphopantetheinyl group; and retinylidene Schiff base formation.
  • flavin mononucleotide FMN
  • flavin adenine dinucleotide flavin adenine dinucleotide
  • attachment of heme C for instance via a thioether bond with cysteine
  • phosphopantetheinylation the attachment of a 4'-phosphopantetheinyl group
  • retinylidene Schiff base formation examples include lipoylation, attachment of a lipoate
  • Examples of post-translational modification by addition of a chemical group include acylation, e.g. O-acylation (esters), N-acylation (amides) or S-acylation (thioesters); acetylation, the attachment of an acetyl group for instance to the N-terminus or to lysine; formylation; alkylation, the addition of an alkyl group, such as methyl or ethyl; methylation, the addition of a methyl group for instance to lysine or arginine; amidation; butyrylation; gamma-carboxylation; glycosylation, the enzymatic attachment of a glycosyl group for instance to arginine, asparagine, cysteine, hydroxylysine, serine, threonine, tyrosine or tryptophan; polysialylation, the attachment of polysialic acid; malonylation; hydroxylation; iodination; bromination; citrulin
  • the polypeptide may be labelled with a molecular label.
  • a molecular label may be a modification to the polypeptide which promotes the detection of the polypeptide in the method of the invention.
  • the label may be a modification to the polypeptide which alters the signal obtained as conjugate is characterised.
  • the label may interfere with a flux of ions through the nanopore. In such a manner, the label may improve the sensitivity of the method.
  • the polypeptide may contain one or more cross-linked sections, e.g., C-C bridges.
  • the polypeptide may not be cross-linked prior to being characterised using the method.
  • the polypeptide may comprise sulphide-containing amino acids and thus has the potential to form disulphide bonds.
  • the polypeptide is reduced using a reagent such as DTT (Dithiothreitol) or TCEP (tris(2-carboxyethyl)phosphine) prior to being characterised using the method.
  • DTT Dithiothreitol
  • TCEP tris(2-carboxyethyl)phosphine
  • the polypeptide may be a full-length protein or naturally occurring polypeptide.
  • the protein or naturally occurring polypeptide may be fragmented prior to conjugation to the polynucleotide.
  • the protein or polypeptide may be chemically or enzymatically fragmented.
  • the polypeptides or polypeptide fragments can be conjugated to form a longer target polypeptide.
  • the polypeptide can be any suitable length.
  • the polypeptide preferably has a length of from about 2 to about 300 peptide units or amino acids.
  • the polypeptide has a length of from about 2 to about 100 peptide units, for example from about 2 to about 50 peptide units, e.g., from about 3 to about 50 peptide units, such as from about 5 to about 25 peptide units, e.g., from about 7 to about 16 peptide units, such as from about 9 to about 12 peptide units.
  • “Peptide unit” is interchangeable with "amino acid”.
  • the one or more characteristics of the polypeptide are preferably selected from (i) the length of the polypeptide, (ii) the identity of the polypeptide, (iii) the sequence of the polypeptide, (iv) the secondary structure of the polypeptide and (v) whether or not the polypeptide is modified.
  • the one or more characteristics may be the sequence of the polypeptide or whether or not the polypeptide is modified, e.g., by one or more post- translational modifications.
  • the one or more characteristics are preferably the sequence of the polypeptide.
  • the polypeptide may be in a relaxed form.
  • the polypeptide may be held in a linearized form.
  • Holding the polypeptide in a linearized form can facilitate the characterisation of the polypeptide on a residue-by-residue basis as "bunching up" of the polypeptide within the nanopore is prevented.
  • the polypeptide can be held in a linearized form using any suitable means. For example, if the polypeptide is charged, the polypeptide can be held in a linearized form by applying a voltage.
  • the charge can be altered or controlled by adjusting the pH.
  • the polypeptide can be held in a linearized form by using high pH to increase the relative negative charge of the polypeptide.
  • Increasing the negative charge of the polypeptide allows it to be held in a linearized form under, e.g., a positive voltage.
  • the polypeptide can be held in a linearized form by using low pH to increase the relative positive charge of the polypeptide.
  • Increasing the positive charge of the polypeptide allows it to be held in a linearized form under, e.g., a negative voltage.
  • a polynucleotide-handling protein is used to control the movement of a polynucleotide with respect to a nanopore.
  • a polynucleotide As a polynucleotide is typically negatively charged it is generally most suitable to increase the linearization of the polypeptide by increasing the pH thus making the polypeptide more negatively charged, in common with the polynucleotide. In this way, the conjugate retains an overall negative charge and thus can readily move, e.g., under an applied voltage.
  • the polypeptide can be held in a linearized form by using suitable denaturing conditions.
  • suitable denaturing conditions include, for example, the presence of appropriate concentrations of denaturants such as guanidine HCI and/or urea.
  • concentration of such denaturants to use in the disclosed methods is dependent on the target polypeptide to be characterised in the methods and can be readily selected by those of skill in the art.
  • the polypeptide can be held in a linearized form by using suitable detergents.
  • suitable detergents for use in the disclosed methods include SDS (sodium dodecyl sulfate).
  • SDS sodium dodecyl sulfate
  • the polypeptide can be held in a linearized form by carrying out the disclosed methods at an elevated temperature. Increasing the temperature overcomes intra-strand bonding and allows the polypeptide to adopt a linearized form.
  • the polypeptide can be held in a linearized form by carrying out the method under strong electro-osmotic forces. Such forces can be provided by using asymmetric salt conditions and/or providing suitable charge in the channel of the nanopore.
  • the charge in the channel of a pore can be altered, e.g., by mutagenesis. Altering the charge of a pore is well within the capacity of those skilled in the art. Altering the charge of a pore generates strong electro-osmotic forces from the unbalanced flow of cations and anions through the nanopore when a voltage potential is applied across the nanopore.
  • the polypeptide can be held in a linearized form by passing it through a structure such an array of nanopillars, through a nanoslit or across a nanogap. The physical constraints of such structures can force the polypeptide to adopt a linearized form.
  • the target analyte may comprise a polynucleotide and a polypeptide.
  • the target analyte may be a polynucleotide-polypeptide conjugate.
  • the conjugate preferably comprises a polynucleotide conjugated to a polypeptide.
  • One or both of the polynucleotide and polypeptide may be the target and may be characterised in accordance with the invention.
  • the polypeptide can be conjugate to the polynucleotide at any suitable position.
  • the polypeptide can be conjugated to the polynucleotide at the N-terminus or the C-terminus of the polypeptide.
  • the polypeptide can be conjugated to the polynucleotide via a side chain group of a residue (e.g., an amino acid residue) in the polypeptide.
  • the polypeptide may have a naturally occurring reactive functional group which can be used to facilitate conjugation to the polynucleotide.
  • a cysteine residue can be used to form a disulphide bond to the polynucleotide or to a modified group thereon.
  • the polypeptide may be modified in order to facilitate its conjugation to the polynucleotide.
  • the polypeptide may be modified by attaching a moiety comprising a reactive functional group for attaching to the polynucleotide.
  • the polypeptide can be extended at the N-terminus or the C-terminus by one or more residues (e.g., amino acid residues) comprising one or more reactive functional groups for reacting with a corresponding reactive functional group on the polynucleotide.
  • the polypeptide can be extended at the N-terminus and/or the C-terminus by one or more cysteine residues.
  • Such residues can be used for attachment to the polynucleotide portion of the conjugate, e.g., by maleimide chemistry (e.g., by reaction of cysteine with an azido-maleimide compound such as azido-[Pol]-maleimide wherein [Pol] is typically a short chain polymer such as PEG, e.g., PEG2, PEG3, or PEG4; followed by coupling to appropriately functionalised polynucleotide e.g., polynucleotide carrying a BCN group for reaction with the azide).
  • maleimide chemistry e.g., by reaction of cysteine with an azido-maleimide compound such as azido-[Pol]-maleimide wherein [Pol] is typically a short chain polymer such as PEG, e.g., PEG2, PEG3, or PEG4; followed by coupling to appropriately functionalised polynucleotide e.g., polynucleotide carrying a
  • the polypeptide comprises an appropriate naturally occurring residue at the N- and/or C- terminus (e.g., a naturally occurring cysteine residue at the N- and/or C-terminus) then such residue(s) can be used for attachment to the polynucleotide.
  • an appropriate naturally occurring residue at the N- and/or C- terminus e.g., a naturally occurring cysteine residue at the N- and/or C-terminus
  • a residue in the polypeptide may be modified to facilitate attachment of the polypeptide to the polynucleotide.
  • a residue (e.g., an amino acid residue) in the polypeptide may be chemically modified for attachment to the polynucleotide.
  • a residue (e.g., an amino acid residue) in the polypeptide may be enzymatically modified for attachment to the polynucleotide.
  • the conjugation chemistry between the polynucleotide and the polypeptide in the conjugate is not particularly limited. Any suitable combination of reactive functional groups can be used. Many suitable reactive groups and their chemical targets are known in the art.
  • Some exemplary reactive groups and their corresponding targets include aryl azides which may react with amine, carbodiimides which may react with amines and carboxyl groups, hydrazides which may react with carbohydrates, hydroxmethyl phosphines which may react with amines, imidoesters which may react with amines, isocyanates which may react with hydroxyl groups, carbonyls which may react with hydrazines, maleimides which may react with sulfhydryl groups, NHS-esters which may react with amines, PFP-esters which may react with amines, psoralens which may react with thymine, pyridyl disulfides which may react with sulfhydryl groups, vinyl sulfones which may react with sulfhydryl amines and hydroxyl groups, vinylsulfonamides, and the like.
  • click chemistry for conjugating the polypeptide to the polynucleotide
  • click chemistry include click chemistry.
  • Many suitable click chemistry reagents are known in the art. Suitable examples of click chemistry include, but are not limited to, the following: copper(I)-catalyzed azide-alkyne cycloadditions (azide alkyne Huisgen cycloadditions); strain-promoted azide-alkyne cycloadditions; including alkene and azide [3+2] cycloadditions; alkene and tetrazine inverse-demand Diels-Alder reactions; and alkene and tetrazole photoclick reactions; copper-free variant of the 1,3 dipolar cycloaddition reaction, where an azide reacts with an alkyne under strain, for example in a cyclooctane ring such as in bicycle[6.1.0]nonyne (BCN); the reaction
  • Any reactive group may be used to form the conjugate.
  • suitable reactive groups include [1, 4-Bis[3-(2-pyridyldithio)propionamido]butane; 1,1 1-bis- maleimidotriethyleneglycol; 3,3'-dithiodipropionic acid di(N-hydroxysuccinimide ester); ethylene glycol-bis(succinic acid N-hydroxysuccinimide ester); 4,4'-diisothiocyanatostilbene- 2,2'-disulfonic acid disodium salt; Bis[2-(4-azidosalicylamido)ethyl] disulphide; 3-(2- pyridyldithio)propionic acid N-hydroxysuccinimide ester; 4-maleimidobutyric acid N- hydroxysuccinimide ester; lodoacetic acid N-hydroxysuccinimide ester; S-acetylthioglycolic acid N-hydroxysuccin
  • the reactive functional group may be comprised in the polynucleotide and the target functional group may be comprised in the polypeptide prior to the conjugation step.
  • the reactive functional group may be comprised in the polypeptide and the target functional group may be comprised in the polynucleotide prior to the conjugation step.
  • the reactive functional group may be attached directly to the polypeptide.
  • the reactive functional group may be attached to the polypeptide via a spacer. Any suitable spacer can be used. Suitable spacers include for example alkyl diamines such as ethyl diamine, etc.
  • the conjugate may comprise a plurality of polypeptide sections and/or a plurality of polynucleotide sections.
  • the conjugate may comprise a structure of the form ...-P-N-P-N-P-N... wherein P is a polypeptide and N is a polynucleotide.
  • a polynucleotide- handling protein may sequentially control the N portions of the conjugate with respect to the pore and thus sequentially controls the movement of the P sections with respect to the pore, thus allowing the sequential characterisation of the P sections.
  • the plurality of polynucleotides and polypeptides may be conjugated together by the same or different chemistries.
  • the conjugate may comprise a leader. Any suitable leader may be used.
  • the leader may be a polynucleotide.
  • the leader may be the same sort of polynucleotide as the polynucleotide used in the conjugate, or it may be a different type of polynucleotide.
  • the polynucleotide in the conjugate may be DNA and the leader may be RIMA or vice versa.
  • the leader may be a charged polymer, e.g., a negatively charged polymer.
  • the leader may comprise a polymer such as PEG or a polysaccharide.
  • the leader may be from 10 to 150 monomer units (e.g., ethylene glycol or saccharide units) in length, such as from 20 to 120, e.g., 30 to 100, for example 40 to 80 such as 50 to 70 monomer units (e.g., ethylene glycol or saccharide units) in length.
  • the methods of characterising a target polypeptide of the invention may comprise conjugating a polypeptide to a polynucleotide.
  • the one or more characteristics of the target analyte are preferably measured by electrical measurement and/or optical measurement.
  • the electrical measurement is a current measurement, an impedance measurement, a tunnelling measurement, or a field effect transistor (FET) measurement.
  • the method preferably comprises measuring the current flowing through the pore or the pore multimer as the target analyte moves with respect to, such as through, the pore.
  • the invention also provides a polynucleotide which encodes a pore of the invention or a construct of the invention, including a chimeric pore monomer of the invention, a PorARc pore monomer of the invention, a chimeric construct of the invention or a PorARc construct of the invention.
  • the invention also provides a polynucleotide which encodes a CsgG pore of the invention.
  • the polynucleotide may be any of those discussed above.
  • the invention also provides an expression vector comprising a polynucleotide of the invention.
  • the invention also provides a host cell comprising a polynucleotide of the invention or a host cell of the invention. Suitable vectors and host cells are known in the art.
  • kits for characterising a target analyte comprises (a) a pore monomer of the invention or a construct of the invention and (b) the components of a membrane.
  • the kit preferably comprises (a) a chimeric pore monomer of the invention, a PorARc pore monomer of the invention, a chimeric construct of the invention or a PorARc construct of the invention and (b) the components of a membrane.
  • the kit preferably comprises (a) a CsgG pore monomer or a CsgG construct of the invention and (b) the components of a membrane. Suitable membranes and components are discussed below.
  • the kit comprises:
  • a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from or derived from at least two different pores
  • a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a),
  • a PorARc pore multimer of the invention and a polynucleotide binding protein.
  • the kit comprises:
  • the kit preferably further comprises the components of a membrane.
  • the kit may comprise components of any type of membranes, such as an amphiphilic layer or a triblock copolymer membrane.
  • Preferred polynucleotide binding proteins are polymerases, exonucleases, helicases, and topoisomerases, such as gyrases.
  • Suitable enzymes include, but are not limited to, exonuclease I from E. coli, exonuclease III enzyme from E. coli, RecJ from T. thermophilus and bacteriophage lambda exonuclease, TatD exonuclease and variants thereof.
  • thermophilus or a variant thereof interact to form a trimer exonuclease.
  • the polymerase may be PyroPhage® 3173 DNA Polymerase (which is commercially available from Lucigen® Corporation), SD Polymerase (commercially available from Bioron®) or variants thereof.
  • the enzyme may be Phi29 DNA polymerase or a variant thereof.
  • the topoisomerase is preferably a member of any of the Moiety Classification (EC) groups 5.99.1.2 and 5.99.1.3.
  • the enzyme is most preferably derived from a helicase, such as Hel308 Mbu, Hel308 Csy, Hel308 Tga, Hel308 Mhu, Tral Eco, XPD Mbu or a variant thereof.
  • a helicase such as Hel308 Mbu, Hel308 Csy, Hel308 Tga, Hel308 Mhu, Tral Eco, XPD Mbu or a variant thereof.
  • Any helicase may be used in the invention.
  • the helicase may be or be derived from a Hel308 helicase, a RecD helicase, such as Tral helicase or a TrwC helicase, a XPD helicase or a Dda helicase.
  • the helicase may be any of the helicases, modified helicases or helicase constructs disclosed in WO 2013/057495; WO 2013/098562; WO 2013098561; WO 2014/013260; WO 2014/013259; WO 2014/013262 and WO 2015/055981. All of these are incorporated by reference in their entirety.
  • the kit may further comprise one or more anchors, such as cholesterol, for coupling the target analyte to the membrane.
  • the kit may further comprise one or more polynucleotide adaptors that can be attached to a target polynucleotide to facilitate characterisation of the polynucleotide.
  • the anchor such as cholesterol, is preferably attached to the polynucleotide adaptor.
  • the kit may additionally comprise one or more other reagents or instruments which enable any of the embodiments mentioned above to be carried out.
  • reagents or instruments include one or more of the following: suitable buffer(s) (aqueous solutions), means to obtain a sample from a subject (such as a vessel or an instrument comprising a needle), means to amplify and/or express polynucleotides or voltage or patch clamp apparatus.
  • Reagents may be present in the kit in a dry state such that a fluid sample resuspends the reagents.
  • the kit may also, optionally, comprise instructions to enable the kit to be used in the method of the invention or details regarding for which organism the method may be used.
  • the kit may also comprise additional components useful in analyte characterization.
  • the invention also provides an apparatus for characterising target analytes, such as target polynucleotides, in a sample, comprising a
  • the invention also provides an apparatus for characterising target analytes in a sample, comprising a
  • the plurality of pores or plurality of pore multimers may be any of those discussed above.
  • the invention also provides an apparatus comprising a pore of the invention or a pore multimer of the invention inserted into an in vitro membrane.
  • the apparatus preferably comprises a chimeric pore monomer of the invention, a PorARc pore monomer of the invention, a chimeric construct of the invention or a PorARc construct of the invention inserted into an in vitro membrane.
  • the apparatus preferably comprises a CsgG pore of the invention or a CsgG pore multimer of the invention inserted into an in vitro membrane.
  • the invention also provides an apparatus produced by a method comprising: (i) obtaining a pore of the invention or a pore multimer of the invention and (ii) contacting the pore or pore multimer with an in vitro.
  • the apparatus is preferably produced by a method comprising: (i) obtaining a chimeric pore monomer of the invention, a PorARc pore monomer of the invention, a chimeric construct of the invention or a PorARc construct of the invention and (ii) contacting the pore or pore multimer with an in vitro membrane such that the pore or pore multimer is inserted in the in vitro membrane.
  • the apparatus is preferably produced by a method comprising: (i) obtaining a CsgG pore of the invention or a CsgG pore multimer of the invention and (ii) contacting the pore or pore multimer with an in vitro membrane such that the pore or pore multimer is inserted in the in vitro membrane.
  • the invention also provides an array comprising a plurality of membranes of the invention. Any of the embodiments discussed above with respect to the membranes of the invention equally apply the array of the invention.
  • the array may be set up to perform any of the methods described below.
  • each membrane in the array comprises one pore or pore multimer. Due to the manner in which the array is formed, for example, the array may comprise one or more membranes that do not comprise a pore or pore multimer, and/or one or more membranes that comprise two or more pores complexes or multimers.
  • the array may comprise from about 2 to about 1000, such as from about 10 to about 800, from about 20 to about 600 or from about 30 to about 500 membranes.
  • the invention provides a system comprising (a) a membrane of the invention or an array of the invention, (b) means for applying a potential across the membrane(s) and (c) means for detecting electrical or optical signals across the membrane(s).
  • the pores and membranes may be any as described above and below.
  • the system further comprises a first chamber and a second chamber, wherein the first and second chambers are separated by the membrane(s).
  • the system may further comprise a target analyte, wherein the target analyte is transiently located within the continuous channel and wherein one end of the target analyte is located in the first chamber and one end of the target analyte is located in the second chamber.
  • the target analyte is preferably a target polypeptide or a target polynucleotide.
  • the system further comprises an electrically conductive solution in contact with the pore(s), electrodes providing a voltage potential across the membrane(s), and a measurement system for measuring the current through the pore(s).
  • the voltage applied across the membranes and pore is preferably from +5 V to -5 V, such as -600 mV to +600mV or -400 mV to +400 mV.
  • the voltage used is preferably in the range 100 mV to 240 mV and more preferably in the range of 120 mV to 220 mV. It is possible to increase discrimination between different amino acids or nucleotides by a pore by using an increased applied potential. Any suitable electrically conductive solution may be used.
  • the solution may comprise charge carriers, such as metal salts, for example alkali metal salt, halide salts, for example chloride salts, such as alkali metal chloride salt.
  • Charge carriers may include ionic liquids or organic salts, for example tetramethyl ammonium chloride, trimethylphenyl ammonium chloride, phenyltrimethyl ammonium chloride, or 1- ethyl-3-methyl imidazolium chloride.
  • salt is present in the aqueous solution in the chamber. Potassium chloride (KCI), sodium chloride (NaCI), caesium chloride (CsCI) or a mixture of potassium ferrocyanide and potassium ferricyanide is typically used.
  • KCI, NaCI and a mixture of potassium ferrocyanide and potassium ferricyanide are preferred.
  • the charge carriers may be asymmetric across the membrane. For instance, the type and/or concentration of the charge carriers may be different on each side of the membrane, e.g., in each chamber.
  • the salt concentration may be at saturation.
  • the salt concentration may be 3 M or lower and is typically from 0.1 to 2.5 M, from 0.3 to 1.9 M, from 0.5 to 1.8 M, from 0.7 to 1.7 M, from 0.9 to 1.6 M or from 1 M to 1.4 M.
  • the salt concentration is preferably from 150 mM to 1 M.
  • the method is preferably carried out using a salt concentration of at least 0.3 M, such as at least 0.4 M, at least 0.5 M, at least 0.6 M, at least 0.8 M, at least 1.0 M, at least 1.5 M, at least 2.0 M, at least 2.5 M or at least 3.0 M.
  • High salt concentrations provide a high signal to noise ratio and allow for currents indicative of the presence of an amino acid or nucleotide to be identified against the background of normal current fluctuations.
  • a buffer may be present in the electrically conductive solution.
  • the buffer is phosphate buffer.
  • Other suitable buffers are HEPES and Tris-HCI buffer.
  • the pH of the electrically conductive solution may be from 4.0 to 12.0, from 4.5 to 10.0, from 5.0 to 9.0, from 5.5 to 8.8, from 6.0 to 8.7 or from 7.0 to 8.8 or 7.5 to 8.5.
  • the pH used is preferably about 7.5.
  • the system may be comprised in an apparatus.
  • the apparatus may be any conventional apparatus for analyte analysis, such as an array or a chip.
  • the apparatus is preferably set up to carry out the disclosed method.
  • the apparatus may comprise a chamber comprising an aqueous solution and a barrier that separates the chamber into two sections.
  • the barrier typically has an aperture in which the membrane(s) containing the pore(s) are formed.
  • the barrier forms the membrane in which the pore is present.
  • the apparatus may also comprise an electrical circuit capable of applying a potential and measuring an electrical signal across the membrane and pore.
  • the apparatus may be any of those described in WO 2008/102120, WO 2009/077734, WO 2010/122293, WO 2011/067559, or WO 00/28312 (all incorporated herein by reference in their entirety).
  • the membrane is preferably an amphiphilic layer.
  • An amphiphilic layer is a layer formed from amphiphilic molecules, such as phospholipids, which have both hydrophilic and lipophilic properties.
  • the amphiphilic molecules may be synthetic or naturally occurring.
  • Non-naturally occurring amphiphiles and amphiphiles which form a monolayer are known in the art and include, for example, block copolymers (Gonzalez-Perez et al., Langmuir, 2009, 25, 10447-10450).
  • Block copolymers are polymeric materials in which two or more monomer sub-units that are polymerized together to create a single polymer chain.
  • Block copolymers typically have properties that are contributed by each monomer sub-unit. However, a block copolymer may have unique properties that polymers formed from the individual sub-units do not possess. Block copolymers can be engineered such that one of the monomer sub-units is hydrophobic (/.e., lipophilic), whilst the other sub-unit(s) are hydrophilic whilst in aqueous media. In this case, the block copolymer may possess amphiphilic properties and may form a structure that mimics a biological membrane.
  • the block copolymer may be a diblock (consisting of two monomer sub-units) but may also be constructed from more than two monomer sub-units to form more complex arrangements that behave as amphipiles.
  • the copolymer may be a triblock, tetrablock or pentablock copolymer.
  • the membrane is preferably a triblock copolymer membrane.
  • the membrane may comprise one of the membranes disclosed in International Application No. WO 2014/064443 or WO 2014/064444.
  • the amphiphilic molecules may be chemically modified or functionalised to facilitate coupling of the polynucleotide.
  • the amphiphilic layer may be a monolayer or a bilayer.
  • the amphiphilic layer is typically planar.
  • the amphiphilic layer may be curved.
  • the amphiphilic layer may be supported.
  • Amphiphilic membranes are typically naturally mobile, essentially acting as two-dimensional fluids with lipid diffusion rates of approximately 10' 8 cm s -1 . This means that the pore and coupled polynucleotide can typically move within an amphiphilic membrane.
  • the membrane may be a lipid bilayer.
  • Lipid bilayers are models of cell membranes and serve as excellent platforms for a range of experimental studies.
  • lipid bilayers can be used for in vitro investigation of membrane proteins by single-channel recording.
  • lipid bilayers can be used as biosensors to detect the presence of a range of substances.
  • the lipid bilayer may be any lipid bilayer. Suitable lipid bilayers include, but are not limited to, a planar lipid bilayer, a supported bilayer, or a liposome.
  • the lipid bilayer is preferably a planar lipid bilayer. Suitable lipid bilayers are disclosed in WO 2008/102121, WO 2009/077734, and WO 2006/100484 (all incorporated herein by reference in their entirety).
  • the membrane may comprise a solid-state layer.
  • Solid state layers can be formed from both organic and inorganic materials including, but not limited to, microelectronic materials, insulating materials such as Si 3 N 4 , A1 2 O 3 , and SiO, organic and inorganic polymers such as polyamide, plastics such as Teflon® or elastomers such as two-component addition-cure silicone rubber, and glasses.
  • the solid-state layer may be formed from graphene. Suitable graphene layers are disclosed in WO 2009/035647 (incorporated herein by reference in its entirety).
  • the pore is typically present in an amphiphilic membrane or layer contained within the solid-state layer, for instance within a hole, well, gap, channel, trench or slit within the solid-state layer.
  • amphiphilic membrane or layer contained within the solid-state layer for instance within a hole, well, gap, channel, trench or slit within the solid-state layer.
  • suitable solid state/amphiphilic hybrid systems are disclosed in WO 2009/020682 and WO 2012/005857 (both incorporated herein by reference in their entirety). Any of the amphiphilic membranes or layers discussed above may be used.
  • the method is typically carried out using (i) an artificial amphiphilic layer comprising a pore, (ii) an isolated, naturally occurring lipid bilayer comprising a pore, or (iii) a cell having a pore inserted therein.
  • the method is typically carried out using an artificial amphiphilic layer, such as a di- or tri-block copolymer layer.
  • the layer may comprise other transmembrane and/or intramembrane proteins as well as other molecules in addition to the pore. Suitable apparatus and conditions are discussed below.
  • the method of the invention is typically carried out in vitro.
  • constriction transplants and (2) cap transplants (also known as scaffold transplants).
  • PorARc_Rco Rhodococcus corynebacteroides
  • SEQ ID NO: 1 Chimeras were named with the cap region (or scaffold) first and then the constriction region.
  • PorARc_Rco_Mph contained the cap region (or scaffold) from PorARc_Rco and the constriction from PorARc pore from Mycolicibacterium phlei (PorARc_Mph; SEQ ID NO: 2).
  • PorARc_Mph For (2), different cap regions (or scaffolds) were transplanted around the constriction region from PorARc_Mph (SEQ ID NO: 2). The same naming convention was applied. For instance, PorARc_Gcr_Mph was the cap region (or scaffolds) from PorARc from Gordonia crocea transplanted around the constriction from PorARc_Mph (SEQ ID NO: 2).
  • Recombinant expression vectors encoding the chimeric pore monomers and PorARc pores with a C-terminal Strep affinity tag and ampicillin resistance gene were transformed into chemically competent E. coli cells.
  • the cells were plated onto an LB Agar plate containing appropriate antibiotics for selection.
  • a single colony from the agar plate was inoculated in LB Media with antibiotics and grown overnight.
  • the culture was diluted into LB media plus necessary antibiotics and incubated at 37°C for 6.5 hours. Following incubation at 37°C, glucose was added and the temperature was dropped to 18°C. After incubation at 18°C for 1 hour, lactose was added and the culture was incubated at 18°C for a further 16 hours.
  • the cells were harvested through centrifugation before being lysed and extracted into lx Bugbuster extraction reagent (Merck 70921) and 0.1% DDM.
  • the chimeric pore monomers were purified from the supernatant using affinity chromatography and ion exchange chromatography, selecting for oligomoeric nanopores as judged by SDS-PAGE.
  • DNA squiggle (/'.e. , DNA translocation current traces')
  • PorARc from Rhodococcus corynebacteroides was tested for comparative purposes.
  • PorARc pore from Mycolicibacterium phlei was also tested.
  • the chimeric pores tested are listed in the tables below. All pores tested were homo-oligomers.
  • the analyte being used to assess the DNA squiggle was a 3.6-kilobase DNA section from the 3' end of the lambda genome.
  • Preparation of the analyte, ligating the analyte to the Y- adapter, SPRI-bead clean-up of the ligated analyte and addition to a minlON flow cell was carried out using the Oxford Nanopore Technologies Q-SQK-LSK109 protocol.
  • Peotide-DNA conjugate souiooles (/'.e., oeotide-DNA translocation current traces)
  • Example current versus time traces as a peptide translocates through the pores were obtained by using a conjugate comprising a polypeptide flanked by two pieces of polynucleotide; a dsDNA Y adapter (DNA1) and a dsDNA tail (DNA2).
  • a polynucleotide- handling protein at the cis side of the nanopore controls the movement of the conjugate by first unwinding DNA1 and translocating 5'-3' on ssDNA, then sliding across the polypeptide section to finally unwind the DNA2 segment.
  • the DNA and polypeptide sections can be visualized on a current vs time plot.
  • the adaptor was from the Oxford Nanopore Technologies Q-SQK-LSK109 kit as above.
  • the DNA tail was made by annealing two DNA oligonucleotides, it also contains a side arm for tethering resulting in two tethering sites per construct to increase efficiency of capture.
  • the polypeptide analytes were obtained with azide moieties at the N-terminus and directly after the C-terminus using an ethyl diamine spacer in line with the peptide backbone. Each analyte was then conjugated to the Y-adapter and DNA tail via copper-free Click Chemistry reaction between the azide and BCN (bicyclo[6.1.0]nonyne) moieties.
  • the sample was purified using Agencourt AMPure XP (Beckman Coulter) beads, with two washes in 28% PEG 8K, 2.5M NaCI, 25mM Tris (pH 8.0) buffer, and eluted into 10 mM Tris-CI, 50 mM NaCI (pH 8.0).
  • a standard sequencing script at -180mV was run for 1-6 hours, with static flicks every 1 minute to remove extended nanopore blocks.
  • Raw data was collected in a bulk FAST5 file using MinKNOW software (Oxford Nanopore Technologies).
  • EXAMPLE 2 - CSGG CHIMERAS Various CsgG constriction transplant chimeras were produced and tested as described above in Example 1. The chimeras are summarised in Table 7 below.
  • Each chimeric pore represents the cap region and transmembrane beta barrel region (together known as the scaffold) from CsgG_Eco_WT containing the constriction from a different CsgG pore from a different species.
  • CsgG-Eco-Vdi is the cap region and transmembrane beta barrel region (or scaffold) from CsgG_Eco_WT containing the constriction from CsgG_Vdi_WT.
  • the pores from which the constrictions were derived are summarised in Table 4 above.
  • Table 7 CsgG constriction transplant chimeric pores
  • Table 8 Sequence identities of pores in Table 7 (calculated including the signal peptide)
  • A Sequence identity of full homologous pore from which the constriction in the chimera is derived to CsgG-Eco-WT
  • B Sequence identity of full chimera to CsgG-Eco-WT.
  • C Sequence identity of chimera to CsgG-Eco-WT (constriction only, E44-A59).
  • E Sequence identity of chimera to CsgG-Eco-WT (constriction only, V38-S63). E shows the sequence identities of the constrictions in Table 4 to the CsgG-Eco-WT constriction in Table 4. Representative electrophysiology results for CsgG-Eco-Vmae are shown in Figure 9.
  • Each chimeric pore represents the cap region and transmembrane beta barrel region (together known as the scaffold) from CsgG_Eco_WT containing the constriction from a different CsgG pore from a different species.
  • CsgG-Eco-Vfu is the cap region and transmembrane beta barrel region (or scaffold) from CsgG_Eco_WT containing the constriction from CsgG_Vfu_WT.
  • the pores from which the constrictions were derived are summarised in Table 4 above.
  • A Sequence identity of full homologous pore from which the constriction in the chimera is derived to CsgG-Eco-WT
  • B Sequence identity of full chimera to CsgG-Eco-WT.
  • C Sequence identity of chimera to CsgG-Eco-WT (constriction only, E44-A59).
  • E Sequence identity of chimera to CsgG-Eco-WT (constriction only: V38-S63). E shows the sequence identities of the constrictions in Table 4 to the CsgG-Eco-WT constriction in Table 4. Representative electrophysiology results for CsgG-Eco-Vfu are shown in Figure 13.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Analytical Chemistry (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Hematology (AREA)
  • General Physics & Mathematics (AREA)
  • Urology & Nephrology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Food Science & Technology (AREA)
  • Nanotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Solid-Sorbent Or Filter-Aiding Compositions (AREA)
  • Investigating Or Analyzing Non-Biological Materials By The Use Of Chemical Means (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present invention relates to novel pore monomers, pores formed from the pore monomers and their uses in analyte detection and characterisation.

Description

PORE MONOMERS AND PORES
TECHNICAL FIELD
The present invention relates to novel pore monomers, pores formed from the pore monomers and their uses in analyte detection and characterisation.
BACKGROUND
Nanopore sensing is an approach to analyte detection and characterization that relies on the observation of individual binding or interaction events between the analyte molecules and an ion conducting channel. Two of the essential components of analyte characterization using nanopore sensing are (1) the control of analyte movement through the pore and (2) the discrimination of the composing building blocks as the analyte is moved through the pore. During nanopore sensing, the narrowest part of the pore forms the most discriminating part of the nanopore with respect to the current signatures as a function of the passing analyte.
For polynucleotide analytes, nucleotide discrimination is achieved by measuring the current as the polynucleotide passes through the pore. Multiple nucleotides contribute to the observed current, so the height of the channel constriction and extent of the interaction with the polynucleotide affect the relationship between observed current and polynucleotide sequence. While the current range and signal-to-noise ratio for nucleotide discrimination have been improved through mutations of protein pores, a sequencing system would have higher performance if the current differences between nucleotides could be improved further. Accordingly, there is a need to identify novel ways to improve nanopore sensing features.
Ghanem et al. (2022), FEBS J, 289: 3505-3520 discloses chimeric mutants of alphahemolysin and gamma-hemolysin. However, these chimeric mutants are not used in nanopore sensing.
SUMMARY OF THE INVENTION
The inventors have surprisingly shown that chimeric pores formed from at least two different pores display an increased signal-to-noise ratio (SNR), an increased current range, decreased noise and increased normalised median absolute deviation (nMAD) during analyte characterisation compared with the different pores from which the chimeras are derived. Increased SNR, increased current range, decreased noise and increased nMAD improve the pore's ability to discriminate analytes as they pass through the pore. The invention therefore provides a chimeric pore monomer comprising two or more regions, wherein at least two of the two or more regions are from at least two different pores, and wherein the at least two different pores do not comprise alpha-hemolysin and gamma-hemolysin. The invention also provides:
- a chimeric construct comprising two or more covalently attached chimeric pore monomers of the invention;
- a chimeric pore comprising at least one chimeric pore monomer of the invention or at least one construct of the invention;
- a chimeric pore multimer comprising two or more pores, wherein at least one of the pores is a chimeric pore of the invention;
- a chimeric pore of the invention or a chimeric pore multimer of the invention, which is comprised in a membrane.
- a membrane comprising a chimeric pore of the invention or a chimeric pore multimer of the invention;
- a method for producing a chimeric pore monomer of the invention comprising attaching the at least two regions from at least two different pores;
- a method for determining the presence, absence or one or more characteristics of a target analyte, comprising the steps of:
(i) contacting the target analyte with (a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from at least two different pores or (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a); and
(ii) taking one or more measurements as the target analyte moves with respect to the pore or pore multimer and thereby determining the presence, absence or one or more characteristics of the target analyte;
- a method of characterising a target analyte using (a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from at least two different pores or (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a);
- use of (a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from at least two different pores or (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a) to determine the presence, absence or one or more characteristics of a target analyte; - a kit for characterising a target polynucleotide comprising:
- (a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from at least two different pores or (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a); and
- a polynucleotide binding protein;
- an apparatus for characterising a target polynucleotide in a sample, comprising:
- (a) a plurality of chimeric pores comprising two or more regions wherein at least two of the two or more regions are from at least two different pores or (b) a plurality of chimeric pore multimers comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a); and
- a plurality of polynucleotide binding proteins;
- a polynucleotide which encodes a chimeric pore monomer of the invention or a chimeric construct of the invention;
- a kit for characterising a target analyte comprising (a) a chimeric pore of the invention or a chimeric pore multimer of the invention and (b) the components of a membrane;
- an array comprising a plurality of membranes of the invention;
- a system comprising (a) a membrane of the invention or an array of the invention, (b) means for applying a potential across the membrane(s) and (c) means for detecting electrical or optical signals across the membrane(s);
- an apparatus comprising a chimeric pore of the invention or a chimeric pore multimer of the invention inserted into an in vitro membrane; and
- an apparatus produced by a method comprising (i) obtaining a chimeric pore of the invention or a chimeric pore multimer of the invention and (ii) contacting the chimeric pore or a pore multimer with an in vitro membrane such that the chimeric pore or the pore multimer is inserted in the in vitro membrane.
The inventors have also surprisingly shown the PorARc pores from various species are capable of nanopore sensing with a high signal-to-noise ratio (SNR), a good current range, minimal noise, and a good normalised median absolute deviation (nMAD). The invention therefore provides a PorARc pore monomer which comprises a sequence having at least 88% identity to the sequence shown in SEQ ID NO: 2 or a sequence having at least about 20% identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55.
The invention also provides: - a PorARc construct comprising two or more covalently attached PorARc pore monomers of the invention;
- a PorARc pore comprising at least one PorARc pore monomer of the invention or at least one construct of the invention;
- a PorARc pore multimer comprising two or more pores, wherein at least one of the pores is a PorARc pore of the invention;
- a PorARc pore of the invention or a PorARc pore multimer of the invention, which is comprised in a membrane;
- a membrane comprising a PorARc pore of the invention or a PorARc pore multimer of the invention;
- a method for determining the presence, absence or one or more characteristics of a target analyte, comprising the steps of:
(i) contacting the target analyte with a PorARc pore of the invention or a PorARc pore multimer of the invention; and
(ii) taking one or more measurements as the target analyte moves with respect to the pore or pore multimer and thereby determining the presence, absence or one or more characteristics of the target analyte;
- a method of characterising a target analyte using a PorARc pore of the invention or a PorARc pore multimer of the invention;
- use of a PorARc pore of the invention or a PorARc pore multimer of the invention to determine the presence, absence or one or more characteristics of a target analyte;
- a kit for characterising a target polynucleotide comprising (a) a PorARc pore of the invention or a PorARc pore multimer according to claim of the invention and (b) a polynucleotide binding protein;
- an apparatus for characterising a target polynucleotide in a sample, comprising (a) a plurality of PorARc pores of the invention or a plurality of PorARc pore multimers of the invention and (b) a plurality of polynucleotide binding proteins;
- a polynucleotide which encodes a PorARc pore monomer of the invention or a PorARc construct of the invention; - a kit for characterising a target analyte comprising (a) a PorARc pore of the invention or a PorARc pore multimer of the invention and (b) the components of a membrane;
- an array comprising a plurality of membranes of the invention;
- a system comprising (a) a membrane of the invention or an array of the invention, (b) means for applying a potential across the membrane(s) and (c) means for detecting electrical or optical signals across the membrane(s);
- an apparatus comprising a PorARc pore of the invention or a PorARc pore multimer of the invention inserted into an in vitro membrane; and
- an apparatus produced by a method comprising (i) obtaining a PorARc pore of the invention or a PorARc pore multimer of the invention and (ii) contacting the chimeric pore or a pore multimer with an in vitro membrane such that the chimeric pore or the pore multimer is inserted in the in vitro membrane.
DESCRIPTION OF THE FIGURES
Figure 1: Schematic showing a chimeric pore formed from the cap region (also known as the scaffold) of one pore (A) and the constriction region of another (B).
Figure 2: An alignment of the sequences of the constriction pore monomer chimeras of the invention. The dark shading shows the consistency of the cap region (or scaffold) between the chimeras and the lack of shading shows the differences between the constriction regions.
Figure 3: Snapshots of run reports showing the ionic current (pA) versus time (s) as single stranded DNA or a peptide-DNA conjugate (bottom squiggle) translocates through PorARc from Rhodococcus corynebacteroides (PorARc_Rco) (for comparative purposes).
Figure 4: Snapshots of run reports showing the ionic current (pA) versus time (s) as single stranded DNA or a peptide-DNA conjugate (bottom squiggle) translocates through PorARc pore from Mycolicibacterium phlei (PorARc_Mph).
Figure 5: Snapshots of run reports showing the ionic current (pA) versus time (s) as single stranded DNA or a peptide-DNA conjugates (bottom squiggle) translocates through PorARc_Rco_Mel_ONLZ18401_ONLP19805
(SEQ ID NO: 18).
Figure 6: Snapshots of run reports showing the ionic current (pA) versus time (s) as single stranded DNA or a peptide-DNA conjugates (bottom squiggle) translocates through PorARc_Aku_Mph_ONLZ19310_ONLP20864 (SEQ ID NO: 40).
Figure 7: SNR, current range (pA) and noise (pA) for all the chimeras tested in the Example 1. The comparative line relates to PorARc pore from Mycolicibacterium phlei (PorARc_Mph) with the substitutions D91N/D92N (SEQ ID NO: 2).
Figure 8: nMAD for all the chimeras tested in the Example 1. The comparative line relates to PorARc pore from Mycolicibacterium phlei (PorARc_Mph) with the substitutions D91N/D92N (SEQ ID NO: 2).
Figure 9: Representative ionic current (pA) versus time (s) traces as single stranded DNA translocates through a CsgG chimeric nanopore (CsgG-Eco-Vmae in Table 7) in Example 2. The raw current trace is shown in black lines and the event detected signal is shown in red lines. A shows consecutive DNA translocation events through a single nanopore, B shows an individual DNA translocation event, C shows a zoomed in view in the x-axis of the first section of the current trace, and D shows a zoomed in view in the x- and y-axes of the first section of the current trace.
Figure 10: SNR, current range (pA) and noise (pA) for the CsgG chimeras tested. The results are shown from left to right in the order in which the chimeras appear in Table 7. The comparative line relates to CsgG wild type pore from E. coli (CsgG-Eco-WT).
Figure 11: The structure and size of the wild-type CsgG pore from Escherichia coli strain K12 (the databank accession code for this structure is 4UV3). The distances shown are measured from backbone to backbone of the amino acids forming the pore structure. The CsgG pore is a tightly interconnected symmetrical nonameric pore that resembles a crown. The overall height is 98 A, and the largest outer diameter is 120 A. It defines a central channel and consists of three parts: (A) the cap region, (B) the constriction region and (C) the transmembrane beta barrel region. Cap axial length, or height, is 39 A. It has an inner diameter of 43 A and a 66 A mouth. The beta barrel has 36 strands, an axial length of 39 A and inner diameter of 55 A. Transition between pore cap and beta barrel is sharp, being the constriction located among them, at the level of the predicted lipid-aqueous interface. The constriction is approximately 18.5 A in diameter and exhibits a length of 20A along the axis of the channel.
Figure 12: Structure and dimensions of PorARc (cryoEM structure). The distances are measured from backbone to backbone of the amino acids forming the pore structure. The PorARc pore is a symmetrical octameric pore. The overall height is 90.4 A, and the largest outer dimension is 90.7 A. The PorARc pore consists of the cap region (A) and the transmembrane beta barrel region (B) which together make the cap region (or scaffold) (C), and the constriction region (D). The cap region (or scaffold) has a height of 73.6 A corresponding to height of the cap region (A), 44.7 A, and the transmembrane beta barrel region (B), 26.2 A. The constriction region (C) has a height of 19.7 A. The PorARc pore has an overall funnel shape with an entry width of 49 A which narrows to 41.5 A at the bottom of the cap region (A) and narrows further to 39.8 A at the transmembrane beta barrel region (B). The constriction region (C) has a sharp narrowing to 27.4 A and then widens to 36. 1 A at the base of the pore structure.
Figure 13: Representative ionic current (pA) versus time (s) traces as single stranded DNA translocates through a CsgG chimeric nanopore (CsgG-Eco-Vfu in Table 10) in Example 3. A-D are the same as in Figure 9.
Figure 14: SNR, current range (pA) and noise (pA) for the CsgG chimeras tested. The results are shown from left to right in the order in which the chimeras appear in Tables 7 and 12. The data for the first nine chimeras from left to right is for the chimeras in Example 2 (and these data are identical to the data in Figure 10). The data for the final three chimeras left to right is for the chimeras in Example 3. The comparative line relates to CsgG wild type pore from E. coli (CsgG-Eco-WT).
DESCRIPTION OF THE SEQUENCE LISTING
SEQ ID NO: 1 shows the amino acid sequence of PorARc from Rhodococcus corynebacteroides (PorARc_Rco) with the substitutions E78R/D82S/E116T/E125A/D165S.
SEQ ID NO: 2 shows the amino acid sequence of PorARc pore from Mycolicibacterium phlei (PorARc_Mph) with the substitutions D91N/D92N.
Table 1 - Description of SEQ ID NOs: 3-49
Figure imgf000008_0001
Figure imgf000009_0001
Figure imgf000010_0001
Figure imgf000011_0001
Figure imgf000012_0001
SEQ ID NO: 50 shows the amino acid sequence of PorARc pore from Mycobacterium sp. (PorARc_Msp) with the substitutions D91N/D92N.
SEQ ID NO: 51 shows the amino acid sequence of PorARc pore from Mycolicibacterium rhodesiae (PorARc_Mrh) with the substitutions D91N/D92N.
SEQ ID NO: 52 shows the amino acid sequence of PorARc pore from Mycolicibacterium elephantis (PorARc_Mel) with the substitutions D91N/E101Q.
SEQ ID NO: 53 shows the amino acid sequence of PorARc pore from Mycolicibacterium cosmeticum (PorARc_Mco) with the substitutions D91N/D92N. SEQ ID NO: 54 shows the amino acid sequence of PorARc pore from unclassified Rhodococcus (WP_056447532.1; PorARc_Rsp) with the substitutions E89Q/D91N/D93N/D100N.
SEQ ID NO: 55 shows the amino acid sequence of PorARc pore from Rhodococcus sp PSBB049 (WP_206003768. 1; PorARc_Rsp) with the substitutions D90N/D95N/D103N. SEQ ID NOs: 56-64 and 73-75 show the amino acid sequences of CsgG pores in Table 4 below. The signal peptide in each sequence is underlined. Any discussion below of specific position numbering in SEQ ID NO: 56 (e.g., Q100) or any of SEQ ID NOs: 57-64 and 73-75 excludes the signal peptide.
SEQ ID NOs: 65-72 show the amino acid sequences of the CsgG constriction chimeras in Table 7 below. The signal peptide in each sequence is underlined.
SEQ ID NOs: 76-78 show the amino acid sequences of the CsgG constriction chimeras in Table 12 below. The signal peptide in each sequence is underlined.
DETAILED DESCRIPTION
All publications, patents and patent applications cited herein, whether supra or infra, are hereby incorporated by reference in their entirety. All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference. To the extent publications and patents or patent applications incorporated by reference contradict the invention contained in the specification, the specification is intended to supersede and/or take precedence over any such contradictory material.
The present invention will be described with respect to particular embodiments and with reference to certain drawings but the invention is not limited thereto but only by the claims. Any reference signs in the claims shall not be construed as limiting the scope. Of course, it is to be understood that not necessarily all aspects or advantages may be achieved in accordance with any particular embodiment of the invention. Thus, for example those skilled in the art will recognize that the invention may be embodied or carried out in a manner that achieves or optimizes one advantage or group of advantages as taught herein without necessarily achieving other aspects or advantages as may be taught or suggested herein.
In addition, as used in this specification and the appended claims, the singular forms "a", "an", and "the" include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to "a polynucleotide" includes two or more polynucleotides, reference to "a polynucleotide binding protein" includes two or more such proteins, reference to "a helicase" includes two or more helicases, reference to "a monomer" refers to two or more monomers, reference to "a pore" includes two or more pores and the like.
In all instances here, the terms "comprises" or "comprising" cover and can be replaced by "consists of" or "consisting of". In all of the discussion herein, the standard one letter codes for amino acids are used. These are as follows: alanine (A), arginine (R), asparagine (N), aspartic acid (D), cysteine (C), glutamic acid (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y) and valine (V). Standard substitution notation is also used, i.e., D91N means that D at position 91 is replaced with N.
In the paragraphs herein where different amino acids at a specific position are separated by the I symbol, the I symbol means "or". For instance, D91N/Q means D91N or D91Q. In the paragraphs herein where different positions are separated by the I symbol, the I symbol means "and" such that D91/D92 is D91 and D92.
A "pore" in the context of the invention is a transmembrane protein structure defining a channel or hole that allows the translocation of molecules and ions from one side of the membrane to the other. The translocation of ionic species through the pore may be driven by an electrical potential difference applied to either side of the pore. A "nanopore" is a biological pore in which the minimum diameter of the channel through which molecules or ions pass is in the order of nanometres (10-9 nanometres). In some embodiments, the pore can be a transmembrane protein pore. The transmembrane protein structure of a biological pore may be monomeric or oligomeric in nature. Typically, the pore comprises a plurality of polypeptide monomers or subunits arranged around a central axis thereby forming a protein-lined channel that extends substantially perpendicular to the membrane in which the pore resides. The number of polypeptide monomers or subunits is not limited. Typically, the number of monomers or subunits is from 5 to up to 30, suitably the number of monomers or subunits is from 6 to 10. The portions of the protein monomers or subunits within the pore that form protein-lined channel typically comprise secondary structural motifs that may include one or more trans-membrane p-barrel, and/or o-helix sections.
The chimeric pore monomers of the invention are formed from at least two different pores, i.e. from at least two monomers from at least two different pores. This means the chimeric pore monomers are formed from pores or pore monomers that in their natural state are/form a pore as defined above. Different pores are defined below.
The general definitions in WO 2019/002893 are incorporated by reference herein in their entirety.
Chimeric pore monomers
The invention provides a chimeric pore monomer. The chimeric pore monomer is typically a protein or polypeptide. The chimeric pore monomer is capable of forming a pore. This can be measured using routine methods, including any of those described in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety) and in the Examples.
The chimeric pore monomer comprises two or more regions. The chimeric pore monomer may comprise any number of regions, such as three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more or ten or more regions. The chimeric pore monomer may comprise three, four, five, six, seven, eight, nine or ten regions. The chimeric pore monomer preferably comprises two regions. The chimeric pore monomer preferably comprises three regions. The chimeric pore monomer preferably comprises five regions.
Specific regions, such as the cap region or constriction region, in a pore may be identified using standard methods. Regions in proteins of unknown structure can be defined by aligning the protein sequence with a homologous protein of known structure. If there is sufficiently high sequence identity or similarity between the sequences, there is confidence in the region boundaries. It is possible to add further confidence in a region boundary through use of tools that predict tertiary structure (e.g., AlphaFold) or secondary structural elements (e.g., PSIPRED). For the latter, if both the protein of unknown structure and the protein of known structure are flanked by the same predicted secondary structural element, then there is more confidence of the region boundary.
The at least two regions preferably comprise a cap region and a constriction region. The cap region typically forms the structural core of the protein pore and may partially reside in a membrane. When the cap region forms the structural core and partially resides in a membrane, it is also known as a scaffold. The constriction region typically comprises at least one constriction. The "constriction" refers to an aperture defined by a luminal surface of a pore, which acts to allow the passage of ions and target molecules (e.g., but not limited to polynucleotides, polypeptides, or individual nucleotides) but not other non-target molecules through the pore channel. As explained in more detail below, pores formed from the chimeric pore monomers of the invention may comprise two or more constrictions. The constriction(s) are typically the narrowest aperture(s) within a pore or within the channel defined by the pore. The constriction(s) may serve to limit the passage of molecules through the pore. The size of the constriction is typically a key factor in determining suitability of a pore for analyte characterisation. If the constriction is too small, the molecule to be characterised will not be able to pass through. However, to achieve a maximal effect on ion flow through the channel, the constriction should not be too large. For example, the constriction should not be wider than the solvent-accessible transverse diameter of a target analyte. Ideally, any constriction should be as close as possible in diameter to the transverse diameter of the analyte passing through. The narrowest point in the constriction region preferably forms a constriction at least 5 A in diameter, such as at least about 10 A, at least about 15 A, at least about 18 A, at least about 20 A, at least about 25 A or at least about 27 A in diameter.
The constriction region may comprise any number of constrictions, such as at least two, at least three, at least four, or at least five constrictions. The constriction region typically resides in a membrane. The constriction is preferably transmembrane. The skilled person is capable of identifying cap regions (or scaffolds) and constriction regions in pores. Specific examples of cap regions and constriction regions are provided below.
The cap region (or scaffold) in the chimeric pore monomer may be longer, shorter or the same length as the cap region (or scaffold) in the pore which the constriction region in the chimeric pore monomer is from or derived from. The constriction region in the chimeric pore monomer may be longer, shorter or the same length as the constriction region in the pore which the cap region (or scaffold) in the chimeric pore monomer is from or derived from. The constriction region in the chimeric pore monomer is preferably shorter than the constriction region in the pore which the cap region (or scaffold) in the chimeric pore monomer is from or derived from. Length may be measured in terms of amino acid number and/or length along the sagittal (or longitudinal) plane of the pore.
The at least two regions preferably comprise a cap region, a constriction region, and a transmembrane region. The transmembrane region may be a transmembrane beta barrel region or a transmembrane alpha helical region. The transmembrane region is preferably a transmembrane beta barrel region. CsgG pores typically comprise these three regions and they are discussed in more detail below in relation to CsgG pores. In these embodiments with a cap region, a constriction region, and a transmembrane region, the cap region and transmembrane region together is also known as a scaffold.
The cap region may further comprise two subregions, namely a landing platform region and a carboxy-terminal (C-terminal) region. The at least two regions preferably comprise a cap region, a landing platform region, a C-terminal region, a constriction region, and a transmembrane region. The transmembrane region may be a transmembrane beta barrel region or a transmembrane alpha helical region. The transmembrane region is preferably a transmembrane beta barrel region. CsgG pores typically comprise these five regions and they are discussed in more detail below in relation to CsgG pores. In these embodiments, the cap region, the landing platform region, the C-terminal region and transmembrane region together is also known as a scaffold.
The cap region in the chimeric pore monomer may be longer, shorter or the same length as the cap region in the pore(s) which the constriction region and/or the transmembrane region in the chimeric pore monomer is/are from or derived from. The constriction region in the chimeric pore monomer may be longer, shorter or the same length as the constriction region in the pore(s) which the cap region and/or the transmembrane region in the chimeric pore monomer is/are from or derived from. The transmembrane region in the chimeric pore monomer may be longer, shorter or the same length as the transmembrane region in the pore(s) which the cap region and/or the constriction region in the chimeric pore monomer is/are from or derived from. The constriction region in the chimeric pore monomer may be shorter than the constriction region in the pore(s) which the cap region and/or the transmembrane region in the chimeric pore monomer is/are from or derived from. Length may be measured in terms of amino acid number and/or length along the sagittal (or longitudinal) plane of the pore.
The chimeric pore monomer preferably comprises two regions. The two regions are preferably a cap region (or scaffold) and a constriction region. In such instances, the cap region (or scaffold) and the constriction region are typically from different pores. An example of this is shown in Figure 1. The constriction transplants and cap transplants (the latter also known as scaffold transplants) tested in Example 1 are also examples of chimeric pore monomers formed from two different pores. In the context of the invention, a transplant indicates when a chimeric pore monomer is created by transplanting all or part of a region from one pore monomer into the monomer from a different pore. For instance, a constriction transplant involves transplanting all or part of the constriction from a pore A monomer into a pore B monomer.
The chimeric pore monomer preferably comprises three regions. The three regions are preferably a cap region, a constriction region, and a transmembrane region. As explained above, the cap region and transmembrane region together may also be known as a scaffold. In such instances, the constriction region may be from one pore and the cap region and the transmembrane region (together also known as the scaffold) may from a different pore, i.e., the chimeric pore monomer is formed from/derived from two different pores. These are also constrictions transplants, such as the chimeric pores produced in Example 2. Alternatively, the cap region, the constriction region, and the transmembrane region may each be from a different pore, i.e., the chimeric pore monomer is formed from/derived from three different pores. The transmembrane region may be a transmembrane beta barrel region or a transmembrane alpha helical region. The transmembrane region is preferably a transmembrane beta barrel region. Such regions are found in CsgG pores.
The chimeric pore monomer preferably comprises five regions. The five regions are preferably a cap region, a landing platform region, a C-terminal region, a constriction region, and a transmembrane region. As explained above, the cap region, the landing platform region, the C-terminal region, and transmembrane region together may also be known as a scaffold. In such instances, the constriction region may be from one pore and the cap region, the landing platform region, the C-terminal region, and the transmembrane region (together also known as the scaffold) may from a different pore, i.e., the chimeric pore monomer is formed from/derived from two different pores. These are also constrictions transplants, such as the chimeric pores produced in Example 2. Alternatively, the cap region, the landing platform region, the C-terminal region, the constriction region, and the transmembrane region may each be from a different pore, i.e., the chimeric pore monomer is formed from/derived from five different pores. The transmembrane region may be a transmembrane beta barrel region or a transmembrane alpha helical region. The transmembrane region is preferably a transmembrane beta barrel region. Such regions are found in CsgG pores.
The constriction region in the chimeric pore monomer may be formed from the constriction regions of the two different pores. In other words, the constriction region may be a hybrid constriction region. For instance, part of the constriction region in one pore (e.g., pore A) may be replaced with a part or the corresponding part from the constriction region of the different pore (e.g., pore B). In this example, the chimeric pore monomer comprises a cap region (or scaffold) from pore A and a constriction region from pores A and B. In this sense, the two regions are from two different pores. Any amount or part of the constriction region in one pore may be replaced with any amount or part from the constriction region of the different pore. For instance, at least about 5%, at least about 10%, at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 99% of the constriction region in one pore may be replaced with at least about 5%, at least about 10%, at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 99% of the constriction region from the different pore. At least about 5 amino acids, such as at least about 10, 15, 20, 25, 26, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105 or 110 amino acids, in the constriction region in one pore may be replaced with at least about 5 amino acids, such as at least about 10, 15, 20, 25, 26, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105 or 110 amino acids, from the constriction region of the different pore. Specific examples of this are discussed below with reference to CsgG chimeras. The whole constriction region of one pore may be replaced with the whole constriction region of a different pore. 100% of the constriction region of one pore may be replaced with 100% of the constriction region of a different pore. All of the amino acids in the constriction region of one pore may be replaced with all of the amino acids in the constriction region of a different pore.
At least two of the two or more regions are from, preferably derived from, at least two different pores. The pore may comprise any number of regions from any number of different pores as long as at least two of the regions are from or derived from two different pores. The chimeric pore monomer may comprise two or more different regions from at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine or at least ten different pores. The number of two or more regions is typically the same as the number of at least two different pores with each region being from or derived from a different pore. The chimeric pore monomer preferably comprises two regions from or derived from two different pores. The chimeric pore monomer preferably comprises three regions from or derived from three different pores. The chimeric pore monomer preferably comprises five regions from or derived from five different pores.
The regions from at least two different pores (or from two different pores) are typically attached, preferably covalently attached, to each other to form the chimeric pore monomer. The regions from at least two different pores (or from two different pores, from three different pores or five different pores) are typically attached, preferably covalently attached, to each other to form the chimeric pore monomer. The regions may be attached directly or may be attached using one or more linkers, preferably one or more peptide linkers. Suitable linkers are discussed below with reference to the constructs of the invention. The chimeric pore monomer of the invention is typically created by genetic fusion of the two or more regions. A polynucleotide encoding the chimeric pore monomer can be designed using the sequences of the at least two different pores. The polynucleotide can then be used to express the chimeric pore monomer as a genetic fusion. This is discussed in more detail below.
A region is "from" or "derived from" a pore if it shares significant homology/identity with a region from the pore. The region, preferably the cap region (or scaffold) or the constriction region, preferably comprises a sequence having at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology to the sequence of the corresponding region, preferably the cap region (or scaffold) or the constriction region, from the pore. The region, preferably the cap region (or scaffold) or the constriction region, preferably comprises a sequence having 100% homology to the sequence of the corresponding region, preferably the cap region (or scaffold) or the constriction region, from the pore. The region, preferably the cap region (or scaffold) or the constriction region, preferably comprises a sequence having at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% identity to the sequence of the corresponding region, preferably the cap region (or scaffold) or the constriction region, from the pore. The region, preferably the cap region (or scaffold) or the constriction region, preferably comprises a sequence having 100% identity to the sequence of the corresponding region, preferably the cap region (or scaffold) or the constriction region, from the pore. Homology and/or identity is typically measured over the entire length of the region.
The cap region, the constriction region or the transmembrane region preferably comprises a sequence having at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology to the sequence of the cap region, the constriction region or the transmembrane region from the pore. The cap region, the constriction region or the transmembrane region preferably comprises a sequence having 100% homology to the sequence of the cap region, the constriction region, or the transmembrane region from the pore. The cap region, the constriction region or the transmembrane region preferably comprises a sequence having at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% identity to the sequence of the cap region, the constriction region or the transmembrane region from the pore. The cap region, the constriction region or the transmembrane region preferably comprises a sequence having 100% identity to the sequence of the cap region, the constriction region, or the transmembrane region from the pore. Homology and/or identity is typically measured over the entire length of the region.
The chimeric pore monomer comprises at least two regions from or derived from at least two different pores. The at least two different pores are typically at least two different pores that appear in nature. The at least two different pores are typically at least two different wild-type or naturally occurring pores. The at least two different pores are preferably different before any artificial or synthetic modifications, such as additions, deletions and/or substitutions, are made to them. The at least two different pores are preferably different before any modifications, such as additions, deletions and/or substitutions, are made to their wild-type or naturally occurring sequences. However, as explained below, one or more of the at least two regions or two regions from or derived from the at least two different pores or two different pores may by modified when constructing the chimeric pore monomer in accordance with the invention. In particular, one or more of the at least two regions or one or more of the two regions in the chimeric pore monomer preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte. Specific modifications are discussed in more detail below. The at least two different pores are preferably homologues, for example structural homologues. A structural homologue refers to a protein or molecule that shares a similar three-dimensional structure with another protein or molecule. This can be determined using standard methods in the art (e.g., AlphaFold or PSIPRED). Structural homologues typically have similar sequences. Structural homologues are normally identified in similar species. For instance, the at least two different pores may be at least two PorARc pores selected from Table 2 below. For instance, the at least two different pores may be at least two CsgG pores selected from Table 4 below. These show structural homologues amongst different species.
Each region in the chimeric pore monomer typically shares low homology or identity with the corresponding region in the different pore(s) which the other the region(s) in the chimeric pore monomer are from or derived from. In the context of the invention, a region corresponds to a region in a different pore when they share a similar structure and/or function. A region may correspond to a region in a different pore when they share a similar sequence. This can be determined as discussed above. The cap region (or scaffold) in one pore corresponds to the cap region (or scaffold) in a different pore. The cap region in one pore corresponds to the cap region in a different pore. The constriction region in one pore corresponds to the constriction region in a different pore. The transmembrane region in one pore corresponds to the transmembrane region in a different pore. The transmembrane beta barrel region in one pore corresponds to the transmembrane beta barrel region in a different pore.
Each region in the chimeric pore preferably comprises a sequence that is about 99% or less homologous or identical to the sequence of the corresponding region in the different pore(s) which the other region(s) in the chimeric pore monomer are from or derived from. Each region in the chimeric pore preferably comprises a sequence that is about 99% or less, about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence of the corresponding region in the different pore(s) which the other region(s) in the chimeric pore are from ore derived from. Each region preferably comprises the cap region (or scaffold) and the constriction region. Each region preferably comprises the cap region, the constriction region, or the transmembrane region. Homology and/or identity is typically measured over the entire length of the region.
In some embodiments, the chimeric pore monomer comprises a cap region (or scaffold) and a constriction region from two different pores. In some embodiments, the chimeric pore monomer comprises a constriction region formed from two different pores and a cap region (or scaffold) from one of the different pores, i.e., the chimeric pore monomer is formed from/derived from two different pores. The cap region (or scaffold) in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence of the cap region (or scaffold) in the pore which the constriction region is from or derived from. The cap region (or scaffold) in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence of the cap region (or scaffold) in the pore which the constriction region is from or derived from. The constriction region in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence of the constriction region in the pore which the cap region (or scaffold) is from or derived from. The constriction region in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence of the constriction region in the pore which the cap region (or scaffold) is from or derived from. Homology and/or identity is typically measured over the entire length of the region.
In some embodiments, the chimeric pore monomer comprises a constriction region from one pore and a cap region and a transmembrane region from a different pore, i.e., the chimeric pore monomer is formed from/derived from two different pores. In some embodiments, the chimeric pore monomer comprises a constriction region formed from two different pores and a cap region and a transmembrane region from one of the different pores, i.e., the chimeric pore monomer is formed from/derived from two different pores. The constriction region in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence of the constriction region in the pore which the cap region and transmembrane region are from or derived from. The constriction region in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 94% or less, about 90% or less, about 85% or less, about 82% or less, about 80% or less, about 75% or less, about 72% or less, about 70% or less, about 69% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence of the constriction region in the pore which the cap region and transmembrane region is from or derived from. The cap region and/or the transmembrane region in the chimeric pore monomer preferably comprise(s) a sequence/sequences that is/are about 99% or less homologous or identical to the sequence(s) of the cap region and/or the transmembrane region in the pore which the constriction region is from or derived from. The cap region and/or the transmembrane region in the chimeric pore monomer more preferably comprise(s) a sequence/sequences that is/are about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence(s) of the cap region and/or the transmembrane region in the pore which the constriction region is from or derived from. Homology and/or identity is typically measured over the entire length of the region.
In some embodiments, the chimeric pore monomer comprises a constriction region, a cap region, and a transmembrane region each from a different pore, i.e., the chimeric pore monomer is formed from/derived from three different pores. In some embodiments, the chimeric pore monomer comprises a constriction region formed from two different pores, a transmembrane region from one of the two different pores and a cap region from a third different pore, i.e., the chimeric pore monomer is formed from/derived from three different pores. In some embodiments, the chimeric pore monomer comprises a constriction region, a cap region, a landing platform region, a C-terminal region, and a transmembrane region each from a different pore, i.e., the chimeric pore monomer is formed from/derived from five different pores. In some embodiments, the chimeric pore monomer comprises a constriction region formed from two different pores, a transmembrane region from one of the two different pores and a cap region, a landing platform region, a C-terminal region, from three different pores, i.e., the chimeric pore monomer is formed from/derived from five different pores. The constriction region in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence(s) of the constriction region in the pore(s) which the cap region and/or transmembrane region is/are from or derived from. The constriction region in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 94% or less, about 90% or less, about 85% or less, about 82% or less, about 80% or less, about 75% or less, about 72% or less, about 70% or less, about 69% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence(s) of the constriction region in the pore(s) which the cap region and/or transmembrane region is/are from or derived from. The cap region in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence(s) of the cap region in the pore(s) which the constriction region and/or transmembrane regions is/are from or derived from. The cap region in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence(s) of the cap region in the pore(s) which the constriction region and/or the transmembrane region is/are from or derived from. The transmembrane region in the chimeric pore monomer preferably comprises a sequence that is about 99% or less homologous or identical to the sequence(s) of the transmembrane region in the pore(s) which the cap region and/or constriction region is/are from or derived from. The transmembrane region in the chimeric pore monomer more preferably comprises a sequence that is about 98% or less, about 97% or less, about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the sequence of the transmembrane region in the pore(s) which the cap region and/or constriction region is/are from or derived from. Homology and/or identity is typically measured over the entire length of the region.
The constriction region in the chimeric pore monomer preferably comprises at least about 1 amino acid difference, such as at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105,
106, 107, 108, 109, 110, 111, 112, or 114 amino acids differences, compared with the constriction region(s) in the pore from which the cap region (or scaffold) is derived or in the pore from which the cap region and transmembrane region are derived.
If the constriction region in the chimeric pore monomer is formed from the constriction regions of two different pores (/.e., is a hybrid constriction region), it preferably comprises at least about 1 amino acid difference, such as at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,
58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102,
103, 104, 105, 106, 107, 108, 109, 110, 111, 112, or 114 amino acids differences, compared with the constriction regions in the two different pores.
The entire or complete chimeric pore monomer preferably comprises a sequence that is about 96% or less homologous or identical to the wild type monomer sequences of the at least two different pores. In other words, the chimeric pore monomer preferably comprises a sequence that is about 96% or less homologous or identical to the wild type monomer sequences of the different pores from which the chimeric pore monomer is derived. The chimeric pore monomer more preferably comprises a sequence that is about 95% or less, about 90% or less, about 85% or less, about 80% or less, about 75% or less, about 70% or less, about 65% or less, about 60% or less, about 55% or less or more preferably about 50% or less, about 45% or less or about 40% or less homologous or identical to the wild type monomer sequences of the at least two different pores or the wild type monomer sequences of the different pores from which the chimeric pore monomer is derived. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
The entire or complete chimeric pore monomer preferably comprises a sequence that is about 99.7% or less homologous or identical to the wild type monomer sequences of the at least two different pores. This especially applies to chimeric pore monomers formed from a CsgG pore, two different CsgG pores, such as the constriction transplants described in Example 2, or three different CsgG pores. This also applies to chimeric pore monomers formed from five different CsgG pores. In other words, the chimeric pore monomer preferably comprises a sequence that is about 99.7% or less homologous or identical to the wild type monomer sequences of the different pores from which the chimeric pore monomer is derived. The chimeric pore monomer more preferably comprises a sequence that is about 99.6% or less, about 99.5% or less, about 99.4% or less, about 99.3% or less, about
99.2% or less, about 99.1% or less, about 99.0% or less, about 98.9% or less, about
98.7% or less, about 98.7% or less, about 98.7% or less, about 98.7% or less, about
98.7% or less, about 98.7% or less, about 98.7% or less homologous or identical to the wild type monomer sequences of the at least two different pores or the wild type monomer sequences of the different pores from which the chimeric pore monomer is derived. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
Standard methods in the art may be used to determine homology or identity. For example, the UWGCG Package provides the BESTFIT program which can be used to calculate homology or identity, for example used on its default settings (Devereux et al (1984) Nucleic Acids Research 12, p387-395). The PILEUP and BLAST algorithms can be used to calculate homology and identity or line up sequences (such as identifying equivalent residues or corresponding sequences (typically on their default settings)), for example as described in Altschul S. F. (1993) J Mol Evol 36:290-300; Altschul, S.F et al (1990) J Mol Biol 215:403-10. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/). The term "pore" is well known to a skilled person and typically refers to a biological pore, i.e., a typical protein structure which defines a channel. In the context of defining the chimeric pore monomers of the invention, the term "pore" typically relates to a structure, preferably a protein structure, that in its native state associates with or crosses a membrane, such as a cell membrane. Other ring-like or channel-like structures are preferably not included. The at least two different pores or the two different pores preferably do not comprise a proteasome. The at least two different pores or the two different pores preferably do not comprise mouse proteasome activator 28o (also called REG or IIS activators).
The at least two different pores or the two different pores are preferably selected from Wza, Iota toxin, Anthrax protective antigen, Vibrio cholerae cytolysin, Cytotoxin K (CytK), CELIII, CsgG, Aerolysin, alpha hemolysin, InvG, GspD, MspA, MspB, MspC, PorARr, PorBRr, PorARc, PilQ, necrotic enteritis B-like toxin (NetB), FraC, portal proteins including G20c, P23_45, T4, SPP1, P22 and Phi29, gamma hemolysin, Monalysin, Lysenin, ClyA, and Clostridium perfringens beta toxin.
The at least two different pores or the two different pores are preferably selected from Wza, Iota toxin, Anthrax protective antigen, Vibrio cholerae cytolysin, Cytotoxin K (CytK), CELIII, CsgG, Aerolysin, alpha hemolysin, InvG, GspD, MspA, MspB, MspC, PorARr, PorBRr, PorARc, PilQ, necrotic enteritis B-like toxin (NetB), FraC, portal proteins including G20c, P23_45, T4, SPP1, P22 and Phi29, gamma hemolysin, Monalysin, Lysenin, ClyA, Clostridium perfringens beta toxin, parasporin-2, epsilon toxin, lectin from the parasitic mushroom Laetiporus sulphureus (LSL), volvatoxin, Cry toxins, CytlAa and Cyt2Aa.
The at least three different pores are preferably selected from Wza, Iota toxin, Anthrax protective antigen, Vibrio cholerae cytolysin, Cytotoxin K (CytK), CELIII, CsgG, Aerolysin, alpha hemolysin, InvG, GspD, MspA, MspB, MspC, PorARr, PorBRr, PorARc, PilQ, necrotic enteritis B-like toxin (NetB), FraC, portal proteins including G20c, P23_45, T4, SPP1, P22 and Phi29, gamma hemolysin, Monalysin, Lysenin, ClyA, and Clostridium perfringens beta toxin. The five different pores may be selected from any of these pores.
The at least three different pores are preferably selected from Wza, Iota toxin, Anthrax protective antigen, Vibrio cholerae cytolysin, Cytotoxin K (CytK), CELIII, CsgG, Aerolysin, alpha hemolysin, InvG, GspD, MspA, MspB, MspC, PorARr, PorBRr, PorARc, PilQ, necrotic enteritis B-like toxin (NetB), FraC, portal proteins including G20c, P23_45, T4, SPP1, P22 and Phi29, gamma hemolysin, Monalysin, Lysenin, ClyA, Clostridium perfringens beta toxin, parasporin-2, epsilon toxin, lectin from the parasitic mushroom Laetiporus sulphureus (LSL), volvatoxin, Cry toxins, CytlAa and Cyt2Aa. The five different pores may be selected from any of these pores. The pore may be from any species. When particular pores can be found in multiple species, the different pores may be selected from any of the pores from any of the species. For instance, the two different pores may be two different CsgG pores. For instance, the three different pores may be three different CsgG pores.
The at least two different pores may be two different PorARc pores or three different PorARc pores. The at least two different pores may be five different PorARc pores. The PorARc pore or two different PorARc pores preferably comprise(s) a cap region (or scaffold) (e.g., C in Figure 12) and a constriction region (e.g., D is Figure 12). The PorARc pore, two different PorARc pores or three different PorARc pores preferably comprise(s) one or more of (a) a cap region (e.g. A in Figure 12), (b) a constriction region (e.g., D in Figure 12), and (c) a transmembrane beta barrel region (e.g., B in Figure 12), such as (a), (b), (c), (a) and (b), (a) and (c), (b) and (c), or (a), (b) and (c). The chimeric pore monomer preferably comprises one or more of (a) a cap region, (b) a constriction region, and (c) a transmembrane beta barrel region, such as (a), (b), (c), (a) and (b), (a) and (c), (b) and (c), or (a), (b) and (c). The chimeric pore monomer preferably comprises (a)-(c). The chimeric pore monomer preferably comprises (a) and (c) from one PorARc pore and (b) from a different PorARc pore. The chimeric pore monomer preferably comprises (a), (b) and (c) each from a different PorARc pore or (a), (b) and (c) from three different PorARc pores. The PorARc pore, two different PorARc pores or three different PorARc pores may have any structure but preferably has/have or comprise(s) the structure of the wild-type PorARc pore (Figure 12). The protein structure of PorARc defines a channel or hole that allows the translocation of molecules and ions from one side of the membrane to the other.
The PorARc pore, two different PorARc pores or three different PorARc pores be any size but preferably has/have the dimensions of the wild-type PorARc_Rco (Figure 12). The PorARc pore or at least two different PorARc pores preferably has/have an external diameter of from about 70 to about 110 A at its widest point, such as from about 80 to about 100 A or from about 85 to about 95 A at its widest point. The PoARc pore, two different PoARc pores or three different PoARc pores preferably has/have an external diameter of about 90.7 A at its widest point. The PorARc pore, two different PorARc pores or three different PorARc pores preferably has/have a total length of from about 70 to about 110 A, such as from about 80 to about 100 A or from about 85 to about 95 A. The PorARc pore, two different PorARc pores or three different PorARc pores preferably has/have a total length of about 90.4 A. References to "total length" and "length" relate to the length of the pore or pore region when viewed from the side (see, e.g., the side view in Figure 12).
The cap region (A in Figure 12) preferably has a length of from about 25 to about 65 A, such as from about 35 to about 55 A or from about 40 to about 50 A. The cap region preferably has a length of about 44.7 A. The channel defined by the cap region preferably has an opening of from about 30 to about 70 A in diameter, such as from about 40 to about 60 A or from about 45 to about 55 A in diameter. The channel defined by the cap region preferably has an opening of about 49 A in diameter. The channel defined by the cap region is preferably from about 20 to about 60 A in diameter at its narrowest point, such as from about 30 to about 50 A or from about 35 to about 45 A in diameter at its narrowest point. The channel defined by the cap region is preferably about 41.5 A in diameter at its narrowest point.
The transmembrane beta barrel region (B in Figure 12) preferably has a length of from about 5 to about 45 A, such as from about 15 to about 35 A or from about 20 to about 30 A. The transmembrane beta barrel preferably has a length of about 26.2 A. The channel defined by the transmembrane beta barrel region is preferably from about 20 to about 60 A in diameter at its narrowest point, such as from about 30 to about 50 A or from about 35 to about 45 A in diameter at its narrowest point. The channel defined by the transmembrane beta barrel region is preferably about 39.8 A in diameter at its narrowest point.
The cap region (or scaffold) (C in Figure 12 and formed from A and B) preferably has a length of from about 55 to about 95 A, such as from about 65 to about 85 A or from about 70 to about 80 A. The cap region (or scaffold) preferably has a length of about 73.6 A. The channel defined by the cap region (or scaffold) (C) is preferably from about 20 to about 60 A in diameter at its narrowest point, such as from about 30 to about 50 A or from about 35 to about 45 A in diameter at its narrowest point. The channel defined by the cap region (or scaffold) (C) is preferably about 39.8 A in diameter at its narrowest point.
The constriction region (D in Figure 12) preferably has a length of from about 5 to about 40 A, such as from about 10 to about 30 A or from about 15 to about 25 A. The constriction region preferably has a length of about 19.7 A. The channel defined by the constriction region is preferably from about 10 to about 50 A in diameter at its narrowest point, such as from about 20 to about 40 A, from about 22 to about 32 A or from about 25 to about 35 A in diameter at its narrowest point. The channel defined by the constriction region is preferably about 27.4 A in diameter at its narrowest point. The channel defined by the constriction region is preferably from about 10 to about 50 A in diameter at its narrowest point, such as from about 15 to about 55 A, from about 25 to about 45 A or from about 30 to about 40 A in diameter at the base of the pore structure. The channel defined by the constriction region is preferably about 36.1 A in diameter at the base of the pore structure. The constriction region is preferably from about 20 to about 60 A in diameter, such as from about 30 to about 50 A or from about 35 to about 45 A. The constriction region is preferably about 41.9 A in diameter. All of the measurements above are based on measuring from backbone to backbone of the amino acids forming the different regions (as shown in Figure 12).
The cap region (or scaffold) (e.g., C in Figure 12) in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably from about 67 to about 187 amino acids in length. The cap region (or scaffold) in the PorARc, at least two different PorARc pores or at least three different PorARc pores is preferably about 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130,
131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147,
148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164,
165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181,
182, 183, 184, 185, 186, or 187 amino acids in length. The cap region (or scaffold) in the PorARc, at least two different PorARc pores or at least three different PorARc pores is preferably from about 87 to about 167 amino acids in length. The cap region (or scaffold) in the PorARc, at least two different PorARc pores or at least three different PorARc pores is preferably about 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120,
121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137,
138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154,
155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167 amino acids in length.
The cap region (or scaffold) in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably about 87, 159, 160, 166, or 167 amino acids in length.
The constriction region in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably from about 5 to about 114 amino acids in length. The constriction region in the PorARc, at least two different PorARc pores or at least three different PorARc pores is preferably about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64,
65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87,
88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107,
108, 109, 110, 111, 112, 113, or 114 amino acids in length. The constriction region in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably from about 15 to about 94 amino acids in length. The constriction region in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, or 94 amino acids in length. The constriction region in the PorARc pore, at least two different PorARc pores or at least three different PorARc pores is preferably about 15, 16, 17, 18, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 39, 50, 52, 86, 94 amino acids in length.
The PorARc pore is preferably selected from the pores in Table 2.
Table 2 - Different PorARc pores for use in the invention (aa = amino acids)
Figure imgf000030_0001
Figure imgf000031_0001
Figure imgf000032_0001
Figure imgf000033_0001
Figure imgf000034_0001
Figure imgf000035_0001
Figure imgf000036_0001
Figure imgf000037_0001
Figure imgf000038_0001
Figure imgf000039_0001
The PorARc pore is preferably selected from the pores in Table 2. The (fourth) Reference column provides a reference to the wild-type (or naturally occurring) sequence of each pore on GenBank. The PorARc pore may be selected from any of the wild-type pores in Table 2 (/.e., from the references in the fourth column). The PorARc pore may be selected from any of the wild-type pores in Table 2 with the signal peptide removed and a methionine (M) at the N terminus (/.e., at position 1). The skilled person can determine these sequences from the references in the fourth column.
Preferred cap and constriction regions for each pore are shown in the fifth and sixth columns of Table 2. The residue numbering in the fifth and sixth columns correspond to the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1). The residue numbering of the cap and constriction regions will need to be adjusted for the wild-type (or naturally occurring) sequences in the references in the fourth column. The cap and constriction regions shown in the fifth and sixth columns of Table 2 are preferred cap and constriction regions. The invention covers additional cap and constriction regions which differ from those shown in the fifth and sixth columns by ± about 10 amino acids. For instance, in PorAR.c_R.co (pore 1):
1-86 includes 1-76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,
93, 94, 95 or 96;
107-180 includes 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116 or 117-170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189 or 190
87-106 includes 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93,
94, 95, 96, 97-96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116 or 117.
This equally applies to pores 2-161 in Table 2. As explained in more detail below, one or more of the at least two regions or one or more of the two regions in the chimeric pore monomer preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte. One or more negatively charged amino acids in any of the pores in Table 2 may be removed, for instance by deletion or substitution. One or more negatively charged amino acids in any of the pores in Table 2, such as one or more E and/or D, are preferably deleted or substituted with one or more different amino acids, such as one or more positively charged amino acids and/or one or more uncharged amino acids. This removes negative charge from the sequence of the pore. Any number of negatively charged amino acids may be deleted or substituted, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. The one or more negatively charged amino acids are preferably in the constriction region of the pore. The preferred constriction region for each pore is defined in sixth column of Table 2. This applies to any of the sequences in Table 2, including the wild-type sequences and the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1). Preferred constriction substitutions for each pore are shown in the seventh column. The PorARc pore may be selected from any of the pores in Table 2 comprising substitution of the negatively charged amino acid, such as D or E, with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of the positions shown, preferably all of the positions shown. For instance, the PorARc pore may be pore 2 with substitution of D with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at position 91 and/or position 92. The PorARc pore may be pore 3, pore 8 or pore 19 with substitution of D with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at position 91 and/or position 92. The PorARc pore may be pore 17 with substitution of D or E with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at position 91 and/or position 101. The PorARc pore may be pore 25 with substitution of D or E with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of, preferably all of, position 89, 91, 93 and 100. The PorARc pore may be pore 27 with substitution of D with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of, preferably all of, positions 90, 95 and 103. This applies to any of the sequences in Table 2, including the wild-type sequences and the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1). The residue numbering in the seventh column corresponds to the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1).
The PorARc pore may be selected from any of the pores in Table 2 comprising one or more of, or all of, the substitutions shown in the seventh column. For instance, the PorARc pore may be pore 2 in Table 2 with D91N and/or D92N. The PorARc pore may be pore 3, pore 8 or pore 19 with D91N and/or D92N. The PorARc pore may be pore 17 with D91N and/or E101Q. The PorARc pore may be pore 25 with one or more of, preferably all of, E89Q, D91N, D93N and D100N. The PorARc pore may be pore 27 with one or more of, preferably all of, D90N, D95N and D103N. This applies to any of the sequences in Table 2, including the wild-type sequences and the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1). The residue numbering in the seventh column corresponds to the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1). The PorARc pore preferably comprises or consists of (a) the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55 or (b) a sequence having at least about 20% homology or identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. Such sequences are discussed in more detail below.
The PorARc pore may be pore 1 in Table 2 with one or more cap substitutions. For instance, the PorARc is preferably pore 1 in Table 2 with substitution of D or E with a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of, preferably all of, positions 78, 82, 116, 125 and 165. The PorARc pore is preferably pore 1 in Table 2 with one or more of, preferably all of, E78R, D82S, E116T, E125A, or D165S. This applies to any of the pore 1 sequences in Table 2, including the wildtype sequence and the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1). The PorARc pore preferably comprises or consists of the sequence shown in SEQ ID NO: 1.
The PorARc pore may be pore 1 in Table 2 with one or more constriction substitutions. For instance, the PorARc is preferably pore 1 in Table 2 with substitution of D or E with a positive amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at position 89 and/or position 104. The PorARc pore is preferably pore 1 in Table 2 with E89R, E89Q, E89L or E89A and/or D104S or D104N, such as E89R/D104S, E89Q/D104S, E89Q/D104N, E89L/D104S, E89L/D104N, E89A/D104S or E89A/D104N. The PorARc pore is preferably pore 1 in Table 2 with E89R and/or D104S, such as E89R/D104S. This applies to any of the pore 1 sequences in Table 2, including the wild-type sequence and the wild-type sequences without the signal peptide and having a methionine (M) at the N terminus (/.e., at position 1). These one or more constriction substitutions may be made in addition to the one or more cap substitutions discussed above.
The at least two different pores preferably comprise at least two different PorARc pores. The two different pores are preferably two different PorARc pores. The at least two or two different PorARc pores may be selected from the pores in Table 2. One of the at least two different pores is preferably PorARc_Rco (pore 1 in Table 2) or PorARc_Mph (pore 2 in Table 2). One of the two different pores is preferably PorARc_Rco (pore 1 in Table 2) or PorARc_Mph (pore 2 in Table 2). The at least two different pores preferably comprise or the two different pores preferably are (a) PorARc_Rco or PorARc_Mph and (b) one of the pores in Table 2. In (b), the one of the pores in Table 2 is preferably pore 3, 8, 17, 19, 20, 25 or 27. The at least two different pores preferably comprise or the two different pores preferably are PorArc_Rco and PorARc_Rco_Mph. Any reference in this paragraph to a particular pore in Table 2 (e.g., pore 1) includes any of the pores discussed above in relation to that pore Table 2 including the wild-type sequence, the sequence lacking the signal peptide and including a M at the N terminus (/.e., at position 1) and either of such pores comprising the one or more substitutions discussed above. PorARc_Rco pore preferably comprises or consists of SEQ ID NO: 1. PorARc_Mph preferably comprises or consists of SEQ ID NO: 2.
The constriction transplants and cap transplants (the latter also known as scaffold transplants) in Example 1 are formed from different PorARc pores. The at least two different pores preferably comprise or the two different pores preferably are the combination of two different PorARc pores in any of those transplants. The at least two different pores preferably comprise or the two different pores preferably are any combination shown in a row in Table 3. In relation to any row in Table 3, the cap region (or scaffold) is preferably from or derived from the pore in column A and the constriction region is preferably from or derived from the pore in column B. In relation to any row in Table 3, the constriction region is preferably from or derived from the pore in column A and the cap region (or scaffold) is preferably from or derived from the pore in column B. A reference in Table 3 to a particular pore in Table 2 (e.g., pore 3) includes any of the pores discussed above in relation to that pore in Table 2 including the wild-type sequence, the sequence lacking the signal peptide and including a M at the N terminus (/.e., at position 1) and either of such pores comprising the one or more substitutions discussed above. Where preferred pores have a sequence identifier number, these are also shown in Table 3. The pore preferably comprises or consists of the sequence identifier number.
Table 3 - Preferred combinations of PorARc pores for use in the invention
Figure imgf000043_0001
Figure imgf000044_0001
Figure imgf000045_0001
Figure imgf000046_0001
Figure imgf000047_0001
Figure imgf000048_0001
Figure imgf000049_0001
Figure imgf000050_0001
The chimeric pore monomer preferably comprises a sequence having at least about 40% homology or identity to the sequence shown in any one of SEQ ID NOs: 3-49. The chimeric pore monomer preferably comprises a sequence having at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 3-49. Any one one of SEQ ID NOs: 3-49 is of course equivalent to SEQ ID NO: 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48 or 49. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
The chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 3-49. The chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%. at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 3-49. The chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 3-49, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 3-49. Any one one of SEQ ID NOs: 3-49 is of course equivalent to SEQ ID NO: 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48 or 49. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
In any of these embodiments, the chimeric pore monomer preferably does not comprise the entire sequence of any wild-type pore. In any of these embodiments, the chimeric pore monomer preferably does not comprise a sequence having 100% identity to the entire sequence of any wild-type pore. In any of these embodiments, the chimeric pore monomer preferably does not comprise the entire sequence of any of the different pores used to create the chimeric pore monomer or any of the different pores which the two or more regions, such as two regions, three regions or five regions, are from or derived from. In any of these embodiments, the chimeric pore monomer preferably does not comprise a sequence having 100% identity to the entire sequence of any of the different pores used to create the chimeric pore monomer or any of the different pores which the two or more regions, such as two regions, three regions or five regions, are from or derived from.
The at least two different pores preferably comprise or the two different pores are preferably (a) PorARc_Rco and MspA, (b) two different CsgG pores or three different CsgG pores, (c) alpha-hemolysin and CytK, or (d) NetB and CytK. The at least two different pores preferably comprise or the two different pores are preferably (a) PorARc_Rco and MspA, (b) two different CsgG pores, (c) alpha-hemolysin and CytK, or (d) NetB and CytK. The at least two different pores preferably comprise five different CsgG pores. The two different CsgG pores or three different CsgG pores in (b) may be selected from any of the pores disclosed in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068, CN 113773373 A, CN 113896776 A, CN 113912683 A, and CN 113754743 A or a variant thereof (all incorporated by reference herein in their entirety). The five different CsgG pores may be selected from any of the pores disclosed in thes references. The two different CsgG pores may be CsgG from E. coli and CsgG from a different species. The three different CsgG pores may be CsgG from E. coli and CsgG from two different non-E. coli species. The five different CsgG pores may be CsgG from E. coli and CsgG from four different non-E. coli species.
CsgG pores are known in the art, especially from WO 2019/002893 (incorporated by reference herein in its entirety). The CsgG pore, two different CsgG pores or three different CsgG pores preferably comprise(s) one or more of (a) a cap region, (b) a constriction region, and (c) a transmembrane beta barrel region, such as (a), (b), (c), (a) and (b), (a) and (c), (b) and (c), or (a), (b) and (c). This also applies to the five different CsgG pores. The cap region (a) in CsgG is also known as the rim. Together (a) and (c) is also known as a scaffold. The chimeric pore monomer preferably comprises one or more of (a) a cap region, (b) a constriction region, and (c) a transmembrane beta barrel region, such as (a), (b), (c), (a) and (b), (a) and (c), (b) and (c), or (a), (b) and (c). The chimeric pore monomer preferably comprises (a)-(c). The chimeric pore monomer preferably comprises (a) and (c) from one CsgG pore and (b) from a different CsgG pore. The chimeric pore monomer preferably comprises (a), (b) and (c) each from a different CsgG pore or (a), (b) and (c) from three different CsgG pores. The residues of SEQ ID NOs: 56-64 which form these regions are defined below. The CsgG pore, two different CsgG pores or three different CsgG pores may have any structure but preferably has/have or comprise(s) the structure of the wild-type CsgG pore (Figure 11). This also applies to five different CsgG pores. The protein structure of CsgG defines a channel or hole that allows the translocation of molecules and ions from one side of the membrane to the other.
The "constriction", "orifice", "constriction region", "channel constriction", or "constriction site", as used interchangeably herein, refers to an aperture defined by a luminal surface of a pore or pore complex, which acts to allow the passage of ions and target molecules (e.g., but not limited to polynucleotides or individual nucleotides) but not other non-target molecules through the pore or pore complex channel. The constriction(s) are typically the narrowest aperture(s) within a pore or pore complex or within the channel defined by the pore or pore complex. The constriction(s) may serve to limit the passage of molecules through the pore. The size of the constriction is typically a key factor in determining suitability of a pore or pore complex for analyte characterisation. If the constriction is too small, the molecule to be characterised will not be able to pass through. However, to achieve a maximal effect on ion flow through the channel, the constriction should not be too large. For example, the constriction should not be wider than the solvent-accessible transverse diameter of a target analyte. Ideally, any constriction should be as close as possible in diameter to the transverse diameter of the analyte passing through.
The CsgG pore, two different CsgG pores or three different CsgG pores may be any size but preferably has/have the dimensions of the wild-type CsgG pore (Figure 11). The CsgG pore, two different CsgG pores or three different CsgG pores preferably has/have an external diameter of from about 100 to about 150 A at its widest point, such as from about 110 to about 140 A or from about 115 to about 125 A at its widest point. The CsgG pore, two different CsgG pores or three different CsgG pores preferably has/have an external diameter of about 120 A at its widest point. The CsgG pore, two different CsgG pores or three different CsgG pores preferably has/have a total length of from about 80 to about 120 A, such as from about 90 to about 110 A or from about 95 to about 105 A. The CsgG pore, two different CsgG pores or three different CsgG pores preferably has/have a total length of about 98 A. References to "total length" and "length" relate to the length of the pore or pore region when viewed from the side (see, e.g., the side view in Figure 11). These sizes also apply to the five different CsgG pores. The cap region preferably has a length of from about 20 to about 60 A, such as from about 30 to about 50 A or from about 35 to about 45 A. The cap region preferably has a length of about 39 A. The channel defined by the cap region preferably has an opening of from about 45 to about 85 A in diameter, such as from about 55 to about 75 A or from about 60 to about 70 A in diameter. The channel defined by the cap region preferably has an opening of about 66 A in diameter. The channel defined by the cap region is preferably from about 30 to about 70 A in diameter at its narrowest point, such as from about 35 to about 60 A or from about 40 to about 50 A in diameter at its narrowest point. The channel defined by the cap region is preferably about 43 A in diameter at its narrowest point.
The constriction region preferably has a length of from about 5 to about 40 A, such as from about 10 to about 30 A or from about 15 to about 25 A. The constriction region preferably has a length of about 20 A. The channel defined by the constriction region is preferably from about 2 to about 40 A in diameter at its narrowest point, such as from about 5 to about 35 A, from about 8 to about 25 A or from about 10 to about 20 A in diameter at its narrowest point. The channel defined by the constriction region is preferably about 9 A or 12 A in diameter. The channel defined by the constriction region is preferably about 18.5 A in diameter. The constriction is preferably from about 2 to about 40 A in diameter, such as from about 5 to about 35 A, from about 8 to about 25 A or from about 10 to about 20 A in diameter. The constriction is preferably about 9 A or 12 A in diameter. The constriction is preferably about 12 A in diameter.
The transmembrane beta barrel region preferably has a length of from about 20 to about 60 A, such as from about 30 to about 50 A or from about 35 to about 45 A. The transmembrane beta barrel preferably has a length of about 39 A. The channel defined by the transmembrane beta barrel region is preferably from about 35 to about 75 A in diameter at its narrowest point, such as from about 45 to about 65 A or from about 50 to about 60 A in diameter at its narrowest point. The channel defined by the transmembrane beta barrel region is preferably about 55 A in diameter at its narrowest point.
All of the measurements above are based on measuring from backbone to backbone of the amino acids forming the different regions (as shown in Figure 11).
The cap region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably from about 164 to about 210 amino acids in length. The cap region (or scaffold) in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209 or 210 amino acids in length. The cap region (or scaffold) in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably from about 184 to about 190 amino acids in length. The cap region (or scaffold) in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 184, 185, 186, 187, 188, 189 or 190 amino acids in length. These lengths equally apply to the five different CsgG pores. The cap region preferably comprises a landing platform region and a carboxy-terminal (C-terminal) region.
The constriction region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably from about 6 to about 46 amino acids in length. The constriction region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or 46 amino acids in length. The constriction region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably from about 16 to about 36 amino acids in length. The constriction region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35 or 36 amino acids in length. The constriction in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 26 amino acids in length. These lengths equally apply to the five different CsgG pores.
The transmembrane beta barrel region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably from about 28 to about 68 amino acids in length. The constriction region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67 or 68 amino acids in length. The transmembrane beta barrel region in the CsgG pore, at least two different CsgG pores or at least three different CsgG pores is preferably about 48 amino acids in length. These lengths equally apply to the five different CsgG pores.
CsgG pores are highly conserved (as can be readily appreciated from Figures 45 to 47 of WO 2017/149317). The CsgG pore, two different CsgG pores or three different CsgG pores may have any of the sequences shown in SEQ ID NOs: 68 to 88 of WO 2019/002893 (incorporated by reference herein in its entirety) and comprise any of the modifications or mutations disclosed therein. The CsgG pore, two different CsgG pores or three different CsgG pores may also be any of the sequences shown in CN 113773373 A, CN 113896776 A, CN 113912683 A, and CN 113754743 A or variants thereof. The five different CsgG pores may have any of these sequences. It will further be appreciated that the invention extends to other variant CsgG pores not expressly identified in the specification that show highly conserved regions.
The CsgG pore, the two different CsgG pores or the three different CsgG pores preferably is/are selected from the pores in Table 4. The five different CsgG pores are preferably selected from the pores in Table 4.
Table 4 - Different CsgG pores for use in the invention (the position numbering in the last three columns excludes the signal peptide shown in SEQ ID NOs: 56-64, i.e., positions 1-37 in SEQ ID NO: 56 in the fifth column relates to positions 16-52 in the sequence for SEQ ID NO: 56 shown in the sequence listing; TM = transmembrane; aa = amino acids)
Figure imgf000055_0001
Figure imgf000056_0001
Figure imgf000057_0001
The fourth column provides the SEQ ID NO: for the wild-type pore. The CsgG pore, two different CsgG pores or three different CsgG pores may be selected from the wild-type pores in Table 4 (/.e., from the sequences in the fourth column). The CsgG pore, two different CsgG pores or three different CsgG pores may be selected from the wild-type pores in Table 4 with the signal peptide removed, i.e., from SEQ ID NOs: 56-64 with the signal peptide removed or from SEQ ID NO: 55-64 and 73-75 with the signal peptide removed. The five different CsgG pores may be selected from any of these pores. The skilled person can determine these sequences from the sequences in the sequence listing where the signal peptide is underlined.
Preferred cap, constriction and transmembrane beta barrel regions for each pore are shown in the fifth, sixth and seventh columns of Table 4. These preferred regions may be used to construct the chimeric pore monomers of the invention as discussed above. The residue numbering in the fifth, sixth and seventh columns correspond to the wild-type sequences without the signal peptide. The skilled person can determine these sequences from the sequences in the sequence listing where the signal peptide is underlined. The cap, transmembrane beta barrel and constriction regions in the fifth, sixth and seventh columns of Table 4 are preferred cap and constriction regions. The invention covers cap, transmembrane beta barrel and constriction regions which differ from those shown in the fifth, sixth and seventh columns by ± about 10 amino acids. For instance, in CsgG_Eco_WT:
- 1-37 includes 1-27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46 or 47,
64-134 includes 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73 or 74-124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143 or 144, - 155-181 includes 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164 or 165-171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191,
- 210-262 includes 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219 or 220-252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271 or 272,
- 38-63 includes 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47 or 48-53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72 or 73,
- 135-154 includes 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144 or 145-144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163 or 164, and
- 182-209 includes 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191 or 192-199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218 or 219.
This equally applies to the other pores in Table 4.
The cap region may further comprise two subregions, namely a landing platform region and a carboxy-terminal (C-terminal) region. The landing platform region comprises helix 2, that shapes the pore surface on the Cys side and with which molecular engage the channel, such as N22 peptides in CsgA like sequences, or the enzyme in analyte characterisation as discussed below applications. This is a separate structural and functional unit that can be replaced with equivalent sequences from homologues. The C-terminal tail can carry additional sequences in fusion to the channel. These can come from CsgG homologues.
These regions correspond to the following residues in the CsgG pores in Table 4 (numbering without the signal peptide):
• CsgG_Eco_WT in SEQ ID NO: 56: landing platform region = 85-117 and C-terminal region = 242-262
• CsgG_Vdi_WT in SEQ ID NO: 57: landing platform region = 87-119 and C-terminal region = 244-263
• CsgG_Vmae_WT in SEQ ID NO: 58: landing platform region = 87-119 and C- terminal region = 244-264 CsgG_Vsp_WT in SEQ ID NO: 59: landing platform region = 87-119 and C-terminal region = 244-263
• CsgG_Ler_WT in SEQ ID NO: 60: landing platform region = 85-117 and C-terminal region = 242-261
• CsgG_Vcr_WT in SEQ ID NO: 61: landing platform region = 87-119 and C-terminal region = 244-263
• CsgG_Psh_WT in SEQ ID NO: 62: landing platform region = 86-118 and C-terminal region = 243-263
• CsgG_Vhi_WT in SEQ ID NO: 63: landing platform region = 87-115 and C-terminal region = 240-259
• CsgG_Vma_WT in SEQ ID NO: 64: landing platform region = 87-119 and C-terminal region = 244-263
• CsgG_Vfu_WT in SEQ ID NO: 73: landing platform region = 87-119 and C-terminal region = 244-263
• CsgG_Vme_WT in SEQ ID NO: 74: landing platform region = 87-119 and C-terminal region = 244-263
CsgG_Vge_WT in SEQ ID NO: 75: landing platform region = 87-119 and C-terminal region = 244-263It will be clear from this the constriction regions in Table 4 may be 46 amino acids in length (e.g., 28-73 in CsgG_Eco_WT). When constructing chimeric pore monomers from two or three different CsgG pores, the whole constriction region from one CsgG pore may be used, i.e., all 46 amino acids from one CsgG pore may be used. This typically involves complete replacement i.e., absence) of the constriction region(s) from the other different CsgG pore or the other two different CsgG pores. This also applies to chimeric pore monomers constructed from five different CsgG pores. Alternatively, part of the constriction region from one CsgG pore may be replaced with all or part of the constriction region from a different CsgG pore. This is an example of the hybrid constriction regions discussed above. At least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or 46, of the amino acids in the constriction region of one CsgG pore may be replaced with the constriction region from a different CsgG pore. As explained above, the constriction region in the chimeric pore monomer may be longer, shorter or the same length as the constriction region in the pore(s) which the cap region and/or the transmembrane beta barrel region in the chimeric pore monomer is/are from or derived from. At least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or 46, of the amino acids in the constriction region of one CsgG pore may be introduced into the constriction region of a different CsgG pore. At least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or 46, of the amino acids in the constriction region of one CsgG pore may be replaced with at least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 or 46, of the amino acids in the constriction region from a different CsgG pore.
Preferred constriction regions for each pore are shown in the sixth column of Table 4. Any of these constriction regions may be used to construct the chimeric pore monomers of the invention as discussed above. The preferred constriction regions in Table 4 are all 26 amino acids in length. When constructing chimeric pore monomers from two or three different CsgG pores, the whole constriction region from one CsgG pore may be used, i.e., all 26 amino acids from one CsgG pore may be used. This typically involves complete replacement (/.e., absence) of the constriction region(s) from the other different CsgG pore or the other two different CsgG pores. This also applies to chimeric pore monomers constructed from five different CsgG pores. Alternatively, part of the constriction region from one CsgG pore may be replaced with all or part of the constriction region from a different CsgG pore. This is an example of the hybrid constriction regions discussed above. At least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25, of the amino acids in the constriction region of one CsgG pore may be replaced with the constriction region from a different CsgG pore. As explained above, the constriction region in the chimeric pore monomer may be longer, shorter or the same length as the constriction region in the pore(s) which the cap region and/or the transmembrane beta barrel region in the chimeric pore monomer is/are from or derived from. At least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25, of the amino acids in the constriction region of one CsgG pore may be introduced into the constriction region of a different CsgG pore. At least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25, of the amino acids in the constriction region of one CsgG pore may be replaced with at least about 6, such at least about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25, of the amino acids in the constriction region from a different CsgG pore. At least about 15 of the amino acids in the constriction region of one CsgG pore may be replaced with at least about 14 of the amino acids in the constriction region from a different CsgG pore. At least about 18 of the amino acids in the constriction region of one CsgG pore may be replaced with at least about 20 of the amino acids in the constriction region from a different CsgG pore. As explained in more detail below, one or more of the at least two regions or one or more of the two regions in the chimeric pore monomer preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte. One or more negatively charged amino acids in any of the pores in Table 4 may be removed, for instance by deletion or substitution. One or more negatively charged amino acids in any of the pores in Table 4, such as one or more E and/or D, are preferably deleted or substituted with one or more different amino acids, such as one or more positively charged amino acids and/or one or more uncharged amino acids. This removes negative charge from the sequence of the pore. Any number of negatively charged amino acids may be deleted or substituted, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. The one or more negatively charged amino acids are preferably in the constriction region of the pore. The preferred constriction region for each pore is defined in sixth column of Table 4. This applies to any of the sequences in Table 4, including the wild-type sequences and the wild-type sequences without the signal peptide.
SEQ ID NO: 56 is the wild-type CsgG pore from Escherichia coli Str. K-12 substr. MC4100. The CsgG pore, one of the two different CsgG pores or one of the three different CsgG pores may comprise the sequence of SEQ ID NO: 56 or may comprise SEQ ID NO: 56 having any of the substitutions present in another CsgG homologue. This also applies to the five different CsgG pores. Preferred CsgG homologues are shown in SEQ ID NOs: 68 to 88 of WO 2019/002893 (incorporated by reference herein in its entirety) and in Table 4 above. The CsgG pore, one of the two different CsgG pores or one of the three different CsgG pores may comprise combinations of one or more of the substitutions present in SEQ ID NOs: 68 to 88 WO 2019/002893 (incorporated by reference herein in its entirety) or in the other homologies in Table 4 compared with SEQ ID NO: 56, including one or more substitutions, one or more conservative mutations, one or more deletions or one or more insertion mutations, such as deletion or insertion of 1 to 10 amino acids, such as of 2 to 8 or 3 to 6 amino acids. This also applies to one of the five different CsgG pores.
The chimeric pore monomer of the invention typically retains the ability to form the same 3D structure as the wild-type CsgG pore monomer, such as the same 3D structure as a CsgG pore having the sequence of SEQ ID NO: 56. The 3D structure of CsgG is known in the art and is disclosed, for example, in Goyal et al (2014) Nature 516(7530):250-3. Any number of modifications or mutations may be made in the wild-type CsgG sequence in addition to the modifications and mutations described herein provided that the chimeric pore monomer retains the improved properties of the invention.
Typically a chimeric pore monomer formed from a CsgG pore, two different CsgG pores or three different CsgG pores will retain the ability to form a structure comprising three alpha- helices and five beta-sheets. This also applies to chimeric pore monomers constructed from five different CsgG pores. One or more modifications may be made at least in the region which is N-terminal to the first alpha helix (which starts at S63 in SEQ ID NO: 56), in the second alpha helix (from G85 to A99 of SEQ ID NO: 56), in the loop between the second alpha helix and the first beta sheet (from Q100 to N120 of SEQ ID NO: 56), in the fourth and fifth beta sheets (S173 to R192 and R198 to T107 of SEQ ID NO: 56, respectively) and in the loop between the fourth and fifth beta sheets (F193 to Q197 of SEQ ID NO: 56) without affecting the ability of the chimeric pore monomer to form a transmembrane pore which is capable of translocating analytes. Further modifications may be made in any of these regions in any chimeric pore monomer formed from a CsgG pore, two different CsgG pores or three different CsgG pores without affecting the ability of the chimeric pore monomer to form a pore that can translocate analytes. This also applies to chimeric pore monomers constructed from five different CsgG pores.
It is also expected that one or more modifications may be made in other regions, such as in any of the alpha helices (S63 to R76, G85 to A99 or V211 to L236 of SEQ ID NO: 56) or in any of the beta sheets (1121 to N133, K135 to R142, 1146 to R162, S173 to R192 or R198 to T107 of SEQ ID NO: 56) without affecting the ability of the chimeric pore monomer to form a pore that can translocate analytes. It is also expected that deletions of one or more amino acids can be made in any of the loop regions linking the alpha helices and beta sheets and/or in the N-terminal and/or C-terminal regions without affecting the ability of the chimeric pore monomer to form a pore that can translocate analytes.
The chimeric pore monomer may contain the region(s) of SEQ ID NO: 56 that is/are responsible for pore formation. The pore forming ability of CsgG, which contains a p-barrel, is provided by p-sheets in each subunit. The chimeric pore monomer may comprise the regions in SEQ ID NO: 56 that form p-sheets, namely K134-Q154 and S183-S208. One or more modifications can be made to the regions of SEQ ID NO: 56 that form p-sheets as long as the resulting variant retains its ability to form a pore. The chimeric pore monomer preferably includes one or more modifications, such as substitutions, additions, or deletions, within the o-helices and/or loop regions of SEQ ID NO: 56.
The one or more modifications in the CsgG pore, two different CsgG pores or three different CsgG pores preferably improve the ability of a chimeric pore monomer formed from a CsgG pore, two different CsgG pores or three different CsgG pores to characterise an analyte. For example, modifications/mutations/substitutions are contemplated to alter the number, size, shape, placement, or orientation of the constriction within a channel of the chimeric pore monomer. The CsgG pore, two different CsgG pores or three different CsgG pores may have any of the particular modifications or substitutions disclosed in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety). This also applies to chimeric pore monomers constructed from five different CsgG pores.
Preferred modifications or substitutions in SEQ ID NO: 56 include, but are not limited to, one or more of, such as 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more or all of:
(a) a substitution at position Y51, such as Y51I, Y51L, Y51A, Y51V, Y51T, Y51S, Y51Q or Y51N;
(b) a substitution at position N55, such as N55I, N55L, N55A, N55V, N55T, N55S or N55Q;
(c) a substitution at position F56, such as F56I, F56L, F56A, F56V, F56T, F56S, F56Q or F56N;
(d) a substitution at position L90, such as L90N, L90D, L90E, L90R or L90K;
(e) a substitution at position N91, such as N91D, N91E, N91R or N91K;
(f) a substitution at position K94, such as K94R, K94F, K94Y, K94Q, K94W, K94L, K94S or K94N;
(g) a substitution at position R192, such as R192Q, R192F, R192S R192D, or R192T; and
(i) a substitution at position C215, such as C215T, C215S, C215I, C215L, C215A, C215V, or C215G.
Preferred modifications or substitutions in SEQ ID NO: 56 include, but are not limited to, one or more of, such as 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, 12 or more or all of:
(a) a substitution at position Y51, such as Y51I, Y51L, Y51A, Y51V, Y51T, Y51S, Y51Q or Y51N;
(b) a substitution at position N55, such as N55I, N55L, N55A, N55V, N55T, N55S or N55Q;
(c) a substitution at position F56, such as F56I, F56L, F56A, F56V, F56T, F56S, F56Q or F56N;
(d) a substitution at position L90, such as L90N, L90D, L90E, L90R or L90K;
(e) a substitution at position N91, such as N91D, N91E, N91R or N91K; (f) a substitution at position K94, such as K94R, K94F, K94Y, K94Q, K94W, K94L, K94S or K94N;
(g) a substitution at position R97, such as with R97H, R97K, R97A, R97V, R97I, R97L, R97M, R97F, R97W, R97Y, R97S, R97T, R97Q, R97D, R97E, R97N, R97C, R97P or R97G;
(h) a substitution at position Q100, such as Q100R, Q100H, Q100K, Q100W, Q100A, Q100V, Q100I, Q1OOL, Q100M, Q1OOF, Q100Y, Q100T, Q100N or Q100S;
(i) a substitution at position E1O1, such as E101V, E1O1I, E1O1L, E101M, E101A, E1O1F, E101Y, E101W, E1O1S, E101T, E1O1N, E1O1Q, E1O1C, E101G or E1O1P;
(j) a substitution at position N102, such as N102E, N102R, N102H, N102K, N102S, N102T, N102D, N102Q, N102V, N102I, N102L, N102M, N102F, N102Y, N102W or N102A;
(k) a substitution at position T1O4, such as T104E, T104R, T104H, T104K, T104S, T104T, T104Q, T104V, T104D, T1O4I, T104L, T104M, T104F, T104Y, T104W or T104A;
(l) a substitution at position R192, such as R192Q, R192F, R192S R192D, or R192T; and
(m) a substitution at position C215, such as C215T, C215S, C215I, C215L, C215A, C215V, or C215G.
The CsgG pore may comprise a deletion of one or more positions from SEQ ID NO: 56, such as a deletion of V105-I107, a deletion of F193-L199 or a deletion of F195-L199.
All of the references to specific positions in SEQ ID NO: 56 above (e.g., V105) relate to the sequence of SEQ ID NO: 56 without the signal peptide (which is underlined in the sequence listing below).
CsgG pores are highly conserved (as can be readily appreciated from Figures 45 to 47 of WO2017/149317). Furthermore, from knowledge of the modifications in relation to SEQ ID NO: 56, it is possible to determine the equivalent positions for modifications of other CsgG pores, especially CsgG pores having the sequences shown in SEQ ID NOs: 57-64 in Table 4.
Amino acid substitutions may be made to the amino acid sequences of any of SEQ ID NOs: 56-64 in addition to those discussed above, for example up to 1, 2, 3, 4, 5, 10, 20 or 30 substitutions. Conservative substitutions replace amino acids with other amino acids of similar chemical structure, similar chemical properties, or similar side-chain volume. The amino acids introduced may have similar polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality, or charge to the amino acids they replace. Alternatively, the conservative substitution may introduce another amino acid that is aromatic or aliphatic in the place of a pre-existing aromatic or aliphatic amino acid. Conservative amino acid changes are well- known in the art.
The CsgG pore, two different CsgG pores or three different CsgG pores may be modified to introduce one or more cysteines, one or more hydrophobic amino acids, one or more charged amino acids, one or more non-native amino acids, one or more polar amino acids, or one or more photoreactive amino acids. Any number and combination of such introductions may be made. The introduction is preferably by substitution or addition. This also applies to chimeric pore monomers constructed from five different CsgG pores.
One or more amino acid residues of the amino acid sequences of any of SEQ ID NOs: 56-64 may additionally be deleted from the polypeptides described above. Up to 1, 2, 3, 4, 5, 10, 20 or 30 or more residues may be deleted.
One or more amino acids may be alternatively or additionally added to the CsgG polypeptides described above. An extension may be provided at the amino terminal or carboxy terminal of the amino acid sequences of any of SEQ ID NOs: 56-64 or polypeptide variants or fragments thereof. The extension may be quite short, for example from 1 to 10 amino acids in length. Alternatively, the extension may be longer, for example up to 50 or 100 amino acids.
The at least two different pores preferably comprise at least two different CsgG pores or at least three different CsgG pores. The two different pores are preferably two different CsgG pores. The three different pores are preferably three different CsgG pores. The at least two different pores preferably comprise at least five different CsgG pores or five different CsgG pores.
The at least two different CsgG pores, the two different CsgG pores or the three different CsgG pores may be selected from the pores in Table 4. One of the at least two different pores is preferably CsgG_Eco_WT (row 1 in Table 4). One of the two different pores is preferably CsgG_Eco_WT (row 1 in Table 4). One of the three different pores is preferably CsgG_Eco_WT (row 1 in Table 4). One of the five different pores is preferably CsgG_Eco_WT (row 1 in Table 4).
The at least two different pores preferably comprise or the two or three different pores preferably are selected from (a) CsgG_Eco_WT, (b) CsgG_Vdi_WT, (c) CsgG_Vmae_WT, (d) CsgG_Vsp_WT, (e) CsgG_Ler_WT, (f) CsgG_Vcr_WT, (g) CsgG_Psh_WT, (h) CsgG_Vhi_WT or (i) CsgG_Vma_WT. (a)-(i) are defined in Table 4. The five different pores may be selected from (a)-(i). The at least two different pores preferably comprise or the two different pores preferably are (a) and (b) , (a) and (c), (a) and (d), (a) and (e), (a) and (f), (a) and (g), (a) and (h), (a) and (i), (b) and (c), (b) and (d), (b) and (e), (b) and (f), (b) and (g), (b) and (h), (b) and (i), (c) and (d), (c) and (e), (c) and (f), (c) and (g), (c) and
(h), (c) and (i), (d) and (e), (d) and (f), (d) and (g), (d) and (h), (d) and (i), (e) and (f), (e) and (g), (e) and (h), (e) and (i), (f) and (g), (f) and (h), (f) and (i), (g) and (h), (g) and (i), or (h) and (i).
The at least three different pores preferably comprise or the three different pores preferably are (a), (b) and (c); (a), (b) and (d); (a), (b) and (e); (a), (b) and (f); (a), (b) and (g); (a),
(b) and (h); (a), (b) and (i); (a), (c) and (d); (a), (c) and (e); (a), (c) and (f); (a), (c) and
(g); (a), (c) and (h); (a), (c) and (i); (a), (d) and (e); (a), (d) and (f); (a), (d) and (g); (a),
(d) and (h); (a), (d) and (i); (a), (e) and (f); (a), (e) and (g); (a), (e) and (h); (a), (e) and
(i); (a), (f) and (g); (a), (f) and (h); (a), (f) and (i); (a), (g) and (h); (a), (g) and (i); (a),
(h) and (i); (b), (c) and (d); (b), (c) and (e); (b), (c) and (f); (b), (c) and (g); (b), (c) and
(h); (b), (c) and (i); (b), (d) and (e); (b), (d) and (f); (b), (d) and (g); (b), (d) and (h);
(b), (d) and (i); (b), (e) and (f); (b), (e) and (g); (b), (e) and (h); (b), (e) and (i); (b), (f) and (g); (b), (f) and (h); (b), (f) and (i); (b), (g) and (h); (b), (g) and (i); (b), (h) and (i);
(c), (d) and (e); (c), (d) and (f); (c), (d) and (g); (c), (d) and (h); (c), (d) and (i); (c), (e) and (f); (c), (e) and (g); (c), (e) and (h); (c), (e) and (i); (c), (f) and (g); (c), (f) and (h);
(c), (f) and (i); (c), (g) and (h); (c), (g) and (i); (c), (h) and (i); (d), (e) and (f); (d), (e) and (g); (d), (e) and (h); (d), (e) and (i); (d), (f) and (g); (d), (f) and (h); (d), (f) and (i);
(d), (g) and (h); (d), (g) and (i); (d), (h) and (i); (e), (f) and (g); (e), (f) and (h); (e), (f) and (i); (e), (g) and (h); (e), (g) and (i); (e), (h) and (i); (f), (g) and (h); (f), (g) and (i); (f), (h) and (i); or (g), (h) and (i).
The at least two different pores preferably comprise or the two or three different pores preferably are selected from (a) CsgG_Eco_WT, (b) CsgG_Vdi_WT, (c) CsgG_Vmae_WT, (d) CsgG_Vsp_WT, (e) CsgG_Ler_WT, (f) CsgG_Vcr_WT, (g) CsgG_Psh_WT, (h) CsgG_Vhi_WT,
(i) CsgG_Vma_WT, (j) CsgG_Vfu_WT, (k) CsgG_Vme_WT or (I) CsgG_Vge_WT. (a)-(l) are defined in Table 4. The five different pores may be selected from (a)-(l). The at least two different pores preferably comprise or the two different pores preferably are (a) and (b), (a) and (c), (a) and (d), (a) and (e), (a) and (f), (a) and (g), (a) and (h), (a) and (i), (a) and
(j), (a) and (k), (a) and (I), (b) and (c), (b) and (d), (b) and (e), (b) and (f), (b) and (g),
(b) and (h), (b) and (i), (b) and (j), (b) and (k), (b) and (I), (c) and (d), (c) and (e), (c) and (f), (c) and (g), (c) and (h), (c) and (i), (c) and (j), (c) and (k), (c) and (I), (d) and (e),
(d) and (f), (d) and (g), (d) and (h), (d) and (i), (d) and (j), (d) and (k), (d) and (I), (e) and (f), (e) and (g), (e) and (h), (e) and (i), (e) and (j), (e) and (k), (e) and (I), (f) and (g),
(f) and (h), (f) and (i), (f) and (j), (f) and (k), (f) and (I), (g) and (h), (g) and (i), (g) and
(j), (g) and (k), (g) and (I), (h) and (i), (h) and (j), (h) and (k), (h) and (I), (i) and (j), (i) and (k), (i) and (I), (j) and (k), (j) and (I), or (k) and (I). The at least three different pores preferably comprise or the three different pores preferably are (a), (b) and (c); (a), (b) and (d); (a), (b) and (e); (a), (b) and (f); (a), (b) and (g); (a),
(b) and (h); (a), (b) and (i); (a), (b) and (j); (a), (b) and (k); (a), (b) and (I); (a), (c) and
(d); (a), (c) and (e); (a), (c) and (f); (a), (c) and (g); (a), (c) and (h); (a), (c) and (i); (a),
(c) and (j); (a), (c) and (k); (a), (c) and (I); (a), (d) and (e); (a), (d) and (f); (a), (d) and
(g); (a), (d) and (h); (a), (d) and (i); (a), (d) and (j); (a), (d) and (k); (a), (d) and (I); (a),
(e) and (f); (a), (e) and (g); (a), (e) and (h); (a), (e) and (i); (a), (e) and (j); (a), (e) and (k); (a), (e) and (I); (a), (f) and (g); (a), (f) and (h); (a), (f) and (i); (a), (f) and (j); (a),
(f) and (k); (a), (f) and (I); (a), (g) and (h); (a), (g) and (i); (a), (g) and (j); (a), (g) and
(k); (a), (g) and (I); (a), (h) and (i); (a), (h) and (j); (a), (h) and (k); (a), (h) and (I); (a), (i) and (j); (a), (i) and (k); (a), (i) and (I); (a), (j) and (k); (a), (j) and (I); (a), (k) and (I); (b), (c) and (d); (b), (c) and (e); (b), (c) and (f); (b), (c) and (g); (b), (c) and (h); (b), (c) and (i); (b), (c) and (j); (b), (c) and (k); (b), (c) and (I); (b), (d) and (e); (b), (d) and (f); (b), (d) and (g); (b), (d) and (h); (b), (d) and (i); (b), (d) and (j); (b), (d) and (k); (b), (d) and (I); (b), (e) and (f); (b), (e) and (g); (b), (e) and (h); (b), (e) and (i); (b), (e) and (j); (b), (e) and (k); (b), (e) and (I); (b), (f) and (g); (b), (f) and (h); (b), (f) and (i); (b), (f) and (j); (b), (f) and (k); (b), (f) and (I); (b), (g) and (h); (b), (g) and (i); (b), (g) and (j); (b), (g) and (k); (b), (g) and (I); (b), (h) and (i); (b), (h) and (j); (b), (h) and (k); (b), (h) and (I); (b), (i) and (j); (b), (i) and (k); (b), (i) and (I); (b), (j) and (k); (b), (j) and (I);
(b), (k) and (I); (c), (d) and (e); (c), (d) and (f); (c), (d) and (g); (c), (d) and (h); (c), (d) and (i); (c), (d) and (j); (c), (d) and (k); (c), (d) and (I); (c), (e) and (f); (c), (e) and (g);
(c), (e) and (h); (c), (e) and (i); (c), (e) and (j); (c), (e) and (k); (c), (e) and (I); (c), (f) and (g); (c), (f) and (h); (c), (f) and (i); (c), (f) and (j); (c), (f) and (k); (c), (f) and (I);
(c), (g) and (h); (c), (g) and (i); (c), (g) and (j); (c), (g) and (k); (c), (g) and (I); (c), (h) and (i); (c), (h) and (j); (c), (h) and (k); (c), (h) and (I); (c), (i) and (j); (c), (i) and (k);
(c), (i) and (I); (c), (j) and (k); (c), (j) and (I); (c), (k) and (I); (d), (e) and (f); (d), (e) and
(g); (d), (e) and (h); (d), (e) and (i); (d), (e) and (j); (d), (e) and (k); (d), (e) and (I); (d), (f) and (g); (d), (f) and (h); (d), (f) and (i); (d), (f) and (j); (d), (f) and (k); (d), (f) and
(l); (d), (g) and (h); (d), (g) and (i); (d), (g) and (j); (d), (g) and (k); (d), (g) and (I); (d),
(h) and (i); (d), (h) and (j); (d), (h) and (k); (d), (h) and (I); (d), (i) and (j); (d), (i) and
(k); (d), (i) and (I); (d), (j) and (k); (d), (j) and (I); (d), (k) and (I); (e), (f) and (g); (e), (f) and (h); (e), (f) and (i); (e), (f) and (j); (e), (f) and (k); (e), (f) and (I); (e), (g) and
(h); (e), (g) and (i); (e), (g) and (j); (e), (g) and (k); (e), (g) and (I); (e), (h) and (i); (e),
(h) and (j); (e), (h) and (k); (e), (h) and (I); (e), (i) and (j); (e), (i) and (k); (e), (i) and
(l); (e), (j) and (k); (e), (j) and (I); (e), (k) and (I); (f), (g) and (h); (f), (g) and (i); (f), (g) and (j); (f), (g) and (k); (f), (g) and (I); (f), (h) and (i); (f), (h) and (j); (f), (h) and (k);
(f), (h) and (I); (f), (i) and (j); (f), (i) and (k); (f), (i) and (I); (f), (j) and (k); (f), (j) and (I); (f), (k) and (I); (g), (h) and (i); (g), (h) and (j); (g), (h) and (k); (g), (h) and (I); (g),
(i) and (j); (g), (i) and (k); (g), (i) and (I); (g), (j) and (k); (g), (j) and (I); (g), (k) and (I); (h), (i) and (j); (h), (i) and (k); (h), (i) and (I); (h), (j) and (k); (h), (j) and (I); (h), (k) and (I); (i), (j) and (k); (i), (j) and (I); (i), (k) and (I); or (j), (k) and (I).
The constriction transplants in Example 2 are formed from two different CsgG pores. The at least two different pores preferably comprise or the two different pores preferably are the combination of two different CsgG pores in any of those transplants. Using the (a)-(i) definitions above, the at least two different pores preferably comprise or the two different pores preferably are (a) and (b), (a) and (c), (a) and (d), (a) and (e), (a) and (f), (a) and (g), (a) and (h), or (a) and (i). In these embodiments, the chimeric pore monomers preferably comprise a cap region and transmembrane beta barrel region from (a) and a constriction region from one of (b)-(i).
Using the (a)-(i) definitions above, the at least two different pores preferably comprise or the two different pores preferably are (a) and (c), (a) and (d), (a) and (h), (a) and (i), or (a) and (b). In these embodiments, the chimeric pore monomers preferably comprise a cap region and a transmembrane beta barrel region from (a) and a constriction region from (c), (d), (h), (i) or (d).
Using the (a)-(i) definitions above, the at least two different pores preferably comprise or the two different pores preferably are (a) and (f), or (a) and (g). In these embodiments, the chimeric pore monomers preferably comprise a cap region and a transmembrane beta barrel region from (a) and a constriction region from (f) or (g).
Using the (a)-(i) definitions above, the at least two different pores preferably comprise or the two different pores preferably are (a) and (c), (a) and (h), or (a) and (i). In these embodiments, the chimeric pore monomers preferably comprise a cap region from (a) and a constriction region from (c), (h) or (i).
The chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 or to the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide. The chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide.
The chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72, to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78, to the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide, or to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide. The chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 65-72 or the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 65-72 or the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide. The chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide. Any one one of SEQ ID NOs: 65-72 is of course equivalent to SEQ ID NO: 65, 66, 67, 68, 69, 70, 71 or 72. Any one one of SEQ ID NOs: 65-72 and 76-78 is of course equivalent to SEQ ID NO: 65, 66, 67, 68, 69, 70, 71, 72, 76, 77 or 78. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
The chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 or to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 without the signal peptide. The chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 or to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 without the signal peptide. The chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 or the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 or the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72 and 65 without the signal peptide. Any one one of SEQ ID NOs: 66, 67, 71, 72 and 65 is of course equivalent to SEQ ID NO: 66, 67, 71, 72 or 65. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
The chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 or to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 without the signal peptide. The chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 or to the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 without the signal peptide. The chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 or the sequence shown in any one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 66, 67,
71, 72, 65, 76, 77 and 78 or the sequence shown in any one of SEQ ID NOs: 66, 67, 71,
72, 65, 76, 77 and 78 without the signal peptide. Any one one of SEQ ID NOs: 66, 67, 71, 72, 65, 76, 77 and 78 is of course equivalent to SEQ ID NO: 66, 67, 71, 72, 65, 76, 77 or 78. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
The chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 69 and 70 or to the sequence shown in any one of SEQ ID NOs: 69 and 70 without the signal peptide. The chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 69 and 70 or to the sequence shown in any one of SEQ ID NOs: 69 and 70 without the signal peptide. The chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 69 and 70 or the sequence shown in any one of SEQ ID NOs: 69 and 70 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 69 and 70 or the sequence shown in any one of SEQ ID NOs: 69 and 70 without the signal peptide. Any one one of SEQ ID NOs: 69 and 70 is of course equivalent to SEQ ID NO: 69 or 70. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
The chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 or to the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 without the signal peptide. The chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 or to the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 without the signal peptide. The chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 or the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 or the sequence shown in any one of SEQ ID NOs: 66, 71 and 72 without the signal peptide. Any one one of SEQ ID NOs: 66, 71 and 72 is of course equivalent to SEQ ID NO: 66, 71 or 72. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
The chimeric pore monomer preferably comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 or to the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 without the signal peptide. The chimeric pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 or to the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 without the signal peptide. The chimeric pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 or the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 or the sequence shown in any one of SEQ ID NOs: 66, 71, 72 and 76 without the signal peptide. Any one one of SEQ ID NOs: 66, 71, 72 and 76 is of course equivalent to SEQ ID NO: 66, 71, 72 or 76. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
In any of the embodiments above where the chimeric pore monomer comprises a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 or any one of SEQ ID NOs: 65-72 and 76-78 or to the sequence shown in any one of SEQ ID NOs: 65-72 or any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide, the chimeric pore monomer preferably does not comprise the entire sequence of any wild-type pore. In any of these embodiments, the chimeric pore monomer preferably does not comprise a sequence having 100% identity to the entire sequence of any wild-type pore. In any of these embodiments, the chimeric pore monomer preferably does not comprise the entire sequence of any of the different pores used to create the chimeric pore monomer or any of the different pores which the two or more regions, such as two regions or three regions, are from or derived from. In any of these embodiments, the chimeric pore monomer preferably does not comprise a sequence having 100% identity to the entire sequence of any of the different pores used to create the chimeric pore monomer or any of the different pores which the two or more regions, such as two regions or three regions, are from or derived from.
When defining the chimeric pore monomers of the invention, the at least two different pores do not comprise alpha-hemolysin and gamma-hemolysin. The at least two different pores preferably do not comprise two sodium channels.
The chimeric pore monomer preferably does not comprise the sequence shown in SEQ ID NO: 70.
The invention also provides a chimeric pore monomer, comprising a fusion protein comprising the following three structural regions: (1) a cap region, (2) a constriction region, and (5) a transmembrane region, wherein the three structural regions are derived from at least two different CsgG pores. The invention also provides a chimeric pore monomer, comprising a fusion protein comprising the following five structural regions: (1) a cap region, (2) a landing platform region, (3) a C-terminal region, (4) a constriction region, and (5) a transmembrane region, wherein the five structural regions are derived from at least two different CsgG pores. Any of the embodiments discussed above equally apply to these chimeric pore monomers of the invention. In particular, the three or five structural regions may be derived from two, three or five different CsgG pores.
Any of the chimeric pore monomers of the invention may further comprise one or more CsgF peptides. Such peptides and their association with pore monomers are described in WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety).
Stabilisation and other mutations
One or more of the at least two regions or one or more of the two regions in the chimeric pore monomer preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte. All the at least two regions or both of the two regions preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte. The cap region (or scaffold) and/or the constriction region preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte. One or more of, such as all of, the cap region, the constriction region, and the transmembrane region preferably comprise one or more modifications which stabilise the chimeric pore and/or improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise a target analyte. The CsgG pore monomer of the invention preferably comprises one or more modifications which stabilise a CsgG pore and/or improve the ability of a CsgG pore formed from the CsgG pore monomer to characterise a target analyte.
The one or more modifications are preferably (a) one or more deletions, (b) one or more substitutions, (c) one or more additions or (d) any combination of (a) to (c), including (a) and (b), (a) and (c), (b) or (d) or (a), (b) and (c). The one or more modifications are preferably one or more substitutions. The one or more modifications are preferably one or more of the substitutions to PorARc discussed above, especially in relation to Table 2. The one or more modifications are preferably one or more of the substitutions to CsgG discussed above, especially in relation to SEQ ID NO: 56. The one or more substitutions discussed above preferably remove negative charge. Such one or more substitutions improve the ability of a chimeric pore formed from the chimeric pore monomer to characterise negatively charged target analytes, such as polynucleotides. The one or more modifications may be any of those discussed below with reference to SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55. The one or more modifications are preferably one or more of the modifications to SEQ ID NO: 56 discussed above.
Suitable one or more modifications are known in the art. For instance, modifications to CsgG which improve its ability to characterise target analytes are disclosed in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety). Suitable modifications are also described in United Kingdom Patent Application No. 2118939.4 filed on 23 December 2021 (incorporated by reference herein in its entirety).
PorARc pore monomers
The invention provides various PorARc pore monomers from different species which are collectively known as PorARc pore monomers of the invention.
SEQ ID NO: 2 is the wild-type PorARc pore from Mycolicibacterium phlei (PorARc_Mph) with the substitutions D91N/D92N. This is pore 2 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (/.e., at position 1) and its preferred substitutions(seventh column). The invention provides a PorARc_Mph pore monomer which comprises or consists of a sequence having at least about 88% homology or identity to the sequence shown in SEQ ID NO: 2. The PorARc_Mph pore monomer of the invention preferably comprises or consists of a sequence having at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% homology to the sequence shown in SEQ ID NO: 2. The PorARc_Mph pore monomer of the invention preferably comprises or consists of a sequence having at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity to the sequence shown in SEQ ID NO: 2. The PorARc_Mph pore monomer of the invention preferably comprises or consists of a sequence having 100% homology or identity to the sequence shown in SEQ ID NO: 2. Homology and/or identity is typically measured over the entire length of the pore monomer. Methods for determining homology and/or identity are described above.
One or more negative amino acids in SEQ ID NO: 2, such as one or more E and/or D, are preferably deleted or substituted with one or more different amino acids, such as one or more positively charged amino acids, such as R, H or K, and/or one or more uncharged amino acids, such as S, T, N or Q. This removes negative charge from the sequence. Any number of negatively charged amino acids may be deleted or substituted, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. The one or more negatively charged amino acids are preferably in the constriction region of SEQ ID NO: 2. This is identified in Table 2.
The sequence that is homologous or identical to the sequence shown in SEQ ID NO: 2 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or both of the positions corresponding to positions 91 and 92 in SEQ ID NO: 2. The sequence that is homologous or identical to the sequence shown in SEQ ID NO: 2 preferably comprises N at the position(s) corresponding to position 91 and/or position 92 in SEQ ID NO: 2.
SEQ ID NO: 50 shows the amino acid sequence of PorARc pore from Mycobacterium sp. (PorARc_Msp) with the substitutions D91N/D92N. This is pore 3 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (/.e., at position 1) and its preferred substitutions (seventh column). SEQ ID NO: 51 shows the amino acid sequence of PorARc pore from Mycolicibacterium rhodesiae (PorARc_Mrh) with the substitutions D91N/D92N. This is pore 8 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (/.e., at position 1) and its preferred substitutions (seventh column). SEQ ID NO: 52 shows the amino acid sequence of PorARc pore from Mycolicibacterium elephantis (PorARc_Mel) with the substitutions D91N/E101Q. This is pore 17 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (/.e., at position 1) and its preferred substitutions (seventh column). SEQ ID NO: 53 shows the amino acid sequence of PorARc pore from Mycolicibacterium cosmeticum (PorARc_Mco) with the substitutions D91N/D92N (seventh column). This is pore 19 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (/.e., at position 1) and its preferred substitutions (seventh column). SEQ ID NO: 54 shows the amino acid sequence of PorARc pore from unclassified Rhodococcus (WP_056447532.1; PorARc_Rsp) with the substitutions E89Q/D91N/D93N/D100N. This is pore 25 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (/.e., at position 1) and its preferred substitutions (seventh column). SEQ ID NO: 55 shows the amino acid sequence of PorARc pore from Rhodococcus sp PSBB049 (WP_206003768.1; PorARc_Rsp) with the substitutions D90N/D95N/D103N. This is pore 27 in Table 2 above with the signal peptide removed, a methionine (M) at the N terminus (i.e., at position 1) and its preferred substitutions (seventh column).
The invention provides a PorARc pore monomer which comprises or consists of a sequence having at least about 40% homology or identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. A pore monomer based on SEQ ID NO: 50 may be referred to as a PorARc_Msp pore monomer. A pore monomer based on SEQ ID NO: 51 may be referred to as a PorARc_Mrh pore monomer. A pore monomer based on SEQ ID NO: 52 may be referred to as a PorARc_Mel pore monomer. A pore monomer based on SEQ ID NO: 53 may be referred to as a PorARc_Mco pore monomer. A pore monomer based on SEQ ID NO: 54 may be referred to as a PorARc_Rsp pore monomer. A pore monomer based on SEQ ID NO: 55 may be referred to as a PorARc_Rsp pore monomer.
The PorARc pore monomer of the invention preferably comprises or consists of a sequence having at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. The PorARc pore monomer preferably comprises or consists of a sequence having at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. The PorARc pore monomer of the invention preferably comprises or consists of a sequence having 100% homology or identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. Homology and/or identity is typically measured over the entire length of the pore monomer. Methods for determining homology and/or identity are described above. The invention provides a PorARc pore monomer which comprises or consists of a sequence having at least about 20% homology or identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. A pore monomer based on SEQ ID NO: 50 may be referred to as a PorARc_Msp pore monomer. A pore monomer based on SEQ ID NO: 51 may be referred to as a PorARc_Mrh pore monomer. A pore monomer based on SEQ ID NO: 52 may be referred to as a PorARc_Mel pore monomer. A pore monomer based on SEQ ID NO: 53 may be referred to as a PorARc_Mco pore monomer. A pore monomer based on SEQ ID NO: 54 may be referred to as a PorARc_Rsp pore monomer. A pore monomer based on SEQ ID NO: 55 may be referred to as a PorARc_Rsp pore monomer.
The PorARc pore monomer of the invention preferably comprises or consists of a sequence having at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. The PorARc pore monomer preferably comprises or consists of a sequence having at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. The PorARc pore monomer of the invention preferably comprises or consists of a sequence having 100% homology or identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. Homology and/or identity is typically measured over the entire length of the pore monomer. Methods for determining homology and/or identity are described above.
One or more negatively charged amino acids in SEQ ID NO: 50, 51, 52, 53, 54 or 55, such as one or more E and/or D, are preferably deleted or substituted with one or more different amino acids, such as one or more positively charged amino acids, such as R, H or K, and/or one or more uncharged amino acids, such as S, T, N or Q. This removes negative charge from the sequence. Any number of negatively charged amino acids may be deleted or substituted, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. The one or more negatively charged amino acids are preferably in the constriction region of SEQ ID NO: 50, 51, 52, 53, 54 or 55. These are identified in Table 2.
The sequence having homology or identity to the sequence shown in SEQ ID NO: 50 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or both of the positions corresponding to positions 91 and 92 in SEQ ID NO: 50. The sequence that is homologous or identical to the sequence shown in SEQ ID NO: 50 preferably comprises N at the position(s) corresponding to position 91 and/or position 92 in SEQ ID NO: 50.
The sequence having homology or identity to the sequence shown in SEQ ID NO: 51 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or both of the positions corresponding to positions 91 and 92 in SEQ ID NO: 51. The sequence that is homologous or identical to the sequence shown in SEQ ID NO: 51 preferably comprises N at the position(s) corresponding to position 91 and/or position 92 in SEQ ID NO: 51.
The sequence having homology or identity to the sequence shown in SEQ ID NO: 52 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or both of the positions corresponding to positions 91 and 101 in SEQ ID NO: 52. The sequence that is homologous or identical to the sequence shown in SEQ ID NO: 52 preferably comprises Q at the position(s) corresponding to position 91 and/or position 101 in SEQ ID NO: 52.
The sequence having homology or identity to the sequence shown in SEQ ID NO: 53 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or both of the positions corresponding to positions 91 and 92 in SEQ ID NO: 53. The sequence that is homologous or identical to the sequence shown in SEQ ID NO: 53 preferably comprises N at the position(s) corresponding to position 91 and/or position 92 in SEQ ID NO: 53.
The sequence having homology or identity the sequence shown in SEQ ID NO: 54 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of, such as all of, the positions corresponding to positions 89, 91, 93 and 100 in SEQ ID NO: 54. The sequence that is homologous or identical to the sequence shown in SEQ ID NO: 54 preferably comprises N at the positions corresponding to positions 89, 91, 93 and 100 in SEQ ID NO: 54.
The sequence having homology or identity to the sequence shown in SEQ ID NO: 55 preferably comprises a positively charged amino acid, such as R, H or K, or an uncharged amino acid, such as S, T, N or Q, at one or more of, such as all of, the positions corresponding to positions 90, 95 and 103 in SEQ ID NO: 55. The sequence that is homologous or identical to the sequence shown in SEQ ID NO: 55 preferably comprises N at the position corresponding to positions 90, 95 and 103 in SEQ ID NO: 55.
The sequence of the PorARc pore monomer (/.e., a sequence having homology or identity to SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55) may comprise any of the substitutions present in other PorARc pores including any of the pores listed in Table 2. The PorARc pore monomer typically retains the ability to form the same 3D structure as the wild-type PorARc pore monomer, such as the same 3D structure as a PorARc pore monomer having the sequence of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55. The PorARc pore monomer is capable of forming a pore. Methods for measuring are discussed above with reference to the chimer pore monomers of the invention.
Amino acid substitutions may be made to the amino acid sequence of SEQ ID NO: 2, 50, 51,
52, 53, 54 or 55, for example up to 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90 or 100 or more substitutions. Conservative substitutions replace amino acids with other amino acids of similar chemical structure, similar chemical properties, or similar side-chain volume. The amino acids introduced may have similar polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality, or charge to the amino acids they replace. Alternatively, the conservative substitution may introduce another amino acid that is aromatic or aliphatic in the place of a pre-existing aromatic or aliphatic amino acid. Conservative amino acid changes are well-known in the art.
The PorARc pore monomer may be modified to introduce one or more cysteines, one or more hydrophobic amino acids, one or more charged amino acids, one or more non-native amino acids, one or more polar amino acids, or one or more photoreactive amino acids. Any number and combination of such introductions may be made. The introduction is preferably by substitution or addition.
One or more amino acid residues of the amino acid sequence of SEQ ID NO: 2, 50, 51, 52,
53, 54 or 55 may additionally be deleted from the polypeptides described above. Up to 1, 2, 3, 4, 5, 10, 20 or 30 or more residues may be deleted.
The PorARc pore monomer may comprise a fragment of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55. Such fragments retain pore forming activity. Fragments may be at least about 50, at least about 100, at least about 150, or at least about 200 amino acids in length. Such fragments may be used to produce the pores of the invention. A fragment preferably comprises the transmembrane beta barrel region of the relevant sequence, namely residues W73 to T84 and G113 to N122 of SEQ ID NO: 2, residues W73 to T84 and G114 to N123 of SEQ ID NO: 50, residues W73 to T84 and G127 to N136 of SEQ ID NO: 51, residues W73 to T84 and Gill to N120 of SEQ ID NO: 52, residues W73 to T84 and G109 to N118 of SEQ ID NO: 53, residues A73 to S84 and Q104 to Pl 13 of SEQ ID NO: 54, residues G74 to S85 and Q107 to P116 of SEQ ID NO: 55, or a variant thereof as discussed above.
One or more amino acids may be alternatively or additionally added to the polypeptides described above. An extension may be provided at the amino terminal or carboxy terminal of the amino acid sequence of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55 or polypeptide variant or fragment thereof. The extension may be quite short, for example from 1 to 10 amino acids in length. Alternatively, the extension may be longer, for example up to 50 or 100 amino acids. A carrier protein may be fused to an amino acid sequence according to the invention.
The sequence of the PorARc pore monomer has an amino acid sequence which varies from that of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55 and which retains its ability to form a pore. The sequence typically contains the regions of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55 that are responsible for pore formation. The pore forming ability of PorARc, which contains a p- barrel, is provided by p-strands in the transmembrane beta barrel region of each monomer. A variant of SEQ ID NO: 2 typically comprises the region in the relevant sequence that forms p-strands, namely residues W73 to T84 and G113 to N122 of SEQ ID NO: 2, residues W73 to T84 and G114 to N123 of SEQ ID NO: 50, residues W73 to T84 and G127 to N136 of SEQ ID NO: 51, residues W73 to T84 and Gill to N120 of SEQ ID NO: 52, residues W73 to T84 and G109 to N118 of SEQ ID NO: 53, residues A73 to S84 and Q104 to P113 of SEQ ID NO: 54, residues G74 to S85 and Q107 to P116 of SEQ ID NO: 55, or a variant thereof as discussed above. One or more modifications can be made to the region of SEQ ID NO: 2, 50, 51, 52, 53, 54 or 55 that form p-strands as long as the resulting variant retains its ability to form a pore.
The one or more modifications in the PorARc pore monomer preferably improve the ability of a pore comprising the pore monomer to characterise an analyte.
CsoG pore monomers
The invention also provides various CsgG pore monomers which are collectively known as CsgG pore monomers of the invention.
The invention also provides a CsgG pore monomer comprising a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 or to the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide. The CsgG pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 or to the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide. The CsgG pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 65-72 or the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 65-72 or the sequence shown in any one of SEQ ID NOs: 65-72 without the signal peptide. Any one one of SEQ ID NOs: 65-72 is of course equivalent to SEQ ID NO: 65, 66, 67, 68, 69, 70, 71 or 72. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
The invention also provides a CsgG pore monomer comprising a sequence having at least about 20% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide. The CsgG pore monomer preferably comprises a sequence having at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90% or more preferably at least about 95%, at least about 97%, at least about 98% or at least about 99% homology or identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide. The CsgG pore monomer preferably comprises the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide, i.e., comprises a sequence having 100% identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 or the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78 without the signal peptide. Any one one of SEQ ID NOs: 65-72 and 76-78 is of course equivalent to SEQ ID NO: 65, 66, 67, 68, 69, 70, 71, 72, 76, 77 or 78. Homology and/or identity is typically measured over the entire length of the chimeric pore monomer.
The CsG pore monomer preferably does not comprise the entire sequence of any wild-type pore. The CsgG pore monomer preferably does not comprise a sequence having 100% identity to the entire sequence of any wild-type pore. The CsgG pore monomer preferably does not comprise the entire sequence of any of the CsgG pores identified in Table 4, including SEQ ID NO: 58 or 73.
The invention also provides a CsgG pore monomer comprising or consisting of a sequence having at least about 68% homology or identity to the sequence shown in SEQ ID NO: 58. The CsgG pore monomer of the invention preferably comprises or consists of a sequence having at least about 69%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% homology or identity to the sequence shown in SEQ ID NO: 58. The CsgG pore monomer may comprise or consist of SEQ ID NO: 58. Homology and/or identity is typically measured over the entire length of SEQ ID NO: 58. The invention also provides a CsgG pore monomer comprising or consisting of a sequence having at least about 79% homology or identity to the sequence shown in SEQ ID NO: 73. The CsgG pore monomer of the invention preferably comprises or consists of a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% homology or identity to the sequence shown in SEQ ID NO: 73. The CsgG pore monomer may comprise or consist of SEQ ID NO: 73. Homology and/or identity is typically measured over the entire length of SEQ ID NO: 73.
The CsgG pore monomer of the invention preferably comprises or consists of a sequence having at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% homology to the sequence shown in SEQ ID NO: 58 or 73.
The CsgG pore monomer of the invention any comprise one or more the particular modifications or substitutions disclosed in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety). The CsgG pore monomer may comprise one or more the modifications or substitutions discussed above with reference to SEQ ID NO: 56. The one or more modifications in the CsgG pore monomer preferably improve the ability of a pore comprising the pore monomer to characterise an analyte.
The CsgG pore monomer typically retains the ability to form the same 3D structure as the wild-type CsgG pore monomer, such as the same 3D structure as a CsgG pore monomer having the sequence of SEQ ID NO: 58 or 73. The CsgG pore monomer is capable of forming a pore. Methods for measuring are discussed above with reference to the chimer pore monomers of the invention.
Amino acid substitutions may be made to the amino acid sequence of SEQ ID NO: 58 or 73, for example up to 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90 or 100 or more substitutions. Conservative substitutions replace amino acids with other amino acids of similar chemical structure, similar chemical properties, or similar side-chain volume. The amino acids introduced may have similar polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality, or charge to the amino acids they replace. Alternatively, the conservative substitution may introduce another amino acid that is aromatic or aliphatic in the place of a pre-existing aromatic or aliphatic amino acid. Conservative amino acid changes are well- known in the art.
The CsgG pore monomer may be modified to introduce one or more cysteines, one or more hydrophobic amino acids, one or more charged amino acids, one or more non-native amino acids, one or more polar amino acids, or one or more photoreactive amino acids. Any number and combination of such introductions may be made. The introduction is preferably by substitution or addition.
One or more amino acid residues of the amino acid sequence of SEQ ID NO: 58 or 73 may additionally be deleted from the polypeptides described above. Up to 1, 2, 3, 4, 5, 10, 20 or 30 or more residues may be deleted.
The CsgG pore monomer may comprise a fragment of SEQ ID NO: 58 or 73. Such fragments retain pore forming activity. Fragments may be at least about 50, at least about 100, at least about 150, or at least about 200 amino acids in length. Such fragments may be used to produce the pores of the invention. A fragment preferably comprises the transmembrane beta barrel region of the relevant sequence (as identified above).
One or more amino acids may be alternatively or additionally added to the polypeptides described above. An extension may be provided at the amino terminal or carboxy terminal of the amino acid sequence of SEQ ID NO: 58 or 73 or polypeptide variant or fragment thereof. The extension may be quite short, for example from 1 to 10 amino acids in length. Alternatively, the extension may be longer, for example up to 50 or 100 amino acids. A carrier protein may be fused to an amino acid sequence according to the invention.
Constructs
The invention also provides a construct comprising two or more covalently attached chimeric pore monomers of the invention or two or more covalently attached PorARc pore monomers of the invention. In this and subsequent sections, the chimeric pore monomers of the invention and the PorARc pore monomers of the invention are collectively referred to as "pore monomers of the invention". In this and subsequent sections, the chimeric pore monomers of the invention, the PorARc pore monomers of the invention and the CsgG pore monomers of the invention are collectively referred to as "pore monomers of the invention". In this and subsequent sections, the chimeric construct of the invention and the PorARc construct of the invention are collectively referred to as "constructs of the invention".
The invention also provides a construct comprising two or more covalently attached CsgG pore monomers of the invention. In this and subsequent sections, the chimeric construct of the invention, the PorARc construct of the invention and the CsgG construct of the invention are collectively referred to as "constructs of the invention". The construct of the invention may comprise 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more or 10 or more pore monomers of the invention. The construct may comprise at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9 or at least 10 pore monomers of the invention. The two or more pore monomers of the invention may be the same or different. The two or more pore monomers of the invention may differ based on their sequences. The two or more pore monomers of the invention are preferably the same (/.e., identical).
The construct of the invention preferably comprises two pore monomers of the invention. The two pore monomers of the invention may be the same or different. The two pore monomers of the invention are preferably the same (/.e., identical).
The pore monomers of the invention may be genetically fused. The pore monomers of the invention may be attached via a linker, or chemically fused, for instance via a chemical crosslinker. Methods for covalently attaching pore monomers of the invention are disclosed in WO 2017/149316, WO 2017/149317, and WO 2017/149318 (incorporated herein by reference in their entirety). The pore monomers of the invention may be genetically fused using a linker.
The linker is preferably an amino acid sequence and/or a chemical crosslinker. Suitable amino acid linkers, such as peptide linkers, are known in the art. The length, flexibility and hydrophilicity of the amino acid or peptide linker are typically designed to facilitate the formation of pores from the constructs. Preferred flexible peptide linkers are stretches of 2 to 20, such as 4, 6, 8, 10 or 16, serine and/or glycine amino acids. More preferred flexible linkers include (SG)i, (SG)2, (SG)3, (SG)4, (SG)5, (SG)8, (SG)i0, (SG)i5 or (SG)20 wherein S is serine and G is glycine. Preferred rigid linkers are stretches of 2 to 30, such as 4, 6, 8, 16 or 24, proline amino acids. More preferred rigid linkers include (P)i2 wherein P is proline.
Suitable chemical crosslinkers are well-known in the art. Suitable chemical crosslinkers include, but are not limited to, those including the following functional groups: maleimide, active esters, succinimide, azide, alkyne (such as dibenzocyclooctynol (DIBO or DBCO), difluoro cycloalkynes and linear alkynes), phosphine (such as those used in traceless and non-traceless Staudinger ligations), haloacetyl (such as iodoacetamide), phosgene type reagents, sulfonyl chloride reagents, isothiocyanates, acyl halides, hydrazines, disulfides, vinyl sulfones, aziridines and photoreactive reagents (such as aryl azides, diaziridines).
Reactions between amino acids and functional groups may be spontaneous, such as cysteine/maleimide, or may require external reagents, such as Cu(I) for linking azide and linear alkynes. Linkers can comprise any molecule that stretches across the distance required. Linkers can vary in length from one carbon (phosgene-type linkers) to many Angstroms. Examples of linker molecules, include but are not limited to, are polyethyleneglycols (PEGs), polypeptides, polysaccharides, deoxyribonucleic acid (DNA), peptide nucleic acid (PNA), threose nucleic acid (TNA), glycerol nucleic acid (GNA), saturated and unsaturated hydrocarbons, polyamides. These linkers may be inert or reactive, in particular they may be chemically cleavable at a defined position, or may be themselves modified with a fluorophore or ligand. The linker is preferably resistant to reducing agents, such as dithiothreitol (DTT), following the covalent attachment of the monomers.
Preferred crosslinkers include 2,5-dioxopyrrolidin-l-yl 3-(pyridin-2-yldisulfanyl)propanoate, 2,5-dioxopyrrolidin-l-yl 4-(pyridin-2-yldisulfanyl)butanoate and 2,5-dioxopyrrolidin-l-yl 8- (pyridin-2-yldisulfanyl)octananoate, di-maleimide PEG Ik, di-maleimide PEG 3.4k, di- maleimide PEG 5k, di-maleimide PEG 10k, bis(maleimido)ethane (BMOE), bis- maleimidohexane (BMH), 1,4-bis-maleimidobutane (BMB), 1,4 bis-maleimidyl-2,3- di hydroxybutane (BMDB), BM[PEO]2 (1,8-bis-maleimidodiethyleneglycol), BM[PEO]3 (1,11- bis-maleimidotriethylene glycol), tris[2-maleimidoethyl]amine (TMEA), DTME dithiobismaleimidoethane, bis-maleimide PEG3, bis-maleimide PEGU, DBCO-maleimide, DBCO-PEG4-maleimide, DBCO-PEG4-NH2, DBCO-PEG4-NHS, DBCO-NHS, DBCO-PEG-DBCO 2.8kDa, DBCO-PEG-DBCO 4.0kDa, DBCO-15 atoms-DBCO, DBCO-26 atoms-DBCO, DBCO- 35 atoms-DBCO, DBCO-PEG4-S-S-PEG3-biotin, DBCO-S-S-PEG3-biotin, DBCO-S-S-PEG11- biotin, (succinimidyl 3-(2-pyridyldithio)propionate (SPDP) and maleimide-PEG(2kDa)- maleimide (ALPHA, OMEGA-BIS-MALEIMIDO POLYETHYLENE GLYCOL)). The most preferred crosslinker is maleimide-propyl-SRDFWRS-(l,2-diaminoethane)-propyl-maleimide.
The linker is preferably resistant to dithiothreitol (DTT). Suitable linkers include, but are not limited to, iodoacetamide-based and maleimide-based linkers.
The pore monomers of the invention may be connected using two or more linkers each comprising a hybridizable region and a group capable of forming a covalent bond. The hybridizable regions in the linkers hybridize and link the pore monomers of the invention. The linked pore monomers of the invention are then coupled via the formation of covalent bonds between the groups. Any of the specific linkers disclosed in WO 2010/086602 (incorporated herein by reference in its entirety) may be used in accordance with the invention.
The linkers may be labelled. Suitable labels include, but are not limited to, fluorescent molecules (such as Cy3 or AlexaFluor®555), radioisotopes, e.g., 125I, 35S, 32P, enzymes, antibodies, antigens, polynucleotides, and ligands such as biotin. Such labels allow the amount of linker to be quantified. The label could also be a cleavable purification tag, such as biotin, or a specific sequence to show up in an identification method, such as a peptide that is not present in the protein itself, but that is released by trypsin digestion.
A preferred method of connecting the pore monomers of the invention is via cysteine linkage. This can be mediated by a bi-functional chemical crosslinker or by an amino acid linker with a terminal presented cysteine residue.
Another preferred method of attachment via 4-azidophenylalanine or Faz linkage. This can be mediated by a bi-functional chemical linker or by a polypeptide linker with a terminal presented 4-azidophenylalanine or Faz residue. Additional suitable linkers are discussed in more detail below.
Pore of the invention
The invention provides a chimeric pore comprising at least one chimeric pore monomer of the invention or at least one construct of the invention comprising two or more covalently linked chimeric pore monomers of the invention. The invention also provides a CsgG pore comprising at least one CsgG pore monomer of the invention or at least one construct comprising two or more covalently linked CsgG pore monomers of the invention. The invention also provides a PorARc pore comprising at least one PorARc pore monomer of the invention or at least one construct comprising two or more covalently linked PorARc pore monomers of the invention. In this and subsequent sections, the chimeric pore of the invention and the PorARc pore of the invention are collectively referred to as "pores of the invention". In this and subsequent sections, the chimeric pore of the invention, the PorARc pore of the invention and the CsgG pore of the invention are collectively referred to as "pores of the invention".
The term "pore" refers to an oligomeric pore comprising at least one pore monomer of the invention (including, e.g., one or more pore monomers of the invention such as two or more pore monomers of the invention, three or more pore monomers of the invention etc.). The pore of the invention has the features of a biological pore, i.e., it has a typical protein structure and defines a channel. When the pore is provided in an environment having membrane components, membranes, cells, or an insulating layer, the pore will insert in the membrane or the insulating layer and form a "transmembrane pore".
The chimeric pores of the invention typically display improved target analyte characterisation compared with the at least two different pores or the two different pores from which they are derived. The PorARc pores of the invention typically display improved target analyte characterisation compared with other pores used in nanopore sensing.
As explained above, the pores of the invention display one or more of (a) increased signal- to-noise ratio (SNR), (b) an increased current range, (c) decreased noise and (d) and increased normalised median absolute deviation (nMAD). The median absolute deviation (MAD) is the median of the absolute values between the current at a given event and the overall mean current of the squiggle (/.e., current trace during analyte translocation). Normalised MAD (nMAD) is this value normalised by the overall range of the squiggle current. The pore of the invention may display (a); (b); (c); (d); (a) and (b); (a) and (c); (a) and (d); (b) and (c); (b) and (d); (c) and (d); (a), (b) and (c); (a), (b) and (d); (a), (c) and (d); (b), (c) and (d); or (a), (b), (c) and (d). For the chimeric pores of the invention, one or more of (a) to (d) are preferably compared with the at least two different pores or the two different pores from which the chimeric pores of the invention are constructed. For the PorARc pores of the invention, one or more of (a) to (d) are preferably compared with other pores used in nanopore sensing, including any of the pores described herein.
The pore of the invention typically comprises at least one constriction. Constrictions and their functions are defined above. The pore of the invention may comprise two or more constrictions, such as three or more, four or more, five or more or six or more constrictions. The additional constriction(s) expand(s) the contact surface with passing analytes and can improve analyte detection and characterization. Pores comprising the pore monomers of the invention can improve the characterisation of analytes, such as polynucleotides, providing a more discriminating direct relationship between the observed current as the polynucleotide moves through the pore. In particular, by having two stacked constrictions spaced at a defined distance, the pore may facilitate characterization of polynucleotides that contain at least one homopolymeric stretch, e.g., several consecutive copies of the same nucleotide that otherwise exceed the interaction length of the single constriction. Additionally, by having two stacked constrictions at a defined distance, small molecule analytes including organic or inorganic drugs and pollutants passing through the pore will consecutively pass the two constrictions. The chemical nature of either constriction can be independently modified, each giving unique interaction properties with the analyte, thus providing additional discriminating power during analyte detection.
The pore of the invention is preferably a homooligomer comprising 6 to 20, such as 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20, pore monomers of the invention. The pore of the invention is preferably a homooligomer comprising 6 to 10, such as 6, 7, 8, 9 or 10, pore monomers of the invention. The pore monomers of the invention are typically identical. The pore preferably comprises 8 or 9 identical pore monomers of the invention. The pore monomers of the invention may be any of those discussed above.
The invention provides pores comprising at least one construct of the invention. The pore typically comprises at least 1, 2, 3, 4 or 5 constructs of the invention. The pore comprises sufficient monomers to form a pore. For instance, an octameric pore may comprise (a) four constructs each comprising two pore monomers of the invention, (b) two constructs each comprising four pore monomers of the invention, (c) one construct comprising two pore monomers of the invention and six pore monomers of the invention that do not form part of a construct, (d) three constructs comprising two pore monomers of the invention and two pore monomers of the invention that do not form part of a construct, and (e) combinations thereof. Same and additional possibilities are provided for a nonameric pore for instance. Other combinations of constructs and monomers can be envisaged by the skilled person. One or more constructs of the invention may be used to form a pore for characterising, such as sequencing, polynucleotides. The pore preferably comprises 4 constructs of the invention each of which comprises two chimeric pore monomers or pore monomers. The constructs are typically the same (/.e., identical).
The pore of the invention is preferably a homooligomer comprising 1-5, such as 1, 2, 3, 4, 5, constructs of the invention. The constructs are typically the same (/.e., identical). The pore preferably comprises 4 identical constructs of the invention each of which comprises two pore monomers of the invention. The constructs may be any of those discussed above.
The pore monomers of the invention in the pore are preferably all approximately the same length or are the same length. The barrels of the pore monomers of the invention in the pore are preferably approximately the same length or are the same length. Length may be measured in number of amino acids and/or units of length.
Any of the pores of the invention may further comprise one or more CsgF peptides. Such peptides and their association with pores are described in WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety).
The pore of the invention may be isolated, substantially isolated, purified or substantially purified. A pore of the invention is isolated or purified if it is completely free of any other components, such as lipids or other pores. A pore is substantially isolated if it is mixed with carriers or diluents which will not interfere with its intended use. For instance, a pore is substantially isolated or substantially purified if it is present in a form that comprises less than 10%, less than 5%, less than 2% or less than 1% of other components, such as block copolymers, lipids, or other pores. Alternatively, a pore of the invention may be present in a membrane. Suitable membranes are discussed below.
A pore of the invention may be present as an individual or single pore. Alternatively, a pore of the invention may be present in a homologous or heterologous population of two or more pores. Other formats involving the pores of the invention are discussed in more detail below.
Multimeric pores The invention also provides a chimeric pore multimer comprising two or more pores, wherein at least one of the pores is a chimeric pore of the invention. The invention also provides a PorARc pore multimer comprising two or more pores, wherein at least one of the pores is a PorARc pore of the invention. In this and subsequent sections, the chimeric pore multimer of the invention and the PorARc pore multimer of the invention are collectively referred to as "pore multimers of the invention". The invention also provides a CsgG pore multimer comprising two or more pores, wherein at least one of the pores is a CsgG pore of the invention. In this and subsequent sections, the chimeric pore multimer of the invention, the PorARc pore multimer of the invention and the CsgG pore multimer of the invention are collectively referred to as "pore multimers of the invention".
The multimer of the invention may comprise any number of pores, such as 2, 3, 4, 5, 6, 7 or 8 or more pores. Any number of the pores in the multimer, including all of them, may be a pore of the invention.
The pore multimer of the invention may be a double pore comprising a first pore of the invention and a second pore. The double pore may be orientated in any way. The double pore may be two pores end to end. The double pore may be two pores adjacent to each other (/.e., side to side). The second pore may be a pore of the invention. Both the first pore and the second pore are preferably pores of the invention. In the double pore, the first pore may be attached to the second pore by hydrophobic interactions and/or by one or more disulfide bonds. One or more, such as 2, 3, 4, 5, 6, 8, 9, for example all, of the monomers in the first pore and/or the second pore (complex) may be modified to enhance such interactions. This may be achieved in any suitable way. Particular methods of forming double pores are described in WO 2019/002893 (incorporated by reference herein in its entirety).
The pore multimer of the invention may be isolated, substantially isolated, purified or substantially purified. Such terms are defined above with reference to the pores of the invention.
Membrane embodiments
The invention also provides a pore of the invention or a pore multimer of the invention which is comprised in a membrane. The invention also provides a membrane comprising a pore of the invention or a pore multimer of the invention. These products are directly applicable for use in molecular sensing, such as analyte characterisation and polynucleotide sequencing. Suitable membranes are discussed in more detail below.
Method for making modified proteins
Methods for introducing or substituting non-naturally occurring amino acids in pore monomers are also well known in the art and described in WO 2019/002893 (incorporated by reference herein in its entirety). The proteins may be modified to assist their identification or purification, for example by the addition of a streptavidin tag or by the addition of a signal sequence to promote their secretion from a cell where the monomer does not naturally contain such a sequence. The proteins may also be produced using D- amino acids or a mixture of L-amino acids and D-amino acids. This is conventional in the art for producing such proteins or peptides.
The chimeric pore monomer, the PorARc pore monomer, the chimeric construct, the PorARc construct, the chimeric pore, the PorARc pore, the chimeric pore multimer or the PorARc pore multimer (/.e., any protein of the invention) may be chemically modified. In this section, these proteins of the invention are collectively referred to as "the protein". The protein can be chemically modified in any way and at any site. The protein may be chemically modified by attachment of a molecule to one or more cysteines (cysteine linkage), attachment of a molecule to one or more lysines, attachment of a molecule to one or more non-natural amino acids, enzyme modification of an epitope or modification of a terminus. Suitable methods for carrying out such modifications are well-known in the art. The protein may be chemically modified by the attachment of any molecule, such as a dye or a fluorophore.
The protein may be chemically modified with a molecular adaptor that facilitates the interaction between a pore comprising the monomer and a target nucleotide or target polynucleotide sequence. Suitable adaptors, including a cyclic molecule, a cyclodextrin, a species that is capable of hybridization, a DNA binder or interchelator, a peptide or peptide analogue, a synthetic polymer, an aromatic planar molecule, a small positively charged molecule or a small molecule capable of hydrogen-bonding, are described in WO 2019/002893 (incorporated by reference herein in its entirety). The molecular adaptor may be attached using any of the methods and linkers discussed above.
The protein may be attached to a polynucleotide binding protein. This forms a modular sequencing system that may be used in the methods of sequencing of the invention. Polynucleotide binding proteins are discussed below. The protein can be covalently attached to the monomer using any method known in the art. The monomer and protein may be chemically fused or genetically fused. Genetic fusion of a monomer to a polynucleotide binding protein is discussed in WO 2010/004265 (incorporated herein by reference in its entirety). The polynucleotide binding protein may be attached via cysteine linkage using any method described above.
The polynucleotide binding protein may be attached directly to the protein via one or more linkers. The molecule may be attached to the pore monomer using the hybridization linkers described in as WO 2010/086602 (incorporated herein by reference in its entirety). Alternatively, peptide linkers may be used. Suitable peptide linkers are discussed above.
Any of the proteins may be modified to assist their identification or purification, for example by the addition of histidine residues (a his tag), aspartic acid residues (an asp tag), a streptavidin tag, a flag tag, a SUMO tag, a GST tag or a MBP tag, or by the addition of a signal sequence to promote their secretion from a cell where the polypeptide does not naturally contain such a sequence. An alternative to introducing a genetic tag is to chemically react a tag onto a native or engineered position on the protein. An example of this would be to react a gel-shift reagent to a cysteine engineered on the outside of the protein. This has been demonstrated as a method for separating hemolysin heterooligomers (Chem Biol. 1997 Jul;4(7):497-505).
Any of the proteins may be labelled with a revealing label. The revealing label may be any suitable label which allows the protein to be detected. Suitable labels include, but are not limited to, fluorescent molecules, radioisotopes, e.g., 1251, 35S, enzymes, antibodies, antigens, polynucleotides, and ligands such as biotin.
The protein may also contain other non-specific modifications as long as they do not interfere with the function of the protein. A number of non-specific side chain modifications are known in the art and may be made to the side chains of the protein(s). Such modifications include, for example, reductive alkylation of amino acids by reaction with an aldehyde followed by reduction with NaBH4, amidation with methylacetimidate or acylation with acetic anhydride.
Any of the proteins can be produced using standard methods known in the art. Polynucleotide sequences encoding a protein may be derived and replicated using standard methods in the art. Polynucleotide sequences encoding a protein may be expressed in a bacterial host cell using standard techniques in the art. The protein may be produced in a cell by in situ expression of the polypeptide from a recombinant expression vector. The expression vector optionally carries an inducible promoter to control the expression of the polypeptide. These methods are described in Sambrook, J. and Russell, D. (2001). Molecular Cloning: A Laboratory Manual, 3rd Edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
Proteins may be produced in large scale following purification by any protein liquid chromatography system from protein producing organisms or after recombinant expression. Typical protein liquid chromatography systems include FPLC, AKTA systems, the Bio-Cad system, the Bio-Rad BioLogic system, and the Gilson HPLC system.
Method for producing chimeric pore monomers The invention provides methods for producing a chimeric pore monomer of the invention. The method comprises attaching, preferably covalently attaching, the at least two regions from or derived from at least two different pores. The at least two regions may be attached or covalently attached using one or more linkers as described above. In particular, the invention provides a method for producing a chimeric pore monomer of the invention comprising (a) designing a polynucleotide which encodes the chimeric pore monomer of the invention as a genetic fusion and (b) expressing the chimeric pore monomer from the polynucleotide. Methods for designing polynucleotide sequences, constructing polynucleotides, and expressing them are well known in the art.
Any of the embodiments discussed above with reference to the chimeric pore monomers of the invention equally apply to these methods.
Method of producing pores
The invention also provides a method for producing a pore of the invention or a pore multimer of the invention. The pore of the invention may be a chimeric pore of the invention or a PorARc pore of the invention. The pore multimer of the invention may be a chimeric pore multimer of the invention or a PorARc pore multimer of the invention.
The method may involve expressing the pore in a host cell. In particular, the method may comprise expressing at least one pore monomer of the invention or at least one construct of the invention and sufficient pore monomers or constructs to form the pore or the pore multimer in a host cell and allowing the pore or pore multimer to form in the host cell. The sufficient pore monomers or constructs are preferably sufficient pore monomers of the invention or sufficient constructs of the invention. In this context, the pore monomer(s) of the invention may be chimeric pore monomer(s) of the invention or PorARc pore monomer(s) of the invention. The construct(s) of the invention may be chimeric construct(s) of the invention or PorARc constructs of the invention. The numbers of pore monomers or constructs needed to form the pores of the invention or pore multimers of the invention are discussed above. Suitable host cells and expression systems are known in the art and are discussed in the Examples.
The method may involve forming the pore in a non-cellular or in vitro context. In particular, the method may comprise contacting at least one pore monomer of the invention or at least one construct of the invention with sufficient pore monomers or constructs in vitro and allowing the formation of the pore or pore multimer. The pore monomer(s) or the construct(s) may be produced separately by in vitro translation and transcription (IVTT) and then incubated with the sufficient pore monomers or constructs. The sufficient pore monomers or constructs are preferably sufficient pore monomers of the invention or sufficient constructs of the invention. In this context, the pore monomer(s) of the invention may be chimeric pore monomer(s) of the invention or PorARc pore monomer(s) of the invention. The construct(s) of the invention may be chimeric construct(s) of the invention or PorARc construct(s) of the invention. The numbers of pore monomers or constructs needed to form the pores of the invention or pore multimers of the invention are discussed above. The method may be conducted in an "in vitro system", which refers to a system comprising at least the necessary components and environment to execute said method, and makes use of biological molecules, organisms, a cell (or part of a cell) outside of their normal naturally occurring environment, permitting a more detailed, more convenient, or more efficient analysis than can be done with whole organisms. An in vitro system may also comprise a suitable buffer composition provided in a test tube, wherein said protein components to form the complex have been added. A person skilled in the art is aware of the options to provide said system.
Some or all of the components of the pore or pore multimer may be tagged to facilitate purification. Purification can also be performed when the components are untagged. Methods known in the art (e.g., ion exchange, gel filtration, hydrophobic interaction column chromatography etc.) can be used alone or in different combinations to purify the components of the pore.
The pore or pore multimer can be made prior to insertion into a membrane or after insertion of the components into a membrane.
Methods for making the pores and complexes of the invention and ways of tagging them are disclosed in WO 2016/034591, WO 2017/149316, WO 2017/149317 and, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety).
Methods of characterising an analyte
The invention provides a method of determining the presence, absence or one or more characteristics of a target analyte. The method involves contacting the target analyte with a pore or pore multimer such that the target analyte moves with respect to, such as into or through, the pore or pore multimer and taking one or more measurements as the target analyte moves with respect to the pore or pore multimer and thereby determining the presence, absence or one or more characteristics of the target analyte. The target analyte may also be called the template analyte or the analyte of interest.
The pore may be a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from or derived from at least two different pores. All of the discussion above in relation to chimeric pores, two or more regions and at least two of the regions being from or derived from at least two different pores equally apply to the method of the invention. For the avoidance of doubt, in the context of the method of the invention, the at least two different pores can comprise alpha-hemolysin and gamma-hemolysin. The chimeric pore is preferably a chimeric pore of the invention.
The pore may be a PorARc pore of the invention.
The pore may be a CsgG pore of the invention.
The pore multimer may comprise two or more pores wherein at least one pore is a chimeric pore a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from or derived from at least two different pores. All of the discussion above in relation to chimeric pores, two or more regions and at least two of the regions being from or derived from at least two different pores equally apply to the method of the invention. The pore multimer is preferably a chimeric pore multimer of the invention.
The pore multimer may be a PorARc pore multimer of the invention.
The pore may be a CsgG pore multimer of the invention.
The method is for determining the presence, absence or one or more characteristics of a target analyte. The method may be for determining the presence, absence or one or more characteristics of at least one analyte. The method may concern determining the presence, absence or one or more characteristics of two or more analytes. The method may comprise determining the presence, absence or one or more characteristics of any number of analytes, such as 2, 5, 10, 15, 20, 30, 40, 50, 100 or more analytes. Any number of characteristics of the one or more analytes may be determined, such as 1, 2, 3, 4, 5, 10 or more characteristics.
The binding of a molecule in the channel of the pore or pore multimer, or in the vicinity of either opening of the channel will have an effect on the open-channel ion flow through the pore or pore multimer, which is the essence of "molecular sensing". In a similar manner to the nucleic acid sequencing application, variation in the open-channel ion flow can be measured using suitable measurement techniques by the change in electrical current (for example, WO 2000/28312 and D. Stoddart et al., Proc. Natl. Acad. Sci., 2010, 106, 7702-7 or WO 2009/077734; all incorporated herein by reference in their entirety). The degree of reduction in ion flow, as measured by the reduction in electrical current, is related to the size of the obstruction within, or in the vicinity of, the pore. Binding of a molecule of interest, also referred to as an "analyte", in or near the pore therefore provides a detectable and measurable event, thereby forming the basis of a "biological sensor". Suitable molecules for nanopore sensing include polynucleotides/nucleic acids, proteins, peptides, polynucleotide-polypeptide conjugates, polysaccharides, and small molecules (refers here to a low molecular weight (e.g., < 900Da or < 500Da) organic or inorganic compound) such as pharmaceuticals, toxins, cytokines, and pollutants. Detecting the presence of biological molecules finds application in personalised drug development, medicine, diagnostics, life science research, environmental monitoring and in the security and/or the defence industry.
The pore or pore multimer may serve as a molecular or biological sensor. The target analyte molecule that is to be detected may bind to either face of the channel, or within the lumen of the channel itself. The position of binding may be determined by the size of the molecule to be sensed.
The target analyte preferably comprises or is a metal ion, an inorganic salt, a polymer, an amino acid, a peptide, a polypeptide, a protein, a nucleotide, an oligonucleotide, a polynucleotide, a polynucleotide-polypeptide conjugate, a monosaccharide, an oligosaccharide, a polysaccharide, a dye, a bleach, a pharmaceutical, a diagnostic agent, a recreational drug, an explosive, a toxic compound, or an environmental pollutant. The target analyte preferably comprises or is a metal ion, an inorganic salt, a polymer, an amino acid, a peptide, a polypeptide, a protein, a nucleotide, an oligonucleotide, a polynucleotide, a polynucleotide-polypeptide conjugate, a monosaccharide, an oligosaccharide, a polysaccharide, a dye, a bleach, a pharmaceutical, a diagnostic agent, a recreational drug, an explosive, a toxic compound, an environmental pollutant, or a metabolite.
The target analyte may comprise two or more different molecules, such as a peptide and a polypeptide. The target analyte may be a polynucleotide-polypeptide conjugate. The method may concern determining the presence, absence or one or more characteristics of two or more analytes of the same type, such as two or more proteins, two or more nucleotides or two or more pharmaceuticals. Alternatively, the method may concern determining the presence, absence or one or more characteristics of two or more analytes of different types, such as one or more proteins, one or more nucleotides and one or more pharmaceuticals.
The target analyte can be secreted from cells. Alternatively, the target analyte can be an analyte that is present inside cells such that the target analyte must be extracted from the cells before the method can be carried out. The target analyte may be obtained from or extracted from any organism or microorganism. The target analyte may be obtained from a human or animal, e.g., from urine, lymph, saliva, mucus, seminal fluid, or amniotic fluid, or from whole blood, plasma, or serum. The target analyte may be obtained from a plant e.g., a cereal, legume, fruit, or vegetable.
The pore or pore multimer may be modified via recombinant or chemical methods to increase the strength of binding, the position of binding, or the specificity of binding of the molecule to be sensed. Typical modifications include addition of a specific binding moiety complimentary to the structure of the molecule to be sensed. Where the target analyte molecule comprises a nucleic acid, this binding moiety may comprise a cyclodextrin or an oligonucleotide; for small molecules this may be a known complimentary binding region, for example the antigen binding portion of an antibody or of a non-antibody molecule, including a single chain variable fragment (scFv) region or an antigen recognition domain from a T- cell receptor (TCR); or for proteins, it may be a known ligand of the target protein. In this way the pore or pore multimer may be rendered capable of acting as a molecular sensor for detecting presence in a sample of suitable antigens (including epitopes) that may include cell surface antigens, including receptors, markers of solid tumours or haematologic cancer cells (e.g. lymphoma or leukaemia), viral antigens, bacterial antigens, protozoal antigens, allergens, allergy related molecules, albumin (e.g. human, rodent, or bovine), fluorescent molecules (including fluorescein), blood group antigens, small molecules, drugs, enzymes, catalytic sites of enzymes or enzyme substrates, and transition state analogues of enzyme substrates. As described above, modifications may be achieved using known genetic engineering and recombinant DNA techniques. The positioning of any adaptation would be dependent on the nature of the molecule to be sensed, for example, the size, three- dimensional structure, and its biochemical nature. The choice of adapted structure may make use of computational structural design. Determination and optimization of proteinprotein interactions or protein-small molecule interactions can be investigated using technologies such as a BIAcore® which detects molecular interactions using surface plasmon resonance (BIAcore, Inc., Piscataway, NJ; see also www.biacore.com).
The target analyte preferably comprises or is an amino acid, a peptide, a polypeptide, or protein. The amino acid, peptide, polypeptide, or protein can be naturally occurring or non- naturally occurring. The polypeptide or protein can include within them synthetic or modified amino acids. Several different types of modification to amino acids are known in the art. Suitable amino acids and modifications thereof are above. It is to be understood that the target analyte can be modified by any method available in the art.
The target analyte preferably comprises or is a polynucleotide, such as a nucleic acid. A polynucleotide is defined as a macromolecule comprising two or more nucleotides. Nucleic acids are particularly suitable for nanopore sequencing. The naturally occurring nucleic acid bases in DNA and RNA may be distinguished by their physical size. As a nucleic acid molecule, or individual base, passes through the channel of a nanopore, the size differential between the bases causes a directly correlated reduction in the ion flow through the channel. The variation in ion flow may be recorded. Suitable electrical measurement techniques for recording ion flow variations are discussed above. Through suitable calibration, the characteristic reduction in ion flow can be used to identify the particular nucleotide and associated base traversing the channel in real-time. In typical nanopore nucleic acid sequencing, the open-channel ion flow is reduced as the individual nucleotides of the nucleic sequence of interest sequentially pass through the channel of the nanopore due to the partial blockage of the channel by the nucleotide. It is this reduction in ion flow that is measured using the suitable recording techniques described above. The reduction in ion flow may be calibrated to the reduction in measured ion flow for known nucleotides through the channel resulting in a means for determining which nucleotide is passing through the channel, and therefore, when done sequentially, a way of determining the nucleotide sequence of the nucleic acid passing through the nanopore. For the accurate determination of individual nucleotides, it has typically required for the reduction in ion flow through the channel to be directly correlated to the size of the individual nucleotide passing through the constriction. It will be appreciated that sequencing may be performed upon an intact nucleic acid polymer that is 'threaded' through the pore via the action of an associated polymerase, for example. Alternatively, sequences may be determined by passage of nucleotide triphosphate bases that have been sequentially removed from a target nucleic acid in proximity to the pore (see for example WO 2014/187924 incorporated herein by reference in its entirety).
The polynucleotide or nucleic acid may comprise any combination of any nucleotides. The nucleotides can be naturally occurring or artificial. One or more nucleotides in the polynucleotide can be oxidized or methylated. One or more nucleotides in the polynucleotide may be damaged. For instance, the polynucleotide may comprise a pyrimidine dimer. Such dimers are typically associated with damage by ultraviolet light and are the primary cause of skin melanomas. One or more nucleotides in the polynucleotide may be modified, for instance with a label or a tag, for which suitable examples are known by a skilled person. The polynucleotide may comprise one or more spacers. A nucleotide typically contains a nucleobase, a sugar and at least one phosphate group. The nucleobase and sugar form a nucleoside. The nucleobase is typically heterocyclic. Nucleobases include, but are not limited to, purines and pyrimidines and more specifically adenine (A), guanine (G), thymine (T), uracil (U) and cytosine (C). The sugar is typically a pentose sugar. Nucleotide sugars include, but are not limited to, ribose and deoxyribose. The sugar is preferably a deoxyribose. The polynucleotide preferably comprises the following nucleosides: deoxyadenosine (dA), deoxyuridine (dll) and/or thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC). The nucleotide is typically a ribonucleotide or deoxyribonucleotide. The nucleotide typically contains a monophosphate, diphosphate, or triphosphate. The nucleotide may comprise more than three phosphates, such as 4 or 5 phosphates. Phosphates may be attached on the 5' or 3' side of a nucleotide. The nucleotides in the polynucleotide may be attached to each other in any manner. The nucleotides are typically attached by their sugar and phosphate groups as in nucleic acids. The nucleotides may be connected via their nucleobases as in pyrimidine dimers. The polynucleotide may be single stranded or double stranded. At least a portion of the polynucleotide is preferably double stranded. The polynucleotide is most preferably ribonucleic nucleic acid (RNA) or deoxyribonucleic acid (DNA). In particular, said method using a polynucleotide as an analyte alternatively comprises determining one or more characteristics selected from (i) the length of the polynucleotide, (ii) the identity of the polynucleotide, (iii) the sequence of the polynucleotide, (iv) the secondary structure of the polynucleotide and (v) whether or not the polynucleotide is modified.
The polynucleotide can be any length (i). For example, the polynucleotide can be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400 or at least 500 nucleotides or nucleotide pairs in length. The polynucleotide can be 1000 or more nucleotides or nucleotide pairs, 5000 or more nucleotides or nucleotide pairs in length or 100000 or more nucleotides or nucleotide pairs in length. Any number of polynucleotides can be investigated. For instance, the method may concern characterising 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 50, 100 or more polynucleotides. If two or more polynucleotides are characterised, they may be different polynucleotides or two instances of the same polynucleotide. The polynucleotide can be naturally occurring or artificial. For instance, the method may be used to verify the sequence of a manufactured oligonucleotide. The method is typically carried out in vitro.
Nucleotides can have any identity (ii), and include, but are not limited to, adenosine monophosphate (AMP), guanosine monophosphate (GMP), thymidine monophosphate (TMP), uridine monophosphate (UMP), 5-methylcytidine monophosphate, 5- hydroxy methylcytidine monophosphate, cytidine monophosphate (CMP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyguanosine monophosphate (dGMP), deoxythymidine monophosphate (dTMP), deoxyuridine monophosphate (dUMP), deoxycytidine monophosphate (dCMP) and deoxymethylcytidine monophosphate. The nucleotides are preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP, dCMP and dUMP. A nucleotide may be abasic (/.e., lack a nucleobase). A nucleotide may also lack a nucleobase and a sugar (/.e., is a C3 spacer). The sequence of the nucleotides (iii) is determined by the consecutive identity of following nucleotides attached to each other throughout the polynucleotide strain, in the 5' to 3' direction of the strand.
The movement of the polynucleotide with respect to the pore or pore multimer, such as through the pore or pore multimer, is preferably controlled using a polynucleotide binding protein. Suitable proteins are discussed in more detail below. The invention provides a method for determining the presence, absence or one or more characteristics of a target polynucleotide, comprising the steps of:
(i) contacting the target analyte with (a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from or derived from at least two different pores, (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a), (c) a PorARc pore of the invention, or (d) a PorARc pore multimer of the invention and a polynucleotide binding protein, such that the polynucleotide binding protein controls the movement of the target analyte moves with respect to, such as through, the pore or the pore multimer; and
(ii) taking one or more measurements as the polynucleotide moves with respect to, such as through, the pore or the pore multimer and thereby determining the presence, absence or one or more characteristics of the polynucleotide.
The chimeric pore in (a) is preferably a chimeric pore of the invention. The chimeric pore multimer in (b) is preferably a chimeric pore multimer of the invention.
The target analyte preferably comprises a polypeptide. Any suitable polypeptide can be characterised. The polypeptide may be an unmodified protein or a portion thereof, or a naturally occurring polypeptide or a portion thereof. The target polypeptide may be secreted from cells. Alternatively, the target polypeptide can be produced inside cells such that it must be extracted from cells for characterisation. The polypeptide may comprise the products of cellular expression of a plasmid, e.g., a plasmid used in cloning of proteins in accordance with the methods described in Sambrook et al., Molecular Cloning: A Laboratory Manual, 4th ed., Cold Spring Harbor Press, Plainsview, New York (2012); and Ausubel et al., Current Protocols in Molecular Biology (Supplement 114), John Wiley & Sons, New York (2016).
The polypeptide can be provided as an impure mixture of one or more polypeptides and one or more impurities. Impurities may comprise truncated forms of the target polypeptide which are distinct from the "target polypeptides" for characterisation. For example, the target polypeptide may be a full-length protein and impurities may comprise fractions of the protein. Impurities may also comprise proteins other than the target protein, e.g., which may be co-purified from a cell culture or obtained from a sample.
A polypeptide may comprise any combination of any amino acids, amino acid analogs and modified amino acids (/.e., amino acid derivatives). Amino acids (and derivatives, analogs etc) in the polypeptide can be distinguished by their physical size and charge. The amino acids/derivatives/analogs can be naturally occurring or artificial. The polypeptide may comprise any naturally occurring amino acid.
The polypeptide may be modified. The polypeptide may be modified for detection using the method of the invention. The method may be for characterising modifications in the target polypeptide. One or more of the amino acids/derivatives/analogs in the polypeptide may be modified. One or more of the amino acids/derivatives/analogs in the polypeptide may be post- translationally modified. As such, the method of the invention can be used to detect the presence, absence, number of positions of post-translational modifications in a polypeptide. The method can be used to characterise the extent to which a polypeptide has been post- translationally modified.
Any one or more post-translational modifications may be present in the polypeptide. Typical post-translational modifications include modification with a hydrophobic group, modification with a cofactor, addition of a chemical group, glycation (the non-enzymatic attachment of a sugar), biotinylation and pegylation. Post-translational modifications can also be nonnatural, such that they are chemical modifications done in the laboratory for biotechnological or biomedical purposes. This can allow monitoring the levels of the laboratory made peptide, polypeptide, or protein in contrast to the natural counterparts.
Examples of post-translational modification with a hydrophobic group include myristoylation, attachment of myristate, a Ci4 saturated acid; palmitoylation, attachment of palmitate, a Ci6 saturated acid; isoprenylation or prenylation, the attachment of an isoprenoid group; farnesylation, the attachment of a farnesol group; geranylgeranylation, the attachment of a geranylgeraniol group; and glypiation, and glycosylphosphatidylinositol (GPI) anchor formation via an amide bond.
Examples of post-translational modification with a cofactor include lipoylation, attachment of a lipoate (C8) functional group; flavination, attachment of a flavin moiety (e.g. flavin mononucleotide (FMN) or flavin adenine dinucleotide (FAD)); attachment of heme C, for instance via a thioether bond with cysteine; phosphopantetheinylation, the attachment of a 4'-phosphopantetheinyl group; and retinylidene Schiff base formation.
Examples of post-translational modification by addition of a chemical group include acylation, e.g. O-acylation (esters), N-acylation (amides) or S-acylation (thioesters); acetylation, the attachment of an acetyl group for instance to the N-terminus or to lysine; formylation; alkylation, the addition of an alkyl group, such as methyl or ethyl; methylation, the addition of a methyl group for instance to lysine or arginine; amidation; butyrylation; gamma-carboxylation; glycosylation, the enzymatic attachment of a glycosyl group for instance to arginine, asparagine, cysteine, hydroxylysine, serine, threonine, tyrosine or tryptophan; polysialylation, the attachment of polysialic acid; malonylation; hydroxylation; iodination; bromination; citrulination; nucleotide addition, the attachment of any nucleotide such as any of those discussed above, ADP ribosylation; oxidation; phosphorylation, the attachment of a phosphate group for instance to serine, threonine or tyrosine (O-linked) or histidine (N-linked); adenylylation, the attachment of an adenylyl moiety for instance to tyrosine (O-linked) or to histidine or lysine (N-linked); propionylation; pyroglutamate formation; S-glutathionylation; Sumoylation; S-nitrosylation; succinylation, the attachment of a succinyl group for instance to lysine; selenoylation, the incorporation of selenium; and ubiquitinilation, the addition of ubiquitin subunits (N-linked).
The polypeptide may be labelled with a molecular label. A molecular label may be a modification to the polypeptide which promotes the detection of the polypeptide in the method of the invention. For example, the label may be a modification to the polypeptide which alters the signal obtained as conjugate is characterised. For example, the label may interfere with a flux of ions through the nanopore. In such a manner, the label may improve the sensitivity of the method.
The polypeptide may contain one or more cross-linked sections, e.g., C-C bridges. The polypeptide may not be cross-linked prior to being characterised using the method.
The polypeptide may comprise sulphide-containing amino acids and thus has the potential to form disulphide bonds. Typically, in such embodiments, the polypeptide is reduced using a reagent such as DTT (Dithiothreitol) or TCEP (tris(2-carboxyethyl)phosphine) prior to being characterised using the method.
The polypeptide may be a full-length protein or naturally occurring polypeptide. The protein or naturally occurring polypeptide may be fragmented prior to conjugation to the polynucleotide. The protein or polypeptide may be chemically or enzymatically fragmented. The polypeptides or polypeptide fragments can be conjugated to form a longer target polypeptide.
The polypeptide can be any suitable length. The polypeptide preferably has a length of from about 2 to about 300 peptide units or amino acids. The polypeptide has a length of from about 2 to about 100 peptide units, for example from about 2 to about 50 peptide units, e.g., from about 3 to about 50 peptide units, such as from about 5 to about 25 peptide units, e.g., from about 7 to about 16 peptide units, such as from about 9 to about 12 peptide units. "Peptide unit" is interchangeable with "amino acid".
The one or more characteristics of the polypeptide are preferably selected from (i) the length of the polypeptide, (ii) the identity of the polypeptide, (iii) the sequence of the polypeptide, (iv) the secondary structure of the polypeptide and (v) whether or not the polypeptide is modified. The one or more characteristics may be the sequence of the polypeptide or whether or not the polypeptide is modified, e.g., by one or more post- translational modifications. The one or more characteristics are preferably the sequence of the polypeptide. The polypeptide may be in a relaxed form. The polypeptide may be held in a linearized form. Holding the polypeptide in a linearized form can facilitate the characterisation of the polypeptide on a residue-by-residue basis as "bunching up" of the polypeptide within the nanopore is prevented. The polypeptide can be held in a linearized form using any suitable means. For example, if the polypeptide is charged, the polypeptide can be held in a linearized form by applying a voltage.
If the polypeptide is not charged or is only weakly charged then the charge can be altered or controlled by adjusting the pH. For example, the polypeptide can be held in a linearized form by using high pH to increase the relative negative charge of the polypeptide.
Increasing the negative charge of the polypeptide allows it to be held in a linearized form under, e.g., a positive voltage. Alternatively, the polypeptide can be held in a linearized form by using low pH to increase the relative positive charge of the polypeptide. Increasing the positive charge of the polypeptide allows it to be held in a linearized form under, e.g., a negative voltage. In the disclosed methods a polynucleotide-handling protein is used to control the movement of a polynucleotide with respect to a nanopore. As a polynucleotide is typically negatively charged it is generally most suitable to increase the linearization of the polypeptide by increasing the pH thus making the polypeptide more negatively charged, in common with the polynucleotide. In this way, the conjugate retains an overall negative charge and thus can readily move, e.g., under an applied voltage.
The polypeptide can be held in a linearized form by using suitable denaturing conditions. Suitable denaturing conditions include, for example, the presence of appropriate concentrations of denaturants such as guanidine HCI and/or urea. The concentration of such denaturants to use in the disclosed methods is dependent on the target polypeptide to be characterised in the methods and can be readily selected by those of skill in the art.
The polypeptide can be held in a linearized form by using suitable detergents. Suitable detergents for use in the disclosed methods include SDS (sodium dodecyl sulfate). The polypeptide can be held in a linearized form by carrying out the disclosed methods at an elevated temperature. Increasing the temperature overcomes intra-strand bonding and allows the polypeptide to adopt a linearized form.
The polypeptide can be held in a linearized form by carrying out the method under strong electro-osmotic forces. Such forces can be provided by using asymmetric salt conditions and/or providing suitable charge in the channel of the nanopore. The charge in the channel of a pore can be altered, e.g., by mutagenesis. Altering the charge of a pore is well within the capacity of those skilled in the art. Altering the charge of a pore generates strong electro-osmotic forces from the unbalanced flow of cations and anions through the nanopore when a voltage potential is applied across the nanopore. The polypeptide can be held in a linearized form by passing it through a structure such an array of nanopillars, through a nanoslit or across a nanogap. The physical constraints of such structures can force the polypeptide to adopt a linearized form.
The target analyte may comprise a polynucleotide and a polypeptide. The target analyte may be a polynucleotide-polypeptide conjugate. The conjugate preferably comprises a polynucleotide conjugated to a polypeptide. One or both of the polynucleotide and polypeptide may be the target and may be characterised in accordance with the invention.
The polypeptide can be conjugate to the polynucleotide at any suitable position. For example, the polypeptide can be conjugated to the polynucleotide at the N-terminus or the C-terminus of the polypeptide. The polypeptide can be conjugated to the polynucleotide via a side chain group of a residue (e.g., an amino acid residue) in the polypeptide. The polypeptide may have a naturally occurring reactive functional group which can be used to facilitate conjugation to the polynucleotide. For example, a cysteine residue can be used to form a disulphide bond to the polynucleotide or to a modified group thereon.
The polypeptide may be modified in order to facilitate its conjugation to the polynucleotide. For example, the polypeptide may be modified by attaching a moiety comprising a reactive functional group for attaching to the polynucleotide. For example, the polypeptide can be extended at the N-terminus or the C-terminus by one or more residues (e.g., amino acid residues) comprising one or more reactive functional groups for reacting with a corresponding reactive functional group on the polynucleotide. For example, the polypeptide can be extended at the N-terminus and/or the C-terminus by one or more cysteine residues. Such residues can be used for attachment to the polynucleotide portion of the conjugate, e.g., by maleimide chemistry (e.g., by reaction of cysteine with an azido-maleimide compound such as azido-[Pol]-maleimide wherein [Pol] is typically a short chain polymer such as PEG, e.g., PEG2, PEG3, or PEG4; followed by coupling to appropriately functionalised polynucleotide e.g., polynucleotide carrying a BCN group for reaction with the azide). Such chemistry is described in Example 2. For avoidance of doubt, when the polypeptide comprises an appropriate naturally occurring residue at the N- and/or C- terminus (e.g., a naturally occurring cysteine residue at the N- and/or C-terminus) then such residue(s) can be used for attachment to the polynucleotide.
A residue in the polypeptide may be modified to facilitate attachment of the polypeptide to the polynucleotide. A residue (e.g., an amino acid residue) in the polypeptide may be chemically modified for attachment to the polynucleotide. A residue (e.g., an amino acid residue) in the polypeptide may be enzymatically modified for attachment to the polynucleotide. The conjugation chemistry between the polynucleotide and the polypeptide in the conjugate is not particularly limited. Any suitable combination of reactive functional groups can be used. Many suitable reactive groups and their chemical targets are known in the art. Some exemplary reactive groups and their corresponding targets include aryl azides which may react with amine, carbodiimides which may react with amines and carboxyl groups, hydrazides which may react with carbohydrates, hydroxmethyl phosphines which may react with amines, imidoesters which may react with amines, isocyanates which may react with hydroxyl groups, carbonyls which may react with hydrazines, maleimides which may react with sulfhydryl groups, NHS-esters which may react with amines, PFP-esters which may react with amines, psoralens which may react with thymine, pyridyl disulfides which may react with sulfhydryl groups, vinyl sulfones which may react with sulfhydryl amines and hydroxyl groups, vinylsulfonamides, and the like. Other suitable chemistry for conjugating the polypeptide to the polynucleotide includes click chemistry. Many suitable click chemistry reagents are known in the art. Suitable examples of click chemistry include, but are not limited to, the following: copper(I)-catalyzed azide-alkyne cycloadditions (azide alkyne Huisgen cycloadditions); strain-promoted azide-alkyne cycloadditions; including alkene and azide [3+2] cycloadditions; alkene and tetrazine inverse-demand Diels-Alder reactions; and alkene and tetrazole photoclick reactions; copper-free variant of the 1,3 dipolar cycloaddition reaction, where an azide reacts with an alkyne under strain, for example in a cyclooctane ring such as in bicycle[6.1.0]nonyne (BCN); the reaction of an oxygen nucleophile on one linker with an epoxide or aziridine reactive moiety on the other; and the Staudinger ligation, where the alkyne moiety can be replaced by an aryl phosphine, resulting in a specific reaction with the azide to give an amide bond.
Any reactive group may be used to form the conjugate. Some suitable reactive groups include [1, 4-Bis[3-(2-pyridyldithio)propionamido]butane; 1,1 1-bis- maleimidotriethyleneglycol; 3,3'-dithiodipropionic acid di(N-hydroxysuccinimide ester); ethylene glycol-bis(succinic acid N-hydroxysuccinimide ester); 4,4'-diisothiocyanatostilbene- 2,2'-disulfonic acid disodium salt; Bis[2-(4-azidosalicylamido)ethyl] disulphide; 3-(2- pyridyldithio)propionic acid N-hydroxysuccinimide ester; 4-maleimidobutyric acid N- hydroxysuccinimide ester; lodoacetic acid N-hydroxysuccinimide ester; S-acetylthioglycolic acid N-hydroxysuccinimide ester; azide-PEG-maleimide; and alkyne-PEG-maleimide. The reactive group may be any of those disclosed in WO 2010/086602, particularly in Table 3 of that application.
The reactive functional group may be comprised in the polynucleotide and the target functional group may be comprised in the polypeptide prior to the conjugation step. The reactive functional group may be comprised in the polypeptide and the target functional group may be comprised in the polynucleotide prior to the conjugation step. The reactive functional group may be attached directly to the polypeptide. The reactive functional group may be attached to the polypeptide via a spacer. Any suitable spacer can be used. Suitable spacers include for example alkyl diamines such as ethyl diamine, etc.
The conjugate may comprise a plurality of polypeptide sections and/or a plurality of polynucleotide sections. For example, the conjugate may comprise a structure of the form ...-P-N-P-N-P-N... wherein P is a polypeptide and N is a polynucleotide. A polynucleotide- handling protein may sequentially control the N portions of the conjugate with respect to the pore and thus sequentially controls the movement of the P sections with respect to the pore, thus allowing the sequential characterisation of the P sections. The plurality of polynucleotides and polypeptides may be conjugated together by the same or different chemistries.
The conjugate may comprise a leader. Any suitable leader may be used. The leader may be a polynucleotide. The leader may be the same sort of polynucleotide as the polynucleotide used in the conjugate, or it may be a different type of polynucleotide. For example, the polynucleotide in the conjugate may be DNA and the leader may be RIMA or vice versa.
The leader may be a charged polymer, e.g., a negatively charged polymer. The leader may comprise a polymer such as PEG or a polysaccharide. The leader may be from 10 to 150 monomer units (e.g., ethylene glycol or saccharide units) in length, such as from 20 to 120, e.g., 30 to 100, for example 40 to 80 such as 50 to 70 monomer units (e.g., ethylene glycol or saccharide units) in length. The methods of characterising a target polypeptide of the invention may comprise conjugating a polypeptide to a polynucleotide.
In any of the methods, the one or more characteristics of the target analyte are preferably measured by electrical measurement and/or optical measurement. The electrical measurement is a current measurement, an impedance measurement, a tunnelling measurement, or a field effect transistor (FET) measurement. The method preferably comprises measuring the current flowing through the pore or the pore multimer as the target analyte moves with respect to, such as through, the pore.
General conditions for conducting the methods of the invention are discussed in more detail below with reference to the kits and systems of the invention. Polynucleotides of the invention
The invention also provides a polynucleotide which encodes a pore of the invention or a construct of the invention, including a chimeric pore monomer of the invention, a PorARc pore monomer of the invention, a chimeric construct of the invention or a PorARc construct of the invention. The invention also provides a polynucleotide which encodes a CsgG pore of the invention. The polynucleotide may be any of those discussed above. The invention also provides an expression vector comprising a polynucleotide of the invention. The invention also provides a host cell comprising a polynucleotide of the invention or a host cell of the invention. Suitable vectors and host cells are known in the art.
Kits
The invention also provides kits for characterising a target analyte. In one embodiment, the kit comprises (a) a pore monomer of the invention or a construct of the invention and (b) the components of a membrane. The kit preferably comprises (a) a chimeric pore monomer of the invention, a PorARc pore monomer of the invention, a chimeric construct of the invention or a PorARc construct of the invention and (b) the components of a membrane. The kit preferably comprises (a) a CsgG pore monomer or a CsgG construct of the invention and (b) the components of a membrane. Suitable membranes and components are discussed below.
In another embodiment, the kit comprises:
(a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from or derived from at least two different pores, (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a), (c) a PorARc pore of the invention, or (d) a PorARc pore multimer of the invention; and a polynucleotide binding protein.
In another embodiment, the kit comprises:
(a) a CsgG pore of the invention or a CsgG pore multimer of the invention; and
(b) a polynucleotide binding protein.
Any of the embodiments discussed above with reference to pores and pore multimers equally apply to the kits of the invention.
The kit preferably further comprises the components of a membrane. The kit may comprise components of any type of membranes, such as an amphiphilic layer or a triblock copolymer membrane. Preferred polynucleotide binding proteins are polymerases, exonucleases, helicases, and topoisomerases, such as gyrases. Suitable enzymes include, but are not limited to, exonuclease I from E. coli, exonuclease III enzyme from E. coli, RecJ from T. thermophilus and bacteriophage lambda exonuclease, TatD exonuclease and variants thereof. Three subunits comprising the RecJ sequence from T. thermophilus or a variant thereof interact to form a trimer exonuclease. The polymerase may be PyroPhage® 3173 DNA Polymerase (which is commercially available from Lucigen® Corporation), SD Polymerase (commercially available from Bioron®) or variants thereof. The enzyme may be Phi29 DNA polymerase or a variant thereof. The topoisomerase is preferably a member of any of the Moiety Classification (EC) groups 5.99.1.2 and 5.99.1.3.
The enzyme is most preferably derived from a helicase, such as Hel308 Mbu, Hel308 Csy, Hel308 Tga, Hel308 Mhu, Tral Eco, XPD Mbu or a variant thereof. Any helicase may be used in the invention. The helicase may be or be derived from a Hel308 helicase, a RecD helicase, such as Tral helicase or a TrwC helicase, a XPD helicase or a Dda helicase. The helicase may be any of the helicases, modified helicases or helicase constructs disclosed in WO 2013/057495; WO 2013/098562; WO 2013098561; WO 2014/013260; WO 2014/013259; WO 2014/013262 and WO 2015/055981. All of these are incorporated by reference in their entirety.
The kit may further comprise one or more anchors, such as cholesterol, for coupling the target analyte to the membrane. The kit may further comprise one or more polynucleotide adaptors that can be attached to a target polynucleotide to facilitate characterisation of the polynucleotide. The anchor, such as cholesterol, is preferably attached to the polynucleotide adaptor.
The kit may additionally comprise one or more other reagents or instruments which enable any of the embodiments mentioned above to be carried out. Such reagents or instruments include one or more of the following: suitable buffer(s) (aqueous solutions), means to obtain a sample from a subject (such as a vessel or an instrument comprising a needle), means to amplify and/or express polynucleotides or voltage or patch clamp apparatus. Reagents may be present in the kit in a dry state such that a fluid sample resuspends the reagents. The kit may also, optionally, comprise instructions to enable the kit to be used in the method of the invention or details regarding for which organism the method may be used. Finally, the kit may also comprise additional components useful in analyte characterization.
Apparatus
The invention also provides an apparatus for characterising target analytes, such as target polynucleotides, in a sample, comprising a
(a) a plurality of chimeric pores comprising two or more regions wherein at least two of the two or more regions are from or derived from at least two different pores, (b) a plurality of chimeric pore multimers comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a), (c) a plurality of PorARc pores of the invention, or (d) a plurality of PorARc pore multimers of the invention; and a plurality of polynucleotide binding proteins.
The invention also provides an apparatus for characterising target analytes in a sample, comprising a
(a) a plurality of CsgG pores of the invention or a plurality of CsgG pore multimers of the invention; and
(b) a plurality of polynucleotide binding proteins.
The plurality of pores or plurality of pore multimers may be any of those discussed above.
The invention also provides an apparatus comprising a pore of the invention or a pore multimer of the invention inserted into an in vitro membrane. The apparatus preferably comprises a chimeric pore monomer of the invention, a PorARc pore monomer of the invention, a chimeric construct of the invention or a PorARc construct of the invention inserted into an in vitro membrane. The apparatus preferably comprises a CsgG pore of the invention or a CsgG pore multimer of the invention inserted into an in vitro membrane.
The invention also provides an apparatus produced by a method comprising: (i) obtaining a pore of the invention or a pore multimer of the invention and (ii) contacting the pore or pore multimer with an in vitro. The apparatus is preferably produced by a method comprising: (i) obtaining a chimeric pore monomer of the invention, a PorARc pore monomer of the invention, a chimeric construct of the invention or a PorARc construct of the invention and (ii) contacting the pore or pore multimer with an in vitro membrane such that the pore or pore multimer is inserted in the in vitro membrane. The apparatus is preferably produced by a method comprising: (i) obtaining a CsgG pore of the invention or a CsgG pore multimer of the invention and (ii) contacting the pore or pore multimer with an in vitro membrane such that the pore or pore multimer is inserted in the in vitro membrane. Any of the specific embodiments discussed above are equally applicable to the apparatuses of the invention.
Arrays
The invention also provides an array comprising a plurality of membranes of the invention. Any of the embodiments discussed above with respect to the membranes of the invention equally apply the array of the invention. The array may be set up to perform any of the methods described below.
In a preferred embodiment, each membrane in the array comprises one pore or pore multimer. Due to the manner in which the array is formed, for example, the array may comprise one or more membranes that do not comprise a pore or pore multimer, and/or one or more membranes that comprise two or more pores complexes or multimers. The array may comprise from about 2 to about 1000, such as from about 10 to about 800, from about 20 to about 600 or from about 30 to about 500 membranes.
System
The invention provides a system comprising (a) a membrane of the invention or an array of the invention, (b) means for applying a potential across the membrane(s) and (c) means for detecting electrical or optical signals across the membrane(s).
The pores and membranes may be any as described above and below.
In one embodiment, the system further comprises a first chamber and a second chamber, wherein the first and second chambers are separated by the membrane(s). When used to characterise a target analyte, the system may further comprise a target analyte, wherein the target analyte is transiently located within the continuous channel and wherein one end of the target analyte is located in the first chamber and one end of the target analyte is located in the second chamber. The target analyte is preferably a target polypeptide or a target polynucleotide.
In one embodiment, the system further comprises an electrically conductive solution in contact with the pore(s), electrodes providing a voltage potential across the membrane(s), and a measurement system for measuring the current through the pore(s). The voltage applied across the membranes and pore is preferably from +5 V to -5 V, such as -600 mV to +600mV or -400 mV to +400 mV. The voltage used is preferably in the range 100 mV to 240 mV and more preferably in the range of 120 mV to 220 mV. It is possible to increase discrimination between different amino acids or nucleotides by a pore by using an increased applied potential. Any suitable electrically conductive solution may be used. For example, the solution may comprise charge carriers, such as metal salts, for example alkali metal salt, halide salts, for example chloride salts, such as alkali metal chloride salt. Charge carriers may include ionic liquids or organic salts, for example tetramethyl ammonium chloride, trimethylphenyl ammonium chloride, phenyltrimethyl ammonium chloride, or 1- ethyl-3-methyl imidazolium chloride. In an exemplary system, salt is present in the aqueous solution in the chamber. Potassium chloride (KCI), sodium chloride (NaCI), caesium chloride (CsCI) or a mixture of potassium ferrocyanide and potassium ferricyanide is typically used. KCI, NaCI and a mixture of potassium ferrocyanide and potassium ferricyanide are preferred. The charge carriers may be asymmetric across the membrane. For instance, the type and/or concentration of the charge carriers may be different on each side of the membrane, e.g., in each chamber. The salt concentration may be at saturation. The salt concentration may be 3 M or lower and is typically from 0.1 to 2.5 M, from 0.3 to 1.9 M, from 0.5 to 1.8 M, from 0.7 to 1.7 M, from 0.9 to 1.6 M or from 1 M to 1.4 M. The salt concentration is preferably from 150 mM to 1 M. The method is preferably carried out using a salt concentration of at least 0.3 M, such as at least 0.4 M, at least 0.5 M, at least 0.6 M, at least 0.8 M, at least 1.0 M, at least 1.5 M, at least 2.0 M, at least 2.5 M or at least 3.0 M. High salt concentrations provide a high signal to noise ratio and allow for currents indicative of the presence of an amino acid or nucleotide to be identified against the background of normal current fluctuations.
A buffer may be present in the electrically conductive solution. Typically, the buffer is phosphate buffer. Other suitable buffers are HEPES and Tris-HCI buffer. The pH of the electrically conductive solution may be from 4.0 to 12.0, from 4.5 to 10.0, from 5.0 to 9.0, from 5.5 to 8.8, from 6.0 to 8.7 or from 7.0 to 8.8 or 7.5 to 8.5. The pH used is preferably about 7.5.
The system may be comprised in an apparatus. The apparatus may be any conventional apparatus for analyte analysis, such as an array or a chip. The apparatus is preferably set up to carry out the disclosed method. For example, the apparatus may comprise a chamber comprising an aqueous solution and a barrier that separates the chamber into two sections. The barrier typically has an aperture in which the membrane(s) containing the pore(s) are formed. Alternatively, the barrier forms the membrane in which the pore is present.
The apparatus may also comprise an electrical circuit capable of applying a potential and measuring an electrical signal across the membrane and pore.
The apparatus may be any of those described in WO 2008/102120, WO 2009/077734, WO 2010/122293, WO 2011/067559, or WO 00/28312 (all incorporated herein by reference in their entirety).
Membrane
Any suitable membrane may be used in the system. The membrane is preferably an amphiphilic layer. An amphiphilic layer is a layer formed from amphiphilic molecules, such as phospholipids, which have both hydrophilic and lipophilic properties. The amphiphilic molecules may be synthetic or naturally occurring. Non-naturally occurring amphiphiles and amphiphiles which form a monolayer are known in the art and include, for example, block copolymers (Gonzalez-Perez et al., Langmuir, 2009, 25, 10447-10450). Block copolymers are polymeric materials in which two or more monomer sub-units that are polymerized together to create a single polymer chain. Block copolymers typically have properties that are contributed by each monomer sub-unit. However, a block copolymer may have unique properties that polymers formed from the individual sub-units do not possess. Block copolymers can be engineered such that one of the monomer sub-units is hydrophobic (/.e., lipophilic), whilst the other sub-unit(s) are hydrophilic whilst in aqueous media. In this case, the block copolymer may possess amphiphilic properties and may form a structure that mimics a biological membrane. The block copolymer may be a diblock (consisting of two monomer sub-units) but may also be constructed from more than two monomer sub-units to form more complex arrangements that behave as amphipiles. The copolymer may be a triblock, tetrablock or pentablock copolymer. The membrane is preferably a triblock copolymer membrane.
The membrane may comprise one of the membranes disclosed in International Application No. WO 2014/064443 or WO 2014/064444.
The amphiphilic molecules may be chemically modified or functionalised to facilitate coupling of the polynucleotide. The amphiphilic layer may be a monolayer or a bilayer. The amphiphilic layer is typically planar. The amphiphilic layer may be curved. The amphiphilic layer may be supported.
Amphiphilic membranes are typically naturally mobile, essentially acting as two-dimensional fluids with lipid diffusion rates of approximately 10'8 cm s-1. This means that the pore and coupled polynucleotide can typically move within an amphiphilic membrane.
The membrane may be a lipid bilayer. Lipid bilayers are models of cell membranes and serve as excellent platforms for a range of experimental studies. For example, lipid bilayers can be used for in vitro investigation of membrane proteins by single-channel recording. Alternatively, lipid bilayers can be used as biosensors to detect the presence of a range of substances. The lipid bilayer may be any lipid bilayer. Suitable lipid bilayers include, but are not limited to, a planar lipid bilayer, a supported bilayer, or a liposome. The lipid bilayer is preferably a planar lipid bilayer. Suitable lipid bilayers are disclosed in WO 2008/102121, WO 2009/077734, and WO 2006/100484 (all incorporated herein by reference in their entirety).
The membrane may comprise a solid-state layer. Solid state layers can be formed from both organic and inorganic materials including, but not limited to, microelectronic materials, insulating materials such as Si3N4, A12O3, and SiO, organic and inorganic polymers such as polyamide, plastics such as Teflon® or elastomers such as two-component addition-cure silicone rubber, and glasses. The solid-state layer may be formed from graphene. Suitable graphene layers are disclosed in WO 2009/035647 (incorporated herein by reference in its entirety). If the membrane comprises a solid-state layer, the pore is typically present in an amphiphilic membrane or layer contained within the solid-state layer, for instance within a hole, well, gap, channel, trench or slit within the solid-state layer. The skilled person can prepare suitable solid state/amphiphilic hybrid systems. Suitable systems are disclosed in WO 2009/020682 and WO 2012/005857 (both incorporated herein by reference in their entirety). Any of the amphiphilic membranes or layers discussed above may be used.
The method is typically carried out using (i) an artificial amphiphilic layer comprising a pore, (ii) an isolated, naturally occurring lipid bilayer comprising a pore, or (iii) a cell having a pore inserted therein. The method is typically carried out using an artificial amphiphilic layer, such as a di- or tri-block copolymer layer. The layer may comprise other transmembrane and/or intramembrane proteins as well as other molecules in addition to the pore. Suitable apparatus and conditions are discussed below. The method of the invention is typically carried out in vitro.
SEQUENCE LISTING
SEQ ID NO: 1
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPGLELGISSAGAVSGTLSDVLPQAGVGVTLTPGPGIETVAVASGAASGAHTEI QIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 2 MGLDNELSLVDGQGRTLTIQQWDTFLNGVFPLDRNRLTREWFHSGRAKYIVAGEGAEDFEGTLELGYQI GFPWSLGVGINFSYTTPNILLNNVSLFPAFNPLGSVITPNLFPGVSISADLGNGPGIQEVATFSVDVEGPE GGVAVSNAHGTVTGAAGGVLLRPFARLISSAGDSVTTYGEPWNMNGSGGENLYFQGSGSGSAWSHP QFEK
SEQ ID NO: 3
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILLNNVSLFPAFNPLGSVITPNLFPQAGVGVTLTPGPGIETVAVASGAASGA HTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHP QFEK
SEQ ID NO: 4
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILLNNVSIAPGAFNPLGSVITPNLFPQAGVGVTLTPGPGIETVAVASGAASG AHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSH PQFEK
SEQ ID NO: 5 MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNFLFNNAQVYAIPGTPSANNGIGPFNSIITPNILPQAGVGVTLTPGPGIETVAV ASGAASGAHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGS
GSAWSHPQFEK
SEQ ID NO: 6
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNVLFNQAPLNPAGYLNPNNGFITTPNFFPQAGVGVTLTPGPGIETVAVASGA ASGAHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSA WSHPQFEK
SEQ ID NO: 7
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILLNNVSISPTNFNPLAQVITPNLFPQAGVGVTLTPGPGIETVAVASGAASG AHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSH PQFEK
SEQ ID NO: 8
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILLNQAPPNLNPAAGFLTTPNLFPQAGVGVTLTPGPGIETVAVASGAASGAH TEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQ FEK
SEQ ID NO: 9
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNLLINNASIAPNLTPGSPFGPTVGTGFAPLGSIITPNLFPQAGVGVTLTPGPGI ETVAVASGAASGAHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYF QGSGSGSAWSHPQFEK
SEQ ID NO: 10
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILLNNVNPFPSAWGPLGYQGNGGIITPNLFPQAGVGVTLTPGPGIETVAVA SGAASGAHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGS GSAWSHPQFEK
SEQ ID NO: 11
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNVLVYNASIAPPPLGVGITPLSSVVTPNLFPQAGVGVTLTPGPGIETVAVASG AASGAHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSA WSHPQFEK SEQ ID NO: 12
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNVALNAQNVIGTAVGPINFFPPISTPPLLPQAGVGVTLTPGPGIETVAVASGA ASGAHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSA WSHPQFEK
SEQ ID NO: 13
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILLNNAPLPGFPTAIFGPGFITTPNLFPQAGVGVTLTPGPGIETVAVASGAAS GAHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWS HPQFEK
SEQ ID NO: 14
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILFNQLNPVVPINPNGGLGFITTPNLFPQAGVGVTLTPGPGIETVAVASGAA SGAHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAW SHPQFEK
SEQ ID NO: 15
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNVALNAQNVILAPGINLFPPISTPPLFPQAGVGVTLTPGPGIETVAVASGAAS GAHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWS HPQFEK
SEQ ID NO: 16
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNVLLNQAPPPGANFTQFGFLTTPNLFPQAGVGVTLTPGPGIETVAVASGAAS GAHTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWS HPQFEK
SEQ ID NO: 17
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNVLLNQFSPTAPLGPAFTTPNLFPQAGVGVTLTPGPGIETVAVASGAASGAHT EIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQF EK
SEQ ID NO: 18
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILLNSAPFPPVAGQFITTPNLFPQAGVGVTLTPGPGIETVAVASGAASGAHT EIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQF EK
SEQ ID NO: 19
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILISNAPITNPLSSIITPNLLPQAGVGVTLTPGPGIETVAVASGAASGAHTEI QIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 20
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILLNNATPANPLQVITPNLFPQAGVGVTLTPGPGIETVAVASGAASGAHTEI QIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 21
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILINNGNITAPPFGLNSVITPNLFPQAGVGVTLTPGPGIETVAVASGAASGA HTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHP QFEK
SEQ ID NO: 22
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPQLQLNAGNIIVTNILPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIANLH GTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 23
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPQLSVNANGGTLSNILPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIANL HGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 24
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPQLSVSTNNFTVSNILPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIANL HGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 25
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPGLSISSNNFTLSNVLPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIANLH GTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK SEQ ID NO: 26
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ
VGYPASLGGRLTFSYTTPQLQLNFNGGGTVSNILPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIAN
LHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 27
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ
VGYPASLGGRLTFSYTTPQLQIGNGNAFTVSNILPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIAN
LHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 28
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ
VGYPASLGGRLTFSYTTPGLNLSVGNGVAATVTNVLPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQI
ANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 29
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ
VGYPASLGGRLTFSYTTPGLQVNLPNASATVTNILPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIA
NLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 30
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ
VGYPASLGGRLTFSYTTPQLQVGNGNAFTVSNILPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIAN
LHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 31
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ
VGYPASLGGRLTFSYTTPGLQINFNGGGTINSLIPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIANL
HGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 32
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ
VGYPAS LGG RLTFSYTTPG LQVQVGTNTQVN I FN LI PQAG VG VTLTPG PGI ETVAVASG AASG AHTEIQI
ANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 33
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ
VGYPASLGGRLTFSYTTPGINSRPPPGQGIQLKNLIPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIA
NLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK SEQ ID NO: 34
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPGLNVQVFPNVAVGLTNVIPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQI ANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 35
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPGLQLQLSAPSTLNVFNLIPQAGVGVTLTPGPGIETVAVASGAASGAHTEIQIA NLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 36
MAVDDSNSVVDGGGNTITVSQSDTFINSVFPLDGSPLTREWFHNGRAIVDVTGPDAEDFSGTVTIGYQ VGYPASLGGRLTFSYTTPNILLNNVSLFPAFNPLGSVITPNLFPQAGVGVTLTPGPGIETVAVASGAASGA HTEIQIANLHGTATKIAGNVSVRPYVQVVSSNGDVATTFGQPWRFNGSGGENLYFQGSGSGSAWSHP QFEK
SEQ ID NO: 37
MRVEGGDSVVDGKKRKIAALVSDTRVESVVPIDSNPTSRQWRYSGKSTLRITGGDEDEDGAKVEDWG GTVTFGFLIGYPATLTGSIGVSYSTPNILLNNVSLFPAFNPLGSVITPNLFPTVTGNLAVGFGPGVRAVDAT TGQAKGAGGVLRVKNFHGAVTGVLGQVTIQPYIRVWSNEGDTIMVYGRKHLLGSGGENLYFQGSGSG SAWSHPQFEK
SEQ ID NO: 38
MEVNDQNRIVSNGHEVIVTQEDTFINGVPPIGGSPLSREWFHNGRGIANIVGPEAADFEDSTFQFGYQV AWSAAIDGSVGFNWTSPNILLNNVSLFPAFNPLGSVITPNLFPQLTAGIELTPAPGIEELVVAEGQFDGDY KEVQIANVHGAATGVVGPVSVRPFVRVITENGDNVTTYGQVWTLGSGGENLYFQGSGSGSAWSHPQF EK
SEQ ID NO: 39
MEVNDQNRIVSNGHEVIVTQEDTFINGVPPIGGSPLSRQWFHNGRGIANVVGPQADEFEDSTFQFGYQ VAWPASIDGAIGFSWTSPNILLNNVSLFPAFNPLGSVITPNLFPQVTASVALTPAPGIAELVVAEGQFDGD YKEVQIANVSGAATGVIGPVSVRPFVRVITHNGDNVTTYGQVWTLGSGGENLYFQGSGSGSAWSHPQ FEK
SEQ ID NO: 40
MALDDTKQIETGDNLTIEARQSDTDIRFVAPLDGNPLTREWFHDAIAGFHIDGAGADEFLGKITIGYQIG YPATLSGQIKFSYTSPNILLNNVSLFPAFNPLGSVITPNLFPTVGVEISAGFGPGIKSVDIATVAIAGADGW IKIAGVHGTVTGVVGRTTIRPYVTVTSVRGDTVTTYGKDWKAGSGGENLYFQGSGSGSAWSHPQFEK SEQ ID NO: 41
MDVDSTDRVIDGKQRTISAIQADTTIRAVPPLDRNPLTRQWFHDVTAKFTVEGDGAEEFAGTIKIGYLV GFPATVDGRIKFGYSSPNILLNNVSLFPAFNPLGSVITPNLFPTVTGEIETGFGPGVRQIEAVSGRITGAEG SVRLTNSIGTVTGVIGTATVQPYVTVVSDTGDSVTTLGKPYEVNGSGGENLYFQGSGSGSAWSHPQFE K
SEQ ID NO: 42
MVVDNADIVVDAQQRTITAIQADTMIRGLAPLDRNPLTRMWEHDGRAEFTVTGDKADEFKGTIKIGYLV GFPATFGGKIRVSYSTPNILLNNVSLFPAFNPLGSVITPNLFPTVTGEIEVGFGPGVQQVELASGAITGAS GHISLVNFVGTVTGVIGPATIQPYVTVIADSGDTVTTLGRPWDIGSGGENLYFQGSGSGSAWSHPQFEK
SEQ ID NO: 43
MAVDSTNTVVDANGNVITVSLSDTFINSVSPLDGNPLTREWFANGVAGWTVTGPDADDFEGTVAIGYQ VGYPMSLGGSITFGYTTPNILLNNVSLFPAFNPLGSVITPNLFPSVGFEAEIAPGPGIVDATAASGNIQGTE GDAAGPSGTIQIANAHGTATGILGNVRVRPYVSVTSSTGDVAVTYGTPWTFNGSGGENLYFQGSGSGS AWSHPQFEK
SEQ ID NO: 44
MAVDDSNSVVDGGGNTITVSQADTFINSVFPLDGSPLTREWFHNGRAIVEVTGPDAADFEGDITLGYQ FGYPASLGGELTFSYSTPNILLNNVSLFPAFNPLGSVITPNLFPQVGTTVTLEPGPGIEDVEVGTGSASGE RTEIQISNVHGTATNIAGNVSVRPYVKVVSSNGDTAVTYGKPWRFNGSGGENLYFQGSGSGSAWSHP QFEK
SEQ ID NO: 45
MRLDDQLSAVDGAGRTLTVQQWDTVVEGVASLDRNPLTREWFYSGKATYAVSGPDAAGFKGKLELGY QVGFPWALGMNVAFTYTTPNILLNNVSLFPAFNPLGSVITPNLFPGVSITSNLSNGPGNQEVSTFSTDVA GADGVVAVSGAHGTVTGVAGGVVLRPFVRLTSNTGATVTTYTHPWNLDGSGGENLYFQGSGSGSAWS HPQFEK
SEQ ID NO: 46
MELDDEHSLIDAQGRTLKIQQWDTFLNGVAPLDRNRLTREWFYNGRVKYAVEGPGAESFEGSVELGYQ IQFPWSMGVGLNFTYTTPNILLNNVSLFPAFNPLGSVITPNLFPGASINVMLENGPGIEDIVVVAIPVSGPS GGTAVSNGHGTVTGAAGGVTLRPFARLVSSTGDTATTYGEIWNMNGSGGENLYFQGSGSGSAWSHP QFEK
SEQ ID NO: 47
MAVDDQNRIVSTDGYEVVVTQEDTFIQGVPALGGSPFNREFFHNGRGTANLVGADAADAEGTTFQFGY QFAWAGSIDGAIGVTYSTPNILLNNVSLFPAFNPLGSVITPNLFPQAYAELRLTPAPGIEELVVAEGRFDG DFKSVQFSNVHGTASGVLGAVQVRPFVRAITANGDNVTTYGKPWTVGSGGENLYFQGSGSGSAWSHP
QFEK
SEQ ID NO: 48
MGLDNEKSASDRSGHQLTVQQWDTAVHAVPPMDKNRLTREWFYSGKAAYRVTGAGAETFSGTLEFGY QIGIPWTVGVGLNFTYTTPNILLNNVSLFPAFNPLGSVITPNLFPGASISTDIGNSPGVQELVTFAVPVSG HGGAVAVAKAHGTVTGVAGGIQVRPFARLTWPDHASITTYGDLTNVDGSGGENLYFQGSGSGSAWSH PQFEK
SEQ ID NO: 49
MGLDDELSLVDGQGRTLTVQQWDIFLDGVSPLDRNRLTREWFHSGQAKYTVSGPGAEDFEGTLILGYE VGFPWSLGVAIGFSYTTPNILLNNVSLFPAFNPLGSVITPNLFPGVNFSADLGNGPGVQEISTFEVDVSGS AGGVAVSKAHGTVTGAAGGVLLRPFARLVASTGDLVTTYGEPWNMNGSGGENLYFQGSGSGSAWSHP QFEK
SEQ ID NO: 50
MGLDNQQSLVDGKGRTMTIQQWDTFLDGVFPLDRNRLTREWFHSGKAIYAVVGPGASDFAGTLELGY QVGFPWSLGVGINFSYTTPNILLNNVSIAPGAFNPLGSVITPNLFPGVSISSDLGNGPGI
SEQ ID NO: 51
MGLDNELSLVDGQDRTLTVQQWDTFLNGVFPLDRNRLTREWFHSGRAKYIVAGPGADKFEGTLELGYQ IGFPWSLGVGINFSYTTPNLLINNASIAPNLTPGSPFGPTVGTGFAPLGSIITPNLFPGVSISADLGNGPGI QEVATFSVDVAGPQGGVAVSNAHGTVTGAAGGVLLRPFARLISKSGDSVTTYGEPWNMN
SEQ ID NO: 52
MGLDNELSLIDGRDRTLTIQQWDTFLNGVFPLDRNRLTREWFHSGRAKYIVAGPDAEEFEGTLELGYQI GFPWSLGVGINFSYTTPNILLNSAPFPPVAGQFITTPNLFPGVSISADLGNGPGIQEVATFSVDVAGPNG GVAVSNAHGTVTGAAGGVLLRPFARLIASTGDGLTTYGDPWNMN
SEQ ID NO: 53
MGLDNELSLVDGKDRTLTIQQWDTFLNGVFPLDRNRLTREWFHSGKAKYIVSGPGADEFEGTLELGYQI GFPWSLGVGINFSYTTPNILLNNATPANPLQVITPNLFPGASISADLGNGPGIQEVATFSTDVSGANGAV AVSNAHGTVTGAAGGVLLRPFARLIAKEGDSVTTYGEPWNMN
SEQ ID NO: 54
MAVDDSNSVVDGGGNTITVSQADTFINSVFPLDGSPLTREWFHNGRAIVEVTGPDAADFEGDITLGYQ FGYPASLGGELTFSYSTPQLQLNFNGGGTVSNILPQVGTTVTLEPGPGIEDVEVGTGSASGARTEIQISN VHGTATNIAGNVSVRPYVKVVSSNGDTAVTYGKPWRFN SEQ ID NO: 55
MAIDDQNRIVSTDGYEVVVTQENTVIQGVPALGGSPFNREFFHNGRGTTNLIGAGAADAEGTTFQFGYQ
FAWAGSIDGAIGVTYSTPGLNLSVGNGVAATVTNVLPQAYAELELTPAPGIEELVVAESTFDGDFKSVQL
ANVHGAASGVLGSVQVRPFVRAVTANGDNVTTYGKAWTI
SEQ ID NO: 56
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAOSYKDLTHLPAPTGKIFVSVYNIODETGOFKPYPASNF
STAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEGSI
IGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFIDYQ
RLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
SEQ ID NO: 57
MKKGLCLAVILVLSLTGCSNFMDTPDAEEYPTLAPRGAIYKDLINLPLPKGKIMVSVYDFRDOTGOYKDY
PSSTFATAVPQGGTSMLTSSLLDSKWFLPLEREGLQNLLTERKIIRAAQKKDEAPVNIGDDLPALKSANL
VIEGGIIGYESDLKSGGHGIGYFGLATYGEYRMDQVTVNLRAVDVRTGQILLSVTTSKTIFSHALSGSVF
RYIAYQDLLEMESGYTNNEPVNIAVMSAIDSAVIHMIIDGIQKGLWEPADKKQMESPVMKRYMQESTTI L
SEQ ID NO: 58
MKAVISLLSVLLISACSTSLSVPDVDEAPOIMORSSTYTDLLSLPAPKGRILVSVYDFRDOTGOYKSSPAS
SFSTAVPQGGTALLTTALLESNWFIPLEREGLQNLLTERKIIRAAQGKGETVNNHNNGLPSLNSANIMIE
GGVVAYDWNIKTGGAGAKYLGISAAGEYRADQVTVNLRAVDVRSGRILSSVTISKTIYSHQLSMGAFRY
IDYQELLEAELGYSNNEPVNIALMSAIDASIIHLIVDGVARGLWQPRDFDDLTKNKIYQKYSSQVREIL
SEQ ID NO: 59
MKGLILLCASLVLGGCSYSLEIPETSASPKLMORGSVYTDLTSLPPPVGKIMVSVYDFRDOTGOYKPSPN
SNFSTAVPQGGTSLLTTALIDSKWFVPLEREGLQNLLTERKIIRAAQKKDTVVSNHGTDLSSLNSANVVI
EGGIVAYDSNLRTGGAGARYLGVGGSGQYRTDQVTVNLRAVDVRTGRVLLSVTTSKTISSHEIGLGAFR
FIDYEELLEVELGYSNNEPVNIAVMSAIDAAVIHLIVKGMERGMWSSNDPQAMSHPIIAKYSQATTEIL
SEQ ID NO: 60
MRHLLIIASLFLLNGCLTAPPKOAAEPTLLPRSQSYODLVOLPTPKGKIFVSVYNIODETGOFKPYPASNF
STSVPQSATAMLISALKDSNWFIPLERQGLQNLLNERKIIRAAQENGRVAINNAQPLSSLVAANVLIEGSI
IGYESNVKSGGIGARYFGIGGSTQYQLDQIAVNLRVVDVSTGEILSSVNTSKTILSYEVQAGVFRFIDYQ
RLLEGEVGYTSN EPVM MCLMSAIETGVIYLIN DGITRN LWQLQN PKDINTPVFERYKN LKVPTA
SEQ ID NO: 61 MKGLFSIIVILIMTGCSASLDIPDADSEPKLMPRGTTYTDLISLPTPKGKILVSVYDFRDOTGOYKPYPNST
YSTAVPQGGTTLLTNSLLDSQWFIPLEREGLQNLLTERKIIRAAQKKETKISNHGSNLSSLNSANVVIEG
GIVAYDSNIKTGGLGAKYLGIGGSGQYRTDQVTVSLRAIDVRTGQVLLSVTTSKTISSHEIGLGAFRFID
YQELLEVELGYSNNEPVNIAVMSAIDAAVIHLIVKGMSLGMWQSNDPNVESNPIIAKYSQATREIL
SEQ ID NO: 62
MRAMILIIAVLLGGCSITEVPKEAAKPTLMPRASTYKDLVALPKPNGKIIVSVYSVODETGOFKPLPASNF
STAVPQSGNAMLTSALKDSGWFVPLEREGLQNLLNERKIIRAAQENGTVAANNQQPLPSLLSANVVIEG
AIIGYDSDIKTGGAGARYFGIGADGKYRVDQVAVNLRAVDVRTGEVLLSVNTSKTILSSELSAGVFRFIE
YQRLLELEAGYTTNEPVMMCMMSALEAGVAHLIVEGIRQNLWSLQNPSDINNPIIQRYMKEDVPLAI
SEQ ID NO: 63
MKRLFLFIAVIMVAGCSNSLSIPDTDEAPKLMORGSTYODLIHLPDPKGKLYVSIYDFRDOTGOYKPOPN
SNFSTAVPQGAISLLIMSLIDSKWFVPLEREGLQNLLTERKIIRAAQSSSKGQANIASLRSANVMIEGGIV
AYDTNIKTGGAGARYLGVGASTQYRTDQVTVSLRVVDVSSGAILSSVTTSKTIFSQEMQTGAFRFIDYK
DLLEVELGYTNNEPVNIALMSAIDAAVIYLVVQGIDQGLWQAGTSDNINNKIYKKYSQNKAEIL
SEQ ID NO: 64
MKGLISIGLVLLLSGCAYSLDIPDTDASPKLMPRGATYTDLVSLPKPAGKILVSVYDFRDOTGOYKPOPN
SNFSTAVPQGGTSLLTTSLLDSQWFVPLEREGLQNLLTERKIIRAAQKKDKVISNHGADLSSLNSANVVI
EGGIVAYDSNIRTGGLGAKYLGIGASGQYRTDQVTVNLRAVDVRSGQVLLSITTSKTISSHEMGLGAFR
FIDYKELLEVEMGYSNNEPVNIAVMSAIDAAVIHLIVKGMERGMWSASDPQAMSNPIIARYSQAETEIL
SEQ ID NO: 65
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIODETGOYKDYPSST
FATAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEG
SIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFID
YQRLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
SEQ ID NO: 66
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIODETGOYKSSPASS
FSTAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEG
SIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFID
YQRLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
SEQ ID NO: 67 MORLFLLVAVMLLSGCLTAPPKEAARPTLM PRAQSYKDLTHLPAPTGKIFVSVYN IQDETGQYKPSPN SN FSTAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEG SIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFID YQRLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
SEQ ID NO: 68
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIODETGOFKPYPASNF STSVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEGSI IGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFIDYQ RLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
SEQ ID NO: 69
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIODETGOYKPYPNST
YSTAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEG SIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFID YQRLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
SEQ ID NO: 70
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIODETGOFKPLPASNF
STAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEGSI IGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFIDYQ RLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
SEQ ID NO: 71
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIODETGOYKPOPNSN
FSTAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEG SIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFID YQRLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
SEQ ID NO: 72
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIODOTGOYKPOPNSN
FSTAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEG SIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFID YQRLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPESAW SHPQFEK
SEQ ID NO: 73 MKTGLMLCVALLAGCSNSMSIPDADESPTLTPRGPTYNDLVKLPLPKGKIMVSVYDFRDOTGOYKSSPN
SSFSTAVPQGGTSMLTTALLDSGWFLPLEREGLQNLLTERKIIRAAQKKDQTPANIGDDLPALKSANLVI
EGGIIGYESDLKTGGHGAGYLGFAAYGQYRMDQVTVNLRAVDVRTGQIVLSVTTSKTIFSQEVSASVFR
YIAYQDLLELESGYTNNEPVNIAIMSAIDSAVIHMVVDGIKRGLWQPADEAQLKNPIIQRYSDETVAIL
SEQ ID NO: 74
MKLIISCILVLVMTGCSNSMGIPEADSAPTLMPRGATYODLVRLPEPKGKILVSVYDFRDOTGOYKAOPN
SNFSTAVPQGGTALLTTSLLDSRWFIPLEREGLQNLLTERKIIRAAQKKGEGASNHGDDLSSLSSANVVI
EGGIIAYDSNIRTGGLGARYLGVGSSGEYRADQVTVNLRAVNVRTGQILLSITTSKTIFSHQISAGAFRFV
DYKDLLEIEMGYSNNEPVNIAVMSAIDAAVIHLVVKGMERGMWQPAATEGEGFDVIERYAAQTQVIL
SEQ ID NO: 75
MNRLFLLVTLLVLAGCSNSLSVPESDEAPRLMPRGATYSDLIALPKPKGRILVSVYDFRDOTGOYKSSPN
SNFSTAVPQGGTALLTTSLLDSNWFTPLEREGLQNLLTERKIIRAAQKKDQSVSNHGADLPSLSSANVVI
EGGIVAYDSNVKTGGFGARYLGIGGATEYRSDMVTVNLRVVDVRTGQILLSVTTSKTILSMQVTGDVFR
FVDYKDLLEVEAGYTNNEPVNVAVMSAIDASVIHLVIEGIERGMWQPANLEELNSPIIERYAKEKHHIL
SEQ ID NO: 76
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIODETGOYKSSPNSS
FSTAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEG
SIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFID
YQRLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
SEQ ID NO: 77
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIODETGOYKAOPNSN
FSTAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEG
SIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFID
YQRLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
SEQ ID NO: 78
MORLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIODETGOYKSSPNSN
FSTAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEG
SIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSVNTSKTILSYEVQAGVFRFID
YQRLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
The following Examples illustrate the invention. It is to be understood that although particular embodiments, specific configurations as well as materials and/or molecules, have been discussed herein for engineered cells and methods according to the present invention, various changes or modifications in form and detail may be made without departing from the scope and spirit of this invention. The following examples are provided to better illustrate particular embodiments, and they should not be considered limiting the application. The application is limited only by the claims.
EXAMPLES
Detailed methods for making and testing pores are described in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, WO 2019/002893, PCT/EP2023/059821, PCT/EP2023/072113, PCT/EP2023/072065, PCT/EP2023/072106 and PCT/EP2023/072068 (all incorporated by reference herein in their entirety).
EXAMPLE 1 - PORARC CHIMERAS
Chimeric pore monomers
Two types of chimeras were designed: (1) constriction transplants and (2) cap transplants (also known as scaffold transplants).
For (1), different constriction regions were transplanted into the cap region (or scaffold) of PorARc from Rhodococcus corynebacteroides (PorARc_Rco; SEQ ID NO: 1). Chimeras were named with the cap region (or scaffold) first and then the constriction region. For instance, PorARc_Rco_Mph contained the cap region (or scaffold) from PorARc_Rco and the constriction from PorARc pore from Mycolicibacterium phlei (PorARc_Mph; SEQ ID NO: 2).
For (2), different cap regions (or scaffolds) were transplanted around the constriction region from PorARc_Mph (SEQ ID NO: 2). The same naming convention was applied. For instance, PorARc_Gcr_Mph was the cap region (or scaffolds) from PorARc from Gordonia crocea transplanted around the constriction from PorARc_Mph (SEQ ID NO: 2).
E coli pore production
Recombinant expression vectors encoding the chimeric pore monomers and PorARc pores with a C-terminal Strep affinity tag and ampicillin resistance gene were transformed into chemically competent E. coli cells. The cells were plated onto an LB Agar plate containing appropriate antibiotics for selection. A single colony from the agar plate was inoculated in LB Media with antibiotics and grown overnight. The culture was diluted into LB media plus necessary antibiotics and incubated at 37°C for 6.5 hours. Following incubation at 37°C, glucose was added and the temperature was dropped to 18°C. After incubation at 18°C for 1 hour, lactose was added and the culture was incubated at 18°C for a further 16 hours. The cells were harvested through centrifugation before being lysed and extracted into lx Bugbuster extraction reagent (Merck 70921) and 0.1% DDM. The chimeric pore monomers were purified from the supernatant using affinity chromatography and ion exchange chromatography, selecting for oligomoeric nanopores as judged by SDS-PAGE. DNA squiggle (/'.e. , DNA translocation current traces')
PorARc from Rhodococcus corynebacteroides (PorARc_Rco; SEQ ID NO: 1) was tested for comparative purposes. PorARc pore from Mycolicibacterium phlei (PorARc_Mph; SEQ ID NO: 2) was also tested. The chimeric pores tested are listed in the tables below. All pores tested were homo-oligomers.
Electrical measurements were acquired from pores that were inserted into MinlON flow cells. After a single pore inserted into the block co-polymer membrane, 1 mL of a buffer comprising 25 mM Potassium Phosphate, 150 mM Potassium Ferrocyanide (II), 150 mM Potassium Ferricyanide (III), pH 8.0 was flowed through the system to remove any excess nanopores.
The analyte being used to assess the DNA squiggle was a 3.6-kilobase DNA section from the 3' end of the lambda genome. Preparation of the analyte, ligating the analyte to the Y- adapter, SPRI-bead clean-up of the ligated analyte and addition to a minlON flow cell was carried out using the Oxford Nanopore Technologies Q-SQK-LSK109 protocol.
Electrical measurements were acquired using minlON Mklb from Oxford Nanopore Technologies. A standard sequencing script at -180 mV was run for 2-6 hours, with static flicks every 5 minute to remove extended nanopore blocks. Raw data was collected in a bulk FAST5 file using MinKNOW software (Oxford Nanopore Technologies). The median number of DNA strands used to calculate the metrics shown in Figures 7 and 8 was 1990.
Peotide-DNA conjugate souiooles (/'.e., oeotide-DNA translocation current traces) Example current versus time traces as a peptide translocates through the pores were obtained by using a conjugate comprising a polypeptide flanked by two pieces of polynucleotide; a dsDNA Y adapter (DNA1) and a dsDNA tail (DNA2). A polynucleotide- handling protein at the cis side of the nanopore controls the movement of the conjugate by first unwinding DNA1 and translocating 5'-3' on ssDNA, then sliding across the polypeptide section to finally unwind the DNA2 segment. As this construct moves from the cis to trans side of the nanopore, the DNA and polypeptide sections can be visualized on a current vs time plot.
The adaptor was from the Oxford Nanopore Technologies Q-SQK-LSK109 kit as above. The DNA tail was made by annealing two DNA oligonucleotides, it also contains a side arm for tethering resulting in two tethering sites per construct to increase efficiency of capture.
The polypeptide analytes were obtained with azide moieties at the N-terminus and directly after the C-terminus using an ethyl diamine spacer in line with the peptide backbone. Each analyte was then conjugated to the Y-adapter and DNA tail via copper-free Click Chemistry reaction between the azide and BCN (bicyclo[6.1.0]nonyne) moieties. The sample was purified using Agencourt AMPure XP (Beckman Coulter) beads, with two washes in 28% PEG 8K, 2.5M NaCI, 25mM Tris (pH 8.0) buffer, and eluted into 10 mM Tris-CI, 50 mM NaCI (pH 8.0).
Electrical measurements were acquired using MinlON Mklb from Oxford Nanopore Technologies and a custom MinlON flow cell with pores inserted. Flow cells were flushed with a tether mix containing 50 nM of DNA tether and SQB buffer lacking ATP. Initially 800 pL of tether mix was added for 5 minutes, then a further 200 pL of mix were flowed through the system with the SpotON port open. DNA-peptide constructs were prepared at 0.5nM concentration in buffer like SQB from Oxford Nanopore Technologies sequencing kit (SQK- LSK109) but lacking ATP, and LB from Oxford Nanopore Technologies sequencing kit (SQK- LSK109), yielding "sequencing mix". 75 pL of the sequencing mix was added to a MinlON flow cell via the SpotON flow cell port. The mixture was incubated on the flow cell for 5-10 minutes to allow for construct tethering and subsequent capture by the nanopores. In the absence of ATP, the DNA motor remains stalled on the spacer region of the Y-adapter, the conjugates are captured by the nanopores but there is no translocation. After the incubation, 200 pL of SQB from Oxford Nanopore Technologies sequencing kit (SQK- LSK109) was added, in the presence of ATP the captured DNA-peptide conjugate is moved across the nanopore by the helicase resulting in a reproducible current footprint.
A standard sequencing script at -180mV was run for 1-6 hours, with static flicks every 1 minute to remove extended nanopore blocks. Raw data was collected in a bulk FAST5 file using MinKNOW software (Oxford Nanopore Technologies).
Results
Representative results are shown in Figures 3-6. The results in Figures 7 and 8 for all the chimeric pores tested are summarised in Tables 5 and 6 below. The final two columns compare the chimeric pores with PorARc_Rco (ONLP21293) and PorARc_Mph (ONLP21323).
Table 5 - Constriction transplants
Figure imgf000125_0001
Figure imgf000126_0001
Figure imgf000127_0001
Figure imgf000128_0001
Figure imgf000129_0001
Table 6 - Cap transplants (also known as scaffold transplants)
Figure imgf000129_0002
Figure imgf000130_0001
EXAMPLE 2 - CSGG CHIMERAS Various CsgG constriction transplant chimeras were produced and tested as described above in Example 1. The chimeras are summarised in Table 7 below. Each chimeric pore represents the cap region and transmembrane beta barrel region (together known as the scaffold) from CsgG_Eco_WT containing the constriction from a different CsgG pore from a different species. For example, CsgG-Eco-Vdi is the cap region and transmembrane beta barrel region (or scaffold) from CsgG_Eco_WT containing the constriction from CsgG_Vdi_WT. The pores from which the constrictions were derived are summarised in Table 4 above.
Table 7 - CsgG constriction transplant chimeric pores
Figure imgf000131_0001
Table 8 - Sequence identities of pores in Table 7 (calculated including the signal peptide)
Figure imgf000132_0001
A = Sequence identity of full homologous pore from which the constriction in the chimera is derived to CsgG-Eco-WT B = Sequence identity of full chimera to CsgG-Eco-WT.
C = Sequence identity of chimera to CsgG-Eco-WT (constriction only, E44-A59).
D = Sequence identity of chimera to CsgG-Eco-WT (constriction only, F48-A59).
E = Sequence identity of chimera to CsgG-Eco-WT (constriction only, V38-S63). E shows the sequence identities of the constrictions in Table 4 to the CsgG-Eco-WT constriction in Table 4. Representative electrophysiology results for CsgG-Eco-Vmae are shown in Figure 9.
Equivalent traces were obtained for all of the pores in Table 7 (data not shown). The results in Figure 10 for all the chimeric pores tested are summarised in Table 9 below. The final column compares the chimeric pores with CsgG-Eco-WT. Table 9 - Summary of data in Figure 11
Figure imgf000133_0001
EXAMPLE 3 - MORE CSGG CHIMERAS
Various CsgG constriction transplant chimeras were produced and tested as described above in Examples 1 and 2. The chimeras are summarised in Table 10 below. Each chimeric pore represents the cap region and transmembrane beta barrel region (together known as the scaffold) from CsgG_Eco_WT containing the constriction from a different CsgG pore from a different species. For example, CsgG-Eco-Vfu is the cap region and transmembrane beta barrel region (or scaffold) from CsgG_Eco_WT containing the constriction from CsgG_Vfu_WT. The pores from which the constrictions were derived are summarised in Table 4 above.
Table 10 - CsgG constriction transplant chimeric pores
Figure imgf000134_0001
Table 11 - Sequence identities of pores in Table 10 (calculated including the signal peptide)
Figure imgf000134_0002
A = Sequence identity of full homologous pore from which the constriction in the chimera is derived to CsgG-Eco-WT B = Sequence identity of full chimera to CsgG-Eco-WT.
C = Sequence identity of chimera to CsgG-Eco-WT (constriction only, E44-A59).
D = Sequence identity of chimera to CsgG-Eco-WT (constriction only, F48-A59).
E = Sequence identity of chimera to CsgG-Eco-WT (constriction only: V38-S63). E shows the sequence identities of the constrictions in Table 4 to the CsgG-Eco-WT constriction in Table 4. Representative electrophysiology results for CsgG-Eco-Vfu are shown in Figure 13.
Equivalent traces were obtained for all of the pores in Table 11 (data not shown). The results in Figure 14 for the three chimeric pores tested (last three left to right) are summarised in Table 12 below. The final column compares the chimeric pores with CsgG- Eco-WT.
Table 12 - Summary of the data in Figure 14
Figure imgf000135_0001

Claims

CLAIMS A chimeric pore monomer comprising two or more regions, wherein at least two of the two or more regions are from at least two different pores, and wherein the at least two different pores do not comprise alpha-hemolysin and gamma-hemolysin. A chimeric pore monomer according to claim 1, wherein the chimeric pore monomer comprises two regions or three regions from two different pores and wherein the two different pores are not alpha-hemolysin and gamma-hemolysin. A chimeric pore monomer according to claim 1 or 2, wherein the chimeric pore monomer comprises a sequence that is about 99.7% or less or about 96% or less identical to the wild type monomer sequences of the at least two different pores or two different pores. A chimeric pore monomer according to any one of the preceding claims, wherein the at least two different pores or the two different pores are selected from Wza, Iota toxin, Anthrax protective antigen, Vibrio cholerae cytolysin, Cytotoxin K (CytK), CELIII, CsgG, Aerolysin, alpha hemolysin, InvG, GspD, MspA, MspB, MspC, PorARr, PorBRr, PorARc, PilQ, necrotic enteritis B-like toxin (NetB), FraC, portal proteins including G20c, P23_45, T4, SPP1, P22 and Phi29, gamma hemolysin, Monalysin, Lysenin, ClyA, Clostridium perfringens beta toxin, parasporin-2, epsilon toxin, lectin from the parasitic mushroom Laetiporus sulphureus (LSL), volvatoxin, Cry toxins, CytlAa and Cyt2Aa. A chimeric pore monomer according to claim 4, wherein the PorARc is selected from the pores in Table 2. A chimeric pore monomer according to claim 4 or 5, wherein the at least two different pores comprise at least two different PorARc pores or the two different pores are two different PorARc pores. A chimeric monomer according to any one of claims 4-6, wherein one of the at least two different pores is PorARc_Rco or PorARc_Mph or one of the two different pores is PorARc_Rco or PorARc_Mph. A chimeric pore monomer according to any one claims 4-7, wherein the at least two different pores comprise or the two different pores are (a) PorARc_Rco or PorARc_Mph and (b) one of the pores in Table 2. A chimeric pore monomer according to claim 8, wherein the at least two different pores comprise or the two different pores are PorArc_Rco and PorARc_Rco_Mph.
. A chimeric pore monomer according to any one of claims 1-4, wherein the at least two different pores comprise or the two different pores are (a) PorARc_Rco and MspA, (b) two different CsgG pores, three different CsgG pores or five different CsgG pores, (c) alphahemolysin and CytK, or (d) NetB and CytK. . A chimeric pore monomer according to claim 4 or 10, wherein the CsgG pore is selected from the CsgG pores in Table 4 or the two different CsgG pores, the three different CsgG pores or the five different CsgG pores are selected from the CsgG pores in Table 4. . A chimeric pore monomer according to any one of the preceding claims, wherein the at least two regions comprise or the two regions are a cap region and a constriction region or the at least two regions comprise or the three regions are a cap region, a constriction region, and a transmembrane region. . A chimeric pore monomer according to claim 12, wherein the chimeric pore monomer comprises a sequence having at least about 20% or at least about 40% identity to the sequence shown in any one of SEQ ID NOs: 3-49. . A chimeric pore monomer according to claim 12, wherein the chimeric pore monomer comprises a sequence having at least about 20% identity to the sequence shown in any one of SEQ ID NOs: 65-72 and 76-78. . A chimeric construct comprising two or more covalently attached chimeric pore monomers according to any one of the preceding claims. . A chimeric pore comprising at least one chimeric pore monomer according to any one of claims 1-14 or at least one construct according to claim 15. . A chimeric pore multimer comprising two or more pores, wherein at least one of the pores is a chimeric pore according to claim 16. . A PorARc pore monomer which comprises a sequence having at least about 88% identity to the sequence shown in SEQ ID NO: 2 or a sequence having at least about 20% or at least about 40% identity to the sequence shown in SEQ ID NO: 50, 51, 52, 53, 54 or 55. . A PorARc construct comprising two or more covalently attached PorARc pore monomers according to claim 18. . A PorARc pore comprising at least one PorARc pore monomer according to claim 18 or at least one construct according to claim 19.
. A PorARc pore multimer comprising two or more pores, wherein at least one of the pores is a PorARc pore according to claim 20. . A chimeric pore according to claim 16, a chimeric pore multimer according to claim 17, a PorARc pore according to claim 20 or a PorARc pore multimer according to claim 22, which is comprised in a membrane. . A membrane comprising a chimeric pore according to claim 16, a chimeric pore multimer according to claim 17, a PorARc pore according to claim 20 or a PorARc pore multimer according to claim 21. . A method for producing a chimeric pore monomer according to any one of claims 1-14 comprising attaching the at least two regions from at least two different pores. . A method for determining the presence, absence or one or more characteristics of a target analyte, comprising the steps of:
(i) contacting the target analyte with (a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from at least two different pores, (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a), (c) a PorARc pore according to claim 20, or (d) a PorARc pore multimer according to claim 21; and
(ii) taking one or more measurements as the target analyte moves with respect to the pore or pore multimer and thereby determining the presence, absence or one or more characteristics of the target analyte. . A method according to claim 25, wherein the target analyte comprises a metal ion, an inorganic salt, a polymer, an amino acid, a peptide, a polypeptide, a protein, a nucleotide, an oligonucleotide, a polynucleotide, a polynucleotide-polypeptide conjugate, a monosaccharide, an oligosaccharide, a polysaccharide, a dye, a bleach, a pharmaceutical, a diagnostic agent, a recreational drug, an explosive, a toxic compound, an environmental pollutant, or a metabolite. . A method according to claim 26, wherein the target analyte comprises a polynucleotide. . A method according to claim 26, wherein the target analyte comprises a peptide. . A method of characterising a target analyte using (a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from at least two different pores, (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a), (c) a PorARc pore according to claim 20, or (d) a PorARc pore multimer according to claim 21. . Use of (a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from at least two different pores, (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a), (c) a PorARc pore according to claim 20, or (d) a PorARc pore multimer according to claim 21 to determine the presence, absence or one or more characteristics of a target analyte. . A kit for characterising a target polynucleotide comprising:
(a) a chimeric pore comprising two or more regions wherein at least two of the two or more regions are from at least two different pores, (b) a chimeric pore multimer comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a), (c) a PorARc pore according to claim 20, or (d) a PorARc pore multimer according to claim 21; and a polynucleotide binding protein. . An apparatus for characterising a target polynucleotide in a sample, comprising:
(a) a plurality of chimeric pores comprising two or more regions wherein at least two of the two or more regions are from at least two different pores, (b) a plurality of chimeric pore multimers comprising two or more pores wherein at least one pore is a chimeric pore as defined in (a), (c) a plurality of PorARc pores according to claim 20, or (d) a plurality of PorARc pore multimers according to claim 21; and a plurality of polynucleotide binding proteins. . A method according to any one of claims 25-29, use according to claim 30, a kit according to claim 31 or an apparatus according to claim 32, wherein the at least two different pores are as defined in any of claims 4-12. . A method according to any one of claims 25-29, use according to claim 30, a kit according to claim 31 or an apparatus according to claim 32, wherein the chimeric pore is a chimeric pore according to claim 16, wherein the plurality of chimeric pores is a plurality of chimeric pores according to claim 16, wherein the chimeric pore multimer is a chimeric pore multimer according to claim 17, or wherein the plurality of chimeric pore multimers is a plurality of chimeric pore multimers according to claim 17. . A polynucleotide which encodes a chimeric pore monomer according to any one of claims 1-14, a chimeric construct according to claim 15, a PorARc pore monomer according to claim 18 or a PorARc construct according to claim 19.
. A kit for characterising a target analyte comprising (a) a chimeric pore according to claim 16, a chimeric pore multimer according to claim 17, a PorARc pore according to claim 20 or a PorARc pore multimer according to claim 21 and (b) the components of a membrane. . An array comprising a plurality of membranes according to claim 23. . A system comprising (a) a membrane according to claim 23 or an array according to claim 37, (b) means for applying a potential across the membrane(s) and (c) means for detecting electrical or optical signals across the membrane(s). . An apparatus comprising a chimeric pore according to claim 16, a chimeric pore multimer according to claim 17, a PorARc pore according to claim 20 or a PorARc pore multimer according to claim 21 inserted into an in vitro membrane. . An apparatus produced by a method comprising (i) obtaining a chimeric pore according to claim 16, a chimeric pore multimer according to claim 17, a PorARc pore according to claim 20 or a PorARc pore multimer according to claim 21 and (ii) contacting the chimeric pore or a pore multimer with an in vitro membrane such that the chimeric pore or the pore multimer is inserted in the in vitro membrane.
PCT/EP2023/080135 2022-10-28 2023-10-27 Pore monomers and pores WO2024089270A2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
GB2216026.1 2022-10-28
GBGB2216026.1A GB202216026D0 (en) 2022-10-28 2022-10-28 Pore monomers and pores
GB2312689.9 2023-08-18
GBGB2312689.9A GB202312689D0 (en) 2023-08-18 2023-08-18 Pore monomers and pores

Publications (2)

Publication Number Publication Date
WO2024089270A2 true WO2024089270A2 (en) 2024-05-02
WO2024089270A3 WO2024089270A3 (en) 2024-07-18

Family

ID=88793191

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2023/080135 WO2024089270A2 (en) 2022-10-28 2023-10-27 Pore monomers and pores

Country Status (1)

Country Link
WO (1) WO2024089270A2 (en)

Citations (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000028312A1 (en) 1998-11-06 2000-05-18 The Regents Of The University Of California A miniature support for thin films containing single channels or nanopores and methods for using same
WO2006100484A2 (en) 2005-03-23 2006-09-28 Isis Innovation Limited Deliver of molecules to a li id bila
WO2008102121A1 (en) 2007-02-20 2008-08-28 Oxford Nanopore Technologies Limited Formation of lipid bilayers
WO2009020682A2 (en) 2007-05-08 2009-02-12 The Trustees Of Boston University Chemical functionalization of solid-state nanopores and nanopore arrays and applications thereof
WO2009035647A1 (en) 2007-09-12 2009-03-19 President And Fellows Of Harvard College High-resolution molecular graphene sensor comprising an aperture in the graphene layer
WO2009077734A2 (en) 2007-12-19 2009-06-25 Oxford Nanopore Technologies Limited Formation of layers of amphiphilic molecules
WO2010004265A1 (en) 2008-07-07 2010-01-14 Oxford Nanopore Technologies Limited Enzyme-pore constructs
WO2010086602A1 (en) 2009-01-30 2010-08-05 Oxford Nanopore Technologies Limited Hybridization linkers
WO2010122293A1 (en) 2009-04-20 2010-10-28 Oxford Nanopore Technologies Limited Lipid bilayer sensor array
WO2011067559A1 (en) 2009-12-01 2011-06-09 Oxford Nanopore Technologies Limited Biochemical analysis instrument
WO2012005857A1 (en) 2010-06-08 2012-01-12 President And Fellows Of Harvard College Nanopore device with graphene supported artificial lipid membrane
WO2013057495A2 (en) 2011-10-21 2013-04-25 Oxford Nanopore Technologies Limited Enzyme method
WO2013098561A1 (en) 2011-12-29 2013-07-04 Oxford Nanopore Technologies Limited Method for characterising a polynucelotide by using a xpd helicase
WO2013098562A2 (en) 2011-12-29 2013-07-04 Oxford Nanopore Technologies Limited Enzyme method
WO2014013259A1 (en) 2012-07-19 2014-01-23 Oxford Nanopore Technologies Limited Ssb method
WO2014013260A1 (en) 2012-07-19 2014-01-23 Oxford Nanopore Technologies Limited Modified helicases
WO2014013262A1 (en) 2012-07-19 2014-01-23 Oxford Nanopore Technologies Limited Enzyme construct
WO2014064444A1 (en) 2012-10-26 2014-05-01 Oxford Nanopore Technologies Limited Droplet interfaces
WO2014064443A2 (en) 2012-10-26 2014-05-01 Oxford Nanopore Technologies Limited Formation of array of membranes and apparatus therefor
WO2014187924A1 (en) 2013-05-24 2014-11-27 Illumina Cambridge Limited Pyrophosphorolytic sequencing
WO2015055981A2 (en) 2013-10-18 2015-04-23 Oxford Nanopore Technologies Limited Modified enzymes
WO2016034591A2 (en) 2014-09-01 2016-03-10 Vib Vzw Mutant pores
WO2017149317A1 (en) 2016-03-02 2017-09-08 Oxford Nanopore Technologies Limited Mutant pore
WO2018211241A1 (en) 2017-05-04 2018-11-22 Oxford Nanopore Technologies Limited Transmembrane pore consisting of two csgg pores
WO2019002893A1 (en) 2017-06-30 2019-01-03 Vib Vzw Novel protein pores
CN113754743A (en) 2021-10-12 2021-12-07 成都齐碳科技有限公司 Mutant of porin monomer, protein pore and application thereof
CN113773373A (en) 2021-10-12 2021-12-10 成都齐碳科技有限公司 Mutant of porin monomer, protein pore and application thereof
CN113896776A (en) 2021-10-12 2022-01-07 成都齐碳科技有限公司 Mutant of porin monomer, protein pore and application thereof
CN113912683A (en) 2021-10-12 2022-01-11 成都齐碳科技有限公司 Mutant of porin monomer, protein pore and application thereof

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
HUE029215T2 (en) * 2008-09-22 2017-02-28 Univ Washington Msp nanopores and related methods
BR112013020411B1 (en) * 2011-02-11 2021-09-08 Oxford Nanopore Technologies Limited MUTANT MSP MONOMER, CONSTRUCT, POLYNUCLEOTIDE, PORE, KIT AND APPARATUS TO CHARACTERIZE A TARGET NUCLEIC ACID SEQUENCE, AND METHOD TO CHARACTERIZE A TARGET NUCLEIC ACID SEQUENCE
CN115777019A (en) * 2021-04-06 2023-03-10 成都齐碳科技有限公司 Modified Prp43 helicases and uses thereof
GB202118939D0 (en) * 2021-12-23 2022-02-09 Oxford Nanopore Tech Plc Pore

Patent Citations (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000028312A1 (en) 1998-11-06 2000-05-18 The Regents Of The University Of California A miniature support for thin films containing single channels or nanopores and methods for using same
WO2006100484A2 (en) 2005-03-23 2006-09-28 Isis Innovation Limited Deliver of molecules to a li id bila
WO2008102121A1 (en) 2007-02-20 2008-08-28 Oxford Nanopore Technologies Limited Formation of lipid bilayers
WO2008102120A1 (en) 2007-02-20 2008-08-28 Oxford Nanopore Technologies Limited Lipid bilayer sensor system
WO2009020682A2 (en) 2007-05-08 2009-02-12 The Trustees Of Boston University Chemical functionalization of solid-state nanopores and nanopore arrays and applications thereof
WO2009035647A1 (en) 2007-09-12 2009-03-19 President And Fellows Of Harvard College High-resolution molecular graphene sensor comprising an aperture in the graphene layer
WO2009077734A2 (en) 2007-12-19 2009-06-25 Oxford Nanopore Technologies Limited Formation of layers of amphiphilic molecules
WO2010004265A1 (en) 2008-07-07 2010-01-14 Oxford Nanopore Technologies Limited Enzyme-pore constructs
WO2010086602A1 (en) 2009-01-30 2010-08-05 Oxford Nanopore Technologies Limited Hybridization linkers
WO2010122293A1 (en) 2009-04-20 2010-10-28 Oxford Nanopore Technologies Limited Lipid bilayer sensor array
WO2011067559A1 (en) 2009-12-01 2011-06-09 Oxford Nanopore Technologies Limited Biochemical analysis instrument
WO2012005857A1 (en) 2010-06-08 2012-01-12 President And Fellows Of Harvard College Nanopore device with graphene supported artificial lipid membrane
WO2013057495A2 (en) 2011-10-21 2013-04-25 Oxford Nanopore Technologies Limited Enzyme method
WO2013098561A1 (en) 2011-12-29 2013-07-04 Oxford Nanopore Technologies Limited Method for characterising a polynucelotide by using a xpd helicase
WO2013098562A2 (en) 2011-12-29 2013-07-04 Oxford Nanopore Technologies Limited Enzyme method
WO2014013259A1 (en) 2012-07-19 2014-01-23 Oxford Nanopore Technologies Limited Ssb method
WO2014013260A1 (en) 2012-07-19 2014-01-23 Oxford Nanopore Technologies Limited Modified helicases
WO2014013262A1 (en) 2012-07-19 2014-01-23 Oxford Nanopore Technologies Limited Enzyme construct
WO2014064444A1 (en) 2012-10-26 2014-05-01 Oxford Nanopore Technologies Limited Droplet interfaces
WO2014064443A2 (en) 2012-10-26 2014-05-01 Oxford Nanopore Technologies Limited Formation of array of membranes and apparatus therefor
WO2014187924A1 (en) 2013-05-24 2014-11-27 Illumina Cambridge Limited Pyrophosphorolytic sequencing
WO2015055981A2 (en) 2013-10-18 2015-04-23 Oxford Nanopore Technologies Limited Modified enzymes
WO2016034591A2 (en) 2014-09-01 2016-03-10 Vib Vzw Mutant pores
WO2017149317A1 (en) 2016-03-02 2017-09-08 Oxford Nanopore Technologies Limited Mutant pore
WO2017149318A1 (en) 2016-03-02 2017-09-08 Oxford Nanopore Technologies Limited Mutant pores
WO2017149316A1 (en) 2016-03-02 2017-09-08 Oxford Nanopore Technologies Limited Mutant pore
WO2018211241A1 (en) 2017-05-04 2018-11-22 Oxford Nanopore Technologies Limited Transmembrane pore consisting of two csgg pores
WO2019002893A1 (en) 2017-06-30 2019-01-03 Vib Vzw Novel protein pores
CN113754743A (en) 2021-10-12 2021-12-07 成都齐碳科技有限公司 Mutant of porin monomer, protein pore and application thereof
CN113773373A (en) 2021-10-12 2021-12-10 成都齐碳科技有限公司 Mutant of porin monomer, protein pore and application thereof
CN113896776A (en) 2021-10-12 2022-01-07 成都齐碳科技有限公司 Mutant of porin monomer, protein pore and application thereof
CN113912683A (en) 2021-10-12 2022-01-11 成都齐碳科技有限公司 Mutant of porin monomer, protein pore and application thereof

Non-Patent Citations (11)

* Cited by examiner, † Cited by third party
Title
ALTSCHUL S. F., J MOL EVOL, vol. 36, 1993, pages 290 - 300
ALTSCHUL, S.F ET AL., J MOL BIOL, vol. 215, 1990, pages 403 - 10
AUSUBEL ET AL.: "Current Protocols in Molecular Biology", 2016, JOHN WILEY & SONS
CHEM BIOL, vol. 4, no. 7, July 1997 (1997-07-01), pages 497 - 505
D. STODDART ET AL., PROC. NATL. ACAD. SCI., vol. 106, 2010, pages 7702 - 7
DEVEREUX ET AL., NUCLEIC ACIDS RESEARCH, vol. 12, 1984, pages 387 - 395
GHANEM ET AL., FEBS J, vol. 289, 2022, pages 3505 - 3520
GONZALEZ-PEREZ ET AL., LANGMUIR, vol. 25, 2009, pages 10447 - 10450
GOYAL ET AL., NATURE, vol. 516, no. 7530, 2014, pages 250 - 3
SAMBROOK, J.RUSSELL, D.: "Molecular Cloning: A Laboratory Manual", 2001, COLD SPRING HARBOR LABORATORY PRESS
SAMBROOK: "Manual", 2012, COLD SPRING HARBOR PRESS

Also Published As

Publication number Publication date
WO2024089270A3 (en) 2024-07-18

Similar Documents

Publication Publication Date Title
US12084477B2 (en) Protein pores
JP7499761B2 (en) pore
US10167503B2 (en) Mutant pores
US10472673B2 (en) Hetero-pores
US10266885B2 (en) Mutant pores
JP6169976B2 (en) Mutant pore
JP2010539966A (en) Molecular adapter
WO2023118404A1 (en) Pore
WO2024089270A2 (en) Pore monomers and pores
WO2024100270A1 (en) Novel pore monomers and pores
WO2024033421A2 (en) Novel pore monomers and pores
WO2024033422A1 (en) Novel pore monomers and pores
WO2024033443A1 (en) Novel pore monomers and pores
WO2023198911A2 (en) Novel modified protein pores and enzymes

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23805472

Country of ref document: EP

Kind code of ref document: A2