CN111534518B

CN111534518B - Universal blocking sequence and application thereof

Info

Publication number: CN111534518B
Application number: CN202010421923.XA
Authority: CN
Inventors: 胡玉刚; 汪彪; 郑文莉; 吴强
Original assignee: Naonda Nanjing Biological Technology Co ltd
Current assignee: Naonda Nanjing Biological Technology Co ltd
Priority date: 2020-05-18
Filing date: 2020-05-18
Publication date: 2021-07-23
Anticipated expiration: 2040-05-18
Also published as: CN111534518A; WO2021232793A1

Abstract

The invention provides a universal closed sequence and application thereof. The universal closed sequence comprises a left non-tag region closed sequence, a middle tag region closed sequence and a right non-tag region closed sequence which are sequentially connected in a direction from 5' to 3', wherein the left non-tag region closed sequence comprises 5-7 bases modified by LAN or BNA, the middle tag region closed sequence is a universal closed base sequence, the right non-tag region closed sequence comprises 7-10 bases modified by LAN or BNA, and the 3' end of the right non-tag region closed sequence is provided with a closed modification. 5-7 and 7-10 LNA or BNA modified base sequences are respectively designed on the left side and the right side of the non-tag area, so that the binding capacity with a sequence to be blocked can be remarkably enhanced, and the blocking effect is improved; the blocking modification of the 3' end can reduce or avoid the nonspecific capture of the library, and improve the target hit rate of the target library capture.

Description

Universal blocking sequence and application thereof

Technical Field

The invention relates to the field of high-throughput sequencing library construction, in particular to a universal closed sequence and application thereof.

Background

With the improvement of the importance of high-throughput sequencing in the auxiliary diagnosis of clinical application, how to reduce the sequencing cost is a very critical problem, and the reduction of the sequencing cost has corresponding performances at different levels: huada Zhi (MGI) continuously puts forward a sequencer with higher sequencing flux, the sequencing cost is continuously reduced, and MGI-200, MGI-2000 and T7 sequencers are successively put forward, wherein the T7 sequencer is the sequencer with the highest sequencing flux and the lowest sequencing cost in the current market. In addition, the target capture sequencing is also an effective way for realizing large-scale detection cost reduction while detecting a target sequence.

In the sequencing process, different samples are distinguished by different Index sequences, and then a plurality of samples are subjected to mixed sequencing, so that the high-throughput sequencing mode can reduce the cost of a single sample. However, if single-ended indexes are used, the use of the Index adapters or primers inevitably leads to contamination and/or cross-talk during the synthesis, library construction experimental procedures, and sequencing steps. Therefore, a method for solving the low-frequency mutual crosstalk between samples is needed, and the current method for solving the problem is to use the double-ended Index to distinguish different samples, so that the double-ended Index can effectively remove the mutual crosstalk between the samples.

The sequencing cost of the detection target can be effectively reduced through hybridization capture, and meanwhile, if the proportion of a captured target area can be improved in the hybridization blocking process, the sequencing cost can also be saved. However, there is no effective solution for efficient capture of a sample library with index at both ends.

Disclosure of Invention

The invention mainly aims to provide a universal blocking sequence and application thereof, and aims to solve the problem of low hybrid capture efficiency of a double-end index library in the prior art.

In order to achieve the above object, according to one aspect of the present invention, there is provided a universal closed sequence comprising a left non-tag region closed sequence, a middle tag region closed sequence and a right non-tag region closed sequence connected in sequence in a 5' to 3' direction, wherein the left non-tag region closed sequence comprises 5 to 7 bases modified by LAN or BNA, the middle tag region closed sequence is a universal closed base sequence, the right non-tag region closed sequence comprises 7 to 10 bases modified by LAN or BNA, and the 3' end of the right non-tag region closed sequence carries a closed modification.

Further, the blocking modification of the 3 'end is MGB modification, C3 spacer modification, phosphorylation modification, digoxin modification or biotin modification, or the base of the 3' end is a dideoxy base.

Further, the universal blocking base is hypoxanthine or a C3 spacer.

Further, the universal blocking sequence is a blocking sequence of a P1 linker with a first tag sequence or a blocking sequence of a P2 linker with a second tag sequence of the MGI sequencing platform, wherein the blocking sequence of the P1 linker is SEQ ID NO: 3:

CTCTCA + GTACG + TCA + GCA + GT + TXXXXXXXCA + ACTCCT + TGGC + TCACAGA + ACGA + CATGG + CTACGATC + CGACTT/3SpC 3/; wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a 3' C3 spacer modification;

the blocking sequence of the P2 linker is SEQ ID NO: 4:

GCA + TGGC + GA + CCTT + ATCA + GXXXXXXXXXXTTGTCTT + CCTA + AGA + CCGC + TTG + GCC + TCCGA + CTT/3SpC 3/; wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a 3' C3 spacer modification.

Further, the universal blocking sequence is a blocking sequence of a P1 linker with a first tag sequence or a blocking sequence of a P2 linker with a second tag sequence of the MGI sequencing platform;

the blocking sequence of the P1 linker is SEQ ID NO:

CTC + TCA + GT + ACG + TCA + GCA + GT + TXXXXXXXXXCA + ACTCCT + TGGC + TCAC + AGA + ACGA + CAT + GG + CTAC + GATC + CGACTT/3SpC 3/; wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a 3' C3 spacer modification;

the blocking sequence of the P2 linker is SEQ ID NO:

GCA + TG + GC + GA + CC + TT + ATCA + GXXXXXXXXXTTG + TCTT + CCTA + AGA + CC + GC + TTG + GCC + TCC + GA + CTT/3SpC 3/; wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a 3' C3 spacer modification.

In order to achieve the above object, according to a second aspect of the present invention, there is provided a capture kit comprising a universal blocking sequence, the blocking sequence being any one of the universal blocking sequences described above.

Furthermore, the working concentration of the universal capture probe in the kit is 0.4-0.8 μ g of universal blocking sequence/1 μ g of library to be captured, based on a single library.

According to a third aspect of the present invention, there is provided a library hybridization capture method, which comprises capturing a library to be captured using a capture kit, wherein the capture kit is any one of the capture kits described above.

Further, the step of capturing the library to be captured by using the capture kit comprises the following steps of: the ratio of 1 was used for blocking.

According to a fourth aspect of the present invention, there is provided a library construction method, comprising: constructing a fragmentation library; performing hybrid capture on the fragmentation library to obtain a capture library; carrying out PCR amplification on the capture library to obtain a sequencing library; performing hybridization capture by using any one of the capture kits or any one of the methods.

By applying the technical scheme of the invention, LNA or BNA modification is respectively carried out on 5-7 bases and 7-10 bases on the left non-tag region closed sequence and the right non-tag region closed sequence, the binding capacity of the sequence to be closed can be enhanced, and the closing effect is improved; and (2) carrying out blocking modification on the 3' end of the blocking sequence of the right non-tag region, so that the blocking sequence of the application can reduce or avoid the capture of a non-target library during library capture, and the capture rate of the target library (or the hit rate of the target library) is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a schematic diagram illustrating a current MGI platform dual-end Index library building process;

FIG. 2 is a schematic diagram illustrating the principle of improving the hit rate of target sequence capture when a universal blocking sequence is used for capture in the prior art;

FIG. 3 is a graph showing the statistical results of cross-talk between libraries constructed using 12 paired-ends Index for MGI platform paired-ends Index libraries in the prior art;

FIGS. 4A and 4B are schematic diagrams showing the principle of normal and abnormal blocking of a hybridization library by a modified universal blocking sequence;

FIG. 5 shows a schematic graph of the effect of using different numbers of modifications and different concentrations of universal blocking sequences on the on-target rate of target sequence capture;

FIG. 6 shows the effect of too few and too many base-modified universal blocking sequences on the blocking effect;

FIG. 7 shows the effect of the same number of modified bases but different modification positions on the blocking effect of a universal blocking sequence;

FIG. 8 shows the effect of different library input on blocking effect.

Detailed Description

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present invention will be described in detail with reference to examples.

Interpretation of terms:

double-ended Index linker: high throughput sequencing requires the end of each fragment to be ligated with a universal sequencing adaptor, and the non-complementary regions of the adaptor each have a variable sequence region sequence which is an Index sequence and is used for sequencing and de-sequencing data.

Linker blocking sequence: during library capture, each library has the same and similar adaptor sequence, the adaptor parts of the target fragment and the non-target fragment can be combined with each other during hybridization to reduce the hit rate, and the adaptor blocking sequence is used for specifically combining the adaptor part sequence to play a role in improving the hit rate. The universal blocking sequence is a sequence which can realize blocking aiming at all the linkers with different Inedx in the library.

C3 spacer arm: c3 Splicer is used mainly to mimic the three-carbon spacing between the 3 'and 5' hydroxyl groups of ribose, or to "substitute" for unknown bases in a sequence, and is primarily responsible for ligation in the middle of a nucleic acid sequence, and is not able to pair with complementary bases to stabilize, but only to ligate the front and back bases.

The index length for the MGI sequencing platform mentioned in the present application is exemplified by 10 bp. The length of the label block for an index in this application can be adapted to the blocking of index junctions of different lengths by adjusting the length of the hypoxanthine I and C3 spacer arms.

As mentioned in the background art, the sequencing cost needs to be reduced by the existing high-throughput sequencer, the single sequencing throughput is higher and higher, in order to save the sequencing cost, the T7 sequencer which has the highest throughput and the lowest sequencing cost in the current market is also proposed by the existing MGI sequencer, the sequencing throughput is increased, a plurality of sequencing samples need to be mixed at one time for sequencing, and the high-throughput sequencing is realized by connecting different samples with different Index joints to mix and sequence a plurality of samples together. The MGI firstly provides a library building scheme of a single-ended Index joint, the scheme has a remarkable problem of mutual crosstalk of samples, and the library building scheme of a double-ended Index is also provided for solving the problem of crosstalk, so that the problem of low-frequency crosstalk caused by mutual influence of joint synthesis, experimental operation and sequencing processes can be solved, and crosstalk data is filtered out through two Index data splitting.

However, when the paired-end Index library is subjected to capture library sequencing, the defect of low capture efficiency of a library containing a target fragment still exists, and in order to further improve the capture efficiency of the paired-end Index library, the method utilizes a paired-end Index library construction scheme (a specific paired-end Index library construction flow is shown in figure 1) developed by a Naon aiming at an MGI sequencer, and provides a corresponding hybridization capture improvement scheme.

Another effective way to reduce the sequencing cost is targeted capture sequencing, where the human genome size is 3Gb, and the region encoding the gene occupies less than 2% of the region (in terms of the whole exon V2 version of IDT) covering about 2 ten thousand genes in human, about 34Mb in size, so that one whole genome sequencing cost can be 10 whole exon sequencing, and the detection region for tumor targeted drug use has a larger cost difference compared with whole genome sequencing, and is more important for targeted sequencing. Moreover, the tumor mutation is a low-frequency mutation, and two main problems need to be considered, one is that the low-frequency mutation cannot have crosstalk between samples, or if the crosstalk cannot be avoided, a method for rejecting crosstalk data must be available, the double-ended Index is a necessary scheme for rejecting crosstalk when the crosstalk cannot be avoided, and therefore the double-ended Index is the greatest meaning of the existence of the double-ended Index. Another condition for detecting low frequency mutations is that the sequencing depth is guaranteed, typically requiring thousands to tens of thousands of sequencing depths.

In the process of target sequencing, two key factors determine the target-targeting rate of target capture, namely the specificity of a designed region, the designed probe cannot fall into a highly repetitive region, and the joint blocking effect during capture. The probe sequence is determined by the sequence of the region to be captured, and the highly repetitive region is generally avoided when designing the probe, so that the blocking effect in the targeted capture can be improved. If no blocking sequence is added during capture, both theoretical and practical tests show that the hit rate (of the target fragment library) does not exceed 50%. As shown in FIG. 2, without the addition of a blocking linker sequence, the linker moiety will bind to the non-targeted library thereby reducing the on-target rate. On the premise of better specificity of the capture probe region, the addition of different grades of blocking sequences can make the targeting rate fluctuate between 45 and 90 percent. Therefore, the application explains the improved and designed universal blocking sequence and the blocking effect achieved by the universal blocking sequence by taking the double-ended Index adapter library of the MGI platform as an example, and can greatly improve the on-target rate and further reduce the sequencing cost.

The two ends of the library of the double-ended Index of the MGI platform are respectively provided with an adapter sequence, and Index sequences carried by Index adapters with different numbers are all a variable region with 10bp, and the variable region is used for distinguishing different samples in hybrid capture and hybrid sequencing. The purpose of the double-ended Index mentioned above is to remove mutual crosstalk, as shown in fig. 3, 3-7 sequences can be filtered out by the double-ended Index per ten thousand sequences, and if one-thousandth or less mutation detection is not reliable data without the double-ended Index, the detection accuracy can be improved by the double-ended Index. In order to increase the on-target rate during hybrid capture, the invention develops a universal blocking sequence for MGI platform paired-end Index. One of the characteristics of the universal sequence is that the 10bp Index region is used for selecting universal bases (hypoxanthine or C3 spacer) to play a role in blocking/occupying, in order to improve the blocking effect, base modification substitution for improving the hybridization temperature is carried out on the fixed sequence regions at two ends, and LNA or BNA modification is carried out on partial bases of the fixed sequence. Specific characteristics and requirements for universal probes for double-ended indexes are as follows: (1) the tag sequence region is a universal closed base, such as a spacer sequence of hypoxanthine (I), a C3 spacer arm and the like or a combination thereof; (2) in order to enhance the binding efficiency of the universal blocking sequence, LNA or BNA modification is carried out on part of bases in the non-tag sequence region at the upstream and downstream of the blocking sequence, and the number of the modified bases is respectively 5-7 and 7-10, or 20-40%. (3) The optimal use concentration of the universal closed sequence is inversely proportional to the number of modified bases, the number of modified bases is large, the optimal concentration is low relative to the concentration, and the adverse effect is generated when the concentration is too high; on the contrary, the modified base is less, and higher concentration of the blocking sequence is required to achieve the blocking effect.

The invention designs a universal closed modified sequence according to the sequence characteristics of a joint of MGI double-end Index, wherein two universal closed original sequences of MGI are as follows:

SEQ ID NO:1：

CTCTCAGTACGTCAGCAGTTNNNNNNNNNNCAACTCCTTGGCTCACAGAACGACATGGCTACGATCCGACTT。

SEQ ID NO:2：

GCATGGCGACCTTATCAGNNNNNNNNNNTTGTCTTCCTAAGACCGCTTGGCCTCCGACTT。

the N part is a closed sequence of an Index sequence, and the Index with the length of 10bp is compared with the Index with the length of 6bp and 8bp, so that the selection for designing different indexes can be increased, and the defects are that the difficulty and the instability are increased for designing a universal closed sequence. When the universal closed sequence is designed, bases in the Index region are designed into degenerate bases N, C3 spacing wall and hypoxanthine, the obtained universal closed sequence has higher instability, and the universal closed sequence at two ends of the Index is required to modify more bases, so that the number of modified bases for improving the blocking effect is increased from 3-6 to 5-10.

In a preferred embodiment of the present application, the P1(5+7 modified) end blocking sequence is SEQ ID NO: 3:

CTCTCA + GTACG + TCA + GCA + GT + TXXXXXXXCA + ACTCCT + TGGC + TCACAGA + ACGA + CATGG + CTACGATC + CGACTT/3SpC 3/; wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a blocked modification of the 3' terminus. In a preferred embodiment, the P2 terminal blocking sequence is SEQ ID NO: 4:

GCA + TGGC + GA + CCTT + ATCA + GXXXXXXXXXXTTGTCTT + CCTA + AGA + CCGC + TTG + GCC + TCCGA + CTT/3SpC 3/; wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a blocked modification of the 3' terminus.

In another preferred embodiment of the present application, the P1 terminal blocking sequence is SEQ ID NO: 5:

CTC + TCA + GT + ACG + TCA + GCA + GT + TXXXXXXXXXCA + ACTCCT + TGGC + TCAC + AGA + ACGA + CAT + GG + CTAC + GATC + CGACTT/3SpC 3/; wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a blocked modification of the 3' terminus. In a preferred embodiment, the P2 terminal blocking sequence is SEQ ID NO: 6:

GCA + TG + GC + GA + CC + TT + ATCA + GXXXXXXXXXTTG + TCTT + CCTA + AGA + CC + GC + TTG + GCC + TCC + GA + CTT/3SpC 3/; wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a blocked modification of the 3' terminus.

Further research finds that the number of modified bases is large, and the sealing effect is poor when the concentration of the added universal sealing sequence is high, as shown in fig. 4A and 4B, when the number of modified bases in the sealing sequence is large, strong binding capacity is formed in a local area, most of the modified bases can be shown in fig. 4A, the sealing sequence can correctly seal a library sequence, but cross star type sealing can also be formed at the same time, a non-target sequence is captured, and therefore the target rate in the library is reduced, as shown in fig. 4B.

It was further found that the amount of library added during hybridization was also the most limited, and it was found that the amount of library added during hybridization could not exceed 6.5. mu.g (25pmol/L, referred to herein as single-ended sequence) in a single hybridization, and that the total exon assay found that 500ng of each library was used in hybridization, 12 libraries had better hybridization than 14-16 libraries, and that the spatial distance of the library was shorter due to the excessive library input during hybridization, and that there were two libraries in part and the universal block forming the double-library and double-block cross-star structure shown in FIG. 4B, resulting in a decrease in capture efficiency.

Based on the above research and findings, the applicant proposes a technical solution of the present application. In an exemplary embodiment, a universal seal is providedA closed sequence, the universal closed sequence comprising in the 5 'to 3' direction: comprises a left non-tag region blocking sequence, a middle tag region blocking sequence and a right non-tag region blocking sequence which are connected in sequence, wherein the left non-tag region blocking sequence comprises 5-7 LANs (Locked nucleic acids) or BNAs (Bridged nucleic acids, namely 2',4' -BNAs)^NCNamely, 2' -O,4' -aminoethylene bridged nucleic acid is a base (mainly LNA or BNA modification is carried out on C base) modified by a compound with a six-membered bridging structure with NO bond), the middle tag region closed sequence is a general closed base sequence, the right non-tag region closed sequence comprises 7-10 bases modified by LAN or BNA, and the 3' end of the right non-tag region closed sequence is provided with a closed modification.

LNA or BNA modification is carried out on 5-7 bases on the left non-tag region closed sequence and 7-10 bases on the right non-tag region closed sequence, so that the binding capacity with a sequence to be closed can be remarkably enhanced, and the closing effect is improved; and the 3' end of the right non-tag region blocking sequence is subjected to blocking modification, so that redundant linkers in the library cannot be used as primers to amplify the linkers of other libraries, further the nonspecific capture of the library can be reduced or avoided, and the target hit rate of the target library capture is improved.

The blocking modification of the 3' end of the right non-tag region blocking sequence may be MGB modification, C3 spacer modification, 3' phosphorylation modification, 3' digoxin modification, 3' biotin modification or dideoxy base as the base at the 3' end. In the present application, the C3 spacer arm modification is preferably used.

As the tag region-blocked sequence, a universal block base can be used, and a base sequence having a weak binding ability to both of the A, T, C and G four bases can be used. In a preferred embodiment of the present application, the universal blocking base is a hypoguanine I and/or C3 spacer. The number of bases of the blocking sequence of the specific tag region is not limited to 10bp, and can be set reasonably according to the number of bases of the sample tag in the library to be blocked. For example, it may be 6bp, 7bp, 8bp, 9bp, 11bp, or 12 bp.

When the C3 spacer was used as the tag region blocking sequence, it was 10C 3 spacers, or 10 hypoxanthines (I), as exemplified by the blocking sequence of the P1 and P2 linkers of the MGI platform described previously. Hypoxanthine has the advantage of weak pairing ability with all bases, whereas the C3 spacer arm occupies only one base, has no binding ability with the paired base, and cannot play a stabilizing role.

In the above-mentioned universal block sequences, the number of LNA or BNA modified bases in the left non-tag region block sequence and the right non-tag region block sequence is generally considered to be inversely related to the sequence length of the left non-tag region block sequence or the right non-tag region block sequence, and the number of bases to be modified is small when the sequence is long and large when the sequence is short. However, in the present application, the inventors found that, for a blocking sequence of a non-tag region of a specific length, the LNA or BNA modified bases have the strongest binding ability with the target linker when the number of bases in the left non-tag region blocking sequence or the right non-tag region blocking sequence is 5 to 10 bases. And when less than 5 bases, the binding to the objective linker is unstable, thereby making the capture efficiency low.

In addition, when the universal blocking sequence is used for library capture, the total amount of the library to be captured and the amount of the universal blocking sequence added are preferably matched, and if the amount of the library added is too large, the blocking sequences are easily hybridized to form a star-shaped structure, so that nonspecific capture is caused, and the capture efficiency is reduced. For example, when the addition amount of the universal block sequence is 2.4. mu.g, the hybridization capture effect is better than the capture effect of the hybridization of 14 to 16 libraries when the total amount of 12 libraries hybridized, i.e., 6. mu.g, is calculated for each library by 500 ng. Of course, less than 500ng, such as 400ng, per library, 2.4. mu.g of universal blocking sequence was used, and the capture efficiency was highest for 15 libraries hybridized simultaneously.

The present application also provides universal blocking sequences capable of inhibiting paired-end Index sequencing library capture of MGI sequencing platforms, for paired-end Index adapters of MGI platforms. In a preferred embodiment of the present application, the P1(5+7 modified) end blocking sequence is SEQ ID NO: 3:

The universal block sequences provided in the above two preferred embodiments not only increase the number of modified bases to improve the binding ability to the target linker, but also enhance the binding ability to the target linker when modifying the modified bases at specific positions in the universal block sequence as compared with modifying bases at other positions. That is, the preferred universal blocking sequence has the best blocking effect on the target adaptor, and the capture efficiency of the target library is the highest in hybrid capture.

In a second exemplary embodiment of the present application, based on the various modified universal blocking sequences described above, a capture kit is provided that includes any of the universal blocking sequences described above. The universal blocking sequence in the capture kit has strong binding capacity with the target joint, and can realize high-efficiency capture of the target library when used for capturing library construction.

In order to further improve the capture efficiency (i.e., the target-targeting rate) of the target library, in a preferred embodiment, the working concentration of the universal capture probe in the kit is 0.4-0.8. mu.g of universal blocking sequence/1. mu.g of library to be captured. The capture is carried out according to the dosage, so that the phenomenon that the amount of the library is too large to form a cross star-shaped closure can be further avoided, and the target rate in the library is reduced. The working concentration may vary depending on the particular blocking protocol, for example, when the nucleic acid sequence of SEQ ID NO:5 and SEQ ID NO:6, the working concentration is captured according to 0.4. mu.g of universal blocking sequence/1. mu.g of library to be captured, and the target rate of the target library is higher. When using SEQ ID NO:3 and SEQ ID NO:4, the working concentration is higher in the target-hit rate for capturing the target library according to 0.8. mu.g of the universal blocking sequence/1. mu.g of the library to be captured.

In a third exemplary embodiment of the present application, a library hybrid capture method is further provided, which comprises capturing a library to be captured by using a capture kit, wherein the capture kit is the capture kit. The blocking sequence in the capture kit can realize the high-efficiency capture of a target library when the capture library is constructed.

The inventors also found that the molar ratio of the universal blocking sequence to the library to be captured was 10: 1-20: the sealing effect is better when 1 is used. Thus, in a preferred embodiment of the present application, in the step of capturing the library to be captured using the capture kit, the blocking sequence is added to the library to be captured in a molar ratio of 10: 1-20: the ratio of 1 was used for blocking.

In a fourth exemplary embodiment of the present application, a library construction method is provided, which includes: constructing a fragmentation library; performing hybrid capture on the fragmentation library to obtain a capture library; carrying out PCR amplification on the capture library to obtain a sequencing library; the capture kit is used for carrying out hybridization capture, or any one of the methods is used for carrying out hybridization capture. The library constructed by the library construction method has high occupation ratio of a target library and high occupation ratio of effective data generated by the library.

The advantageous effects of the present application will be further described with reference to specific examples.

In the following examples, NadPrep was used^TMThe library construction procedure provided by the DNA library construction kit (for MGI) (201909Version2.0) (Nanon (Nanjing) Biotechnology Ltd.) was carried out. It should be further noted that the following examples are only illustrative, and the method of the present application is not limited to the following method. The specific process is briefly described as follows:

DNA sample fragmentation-end repair and A addition-joint connection-fragment screening-PCR amplification-library purification, quantification and quality control-sequencing by using MGI platform or sequencing after target capture.

Example 1 general blocking protocol for how much modification and addition concentration differences

The method comprises the following steps: library construction reference NadPrep^TMDNA library construction kit (for MGI) (201909version2.0) instructions. Wherein, the step of hybrid capture is carried out as follows, when multi-library hybrid capture is carried out after vacuum concentration, the specific hybrid library mixing steps are as follows:

table 1:

components	Total library amount	Number of
			Total library	6μg	1～12
Human Cot DNA	5μl	/
			General blocking sequence (sequence listed below)	2μl	/

1) Modified MGI (minor groove interface) connector double-end Index universal blocking sequence

1.1 the P1 terminal block sequence shown in SEQ ID NO. 3 and the P2 terminal block sequence shown in SEQ ID NO. 4.

CTCTCA+GTACG+TCA+GCA+GT+T10XXXXXXXXXXCA+ACTCCT+TGGC+TCACAGA+ACGA+CATGG+CTACGATC+CGACTT/3SpC3/；

GCA+TGGC+GA+CCTT+ATCA+GXXXXXXXXXXTTGTCTT+CCTA+AGA+CCGC+TTG+GCC+TCCGA+CTT/3SpC3/

1.2 SEQ ID NO: 5: the P1 terminal blocking sequence shown and the P2 terminal blocking sequence CTC + TCA + GT + ACG + TCA + GCA + GT + TXXXXXXXCA + ACTCCT + TGGC + TCAC + AGA + ACGA + CAT + GG + CTAC + GATC + CGACTT/3SpC3 shown in SEQ ID NO 6;

GCA+TG+GC+GA+CC+TT+ATCA+GXXXXXXXXXXTTG+TCTT+CCTA+AGA+CC+GC+TTG+GCC+TCC+GA+CTT/3SpC3/

in the above four sequences, X represents C3 Spacer/hypoxanthine, + N represents LNA or BNA modified base (the two modification effects are equivalent, for example LNA modification),/3 SpC 3/3' C3 Spacer block.

The universal blocking sequences and concentrations put in during the hybridization are listed below:

table 2:

2) the specific hybridization capture steps are as follows:

1. the components were mixed in a 0.2/1.5ml low adsorption centrifuge tube according to the above table, vortexed and mixed, and centrifuged instantaneously.

2. The centrifuge tube was dried in a vacuum concentrator preheated to 60 ℃.

3. After all the liquid had evaporated and completely dried, the centrifuge tube was sealed for use.

4. Taking out

The Exome Research Panel v2.0 was naturally thawed on ice and used and then dispensed in small quantities as required.

5. And (3) preparing hybridization reaction liquid according to the following table 3, uniformly mixing the hybridization reaction liquid by using a pipettor, adding the mixture to the bottom of a centrifugal tube which is concentrated and dried in vacuum, softly blowing and sucking the mixture by using the pipettor, uniformly mixing the mixture for 15 to 20 times, carrying out instantaneous centrifugation, and incubating the mixture for 5 to 10min at 25 ℃.

Table 3:

6. vortex and mix the hybridization reaction mixture, after instantaneous centrifugation, transfer the total 17 μ l hybridization reaction mixture in the centrifuge tube to a new 0.2ml PCR tube, instantaneous centrifugation, put into the PCR instrument, start the following hybridization procedure:

table 4:

7. hybrid library elution

(1) Preparation work

1. Taking out

Other reagents in Hybridization and Wash Kit were thawed naturally at room temperature and vortexed until homogeneous (note: Wash Buffer I could be incubated in a water bath at 65 ℃ until complete thawing if it could not be thawed).

2、Dynabeads^TMAnd (3) uniformly mixing M-270 Streptavidin Beads in a vortex mode, and after the mixture is balanced for 30min at room temperature, washing and capturing Streptavidin magnetic Beads can be carried out.

(2) Reagent preparation

1. Preparation of elution buffer

1X working solution of elution buffer was prepared according to the following table system:

table 5:

component name	RNase-free water	Buffer solution	Total of
				2X magnetic bead elution buffer	160μl	160μl	320μl
10 Xelution buffer I	252μl	28μl	280μl
				10 Xelution buffer II	144μl	16μl	160μl
10 Xelution buffer III	144μl	16μl	160μl
				10X Stringent elution buffer	288μl	32μl	320μl

2. Magnetic bead suspensions were prepared as in table 6.

Table 6:

(3) avidin magnetic bead washing

1. Dynabeads of the general formula^TMM-270 Streptavidin Beads were vortexed for 15s to ensure complete mixing. Pipette 50. mu. lM270 magnetic beads into 1.5ml low adsorption centrifuge tubes.

2. Adding 100 μ l of 1X Bead Wash Buffer into a centrifuge tube, gently blowing and sucking for 10 times, performing instantaneous centrifugation, placing on a magnetic frame for several minutes, and removing the supernatant by using a pipette after the liquid is completely clarified. The centrifuge tube was removed from the magnetic stand.

3. Repeat step 2 twice.

4. Add 17. mu.l of the magnetic bead suspension to the tube, mix it gently by pipetting, and transfer the whole magnetic bead suspension to 1 new 0.2ml low adsorption PCR tube.

(4) Streptavidin magnetic bead capture

1. After 16h of hybridization, the PCR machine was adjusted to enter the elution procedure.

2. Adding the resuspended streptavidin magnetic beads into the hybridization system, and gently pipetting and mixing the streptavidin magnetic beads or vortexing the streptavidin magnetic beads and the hybridization system.

3. Incubating for 45min at 65 ℃, and gently swirling once every 10-12 min to ensure that the magnetic beads are completely resuspended.

(5) Heat elution (note: the operation is rapid in the heat elution process; avoid bubbles in the blowing and sucking process)

1. After the incubation is finished, the PCR tube is taken down from the PCR instrument, 100 mu l of 1X Wash Buffer I at 65 ℃ is added into the PCR tube, and the hybridization system containing the magnetic beads is uniformly blown and sucked.

2. The PCR tube was placed on a magnetic stand for 1min, and after the liquid was completely clarified, the supernatant was removed by pipetting.

3. The PCR tube was removed from the magnetic stand, 150. mu.l of 1X Stringent Wash Buffer at 65 ℃ was added, mixed well by gentle pipetting for 10 times, and placed in a PCR instrument for incubation at 65 ℃ for 5 min.

4. Repeat steps 2 and 3 once.

(6) Elution at room temperature

1. And (3) placing the PCR tube on a magnetic frame for 1min after instantaneous centrifugation, sucking and removing supernatant after the liquid is completely clarified, adding 150 mu l of room-temperature 1X Wash Buffer I, carrying out vortex mixing, incubating at room temperature for 2min, carrying out vortex mixing for 30s, standing for 30s, and carrying out alternation to ensure full mixing.

2. And (3) placing the PCR tube on a magnetic frame for 1min after instantaneous centrifugation, sucking and removing supernatant after the liquid is completely clarified, adding 150 mu l of room-temperature 1X Wash Buffer II, carrying out vortex mixing, incubating at room temperature for 2min, carrying out vortex mixing for 30s, standing for 30s, and carrying out alternation to ensure full mixing.

3. And (3) placing the PCR tube on a magnetic frame for 1min after instantaneous centrifugation, sucking and removing supernatant after the liquid is completely clarified, adding 150 mu l of room-temperature 1X Wash Buffer III, carrying out vortex mixing, incubating at room temperature for 2min, carrying out vortex mixing for 30s, standing for 30s, and carrying out alternation to ensure full mixing.

4. And (3) placing the PCR tube on a magnetic frame for 1min after instantaneous centrifugation, sucking and removing supernatant after the liquid is completely clarified, and then removing a small amount of residual Buffer by using a 10 mu l suction head.

5. Remove the PCR tube from the magnetic stand, add 22.5. mu.l of nucleic Free Water, gently pipette 10 times to ensure uniform mixing, and transfer all the liquid to a new 0.2ml PCR tube.

Subsequent PCR amplification and library purification and quantification steps according to NadPrep^TMThe DNA library construction kit (for MGI) (201909Version2.0) may be used.

In the test of this example, it was found that when universal bases are used in the Index region, the influence of the amount of base modification in the region fixed at both ends and the use concentration of the universal blocking sequence on the final blocking effect is increased, and 200. mu. mol/L (20: 1 ratio of blocking sequence to library) is required for achieving the optimal blocking effect when the amount of modified bases (5+7) is small; the best effect can be achieved when the number of modified bases (7+10) is 100. mu. mol/L, inhibition is already generated when 200. mu. mol/L is reached, as shown in FIG. 5, strong binding capacity can be generated in a local area, and when the number of added universal blocking sequences is too large, the collision chance is increased in the reaction system, so that an abnormal cross-star structure is formed (as shown in FIG. 4B).

Example 2

Example 2 the procedure of example 1 is the same, the only difference being the number of modified bases of the universal blocking sequence used. In this example, the number of modified bases of the universal blocking sequence is shown in the following table:

table 8:

closed combination	4+6 modified sealing (sealing scheme 3)	8+11 modification combination (closed scheme 4)
			Concentration (μmol/L)	200	100

In blocking scheme 3, the P1 sequence was not changed (P1 terminal blocking sequence shown in SEQ ID NO:3,

CTCTCA + GTACG + TCA + GCA + GT + T10 XXXXXXXCCA + ACTCCT + TGGC + TCACAGA + ACGA + CATGG + CTACGATC + CGACTT/3SpC3/), P2 sequence: SEQ ID NO 7

GCA+TGGC+GA+CCTT+ATCAGXXXXXXXXXXTTGTCTT+CCTA+AGA+CCGC+TTG+GCCTCCGA+CTT/3SpC3/。

In blocking scheme 4, the P1 blocking sequence is SEQ ID NO 8

CTC + TCA + GT + ACG + TCA + G + CA + GT + TXXXXXXXXXCA + ACTCCT + TGGC + TCAC + AGA + ACGA + CAT + GG + CTAC + GATC + CGA + CTT/3SpC 3/; the P2 blocking sequence is SEQ ID NO 9

GCA+TG+GC+GA+CC+TT+AT+CA+GXXXXXXXXXXTTG+TCTT+CCTA+AGA+CC+GC+TTG+GCC+TC+C+GA+CTT/3SpC3/。

Scheme 3 is that the modification of P1 is kept unchanged on the basis of scheme 1, the number of blocked modified bases at two ends of a label of P2 is reduced by one, and the number of modified bases is 4+ 6; scheme 4 is that scheme 3 is based on scheme 3, and each of the universal blocks at two ends of the label is added with a block modified base. When the number of blocked modified bases on the left side of the tag sequence is 4 and the number on the right side is 6, the blocking effect is significantly inferior to that of the 5+7 combination scheme of scheme 1. Meanwhile, the 8+11 modification mode of scheme 4 has a poor blocking effect compared with the 7+10 scheme of scheme 2, and the result is shown in FIG. 6. Therefore, the scheme that the left non-tag region closed sequence comprises 5-7 modified bases and the right non-tag region closed sequence comprises 7-10 modified bases is better in effect.

Example 3

Example 3 the same procedure as in example 1, the same number of modified bases of the universal blocking sequence as in scheme 2 of example 1, the only difference being that the positions of the modified bases of the universal blocking sequence are different, and the specific sequences are as follows:

blocking scheme 5: the P1 blocking sequence is SEQ ID NO 10

CTC + T + CA + GT + ACG + TCA + GCA + GTTXXXXXXXCAACTCCT + TGGC + TCAC + AGA + ACGA + CAT + GG + CTAC + GATC + CGA + CTT/3SpC3/, P2 blocking sequence is SEQ ID NO 11

G+CA+TG+GC+GA+CC+TT+ATCAGXXXXXXXXXXTTGTCTT+CCTA+AGA+CC+GC+TTG+GCC+TC+C+GA+CTT/3SpC3/。

Blocking scheme 6: the P1 blocking sequence is SEQ ID NO 12

CTCA + GT + ACG + TCA + G + CA + GT + TXXXXXXXXXCA + ACT + CCT + TGGC + TCAC + AGA + ACGA + CAT + GG + CTAC + GATCCGACTT/3SpC 3/; p2 blocking sequence is SEQ ID NO 13

GCATG+GC+GA+CC+TT+AT+CA+GXXXXXXXXXXTTG+TCTT+C+CTA+AGA+CC+GC+TTG+GCC+TCC+GACTT/3SpC3/。

The closed scheme 5 is that the positions of the modifications are changed on the basis of the same number of modifications as the scheme 2, and the scheme 5 is that one modification at each end is reduced and the modification at the position closer to the middle tag is added; the reverse is true for the blocking scheme 6, which increases the blocking modified bases at both ends and reduces the modifications closer to the middle tag. The blocking sequences of scheme 5, scheme 6 and scheme 2 were used at a concentration of 100. mu. mol/L and hybridized with the same input library. As a result, it was found that the effects of the schemes 5 and 2 are close to each other and the effect of the scheme 6 is poor, as shown in FIG. 7, which indicates that not only the number of modified bases affects the blocking effect but also the position of the modification affects the blocking, and it was found that the effect of equalizing the modification and increasing the modification in the region closer to the middle tag sequence is significantly better than the effect of increasing the number of modifications at both ends.

Example 4 multiple library hybridization assay

The experimental procedure of library construction and hybridization was the same as in example 1, with 7+10 base modifications selected by universal block and concentration of 100. mu. mol/L, except that multiple libraries were hybridized in mixture, and the specific library input amounts and total library input amounts were as follows:

table 7.

Number of input libraries (500 ng/library)	10 are provided with	12 pieces of	14 are provided with	16 are provided with
					Total amount of library input	5μg	6μg	7μg	8μg

The sequencing indexes are better when the input amount is 500 ng/library in hybridization capture, and if a single hybridization can allow more input amount, namely more samples in the single hybridization can reduce the hybridization capture cost of each sample. The test of the universal closed sequence of the application finds that each index of hybridization is better when no more than 12 samples are hybridized together at a single time, and the target hit rate is reduced to a certain extent when the number of hybridized libraries reaches 14 samples and 16 samples. As shown in FIG. 8, the total input amount of 14 libraries and 16 libraries is 7 μ g and 8 μ g, respectively, and in the fixed capture system, as the number of libraries increases, part of the libraries have the opportunity to form a cross star structure closure with the universal closure, thereby reducing the hit rate.

In conclusion, the universal blocking sequence is designed for hybridization of the double-ended Index library of the MGI platform, the variable region of the Index is replaced by universal bases, and base modification for increasing annealing temperature is added in the fixed sequences at two sides of the Index, so that the capture efficiency of the double-ended Index library can be improved. Further, the application also finds that the sealing effect of the universal sealing sequence and the number of the modified bases for improving the annealing temperature are controlled to be 5-7 on the left side and 7-10 on the right side, so that the sealing effect of the sealing sequence is better. It has also been found that the blocking effect is best if the specific position of the modified base is further optimized as the preferred position in the examples of the present application. Accordingly, the concentration of universal blocking sequence used also has an effect on the targeting rate of the library of interest. When the using amount of the universal blocking sequence is 0.4-0.8 mu g, the method can support simultaneous hybridization of 12 samples, and greatly reduces the hybridization capture cost of a single sample.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Sequence listing

<110> Naon Dada (Nanjing) Biotechnology Ltd

<120> Universal blocking sequence and uses thereof

<130> PN132184NAGD

<160> 13

<170> SIPOSequenceListing 1.0

<210> 1

<211> 72

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (21)..(30)

<223> tag blocking sequence, n represents A, T, C, or G;

<400> 1

ctctcagtac gtcagcagtt nnnnnnnnnn caactccttg gctcacagaa cgacatggct 60

acgatccgac tt 72

<210> 2

<211> 60

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (19)..(28)

<223> tag blocking sequence, n represents A, T, C or G

<400> 2

gcatggcgac cttatcagnn nnnnnnnntt gtcttcctaa gaccgcttgg cctccgactt 60

<210> 3

<211> 72

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (21)..(30)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(20)

<223> left non-tag region blocking sequence, bases 6, 12, 15, 18 and 20 are modified by LAN or BNA;

<220>

<221> misc_feature

<222> (31)..(72)

<223> right non-tag region blocking sequence, bases 33, 39, 43, 50, 54, 59 and 67 are modified by LAN or BNA, and 72 is terminated by C3 spacer arm modification;

<400> 3

ctctcagtac gtcagcagtt nnnnnnnnnn caactccttg gctcacagaa cgacatggct 60

acgatccgac tt 72

<210> 4

<211> 60

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (19)..(28)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(18)

<223> left non-tag region blocking sequence, bases 4, 8, 10, 14 and 18 are modified by LAN or BNA;

<220>

<221> misc_feature

<222> (1)..(18)

<223> right non-tag region blocking sequence, bases at positions 36, 40, 43, 47, 50, 53 and 58 being modifications of LAN or BNA;

<400> 4

gcatggcgac cttatcagnn nnnnnnnntt gtcttcctaa gaccgcttgg cctccgactt 60

<210> 5

<211> 72

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (21)..(30)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(20)

<223> left non-tag region blocking sequence, bases at positions 4, 7, 9, 12, 15, 18 and 20 are modified with LAN or BNA;

<220>

<221> misc_feature

<222> (31)..(72)

<223> right non-tag region blocking sequence, bases at positions 33, 39, 43, 47, 50, 54, 57, 59, 63 and 67 are modified by LAN or BNA, and the 72 th end is modified by C3 spacer arm;

<400> 5

ctctcagtac gtcagcagtt nnnnnnnnnn caactccttg gctcacagaa cgacatggct 60

acgatccgac tt 72

<210> 6

<211> 60

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (19)..(28)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(18)

<223> left non-tag region blocking sequence, bases 4, 6, 8, 10, 12, 14 and 18 are modified with LAN or BNA;

<220>

<221> misc_feature

<222> (29)..(60)

<223> right non-tag region blocking sequence, bases at positions 32, 36, 40, 43, 45, 47, 50, 53, 56 and 58 are modified by LAN or BNA, and the end at position 60 is modified by C3 spacer arm;

<400> 6

gcatggcgac cttatcagnn nnnnnnnntt gtcttcctaa gaccgcttgg cctccgactt 60

<210> 8

<211> 60

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (19)..(28)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(18)

<223> left non-tag region blocking sequence, bases 4, 8, 10, 14 are modified by LAN or BNA;

<220>

<221> misc_feature

<222> (29)..(60)

<223> right non-tag region blocking sequence, bases at positions 36, 40, 43, 47, 50 and 58 are modified by LAN or BNA, and the end at position 60 is modified by C3 spacer arm;

<400> 8

gcatggcgac cttatcagnn nnnnnnnntt gtcttcctaa gaccgcttgg cctccgactt 60

<210> 8

<211> 72

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (21)..(30)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(20)

<223> left non-tag region blocking sequence, bases at positions 4, 7, 9, 12, 15, 16, 18 and 20 are modified by LAN or BNA;

<220>

<221> misc_feature

<222> (31)..(72)

<223> right non-tag region blocking sequence, LNA or BAN modification at position 33, 39, 43, 47, 50, 54, 57, 59, 63, 67, 70, C3 spacer arm blocking modification at 72 th base end

<400> 8

ctctcagtac gtcagcagtt nnnnnnnnnn caactccttg gctcacagaa cgacatggct 60

acgatccgac tt 72

<210> 9

<211> 60

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (19)..(28)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(18)

<223> left non-tag region blocking sequence, bases 4, 6, 8, 10, 12, 14, 16 and 18 are modified with LAN or BNA;

<220>

<221> misc_feature

<222> (29)..(60)

<223> right non-tag region blocking sequence, bases at positions 32, 36, 40, 43, 45, 47, 50, 53, 55, 56 and 58 are modified by LAN or BNA, and the end at position 60 is modified by C3 spacer arm;

<400> 9

gcatggcgac cttatcagnn nnnnnnnntt gtcttcctaa gaccgcttgg cctccgactt 60

<210> 10

<211> 72

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (21)..(30)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(20)

<223> tag region blocking sequence, bases at

positions

4, 5, 7, 9, 12, 15 and 18 are LAN or BNA modifications;

<220>

<221> misc_feature

<222> (31)..(72)

<223> right non-tag region blocking sequence, LNA or BAN modification at position 39, 43, 47, 50, 54, 57, 59, 63, 67, 70, C3 spacer arm blocking modification at 72 th base end

<400> 10

ctctcagtac gtcagcagtt nnnnnnnnnn caactccttg gctcacagaa cgacatggct 60

acgatccgac tt 72

<210> 11

<211> 60

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (19)..(28)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(18)

<223> left non-tag region blocking sequence, bases 2, 4, 6, 8, 10, 12 and 14 are modified by LAN or BNA;

<220>

<221> misc_feature

<222> (29)..(60)

<223> right non-tag region blocking sequence, bases at positions 36, 40, 43, 45, 47, 50, 53, 55, 56 and 58 are modified by LAN or BNA, and the end at position 60 is modified by C3 spacer arm;

<400> 11

gcatggcgac cttatcagnn nnnnnnnntt gtcttcctaa gaccgcttgg cctccgactt 60

<210> 12

<211> 72

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (21)..(30)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(20)

<223> tag region blocking sequence, bases at positions 7, 9, 12, 15, 16, 18 and 20 are LAN or BNA modifications;

<220>

<221> misc_feature

<222> (31)..(72)

<223> right non-tag region blocking sequence, LNA or BAN modification at the 33, 36, 39, 43, 47, 50, 54, 57, 59 and 63 bases, C3 spacer arm blocking modification at the 72 th base end

<400> 12

ctctcagtac gtcagcagtt nnnnnnnnnn caactccttg gctcacagaa cgacatggct 60

acgatccgac tt 72

<210> 13

<211> 60

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<220>

<221> misc_feature

<222> (19)..(28)

<223> tag region blocking sequence, n represents hypoxanthine or the C3 spacer;

<220>

<221> misc_feature

<222> (1)..(18)

<223> left non-tag region blocking sequence, bases 6, 8, 10, 12, 14, 16 and 18 are modified with LAN or BNA;

<220>

<221> misc_feature

<222> (29)..(60)

<223> right non-tag region blocking sequence, bases at positions 32, 36, 37, 40, 43, 45, 47, 50, 53 and 56 are modified by LAN or BNA, and the end at position 60 is modified by a C3 spacer arm;

<400> 13

gcatggcgac cttatcagnn nnnnnnnntt gtcttcctaa gaccgcttgg cctccgactt 60

Claims

1. a universal blocking sequence combination, wherein the universal blocking sequence combination is a combination of a blocking sequence with a P1 linker of a first tag sequence and a blocking sequence with a P2 linker of a second tag sequence of an MGI sequencing platform, wherein the combination is selected from the group consisting of:

combination 1): the closed sequence of the P1 joint is as follows:

the closed sequence of the P2 joint is as follows:

GCA + TGGC + GA + CCTT + ATCA + GXXXXXXXXXXTTGTCTT + CCTA + AGA + CCGC + TTG + GCC + TCCGA + CTT/3SpC 3/; wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a 3' C3 spacer modification;

or

Combination 2): the closed sequence of the P1 joint is as follows:

the blocking sequence of the P2 linker:

GCA + TG + GC + GA + CC + TT + ATCA + GXXXXXXXXXTTG + TCTT + CCTA + AGA + CC + GC + TTG + GCC + TCC + GA + CTT/3SpC 3/; wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a 3' C3 spacer modification;

or

Combination 3) the blocking sequence of the P1 linker is:

CTC + T + CA + GT + ACG + TCA + GCA + GTTXXXXXXXCAACTCCT + TGGC + TCAC + AGA + ACGA + CAT + GG + CTAC + GATC + CGA + CTT/3SpC3/, wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/is a C3 spacer arm modification at the 3' end;

the closed sequence of the P2 linker is:

g + CA + TG + GC + GA + CC + TT + ATCAGAXXXXXXXTTGTCTT + CCTA + AGA + CC + GC + TTG + GCC + TC + C + GA + CTT/3SpC3/, wherein + represents a LAN or BNA modification, X represents hypoxanthine or C3 spacer,/3 SpC 3/C3 spacer modification at the 3' end.

2. A capture kit comprising a combination of universal blocking sequences, wherein the combination of universal blocking sequences is according to claim 1.

3. The kit according to claim 2, wherein the universal capture probe is present in the kit at a working concentration of 0.4 to 0.8 μ g of any one of the universal blocking sequence combinations per 1 μ g of library to be captured.

4. A library hybridization capture method, comprising capturing a library to be captured by using a capture kit, wherein the capture kit is the capture kit of claim 2 or 3, and the molar ratio of any blocking sequence in the universal blocking sequence combination to the library to be captured is 20: the ratio of 1 was used for blocking.

5. The method of claim 4,

when the combination of the general closed sequences is the combination 1), adding the concentration of any closed sequence to 200 [ mu ] mol/L;

when the combination of the universal closed sequences is the combination 2), adding the concentration of any closed sequence to be 100 [ mu ] mol/L;

when the combination of the universal blocking sequences is the combination 3), the addition concentration of any one blocking sequence is 100 [ mu ] mol/L.

6. A multi-library capture method, comprising:

constructing a fragmentation library;

performing hybrid capture on the fragmentation library by using the kit of claim 2 or 3 or the method of claim 4 or 5 to obtain a capture library;

carrying out PCR amplification on the capture library to obtain a sequencing library;

in the step of performing hybrid capture on the fragmentation libraries, no more than 12 fragmentation libraries are subjected to hybrid capture in a single time, and the working concentration of the universal blocking sequence is 0.4-0.8 μ g of any one of the universal blocking sequence combinations/1 μ g of library to be captured.