Releases: ncbi/amr
AMRFinderPlus v4.0.3
NOTE: This is a new major revision of AMRFinderPlus that has a changed database format. You will need to run amrfinder -U
to get database release 2024-10-22.1 or later to use this version. This AMRFinderPlus includes StxTyper version 1.0.27 for E. coli Stx operon typing.
Software and database format changes
- See https://github.com/evolarjun/amr/wiki/New-in-AMRFinderPlus/ for more details.
- StxTyper (https://github.com/ncbi/stxtyper) which accurately types full-length stx operons now included with AMRFinderPlus.
- Database filename changes to standardize filenames and make writing scripts easier (See https://github.com/ncbi/amr/wiki/New-in-AMRFinderPlus for more information).
- Column names in output have been changed to more closely other NCBI Pathogen Detection resources such as MicroBIGG-E.
- Point mutations can now be reported with symbols different from the standard format.
- The optional
hierarchy_node
field will now contain two node IDs for fusion proteins and stx operons separated by '::'. - The ability to have different curated blast cutoffs at multiple levels of the hierarchy with the same reference protein has been added.
- The same protein can be used both as a reference gene to be reported and as the reference for point mutations.
AMRFinderPlus v3.12.8
AMRFinderPlus v3.12.8 includes a new database format version. Because this version is incompatible with AMRFinderPlus databases from previous versions, amrfinder -u must be run once the software update is installed to get the latest version of the database.
This update includes the changes:
- Reference proteins now have stop codons. Input proteins that are not terminated with a ‘’ character are assumed to have a ‘’ at the end.
- Extended proteins are no longer “EXACT” hits, so the protein names will be that of the parent node for alleles.
- If a protein is hit by blast and the Blast Rule is violated, the Blast Rule of the parent family is checked, etc.
--annotation_format prodigal
input format added
This means that Previously extended hits that were 100% identical to, but longer than proteins in the database were called with the method "EXACTP" and now they are called "BLASTP". Blast results will still indicate that 100% of the database reference was covered at 100% identity, and you should still see that the "Target length" is longer than the "Reference sequence length".
-
Extended hits to allele sequences that were previously called EXACTP already had the element symbol was one node up the hierarchy (i.e. the gene symbol, not the allele symbol) the "Sequence name" was that of the ALLELE reference, and thus inconsistent with the element symbol assigned. This has been changed. Again, this does not affect previous "Element symbol" calls.
-
In combined protein + nucleotide runs of AMRFinderPlus extended protein sequences with exact matches in the database were ignored and the nucleotide match was taken resulting in an EXACTX match with no protein accession provided. Now the protein match with method BLASTP will be reported.
To reiterate, gene symbols should not change. Sequence names will only change if the query is an extended version of a protein 100% identical and completely covering an allele in the database.
AMRFinderPlus v3.11.26
Fix for rare crash and change in behavior of hierarchy_node
. This release will likely have no effect for most AMRFinderPlus users.
- Fixes an unusual crash when using the
--ident_min
option with very low identity cutoffs (we generally don't recommend using this option). - A change in the behavior of the
--print_node
option. It will now print the node associated with the element reported, rather than the most specific node hit. This changes the node reported for non-exact blast hits to reference sequences of named alleles to report the parent node. It should have no effect on all other hits and does not affect the behavior of AMRFinderPlus in any field other thanhierarchy_node
.
AMRFinderPlus v3.11.20
This is a minor bug-fix release with two fixes:
- Fix for a protein sequence that could cause AMRFinderPlus to crash. This was very rare, we saw one failure in running on over a million assemblies and all RefSeq proteins.
- Found and fixed another case where AMRFinderPlus would produce two lines for what should really be a single partial hit reported.
AMRFinderPlus v3.11.18
AMRFinderPlus version 3.11.17 is another minor bug fix release
It fixes a bug where some point mutation results could be duplicated in AMRFinderPlus output. There will no longer be two rows in the AMRFinderPlus output for these circumstances.
This bug only affected a few point mutations that occur at/near the end of proteins where we had to add mutated reference sequences to a BLAST database to get BLAST to align through them. We have only seen this bug in our data for the following point mutations: ftsI_I336IKYRI, ftsI_N337NYRIN, mgrB_W47R, and ramR_A19V. See the point mutations in the Reference Gene Catalog for more information about those specific mutations.
AMRFinderPlus v3.11.17
This is a minor software release update with two changes. A bug fix for a very rare bug with mild consequences, and a minor improvement in the functionality of the --update option.
The release fixes a bug when AMRFinderPlus is run in combined nucleotide and protein mode where only a partial gene was assembled for proteins with long amino-acid repeats (e.g., espF, CAI43856.1). In the right circumstances AMRFinderPlus could report multiple combined hits to the same gene with the same start and stop coordinates. We've only seen the exact triggering circumstances a few times in more than a million assemblies, so it should be very rare. This release fixes that bug.
The release also contains a minor improvement to amrfinder_update
to also create the parent database directory if it doesn't exist when amrfinder_update
is run.
AMRFinderPlus v3.11.14
This release addresses a few issues brought up on GitHub.
Changes:
- On failure no
-o
output file is created - #115 - AMRFinderPlus will now automatically decompress files ending in .gz with gunzip (this relies on gunzip being in PATH) - #61
- AMRFinderPlus does not support unicode, but it will not check GFF files to prohibit extended ASCII or UTF-8 characters specifically (still prohibits GFF files with ASCII control characters between 0x00 and 0x1F) - #119
- Add reporting of curl error messages - #120
AMRFinderPlus v3.11.11
This release has two primary changes
- Version checking for blast on Mac to avoid the bug in BLAST 2.12.0 with the
-mt_mode
parameter - Updated handling of special characters in the sequence identifiers in GFF and FASTA files.
Special character handling
Implemented special character handling in GFF files according to https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md. The following need to be escaped in the sequence identifiers using URL-style codes in .gff files:
- # (comment start)
- tab (%09)
- newline (%0A)
- carriage return (%0D)
- % percent (%25)
- control characters (%00 through %1F, %7F)
- ; semicolon (%3B)
- = equals (%3D)
- & ampersand (%26)
- , comma (%2C)
Added input checking for nucleotide sequences because 'makeblastdb' truncates and/or alters sequence identifiers with the following characteristics. Now nucleotide FASTA identifiers (characters after '>' and before the first whitespace) with any of the following will cause amrfinder
to exit with an error message.
- FASTA identifier starts with '?'
- FASTA identifier contains the two character sequence ',,' or '\t' (the character '\' followed by the character 't')
- FASTA identifier ends with ';' '~' ',' or '.'
Bug fix for Bioconda compatibility
On Mac the BLAST parameter -mt_mode requires BLAST version 2.13.0 to run on short query sequences.
This release includes a bug-fix to allow AMRFinderPlus to be compatible with Bioconda which doesn't have BLAST+ version 2.13.0 for the Mac. The previous release 3.11.8 could not be added to Bioconda because at the time of release Bioconda did not have BLAST+ version 2.13.0 for Mac.
See also release notes for version 3.10.8: https://github.com/ncbi/amr/releases/tag/amrfinder_v3.11.8
AMRFinderPlus v3.11.8
NOTE: This version uses the blastx
option -mt_mode 1
which was released in BLAST+ version 2.12.0, but there was a bug in the Mac OSX implementation. The current latest BLAST+ version (2.13.0) does not have that bug, so this release is only compatible with BLAST+ 2.13.0 on the Mac.
The Bioconda package for BLAST+ 2.13.0 was not compiled for the Mac for some reason (bioconda/bioconda-recipes#35897), so the latest BLAST+ version available in bioconda for Mac OSX is 2.12.0. For that reason this version will not be released in bioconda.
We are working on a fix and hope to have a new compatible with bioconda release soon.
- Performance improvements by optimizing blast parameters
- Faster by 70% on single-threaded on nucleotide-only run
- Faster by 64% on single-threaded protein-only run
- Faster by 58% on single-threaded combined run
- Improved handling of special characters
- To simplify issues I would avoid '#' ',' '%' '=' '&' and ':' in sequence identifiers, though they should now be correctly handled with escape sequences in GFF files. See the documentation of
--gff
for more information. - Fixed handling for FASTA identifiers with leading underscore "_" (#115)
- Added
--annotation_format standard
- To simplify issues I would avoid '#' ',' '%' '=' '&' and ':' in sequence identifiers, though they should now be correctly handled with escape sequences in GFF files. See the documentation of
AMRFinderPlus v3.11.4
This includes a new amrfinder_index
program to re-index the AMRFinderPlus database.
It also includes some mostly cosmetic code and error message cleanup and minor updates to github actions.
There were no changes in the way AMRFinderPlus is run or how it works.