Tags: diegozea/MIToS.jl
Tags
[Diff since v2.22.0](v2.22.0...v3.0.0) **MIToS v3.0.0** requires Julia v1.9 or higher, dropping support for older versions. This release introduces several breaking changes to improve the usability of the package. When possible, deprecation warnings are used to inform you of the changes. The MSA module now includes ways to read, write, and work with unaligned protein sequences: * The `MSA` module now exports the `AnnotatedSequence` type to represent a single protein sequence with annotations. This type is a subtype of the new `AbstractSequence` type, a subtype of the new `AbstractResidueMatrix` type. * The `MSA` module now exports the `sequence_id` function to get the identifier of a sequence object. * The `MSA` module now defines the `FASTASequences`, `PIRSequences`, and `RawSequences` file formats to read and write (unaligned) protein sequences in FASTA, PIR, and raw formats, respectively. * *[Breaking change]* The behavior of the `getannotresidue`, `getannotsequence`, `setannotresidue!`, and `setannotsequence!` functions have changed for sequences objects, such as `AnnotatedSequence`, `AnnotatedAlignedSequence`, and `AlignedSequence`. Now, these functions take the feature name, rather than the sequence name, as the second positional argument. As an example of migration, `getannotsequence(sequence, "sequence_name", "feature_name")` should be replaced by `getannotsequence(sequence, "feature_name")`. You still need to specify the sequence name when working with MSA objects. Other changes in the MSA module are: * *[Breaking change]* The `join` function for `AnnotatedMultipleSequenceAlignment` objects is deprecated in favor of the `join_msas` function. * *[Breaking change]* The `Clusters` type is no longer a subtype of `ClusteringResult` from the `Clustering.jl` package. Instead, the `Clusters` type is now a subtype of the new `AbstractCluster` type. Support for the `Clustering.jl` interface is still available through package extensions. You now need to load the `Clustering.jl` package to use the `assignments`, `nclusters`, and `counts` functions. The PDB module now depends on the `BioStructures` package. The main changes in the PDB module are: * The `PDB` module now exports the `MMCIFFile` file format to read and write PDB files in the mmCIF format (using `BioStructures` under the hood). * *[Breaking change]* The `download_alphafold_structure` function can now download the predicted structures from the *AlphaFold Protein Structure Database* using the mmCIF format (`format=MMCIFFile`). This is the new default format. Therefore, you should use `format=PDBFile` to get a PDB file like before. For example, `download_alphafold_structure("P00520")` in previous versions is the same as `download_alphafold_structure("P00520", format=PDBFile)` in this version. * *[Breaking change]* The `downloadpdb` function now returns a mmCIF file by default. Therefore, you should use `format=PDBML` to get a PDBML file. As an example of migration, `downloadpdb("1IVO")` should be replaced by `downloadpdb("1IVO", format=PDBML)`, unless you want to get a mmCIF file. * *[Breaking change]* The `PDBAtom` type now adds two extra fields: `alt_id` and `charge` to represent the alternative location indicator and the atom's charge, respectively. This improves the compatibility with the mmCIF format and the `BioStructures` package. * *[Breaking change]* The `query_alphafolddb` function now returns the EntrySummary object of the returned JSON response instead of the Root list. Therefore, there is no need to take the first element of the list to get the required information. For example, `query_alphafolddb("P00520")[1]["uniprotId"]` would be replaced by `query_alphafolddb("P00520")["uniprotId"]`. * *[Breaking change]* The `MIToS.Utils.Scripts` module and the MIToS scripts have been moved to their package at [MIToS_Scripts.jl](https://github.com/MIToSOrg/MIToS_Scripts.jl). Therefore, the `MIToS.Utils.Scripts` module is no longer exported. This allows for a reduction in the number of MIToS dependencies, and improved load time.
[Diff since v2.21.0](v2.21.0...v2.22.0) This versions introduces several breaking changes to improve the usability of the `Information` module. The main changes are: * *[Breaking change]* The `Information` module deprecates the `Counts` type in favor of the new `Frequencies` type. The new type as the same signature and behavior as the old one. * *[Breaking change]* The `count` function on sequences has been deprecated in favor of the `frequencies` function, which has the same signature and behavior as the old one. * *[Breaking change]* The `count!` function is deprecated in favor of `frequencies!`. The new function use keyword arguments to define the weights and pseudocounts. As an example of migration, `count!(table, weights, pseudocounts, seqs...)` should be replaced by `frequencies!(table, seqs..., weights=weights, pseudocounts=pseudocounts)`. * *[Breaking change]* The `probabilities!` method using positional arguments for the weights, pseudocounts and pseudofrequencies is deprecated in favor the one that uses keyword arguments. As an example of migration, `probabilities!(table, weights, pseudocounts, pseudofrequencies, seqs...)` should be replaced by `probabilities!(table, seqs..., weights=weights, pseudocounts=pseudocounts, pseudofrequencies=pseudofrequencies)`. * *[Breaking change]* The `Information` has deprecated the `entropy` method on `Frequencies` and `Probabilities` in favor of the `shannon_entropy` function. The definition of the base is now done using the `base` keyword argument. As an example of migration, `entropy(p, 2)` should be replaced by `shannon_entropy(p, base=2)`. * *[Breaking change]* The `marginal_entropy` methods based on positional arguments are deprecated in favor of a method relying on the `margin` and `base` keyword arguments. As an example of migration, `marginal_entropy(p, 2, 2.0)` should be replaced by `marginal_entropy(p, margin=2, base=2.0)`. * *[Breaking change]* The `mutual_information` method based on positional arguments is deprecated in favor of a method relying on the `base` keyword argument. As an example of migration, `mutual_information(p, 2)` should be replaced by `mutual_information(p, base=2)`. * *[Breaking change]* The `mapcolpairfreq!` and `mapseqpairfreq!` functions now uses the boolean `usediagonal` keyword argument to indicate if the function should be applied to the diagonal elements of the matrix (the default is `true`). Before, this was done passing `Val{true}` or `Val{false}` as the last positional argument. * The `mapcolfreq!`, `mapseqfreq!`, `mapcolpairfreq!`, and `mapseqpairfreq!` methods using keyword arguments, now pass the extra keyword arguments to the mapped function. * The `Information` module now exports the `mapfreq` function that offers a more high-level interface to the `mapcolfreq!`, `mapseqfreq!`, `mapcolpairfreq!`, and `mapseqpairfreq!` functions. This function allows the user to map a function to the residue frequencies or probabilities of the columns or sequences of an MSA. When `rank = 2`, the function is applied to pairs of sequences or columns. * The `Information` module now exports methods of the `shannon_entropy`, `kullback_leibler`, `mutual_information`, and `normalized_mutual_information` functions that take an `AbstractArray{Residue}` as input, e.g. an MSA. Those methods use the `mapfreq` function under the hood to ease the calculation of the information measures on MSAs. * The `frequencies!`, `frequencies`, `probabilities!`, and `probabilities` functions now accept arrays of `Residue`s of any dimension. Therefore, there is no need to use the `vec` function to convert the arrays to vectors. * The `MSA` module now exports the `WeightType` union type to represent `weights`.
PreviousNext