SIGNIFICANT USER-VISIBLE CHANGES
- Remove warnings related to using
coverage_matrix()
orexpressed_regions()
on Windows asrtracklayer::import()
does work with local BigWig files on that operating system. I'm not sure if it will work with remote BigWig files given that remote BigWig file access on other operating systems is not working due to lawremi/rtracklayer#83 and related issues.
BUG FIXES
- Merged a pull request by @hpages which addresses some changes to GenomicFeatures. See #24 for details.
BUG FIXES
- Fix a bug in
geo_info()
for reading files on Windows where a trailing\r
was added to all variables. - Avoid the
implicit list embedding of S4 objects is deprecated
warning that was noted at https://github.com/leekgroup/recount/runs/3286046827?check_suite_focus=true#step:20:1417.
SIGNIFICANT USER-VISIBLE CHANGES
- Documentation website is now available at https://leekgroup.github.io/recount/. It gets updated with every commit on the master branch (bioc-devel) using GitHub Actions and pkgdown.
- Mention in
reproduce_ranges()
the link to https://support.bioconductor.org/p/126148/#126173 which shows how to update the gene symbols in the RSE objects in recount.
- Added a
NEWS.md
file to track changes to the package.
NEW FEATURES
- Added the function
getTPM()
as discussed in https://support.bioconductor.org/p/124265 and based on Sonali Arora et al https://www.biorxiv.org/content/10.1101/445601v2.
BUG FIXES
- Now
geo_characteristics()
can deal with the scenario reported at https://support.bioconductor.org/p/116480/ by @Jacques.van-Helden.
SIGNIFICANT USER-VISIBLE CHANGES
- Renamed
.load_install()
as.load_check()
as this function now only checks that the package(s) was installed and returns an error if missing. The error shows the user how to install the package(s) they are missing instead of installing them automatically. This complies with Marcel Ramos' request at #14.
NEW FEATURES
- Added the function
download_retry()
based on https://bioconductor.org/developers/how-to/web-query/ such thatdownload_file()
and other recount functions will re-try to download a file 3 times before giving up. This should help reduce the number of occasional failed Bioconductor nightly checks.
SIGNIFICANT USER-VISIBLE CHANGES
- Cleaned up the documentation of
add_metadata()
and changed the defaultsource
fromrecount_brain_v1
torecount_brain_v2
.
NEW FEATURES
citation('recount')[5]
now lists therecount-brain
bioRxiv pre-print citation information.
NEW FEATURES
- We have now released the FANTOM-CAT/recount2 RSE files
which you can access now through
download_study(type = 'rse-fc')
. See Imada EL, Sanchez DF, et al, bioRxiv, 2019 https://www.biorxiv.org/content/10.1101/659490v1 for more information.
BUG FIXES
- I made the example code for
geo_characteristics()
more robust since currentlyrentrez
can occasionally fails.
NEW FEATURES
- Add ORCID's following changes at https://bioconductor.org/developers/package-guidelines/#description
BUG FIXES
- Added the argument
async
tosnaptron_query()
which can be set toFALSE
to address the issue reported at ChristopherWilks/snaptron#11
BUG FIXES
- Updated
reproduce_ranges()
to match theURL
change in Gencode from ftp:https://ftp.sanger.ac.uk to ftp:https://ftp.ebi.ac.uk
SIGNIFICANT USER-VISIBLE CHANGES
add_metadata()
can now download therecount_brain_v2
data.
BUG FIXES
- Fix a
NOTE
aboutRefManageR
.
SIGNIFICANT USER-VISIBLE CHANGES
- Use
BiocManager
BUG FIXES
- Fix a unit test.
SIGNIFICANT USER-VISIBLE CHANGES
rse_tx
URLs now point to v2 to reflect recent changes by Fu et al.
BUG FIXES
- Change some examples to dontrun and improve the code that cleans up after the tests. This should reduce the size of files left in tmp although they didn't seem too big to begin with.
SIGNIFICANT USER-VISIBLE CHANGES
- The functions
add_metadata()
andadd_predictions()
now return the sample metadata or predictions when therse
argument is missing.
NEW FEATURES
- Added the function
add_metadata()
which can be used to append curated metadata to a recount rse object. Currently,add_metadata()
only supports therecount_brain_v1
data available at https://lieberinstitute.github.io/recount-brain/ and to be further described in Razmara et al, in prep, 2018.
BUG FIXES
- Fix doc link in
geo_characteristics()
which affected the Windows build machines.
BUG FIXES
- Fix a unit test for
download_study()
, add another test for the versions, and fix aNOTE
inR CMD check
.
NEW FEATURES
download_study()
can now download the transcript counts (rse_tx.RData
) files. The transcript estimation is described in Fu et al, 2018.
SIGNIFICANT USER-VISIBLE CHANGES
download_study()
now has a version parameter (defaults to 2). This argument controls which version of the files to download based on the change on how exons were defined. Version 1 are reduced exons while version 2 are disjoint exons as described in further detail in the documentation tab of the recount website https://jhubiostatistics.shinyapps.io/recount/.recount_url
and the examplerse_gene_SRP009615
have been updated to match the changes in version 2.
BUG FIXES
- Changed
reproduce_ranges()
since disjoint exons are more useful than reduced exons for downstream analyses.
NEW FEATURES
- Added the function
read_counts()
.
SIGNIFICANT USER-VISIBLE CHANGES
- Added citations for https://www.biorxiv.org/content/early/2017/06/03/145656 and https://f1000research.com/articles/6-1558/v1 as well as mentions to them in the vignette.
SIGNIFICANT USER-VISIBLE CHANGES
add_predictions()
was bumped to version 0.0.05
SIGNIFICANT USER-VISIBLE CHANGES
- Vignette now uses the new
BiocStyle::html_document
that was recently released.
NEW FEATURES
coverage_matrix()
now has two new arguments:scale
andround
. Usescale = FALSE
to get raw coverage counts, which you can then scale withscale_counts()
.scale
is set toTRUE
by default, so the counts are scaled to a library size of 40 million reads.round
is set toFALSE
by default, but can be set toTRUE
if you want to get integer counts, just as in the default ofscale_counts()
.
SIGNIFICANT USER-VISIBLE CHANGES
- Changed the default version argument of
add_predictions()
tolatest
. Internally, that's still 0.0.03.
NEW FEATURES
- Added the
add_predictions()
function which appends the predicted phenotypes to a RSE object downloaded with recount. The phenotypes were predicted by Shannon Ellis et al, 2017 (citation coming up soon!).
SIGNIFICANT USER-VISIBLE CHANGES
- Changed the citation now that the recount2 paper has been published at https://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html.
NEW FEATURES
- Added the function
getRPKM()
which can be used withRangedSummarizedExperiment
objects fromrecount
and from other sources.
SIGNIFICANT USER-VISIBLE CHANGES
recount_url
now includes the URLs for the GTEx bigWig files.
SIGNIFICANT USER-VISIBLE CHANGES
coverage_matrix()
now returns a RangedSummarizedExperiment object. This matches the behavior ofrecount.bwtool::coverage_matrix_bwtool()
and is more consistent with the use of RSE objects in recount.
BUG FIXES
coverage_matrix()
's helper function.read_pheno()
was failing for some projects.
BUG FIXES
- Fixed a bug in the counts in
coverage_matrix()
. They were being incorrectly multiplied by 100.
SIGNIFICANT USER-VISIBLE CHANGES
- Completed the change to Gencode v25 annotation for exon and gene counts.
SIGNIFICANT USER-VISIBLE CHANGES
- We dropped
TxDb.Hsapiens.UCSC.hg38.knownGene
completely fromrecount
and will be using Gencode v25 instead.
BUG FIXES
- Updated
snaptron_query()
to comply with recent changes in Snaptron.
SIGNIFICANT USER-VISIBLE CHANGES
- Updated the package so you can now access TCGA data. Now there's over
8 terabytes of data available in the
recount
project!
SIGNIFICANT USER-VISIBLE CHANGES
snaptron_query()
can now access GTEx and TCGA data.
SIGNIFICANT USER-VISIBLE CHANGES
- Snaptron changed from stingray.cs.jhu.edu:8090 to snaptron.cs.jhu.edu so
snaptron_query()
has been changed accordingly.
SIGNIFICANT USER-VISIBLE CHANGES
- The function
reproduce_ranges()
now has thedb
argument. By default it's set toTxDb.Hsapiens.UCSC.hg38.knownGene
to reproduce the actual information used inrecount
. But it can also be used withEnsDb.Hsapiens.v79
to use the ENSEMBL annotation. Then withcoverage_matrix()
you can get the counts for either an updatedTxDb.Hsapiens.UCSC.hg38.knownGene
or forEnsDb.Hsapiens.v79
at the exon and/or gene levels as shown in the vignette.
SIGNIFICANT USER-VISIBLE CHANGES
- The vignette now describes how to download all the data, how to check
exon-exon junctions by class, and how to use
SciServer
compute to access all therecount
data (over 6 TB) via https://www.sciserver.org/
NEW FEATURES
- Added the function
snaptron_query()
which queries Intropolis via Snaptron to find if an exon-exon junction is present in the data.
BUF FIXES
- Fixed an bug in the vignette. Thanks to Michael Love for noticing it!
NEW FEATURES
- Created the package skeleton for
recount
- Added the function
reproduce_ranges()
for re-creating the gene or exon level information used in therecount
project. - Added the function
abstract_search()
for identifying SRA projects of interest by searching the abstracts. - Added the function
browse_study()
for opening a browser tab for further exploring a project. - Added the function
download_study()
for downloading the data from therecount
project. - Added the function
scale_counts()
for properly scaling the counts before performing a differential expression analysis with theRangedSummarizedExperiment
objects hosted in therecount
project. - Added the function
expressed_regions()
for defining the expressed regions in a chromosome for a given SRA study. - Added the function
coverage_matrix()
for computing the coverage matrix based on the regions of interest for a given SRA study. - Added the function
geo_info()
for obtaining sample information from GEO. - Added the function
find_geo()
for finding the GEO accession id given a SRA run accession (id
). This function will be useful for SRA projects that did not have GEO entries at the timerecount
's data was created. - Added the function
geo_characteristics()
for building adata.frame()
fromgeo_info()
's results for the characteristics. - Added the function
all_metadata()
which downloads all the phenotype data for all projects. This function can be useful for identifying projects and/or samples of interests.