Skip to content

Grabs and downloads full-text versions of PubMed records in different formats.

Notifications You must be signed in to change notification settings

MohammadHalawani/pubmed-articles-grabber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pubmed-articles-grabber

Grabs and downloads full-text versions of PubMed records in different formats. It arranges them in folders named after the file formats (XML, PDF, ..). It is recommended to have two extra files: one containing the click through token to access articles that needs subscription, and the other containing the list of wanted PMIDs.  The names of the files are 'clickThroughToken.txt' and 'wanted.csv', respectively.

The contents of:    

The main functions to grab the articles  are: grab(), grabViaCrossref() and grabViaPMCOAI(). The rest of the functions are seervice functions that can be used for putposes other than the main purpose(e.g. converting XML to txt formats).
   

    Recommended steps::  

  1. Put the list of PMIDs you want to "grab" in the 'wanted.csv' file in a row by row basis.
  2. (optional) Put your click-through-token in the 'clickThroughToken.txt' file. This step is optional but some publishers require having a click-through-token.
  3. Create an object of this class and use its main grabber functions. Below is an example of its use (also can be found in 'main.py').
  4. Do not forget to change the email address from "[email protected]" to yours.
       

Usage::   

from pubMedArticleGrabber import PubMedArticleGrabber    
wanted = PubMedArticleGrabber('wanted', '[email protected]')
wanted.grab()