CurateScienceBots

The project aims to scrap metadata of research articles from academic journals such as Psychological Science, Collabra: Psychology, Journal of Cognition to feed database of Curate Science.

Curate Science is a platform whose goal is to help in verification of transparency and credibility of the research.

Extracted data

The extracted data looks like this sample:

{
    'title': 'A New Replication Norm for Psychology',
    'year': '2015',
    'article_type': 'Original research report',
    'doi': '10.1525/collabra.23',
    'keywords': 'Independent replication, cumulative knowledge, replication norm',
    'peer_review_url': 'http:https://dx.doi.org/10.1525/collabra.23.opr',
    'conflict_of_interests': 'The author declares that they have no competing interests.',
    'views': '2244',
    'downloads': '412'
}

Spiders

This project contains three spiders and you can list them using the list command:

$ scrapy list
collabra
jofcognition
psych_science

You can learn more about the spiders by going through the Scrapy Tutorial.

Running the spiders

You can run a spider using the scrapy crawl command, such as:

$ scrapy crawl collabra
$ scrapy crawl jofcognition
$ scrapy crawl psych_science

If you want to save the scraped data to a file, you can pass the -o option:

$ scrapy crawl psych_science -o psychological_science.csv

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
scrapy_spiders		scrapy_spiders
.gitignore		.gitignore
README.md		README.md
collabra_metadata.csv		collabra_metadata.csv
jofcognition_metadata.csv		jofcognition_metadata.csv
psych_science_metadata.csv		psych_science_metadata.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CurateScienceBots

Extracted data

Spiders

Running the spiders

About

Releases

Packages

Languages

dominik-lenda/web-scraping-curate-science

Folders and files

Latest commit

History

Repository files navigation

CurateScienceBots

Extracted data

Spiders

Running the spiders

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages