webscraper-coding-challenge

This was written as part of a coding challenge to store sanitised webscraping results into a mongo database accessable via API

Instructions

Start a virtualenv environment and install requirements.txt
Create a config file in the news directory with the username and password for the database
Navigate to the news directory and intitiate the webcrawler with scrapy crawl news
Once the spider has finished putting the results into the mongodb database initiate the flask api by running api.py
Results will now be available via the API

Using the API

Simply post json in the following format {"searchterm": "sea cucumber example text"} in a post request to the /intext endpoint with your search terms to search accross all text fields in the database the result will be in JSON format

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
news		news
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

webscraper-coding-challenge

Instructions

Using the API

About

Releases

Packages

Languages

benhili/python-webscraper

Folders and files

Latest commit

History

Repository files navigation

webscraper-coding-challenge

Instructions

Using the API

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages