Python Web Scraper

Created using BeautifulSoup library. Extracts data from specified website and saves it to JSON file.

Feed the script with website data:

Save the website as .html file and copy the website to:

/input/

Run script using:

python webscraper.py

Script Output:

The script will create (if folders don't exist):

/output/
/images/

Output folder:

A JSON file similar to the one in the example below:

{
    "QuestionID": 0,
    "TxtLine_0": "You have a table in an Azure Synapse Analytics dedicated SQL pool. The table was created by using the following Transact-SQL statement.",
    "Image_1": "URL: `IMAGE URL`",
    "TxtLine_2": "You need to alter the table to meet the following requirements:",
    "TxtLine_3": "Ensure that users can identify the current manager of employees.",
    "TxtLine_4": "Support creating an employee reporting hierarchy for your entire company.",
    "TxtLine_5": "Provide fast lookup of the managers' attributes such as name and job title.",
    "TxtLine_6": "Which column should you add to the table?",
    "Variant_A": "[ManagerEmployeeID] [smallint] NULL",
    "Variant_B": "[ManagerEmployeeKey] [smallint] NULL",
    "Variant_C": "[ManagerEmployeeKey] [int] NULL",
    "Variant_D": "[ManagerName] [varchar](200) NULL",
    "Correct Answer:": "C",
    "Explanation:": "We need an extra column to identify the Manager. Use the data type as the EmployeeKey column, an int column.Reference: `WEBSITE LINK`"
}

Images folder:

Saved images from website.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
webscraper.py		webscraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Web Scraper

Feed the script with website data:

Run script using:

Script Output:

Output folder:

Images folder:

About

Releases

Packages

Languages

ursustefan11/py-web-scraper

Folders and files

Latest commit

History

Repository files navigation

Python Web Scraper

Feed the script with website data:

Run script using:

Script Output:

Output folder:

Images folder:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages