Skip to content

A Google-like web search engine that provides the user with the most relevant websites in accordance to his/her query, using crawled and indexed textual data and PageRank.

License

Notifications You must be signed in to change notification settings

shaxxxob/mini_google

 
 

Repository files navigation

Mini Google

Course project for the Architecture of Computer Systems course.

Overview:

Architecture:

We are working on multiple components of the web crawler at the same time:

Each component is intended to run as a separate Docker container, for us to be able to freely mix them in different amounts and on different computers/servers.

Progress can be tracked over here.

Usage:

Launch each container independently with instructions in respective directories, or launch all of them together:

# Download the file with crawled websites, or crawl the websites on your own into
# the root of the project as out.txt: https://drive.google.com/file/d/1XsnWbmk4YzLmZqWjRaMXDzMC_-Rv0Zwm/view

docker-compose build

docker-compose up

Prerequisites:

Credits:

License:

MIT License

About

A Google-like web search engine that provides the user with the most relevant websites in accordance to his/her query, using crawled and indexed textual data and PageRank.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • CSS 92.9%
  • Rust 3.9%
  • Python 3.0%
  • Other 0.2%