Skip to content

Latest commit

 

History

History
21 lines (15 loc) · 801 Bytes

README.md

File metadata and controls

21 lines (15 loc) · 801 Bytes

MiniSearch

Mini search engine for keywords in files. Sentences are considered as documents (big in size). Each document has the appropriate number starting from 0.

Arguments:
./minisearch -i docfile -k K
./minisearch -i docfile (default value 10 gia to K)
./minisearch -k K -i docfile

Commands:
/search q1, q2 .. q10 (search 1 to 10 words)
/df (document frequency of all words)
/df q1 (document frequency of specific word)
/tf (term frequency)

The score of each document was calculated using bm25

The trie structure can be seen in the image below

Image