Skip to content

joaaobr/searcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is Searcher ??

Searcher is a study project with the objective of using VSM (Vector Space Model) to search documents and reveal the similarity between the document and the query.

Example:

The document table is a module used to store documents and transform them into lists of sentences.

  val documentTable = new DocumentTable()

Then we insert the documents into the table:

  documentTable.pushText("Hello World")
  documentTable.pushText("Hello Guys")
  documentTable.pushText("Hello Main")
  documentTable.pushText("Hello Man Guys World")
  documentTable.pushText("Hello hello")
  documentTable.pushText("Hello guys guys hello hello guys hello")

And then we insert the search vector:

  documentTable.pushQuery("Hello Guys")

Execute the query:

  documentTable.result().show()

Result:

  Id  Similarity
  5   0.8164965809277259
  1   0.6666666666666667
  6   0.8082903768654761
  2   1.0000000000000002
  3   0.6666666666666667
  4   0.8660254037844387

How does Searcher work behind the scenes??

In the first step, we process the input by removing accents and punctuations and converting all letters to lowercase. Immediately after that, we index the sentences into an Inverted Index.

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages