-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add to onboarding reproduction logs #2492
Conversation
docs/experiments-msmarco-passage.md
Outdated
@@ -322,8 +316,8 @@ It turns out that optimizing for MRR@10 and MAP yields the same settings. | |||
|
|||
Here's the comparison between the Anserini default and optimized parameters: | |||
|
|||
| Setting | MRR@10 | MAP | Recall@1000 | | |||
|:------------------------------------------------|-------:|-------:|------------:| | |||
| Setting | MRR@10 | MAP | Recall@1000 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you introduced inconsistencies here? Old table seems fine to me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I clicked on the file and my editor did some formatting automatically. I will fix that.
docs/start-here.md
Outdated
@@ -20,8 +20,8 @@ What's the problem we're trying to solve? | |||
|
|||
This is the definition I typically give: | |||
|
|||
> Given an information need expressed as a query _q_, the text retrieval task is to return a ranked list of _k_ texts {_d<sub>1</sub>_, _d<sub>2</sub>_ ... _d<sub>k</sub>_} from an arbitrarily large but finite collection | |||
of texts _C_ = {_d<sub>i</sub>_} that maximizes a metric of interest, for example, nDCG, AP, etc. | |||
> Given an information need expressed as a query _q_, the text retrieval task is to return a ranked list of _k_ texts {_d`<sub>`1`</sub>`_, _d`<sub>`2`</sub>`_ ... _d`<sub>`k`</sub>`_} from an arbitrarily large but finite collection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nah, I think I like original better...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah same here, I didn't mean to change that. My editor did that automatically for some reason. Will fix.
thanks for the de-linting. left comments. |
docs/experiments-msmarco-passage.md
Outdated
@@ -89,7 +86,7 @@ On the other hand, retrieval needs to be fast, i.e., low latency, high throughpu | |||
|
|||
With the data prep above, we can now index the MS MARCO passage collection in `collections/msmarco-passage/collection_jsonl`. | |||
|
|||
If you haven't built Anserini already, build it now using the instructions in [anserini#-getting-started](https://github.com/castorini/anserini#-getting-started). | |||
If you haven't built Anserini already, build it now using the instructions in [anserini#-try-it](https://github.com/castorini/anserini?tab=readme-ov-file#-try-it). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "Installation" is the better link?
@lintool I have made the changes accordingly. Please let me know if anything I didn't expect breaks lol |
System setup:
OS: macOS Sonoma 14.4.1
Memory: 16GB
Chip: Apple M1 Pro
Python Version: 3.12.3
Java Version: 21.0.3
Maven: 3.9.6
Suggestion: I think linking try-it instead of getting-started is more suitable as I don't see a getting-started section on the read-me.