Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User Stories #7

Open
ngawangtrinley opened this issue Jul 20, 2020 · 0 comments
Open

User Stories #7

ngawangtrinley opened this issue Jul 20, 2020 · 0 comments

Comments

@ngawangtrinley
Copy link
Collaborator

ngawangtrinley commented Jul 20, 2020

User Stories

# As a... I want to... so that...
1. researcher segment a collection of Tibetan texts I can do statistics in AntConc
2. tibetan text proofreader mark potential errors I can catch and correct more mistakes
3. corpus researcher for amdo dialects create several custom profiles I can do statistics on different spoken dialects
4. corpus researcher on literary Tibetan create a custom profile for the kangyur I can do accurate statistics on the kangyur and tengyur
5.

Rule based segmentation steps (for story 3 & 4)

  1. Segment a volume with the default profile
  2. Create a word list from the volume, ordered by frequency
  3. Manually cleanup the wordlist
  4. Use the wordlist as the main list
  5. Segment the volume again
  6. Edit the custom profile (word /remove /adjustments) till the segmentation is good
  7. Merge custom profile with main profile
  8. Repeat with a second volume

Steps for story # & #

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant