Playing around with ways of emulating different chinese sentence techniques in the browser
Everyone and their mom has done some sort of Chinese NLP thing, how is this different?
It's not, really. But basically it's cause not everyone has a server in X or wants to use language Y (I'm looking at you, Java) etc. How can I create the most efficient single JS file that can compete with some of the large, expansive, OMFG I have 98,000 bigrams in my database. What is a reasonable list of words that I need? These kinds of questions.
- Forward Maximum Matching
- Backward Maximum Matching
- Conditional Random Fields
- Finding a better bigram list...
Bigrams up till HSK Level 6
Introduction to Chinese NLP by my home-peeps (I wish), Kam-Fai Wong, Wenjie Li, Ruifeng Xu, and Zheng-sheng Zhang