Popular repositories Loading
-
-
DouYin
DouYin PublicForked from Python3WebSpider/DouYin
API of DouYin for Humans used to Crawl Popular Videos and Musics
Python
-
innovators-patent-agreement
innovators-patent-agreement PublicForked from twitter/innovators-patent-agreement
Innovators Patent Agreement (IPA)
-
CNKI_Patent_SVM
CNKI_Patent_SVM PublicForked from speciallurain/CNKI_Patent_SVM
文本分类是指在给定分类体系下 , 根据文本的内容自动确定文本类别的过程。首先我们根据scrapy爬虫根据中国知网URL的规律,爬取70多万条2014年公开的发明专利,然后通过数据清洗筛选出了60多万条含标签数据。通过TF-IDF对60多万条本文进行词频提取,依照词频排序提取前3000个词语形成语义词典,然后根据观察设置停用词。然后再用TF-IDF的方式对每个摘要进行词频选取,通过布尔模型,对…
Python
-
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.