Skip to content

taiyangdixia/data

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Data Collection

  • sohu-dataset: 抓取自sohu网站的1000个网页,附带标题、关键词、带格式的HTML正文内容,无格式的纯文本内容等信息,以XML格式保存。可用于关键词抽取测试。

About

Experimental Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published