Block or Report
Block or report jiangnanboy
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language: Python
Sort by: Most stars
Text Content Grapher based on keyinfo extraction by NLP method。输入一篇文档,将文档进行关键信息提取,进行结构化,并最终组织成图谱组织形式,形成对文章语义信息的图谱化展示。
利用lightgbm做(learning to rank)排序学习,包括数据处理、模型训练、模型决策可视化、模型可解释性以及预测等。Use LightGBM to learn ranking, including data processing, model training, model decision visualization, model interpretability and …
albert + lstm + crf实体识别,pytorch实现。识别的主要实体是人名、地名、机构名和时间。albert + lstm + crf (named entity recognition)
中文版面检测(Chinese layout detection),yolov8 is used to detect the layout of Chinese document images。
个性化推荐模型,主要包括als、als_wr、biaslfm、lfm、nmf、svdpp、基于内容、基于内容回归、user-cf、item-cf、slopeone、关联规则以及基于内容和cf的混合等模型。
利用sklearn和gensim中的tfidf,lsa,doc2vec进行查询与文档匹配搜索
albert-fc for RE(Relation Extraction),中文关系抽取
利用Swin-Unet(Swin Transformer Unet)实现对文档图片里表格结构的识别,Swin-unet (Swin Transformer Unet) is used to identify the document table structure
Chinese Information Extraction Toolkit。中文信息抽取工具。利用CNN各种变体进行实体抽取。
利用llm大语言模型提取卡证票据关键信息。Key Information Extraction from Image with LLM(large language model).Basically, it can extract key information from all bills and documents.
计算词间的相关性,并进行图谱化展示。calculate the relevance between words