leeguandong

KUN leeguandong

RS Image Processing | Deep Learning | CV | OCR | AIGC | LLM | LMM

220 followers · 182 following

Suning
Nanjing，China
https://liguandong.blog.csdn.net/
https://leeguandong.github.io/
https://www.zhihu.com/people/li-xin-52-81

Achievements

OCRDetInternVL2 Public

OCR Large Multi-model Model，基于Internvl2微调OCR文字检测的多模态大模型，在4张A800上基于internvl2-8b模型微调。不仅在ocr文字检测任务上，在大多数的目标检测任务也是work的。

Python 3 Updated Oct 10, 2024
XrayLLama3.2Vision Public

Xray Large Multi-model Model，基于llama3.2-vision微调Xray的多模态大模型，在4张VA800上基于llama3_2-11b-vision-instruct模型微调。

Python 1 Updated Oct 7, 2024
OCRInternVL2 Public

OCR Large Multi-model Model，基于Internvl2微调OCR的多模态大模型，在4张A800上基于internvl2-8b模型微调。internvl2-8b在我们自测的ocr的vqa场景效果表现很好，我们再使用ocr数据微调之后，对于一般的ocr的vqa任务都能实现很好的效果。

Python 1 Updated Oct 7, 2024
MaskControlnet Public

基于mask条件的controlnet生成模型，基于海量电商抠图数据（显著图检测数据）训练。

Python 1 Updated Oct 6, 2024
OCRDetPaliGemma Public

基于paligemma，专注于OCR文字检测和传统目标检测的多模态大语言模型。

Python 1 Updated Oct 5, 2024
EcommerceLLMQwen2.5 Public

基于电商数据微调的Qwen2.5系列的电商大模型，电商数据sft后电商大模型。是https://github.com/leeguandong/EcommerceLLM的升级版本。qwen2.5的效果很好。

Python 3 Updated Oct 4, 2024
leeguandong Public

Updated Sep 25, 2024
ComfyUI_AliControlnetInpainting Public

阿里妈妈电商领域的inpainting方法

Python 3 Updated Sep 25, 2024
leeguandong.github.io Public
Forked from jindongwang/jindongwang.github.io

Personal website

JavaScript MIT License Updated Sep 25, 2024
XrayQwen2VL Public

Xray Large Multi-model Model，基于Qwen2VL微调Xray的多模态大模型，在4张A800上基于qwen2-vl-7b-instruct模型微调。a large multi-modal model fine-tuned from Qwen2VL for X-ray analysis, trained on 4 A800 GPUs based on the qwen…

Python 1 Updated Sep 25, 2024
EcommerceOCRBench Public

电商文字识别的多模态大模型的ocr基准测试集，参照ocrbench，但是测评数据更多。

Python 3 Updated Sep 25, 2024
ComfyUI_CompareModelWeights Public

对比相同结构的stable diffusion的权重之间的偏差，主要用来直观的考量模型融合的权重之间的差异。

Python 3 1 Updated Sep 22, 2024
ComfyUI_Diffusers Public

diffusers的模型，参数加载，以及公用的数据处理等操作，会持续更新。

Python 3 Updated Sep 19, 2024
ComfyUI_MasaCtrl Public

在多次推理中可以固定图像主体，进行一致性控制，qkv层面工作

Python 4 2 Updated Sep 1, 2024
ComfyUI_VisualAttentionMap Public

对sd中text prompt和self-attention以及cross-attention时的特征图进行可视化。

Python 4 1 Updated Aug 26, 2024
ComfyUI_SelfGuidance Public

可以帮助锁定prompt中的特定对象在二次编辑中不被改变，对两次推理的crossattention map进行loss guidance。

Python 2 Updated Aug 26, 2024
ComfyUI_CrossImageAttention Public

CrossImageAttention是zero-shot方法，可以在制定外观图和结构的前提下，生成具有一致结构和外观的图，在qkv层面的工作。

Python 4 1 Updated Aug 16, 2024
ComfyUI_Style_Aligned Public

style_aligned，通过共享qkv的方式来zero shot得到相似图，风格一致图生成，reference方法。

Python 3 1 Updated Aug 16, 2024
ComfyUI_M3Net Public

comfyui的m3net插件，m3net是不错的显著性检测模型，抠图上效果不错，我开源了一个训练的电商的模型，供大家试玩

Python 9 2 Updated Aug 16, 2024
ComfyUI_VideoEditing Public

视频生成，controlnet+sd对输入视频进行一致性控制，对unet中的self-attention的qkv进行第一帧和前一帧参考。

Python 2 2 Updated Aug 14, 2024
ComfyUI_InternVL2 Public

comfyui的InternVL2插件，InternVL2是当前不错的开源多模态大语言模型，在文档vqa上表现很好

Python 11 1 Updated Aug 10, 2024
ComfyUI_LLaSM Public

语音文本多模态大模型，语音侧基于whisper，text侧基于llama，通用效果不错。

Python 3 1 Updated Aug 10, 2024
sd_webui_ZeST Public

ZeST是zero-shot的材质迁移模型，本质上是ip-adapter+controlnet+inpaint算法的组合，只是在输入到inpaint的图生图的图上做了一些改动，包括对image+mask的改动

Python 1 MIT License Updated Aug 7, 2024
sd_webui_instantid Public

Instantid在stable diffusion webui上的插件，instantid是风格迁移和换脸，脸部id信息保留的很好的选择。

Python 1 Updated Aug 7, 2024
EcommerceSD Public

电商场景的stable diffusion模型，包括电商大模型，lora组件和controlnet等一系列应用

Python 1 Updated Jul 24, 2024
Awesome-Chinese-Stable-Diffusion Public

中文文生图stable diffsion模型集合

239 14 Updated Jul 8, 2024
MiniLLaMA3 Public

llama3的迷你版本，包括了从0-1构造数据，训练tokenizer，pt，sft，dpo的全流程

Python 1 1 Updated Jun 26, 2024
sd_webui_prompt_translator_architecture Public

更好的离线翻译效果mBART-50，优于MarianMT，并且支持预设翻译词，内置了大量建筑单词。

Python 1 Updated May 23, 2024
XrayLLaVA Public

基于LLaVA1.6微调的Xray识别的多模态大模型

Python 5 Updated May 19, 2024
EcommerceLLM Public

基于电商数据微调的Qwen1.5系列的电商大模型，包括0.5b-base，0.5b-chat，1.8b-base，7b-base，以及基于llama3-chinese-sft版本的基础模型的sft后电商大模型。

Python 9 Updated May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KUN leeguandong

Achievements

Achievements

Block or report leeguandong

OCRDetInternVL2 Public

XrayLLama3.2Vision Public

OCRInternVL2 Public

MaskControlnet Public

OCRDetPaliGemma Public

EcommerceLLMQwen2.5 Public

leeguandong Public

ComfyUI_AliControlnetInpainting Public

leeguandong.github.io Public

XrayQwen2VL Public

EcommerceOCRBench Public

ComfyUI_CompareModelWeights Public

ComfyUI_Diffusers Public

ComfyUI_MasaCtrl Public

ComfyUI_VisualAttentionMap Public

ComfyUI_SelfGuidance Public

ComfyUI_CrossImageAttention Public

ComfyUI_Style_Aligned Public

ComfyUI_M3Net Public

ComfyUI_VideoEditing Public

ComfyUI_InternVL2 Public

ComfyUI_LLaSM Public

sd_webui_ZeST Public

sd_webui_instantid Public

EcommerceSD Public

Awesome-Chinese-Stable-Diffusion Public

MiniLLaMA3 Public

sd_webui_prompt_translator_architecture Public

XrayLLaVA Public

EcommerceLLM Public