在线量化训练acc为0，提示模型参数不存在 #5801

marsbzp · 2022-03-28T02:21:45Z

littletomatodonkey · 2022-03-28T03:19:15Z

你好，麻烦提供下具体的训练脚本或者复现方法

marsbzp · 2022-03-28T03:41:46Z

使用configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec_enhanced_ctc_loss.yml这个yaml加载预训练模型复现下

littletomatodonkey · 2022-03-28T08:52:12Z

你好，感谢反馈，可以用这个pr试下：#5806

marsbzp · 2022-03-28T11:50:01Z

你们不先测试下再发布的吗，这个在线量化有测过精度吗

marsbzp · 2022-03-28T11:50:29Z

支持多卡训练吗

marsbzp · 2022-03-28T12:08:39Z

现在有精度了，可以多卡训练

littletomatodonkey · 2022-03-28T12:41:48Z

你们不先测试下再发布的吗，这个在线量化有测过精度吗

提供的量化模型就是使用量化脚本得到的，之前的逻辑没问题，后来PACT内部逻辑修改了，结构化命名加载不上了

marsbzp · 2022-03-29T01:10:05Z

看之前的代码是调QAT把model结构改变以后（插量化节点合并bn这些过程）再load原始模型参数，这样才导致参数不存在的吧，如果是load量化训练得到的模型应该是没问题的和PACT内部逻辑应该没啥关系吧

marsbzp · 2022-03-31T09:55:09Z

量化导出的模型paddle-tensorrt推理结果不正确

littletomatodonkey · 2022-03-31T09:56:45Z

看之前的代码是调QAT把model结构改变以后（插量化节点合并bn这些过程）再load原始模型参数，这样才导致参数不存在的吧，如果是load量化训练得到的模型应该是没问题的和PACT内部逻辑应该没啥关系吧

之前的结构化命名在插入节点前后不变，所以可以正常加载

littletomatodonkey · 2022-03-31T09:57:16Z

量化导出的模型paddle-tensorrt推理结果不正确

原生推理结果正确吗？先确认下是否为tensorRT的问题哈

marsbzp · 2022-03-31T10:01:59Z

可以用这个模型复现https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_rec_slim_infer.tar 我试过检测模型和OCR的，int8模式都无法推理 paddle tensorrt_fp32 也不行，只有paddle可以，帮忙看下吧

marsbzp · 2022-03-31T10:05:45Z

量化导出的模型paddle-tensorrt推理结果不正确

原生推理结果正确吗？先确认下是否为tensorRT的问题哈

原生不使用trt是可以的但量化就是为了要是用trt加速啊

littletomatodonkey · 2022-03-31T10:06:20Z

好的，我们跟进下哈

marsbzp · 2022-03-31T10:09:06Z

提了好几个issue了就你回复了，如果是离线量化没问题的，在线量化的模型是否需要有进一步的转换，才可以trt推理呢

littletomatodonkey · 2022-03-31T10:10:11Z

量化导出的模型paddle-tensorrt推理结果不正确

原生推理结果正确吗？先确认下是否为tensorRT的问题哈

原生不使用trt是可以的但量化就是为了要是用trt加速啊

之前的量化模型主要是为了使用PaddleLite去做部署，服务器端，我们一般直接使用了fp32模型来着，这种case之前确实没测试过

littletomatodonkey · 2022-03-31T10:17:33Z

https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_rec_slim_infer.tar

要不分别提供下原生推理和trt推理的结果？

marsbzp · 2022-03-31T10:19:34Z

就用这个模型 cpp demo跑下就能看出问题

marsbzp · 2022-03-31T10:26:25Z

我最近跑过目标检测在线量化的模型可以发你看下
paddle

trt_fp32 没结果

trt_int8 直接挂

littletomatodonkey · 2022-03-31T10:27:22Z

好的，我手边没trt的环境，我先把这个问题反馈下~

marsbzp · 2022-04-01T06:35:06Z

原因找到了qat模型导出的时候softmax没掉了导致结果不正确帮忙看下是啥原因引起的吧

littletomatodonkey · 2022-04-01T06:50:31Z

你好，我试了下这个tensorRT7，试了下上面这个ch_ppocr_mobile_v2.0_rec_slim_infer/模型，没有问题呀

具体地，

cuda驱动版本11.2
trt：TensorRT-7.0.0.11
gcc为8.2，
预测库为：https://paddle-inference-lib.bj.bcebos.com/2.2.1/cxx_c/Linux/GPU/x86-64_gcc8.2_avx_mkl_cuda10.2_cudnn8.1.1_trt7.2.3.4/paddle_inference.tgz

trt+fp32预测脚本为

d="ch_ppocr_mobile_v2.0_rec_slim_infer"
./build/ppocr rec \
    --rec_model_dir=${d} \
    --image_dir=../../doc/imgs_words/ch/word_1.jpg \
    --use_tensorrt="1" \
    --use_gpu="1" \
    --rec_batch_num="1" \
    --precision="fp32"

trt + fp32预测结果为：

韩国小馆	score: 0.990642

trt+int8预测脚本为

d="ch_ppocr_mobile_v2.0_rec_slim_infer"
./build/ppocr rec \
    --rec_model_dir=${d} \
    --image_dir=../../doc/imgs_words/ch/word_1.jpg \
    --use_tensorrt="1" \
    --use_gpu="1" \
    --rec_batch_num="1" \
    --precision="int8"

trt+int8预测结果为

韩国小馆	score: 0.990642

你要不扫描首页二维码进微信群，再单独看下你的模型？

marsbzp · 2022-04-01T06:57:26Z

嗯感谢回复，官方我用tensorrt7也是没有问题，我自己导出的模型缺少softmax算子，你能否找个识别模型用现有代码导出下呢

marsbzp · 2022-04-01T07:46:50Z

找到原因了，导出量化模型的时候，那个没有进下面的if里面去，导致缺少softmax我注释掉if导出模型就有结果了，你们定位下吧

littletomatodonkey · 2022-04-01T07:49:14Z

应该是你改了代码，我这边直接跑，是有softmax的

LDOUBLEV · 2022-04-08T02:38:41Z

导出量化模型的时候，那个没有进下面的if里面去，导致缺少softmax我注释掉if导出模型就有结果了，你们定位下吧

问题修复：#5903

yangy996 · 2022-04-15T03:31:39Z

这个bug不更新到release/v2.4分支上么？

justcodew · 2023-04-20T12:28:53Z

使用最新的代码，量化训练识别模型SVTR，碰到了同样的问题
测试脚本：
python tools/eval.py -c output/plate_rec_quant/config.yml -o Global.pretrained_model=output/plate_rec_quant/best_accuracy.pdparams Eval.dataset.data_dir=./plate_rec_data/convet_img_data Eval.dataset.label_file_list=[./plate_rec_data/convet_img_data/ppocr_test_list.txt] Eval.loader.num_workers=2
报错信息;

W0420 20:26:29.985214 213450 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.2, Runtime API Version: 10.2
W0420 20:26:29.990868 213450 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6.
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer.weight not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer._fake_quant_weight._scale not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer._fake_quant_input._scale not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer._fake_quant_input._state not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer._fake_quant_input._accum not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer._act_preprocess.alpha not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._ma_output_scale._scale not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._ma_output_scale._state not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._ma_output_scale._accum not in model

W0420 20:26:29.985214 213450 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.2, Runtime API Version: 10.2
W0420 20:26:29.990868 213450 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6.
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer.weight not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer._fake_quant_weight._scale not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer._fake_quant_input._scale not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer._fake_quant_input._state not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer._fake_quant_input._accum not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._layer._act_preprocess.alpha not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._ma_output_scale._scale not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._ma_output_scale._state not in model
[2023/04/20 20:26:36] ppocr WARNING: The pretrained params backbone.conv1._conv._ma_output_scale._accum not in model

[2023/04/20 20:27:05] ppocr INFO: metric eval ***************
[2023/04/20 20:27:05] ppocr INFO: acc:0.0
[2023/04/20 20:27:05] ppocr INFO: norm_edit_dis:0.000370501947975721
[2023/04/20 20:27:05] ppocr INFO: fps:861.3670144457965

justcodew · 2023-04-20T12:36:25Z

导出量化模型时，并不会报错，精度acc也是正常的
python deploy/slim/quantization/export_model.py -c output/plate_rec_quant/config.yml -o Global.checkpoints=./output/plate_rec_quant/best_accuracy Global.save_model_dir=./output/plate_rec_quant/infer Global.save_inference_dir=./output/plate_rec_quant/infer

[2023/04/20 20:40:16] ppocr INFO: metric eval ***************
[2023/04/20 20:40:16] ppocr INFO: acc:0.9594412819871275
[2023/04/20 20:40:16] ppocr INFO: norm_edit_dis:0.9928645333013004
[2023/04/20 20:40:16] ppocr INFO: fps:611.2247434852117
[2023/04/20 20:40:21] ppocr INFO: inference model is saved to ./output/plate_rec_quant/infer/inference

paddle-bot-old bot assigned littletomatodonkey Mar 28, 2022

littletomatodonkey mentioned this issue Mar 28, 2022

fix quant logic #5806

Merged

littletomatodonkey closed this as completed Mar 28, 2022

littletomatodonkey reopened this Mar 31, 2022

paddle-bot-old bot closed this as completed Jul 19, 2022

paddle-bot bot added the status/close label Jul 19, 2022

justcodew mentioned this issue Apr 20, 2023

关于模型量化的一些疑问 #9786

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

在线量化训练acc为0，提示模型参数不存在 #5801

在线量化训练acc为0，提示模型参数不存在 #5801

marsbzp commented Mar 28, 2022

littletomatodonkey commented Mar 28, 2022

marsbzp commented Mar 28, 2022

littletomatodonkey commented Mar 28, 2022

marsbzp commented Mar 28, 2022

marsbzp commented Mar 28, 2022

marsbzp commented Mar 28, 2022

littletomatodonkey commented Mar 28, 2022

marsbzp commented Mar 29, 2022

marsbzp commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

marsbzp commented Mar 31, 2022

marsbzp commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

marsbzp commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

marsbzp commented Mar 31, 2022

marsbzp commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

marsbzp commented Apr 1, 2022

littletomatodonkey commented Apr 1, 2022

marsbzp commented Apr 1, 2022

marsbzp commented Apr 1, 2022

littletomatodonkey commented Apr 1, 2022

LDOUBLEV commented Apr 8, 2022 •

edited

Loading

yangy996 commented Apr 15, 2022

justcodew commented Apr 20, 2023

justcodew commented Apr 20, 2023

在线量化训练acc为0，提示模型参数不存在 #5801

在线量化训练acc为0，提示模型参数不存在 #5801

Comments

marsbzp commented Mar 28, 2022

littletomatodonkey commented Mar 28, 2022

marsbzp commented Mar 28, 2022

littletomatodonkey commented Mar 28, 2022

marsbzp commented Mar 28, 2022

marsbzp commented Mar 28, 2022

marsbzp commented Mar 28, 2022

littletomatodonkey commented Mar 28, 2022

marsbzp commented Mar 29, 2022

marsbzp commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

marsbzp commented Mar 31, 2022

marsbzp commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

marsbzp commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

marsbzp commented Mar 31, 2022

marsbzp commented Mar 31, 2022

littletomatodonkey commented Mar 31, 2022

marsbzp commented Apr 1, 2022

littletomatodonkey commented Apr 1, 2022

marsbzp commented Apr 1, 2022

marsbzp commented Apr 1, 2022

littletomatodonkey commented Apr 1, 2022

LDOUBLEV commented Apr 8, 2022 • edited Loading

yangy996 commented Apr 15, 2022

justcodew commented Apr 20, 2023

justcodew commented Apr 20, 2023

LDOUBLEV commented Apr 8, 2022 •

edited

Loading