Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

用官方的BiseNetV1配置文件运行自己的数据集报错 #3401

Closed
3 tasks done
loxoo6 opened this issue Jul 24, 2023 · 2 comments
Closed
3 tasks done

用官方的BiseNetV1配置文件运行自己的数据集报错 #3401

loxoo6 opened this issue Jul 24, 2023 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@loxoo6
Copy link

loxoo6 commented Jul 24, 2023

问题确认 Search before asking

Bug描述 Describe the Bug

2023-07-24 09:40:35 [INFO]
------------Environment Information-------------
platform: Linux-4.15.0-140-generic-x86_64-with-debian-stretch-sid
Python: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0]
Paddle compiled with cuda: True
NVCC: Build cuda_11.2.r11.2/compiler.29618528_0
cudnn: 8.2
GPUs used: 1
CUDA_VISIBLE_DEVICES: None
GPU: ['GPU 0: Tesla V100-SXM2-32GB']
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~16.04) 7.5.0
PaddleSeg: 2.7.0
PaddlePaddle: 2.3.2
OpenCV: 4.1.1

2023-07-24 09:40:35 [INFO]
---------------Config Information---------------
batch_size: 4
iters: 160000
loss:
coef:

  • 1
  • 1
  • 1
    types:
  • ignore_index: 255
    type: OhemCrossEntropyLoss
  • ignore_index: 255
    type: OhemCrossEntropyLoss
  • ignore_index: 255
    type: OhemCrossEntropyLoss
    lr_scheduler:
    end_lr: 0.0
    learning_rate: 0.01
    power: 0.9
    type: PolynomialDecay
    model:
    backbone:
    in_channels: 3
    output_stride: 8
    pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet18_vd_ssld_v2.tar.gz
    type: ResNet18_vd
    num_classes: 2
    type: BiseNetV1
    optimizer:
    type: sgd
    weight_decay: 0.0005
    train_dataset:
    dataset_root: /home/aistudio/PaddleSeg/data
    img_channels: 3
    mode: train
    num_classes: 2
    train_path: /home/aistudio/PaddleSeg/data/train_list.txt
    transforms:
  • max_scale_factor: 2.0
    min_scale_factor: 0.5
    scale_step_size: 0.25
    type: ResizeStepScaling
  • crop_size:
    • 512
    • 512
      type: RandomPaddingCrop
  • type: RandomHorizontalFlip
  • type: RandomDistort
  • type: Normalize
    type: Dataset
    val_dataset:
    dataset_root: /home/aistudio/PaddleSeg/data
    img_channels: 3
    mode: val
    num_classes: 2
    transforms:
  • type: Normalize
    type: Dataset
    val_path: /home/aistudio/PaddleSeg/data/val_list.txt

W0724 09:40:35.476796 7755 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0724 09:40:35.476837 7755 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
2023-07-24 09:40:36 [INFO] Loading pretrained model from https://bj.bcebos.com/paddleseg/dygraph/resnet18_vd_ssld_v2.tar.gz
Connecting to https://bj.bcebos.com/paddleseg/dygraph/resnet18_vd_ssld_v2.tar.gz
Downloading resnet18_vd_ssld_v2.tar.gz
[==================================================] 100.00%
Uncompress resnet18_vd_ssld_v2.tar.gz
[==================================================] 100.00%
2023-07-24 09:40:38 [INFO] There are 115/115 variables loaded into ResNet_vd.
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:654: UserWarning: When training, we now always track global mean and variance.
"When training, we now always track global mean and variance.")
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/math_op_patch.py:278: UserWarning: The dtype of left and right variables are not the same, left dtype is paddle.float32, but right dtype is paddle.int64, the right dtype will convert to paddle.float32
format(lhs_dtype, rhs_dtype, lhs_dtype))
2023-07-24 09:40:55 [INFO] [TRAIN] epoch: 1, iter: 10/160000, loss: 3.4389, lr: 0.009999, batch_cost: 1.6020, reader_cost: 1.16270, ips: 2.4968 samples/sec | ETA 71:11:48
2023-07-24 09:41:10 [INFO] [TRAIN] epoch: 1, iter: 20/160000, loss: 5.1390, lr: 0.009999, batch_cost: 1.4837, reader_cost: 1.32117, ips: 2.6959 samples/sec | ETA 65:56:05
2023-07-24 09:41:25 [INFO] [TRAIN] epoch: 1, iter: 30/160000, loss: 2.9624, lr: 0.009998, batch_cost: 1.5491, reader_cost: 1.39400, ips: 2.5821 samples/sec | ETA 68:50:10
2023-07-24 09:41:40 [INFO] [TRAIN] epoch: 1, iter: 40/160000, loss: 2.4292, lr: 0.009998, batch_cost: 1.4487, reader_cost: 1.30202, ips: 2.7611 samples/sec | ETA 64:22:17
2023-07-24 09:41:55 [INFO] [TRAIN] epoch: 1, iter: 50/160000, loss: 2.5561, lr: 0.009997, batch_cost: 1.5167, reader_cost: 1.35702, ips: 2.6373 samples/sec | ETA 67:23:18
2023-07-24 09:41:55 [INFO] Start evaluating (total_samples: 30, total_iters: 30)...
Traceback (most recent call last):
File "tools/train.py", line 262, in
main(args)
File "tools/train.py", line 257, in main
to_static_training=cfg.to_static_training)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddleseg/core/train.py", line 289, in train
**test_config)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddleseg/core/val.py", line 165, in evaluate
ignore_index=eval_dataset.ignore_index)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddleseg/utils/metrics.py", line 43, in calculate_area
label.shape))
ValueError: Shape of pred and `label should be equal, but there are [1, 4032, 2272] and [1, 4032, 2268].
terminate called without an active exception


C++ Traceback (most recent call last):

No stack trace in paddle, may be caused by external reasons.


Error Message Summary:

FatalError: Process abort signal is detected by the operating system.
[TimeInfo: *** Aborted at 1690162917 (unix time) try "date -d @1690162917" if you are using GNU date ***]
[SignalInfo: *** SIGABRT (@0x3e800001e4b) received by PID 7755 (TID 0x7f04ccaae700) from PID 7755 ***]

复现环境 Environment

配置文件是:
base: '../base/cityscapes.yml'

batch_size: 4
iters: 160000

model:
type: BiseNetV1
backbone:
type: ResNet18_vd
output_stride: 8
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet18_vd_ssld_v2.tar.gz

train_dataset:
type: Dataset
dataset_root: /home/aistudio/PaddleSeg/data
train_path: /home/aistudio/PaddleSeg/data/train_list.txt
num_classes: 2
mode: train
transforms:
- type: ResizeStepScaling
min_scale_factor: 0.5
max_scale_factor: 2.0
scale_step_size: 0.25
- type: RandomPaddingCrop
crop_size: [512, 512]
- type: RandomHorizontalFlip
- type: RandomDistort
- type: Normalize

val_dataset:
type: Dataset
dataset_root: /home/aistudio/PaddleSeg/data
val_path: /home/aistudio/PaddleSeg/data/val_list.txt
num_classes: 2
mode: val
transforms:
- type: Normalize

optimizer:
type: sgd
weight_decay: 0.0005

loss:
types:
- type: OhemCrossEntropyLoss
- type: OhemCrossEntropyLoss
- type: OhemCrossEntropyLoss
coef: [1, 1, 1]

lr_scheduler:
type: PolynomialDecay
learning_rate: 0.01
end_lr: 0.0
power: 0.9

运行环境:
aistudio
paddlepaddle==2.3.3
paddleseg==2.7.0
python3

Bug描述确认 Bug description confirmation

  • 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.

是否愿意提交PR? Are you willing to submit a PR?

  • 我愿意提交PR!I'd like to help by submitting a PR!
@loxoo6 loxoo6 added the bug Something isn't working label Jul 24, 2023
@Asthestarsfalll
Copy link
Contributor

看起来是gt和原图大小不一致,请检查一遍数据集

@ToddBear
Copy link
Collaborator

ToddBear commented Aug 7, 2023

以上回答已经充分解答了问题,如果有新的问题欢迎随时提交issue,或者在此条issue下继续回复~
我们开启了飞桨套件的ISSUE攻关活动,欢迎感兴趣的开发者参加:PaddlePaddle/PaddleOCR#10223

@ToddBear ToddBear closed this as completed Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants