Dataset: https://www.kaggle.com/datasets/bulentsiyah/semantic-drone-dataset
I used the Unet model from here(https://github.com/qubvel/segmentation_models.pytorch).
I'll show both best prediction and worst predction and also the reason why this happens.
These graph shows distribution of the number of times that left sides color pixels show in the train dataset. As you can see, this train dataset has bias. But it's pretty normal phenomenon. Because the size of area(pixels) of people or bikes or other some stuffs are really small from above. So it's difficult for the model to fit to those tiny objects.