Through a comprehensive evaluation of model complexity and number of parameters, it was determined that the overall performance of the proposed model is the best when eight group convolutions are used ...
The transformer-based semantic segmentation approaches, which divide the image into different regions by sliding windows and model the relation inside each window, have achieved outstanding success.