Semantic segmentation (or pixel-level classification) is an important computer vision task with potential applications in a variety of fields, such as autonomous driving or medicine. Deep models have been successfully used to perform semantic segmentation on a variety of benchmarks. Still, these models remain sensitive to domain shifts between training and testing data, which is a potential obstacle to their reliability in real-world applications. This problem may be mitigated through multi-dataset training. The goal of this topic is to get acquainted with deep learning frameworks and popular image classification and semantic segmentation approaches, train and evaluate them on smaller datasets and then explore different approaches to multi-dataset training.
Keywords: Deep Learning, Computer vision
Literature
efficient semantic segmentation: https://arxiv.org/abs/1903.08469
masked based models for semantic segmentation: https://arxiv.org/abs/2112.01527
multi-head multi-dataset training: https://www.sciencedirect.com/science/article/pii/S0925231217306847
multi-dataset training with unified output: https://arxiv.org/abs/2212.10340