Collaborating Foundation Models for Domain Generalized Semantic Segmentation

Domain Generalized Semantic Segmentation (DGSS) deals with training a model on a labeled source domain with the aim of generalizing to unseen domains during inference. Existing DGSS methods typically effectuate robust features by means of Domain Randomization (DR). Such an approach is often limited as it can only account for style diversification and not content. In this work, we take an orthogonal approach to DGSS and propose to use an assembly of CoLlaborative FOUndation models for Domain Generalized Semantic Segmentation (CLOUDS). In detail, CLOUDS is a framework that integrates Foundation Models of various kinds: (i) CLIP backbone for its robust feature representation, (ii) Diffusion Model to diversify the content, thereby covering various modes of the possible target distribution, and (iii) Segment Anything Model (SAM) for iteratively refining the predictions of the segmentation model. Extensive experiments show that our CLOUDS excels in adapting from synthetic to real DGSS benchmarks and under varying weather conditions, notably outperforming prior methods by 5.6% and 6.7% on averaged mIoU, respectively. Our code is available at https://github.com/yasserben/CLOUDS

(2024). Collaborating Foundation Models for Domain Generalized Semantic Segmentation . Retrieved from https://hdl.handle.net/10446/311026

Collaborating Foundation Models for Domain Generalized Semantic Segmentation

Benigmim, Yasser;Roy, Subhankar;Essid, Slim;Kalogeiton, Vicky;Lathuilière, Stéphane

2024-01-01

Abstract

Domain Generalized Semantic Segmentation (DGSS) deals with training a model on a labeled source domain with the aim of generalizing to unseen domains during inference. Existing DGSS methods typically effectuate robust features by means of Domain Randomization (DR). Such an approach is often limited as it can only account for style diversification and not content. In this work, we take an orthogonal approach to DGSS and propose to use an assembly of CoLlaborative FOUndation models for Domain Generalized Semantic Segmentation (CLOUDS). In detail, CLOUDS is a framework that integrates Foundation Models of various kinds: (i) CLIP backbone for its robust feature representation, (ii) Diffusion Model to diversify the content, thereby covering various modes of the possible target distribution, and (iii) Segment Anything Model (SAM) for iteratively refining the predictions of the segmentation model. Extensive experiments show that our CLOUDS excels in adapting from synthetic to real DGSS benchmarks and under varying weather conditions, notably outperforming prior methods by 5.6% and 6.7% on averaged mIoU, respectively. Our code is available at https://github.com/yasserben/CLOUDS

Scheda breve

Scheda completa

Scheda completa (DC)

	DOI del contributo
	
				https://dx.doi.org/10.1109/CVPR52733.2024.00300
			
	Identificativo ISI
	
				WOS:001322555903048
			
	Identificativo SCOPUS
	
				2-s2.0-85207311037
			
	Data di pubblicazione
	
				2024
			
	Lingua/e del contenuto
	
				Inglese
			
	Titolo del volume/Fascicolo monografico/Collezione online
	
				2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
			
	ISBN degli Atti
	
				979-8-3503-5301-3
			
	ISBN della versione online
	
				979-8-3503-5300-6
			
	Serie/collana in ANCE
	
				PROCEEDINGS IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
			
	URL degli Atti
	
				https://ieeexplore.ieee.org/xpl/conhome/10654794/proceeding
			
	Pag. iniziale
	
				3108
			
	Pag. finale
	
				3119
			
	Formato
	
				cartaceo
online
			
	Paese di pubblicazione
	
				United States
			
	Città di pubblicazione
	
				Piscataway
			
	Editore
	
				IEEE (Institute of Electrical and Electronics Engineers)
			
	Nome del convegno
	
				CVPR 2024: Conference on Computer Vision and Pattern Recognition, Seattle, United States of America, 16-22 June 2024
			
	Luogo del convegno
	
				Seattle, USA
			
	Periodo del convegno
	
				16-22 June 2024
			
	Rilevanza del convegno
	
				internazionale
			
	Tipo di intervento
	
				contributo
			
	Settore scientifico-disciplinare (validi dal 09/05/2024)
	
				Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	Keywords
	
				Foundation Models for Domain Generalized; Semantic Segmentation
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Numero autori
	
				5
			
	Tutti gli autori
	
						Benigmim, Yasser; Roy, Subhankar; Essid, Slim; Kalogeiton, Vicky; Lathuilière, Stéphane
					
	Tipologia
	
				1.4 Contributi in atti di convegno - Contributions in conference proceedings::1.4.01 Contributi in atti di convegno - Conference presentations
			
	Fulltext
	
				reserved
			
	description.file
	
				Non definito
			
	Tipologia sito docente
	
				273
			
	Citazione
	
				(2024). Collaborating Foundation Models for Domain Generalized Semantic Segmentation . Retrieved from https://hdl.handle.net/10446/311026
			
	Nelle collezioni:
	
				1.4.01 Contributi in atti di convegno - Conference presentations

File allegato/i alla scheda:

File	Dimensione del file	Formato
Benigmim_Collaborating_Foundation_Models_for_Domain_Generalized_Semantic_Segmentation_CVPR_2024_paper.pdf Solo gestori di archivio Versione: publisher's version - versione editoriale Licenza: Licenza default Aisberg Dimensione del file 3.52 MB Formato Adobe PDF Visualizza/Apri	3.52 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/311026

Citazioni

33

26

social impact