Automating the semantic validation of SysML v2models through a multi-agent framework. Jose Olivert-Iserte, E. C. & Álvarez-Rodríguez, J. M. 2026.
Automating the semantic validation of SysML v2models through a multi-agent framework [link]Paper  doi  abstract   bibtex   
Ensuring the semantic alignment between system models and natural language requirements is a critical bottleneck for the adoption of Model-Based Systems Engineering (MBSE). While SysML v2 offers enhanced expressiveness compared to SysML v1, including: (i) first-class support for behavioral and structural integration, (ii) improved typing and reuse mechanisms, (iii) formal semantics enabling precise model interpretation, existing validation toolsprimarily target syntactic correctness leaving a gap in automated semantic assessment. The present work proposes a deterministic multi-agent framework designed to automate the semantic validation of SysML v2 models. Unlike traditional monolithic Large Language Model (LLM) approaches, the proposed framework decomposes the validation process into a specialized pipeline of five agents: model abstraction, requirement normalization, semantic judgement (grounded in the requirements quality characteristics defined in the ISO/IEC/IEEE 29148:2018 ”Systems and software engineering — Life cycle processes — Requirements engineering”), metric aggregation, and actionable feedback generation. The approach is evaluated through an empirical study using a dataset of 14 SysML v2 models and 303 requirements derived from official specifications. A comparative analysis is performed using two different LLM backends: a highefficiency proprietary model (GPT-4o-mini) and an open-weight reasoning model (Deepseek-R1 ). Results show that the multi-agent orchestrationreliably identifies semantic mismatches, incomplete realizations, and omissions across varying levels of system complexity, ranging from small componentlevel models (M1–M5) to large system-of-systems architectures involving multiple interacting subsystems. The findings show that while the openweight model exhibits a more permissive validation stance, the proprietary backend provides a conservative and highly precise assessment suitable forsafety-critical domains. This work contributes to the automation of auditable model quality assessment, providing a scalable path for iterative model refinement in modern software and systems engineering.
@misc{olivert_iserte_automating_2026,
	title = {{Automating the semantic validation of {SysML} v2models through a multi-agent framework}},
	url = {https://www.ssrn.com/abstract=6553831},
	doi = {10.2139/ssrn.6553831},
	abstract = {Ensuring the semantic alignment between system models and natural language requirements is a critical bottleneck for the adoption of Model-Based Systems Engineering (MBSE). While SysML v2 offers enhanced expressiveness compared to SysML v1, including: (i) first-class support for behavioral and structural integration, (ii) improved typing and reuse mechanisms, (iii) formal semantics enabling precise model interpretation, existing validation toolsprimarily target syntactic correctness leaving a gap in automated semantic assessment. The present work proposes a deterministic multi-agent framework designed to automate the semantic validation of SysML v2 models. Unlike traditional monolithic Large Language Model (LLM) approaches, the proposed framework decomposes the validation process into a specialized pipeline of five agents: model abstraction, requirement normalization, semantic judgement (grounded in the requirements quality characteristics defined in the ISO/IEC/IEEE 29148:2018 ”Systems and software engineering — Life cycle processes — Requirements engineering”), metric aggregation, and actionable feedback generation. The approach is evaluated through an empirical study using a dataset of 14 SysML v2 models and 303 requirements derived from official specifications. A comparative analysis is performed using two different LLM backends: a highefficiency proprietary model (GPT-4o-mini) and an open-weight reasoning model (Deepseek-R1 ). Results show that the multi-agent orchestrationreliably identifies semantic mismatches, incomplete realizations, and omissions across varying levels of system complexity, ranging from small componentlevel models (M1–M5) to large system-of-systems architectures involving multiple interacting subsystems. The findings show that while the openweight model exhibits a more permissive validation stance, the proprietary backend provides a conservative and highly precise assessment suitable forsafety-critical domains. This work contributes to the automation of auditable model quality assessment, providing a scalable path for iterative model refinement in modern software and systems engineering.},
	urldate = {2026-05-13},
	publisher = {SSRN},
	author = {Jose Olivert-Iserte, Eduardo Cibri{\'{a}}n, Roy Mendieta and
                  Jos{\'{e}} Mar{\'{\i}}a {\'{A}}lvarez-Rodr{\'{\i}}guez },
	year = {2026},
}

Downloads: 0