A Multimodal Intelligent Change Assessment Framework for Microservice Systems Based on Large Language Models
Published in FSE(CCF-A), 2025
Frequent changes in large-scale online service systems often lead to failures, threatening system reliability. To overcome the limitations of existing techniques in erroneous change detection, failure triage, and root cause change analysis, this paper presents a multimodal intelligent change assessment framework based on large language models. Our framework integrates retrieval-augmented generation techniques and leverages unified representation of multimodal data, enhanced knowledge access, and domain-specific LLMs to automate the entire change management lifecycle. Experiments on two microservice system datasets show that our method outperforms state-of-the-arts in accuracy, efficiency, and minimizing manual intervention. Furthermore, SCELM has been operational for over 11 months in real world, reducing response and resolution times for erroneous changes by 90%, significantly improving incident handling efficiency. This work provides a robust solution for change management and valuable insights into improving system stability and optimizing operational workflows.
