Requirements engineering in the age of artificial intelligence

By Maksim Kuzmin

Requirements engineering is an important and often-undervalued step in the software development process. Poor understanding of the software requirements often causes design flaws later in the software engineering process and complicates software validation. However, software development teams often lack the resources, training, skills or tools necessary for a proper requirements engineering process. Organizations can also emphasize an implementation-first approach in software engineering, which also encourages teams to cut corners in the requirements engineering process[1]. These resource constraints make outsourcing at least some parts of the requirements engineering process to the AI attractive.

Artificial intelligence has a long history in the process of requirements engineering. Many AI techniques have been used to elicit, analyze and represent software requirements. AI also simplifies working with large-scale requirements. Recent advances in technology have created a new class of models - large language models - that can revolutionize the ways in which AI can be used in the process of requirements engineering. These models are designed to work with natural language, and requirements are often written using natural language. The significance of natural language processing for AI-enhanced requirements engineering can be seen on the word clouds from the Artificial Intelligence in Requirements Engineering workshops: natural language processing is a major component of each of the six word clouds presented[3].

In my opinion, artificial intelligence is mostly useful during the requirements elicitation stage in the requirements engineering process. It suffers from its inability to replicate domain knowledge, as knowledge relevant to a given domain constitutes a small part of the data on which a particular model was trained. The fact that modern forms of AI such as large language models(LLMs) are prone to hallucinations also increases the amount of necessary work during the requirements analysis stage.

Before the development of LLMs, AI was used to work with requirements gathered from stakeholders during the requirements engineering process. Techniques such as automated classification were applied in requirements engineering to solve problems like separating functional and non-functional requirements or separating bug reports from feature requests[3]. For example, Maalej et al[4] used automated classification to work with app reviews in AppStore and Google Play Store and classify the reviews into bug reports, feature requests, user experiences and text ratings. Winkler et al[5] used convolutional neural networks to classify content into requirements and non-requirements. While this classification was simpler than Maalej’s, Winkler’s classification was much more accurate.

While these methods can be useful, they still require the users to write their requirements or reviews without guidance from the AI. Rajender et al[6] created a chatbot that converses with users in natural language and elicits formal system requirements from the interaction. Such an approach has an obvious advantage of not requiring people trained in requirements engineering to create a list of formal system requirements.

A logical development of this approach is using LLM-based chatbots in requirements enginnering. LLMs can be gamechangers in the field of RE[2] as they have shown promise in several important domains: natural language processing, code generation, and program understanding. LLMs can be used to absorb domain-specific knowledge[7] and serve as a proxy for domain experts. They can also highlight areas of ambiguity within the requirements and translate complicated technical jargon into plain language. LLMs are also important for human-centric requirements engineering, where they can analyze user reviews and simulate user journeys considering human-centric factors such as age, gender and social group.

Automated driving is a novel field of software and system engineering. Safety is paramount in developing automated driving systems as cars present many safety risks both to passengers and pedestrians. Because of that, hazard analysis and risk assessment(HARA) is an integral part of requirements engineering in the automotive industry and, thus, in the development of automated driving solutions.

HARA includes brainstorming about possible hazards, and LLMs can be of use during this brainstorming process. Nouri et al[8] developed a prototype aimed at helping human engineers specify safety requirements during HARA. The prototype was created in the context of automated driving systems and the automotive industry, so it made use of standards like ISO 21448[9], concerned with constructing hazardous events from hazards identified during HARA, and previous domain-specific research such as the PEGASUS research project[10].

The prototype was designed in cycles, with each cycle containing the design, implementation, and evaluation phases. The first cycle focused on identifying the limitations of LLMs during HARA, and the prototype created in the end of that cycle used generic industry standards for HARA[11]. That prototype was internally evaluated, and in the next cycle, the developers decomposed the HARA task into a pipeline of tasks, each of which was within the LLM’s limitations, and the HARA produced by the prototype using the pipeline was presented to external experts for evaluation. In the final cycle of development, the prototype was refined based on the feedback received from the experts. The LLM-assisted HARA pipeline was improved, and prompt engineering was used to improve the quality of the work done by the LLM. The final prototype was also integrated into a case company’s processes.

The implementation of this fully automated LLM-based pipeline for automated driving showed that LLMs can make reasonable assumptions about safety-critical software systems. In some places in the pipeline, the LLMs were augumented by rule-based Python scripts processing data between the steps of the LLM-based pipeline. Prompt engineering greatly affected the performance of the pipeline. Overall, the experts considered the developed tool useful and possibly capable of replacing humans in the future.

That doesn’t mean that LLMs don’t have limitations or that there weren’t any challenges in the implementation of the LLM-based security requirements process. The main limitation encountered was the lack of domain knowledge in LLMs. Furthermore, the data used in training the LLMs, which usually can’t be accessed by users, clearly affected their ability to discuss technical topics related to the task at hand. Undesirable hallucinations were also a challenge, as the process of HARA requires some creativity yet it’s hard to enable that creativity in LLMs without also causing hallucinations to appear. This challenge wasn’t critical, as the evaluation of HARA is an integral part of the process under any circumstances.

Artificial intelligence has been used for a long time in the process of requirements engineering. New developments in the field of AI, such as LLMs, have obvious limitations such as their lack of domain knowledge and their hallucinations. Despite those limitations, the case of LLM-based safety requirements engineering for automated driving software shows that the new advances in AI have the potential to be game-changers in the requirements engineering process, even in the most safety-critical applications.

Provide 10-20 references to the academic sources (Scopus, ResearchGate, Google Scholar, SemanticScholar, etc.). These references should justify, support your analysis or provide alternative viewpoints. Use web links and bibliograpy-style references, use Google scholar to format references (Harvard style), provide a clickable DOI link (except AI assistants).

Lamsweerde, A.V., 2009. Requirements engineering: from system goals to UML models to software specifications. John Wiley & Sons, Ltd. Google Scholar
Arora, C., Grundy, J., Abdelrazek, M. (2024). Advancing Requirements Engineering Through Generative AI: Assessing the Role of LLMs. In: Nguyen-Duc, A., Abrahamsson, P., Khomh, F. (eds) Generative AI for Effective Software Development. Springer, Cham. https://doi.org/10.1007/978-3-031-55642-5_6
F. Dalpiaz and N. Niu, “Requirements Engineering in the Days of Artificial Intelligence,” in IEEE Software, vol. 37, no. 4, pp. 7-10, July-Aug. 2020, doi: https://doi.org/10.1109/MS.2020.2986047.
Maalej, W., Kurtanović, Z., Nabil, H. et al. On the automatic classification of app reviews. Requirements Eng 21, 311–331 (2016). https://doi.org/10.1007/s00766-016-0251-9
J. P. Winkler, J. Grönberg and A. Vogelsang, “Optimizing for Recall in Automatic Requirements Classification: An Empirical Study,” 2019 IEEE 27th International Requirements Engineering Conference (RE), Jeju, Korea (South), 2019, pp. 40-50, doi: https://doi.org/10.1109/RE.2019.00016.
C. S. Rajender Kumar Surana, Shriya, D. B. Gupta and S. P. Shankar, “Intelligent Chatbot for Requirements Elicitation and Classification,” 2019 4th International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT), Bangalore, India, 2019, pp. 866-870, doi: https://doi.org/10.1109/RTEICT46194.2019.9016907
Luitel, D., Hassani, S. and Sabetzadeh, M., 2023, April. Using language models for enhancing the completeness of natural-language requirements. In International working conference on requirements engineering: foundation for software quality (pp. 87-104). Cham: Springer Nature Switzerland. https://doi.org/10.48550/arXiv.2302.04792
A. Nouri, B. Cabrero-Daniel, F. Törner, H. Sivencrona and C. Berger, “Engineering Safety Requirements for Autonomous Driving with Large Language Models,” 2024 IEEE 32nd International Requirements Engineering Conference (RE), Reykjavik, Iceland, 2024, pp. 218-228, doi: https://doi.org/10.1109/RE59067.2024.00029
International Standard Organization, 2019. Road Vehicles-Safety of the Intended Functionality. ISO
M. Scholtes et al., “6-Layer Model for a Structured Description and Categorization of Urban Traffic and Environment,” in IEEE Access, vol. 9, pp. 59131-59147, 2021, doi: https://doi.org/10.1109/ACCESS.2021.3072739
“ISO26262:2018”, Road Vehicles — Functional Safety” standard International Organization for Standardization, 2018.

Requirements engineering in the age of artificial intelligence

Introduction

Literature Review

A case for LLMs: use of LLMs for RE for automated driving

Conclusion

References