Program – SEE-AIT 2026 (Co-located with FSE'26)

Workshop Schedule

Time	Activity
14:00 – 15:00	Session 1 — Opening
14:00 – 14:10	Words of Welcome — Felix Dobslaw
14:10 – 14:50	Keynote: Architects of Culture: Bridging Software Engineering and HR in the GenAI Era Dr. Magnus Lundbäck & Dr. Lucas Gren (Getinge / Chalmers University of Technology)
14:50 – 15:00	Q&A
15:00 – 16:00	Session 2 — Paper Presentations
15:00 – 15:20	Generative AI and the Future of Professional Software Development: Survey Findings from Finland Määttä, Kelanti, and Turhan (University of Oulu)
15:20 – 15:40	Lifecycle-Aware GenAI Assistance with MCP via Context Refinement Loops Elsisi, De Abreu Santos, and Melo (Toronto Metropolitan University / Colorado State University)
15:40 – 16:00	LLM-based Specification-Driven Test Oracle Enhancement Fu, Zhao, Wang et al. (Tianjin University / University of Bristol)
16:00 – 16:30	Coffee Break
16:30 – 18:00	Session 3 — Interactive Fishbowl
16:30 – 16:45	Introduction: Fishbowl Format and Purpose
16:45 – 17:45	Fishbowl Discussion
17:45 – 18:00	Wrap-up and Closing

Keynote Session

Architects of Culture: Bridging Software Engineering and HR in the GenAI Era

by Dr. Magnus Lundbäck and Dr. Lucas Gren

Show abstract

Accepted Papers

Generative AI and the Future of Professional Software Development: Survey Findings from Finland

Samuli Määttä (University of Oulu),
Markus Kelanti (University of Oulu),
Burak Turhan (University of Oulu)

Show abstract

Lifecycle-Aware GenAI Assistance with MCP via Context Refinement Loops: A Reference Architecture

Omar Elsisi (Toronto Metropolitan University),
Fabio De Abreu Santos (Colorado State University),
Glaucia Melo (Toronto Metropolitan University)

Show abstract

LLM-based Specification-Driven Test Oracle Enhancement

Ruifeng Fu (School of Computer Software, Tianjin University),
Yingquan Zhao (School of Cybersecurity, Tianjin University),
Meng Wang (Department of Computer Science, University of Bristol),
Zan Wang (School of Artificial Intelligence, Tianjin University),
Junjie Chen (School of Computer Software, Tianjin University)

Show abstract

Software testing critically depends on test oracles, yet existing test oracles are often incomplete and insufficient for detecting bugs where implementations deviate from their specifications. Meanwhile, despite advances in test oracle construction, existing techniques typically rely on coarse-grained failure signals or require substantial manual effort, and thus remain inadequate for detecting specification-violation bugs. To address these limitations, we propose JavaOracle, a specification-driven approach that leverages large language models (LLMs) to reason over specifications and systematically enhance test oracles. Specifically, JavaOracle consists of three stages. First, it leverages LLMs to analyze specifications and derive additional test oracles that are not covered by existing tests, while integrating a root-cause–guided repair workflow to ensure that the enhanced test cases are syntactically and semantically valid. Second, it designs a multi-agent debate workflow to distinguish previously unknown specification-violation bugs from assertion failures caused by LLM hallucinations, thereby mitigating the impact of hallucinations. Third, unlike existing approaches that stop at bug detection, JavaOracle further automates test case minimization and bug report generation, producing submission-ready reports without manual effort.

We evaluate JavaOracle on 3,961 test cases for Java SE API from the OpenJDK. Experimental results on the latest OpenJDK standard library show that JavaOracle substantially outperforms state-of-the-art baselines, including Fuzz4All, ChatAssert, and Randoop. Cumulatively, JavaOracle discovers 45 previously unknown bugs, which have already been confirmed/fixed by developers, with many persisting since their initial implementation. In an entire pipeline running, JavaOracle reduces execution failures to an average of 6 reports per run, achieving 88.9% precision. In contrast, baseline approaches required exhaustive manual inspection to identify only 0, 3, and 2 real bugs, respectively. Further analysis shows that test cases enhanced by JavaOracle achieve high validity, with an execution pass rate of 79.1%, compared to 35.4%, 92.2% (55.5% test cases unchanged), and 77.6% for the baselines. Ablation studies further demonstrate the effectiveness of JavaOracle components, while the automated pipeline significantly reduces manual analysis effort.