Johannes Kirmayr M.Sc.
| Phone: | +49 821 598 - 2328 |
| Email: | johannes1.kirmayr@uni-auni-a.de () |
| Room: | 2022 (N) |
| Open hours: | upon request |
| Address: | Universit?tsstra?e 6a, 86159 Augsburg |
Note
Research Interests
- Large Language Models (LLMs)
- LLM Agents
- LLM Planning and Reasoning
- LLM Finetuning / Agent Training
- Human-Agent Interaction
Bachelor/Master Thesis or Project Module
Thesis Guidelines for Prospective Students
If you're interested in writing a Bachelor’s or Master’s thesis with me, please follow these guidelines:
?
How to Apply
1. Looking for a Topic
- Explore the?Open Topics?section below.
- If a?topic?matches your interests, please contact by email including:
- A?motivational statement?explaining why the topic fits your interests.
- Your?transcript of records?(current and previous, if applicable).
- A?timeframe?for the thesis (planned start and end dates).
- Note: Supervision depends on my capacity and topic relevance.
2. Have Your Own Topic
- If proposing your own topic:
- Include how it aligns with my research.
- If suitable, I will guide you through the next steps.
- Note: As I am an external PhD with BMW Group, I cannot take other external company thesis.
Next Steps
-
If Accepted:
- We will formalize your topic and discuss project goals.
- Ensure you meet university-specific requirements (e.g., registration, defense talks).
-
If Declined:
- You may revise your topic or explore alternative supervisors.
Evaluation Criteria
Your thesis will be graded on:
- Literature review.
- Scientific approach and methodology.
- Clear structure and comprehensive documentation.
- Novelty and significance (especially?for master students).
- Quality of implementation or study design.
Feel free to contact me for further clarification or to apply. Looking forward to working on exciting projects together!
?
?
Open Topics:
Training Reliable In-Car Assistant Agents
Current LLM agents prioritize task completion over compliance: they fabricate capabilities, act on ambiguous requests without clarifying, and violate domain policies. In our CAR-bench benchmark [1], even frontier models achieve less than 54% consistent success across these dimensions. This thesis targets one or more of these failure modes by training small language models (SLMs) within the CAR-bench environment, using paradigms such as self-evolving agents, LoRA fine-tuning, RLVR, or supervised fine-tuning. Focus area and method can be tailored to the student's interests. Familiarity with LLM fine-tuning or reinforcement learning is beneficial but not required. I am also open to self-proposed topics within the CAR-bench environment.
[1]?Kirmayr, Johannes, Lukas Stappen, and Elisabeth André. "CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty."?arXiv preprint arXiv:2601.22027?(2026).
?
Synthetic Multi-Turn Data Generation for Training Tool-Using In-Car Voice Agents
Training reliable tool-using agents requires large-scale interaction data that is expensive to collect manually. This thesis explores novel synthetic data generation approaches for the CAR-bench environment [1], building on work like APIGen-MT [2] which used synthetic pipelines to train small models that outperform frontier LLMs on τ-bench. The student will research and develop methods to generate diverse, verifiable multi-turn trajectories across CAR-bench's 58 tools and three task types (base, hallucination, disambiguation), with a focus on data quality, verification, and downstream training effectiveness for SLM agents. Familiarity with LLM, graph networks, and data pipelines is beneficial but not required.
?
[1]?Kirmayr, Johannes, Lukas Stappen, and Elisabeth André. "CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty."?arXiv preprint arXiv:2601.22027?(2026).
[2]?Prabhakar, Akshara, et al. "Apigen-mt: Agentic pipeline for multi-turn data generation via simulated agent-human interplay."?arXiv preprint arXiv:2504.03601?(2025).
?
Human-Agent Interaction for Multi-Step LLM Assistants
As LLM agents handle increasingly complex, multi-step tasks, new HCI challenges arise: How do users maintain trust when agents execute long-running operations? How should agents communicate uncertainty, progress, or errors? This thesis investigates human-agent interaction design for multi-turn, tool-using voice assistants in the automotive domain. Possible research directions include user trust dynamics during multi-step task execution, interaction design patterns for agent transparency and error recovery, or the effect of agent communication strategies on user experience. The thesis will likely include a user study. Focus area can be tailored to the student's interests. Background in LLMs, HCI, UX research, or study design is beneficial but not required.