Hello! I’m Jinwei Hu (胡津玮), currently pursuing my Ph.D. in Computer Science at the University of Liverpool under the supervision of Prof. Xiaowei Huang and Dr. Yi Dong. My research focuses on understanding and improving the robustness, interpretability, and reliability of modern AI systems, particularly in safety-critical settings. I am especially interested in LLM-driven agents operating in dynamic or adversarial environments, the emergence of collusive behaviours among autonomous agents, and fine-grained knowledge manipulation techniques such as LLM unlearning. I also work on simulation-driven cyber defence, multi-agent coordination, and the integration of symbolic structure with generative intelligence.

Before starting my Ph.D., I completed an MSc in Applied Computational Science and Engineering with Distinction at Imperial College London, where I conducted research on AI4Science and explainable AI under the supervision of Dr. Sibo Cheng and Dr. Rossella Arcucci from the Data Science Institute.I also hold a First Class BSc (Hons) in Computer Science (Artificial Intelligence) from the University of Liverpool and BSc in Information and Computing Science from Xi’an Jiaotong-Liverpool University.

📚 My current research interests include but not limited to:

  • Generative AI
  • Responsible AI
  • Agentic AI
  • AI Safety and Security
  • AI4Science

Please reach out to collaborate 😃

🔥 News

  • 2026.04:   📝 Invited as the Reviewer for NeurIPS 2026.
  • 2026.04:   🎉 Our paper on multi-agent collusive attack of LLM-based agents in open channel has been accepted at ACL 2026.
  • 2026.04:   📝 Invited as the Journal Referee for IEEE Transactions on Neural Networks and Learning Systems (TNNLS).
  • 2026.03:   📝 Invited as the Journal Referee for ACM Transactions on Software Engineering and Methodology (TOSEM).
  • 2026.01:   🎤 Gave an Oral Presentation at AAAI 2026 at Singapore EXPO.
  • 2026.01:   🎉 Our paper on adversarial robustness testing has been accepted at IEEE ICASSP 2026 which will be hosted at Barcelona, Spain.

  • 2025.12:   🎤 Invited to present a talk on AI in Programmatic Agents at the Trustworthy AI+ Workshop, co-hosted by King’s College London and the University of Exeter.
  • 2025.11:   🎉 Our paper on Agentic AI has been accepted at AAAI 2026 and selected for an Oral Presentation.
  • 2025.10:   🎉 Our paper on LLM guardrail has been available at Artificial Intelligence Review.
  • 2025.09:   🎉 Our paper on LLM unlearning has been accepted at NeurIPS 2025.
  • 2025.09:   🎉 Our emerging research work on LLM-powered agent’s responsibility has been accepted as a poster by UKAIRS 2025.
  • 2025.08:   📝 Served as a Reviewer for NeurIPS 2025, reviewing submissions in the area of AI safety.
  • 2025.08:   📝 Served as a Programme Committee Member for AAAI 2026, contributing to the review process for papers in main track.
  • 2025.08:   🎉 Our paper on randomized smoothing for LLM-driven multi-agent systems was accepted as a Fast Track publication in the top-tier journal CJoA.
  • 2025.07:   🎉 Our paper on adversarial testing for industrial cyber-physical systems, was published in IEEE Transactions on ICPS.
  • 2025.06:   🎉 Our paper on safe prunning LoRA has been accepted at TACL.
  • 2025.06:   📝 Served as a Programme Committee Member for UKAIRS 2025.
  • 2025.02:   🎉 Our paper on social media deepfake detection has been accepted by CVPR 2025.

  • 2024.09:   🔬 Act as a Research Associate and mainly work on the project “CRoCS: Certified Robust and Scalable Autonomous Operation in Cyber Space,” funded by the Alan Turing Institute (AICD Research Centre).
  • 2024.06:   🏆 Won the ELLIS Manchester Scholarship and thanks for supporting me to attended the ELLIS Summer Session hosted at the University of Manchester.
  • 2024.05:   🎤 Gave a tutorial session about “How to Control LLMs’ behaviors and Design Strategy to safeguard LLMs” at TACPS & Trust-AI Reading Group.
  • 2024.05:   📝 Served as a Reviewer for ECAI 2024.
  • 2024.05:   🎉 Our paper on LLM guardrail has been accepted by ICML 2024.
  • 2024.01:   🎉 My master’s thesis on Explainable AI and Chemistry (AI4Science) was accepted for publication in the top journal CEJ.

  • 2023.12:   🏆 Awarded a full scholarship to pursue my PhD at the University of Liverpool and joined the Trustworthy Autonomous Cyber Physical Systems (ACPS) Lab, under the supervision of Prof. Xiaowei Huang and Dr. Yi Dong.
  • 2023.10:   🎓 Graduated from Imperial College London and is honored to receive the highest academic award of Master of Science with Distinction in UK.

📝 Publications

Selected Publications in Conference

ACL 2026
sym

Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage

Jinwei Hu, Xinmiao Huang, Youcheng Sun, Yi Dong, Xiaowei Huang

  • We identify and formalize cognitive collusion in LLM agents, and propose a multi-agent generative montage framework that manipulates beliefs using only truthful evidence, revealing a new class of reasoning-driven vulnerabilities in public channel.
AAAI 2026 Oral
sym

Tapas Are Free! Training-Free Adaptation of Programmatic Agents via LLM-Guided Program Synthesis in Dynamic Environments

Jinwei Hu, Yi Dong, Youcheng Sun, Xiaowei Huang

  • We propose TAPA, a training-free framework that uses LLMs to synthesize agent actions for adaptive decision-making, shifting from policy retraining to action-level adaptation and demonstrating strong performance in cyber defense and swarm control.
NeurIPS 2025
sym

FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model

Jinwei Hu, Zhenglin Huang, Xiangyu Yin, Wenjie Ruan, Guangliang Cheng, Yi Dong, Xiaowei Huang

  • A representation-guided unlearning framework that combines contrastive learning, gradient projection, and information-theoretic metrics to enable more precise knowledge removal in LLMs.

Selected Publications in Journal

Chinese Journal of Aeronautics
sym

Enhancing Robustness of LLM-Driven Multi-Agent Systems through Randomized Smoothing

Jinwei Hu, Yi Dong, Zhengtao Ding, Xiaowei Huang

  • We propose a defense framework for LLM-driven multi-agent systems that leverages randomized smoothing to provide probabilistic safety guarantees, mitigating malicious behaviors and hallucination propagation while preserving system performance.
IEEE Transactions on Industrial Cyber-Physical Systems
sym

Hierarchical Testing With Rabbit Optimization for Industrial Cyber-Physical Systems

Jinwei Hu, Zezhi Tang, Xin Jin, Benyuan Zhang, Yi Dong, Xiaowei Huang

  • We propose HERO, a black-box adversarial testing framework that combines hierarchical analysis and optimization to efficiently generate high-quality time-series adversarial examples, enabling robust evaluation of ICPS applications.
Chemical Engineering Journal
sym

Explainable AI models for predicting drop coalescence in microfluidics device

Jinwei Hu, Kewei Zhu, Sibo Cheng, Nina M Kovalchuk, Alfred Soulsby, Mark JH Simmons, Omar K Matar, Rossella Arcucci

  • We investigate droplet coalescence prediction in microfluidic systems using machine learning and explainable AI, revealing key physical factors that govern coalescence while ensuring interpretability in AI predictions.

Other publication details are shown in Google Scholar.

💻 Involved Projects

  • Robustifying Generative AI through Human-Centric Integration of Neural and Symbolic Methods
    • Role: Research Associate
    • Funding: EU Horizon
  • CRoCS: Certified Robust and Scalable Autonomous Operation in Cyber Space
    • Role: Research Associate
    • Funding: Alan Turing Institute (AI for Cyber Defence (AICD) Research Centre)

🎖 Honors and Awards

  • 2024.06: ELLIS Manchester Scholarship, the University of Manchester.
  • 2023.12: PhD Full Scholarship, University of Liverpool.
  • 2023.10: MSc with Distinction, Imperial College London.
  • 2022.07: BSc with First Class Honours in Computer Science (Artificial Intelligence), University of Liverpool.
  • 2022.07: BSc with First Class Honours in Information and Computing Science, Xi’an Jiaotong-Liverpool University.

📖 Educations

  • 2023.12 – Present: PhD in Computer Science, University of Liverpool, UK
  • 2022.10 – 2023.10: MSc in Applied Computational Science and Engineering, Imperial College London, UK
  • 2020.09 – 2022.07: BSc in Computer Science (Artificial Intelligence), University of Liverpool, UK
  • 2018.09 – 2020.06: BSc in Information and Computing Science, Xi’an Jiaotong-Liverpool University, China

💬 Invited Talks

🤝 Services

Journal Reviewing

  • IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
  • ACM Transactions on Software Engineering and Methodology (TOSEM)
  • IEEE Internet of Things Journal (IoTJ)

Conference Program Committee / Reviewer

  • AAAI 2026
  • AAMAS 2026
  • NeurIPS 2025/2026
  • ICML 2025
  • UKAIRS 2025
  • ECAI 2024