Tobia Poppi

I'm a PhD candidate in the National PhD in Artificial Intelligence at the University of Pisa, conducting my research at AImageLab, the research laboratory of the Department of Engineering "Enzo Ferrari" at the University of Modena and Reggio Emilia. I was an Applied Scientist Intern at Amazon Prime Video in Seattle and was selected as the inaugural EMEA recipient of an Amazon post-internship research fellowship on AI Safety. My specialization areas include AI Safety, Responsible AI, and Trustworthy AI.

Email  /  CV  /  Scholar  /  Github

profile photo

Research

My research focuses on advancing Generative AI and Multimodal Architectures, aiming to bridge the gap between cutting-edge deep learning technologies and ethical alignment with human values.

I work on safety alignment pipelines, synthetic data workflows, and representation steering methods for vision-language and generative models. My work includes a CVPR 2025 Highlight paper on safety-aware vision-language models. Some papers are highlighted.

Do Models Share Safety Representations? Cross-Model Steering for Safe Visual Generation
Tobia Poppi, Silvia Cappelletti, Sara Sarto, Florian Schiffers, Garin Kessler, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
arXiv, 2026
project page / arXiv / code

We study whether safety directions learned in one language model can transfer across representation spaces to steer heterogeneous visual generators, reducing unsafe generations through benign-only alignment while preserving generation utility.

CounterVid: Counterfactual Video Generation for Mitigating Action and Temporal Hallucinations in Video-Language Models
Tobia Poppi, Burak Uzkent, Amanmeet Garg, Lucas Porto, Garin Kessler, Yezhou Yang, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara, Florian Schiffers
arXiv, 2026
arXiv

CounterVid introduces a framework leveraging diffusion models for image editing and video generation combined with the reasoning capabilities of video-language models to generate counterfactual videos that mitigate hallucinations, developed during an internship at Amazon Prime Video.

Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack
Silvia Cappelletti, Tobia Poppi, Samuele Poppi, Zheng-Xin Yong, Diego Garcia-Olano, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
ICPR, 2026
arXiv

We propose the prefilling attack, a structured natural-language prefix prepended to the model output, to steer the model to respond with a clean, valid option in multiple-choice question answering tasks, significantly improving accuracy and calibration.

Hyperbolic Safety-Aware Vision-Language Models
Tobia Poppi, Tejaswi Kasarla, Pascal Mettes, Lorenzo Baraldi, Rita Cucchiara
CVPR, 2025   (Highlight)
project page / arXiv / code / model

HySAC, Hyperbolic Safety-Aware CLIP, models hierarchical safety relations to enable effective retrieval of unsafe content, dynamically redirecting it to safer alternatives for enhanced content moderation.

Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Samuele Poppi, Tobia Poppi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
ECCV, 2024
project page / bibtex / arXiv / code / models

Vision-and-language model designed to mitigate the risks associated with NSFW content in AI applications. Safe-CLIP is fine-tuned to serve the association between linguistic and visual concepts, ensuring safer outputs in text-to-image and image-to-text retrieval and generation tasks.

Generative Data Augmentation for ArUco-Free RGB-Based 6-DoF Object Pose Estimation
Carmelo Scribano, Iacopo Ferrari, Giorgia Franchini, Elena Govi, Davide Sapienza, Tobia Poppi, Micaela Verucchi, Marko Bertogna
Journal of Imaging, 2026
paper / doi

A study of shortcut bias induced by ArUco markers in RGB-based 6-DoF object pose estimation, proposing a generative data augmentation pipeline that removes markers and synthesizes realistic backgrounds to improve robustness in ArUco-free settings.

Towards Trustworthy AI: LLM Aligning for Offensive Content Removal
Tobia Poppi, Samuele Poppi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
2023
morethesis

A study on aligning Large Language Models (LLMs) for offensive content removal, introducing the CAiA dataset and Safe-CLIP to ensure ethical and responsible AI-driven content moderation.

Deep Learning for DeepFake Detection
Tobia Poppi, Elena Govi, Fabio Marinelli
2022
pdf

A deep learning pipeline for DeepFake detection, leveraging custom datasets and an optimized Xception-based CNN to identify manipulated facial images with high accuracy."

Il labelling: i migliori tool disponibili e la loro applicazione per il rilevamento di persone e cartelli stradali e per la pose estimation di oggetti 3D
Tobia Poppi
2021
arXiv

A comprehensive study of data labeling tools and techniques, evaluating their efficiency in person and traffic sign detection, as well as 3D pose estimation for robotic applications.

Participation to National and European Projects

code ELIAS - European Lighthouse of AI for Sustainability

ELIAS aims at establishing Europe as a leader in Artificial Intelligence (AI) research that drives sustainable innovation and economic development.

code FAIR - Future Artificial Intelligence Research

The FAIR project is a national scale, multidisciplinary initiative aimed at reimagining and developing large-scale foundational models. It explores research questions, methodologies, models, technologies, as well as ethical and legal frameworks for creating Artificial Intelligence systems capable of interacting and collaborating with humans.


code ELSA - European Lighthouse on Secure and Safe AI

ELSA is a virtual center of excellence that will spearhead efforts in foundational safe and secure artificial intelligence (AI) methodology research.

Teaching

  • Lecturer, Scalable AI and AI Safety for Generative AI, Artificial Intelligence Engineering MSc, University of Modena and Reggio Emilia 2026
  • Lecturer, Deep Learning and Computer Vision for the Smart Factory Master's program, Experis s.r.l. 2025

Selected Activities as Reviewer

  • ACL Rolling Review (ARR) January 2026
  • IEEE/CVF Computer Vision and Pattern Recognition (CVPR) 2026
  • IEEE/CVF Computer Vision and Pattern Recognition (CVPR) 2025
  • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2025
  • Computer Vision and Image Understanding - Journal (CVIU) 2024
  • Pattern Recognition (PR) - Journal 2024
  • European Conference on Computer Vision (ECCV) 2024
  • IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2024

Website created by Tobia | HTML template forked from Jon Barron