Maximilian Mozes

I'm a Member of Technical Staff at Cohere and a PhD student at University College London supervised by Lewis Griffin (Department of Computer Science) and Bennett Kleinberg (Department of Security and Crime Science). My research focuses on the intersection of adversarial machine learning and natural language processing. I'm a member of the UCL Natural Language Processing research group.

I have recently interned at Google Research, working with the PAIR Team on measuring dialog safety using large language models. Prior to that, I was a Research Scientist Intern at Spotify Research, where I focused on NLP-based content moderation in podcasts.

I obtained a Bachelor's degree in Computer Science (minor in Mathematics) from the Technical University of Munich (TUM) in March 2019.

During my undergraduate studies, I have worked as a visiting research scholar at the Language and Information Technologies Group of the University of Michigan's Artificial Intelligence Lab and as a research intern in the Department of Psychology at the University of Amsterdam.

Twitter  /  Email  /  GitHub  /  Google Scholar  /  LinkedIn


Research
Towards Agile Text Classifiers for Everyone
Maximilian Mozes, Jessica Hoffmann, Katrin Tomanek, Muhamed Kouate, Nithum Thain, Ann Yuan, Tolga Bolukbasi, Lucas Dixon.
Findings of EMNLP 2023.
paper
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities
Maximilian Mozes, Xuanli He, Bennett Kleinberg, Lewis D. Griffin.
arXiv pre-print, 2023.
paper
Challenges and Applications of Large Language Models
Jean Kaddour, Joshua Harris, Maximilian Mozes, Herbie Bradley, Roberta Raileanu, Robert McHardy.
arXiv pre-print, 2023.
paper
Large Language Models respond to Influence like Humans
Lewis Griffin, Bennett Kleinberg, Maximilian Mozes, Kimberly Mai, Maria Vau, Matthew Caldwell, Augustine Mavor-Parker.
First Workshop on Social Influence in Conversations (SICon), ACL 2023.
paper
Gradient-Based Automated Iterative Recovery for Parameter-Efficient Tuning
Maximilian Mozes, Tolga Bolukbasi, Ann Yuan, Frederick Liu, Nithum Thain, Lucas Dixon.
arXiv pre-print, 2023.
paper
Identifying Human Strategies for Generating Word-Level Adversarial Examples
Maximilian Mozes, Bennett Kleinberg, Lewis D. Griffin.
Findings of EMNLP 2022.
paper
Textwash -- automated open-source text anonymisation
Bennett Kleinberg, Toby Davies, Maximilian Mozes.
arXiv pre-print, 2022.
paper
A repeated-measures study on emotional responses after a year in the pandemic
Maximilian Mozes, Isabelle van der Vegt, Bennett Kleinberg.
Scientific Reports, 2021.
paper
Scene Graph Generation for Better Image Captioning?
Maximilian Mozes, Martin Schmitt, Vladimir Golkov, Hinrich Schuetze, Daniel Cremers.
Technical report.
paper
Contrasting Human- and Machine-Generated Word-Level Adversarial Examples for Text Classification
Maximilian Mozes, Max Bartolo, Pontus Stenetorp, Bennett Kleinberg, Lewis D. Griffin.
EMNLP 2021.
paper
No Intruder, no Validity: Evaluation Criteria for Privacy-Preserving Text Anonymization
Maximilian Mozes, Bennett Kleinberg.
arXiv pre-print, 2021.
paper
Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples
Maximilian Mozes, Pontus Stenetorp, Bennett Kleinberg, Lewis D. Griffin.
EACL 2021.
paper
The Grievance Dictionary: Understanding Threatening Language Use
Isabelle van der Vegt, Maximilian Mozes, Bennett Kleinberg, Paul Gill.
Behavior Research Methods, 2021.
paper
Measuring Emotions in the COVID-19 Real World Worry Dataset
Bennett Kleinberg, Isabelle van der Vegt, Maximilian Mozes.
NLP COVID-19 Workshop, ACL 2020.
paper
Online Influence, Offline Violence: Linguistic Responses to the 'Unite the Right' Rally
Isabelle van der Vegt, Maximilian Mozes, Paul Gill, Bennett Kleinberg.
Journal of Computational Social Science, 2020.
paper
Uphill from Here: Sentiment Patterns in Videos from Left- and Right-Wing YouTube News Channels
Felix Soldner, Justin Chun-ting Ho, Mykola Makhortykh, Isabelle van der Vegt, Maximilian Mozes, Bennett Kleinberg.
Proceedings of the Third Workshop on Natural Language Processing and Computational Social Science, NAACL-HLT 2019.
paper
Identifying the Sentiment Styles of YouTube's Vloggers
Bennett Kleinberg, Maximilian Mozes, Isabelle van der Vegt.
EMNLP 2018.
paper / dataset
Using Named Entities for Computer-Automated Verbal Deception Detection
Bennett Kleinberg, Maximilian Mozes, Arnoud Arntz, Bruno Verschuere.
The Journal of Forensic Sciences, 63, 3, p. 714 - 723, 2017.
paper / code
Web-Based Text Anonymization with Node.js: Introducing NETANOS (Named Entity-Based Text Anonymization for Open Science)
Bennett Kleinberg, Maximilian Mozes.
The Journal of Open Source Software, 2, 14, 2017.
paper / code
NETANOS - Named Entity-Based Text Anonymization for Open Science
Bennett Kleinberg, Maximilian Mozes, Yaloe van der Toolen.
Preprint, 2017.
preprint / code


Media Coverage

LLMs for Evil.
Podcast interview with Data Skeptic, September 2023.
link

Google's Jigsaw was trying to fight toxic speech with AI. Then the AI started talking.
Fast Company, July 2023.
link


Organised workshops

9th Workshop on Representation Learning for NLP (RepL4NLP-2024)
62nd Annual Meeting of ACL, August 2024, Bangkok, Thailand.
website

8th Workshop on Representation Learning for NLP (RepL4NLP-2023)
61st Annual Meeting of ACL, July 2023, Toronto, Canada.
website

7th Workshop on Representation Learning for NLP (RepL4NLP-2022)
60th Annual Meeting of ACL, May 2022, Dublin, Ireland.
website

A gentle introduction to word embeddings for the computational social sciences
Maximilian Mozes and Bennett Kleinberg.
2019 European Symposium on Societal Challenges in Computational Social Science: Polarization and Radicalization, September 2019, Zurich, Switzerland.
website

Linguistic temporal trajectory analysis - a dynamic approach to text data
Bennett Kleinberg, Maximilian Mozes and Isabelle van der Vegt.
2018 European Symposium on Societal Challenges in Computational Social Science: Bias and Discrimination, December 2018, Cologne, Germany.
website

Teaching activities

Teaching assistant: Statistical Natural Language Processing
University College London, Academic year 2022/23.

Teaching assistant: Theory of Computation
University College London, Academic year 2021/22.

Teaching assistant: Introduction to Machine Learning
University College London, Academic year 2021/22.

Teaching assistant: Theory of Computation
University College London, Academic year 2020/21.

Teaching assistant: Introduction to Deep Learning
University College London, Academic year 2020/21.

Tutor: Analysis for Computer Science
Technical University of Munich, Winter term 2018/19.
Organized tutoring sessions in "Analysis for Computer Science" for undergraduate students in Informatics/Computer Science.


Thesis supervision

Analysing the Implications of Adversarial Training for the Robustness of Models in NLP
Ziying Cheng, MSc Machine Learning (UCL), 2021.

Frequency based Statistics and Detections against Adversarial Examples
Michail Koupparis, MSc Data Science and Machine Learning (UCL), 2020.

Textual Adversarial Attack Research - Pre-processing, Sequence, Transferability and Defense
Dongdong Chen, MSc Data Science and Machine Learning (UCL), 2020.

Analysing Linguistic Features of Perturbed Emails from an Adversarial Word-level Attack
Vlad Pasca, BSc Security and Crime Science (UCL), 2019.


2023 - London, United Kingdom. Forked from jonbarron_website.