References

Here you'll find people familiar with my research. To respect their time and ensure proper context, please write me first, do not reach out to them directly from here if you have not contected them before.

Joar Skalse, AI Safety Researcher. Joar mentored me and co-authored my work on "Defining Agentic Preferences" at SPAR. He can attest to my mathematical rigor, my ability to construct formal proofs, and my theoretical understanding of reinforcement learning and reward learning dynamics.

Samuel Brown, Technical AI Alignment Researcher. Samuel mentored me and co-authored my work on "Self-led LLM agents" at SPAR. He can speak to my ability to design experiments using Inspect and Petri, my proficiency in behavioral evaluation of LLMs, and my capacity to drive independent research progress in ambiguous settings.

Erin Robertson, Program Lead at LASR. Erin evaluated my research agenda and technical background as part of the London safety community's mentoring and selection processes. She can provide a detailed assessment of my research potential and my fit within the UK AI safety ecosystem.

Bryce Meyer, Core Contributor at TransformerLens and Poseidon Research. I have interacted with Bryce regarding my technical contributions to TransformerLens and my work in mechanistic interpretability. He is familiar with my engineering skills and can verify my ability to work with model internals and representations.

Ben Sams, Research Scientist at AISI. Ben can evaluate the analytical depth of my work and its relevance to the current challenges in evaluating frontier AI systems.

Mikhail Seleznyov, Research Scientist at AIRI. We engaged in in-depth discussions exploring the fundamental rationale and effective approaches of unlearning methods for AI alignment. Mikhail can attest to my conviction, and ability to articulate and robustly defend my research perspective.