Log | Timur Ganiev

5 entries

personal entropy reduction

offloading consistency to the machine

Steer LLM behavior by adding vectors to the residual stream (and learn when it breaks)

Stabilize RL fine-tuning of LLMs with smooth, temperature-controlled gating instead of hard clipping

Train LLMs from human preferences without a reward model

Position encoding via rotation – the method behind LLaMA, Qwen, and Mistral