Information Theory & Methodology
Code as evidence, identity encoding, state machine compression, and signal-guided learning through dialogue.
Code as Evidence: An Information-Theoretic Framework for Automated Professional Capability Assessment
The fundamental problem of professional labor markets is one of information asymmetry: candidates possess capabilities that are imperfectly observable by employers, while job requirements are expressed in natural language that maps imprecisely to actual skill demands. This paper presents a rigorous mathematical framework for capability assessment grounded in information theory, treating code repositories as high-fidelity signals of developer competence. We formalize the matching problem as a noisy communication channel, where traditional self-reported credentials exhibit high entropy (approaching maximum disorder), while code artifacts provide lower-entropy, higher-mutual-information signals about underlying capabilities.
Encoding & Compression: Synthesized Reference
Identity encoding is about transferring experiential knowledge into LLM context. The key insight: **encoding IS training data, not representation**. Association format activates model correction machinery; lineage/sequential format fails.
Socratic Fine-Tuning: Learning Through Dialogue and Internal State Measurement
We present Socratic Fine-Tuning, a novel approach to training large language models that inverts the traditional information flow of machine learning. Rather than feeding models input-output pairs and computing loss on prediction accuracy, we position the model as an active generator while a teacher system observes internal activation states to weight learning updates. We formalize this through six neurotransmitter-analog signals extracted from model activations during generation: dopamine (insight/reward), GABA (inhibition), norepinephrine (focus), acetylcholine (learning/attention), serotonin (stability), and glutamate (activation).