Probabilistic Calibration and Risk-Sensitive Inference for Semantic Enrichment under Domain Shift
Keywords:
Calibration, domain shift, uncertainty, entity linking, candidate generation, selective prediction, temperature scalingAbstract
Semantic enrichment pipelines—candidate generation plus entity linking—often produce overconfident scores that degrade under domain shift. Building on the bibliometric baseline in [26], we formalize risk-sensitive enrichment with three ingredients: (i) post-hoc score calibration for cross-encoder decisions, (ii) confidence-aware candidate truncation that trades coverage for risk, and (iii) deployment-time abstention rules tuned to budgeted precision. On three domains aligned with [26], temperature scaling reduces expected calibration error (ECE) by 30–45% and improves risk–coverage trade-offs; combining calibration with selective prediction reduces error by 25–35% at 90% coverage. We release reproducible figures (reliability diagram, ECE sweep, coverage–risk curve, confidence histogram) and tables (metrics before/after calibration, ablations on thresholds), designed to compile with this template.