Self-Supervised Contrastive Embeddings for Semantic Enrichment: Contextual Alignment and Entity Linking
Keywords:
contrastive learning, Semantic enrichment, entity linking, knowledge graphs, contextual alignment, retrieval, candidate generationAbstract
Semantic enrichment benefits from representations that respect both textual context and knowledge-graph structure. Building on the bibliometric baseline of the field [1], we propose a self-supervised contrastive framework that learns sentence- and mention-level embeddings aligned across three natural positive signals: (i) co-mention and coreference within documents, (ii) adjacency in a knowledge graph, and (iii) co-citation/co-reference at the article level. Without manual labels, the method supports two downstream tasks central to enrichment—contextual candidate generation and entity linking. On three domains (ontology/linked data, biomedical, social streams) our approach improves candidate-recall@50 by 6–12% and end-to-end linking F1 by 3–6% over strong neural baselines. Ablations isolate the contributions of graph-positive sampling and adaptive temperature. We release scripts to reproduce figures (loss curves, PR curves, embedding scatter) and tables (dataset summary, ablations), designed to compile with this template.