Edge-Ready Semantic Enrichment via Quantization, Pruning, and Distillation

Authors

  • Javad Rahebi Department of Computer Engineering, Isfahan University, Isfahan, Iran Author

Keywords:

On-device NLP, edge AI, quantization, pruning, knowledge distillation, entity linking, approximate nearest neighbors, energy-efficiency

Abstract

Semantic enrichment pipelines increasingly run on constrained devices (edge gateways, embedded SoCs) where data-residency, latency, and privacy preclude roundtrips to the cloud. Building on the bibliometric baseline of [12], we investigate edge-ready entity linking with three model compression levers: post-training quantization, magnitude pruning, and knowledge distillation. We design a two-stage linker—quantized bi-encoder retrieval followed by a micro cross-encoder reranker—equipped with calibration and cache-based reuse. Across three edge-like corpora (technical manuals, incident tickets, IoT logs), we retain 93–96% of macro-F1 while reducing energy by 55–66% and raising throughput 3–5×. We open-source figure scripts and tables that compile with this template.

Downloads

Published

2023-08-18

Issue

Section

Articles

How to Cite

Edge-Ready Semantic Enrichment via Quantization, Pruning, and Distillation. (2023). International Journal of Industrial Engineering and Construction Management (IJIECM), 1(1), 22-32. https://www.ijiecm.com/index.php/ijiecm/article/view/63

Similar Articles

11-20 of 39

You may also start an advanced similarity search for this article.