News
Rerankers Aren't Magic Either: When the Cross-Encoder Layer Is Worth the Cost
3+ hour, 24+ min ago (1212+ words) Enterprise Document Intelligence [Vol. 1 #2bis] Why stacking a reranker on top of weak retrieval doesn't save it, what cross-encoders actually fix vs what they don't, and where the editorial position of the series lands. Same setup as the embeddings article. Two…...
Proxy-Pointer RAG: Eliminating Wasteful Entity & Relations Extraction in Knowledge Graphs
5+ hour, 24+ min ago (908+ words) In my previous article on Solving Entity and Relationship Sprawl in Knowledge Graphs, I discussed how Proxy-Pointer architecture can optimize searching for right entities and relations. That, however, is only the second part of a larger problem in graph ingestion....
Embeddings Aren't Magic: The Predictable Failure Modes of RAG Retrieval
1+ day, 3+ hour ago (1732+ words) Enterprise Document Intelligence [Vol. 1 #2] Why the same vector search that handles synonyms and paraphrase silently fails on negation, exact identifiers, and your company's acronyms, and what to use when it does. Two scenes, both familiar. Scene 1: A RAG system over…...
Baseline Enterprise RAG, From PDF to Highlighted Answer
1+ day, 22+ hour ago (1795+ words) Enterprise Document Intelligence [Vol. 1 #1] The smallest version of RAG that actually works, on a real PDF, with grounded answers and the source lines highlighted. The fastest way to understand what RAG is is to build the smallest version that actually…...
RAG Is Burning Money " I Built a Cost Control Layer to Fix It
2+ day, 1+ hour ago (1743+ words) Most RAG systems optimize for relevance, not cost. I built a production-ready cost control layer combining semantic caching, query routing, and budget enforcement that reduces LLM costs by 85% without sacrificing answer quality. This article shows a full working implementation in…...
Emo Net: Speaker-Aware Transformers for Emotion Recognition " and What I'd Build Differently in 2026
3+ day, 1+ hour ago (1449+ words) A retrospective on my MS thesis, the leaderboard it placed on, and the LLM shift that has reshaped the field since. In March 2024, I submitted my MS thesis on Emotion Recognition in Conversation (ERC). The model, Emo Net, achieved a…...
The Infrastructure Behind Making Local LLM Agents Actually Useful
3+ day, 3+ hour ago (1163+ words) Lessons from building a fast, reliable scientific agent with local open-weight models, v LLM, and long-context infrastructure To understand this one, it helps to know what happens inside a GPU when it generates a single token. CUDA graphs eliminate this…...
They Requested It. I Built It. Nobody Ever Used It.
4+ day, 6+ hour ago (392+ words) Why good data work gets ignored after delivery. Stakeholders came to us asking for a model. We built a proof of concept. Got the green light. Delivered the model. Weeks of work'all to hear nothing. It's a tale as old…...
Stop Using LLMs Like Giant Problem Solvers
5+ day, 4+ hour ago (1175+ words) How I turned 100 messy pdfs into structured insights by building a deterministic loop around agents I recently worked on a feature where I had to transform 100 messy compliance pdfs into structured JSON rules. The brute force approach was obvious: give…...
Can AI Write Your Code?
5+ day, 18+ hour ago (1683+ words) What a recent study on Chat GPT, Python, R, and Stata tells us about AI-assisted coding for causal inference What if the real question is no longer whether AI can write code, but whether we can trust the code it…...