News
Ertas AI - Fine-tune AI models fast, no ML expertise needed
9+ hour, 9+ min ago (12+ words) Prodigy + Docling + Custom Scripts: A Real Enterprise Stack Audit'Ertas AI...
Migration Guide: From Fragmented Data Tools to a Unified Pipeline
9+ hour, 8+ min ago (934+ words) You're running Docling for parsing, Label Studio for annotation, Cleanlab for quality, and custom scripts for export. Here's how to consolidate into a single platform without losing your existing work. The typical enterprise AI data preparation stack in 2026 looks like…...
From PDF Archives to AI Training Data: What the Journey Actually Looks Like
9+ hour, 7+ min ago (495+ words) A practical walkthrough of the full journey from a folder of enterprise PDFs to usable AI training data " covering ingestion, cleaning, labeling, augmentation, and export. You have 50,000 PDFs in a folder. Maybe it's contracts. Maybe it's medical records. Maybe it's…...
From Documents to Agent Knowledge Bases: The Complete Data Pipeline
56+ min ago (1597+ words) Enterprise AI agents are only as good as their knowledge base. Here's the end-to-end pipeline for converting unstructured documents into structured, agent-ready knowledge " from PDF ingestion to retrieval-optimized chunks. Enterprise AI agents fail for a predictable reason: the knowledge base…...
AI Models for Fine-Tuning
9+ hour, 14+ min ago (515+ words) Open-source models you can fine-tune with Ertas. Meta's specialized code generation model family built on Llama 2, available in 7B, 13B, 34B, and 70B sizes with variants optimized for code completion, instruction following, and Python development. Cohere's enterprise-focused model family in 35B and 104B sizes, purpose-built for…...