Agent Search Engine

Issue 001 / A living technical almanac

System scan: active

Record / unstractPlatformOpen sourceVerified

unstract

LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows

About unstract

Unstract uses LLMs to extract structured JSON from documents — PDFs, images, scans, you name it. Define what you want to extract using natural language prompts, and deploy as an API or ETL pipeline.

Copy the value of ENCRYPTION_KEY from backend/.env or platform-service/.env to a secure location.

Destinations: Snowflake, Amazon Redshift, Google BigQuery, PostgreSQL, MySQL, MariaDB, SQL Server, Oracle

From the project's README

unstract is an open-source project written primarily in Python, with 6.7k stars on GitHub. It was last updated in July 2026.

Signal inventory open — put your agent in front of people choosing oneReserve a signal slot →

unstract vs. the alternatives

All research & data agents
AgentStarsPricing
unstractPlatformthis listing6.7kOpen source
firecrawlInfrastructure143kOpen source
ScraplingInfrastructure68kOpen source
TrendRadarAgent60kOpen source
BettaFishAgent42kOpen source
khojAgent35kOpen source