The problem
Aggregators leave organic traffic on the table.
The remote job market has dozens of aggregators. Most of them treat job listings as ephemeral data, hidden behind client-side JavaScript filters and search forms that produce no indexable URLs. The result is a search experience that works for the user who is already on the site, but loses every long-tail query that should have brought the next user in.
A senior backend role at a Series B fintech in Berlin, posted on Greenhouse three days ago, should be discoverable by someone Googling that exact intent. On most aggregators, it is not. The page either does not exist as a unique URL, or it does but is rendered client-side and never indexed.
The opportunity was straightforward. Build the aggregator that wins on organic search by treating every listing as a first-class indexable page.
Constraints
The shape of the box.
Self-funded, single engineer, infrastructure budget under one hundred dollars per month at scale. That ruled out most of the obvious approaches. No Vercel pricing tier that absorbs 100,000 ISR pages cheaply. No Algolia tier that absorbs the search cost. No managed Postgres that scales without surprise bills.
It also ruled out a flat headless setup. Every additional API hop costs latency the search engine will hold against us. The architecture had to be opinionated.
The deadline was implicit. Indexed listings compound. The longer the delay between launch and the first 10,000 indexed pages, the longer the curve takes to bend. Speed of execution was the dominant constraint.
Architecture decisions
The shape of the system.
Nuxt 3 with full SSR, not ISR or SSG. Each job listing is a server-rendered route with its own URL, its own meta tags, its own JSON-LD JobPosting schema. Google reads it as a self-contained document. Page generation happens on first request and is cached at the CDN layer.
Postgres as the system of record. Listings, sources, employers, locations, and the de-duplication graph all live in Postgres. Aggressive indexing on the columns the search facets actually use. Materialized views for the home and category pages.
Meilisearch for the user-facing search box. Postgres handles the "what listings exist" question. Meilisearch handles "what listings does this user want to see right now." Index size kept under 500 MB by storing only the searchable fields, not the full record.
Playwright as the scraping layer. Some sources publish clean APIs (Greenhouse, Lever, Ashby). Others (Workable variants, custom careers pages) require headless browsing. Playwright handles both behind a single normalized output schema.
BullMQ for orchestration. Scrape jobs, normalization jobs, embedding jobs, social distribution jobs. Each runs on its own queue with its own retry policy. Failures are loud and visible in a single dashboard.
Self-hosted on Coolify on a Contabo VPS. This is the choice that keeps the infrastructure budget defensible at scale. Predictable monthly cost, no per-request surprise, full control over the database. Backups go to S3-compatible storage.
The full stack runs on a single VPS in production. It serves the entire 100,000-listing catalog with sub-200ms page render times, p95.
What we shipped
The visible surface.
A search experience that holds up against the source ATS platforms directly. Filters that produce indexable URLs (every facet combination is its own page). A daily content distribution pipeline that posts curated listings to LinkedIn, X, and a public RSS feed twelve times a day, automatically. A scholarship vertical, in prototype, that extends the same model to a different category of opportunity.
Behind the surface: a normalization layer that handles the 80% of edge cases where two job postings on different ATS platforms are the same role. A daily de-duplication sweep that prunes expired listings without losing the SEO equity. A structured-data audit that runs in CI to prevent malformed JSON-LD from shipping.
The system has run continuously since launch with one infrastructure incident, resolved in under an hour, root cause documented. Cost stays inside the original budget envelope.
What we would do differently
The honest retrospective.
Choosing Nuxt 3 over Next was a coin flip and the coin landed fine. If we were starting today, with React Server Components stable in App Router, the choice would lean Next. The Nuxt ecosystem is good but smaller, and some of the integrations (especially around analytics and observability) needed more glue than they should have.
The scraping layer was over-built initially. We started with a generic abstraction that handled every ATS through a single interface. In practice, three of the six sources need bespoke handling that fights the abstraction. The next version collapses the generic layer and accepts more code duplication for clearer semantics.
Meilisearch was the right call but should have been wired in from week one, not week six. The two-system search architecture (Postgres for facets, Meili for free-text) is good. Bolting it on later required a migration that could have been avoided.
Built to be found, not to be pitched.
