Euterria
Use CasesNetworkTechnologyFAQsOur Story
Member sign inJoin the network

Engineering

How Euterria works under the hood.

Euterria isn't a directory of profiles. It's a system for turning the documents a community already has into a typed, reviewable, searchable graph, then ranking across that graph using structured metadata, document-level evidence, and semantic relevance. Here's how that works, end to end.

By the Engineering Team · ~9 min read

Start with the shape of the problem

Almost everything a climate community knows is already written down somewhere: in grant reports, program one-pagers, decks, and the occasional sixty-page toolkit. The trouble is that this knowledge is scattered across files and inboxes, and it is deeply relational. A program belongs to an organization. A person has skills and sits on a coalition. A report references three partners and a funding source.

Keyword search over a folder of PDFs misses all of that. It can match a filename; it can't tell you who else is working on urban heat, or which toolkit a buried paragraph belongs to. Euterria's job is to give that knowledge a structure first, and then make the entire structure searchable as a single thing.

A directory tells you who exists. Euterria is built to tell you who can help.

A typed graph, rather than a pile of docs

The foundation is a small set of connected entity types. Organizations, people, programs, resources, networks, and partnerships each have their own schema of named fields that describe what they are and how they relate to everything around them.

OrganizationsPeopleProgramsResourcesNetworksPartnerships
Six entity types, connected. Most real questions are really questions about the edges between them.

Long documents don't just sit on a single record. They are split into sections, each given a short, grounded overview, and every entity carries a vector embedding so it can be found by meaning rather than exact wording. The graph, not the file, is the unit Euterria reasons about.

Getting knowledge in

Two front doors lead into the graph. Upload a file and it is parsed into clean markdown. Point Euterria at a website and it maps the site and scrapes the pages that matter. Either way, a model then reads the content and extracts it into strict, named fields, never a free-form paragraph.

GreenLine_2025_Report.pdf
extract

Structured record

TitleUrban Canopy Equity Program
TypeProgram
Issue areasUrban heatTree equityAir quality
Key contactDr. Maya Okonkwo
Timeframe2024 – 2026
A grant report becomes a structured record: title, type, issue areas, contacts, and dates, each one a field you can review.

The two sources fill in different shapes. A document yields a title, description, type, issue areas, geographic scope, key contacts, contributors, languages, sections, a grounded overview, and any program it describes. A website yields an organization's mission, constituency, contact details, programs, staff, partners, and networks, each tied back to the page it came from.

The structure layer. Because every extraction is constrained to a schema, the output is a reviewable record with named fields. That constraint is what makes the knowledge auditable going in and searchable coming out.

Nothing publishes itself

Extraction produces a draft. You review and edit it, correcting fields, linking or creating the programs it mentions, and confirming contacts. Only when you publish does Euterria write the final records and their relationships, and trigger the embedding and index sync that makes them findable.

A record can't go live until it passes

Has a titleHas a descriptionHas an overviewPublishable typeFile or external linkValid program choice

Finding it again

Once the graph exists, the hard part is search. People don't ask database-table questions; they ask ecosystem questions. “Who's working on urban heat in Oakland?” should return a person, a program, an organization, and a report in one ranked list, even though those live in completely different indices.

Search is Algolia-first and Supabase-backed. Supabase stays the source of truth, while live retrieval runs through Algolia and then a layer of application-side ranking. Two retrieval paths run at once, keyword matching and semantic similarity, and their candidates blend into a single pool before anything is ranked.

Query
“Who's working on urban heat in Oakland?”
Hybrid retrieval

Keyword · Algolia

Heat & Health ReportOakland Shade Plan

Semantic · vectors

Dr. Maya OkonkwoGreenLine Collaborative“…tree-canopy gaps in flatland heat”
Blend & rerank
Heat & Health ReportOakland Shade PlanDr. Maya OkonkwoGreenLine Collaborative“…tree-canopy gaps in flatland heat”
reranking by meaning
Ranked answer
1Dr. Maya Okonkwo
2Oakland Shade Plan
3GreenLine Collaborative

One question fans out to keyword and semantic retrieval at once. The candidates — across every entity type — blend into a single pool, and a reranker decides the final order by meaning. That's retrieval-augmented, hybrid search.

OrganizationPersonProgramResourceDoc chunk
Hybrid retrieval: a single question fans out to keyword and semantic search, the candidates blend, and a reranker decides the final order.

Before any of that, the query itself is read. A natural-language question is parsed into structured intent: a type, a topic, a city, and the raw keywords. Short queries skip the model entirely and stay literal, so a quick lookup is never over-thought.

“Who is working on urban heat islands in Oakland?”

type: persontopic: urban heat islandscity: Oaklandkeywords

Retrieval also reaches inside documents. A long report isn't only searchable by its title and description; its body is split into chunks and indexed separately. A single passage buried deep in a toolkit can surface and lift the resource it belongs to.

Climate Resilience Toolkit
Cooling centers now serve 12k residents…
Outreach partners across three CBOs…
Tree-canopy gaps mapped across flatland heat zones…
Budget, grant acknowledgements, and credits…
Appendix: survey methodology and sources…
+ evidence
Resource

Climate Resilience Toolkit

57

Relevance to “urban heat”

Evidence from inside a document. One relevant passage raises the relevance of its parent resource.

Deciding what's relevant

Relevance isn't a single score. Candidates pass through a stack of signals, each doing a specific job: fast retrieval narrows the field, structured filters and heuristics shape it, document evidence reinforces it, and a semantic reranker has the final say on order.

  1. 1

    Algolia retrieval

    Fast first-stage candidates across every index.

  2. 2

    Hard filters

    Constraints the query implies, like a city, a type, or an issue area.

  3. 3

    Filter boosts

    Soft preferences that nudge relevant matches upward.

  4. 4

    Heuristic scoring

    Application-side signals about field matches and completeness.

  5. 5

    Chunk evidence

    Body passages from long documents boost their parent resource.

  6. 6

    Semantic reranking

    A reranking model reorders the shortlist by true meaning.

That last step matters most. Each finalist is turned into a compact semantic document — its title, type, organization details, metadata, description, and resource snippets — and a reranking model reorders the shortlist by what each result actually means, placing the most relevant answer first regardless of its keyword density.

Keyword shortlist

semantic reranker
1Heat & Health Report
92
4Dr. Maya Okonkwo
70
3Oakland Shade Plan
78
5GreenLine Collaborative
66
2Citywide Tree Census
84

The same shortlist, reordered. Keyword scores get candidates in the door; the reranker reads what each one actually means and decides the final order — so the most relevant person, program, or org wins, whatever its type.

The reranker at work: a keyword-sorted shortlist gets reordered by meaning, and the best answer rises regardless of its type.

The embeddings behind all of this are deliberately curated. They are built from intentionally chosen text rather than raw database rows: an organization leads with what it needs and offers, a program with its goals and the populations it serves, a person with their skills and role.

Grounded, governed, private

Three guarantees run underneath everything. Extraction is grounded in your source material, so records reflect what the documents actually say. Publishing is gated by explicit checks, so incomplete records never go live. And row-level security governs who can read what.

The stack that runs it

Supabase

Source of truth for all content and relationships

Algolia

First-stage retrieval across every entity index

Voyage

Semantic reranking and stored embeddings

OpenAI / OpenRouter

Structured extraction and query understanding

LlamaParse

Turns uploaded documents into clean markdown

Firecrawl

Maps and scrapes organization websites

Vercel

Runs the Next.js application

Infisical

Manages production secrets

What this makes possible

All of this exists to answer the questions a flat directory can't:

Who else is working on this?

Surface the organizations, people, and programs already tackling your issue.

What already exists?

Find reports, toolkits, and resources you can reuse instead of rebuilding.

Who can help?

Locate the person with the skill, the partner, or the org that has done it before.

What can we build on?

See the networks and partnerships that connect the ecosystem together.

Each one is a question about the edges of the graph: who connects to whom, and what builds on what. Answering it well is the entire point.

See it in actionExplore the use cases
Euterria

A shared knowledge resource for the Bay Area climate community.

Product

How it worksUse casesThe network

Organization

Our storyFAQUnder the Hood

Join

Add your organizationMember sign inContact

© 2026 Euterria · Supported by the TomKat Center for Sustainable Energy at Stanford

Made for climate orgs in the Bay Area