cross-source-concepts
Score Concepts by recency × frequency × source diversity × mean per-mention extraction_score. Updates Concept.confidence, Concept.mention_count, and derives MENTIONS.weight from extraction_score × recency. Reads Concept.extraction_score but never writes it — ownership-seam invariant.| Domain | knowledge-garden |
| Category | correlation |
| Version | 1.0.0 |
| Tags | correlation, knowledge-garden |
| Source | strategies/knowledge-garden/correlation/cross_source_concepts/ |
About
What this strategy does
RecomputesConcept.confidence and Concept.mention_count for every
Concept in the graph, and derives MENTIONS.weight from
MENTIONS.extraction_score × recency_factor. Reads only — fans out no
MCP calls and stages nothing in DuckDB.
This is the second stage of the Reading Garden pipeline. It folds the
per-mention extraction_score written by highlight_distill into a
graph-aware confidence score that notion_publish later maps to
Concept.epistemic_status (seedling / budding / evergreen).
When to use it
- After every
highlight_distillrun (the new MENTIONS edges shift mention counts and mean extraction scores). - On a periodic cron (daily / weekly) so the recency decay term keeps drifting older concepts toward lower confidence even when no new highlights arrive.
How it works
{freq: 0.30, recency: 0.15, diversity: 0.40, extract: 0.15}.
The diversity term is the load-bearing cross-source corroboration: a
Concept seen in N highlights across M distinct DomainSources gets
M, not N, credit. That is why highlight_distill MERGEs the
CAPTURED_FROM → DomainSource chain — without those edges this term is
always zero.
MENTIONS.weight is set per edge as extraction_score × recency_factor.
The recency factor decays at a 90-day half-life from each Concept’s
last_seen_at; freshly-mentioned concepts keep their edge weight near
the raw extraction score, while older mentions naturally fade.
Ownership seam (read me before editing)
This strategy writes onlyConcept.confidence,
Concept.mention_count, Concept.last_updated, and MENTIONS.weight.
It does NOT write Concept.extraction_score — that field belongs
to highlight_distill. The two-writer authority pattern (one strategy
per property) mirrors spec-32. The verification checklist (spec §8)
checks that Concept.extraction_score values are unchanged after this
strategy runs.
If a future contributor wants cross_source_concepts to update
extraction_score, that is a spec change, not a tactical edit.
Caveats
- No alias merging. If
popperandkarl popperboth exist as separate Concept nodes, this strategy scores them independently. Theconcept_low_extraction_high_confidencecheckpoint rule (spec §3.5) is what flags such drift. - The sigmoid output is in
(0, 1), never exactly 0 or 1. A Concept with zero mentions and zero diversity still getssigmoid(0) = 0.5. Tune the weights or shift with a bias term if a strictly increasing zero-anchored output is preferred.
Steps
| Step | Function | Depends on |
|---|---|---|
| Load per-Concept aggregates from the graph | load_concepts | — |
| Score concepts with graph-aware confidence formula | score | load_concepts |
| Write Concept.confidence and MENTIONS.weight back to the graph | update_graph | score |

