What this shows
A network of similar verse lines from the Finnic runosong corpus (4.29M verses, 289,702 poems). The seed verse (green node) is at the center. Connected verses share similar wording, meaning, or translation.
How similarity is calculated
Four algorithms independently find similar verses:
- Jaccard — shared exact wordforms between two verse lines
- TF-IDF — shared lemmas (dictionary forms), weighted by rarity
- Translation — shared English translations, enabling cross-lingual Estonian↔Finnish matches
- CharBigram — character bigram overlap, capturing surface-level textual similarity
These are combined via Reciprocal Rank Fusion (RRF): if a verse ranks high in multiple algorithms, its combined score is boosted. The percentage shown is the RRF score normalized to 0–100%.
Reading the graph
- Node size = occurrence frequency (log-scaled: how many times this exact verse text appears across the corpus)
- Node color: green = seed verse, blue = Estonian, orange = Finnish
- Edge color: green = seed connection, purple = inter-neighbor link, gray = hop-2+
- Edge thickness = similarity strength
What are hops?
- Hop-1: verses directly similar to the seed (the seed's nearest neighbors)
- Hop-2: verses similar to hop-1 nodes (neighbors of neighbors)
- Higher hops extend further. Each hop level discovers more distant connections.
Controls
| Control | What it does |
| Hop-1 | How many direct neighbors to show (5–20) |
| Per-hop | How many new nodes each parent adds at hop-2+ (1–5) |
| Max hops | Network depth (1–4). Default 2 for performance. |
| Min score | Hide weak connections below this RRF threshold |
| Max nodes | Cap total visible nodes (15–200) |
| ET / FI | Filter by language (Estonian / Finnish) |
| + Add hop | Extend network by one more hop level without full reload |
| Show outer hops | Toggle visibility of hop-2+ nodes |
| Show labels | Toggle verse text labels on/off |
| Zoom slider | Zoom in/out (20–500%) |
Interactions
- Click a node → navigate to that verse's network (re-centers on it)
- Click the seed node (green) → show info panel with links to Similarity Explorer and Poem Reader
- Double-click a node → expand it in-place (load its neighbors without navigating away)
- Double-click empty space → reset zoom to default
- Drag a node to reposition it; scroll to zoom; drag background to pan
- Nodes with a + overlay can be expanded by double-click
Exporting data
Open "More controls" and click "⬇ Export CSV" to download all visible verses and edges with per-algorithm similarity scores and shared wordforms.