What this shows
Two collection places, side by side. For each place you'll see a small map, the usual corpus metrics (poems, verses, unique lemmas, collectors), a distinctive vocabulary list, and a semantic-domain radar chart. The design answers: how does the sung vocabulary of one parish differ from another's?
How to choose places
- Pick Place A and Place B from the dropdowns — 803 parishes are available.
- Use presets for curated comparisons (Setumaa vs Narvusi contrasts Seto vs coastal Estonian; Kuusalu vs Haljala contrasts neighbouring North Estonian parishes; Uhtua vs Setumaa spans a Karelian–Seto cross-tradition; Suistamo vs Karkku contrasts Karelian and southwest Finnish.)
- The swap button (↔) flips A and B without losing your selection.
Distinctive vocabulary
Words listed for each place are ones that occur disproportionately often there compared to the corpus average (a log-likelihood distinctiveness score). They're not just the most frequent words — they're the words that would make a blind classifier guess this parish. Hover a word to see its gloss.
Semantic radar
The radar chart shows how much of each place's verses fall into the major semantic domains (body, kinship, nature, emotions, actions, artefacts, …). A place with a big kinship lobe but a small nature lobe sings mostly wedding/family themes; the reverse might be hunting or work songs.
Caveats
- Small places (few poems) have noisier distinctive-word lists; look at the poem count in the stat row.
- Parish boundaries and names come from the folklore archives; some aggregate multiple villages.
- ET and FI places are directly comparable in metrics but vocabulary is computed in each language's own lemma space — don't read equal distinctiveness scores as identical concepts.
Select two places to compare their vocabulary and semantic profiles