Lemmas (dictionary forms) that appear in both Estonian (ERAB) and Finnish (SKVR + JR) runosong corpus collections.
These are words used across both traditions, revealing the shared poetic vocabulary of Finnic runosong.
A lemma is "shared" when it has at least one occurrence in the Estonian corpus AND at least one in a Finnish corpus.
Reading the scatter plot
Both axes use logarithmic scale (powers of 10). Each dot is one shared lemma.
X-axis = Estonian frequency (ERAB), Y-axis = Finnish frequency (SKVR + JR).
Dashed diagonal = equal frequency in both. Dots above: Finnish-dominant. Dots below: Estonian-dominant.
Dot color = part-of-speech (NOUN green, VERB blue, ADJ orange, etc.). Dot size = total corpus frequency.
Click any dot to open that lemma in the main lexicon.
Cognate pairs
Estonian-Finnish word pairs with shared meaning, identified by 3 signals: translation overlap (55% weight), etymological root match (15%), orthographic similarity (30%).
Exact: identical spelling. Near-exact: 1-2 character difference. Translation-bridged: different spelling but shared English gloss. Orthographic: similar spelling pattern.
Frequency divergence
The divergence panel shows shared words used dramatically more in one tradition than the other.
Finnish-dominant: words with highest FI/ET ratio. Estonian-dominant: words with highest ET/FI ratio.
These reveal cultural and thematic differences between the Estonian and Finnish runosong traditions.
Estonian vs Finnish Frequency (Shared Lemmas)
Frequency Divergence (Most Unbalanced Shared Words)