RunoVerse

Wordform Disambiguation

Surface forms that map to more than one lemma — where a runosong word is genuinely ambiguous.

What is wordform ambiguity?

A wordform is the exact surface spelling seen in a poem. A lemma is its dictionary headword. In Finnic languages many wordforms are genuinely ambiguous: they can be inflected forms of two or more different lemmas. For example a form could be either a case of a noun or a verb form, or two different nouns' inflections. This page collects wordforms that have 2 or more candidate lemmas and shows which lemmas each could be.

How to read a card

Sort options

Why it matters

Runosong language is archaic, dialectal, and often uses syncopated or truncated forms. Disambiguation is a real bottleneck for any frequency count or semantic analysis — this page is a way to audit where the lemmatiser hedges, and to decide whether the hedge is justified.

How to read these cards: Each wordform (bold, green) has multiple candidate lemmas. The number next to each lemma is its overall corpus frequency — how common that lemma is across the whole corpus — not how often this specific wordform maps to it. Lemmas are sorted by overall frequency; more frequent candidates are statistically likelier parses, but the disambiguation itself is ambiguous.
Loading...