Skip to main content

Spelling conventions in late-period Czech, or, Making Google Translate less cranky

So I'm digging through some Latin/Czech dictionaries from 1579 and 1605, right, and I'm reminded yet again that like most other languages, Czech in period was (a) not fully consistent in its spelling, and (b) generally spelled somewhat differently than modern Czech, including using letters that are not present at all, or barely, in modern Czech. This can present difficulties in translation for those of us who aren't fluent, since Google Translate, bless its algorithmic heart, only knows modern Czech.

I've spent enough time digging around in period Czech texts by this point that I make the relevant spelling substitutions fairly fluently, but I realized it might be useful to have them set down somewhere I could point others to, for use in their own research. 

I will note that while most of these are actually fairly consistent, others can be either situational (like y, as noted) or inconsistent across manuscripts or contexts (like v, as noted). For researchers who have some familiarity with Czech, this doesn't pose a problem, as it's generally clear what the word is, but for those unfamiliar with the language, this can make things more difficult.

Here's a list of the letters or digraphs (two letters for one sound) you'll find in 15th-16th century Czech texts, and the letters that generally represent the same sounds in modern Czech. Period spelling is on the left, modern equivalent is on the right.

g → j*

ij → í

w → v

v → u (but not in every manuscript; sometimes a v really is just a v)

y (standalone) → i

y (preceding an a) → j

cz → č

ie → ě

ſſ or ſs → š

rz → ř

* Except as the initial letter of a name, where it does often show up as J instead of G.

In later-period texts, those last four do start showing up as single letters with diacritics, much as they do modernly, rather than as a digraph, in accordance with the reforms proposed in Jan Hus's De orthographia bohemica. However, while modern Czech uses a different diacritic as the mark for long vowels than it does for palatalization, period Czech appears to have used the same mark for both. Most of the time this is fine, especially to a Czech speaker, because there is only one letter that can be either short (e), long (é), or palatalized (ě), and someone familiar with the language will have a clear sense of whether a mark over any given e is intended for length or palatalization. In any other situation, there's no ambiguity; a mark over a letter can only mean one thing.

What I find particularly interesting about these changes is that almost none of these letters or letter combinations persist in modern Czech (though you do still see them in modern Polish); none of the digraphs are still used, and g and w don't exist at all outside of foreign loan-words. Given how common both g and w are in period texts, because the sounds they're standing in for are extremely common in Czech, both then and now, you can imagine how weird those words look to modern speakers.

But apart from the spelling, the language is otherwise basically still the same! Doing simple letter substitutions as above before running your text through machine translation is pretty much all that's necessary, and I love that this means that words written by people five hundred years ago are still completely and perfectly comprehensible today.

Comments

Popular posts from this blog

April Fools heraldic shenanigans

It is the custom of the SCA College of Arms to create and publish Letters of Misintent on April 1, rather than the more usual Letters of Intent, full of shenanigans of some type or another. Much of the time these are filled with pop culture references made documentable to SCA period by means primarily of the FamilySearch records and the fact that late 16th-century English surnames were often found used as given names as well, but there are occasionally other types of shenanigans, though those are generally funnier to heralds than to layfolk. This is the second year I've been the East Kingdom's submissions herald and therefore nominally in charge of deciding on a theme (or lack thereof) for the April 1 letter, creating it, and publishing it. Last year, everything was Very Too Much and I didn't get around to it; this year, I was determined not to let it pass me by, as I'm stepping down this summer and I wanted to have at least one with my name on it. Behold, the East'...

DMNES off-label use: Generating lists of names by culture

The Dictionary of Medieval Names from European Sources (DMNES) is one of my very favorite sources for documenting SCA names, but it can be hard to navigate, and it doesn’t have an easy built-in interface for just browsing names by culture. This is unfortunate, given that one of the ways a lot of names heralds like to handle consults is to hand your submitter a list of names to see if any of them stand out. Just because it doesn’t have an easily browsable interface, though, doesn’t mean it’s not possible to use it to generate lists of names by culture! I stumbled on this awhile back, and figured I’d write up a quick how-to. Fundamentally, this hinges on the fact that the sources for each name in the DMNES are meticulously cited, and that citation is bidirectional: not only is there a link to the source in each individual name’s citation, but each source text has links to every single name that’s listed from that source. The trick is getting the link to each source for your target cu...

On the SCA A&S community and motivational barriers

A friend of mine posted a thread today asking people about motivation in the A&S community, and how students tend to feel unmotivated when they can't reach their goal. The question posed was whether there are things in SCA A&S that people find to be motivational barriers, and if so, what is it that stops people from believing that they can do the thing — and what can we do to help. Never one to answer the question as posed, when instead I can answer the question I see as fundamentally underlying one's assumptions, I wrote a series of comments getting up on my soapbox about SCA A&S, and what the goals are, and where the barriers are, with the intention of getting other people to consider the overall framework they're working in. I saw the discussion up to that point as addressing symptoms, but not the root cause, and figured I might as well take a stab at getting to the root of it. Here's an edited version of where I went with that. tl;dr: We're not doin...