New DNA data reveals the origins of the Slavic peoples

Also this week: How our DNA holds the history of our language + The Cambridge dictionary adds 6,000 new words—and not everybody’s happy about it. Here’s what happened this week in language and linguistics.

New DNA data reveals the origins of the Slavic peoples

Welcome to this week’s edition of Discovery Dispatch, a weekly roundup of the latest language-related news, research in linguistics, interesting reads from the week, and newest books and other media dealing with language and linguistics!

📢 Updates

Announcements and what’s new with me and Linguistic Discovery.

The front entrance of The Green Mill Cocktail Lounge in the Uptown neighborhood of Chicago

Every week the Green Mill Cocktail Lounge in Chicago—one of the oldest bars in the city (tracing back to the 1890s!!) and one of the most famous bars in the United States—hosts a weekly live show with an awesome lineup of talented musicians and stand-up comedians, including the hilarious (and extremely tipsy) Chad the Bird (pictured below).

Chad the Bird, a little tipsy as usual

Despite being situated a mere two blocks from my apartment, I’d never gotten to patronize the Green Mill until this last weekend, when by sheer luck Chad the Bird had prepared an amazing bit about etymology and the 6,000 new words that were just added to the Cambridge dictionary. I could barely contain both my joy and laughter, so I had to share Chad the Bird’s bit with you all as well:

🆕 New from Linguistic Discovery

This week’s content from Linguistic Discovery.

🎃 Pumpkin Spice Linguistics

Starbucks has officially kicked off pumpkin spice latte season, which means it’s time for some Pumpkin Spice Linguistics!

Did you know that the word pumpkin was most likely borrowed from the Massachusett language by the Plymouth colonists? And that the word spice originally comes from a verb meaning ‘to observe’?

In this 20-minute talk given at the Edmonton Nerd Nite in 2022, I discuss the winding history of these two words and what they can teach us about both language change and indigenous history:

📰 In the News

Language and linguistics in the news.

6,000 new words added to the Cambridge dictionary—and not everybody’s happy about it

Screenshot of the definition of delulu: “believing things that are not real or true, usually because you choose to”.

The Cambridge dictionary added 6,000 new words to the dictionary this month, including many recent Gen Z neologisms such as skibiditradwifebroligarchy, and delulu. It’s important to remember that lexicographers don’t just add any old fads to the dictionary either. Only words that lexicography researchers believe are likely to stick around get added, although inevitably many will eventually drop out of use and later become marked as archaic—a fate shared by probably the majority of new words in a language.

Skibidi and tradwife among words added to Cambridge Dictionary
More than 6,000 new terms feature, including some relating to remote working and rich tech giants.
We’re delulu if we think new words should be resisted
Some 6,000 words or phrases are included in the Cambridge Dictionary for the first time this week: not everyone’s happy
I’m a linguist and this is what I think of Cambridge Dictionary’s ‘playful’ new Gen-Z words
With Cambridge Dictionary adding over 6,000 new words into its dictionary this year, including ‘skibidi’, ‘tradwife’, and ‘delulu’, HELLO! spoke to a linguist from global language company Babbel to get their take on the language changes and how social media ‘dictates’ the cultural shifts…

Paraguay is fighting to preserve Guarani

Map of the areas where Guarani is spoken (Wikipedia: Guarani language)

Paraguay is fighting to preserve Guarani as fluency slips among younger generations.

Guarani is a member of the Tupian language family, and is one of the most widely spoken Native American languages. About half the rural population of Paraguay are monolingual speakers of the language. One thing that makes standardization tricky is that Guarani is a dialect continuum.

However, fluency among younger generations is slipping. And one thing that makes standardization tricky is that Guarani is a dialect continuum.

Paraguay is fighting to preserve Guaraní, a language of roots and soul
Guaraní is one of Paraguay’s two official languages alongside Spanish

🗞️ Current Linguistics

Recently published research in linguistics.

Genetic mixing correlates to language mixing

(A) Sampling admixture in populations (solid arrow) if their two largest ancestry components amount to at least 70% of genetic ancestry, of which the minor source contributes at least 5%, and if the admixture is evident through at least five different levels of globally assumed components (K, tested from 12 to 30, see also fig. S1A for results at K = 12). The two components were manually associated (dashed arrows) with source populations exhibiting the ancestry to at least 80%. Target populations were only considered if their two main ancestries were assigned to populations (source 1 and source 2) speaking languages (language 1 and language 2) from different language families (family 1 and family 2), whereby one of these families (family 1 in this illustration) should be the family of the target language spoken by the target population. Targets were associated (gray arrows) with their now spoken language (“language 1” in the figure) and the source with the phylogenetic clade that best characterizes the language varieties (“language 2”) at the time of contact. To control for shared inheritance, we only sampled pairs where target and source are from different families (family 1 and family 2). (B) One hundred twenty-six target-source language pairs. Blue, target languages; orange, source clades (centered on one language for visualization purposes only); TLI and GBI, different linguistic datasets (see the main text). (C) Languages from which features were drawn colored by geohistorical area from AUTOTYP (fig. S1B for an alternative). W N America, Western North America; E N America. Eastern North America; C America, Central America; S America, South America; W and SW Eurasia, Western and Southwestern Eurasia; N-C Asia, Northern-Central Asia; S/SE Asia, South/Southeast Asia. (Graff et al. 2025)

When speakers of different languages are in contact, they often borrow not just words, but also sounds and grammatical patterns from one another. However, most languages of the world are not well documented, and have little to no historical records, making it hard to know which similarities across languages are borrowed or merely coincidental. A new study published in Science Advances aims to estimate the degree of language contact in languages of the world by looking at the genetic histories of different populations instead. Using the AUTOTYP database for mapping linguistic features, the authors find that language pairs whose speaker populations underwent genetic admixture or that are located in the same geohistorical area exhibit higher incidences of shared linguistic patterns. However, the effect varies strongly depending on the specific linguistic feature. Their analysis also challenges previous research about the borrowability of certain features of language.

Reporting

Capturing language change through the genes
Throughout human history, there have been many instances where two populations came into contact—especially in the past few thousand years because of large-scale migrations as a consequence of conquests, colonialization, and, more recently, globalization. During these encounters, not only did populations exchange genetic material, but also cultural elements.
Our DNA holds the hidden history of human language
New research reveals how human DNA preserves the story of language contact, showing when and where languages converged, diverged, and evolved.

Original Research

  • Graff et al. 2025. Patterns of genetic admixture reveal similar rates of borrowing across diverse scenarios of language contact. Science Advances 11(35). DOI: https://doi.org/10.1126/sciadv.adv7521.

The genes of ancient skeletons reveal the origin of the Slavic people

Archaeologists attribute these simple pots, buried as grave offerings in the eighth and ninth centuries C.E., to early Slavs. Archaeological museum Zadar

A new study published in Nature uses DNA evidence to connect the modern Slavic people to a wave of migration following the fall of the Roman Empire.

Today, Slavic languages are spoken from the beaches of the Baltic Sea to Russia’s Pacific coastline. But where the Slavs came from—and how their languages spread out across thousands of kilometers in Eurasia—has long perplexed scholars. Did a small number of Slavic-speaking elites impose their languages and cultures on existing populations? Or did Slavs move in from the east, replacing the previous inhabitants of what are now Poland, Germany, Bohemia, and the Balkans following the Roman Empire’s collapse?

A study of hundreds of ancient genomes published today in Nature backs the second scenario, suggesting Europe’s “Slavicization” process was linked to bands of Slavic speakers migrating west in large numbers.

📃 This Week’s Reads

Interesting articles I’ve come across this week.

  • Why Brits add an /r/ to words that end in vowels (but only sometimes) (Upworthy)
  • Why do we say “um” so much? It’s not just a meaningless filler like you might think! (NPR)
Most Languages Are Not English
Research conducted in WEIRD countries shouldn’t be seen as generalizable, whether it’s in psychology or linguistics.
    • English is not very representative of what the world’s 7,000 languages are like, yet the majority of research in linguistics focuses on English.
Speaking of Words: Can We Reconstruct the First Language?
Given the many thousands of languages spoken today, and records of hundreds of extinct ones, do we have enough evidence to reconstruct “Proto-World,” the ancestor of all natural languages?
Dear Duolingo: Why do we have capital letters?
Discover the history of capital letters and why each language uses them differently.

📚 Books & Media

New (and old) books and media touching on language and linguistics.

A really short history of words

Amazon

Many linguaphiles are already familiar with Bill Bryson’s wonderful book Mother tongue: English and how it got that way. Now Bill Bryson has published an illustrated accompaniment to that book, but written for kids!

🗃️ Resources

Maps, databases, lists, etc. on language and linguistics.

The Indo-European Lexicon

A diagram depicting the Indo-European family of languages. Drawn from The American Heritage® Dictionary of the English Language, Fourth Edition. Copyright © 2000 by Houghton Mifflin Company. Published by the Houghton Mifflin Company. All rights reserved.

The Linguistics Research Center at the University of Texas at Austin hosts the fantastic Indo-European Lexicon, a database of the words and affixes of Proto-Indo-European compiled by Jonathan Slocum. Each entry also shows all the descendants (reflexes) of the word in any of the modern languages that still have it.

Indo-European Lexicon: PIE Etyma and IE Reflexes

❓ How do you like your Linguistic Discovery articles?

I like writing long deep-dives. It’s a great way for me to collect everything about a topic in one place, which people can come back to and reference whenever they’d like. I’m also pretty sure y’all enjoy reading these more immersive longform articles (but please reply to this and let me know if this is true!).

I’ve usually posted these longer features as a single lengthy article, like so:

2,500 words:

What alien languages can teach us about human language: The linguistics of The Three-Body Problem
Imagine if every word you thought could be heard by everyone around you. In this world, thinking would be the same as communicating. What would language—and society—be like?

3,500 words:

From counting to cuneiform: How writing was invented
The earliest version of cuneiform wasn’t used to write language at all—it was used to count! And that Sumerian system of counting still influences our counting systems today. Here’s the story of Sumerian numerals.

3,500 words:

Did Kanzi the bonobo understand language?
Kanzi the bonobo, who learned language, made stone tools, and played Minecraft, dies at age 44. What can his remarkable linguistic abilities teach us about language?

6,000 words:

The etymology of ‘one’: From Proto-Indo-European to Modern English
There are over thirty English words that derive from the Proto-Indo-European word for ‘one’. This is the story of how they came to be, and what that story teaches us about how language works.

But recently I tried syndicating a long piece as a 4-part series, in my review of the linguistics of The Iron Dreamers:

Why do languages change? The linguistics of “The Iron Dreamers”, Part 1
Could a language stay frozen in time?
Are we stuck with the same grammar for life? The linguistics of “The Iron Dreamers”, Part 2
How does your grammar change over the course of your lifetime?
Are some languages more complex than others? The linguistics of “The Iron Dreamers”, Part 3
Do languages get simpler over time? Could they get more complex?
How are new languages created? The linguistics of “The Iron Dreamers”, Part 4
Pidgins, creoles, and mixed languages

If I had published this series as a single article, it would have been 15,000 words! Similarly, I’m working on a long piece right now that will likely reach about 10,000 words.

My question for you is this: Would you rather read longform pieces like these as a single essay, published less frequently, or have them broken up into parts and delivered a week apart? Let me know in the poll below. Thanks!

💡
The Amazon links on this site are affiliate links, which means that I earn a small commission from Amazon for purchases made through them (at no extra cost to you).

If you’d like to support Linguistic Discovery, purchasing through these links is a great way to do so! I greatly appreciate your support!

Check out my entire Amazon storefront here.