Overview of Türkic genetics






Anatole A. Klyosov
Overview of Türkic genetics 
Journal of Russian Academy of DNA Genealogy (ISSN 1942-7484), 2010, Vol. 3, No 1, pp. 3 - 58
The principal mystery in the relationship of Indo-European and Türkic linguistic families, 
and an attempt to solve it with the help of DNA genealogy: reflections of a non-linguist

Editorial Introduction

Whoever reads this, is blessed with a first, real, groundbreaking treatise on Türkic genetics that has a vision equipped with professional tools and not burdened by a load of preconceived notions and institutional restrictions. Even if the future established that not all all hypothesis and explanations are correct, the principles and vision would be a ratchet that would not allow a slide backwards. This is a new page in Turkology, this is a new page in the Indo-European Urheimat business.

For a sequel on Bell Beaker development, turn to Proceedings of the Russian Academy of DNA Genealogy, vol. 4, No. 6, 2011, p. 1127

Posting clarifications and comments are (in blue italics) and blue boxes. Page numbers are shown at the beginning of the page in blue. Most of the references on the author's work can be found on the author's homepage. The adjective Türkic and the noun Türk are used to denote the global world of the Türkic community that includes Turkish and Turks as one of the constituents; Türk is a noun of which Türkic and Türkic are adjectival derivatives, it is needed for translation from Russian, which has four distinct designations for four phenomena. To designate a biased advocate of Türkic studies is used a word Türkist, a mirror of the word Iranist, vs. Turkologist whose specialty is impartial Türkic studies. The semantics of the above terminology in English and Russian is a result of their national histories.



Based on the data of the DNA genealogy, a concept was formulated and substantiated that in the ancient times, until the middle of the 1st millennium BC, two linguistic fields - the Türkic (Proto-Türkic) and Indo-European, the languages of the haplogroups R1b and R1a respectively, dominated in turns the whole Eurasia reaching the Atlantic Ocean.  With a time difference of 1-2 thousand years, people of these haplogroups were migrating in opposite directions, mostly crossing the same territories, which confused present day linguists and archaeologists, and led to the fundamentally erroneous “Kurgan” and “Anatolian” theories of the “Indo-European homeland”.

The modern Uigurs, Kazakhs, Bashkirs, and some other peoples of Siberia, Central Asia and the Urals descend in part from the ancient R1b1 branch, and by now retain the same haplogroup for 16,000 years. The “Türkic-lingual” haplogroup R1b expanded from the South Siberia, where it formed 16,000 years ago, across the territories of the Middle Volga, Samara, Khvalynsk (in the middle course of river Volga) and the Ancient Pit Grave (“Kurgan”) archaeological cultures and historical-cultural complexes (8-6 thousand years ago and later, the common ancestor of the ethnic Russians with haplogroup R1b1 lived 6,775 ± 830 years ago), northern Kazakhstan (for example Botai culture dated by the archaeologists 5,700 - 5,100 years before present (BP), in reality much older), passed through the Caucasus to Anatolia (6,000 ± 800 BP by the dating of R1b1b2 haplogroup of the modern Caucasians), and through the Middle East (Lebanon, 5,300 ± 700 BP; the ancient ancestors of the modern Jews, 5,150 ± 620 BP), and Northern Africa (Berbers of the R1b haplogroup, 3,875 ± 670 BP), crossed over to the Iberian Peninsula (around 4,800-4,500 BP, present day Basques 3625 ± 370 BP) and further on to the British Isles (in the Ireland 3,800 ± 380 and 3,350 ± 360 BP for different populations), and to the continental Europe (Flanders, 4,150 ± 500 BP, Sweden 4,225 ± 520 BP).

The path from the Pyrenees to the Continental Europe is the path and period of the Beaker Culture, the ancestors of the Pra-Celts and Pra-Italics.


In parallel, the traces of the ancient R1b carriers are found in the Balkans (4,050 ± 890 BP), separately in Slovenia (4,050 ± 540 BP), and Italy (4,125 ± 500 BP). That was the beginning of the Türkic languages' time in Europe, and the disappearance there of the Europe “Proto-Indo-European” haplogroup R1a1, which populated Europe from the 10th millennium BC.

Conceptual map of Kurgan westward waves with datable genetic markers (arrows)
(background R.R.Sokal et al. 1992 and M. Gimbutas 1994)

The haplogroup R1a1 was practically saved by the fact that 4,800 years ago, in the beginning of the third millennium BC, its bearers moved from Europe to the Eastern European Plains, and settled the territory from the Baltic to the Black Sea, 4,500 BP they were already in the Caucasus, 3,600 BP they were in Anatolia (according to the haplotypes of the R1a1 haplogroup in modern Anatolia). Meanwhile, across the Eastern European Plain they migrated to the southern Ural, and around 4,000 BP on to the southern Siberia, at that time they founded the Andronovo archaeological culture, colonized Central Asia (4,000 - 3,500 BP), and approximately 3,500 BP a part of them went to India and Iran as Aryans, bringing along the Aryan dialects, which effectively closed the linguistic link with the Aryan languages (R1a1) and led to the emergence of the Indo-European family of languages.

4,500-4,000 years ago the R1a1 disappeared from the Western and Central Europe, Europe became Türkic-speaking with the arrival of the people carrying R1b haplogroup (the beginning of the 3rd millennium BC), and that lasted until the middle of the 1st millennium BC (3,000-2,500 years BP), when the haplogroup R1a1 re-populated the Western and Central Europe, and came about a reverse replacement of the Türkic languages to the Indo-European languages. The striations of the linguistic and haplogroup, or tribal (in terms of DNA genealogy) in the Eastern European Plain, in the Near East, and in Europe has led to erroneous linguistic and archaeological concepts such as the “Indo-European Kurgan Culture”, with its transposed languages (postulated” Indo-European”, when it was a Türkic language), the wrong direction of movement (the “Proto-Indo-European” was moving eastward, not westward, the Türkic was moving westward, the westward movement was seen by the creators and supporters of the “Kurgan Culture” as the “Indo-European movement, which was 180 degrees wrong), wrong periods (the Proto-Indo-European language advanced eastward across the Eastern European Plain in the 3rd millennium BC, while the ancient Pit Grave, or the “Kurgan” culture is mainly dated by the period of the 4th-3rd millenniums BC, and were moving westward).

Something similar also happened to the “Anatolian theory”, where a separate (So.Caucasian) branch of the Aryans' route, the southward movement of the R1a1 haplogroup carriers across the Eastern European Plain was mistaken for the “Indo-European homeland” in Anatolia. That led to a conceptual distortion and misunderstanding of the fundamental role of the Türkic languages in the Eastern European Plain (at least from the time 10,000 years ago), and in Europe, where it continued for two and a half thousand years (from the beginning of the 3rd millennium to the middle of the 1st millennium BC).



What the article calls “Türkic” or “ancient Türkic” language is based only on the fact that Turkologists call it Türkic. Analyzing the ancient texts (see below) they see specifically the agglutinative Türkic language, the Türkic ethnonyms in Europe. It is possible that this is a misunderstanding, and what they see is an agglutinative language of the haplogroup R1b ancient carriers, which can be called “Erbin” (after R1b). It could be, but not necessarily, a basis, a ground, a substrate for the modern Türkic languages; it could just be a related, lateral branch of the ancient Türkic language. It could be the agglutinative language of the ancient Basques. Was that Türkic language or not is a matter for the linguists to decide. In any case, it does not affect the discourse and conclusions of the article . Those who find the term “Türkic language” in this context (as a pre-IE language in Europe, employing by R1b bearers 4,500-2,500 years before present, and some later) not acceptable may substitute it with the term “Erbin”, and read on.


For more than a hundred years the “Iranists, or more commonly” Indo-Europeanists” on one side, and Turkologists on the other side, completely deny the contribution of the opponent's linguistic group into the Eurasian linguistic landscape in antiquity (from the beginning of our era and older), asserting that in the Europe and Asia was either a continuous “Indo-Iranian” substrate, or conversely continuous Türkic substrate. They do not compromise. Examples are given below.

And the explanation is quite simple. Both sides are right, but on their own half. The two major Eurasian haplogroups, R1a and R1b, diverged (or rather, formed and diverged) 20-16 thousand years ago, evolved linguistically from the common Nostratic languages, respectively into the Pra-Aryan (later called “Proto-Indo-European”) and the Proto-Türkic, and then into Türkic. And because the paths of the haplogroups R1a and R1b carriers in Eurasia significantly transversed in the same territories, often with a gap of a millennia or two (R1a migrations are older in Europe, R1b migrations are older in Asia), they left “substrates” superimposed one on another, and intertwined in many ways. Since the agglutinative Türkic languages are probably less subjected to temporal changes than the flexive Indo-European languages, the Turkologists explain with ease almost all “Iranisms” from the Türkic languages. They are finding in works of the Classical writers many examples of Türkisms, in the proper names and in the names for the objects, and in separate terms. The Iranists in response brush them aside, and cite their own versions, in accordance with which certainly no Türkisms existed in the Eurasia during the past era and even more so before that. Or they ignore it, or undertake repressive measures in science. Any Turkologist can cite many examples of that kind.


This article introduces the problem, to show that many thousands years ago have existed both the Aryan, that is Proto-Indo-European languages, and the Proto-Türkic (or Türkic) languages. They simply were carried by different tribes, the first by the tribe R1a1, the second by R1b1, and perhaps by the kindred tribes Q and N. This concept, naturally, awaits deeper linguistic studies. But the beginning, as can be seen, is established.

The next section relays the story about of opposition between “Iranists” and “Türkists”. Actually, the opposition does not exists literally, it is rather a figure of speech. Too unequal were both sides to call it an “opposition”. But this figure of speech reflects the essence of the problem. Ever since the beginning of the 1950s, the official historical science postulated that the Scythians were “Iranian speaking”. The issue was not to be discussed any more. Any arguments and scientific evidence on the subject were not acknowledged by the official science (and that the official science exists is beyond discussions), or reacted to with dead silence for at least 60 years.

About confrontation between “Iranists” and “Türkists”. Solely quotes.

Yu.N. Drozdov “Türkic ethnonymy of ancient European peoples” (2008):
“... presents the results of ethnonymic studies of the ancient Europeans tribes and peoples according to the ancient and early medieval written sources. It was established that the ethnonymy of these tribes and peoples was Türkic-lingual” (annotation for the book).

Ibid: “The results give reasons to believe that a vast majority of the European population from the ancient times to the 10-12 centuries was Türkic-lingual”.

Ibid, p. 5: “The Classical and Early Medieval written sources in Greek, Latin and Arabic cite a large number of names for the ancient European tribes and peoples. Among them was not encountered a single name that could be derived from the Greek, Latin, or any other modern European language ... The linguistic analysis of the ancient European ethnonyms shows that all of them are distorted Türkic-lingual words”. 


Ibid, p. 5-6: “As showed the results of studies, neither the Hebrew, nor the Greek language had any relation to the (Christian) terminology (two millennia ago). It also was entirely Türkic-lingual”.

Ibid, p. 8-9: “In accordance with the concept of the modern historical science, all of these (Scythian) tribes are considered to be Iranian speaking (more accurately, Persian speaking). Moreover, this view has acquired a status of a static axiom ... (To the contrary) a number of scientists and experts provably state for already quite a long time that all Scythian and Sarmatian peoples were Türkic-lingual”.

V.I. Abayev “Ossetian Language and Folklore” (Moscow-Leningrad, 1949, pp. 239): “... We have received a certain amount of positive, solid and indisputable results which can not be moved by any future explorations and discoveries. These results characterize the Scythian language as an Iranian language, with features of peculiar and well-defined individuality”.

Yu.N. Drozdov, p. 9: “... The modern historical science adopted this conclusion of V.I. Abaev as axiom, resulting that the ethnogenesis of all European nations does not find an intelligible and logical explanation.

M.Z. Zakiev “Genesis of Türks and Tartars” (Moscow, 2003, pp. 139-140): “The theory of exclusive Iranian linguality of all tribes united by the common name of the Scythians seemed plausible when the Iranists conducted etymological studies of the Scythian written monuments only selecting the word ( ethnonyms) with solely Iranian roots. However, the research circle of these monuments was extending. The problem was also approached by non-Iranists, in particular Turkologists and other linguists. In the scientific circulation were introduced words with non-Iranian roots, especially with the Türkic roots, indicating the presence in the union of the Scythian tribes Türkic-lingual tribes ... The result is a vicious circle: archaeologists are guided by the opinion of linguists, the archaeological culture of the Scythian and Sarmatian period is attributed to the Iranian-speaking tribes, and the linguists-Iranists for confirmation of their theory refer to the findings of the archaeologists”.

M.Z. Zakiev, ibid: “Notably, all the Turkologists that reached the Scythian materials and studied them themselves, unequivocally recognize the Türkic-linguality of the main composition of the Scythians and Sarmatians, and prove that with linguistic, ethnological, mythological, and archaeological evidence”. 


I.M. Miziev “History around us” (Nalchik, 1990, cit. per T.A. Mollaev “A new perspective to the history of the Ossetian people,” 2010, p. 6):

T.A. Mollaev, “A new perspective on the history of the Ossetian people”, 2010, p. 6): “This table shows irreconcilable difference between the ethnic passport of the Scythians, represented in the archaeological materials, and the Indo-European peoples ... And also a complete equivalence in the corresponding characteristics of the medieval Türkic peoples with the Scythian nations in antiquity.

TA Mollaev, ibid, p. 9: “The “Iranists” explained the Scythian words in this mode: was taken any anthroponym, ethnonym, etc. recorded by the ancient written sources, then for it was randomly sought a lexical unit from Ossetian or other Iranian, and even from other Indo languages, phonetically more or less suitable. And after that had to be held that the result of that comparison for a lexical unit of the Scythian words is translated such and such from the Iranian languages. With that method, and with the same success, could be compared with the Scythian word the lexical units of any other languages in the world. And then, with some phonetical resemblance, declare the Scythian words as translations from those languages. Thus, the initial absence of appropriate scientific methodology, or more accurately ignoring methodology, allowed this theory a chance to appear and penetrate into historiography. The founders of the theory were three very bias minded Indo-Europeanists of the 19th century (J.H.Klaproth, K.V.Müllenhoff, V.F.Miller). Using identical method, with some desire, any word can be etymologized in any language of the world”.

T.A. Mollaev, ibid, p. 11: “That would remain immaterial if their “scientific explorations”, or more accurately fakes, were not represented at an official  level as solid scientific arguments. And after that many others were duped by the “Iranists”: specialists, and regular folk would start to believe that the Scythian tribes (ancestors of the Türkic people), indeed spoke Iranian language.


D. Verkhoturov (cited per T.A. Mollayev, ibid, p. 15): “Believing the Iranian theory, it follows that around the middle of the 1st millennium AD the Türks “left” from the Altai, quickly captured and Türkified a huge “Iranian world”, and did it so well that no trace and fragments of the old world have remained. Meanwhile, it is perfectly clear that the formation of such vast Türkic world took millennia. There is absolutely definite archaeological complex of the steppe peoples, first of all kurgan burials in timber graves, burials with horse, etc., which in the archaeological materials of the Eurasian steppe zone clearly continue their descent in the culture of the undeniably Türkic peoples. The beginning of the continuity ascend at least to the beginning of the 1st millennium BC.

I.M. Miziev and K.T. Laypanov “On the origin of the Türkic peoples” (Nalchik, 1993, cit. per T.A.Mollayev, p. 20): “Into the “captivity” of the Iranist linguists fell Scythologists B.N. Grakov, M.I. Artamonov, A.P. Smirnov, I.G. Aliev, V.Y. Murzin, and many other honest archaeologists, who according to archaeological and other materials know that The Andronovans, Scythians, Sakas, Massagets, and Alans not Iranians, “but since linguists proved their Iranian-linguality”, they are forced to recognize these tribes as Iranian-speaking.

Yu.N. Drozdov, page 10: “... despite a large number of works produced to demonstrate the Türkic-speaking of the Scythian-Sarmatian people, the conclusions of their authors have not yet been accepted by the modern historical science. Perhaps their evidentiary base was not found to be convincing, or more likely these findings do not fit the commonly accepted historical concept.

The books of Yu. Drozdov and T. Mollayev supply a wealth of material for the Türkic ethnonymy of the European and Eurasian tribes, nations, historical figures, and mythical characters; the material was collected by the authors themselves, and their predecessors. The quoting can extend to infinity, and the following are but a few examples. Concluding the series of descriptions on mutual pricking between Iranists and Turkists relative to the Scythians, this example is cited by both authors. Herodotus lists several legends about the origin of the Scythians. According to one of them, the ancestor of the Scythians was a man named Targitai, who had three sons, Lipoksai, Arpoksai and Kolaksai. Herodotus noted that Lipoksai was an ancestor of the Avhatai Scythians, from the middle son Arpoksai descended Katiars and Traspians, from youngest Kolaksai descended Paralats. “Alltogether they are called Scolots, Greeks call them Scythians” (Herodotus).


All these names and ethnical terms were deciphered by the Turkologist M. Zakiev based on the Türkic language (described in detail in N. Drozdov, p. 15), and T.Mollaev adds that the names of both father and son are listed in a long series of the 13th century Türkic names in the annals of Rashid al-Din, for example Actai, Ashiktai, Gurushtai, Buruntai, Daritai, Oiratai, Kamtai, Kutai, Kutuktai, Kyahtai, Subektai, Tubtai, Uigurtai, Usutai etc. (Mollaev, p. 52). Indeed, since tai/sai/thai is “clan” in Türkic, the etymology is totally transparent for the tribal descent: they are clans from Dari, Oirat, Kithai, Kuyan, Suvar, Tuba, Uigur, Usun.

Yu.N. Drozdov (2008) systematically examines the ancient authors, and also examines in detail virtually all regions and known tribes of the ancient Europe, and finds everywhere layers of Türkic-lingual ethnonymy, among  the Scythians, Sarmatians, Goths, Huns, Avars, Enets and Venets, Sklavens, Antes, Vandals, Baltic tribes, in the Dnieper area and east of Dnieper area, among the Germans, Scandinavians, Franks, Gauls and Celts, ancient Britons, inhabitants of the Apennines, the Khazars, Burtas, Bulgars, and so on tribes of the Volga and Kama.

In the conclusion of the book, Yu.N. Drozdov wrote: “The ethnonymic analysis of the ancient Europeans tribes and peoples, and also their names and separate terms according to the Greek, Latin and Arabo-Persian Classical and Early Medieval written sources demonstrated that they all were Türkic-lingual. That suggests that the population of Europe in the period under consideration was Türkic-lingual ... Currently, however, virtually all European nations speak in different flexive languages that have nothing in common with the Türkic languages. Only in the extreme east of the Europe a few Kama, Volga and North Caucasian peoples have preserved the ancient European Türkic language. So, at some time period, the bulk of the European nations switched from the Türkic to other languages, which presently are known as the European languages” (p. 352).

Further on: “It seems that the languages of the Türkic linguistic group were spread throughout Eurasia (and it seems and not only there) from a very distant period in time, beyond the historical memory of the modern humanity” (ibid.).

And further on: “A careful analysis of available written sources in order to identify any evidence that would allow, at least in a first approximation, to understand when, how and from where the European nations received new flexive languages did not produce any results so far” (p. 353).

Further, Yu.N. Drozdov estimates that the period of change from the agglutinative Türkic to the flexive Indo-European languages in Europe, namely to French, Germanic, Danish, and Slavic falls in the period between the 9th and 13th centuries AD (p. 357). But a significant number of the Türkisms remain, though they are phonetically deformed under the influence of the modern languages (pp. 357-358).


More on the confrontation between “Iranists” and “Türkists”. A new perspective at resolution of the conflict. The emergence of the flexive Aryan and agglutinative Türkic languages in Asia and Europe

As was noted above, the continued de facto opposition between Iranists and Türkists already crossed over into the 21st century, it leads to obvious mutual excessives. As the Iranists not give an inch of ancient Eurasia to the Türkic languages (see the previous section), to the Turkologists in the example of Yu.Drozdov (in this case) do not see the Indo-European languages in Europe at that same time and later, including virtually the whole first millennium of our era, excepting the Greek and Latin (from the middle of the last millennium BC [p. 352] or the end of the last centuries BC [p. 356]), although, as Yu.Drozdov pointed out, “to ascertain the carriers (of the Latin) was impossible so far” (p. 352).

It naturally does not happen that both sides were so mistaken. This article attempts to show that both sides are correct, each on its own half. As states the famous saying attributed variously to A.Einstein or I. Newton, “The Nature is cunning, but not malicious.” And here the nature has played a cunning joke with the linguists. It seems that the two Caucasoid brotherly lines, R1a1 and R1b1, that came about 50-40 thousand years ago to the Eastern European Plain as a single branch of R (or, rather, as its upstream haplogroup P, or even NOP), and then went to the Southern Siberia at least 35,000 years ago and dispersed over time and over territories, as relayed below, headed two languages.

One of them was a flexive Aryan language (language of the R1a1 tribe), which later became to be called Proto-Indo-European, and the other was an agglutinative Proto-Türkic language (language of the R1b1 tribe). Both tribes gestated in the Southern Siberia.

Tribe R1b1, a carrier of the agglutinating, ancient Türkic languages.
The path from Asia to Europe, with the arrival at the turn of the 4th-3rd millenniums BC

The modern Uigurs, Kazakhs, Bashkirs, and some other peoples of Siberia, Central Asia, and the Urals, descend in part from the ancient R1b1 branch, and by now retain the same haplogroup for 16,000 years.

That tribe historically was moving from east to west, certainly leaving their descendants along the path.


These are the peoples of Siberia, Volga, Kama, Central Asia, and the ancient peoples of the Middle Volga, Samara, Khvalyn, the ancient Pit Grave or “Kurgan” archaeological cultures, cultural and cultural-historical communities, and some Caucasian peoples that partially retained the haplogroup R1b1, which by the time of 6,000 years ago has become a haplogroup R1b1b2 (mutation M269 and L23 or L49 in the modern nomenclature), and the peoples of the Turkey and Middle East, whose population retained in their DNA many of the same haplogroup R1b1, see the table below (Abu-Amero et al, 2009).

Country (region).........................Proportion of R1b1b2, %
United Arab Emirates.....................................3.7
Saudi Arabia.................................................1.9

Notably, the “indigenous” Caucasians, the descendants of the haplogroup R1b ancient tribes, have no further fragmentation of their haplogroup R1b1b2 (mutation M269) or R1b2b2a (mutation L23) onto the following subgroups, which are typical for the Western and Central Europeans. In other words the fractionation, of course, exists, but no one has yet studied it. It went the other way, forming other “downstream” mutations that have not yet been identified. And the Europeans for identification of the following “European” mutations threw in large forces of specialists, and their R1b2b2a in the subsequent move from the Caucasus to Europe already has more than 70 subgroups (more accurately, 76 as of middle of 2010, and the list is growing at least a dozen a year). In an abridged version, without lateral branches, the subsequent, post-Caucasian development of the subgroups looks as follows:

R1b1b2a (L23) => R1b1b2a1 (L51) => R1b1b2a1a (L11) => R1b1b2a1a2 (P312) => R1b1b2a1a2d (U152) => R1b1b2a1a2d3 (L2) => R1b1b2a1a2de3a (L20),

in parallel R1b1b2a1a (L11) => R1b1b2a1a1 (U106) => R1b1b2a1a1a (U198), R1b1b2a1a1c (L1), R1b1b2a1a1d (L48) => R1b1b2a1a1d1 (L47) => R1b1b2a1a1d1a (L44) => R1b1b2a1a1d1a1 (L164),

and in parallel down to R1b1b2a1a2f (L21) => R1b1b2a1a2f2 (M222), R1b1b2a1a2f3 (P66), R1b1b2a1a2f4 (L226), R1b1b2a1a2f5 (L193).


That, in turn, produces clear “markers” that allow tracing the carriers of the R1b migration throughout the whole Europe, to the most remote corners.

Returning to the ancient migrations of the haplogroup R1b in Asia, eastward of Iran the share of the R1b1b2 noticeably falls (8%), in Pakistan it is 2.8%, alongside with 4.6% of the ancient Asian line R1b1b1. It can be said that the later DNA from the Caucasus superimposed on the older DNA from the South Siberia, and literally displaced them. Very little of the last ancient Asian line is in Anatolia, only 0.8% (Abu-Amero et al, 2009), as also elsewhere outside of the Central Asia. Among the Uigurs, however, there are 22% of R1a1 and 18% of the combined R1b, most of them R1b1b1. In other words, to the west the line of the ancient R1b1b1 DNA is falling, and the R1b1b2 is growing. But the language continued advancing to the west, the human race was basically the same, just were added mutations into the DNA, and the haplogroup kept fragmenting, dividing into subgroups. The people naturally did not know of that, and continued to speak in the Türkic language, which, naturally, was changing in accordance with the laws of linguistic dynamics.

From the Anatolia, which the carriers of R1b1b2, together with their agglutinative language, reached 6,000 ± 800 years ago (Klyosov, 2008a, b), they continued moving westward toward Europe by two routes. One route went through the Balkans, where the haplogroup R1b1b2 is recorded at about 4,000 years ago (a formal calculation gives 4050 ± 890 years ago). In Sardinia, it dates from the 5,025 ± 630 years ago, Sicily 4,550 ± 1020 years ago, in Italy 4,125 ± 500 years ago, in Slovenia 4,250 ± 600 years ago. Another route went through the Middle East (the common ancestor of the modern carriers of the haplogroup R1b1b2 in Lebanon dates back to 5,300 ± 700 years ago, among the modern Jews 5,150 ± 620 years ago), then on across the North Africa (Algerian Berbers 3,875 ± 670 years ago) to the Atlantic Ocean and on to the Iberian Peninsula (3,750 ± 380 years ago), and further on to Europe (Klyosov, 2009a).

It is very likely that carriers of R1b1b2 reached Iberia 4,800-4,500 years before present, but then they had passed a “population bottleneck”, and reappeared again (through a few survived DNA-lineages) 3,750 ± 380 years ago. This is when a common ancestor of the present-day Basques lived.

Approximately 3,600 years ago that haplogroup is noted in the British Isles. This is the movement of Beaker culture - from the Iberian Peninsula in the British Isles and on the European continent. On the overall, the peopling of the Europe by the carriers of the haplogroup R1b1b2, who were speaking the ancient Türkic languages, occurred between 4,500 and 3,600 years ago. They are the ancestors of the Proto-Celtics and Proto-Italics, and, probably, Proto-Picts and other “Proto”-R1b1b2 peoples in Europe.

Tribe R1a1, carrier of the flexive, Aryan languages.
The path from Asia to Europe and back east in the 3rd-2nd millenniums BC

At that time, between 4,200 and 3,600 years ago, Europe was populated (although with low density in those times) by the carriers of the haplogroup R1a1 and I1/I2. Both tribes came to Europe much earlier than R1b1b2. An ancient language of I1/I2 is still unknown. It cannot be excluded that they were ancient bearers of “Proto-Indo-European” language, which was acquired from them by R1a1 in Europe some 12-8 thousand years ago. It is completely unknown. All we know is that R1a1 had started to spread around in Europe at about 6,000 years BP, and came to the Eastern European Plain around 4800 years BP. They brought “Proto-Indo-European” language, their own or acquired. Other carriers of R1a1 haplogroup lived at the same time in South Siberia and adjacent regions, such as Northwest China, having at least 6900 years to their common ancestor (Klyosov, 2010), and spoke Altaian languages, as they speak nowadays. In this study we assume that R1a1 spoke Proto-IE languages from the beginning of the R1a1 tribe 21000 ± 3000 years before present, in South Siberia, otherwise we drive to complete uncertainty. Future studies would show whether Proto-IE flexive languages were the original languages of R1a1, or acquired languages, and if so, from whom. It would not, however, change the concept of R1b as bearers of agglutinative Turkic language on their way to and in Europe.


The R1a1 tribe, as was stated above, arose in South Siberia. Arose in the sense that its carriers acquired a mutation in their Y-chromosome DNA, and after many generations survived those who had that mutation. It was pure coincidence, the mutation M17 that defined the haplogroup R1a1 provided no advantage for survival. The carriers of another mutation could survive, and then we would now state that survived carriers of a mutation xyz. But they were the descendants of the same R1 tribe, and spoke the same language of the tribe. The haplogroup R1a1 appeared about 21,000 years ago (Klyosov, 2009a), in any case there is no other data. It was 4,000 years before the appearance of the R1b1 variety, and by that time the carriers of the R1a1 mutation M17 quite possibly have already left those places. Four thousand years is a huge period, within which is conceivable to move to a new territory. We do not know the reason why the common language of R and then R1 diverged into the flexive (R1a1) and agglutinating (R1b1) branches. It might have happened 20-16 thousand years BP, or 12-10 thousand years BP (see above). In the last case it does not matter for the purpose of this study.

The base for both languages was Nostratic, or Boreal language, or the language called Babylon; many names can be invented, it will not change anything. The fact is that about 35-25 thousands years ago, or during the lifetime of the parental haplogroup P with a common language, formed the haplogroups Q and R, which initially also had to have that common language. Since a number of the Siberian peoples belong to the haplogroup Q, along with a number of Mongolic tribes, and also with a large proportion of the American Indians, then if the haplogroups of the Q and R families initially had an agglutinative language, then it must be found among the Siberians and American Indians, albeit to a lesser extent. Indeed, the overwhelming number of the American languages are in fact agglutinative, see (Wikipedia Category:Agglutinative languages).

But the haplogroup R1a1 is unique in that it formed and developed a flexive language, which became the language of that haplogroup, a Proto-Indo-European Aryan language (with some possible reservations mentioned above).

The path of the haplogroup R1a1 to Europe remains unknown, but from the data of the DNA genealogy we know that that haplogroup appeared in Europe about 12,000 years ago, immediately after the melting of the glaciers. By all indications, it was located in the Balkans. That is also indicated by the linguistic data about the landscape of the Indo-European “homeland”, although this term is inherently flawed. That, as we observe, was not an “ancestral home”, and they were not “Indo-Europeans”, but the haplogroup R1a1, at the time they were Pra-Indo-Europeans. As will be seen later, that tribe migrated to India and Iran in the middle of the second millennium BC under a name of the Aryans, bringing along their Aryan flexive language. From that period the language acquired a status of “Indo-European”. Prior to that it was an Aryan language, a language of the haplogroup R1a1.


So, the carriers of the haplogroup R1a1 remained in the Balkans from about 12,000 years ago, and quite possibly were populating the Europe. They could have trade and other relations with the southern Europe, including Anatolia and in general with the Asia Minor, Greece, and start, alone or in association with the people of the haplogroup I, what later was named Balkan Archaeological Cultures. The early dating of these cultures are about 8-9 thousands years ago (6th-7th millenniums BC), which does not contradict the data of the DNA genealogy for the haplogroup R1a1 in Europe as 12,000 years ago.

The archaeology testifies about detected material traits, and not about the time of the arrival there of an ancient tribe. The gap of 3-4 thousand years that separate the appearance in the Balkans of the R1a1 tribe from the dates of the Starcevo, Keresh, and then Tripolie-Cucuteni cultures is quite reasonable.

So, what linguistic landscape Eurasia had 6 thousand years ago (4th millennium BC)? To answer that question requires rolling back to more ancient times and completing, inevitably in general terms, a concept of the world scene in respect to humanity for the previous 60-50 thousand years according to the modern DNA genealogy.

State of the world  in respect to humanity during the period from 60 to 6 thousand years ago according to DNA genealogy

To get started, picture the global canvas. The anatomically modern man Homo sapiensappeared, according to the modern science, although this estimate is extremely controversial, between the 200 and 35 thousand years ago. A better precision the modern science does not provide yet. Many researchers agree that that date can be narrowed to a range from 160 to 100 thousands years ago, some view it as between 200 and 160 thousand years ago.

Many, however, argue that it was still an archanthrope, rather than a modern man. In other words, it was an advanced Neanderthal, or a creature at the same stage of development, both anatomically and mentally. To become a sapiens man, was needed a mental separation of himself from the outside world, an ability to see himself from the outside. That's why anthropologists and archaeologists attach so much importance to the ancient ornaments of the primitive people, the decorations they left behind. The ornaments represent a materialized desire to be attractive, for which was needed an ability to see himself from the outside. Even the fact that transition from the Neanderthals to the humans was accompanied by a broader range of food from meat to fish and shellfish, which Neanderthals did not have, demonstrates that a man became more flexible in the means of subsistence. It was also accompanied by the anatomical changes. So, the mentality and anatomy (increased brain size, volume of the skull) were changing simultaneously.


Contrary to the assertions of some linguists that the speech was inherent to the “people” a million years ago or even earlier, which rather reflects the emotions of those linguists than their knowledge about the structure of the archanthrope's vocal apparatus, which is not known. We have no information on the degree to which a million years ago the vocal cords of the “people” were supplied with musculature to tension them. The apes do not have those muscles, or they are insufficiently expressed, so they are not able to modulate sound to a degree that it could be called a speech. Probably, the development of that anatomical feature also accompanied a mental development of the primitive man, in parallel with the development of the brain and enlarged cranium.

The first genera of a man were the African-type haplogroups A and B. According to various sources, the haplogroup A appeared 80-60 thousand years ago. Relatively few members of that haplogroup remained in Africa, particularly in Ethiopia and Sudan, and among the populations with click languages. Populations of that haplogroup are scattered in “spots” across the continent. It seems that that is all that remains of the oldest haplogroup A. The haplogroup B formed about 50,000 years ago from the combined (in those days) haplogroup BT, which appeared 55,000 years ago. On some accounts both haplogroups, A and B, are of the same “age”. The territory and frequency of the haplogroup B is approximately the same as that of the haplogroup A, with addition of the Central African Pygmies and South African Khoisans (haplogroup B2b). The carriers of the haplogroup B2a largely speak Bantu languages.

From the haplogroup B separated a combined haplogroup CF, which migrated out of Africa. This happened in the interval of 55-30 thousand years ago. 50,000 years ago from that combined haplogroup formed a haplogroup C, its carriers migrated to the east, a part of its tail remained in the south of the Arabian Peninsula, and the others through Pakistan and India, Sri Lanka, and across the rest of the Southeast Asia went to Australia. The subgroups of that haplogroup are observed in Japan (C1), in Polynesia, in Melanesia and Papua New Guinea (C2), in South-East and Central Asia (C3), and among the aborigines of Australia (almost exclusively C4).


50 thousand years ago from a combined haplogroup CF formed a combined haplogroup DE, which in turn formed a haplogroup E, which spread over the North Africa and Europe, and D, which migrated to India and then across Asia. The carriers of D1 live in Tibet, Mongolia, Central Asia, Southeast Asia, D2 is found almost exclusively in Japan. The haplogroup E apparently appeared in the North-East Africa, but the Middle Eastern region can't be eliminated, from where it could reach Africa. The fact that haplogroup F in Africa practically does not exist, but more than 90% of the people on the Earth have haplogroups descending from the F, may mean that it already formed outside of the continent, or came from the Africa with a small group of people.

The haplogroup G, which supposedly was formed 30,000 years ago in the Mesopotamia, is mainly observed in the Caucasus, Iran, Middle East and in the Mediterranean, but there is hardly any of it in the northern Europe, less than 2% of the population. In the southern Europe it stands at 8 - 10% of the total composition of Spain, Italy, Greece and Turkey. The haplogroup H was formed from F at about 40 - 30 thousand years ago, presumably in India, where it mostly stayed. That haplogroup came to Europe with Gypsies as a subgroup H1.

The combined haplogroup IJK, formed from F in the Middle East 45,000 years ago, first spun off a combined haplogroup IJ and a separate haplogroup K, and then split into I and J, and spread in the Middle East, Mediterranean, and further on in Europe. And in Europe the haplogroup I apparently arrived first from the Eastern European Plain, where it migrated to from the Mesopotamia through the Caucasus mountains or bypassing them.

The suggestion that the haplogroup I, apparently together with haplogroup R, or its upstream haplogroup(s) arrived about 50-45 thousands years ago in the Eastern European Plain is justified by the fact that the Eastern European Plain has a large quantity of the archaeological sites belonging to that time. They can't be explained by any other haplogroups. Further, the carriers of I, and R1a1, and R1b1 are Europoids, and it is difficult to envisage that they became Europoids in parallel in different ends of Eurasia. Finally, in Europe were found Caucasoid camps dated between 45-30 thousand years ago, and in Asia (Baikal region) were found camps dated 24 thousand years ago, while no haplogroup I was detected in Siberia. Moreover, the haplogroup I is the oldest haplogroup in Europe, it certainly appeared there more than 30 thousand years ago, and came from the north-east. Generally, the whole complex of evidence suggests that the haplogroup I populated Europe, and the haplogroup R1 populated Asia (along with other, Mongoloid haplogroups) from the Eastern European Plain.

Now, the haplogroup I (composed of two main subgroups, I2 [“Balkan”, which should be called “haplotype of the Eastern European Plain”], and I1 [“Baltic” or “Nordic”]) covers approximately 20% of Europeans, being a second largest after the haplogroup R1b1.


The designations for these haplogroups are again conditional, and are given here just for orientation, because these territories contain the greatest proportion of those haplogroups. Outside Europe, the haplogroups I1 and I2 are practically absent (there are relatively few of their bearers in the Middle East, but with rather recent common ancestors).

The haplogroup J1 is observed mainly among the Arabs and Jews, whose genealogical lines diverged about 4,000 years ago, in a curious agreement with what is outlined in the Bible and its interpretations. Ironically, the Jews and Arabs, including the Palestinian Arabs, largely share not only the haplogroup J, but also the subgroup J1. They are close DNA-genealogical relatives. The haplogroup J2 is observed among the inhabitants of the Mediterranean, the Greeks, Italians, and many Jews, who are immigrants from the Middle East. It is numerous in India.

The combined haplogroup NOP formed from the haplogroup K approximately 40 - 35 thousand years ago east of the Aral Sea (this is one of the three major versions), then divided into N, which settled Siberia and territories to the south and north; the O, which migrated via India to the South Asia, and P, which went to the southern Siberia, and divided onto Q and R. The same haplogroup K produced haplogroups L and M. The first occurs mainly in India and Sri Lanka (as a subclade L1) and Pakistan (L3), the haplogroup M is mainly located in Papua-New Guinea, where it accounts for a third to two thirds of the M haplogroup of the entire planet. Another version is that the combined haplogroup NOP left from Mesopotamia to the east along the Iranian plateau, turned south at the forbidding Pamir, Himalayas, Tien Shan, and Hindukush mountains, and passed along the Indian Ocean to the Southeast Asia. Neither hypothesis has any evidence in its favor. The possibility that the haplogroups NO, and R migrated separately is also not ruled out, the R from the Mesopotamia to the Eastern European Plain, along with the haplogroup I, and only that apparently can explain the Caucasoidness of the haplogroups I and R, in contrast to the non-Caucasoid N and O (e.g., the Sakha/Yakut and Chinese-Korean-Japanese, respectively). A rather exotic suggestion could also be offered that the whole NOP haplogroup could have been “Caucasoid”, with the NO bearers down the road turned into “Mongoloid” and/or “Oriental” races by local women. It is women who largely make human races. Men cannot pass their genes around directly to men.

The haplogroup Q occurs largely among the Siberian peoples, and among the American Indians, including the descendants of the Mayan tribes. The abundance of this haplogroup among the Ashkenazi Jews is attributed to the Khazar time, since the common ancestor of this haplogroup among the Jews is not more than a thousand years old.

The haplogroup R produced three most known haplogroups, R1a1, R1b1 and R2. The R1a1 is most frequent in Russia (average 48%, and in the southern regions of Belgorod and Orel provinces and adjacent regions 62% of the total population), and in the Eastern Europe (Poland, Ukraine, Belarus, approximately the same proportion in the populations and up to 57%); in Central Europe and Scandinavia it numbers about 15 - 20%. In the Atlantic region it is almost absent, sometimes at the level of a few percentage points.


So, leaving from Africa about 60,000 years ago, the carriers of almost all haplogroups that formed in the course of migration migrated to the Middle East, stretching from the southern to the northern Mesopotamia, i.e. to the southern Caucasus foothills, to the south of the Caspian Sea, and to the west of the Iranian Plateau, except for the haplogroups A and B that remained in Africa, and the haplogroup C that trekked along the Indian Ocean to the Australia and Oceania, and partially to the north to the South-East Asia. That was about 55 - 50 - 45 thousand years ago. That was the areal of the Nostratic, or Babylon language. Its echoes would get to almost all non-African languages of the world.

As was described above, the haplogroups I and R, which in the end became Caucasoid, went north to the Eastern European Plain, where about 45-40 thousand years ago the haplogroup I partially left to Europe, and its carriers became Cro-Magnon men, Gravettians, and other first people in Europe. Their language was the same as that of the haplogroup R, i.e. with a trend to the future flexive language of the haplogroup R1a1, or to the future agglutinative language of the haplogroup R1b1 and possibly of the haplogroup Q (the future Siberian peoples and American Indians). In any case, both languages in 35 thousand years will end up in Europe.

What linguistic landscape was in Eurasia 6 thousand years ago (4th millennium BC), and the next 2,000 years?

So, returning to the subject of our study, by the 6 thousand years ago the carriers of the haplogroup I, divided into two main subgroups I1 and I2, lived in Europe for more than 30 thousand years. They almost did not venture beyond the European continent. What language they used is unknown, but it is possible that the Basque language is an ancient language of the carriers of the haplogroup I. The fact that the Basque language is non-Indo-European is known. Currently, it is considered to be an unclassified agglutinative language. If it also would turn out not to be a Pra-Türkic, it is likely that it was the language of the ancient carriers of the haplogroup I. But if an unbiased study would find the elements of the Türkic agglutinating languages, it is the language of the ancient R1b1b2.


The carriers of the haplogroup R1b1b2, as was mentioned above, arrived in the Iberian Peninsula 4800-4500 years ago, passed through a severe population bottleneck (for the Basques 3,625 ± 370 years ago, and among the Basques R1b1b2 numbers 93%, Adams et al, 2008), and arrived via the Caucasus, where they lived 6,000 years ago. In this connection is important that some linguists are attributing the language of the Basques to the Sino-Caucasian linguistic macrofamily, which includes Caucasian, Tibetan, Yenisei, Chinese and Burushaski languages. Here we definitely see the reflection of the haplogroup R1b path during the ancient times, from the southern Siberia (Yenisei and Chinese languages) across the Caucasus (6,000 years ago) to the Pyrenees (Basques). So, the conjecture that the Basque language is the ancient language of the haplogroup R1b is not devoid of a connection with the classification of the linguists. Moreover, the Basque language has the same vigesimal (20-base) numeral system like the Caucasian languages, and has common elements with the Semito-Hamitic world, as well as with the Sumerian, and Hurro-Urartian (private communication of I. Byzov). That's it, the path and the environs on the way of the haplogroup R1b to Europe.

The carriers of the haplogroup R1a1 are Aryans, considering that it was them who came to the India and Iran about 3,500 years ago. In the 4th millennium BC they began spreading across Europe, and 4,750 ± 500 years ago came to the Eastern European Plain. In the next few centuries, they spread from the Baltic to the Caucasus, about 4,500 years ago they were recorded in the Caucasus, and about 3,600 years ago were already in Anatolia. This is consistent with the linguistic and archaeological results, and documentary evidence. In no way Anatolia can be considered to be a “homeland” of the Indo-European language, not only because the notion of the “homeland” in this context at totally wrong, but also because Anatolia and the surrounding regions were among the territories through which the Aryans passed during colonization and settling of the Eurasia. From the Anatolian side, it is unlikely that the Aryans advanced far to the east, and in any case not to the India and not to the eastern Iranian Plateau. These were the local places of the Aryan layover (haplogroup R1a1), and eventually the Aryans came there around 3,500 years ago from the North.

The phrase above, “consistent with the linguistic and archaeological results, and documentary evidence” should not be understood that that picture is consistent with a modern interpretation of these results by the linguists and archaeologists. This is the author's synthesis of the archaeological and linguistic findings, often dispersed, and their reconciliation with the findings of the DNA-genealogy. The modern archaeology, as is known, over the past decades was not inclined to consider migrations of the ancient people, its methodological arsenal is not too suitable for the studies of the migration. Their classic slogan, known to every archaeologist, is “The pots are not people” (read: “and we are doing the pots”) . In their paradigm, the transmission of the cultural and material traits are passed along the “chain”, generally not necessarily transmitted by the migrations. And it is precisely the migrations that are visible in the DNA genealogy, because the marker in the form of the DNA with certain mutation characteristics is detected in the different parts of Eurasia, together with the dating that follows from the accompanying mutations. That is, the place is detected directly, and the dating is calculated by mutations. From that is derived the picture of the migrations presented in this article.


Archaeologists usually assert that no migrations to the east happened in those days, in the 3 rd millennium BC, because the written sources did not record them,  archeology does not have them. Let's glance at the following specific experimental support.

4,000 years ago, the carriers of the haplogroup R1a1 already established the Andronovo archaeological culture and reached the southern Urals. The archaeological excavations in the south of Krasnoyarsk region revealed that the bone remains dating from 3,800-3,400 years ago have characteristic mutations of the haplogroup R1a1 (Keyser et al, 2009). Moreover, the haplotypes of these remains easily appended into the haplotype tree for the modern ethnic Russians from Ivanovo, Penza, Tver, Lipetsk, Novgorod and Ryazan regions. In other words, these remains and modern ethnic Russian had one and the same common ancestor, who lived, as we already know, about 4,800 years ago.

Approximately 3,600 years ago a part of the Aryans (haplogroup R1a1) left the southern Ural Mountains and moved to India. At about the same time, the Aryans from the Middle Asia, where they lived for at least five hundred years, moved to Iran. The common ancestors of the Indians and Iranians with the haplogroup R1a1 lived 4,050 and 4,025 years ago respectively (Klyosov, 2009b), which is 800 years “younger” than the common ancestor of the modern ethnic Russians with the haplogroup R1a1. The haplotypes of the modern Eastern Slavs (haplogroup R1a1) are almost identical with the haplotypes of the Indians and Iranians, up to the 67-marker haplotypes, i.e. to a maximum resolution of the modern DNA-genealogy. In other words, the coincidence is almost absolute. On that basis should be asserted that the Aryans of the 2nd millennium BC, the bearers of the haplogroup R1a1, are without a doubt the descendants of those same ancestors as the modern ethnic Russians. At present, in India live not less than 100 million men who are descendants of the Aryans from the Eastern European Plain, and before that from the Balkans. Up to 72% of the higher castes in India belong to the haplogroup R1a1 (Sharma et al, 2009). In terms of their DNA genealogy, they are the same people, close relatives.

These ancestors of the modern Russians, as well as many modern Ukrainians, Belarusians, Lithuanians, Estonians, Tajiks, and Kyrgyzes, i.e. the carriers of the haplogroup R1a1, brought to the India and Iran their Aryan flexive language, which also bridged the linguistic link between Europe and India-Iran, and manifested the beginning of a new linguistic family, the Indo-European languages.


The following is an ancestral (“base”) 25-marker haplotype of ethnic Russians of R1a1 haplogroup:


Each number, so-called allele, shows how many times a certain nucleotide sequence repeats itself in the male Y-chromosome. A 25-marker haplotype shows 25 of those sequences arranged in a certain standard order. Every male has a sequence of this kind, but his own. It is a part of his individual “DNA-passport”. Those sequences coalesce to the haplotype of a common ancestor of a certain population, and the further ago a common ancestor lived, the more mutations are accumulated among his descendants. Collectively, a number of those mutations can be translated to a period to a common ancestor (Klyosov, 2009a,b). The 25-marker haplotypes of Russian/Ukrainian individuals on average contain 0.292 ± 0.010 mutations per marker, which translates to approximately 4,750 years to their common ancestor (ibid.).

Among the Indians of the “Indo-European” group, the ancestral haplotype to which the individual R1a1 Indian haplotypes coalesce, is as follows (


These two ancestral haplotypes differ only in the 12th and 13th alleles from the left (marked in bold in the above haplotype, in so-called markers DYS389-2 and DYS458), which is equal to 30 / 30.58 and 15.28 /16.05 in the Russian and Indian ancestral haplotypes respectively. That is, a difference between them is 0.58 + 0.77 = 1.35, which corresponds to a timespan of only 750 years. Indeed, the individual Indian R1a1 “IE” haplotypes contain on average 0.255 ± 0.018 mutations per marker, which translates to approximately 4,050 years to their common ancestor (Klyosov, 2009b).

For comparison, a typical European R1b1b2 ancestral haplotype (so-called “Atlantic Modal Haplotype”) in the same 25-marker format is as follows:

13 24 14 11 11 14 12 12 12 13 13 29 – 17 9 10 11 11 25 15 19 29 15 15 17 17

It differs from the ancestral “Indo-European” Indian haplotype by 22 mutations (!). The bearers of R1b1b2 haplotypes are practically absent among the Indian upper castes.

In fact, the closeness of the Russian and Indian haplotypes is striking even in their 67-marker format, a highest resolution in the contemporary DNA genealogy (the academic publications in the field practically never go beyond 39 marker haplotypes, and those are singular examples in the contemporary scientific literature). For instance, this is the 67-marker haplotype of the author of this study, an ethnic Russian:

13 24 16 11 11 15 12 12 10 13 11 30 – 16 9 10 11 11 24 14 20 34 15 15 16 16 – 11 11 19 23 15 16 17 21 36 41 12 11 – 11 9 17 17 8 11 10 8 10 10 12 22 22 15 10 12 12 13 8 15 23 21 12 13 11 13 11 11 12 13

And the following are three typical individual haplotypes of R1a1  Indian haplogroups (mutational differences between them are highlighted in bold):

13 24 17 10 11 14 12 12 10 13 11 32 – 16 9 10 11 11 24 14 20 31 12 15 15 16 – 11 10 19 23 16 16 17 20 33 34 13 11 – 
11 8 17 17 8 11 10 8 11 10 12 22 22 15 10 12 12 13 8 14 23 21 13 13 11 13 11 11 12 13

13 24 16 11 11 14 12 12 10 13 11 31 -- 16 9 10 11 11 24 14 20 33 12 15 15 16 – 10 12 19 23 15 17 18 18 35 41 15 11 – 
11 8 17 17 8 12 10 8 11 10 12 22 22 15 10 12 12 13 8 13 23 21 12 12 11 13 10 11 12 12

13 23 16 11 12 15 12 12 10 13 11 30 – – 9 10 11 11 24 14 20 30 12 16 16 16 – 11 12 19 23 15 16 18 21 35 39 12 11 – 
11 8 17 17 8 12 10 8 11 10 12 22 22 16 10 12 12 13 8 14 24 22 13 13 11 13 11 11 12 12

It is clear how similar they are: between the Indian haplotypes a number of mutations is in the range of 27-30 (pairwise), and between the Russian and Indian haplotypes mutations are practically in the same range of 25-30. – DNA-genealogically,  they are the same people.

150 years ago, A.F. Hilferding in his work “On the affinity of the Slavic with Sanskrit” (1853) wrote: “... Slavic language, taken in its entirety, does not differ from the Sanskrit by any permanent, organic phonetical change. Some distinction, found in it, like some lisping “r” of the Czechs and Poles and others, have developed already in the later, historical era, and belong to only a few of their vernaculars, I repeat that in the overall the Slavic language does not have any particular distinctions alien to the Sanskrit. The Lithuanian language shares with it this property, whereas all other Indo-European languages follow different phonetical laws, which are exclusively peculiar to each of them separately. Thus, in the lexical relation the Slavic and Lithuanian languages are very closely kindred to Sanskrit, and together in the Indo-European tribe they make up something like a separate family, outside of which stand the Persian and Western Europe languages.

At the present time, we know that the Persian or Iranian languages also were basically brought to the eastern part of the Iranian Plateau by the Aryans, the carriers of the haplogroup R1a1, and around the same time as to the India, but the Aryans that had lived at least for several hundred years (probably for at least 500 years) in the Middle Asia. The starting time for the ancient Iranian languages is the middle of the 2nd millennium BC. According to S.A. Starostin (1989), the modern Russian and Persian languages have 28% pairwise matches in the Swadesh 100 word list, from which with the value of the “rate of loss of words factor” (per Starostin) equal to 0.05, S.A. Starostin arrived at

t = sqrt [(ln (100/28) / (2x0.05)] = 3.6

i.e. 3,600 years from the time of the divergence, or the split of these languages (S.A. Starostin, ibid.) This coincides quite precisely with the Aryans' arrival date to Iran, and the outset of the ancient Iranian languages. S.A. Starostin believed that that value “came out too young” and wrote that he would prefer to have this value as the 4th millennium BC, i.e. about 6,000 years ago, believing that that should be the time of the collapse of the Indo-European languages. But without realizing it, he did receive a reliable dating in respect to the time of the difference of the Aryan or “Pra-IE” and Iranian languages.

Thus, the words of Yu.N. Drozdov “Under the linguistic science concept, the languages of the modern European nations belong to the linguistic family called “Indo-European”, although you can not find a single ancient source which would have recorded any trace of Indians or their kindred peoples in the European territory”reflect the already mentioned above categorical stand of the Turkologists (and equally of the Iranists) in respect to the opposing science, and clearly outdated views of the linguists, where the “Indians” (or as much the “Iranians”) have already formed as an ethnic group, and should be the primary bearers of the Indo-European languages in Europe. On the contrary, the Indians, like the Iranians, were the recipients, and not the donors of these languages. The “Indo-European” languages at that time were generic Aryan languages.

Thus, 6000 years ago, or at the turn of the 4th and 5th millenniums BC, the linguistic landscape in Europe was the ancient Aryan, the language of the R1a1, and perhaps to some extent the language (or languages) of the ancient European haplogroup I. The language of the haplogroup I could also be the ancient Proto-IE, or could be the pra-language of the current Basques, or be an unknown now tongue. The Türkic language was brought over to the Western Europe by the haplogroup R1b1b2 only about 4 thousand years ago, at the turn of the 2nd and 3rd millennia BC.

Approximately 4,500-4,000 years ago something happened in Europe, resulting in the haplogroup R1a1 virtually disappearing from the Europe (see below). As, incidentally, at the same time also disappeared haplogroup I1 and largely the haplogroup I2. Shortly thereafter, the Europe was settled by  the carriers of the Türkic R1b (mainly by its subgroups R1b1b2). Two hypothetical reasons could be at the root, either an almost complete extermination of the other haplogroups by the carriers of the R1b, or between 4,000 and 4,500 years ago the Europe had suffered a major natural cataclysm, and the Türkic-lingual R1b1b2 settled in almost deserted Europe. Evidence can be found in favor of the first and the second possibility. The first theory is supported by numerous finds in Scandinavia of ancient human remains with crushed skulls belonging to approximately the same time, which even received a conditional designation of the “period of crushed skulls”. Characteristically, many findings uncovered fractured skulls of women and children (Lindquist, 1992, 1993, 1994, 1997, 1998). That finding is echoed by the finding in Germany of a group of 13 people, most of whom were children and women, most (including children) with crushed skulls and stone arrowheads stuck in the bones, dating from 4,600 years ago. For two boys (aged 4-5 and 8-9 years) and a men aged 40-60 years the haplogroups were successfully identified, and all three were R1a (Haak et al, 2008). The analysis of the event site revealed that the women, elderly and children were killed during an absence of the adults, apparently by a hostile tribe.


Apparently, under the standard scheme the period of the “fractured skulls” is linked with the “Indo-European invasion”, without a realization that the “Indo-Europeans” already lived in Europe for 12,000 years, and no their “invasion” from the west ever happened. Later, from the end of the 3rd millennium BC and for the next one and a half millennia, before their migration to India and Iran, the vector of their migration was directed to the east. The so-called “Kurgan theory” to the “Indo-Europeans”, i.e. to the holders of R1a1 or to the Aryans, had no relation whatever, but applied to the holders of R1b, which were Türkic-speaking, and indeed were moving westward and then southward through the Caucasus to the Asia Minor and Europe, as was described above; moreover, a thousand years or more ahead of the Aryans. To the “Indo-” they, too, had no relation, neither linguistically, no migrationally, and it remains only to wonder how such a theory could emerge at all. Like, however, also the “Anatolian” theory” of the Indo-European Urheimat. This will be discussed below.

In its entirety, the theory of the “Kurgan Culture” as an “Indo-European” was one ceaseless mishap. Were transposed the migratory flows, their directions (westward and eastward), timing of these flows (6-5 thousand years ago and 5-3 thousand years ago), the origin (tribal affiliation) of the migrants (R1a and R1b), their linguistic classification (Aryans and Türks). It seems that the desire of the authors and supporters of the Kurgan theory as “Indo-European” to persuade others in their accuracy did not allow them to consider alternatives, as is due in the science. Naturally, that mishap could not address the tribal affiliation, such information did not exist then.

Returning to the Europe 4,500-4,000 years ago, the scenario about extermination of the haplogroups R1a1 and I holders has a historical basis. Moreover, in Scandinavia the haplogroup I1 was (then and now) particularly common, so that the fractured skulls in Sweden could primarily belong to them. But we can not exclude a major natural disaster in Europe between the 4,500 and 4,000 years ago, and that has numerous literature that is so vast that we are not getting into details now. Instead, we will refer to the geophysical work (Keenan, 1999) with hundreds of references on this topic. According to the author, it probably was a largest destructive event in the history of civilization since the Ice Age, and it “encompassed the greater part of the northern hemisphere” (ibid.) Also, the cataclysm may have induced a rise of mounted aggression.


Whatever the reason, the haplogroup R1a1 virtually disappeared from the Europe at about 4,500-4,000 years ago, and the Türkic-speaking carriers of the haplogroup R1b colonized the deserted Europe. As is shown several lines below, virtually all modern branches of the haplogroup R1a1 in the Europe are dated from the 2,900-2,500 years ago and later. At the same time, there is evidence that haplogroup R1a1 was in Europe from 12,000  years ago. The archaeological excavations found the haplogroup R1a1 in Europe (Germany) 4,600 years ago (see above). In other words, for R1a1 in Europe exists a gap starting from the middle to the end of the third millennium BC (4,500-4,000 years ago) that lasted for 1,000 - 1,500 years. At the same time in Europe exists no gap in respect to the R1b1b2, their settling goes in a continuous stream from the 4,000-4,200 years ago, without any interruptions.

Apparently, Europe ended up becoming Türkic-speaking. This is with understanding that the Türkic language of the 4th millennium BC R1b invaders and the Türkic language of the third millennium BC R1b invaders were significantly different languages, separated not only in time, but their very different admixtures.

The R1a1 people remained only in the Eastern European Plain, they were the descendants of the people who moved there about 5,000 years ago. Within a few centuries, about 3,500 years ago, the surviving descendants of the extinct in Europe haplogroup R1a1 would bring their haplotypes, and their preserved Aryan language, to the Urals and Central Asia, to the India and Iran, and to the Siberia. This is again with understanding that the Aryan languages of the 4th millennium BC R1a migrants in the Eastern European Plain and their languages that reached their destinations in the Urals, India and Iranian Plateau were likely different languages, separated by their very different dialects.

The common ancestor of all these branches of the haplogroup R1a1 lived on the Eastern European Plain 4,850 ± 500 years ago. These are again the results of the DNA-genealogy with the inevitable conclusions of the linguistic nature. It is known that to the India and Iran was brought over the Aryan, or the Proto-Indo-European language. It should hardly be proposed that the same type R1a1 at the same time brought to the Urals and southern Siberia some other language. In fact, the Aryan (= Proto-IE) language continued to split into its branches between 6,000 and 3,500 years before present.

The second repopulation of Europe by the carriers of the R1a1 happened between 3,000-2,500 years ago, that is from the beginning to the middle of the first millennium BC and later on. Below are the timespans to the common ancestors of the 27 major European DNA-genealogical branches (adapted from Rozhanskiiy and Klyosov, 2009; Klyosov, Rozhanskii and Shvarev, in preparation), time is shown in years BP (in order from the “oldest” to “youngest”) :

1. Eastern European Plain (Central Eurasian parent) 4,425 ± 470 years before present
2. Western Eurasian (parent) 3,875 ± 450
3. Northern European 3,850 ± 500
4. Central Eurasian-1 3,625 ± 475
5. Central Eurasian-2 3,450 ± 450
6. Baltic-Carpathian (parent) 3,200 ± 400
7. Old Scandinavian (parent) 2,775 ± 360
8. Western Eurasian-1 2,575 ± 310
9. Western Slavic 2,550 ± 300
10. North-Western-1 (the Tenths) 2,525 ± 350
11. Central European-2 2,475 ± 300
12. Old Scandinavian-1 2,450 ± 300
13. Baltic-Carpathian-1 2,450 ± 350
14. Northern European-1 2,200 ± 300
15. Northern Carpathian 2,200 ± 300
16. Central European-1 2,175 ± 300
17. Old European 2,150 ± 450
18. Baltic Carpathian-2 2,125 ± 275
19. Scandinavian 2,075 ± 250
20. Western Carpathian 1,975 ± 290
21. Northern Eurasian 1,925 ± 250
22. Young Scandinavian Scottish 1,875 ± 350
23. Western Eurasian-2 1,675 ± 300
24. North-Western-2 (the Tenths) 1,650 ± 200
25. North-Western-3 (the Tenths) 1,475 ± 300
26. Ashkenazi 1,175 ± 150
27. Eastern Eurasian (Kyrgyz) 900 ± 150

A consideration of base haplotypes for all 27 branches showed that the ancestral (base) haplotype for all of them is the Central Eurasian ancestral haplotype, shown here in the 76-marker format:

13 25 16 11 11 14 12 12 10 13 11 30 – 15 9 10 11 11 24 14 20 32 12 15 15 16 – 11 11 19 23 16 16 18 19 34 38 13 11 – 
11 8 17 17 8 12 10 8 11 10 12 22 22 15 10 12 12 13 8 14 23 21 12 12 11 13 11 11 12 13 – 11 11 13 23 9 13 12 30 24

Practically all European branches have descended from the above base haplotype. One can see that this haplotype is identical to base haplotype of the ethnic Russians shown above in the 25-marker format. They are practically identical in all other alleles. A timespan to this common ancestor of R1a1 haplotypes in Europe, after the bearers of R1a1 arrived to the Eastern European Plain and passed a population bottleneck, equals to 5,100 ± 690 years BP. This value was calculated from base haplotypes for all 27 R1a1 branches listed above.

Except for the Eurasian (and Northern European) branches, most (if not all) of the European branches of R1a1 came back to Europe between 3,000 and 2,500 YBP and later. That was a return to Europe of the carriers of the flexive, Indo-European languages. As can be seen, for some regions it was the end of the last era and the beginning of our era. As a result of this migration, the Türkic European languages were replaced with the Indo-European languages, and that tilted the balance in the direction of the current European languages. However, this replacement left many Türkisms in the personal names, designations for the objects, and some individual terms.


It is unlikely that the displacement of the Türkic languages by the Indo-European in the Western and Central Europe was quick and painless, or peaceful. Typically, in such transitions are acting together a number of factors, especially the military, economic and political (ideological). The military factor is not always necessary, or rather, is not decisive, but the last two factors are mandatory. Apparently, the carriers of the Indo-European languages arriving from the east convincingly (this is a wide concept) demonstrated to the Türkic-lingual population of the last millennium in the past era Europe the benefits of their organization, the advantages of producing or more progressive economy, education and culture. Only that could lead to the assimilation of the alien (for then-Türkic population of Europe) material culture and to the transition to a different language. This area still awaits its researchers.

That the branches of the R1a1 tribes were returning to Europe exactly from the Eastern European Plain is evidenced by the fact, described above, that all of these combined European and Eurasian branches indicate the ancestral haplotype from the Eastern European Plain, and the same age, roughly 5,000 years ago (Rozhansky and Klyosov, 2009). Thus, reciting the statement of Yu.N. Drozdov “ can not find a single ancient source which would have recorded any trace of Indus or their kindred peoples in the European territory”it is worth noting that despite the ancient sources and their interpretations, kindred to the “Indus” carriers of the haplogroup R1a1 with their flexive “Indo-European” language, more accurately, languages by that time, by the beginning of our era returned to Europe from the Eastern European Plain. The subsequent intertwining of their IE languages and dialects has created the today’s Romance group.

The time of the Türkic languages' forming and diverging, and the glottochronology

When in the opinion of the experts formed the Türkic languages? This is what wrote a corresponding member of the USSR Academy of Sciences, Orientalist S.E. Malov in 1952: “Separate, the first in time, Türkic words and even a whole phrase (“Hunnish”) are in the Chinese records at the beginning of our era. And the Türkic languages in the writing monuments of the Türks are known to us from approximately 5th-6th cc. of our era.”According to other sources, the beginning of the (written) Türkic languages is associated with the appearance on the historical scene of the European Huns, i.e. the end of the past - the beginning of our era. The same is also evidenced by the glottochronology, that is not “evidenced”, but in fact is postulated by the will of those who decided to use it to resolve this problem. Let us briefly review that.


Nearly all works in glottochronology of the Türkic languages are in Russian language, and all with a notation “preliminary analysis” are connected with the name of M.T.Diyachok, and that preliminary analysis was carried out recently, mostly in 2001. Example - (Diyachok, 2001, “Glottochronology of the Türkic languages (preliminary analysis).” Let us review that work. It provides a classification of the Türkic languages suggested still by A.N. Samoylovich (1922), as well as the works of N.Z. Gadjiyev (1980, 1990). According to that classification, based on the phonetic and morphological principles, the Türkic linguistic group comprises six subgroups (sometimes Sakha/Yakut is defined as a separate subgroup):

1. Bulgar (Bulgar, Chuvash).
2. Uigur (Ancient Uigur, Khakas, Shor, Tuvan, Tofa, Yakut, Dolgan).
3. Kypchak (Tatar, Bashkir, Kazakh, Kirgiz, Altai, Karachai-Balkar, Kumyk, Crimean Tatar).
4. Chagatai (modern Uigur, Uzbek).
5. Kypchak-Turkmen (Western dialects of the Uzbek language).
6. Oguz (Turkish, Azeri, Gagauz, Turkmen).

It noted that despite numerosity of the languages in the Türkic group, many of them are very close to each other (Tatar and Bashkir, Kazakh and Karakalpak; Tuvan and Tofa, Sakha and Dolgan), although it is mentioned that in areas close to the Türkic ancestral home (Southern Siberia and northern China), the classification is developed insufficiently, and “it is quite possible that among them may be found fairly archaic elements”.

Returning now to the glottochronology, it turns out that M.T.Diyachok simply postulated that the laws of the of linguistic dynamics are the same for flexive and agglutinative languages, and took the same constant for the preservation of the vocabulary as is used for the Indo-European languages (even though it is floating there from situation to situation, as pointed S.A. Starostin, from the 0.14 of the initial Swadesh to the 0.03 used by other authors). But if S.A. Starostin, in his paper “Comparative-historical linguistics and lexicostatistics” (1989) went through a lot of options for the constant for the preservation of the language, and was comparing results with considerations about the suitability of the options, and when the constants should be changed and adapted to the real situations, M.T.Diyachok did not bother with such maters. He decided that “in accordance with the methodology of S.A. Starostin ... the factor of lexical preservation was taken to equal 91% per millennium.”And that's it.


Apparently the subject is not the “coefficient of lexicon preservation”, but its double value for the intersection of two 100 word lists. The 91% preservation of lexicon in the 100 word list corresponds to the coefficient of the linguistic dynamics equal to 0.047 (in S.A. Starostin it was 0.05 for a number of languages, and, by his remark, “it slightly varies from 0.04 to 0.06). For the intersection of the 100 word lists that constant should be doubled.

Be that as it may, it remains unclear on what grounds M.T.Diyachok stopped at the same constant for a totally different linguistic system, agglutinative instead of flexive. No such equality for the two constants was ever demonstrated. Moreover, there are assertions of the Turkologists that the Türkic languages are much more stable than the Indo-European languages. Examples:

(Zakiev M.Z.): “In the agglutinative languages the roots of the words almost do not change over time, because in the process of application (i.e. grammatical changes) they do not lose their original phonetic form. The modern phonetic form of the words in the agglutinative languages (hence in the Türkic language) we also can find in the ancient written sources” (“Genesis of the Türks and Tatars”, Moscow, 2003. p. 79).

(B.N. Drozdov): “The root part of the Türkic words phonetically remains permanent by definition (otherwise it would be either meaningless, or another word). The affix system is also phonetically conservative and has no exceptions to the rules for its use. And on the whole this means that the phonetics of the Türkic-lingual lexicon should not significantly change over time, unless it would be influenced by other languages. This phonetic stability of the Türkic-lingual lexical units is its unique distinction...” (pp. 11-12).

(T.A. Mollaev): “Due to specifics of the grammatical structure, the Türkic languages are preserved marvelously, and remain mutually intelligible with each other” (p. 50).

And what achieved M.T.Diyachok? Already can be predicted that if the Türkic languages and their lexemes are stable, and considerably more stable than the flexive languages, the affinity of the Türkic 100 word lists or other comparison texts will inevitably be interpreted, on the basis of the same coefficients of the linguistic dynamics as for the flexive language, that the Türkic languages are young and diverged relatively recently.


And that was the result (see table below). Data from (M.T.Diyachok, 2001) [years are rounded to the nearest century]

The main conclusion made by M.T.Diyachok is the following: “The results of the glottochronological analysis agree surprisingly well with the known history, and therefore can be regarded as reliable.

Another conclusion: “The division of the Türkic languages into the four most ancient branches (Sakha, Tuva, Bulgar and Western) occurred almost simultaneously during the first three centuries of our era.

All of these conclusions that include a time component are resting on a shaky postulate about applicability of the linguistic dynamics constant, or the “coefficient of language preservation”, or “word loss rate factor” (per Starostin) established for flexive languages to the agglutinative languages. And not only on that constant, but also on a premise that for the agglutinative languages is working the square root equation, which also nobody demonstrated so far.


The catch is that if the loss of words from the 100 word list proceeds by the first order kinetics, i.e. it depends only on its “internal” laws, and is not subject to outside influence, the rate equation which relates the proportion of the remaining words and the constant of the linguistic dynamics will be as follows:

[ln (100 / N)] / k = n

N - number of words preserved in a 100-word glossary,
k - rate of replacement (word loss rate factor),
n - number in thousands of years after which of the 100 word list remain N words (More details can be found at; the author of this study is one of the prime specialists on kinetics, and authored a number of textbooks in the area).

For example, with k = 0.05, a half of the words in 100 word list will survive after

ln2/0.05 = 13.9 thousand years.

A half the words in two 100 word lists will still be the same after

ln2/0.1 = 6.9 thousand years.

But because in reality this does not occur with the Indo-European languages, and loss of words there is much faster, then without any reason, purely empirically, is introduced a square root. This way the desired result is achieved, to get a faster rate for the loss of of the words, to get a shorter period for the erosion of the language “core”. For the Indo-European languages in this case it is

t = sqrt ((ln (100/50) / (2x0.05)) = 2.6thousand years,

or the same thing,

t = sqrt (6.9) = 2.6

which means that half of the words in the two 100 word lists remain after 2,600 years.

Naturally, this square root is just a fix for the desired result, and with a caveat that at the whish of the researcher the constant of the linguistic dynamics is also changing, any desired result can be produced. Nevertheless, because in reality the Indo-European languages are fairly well known, examples are bountiful, in other words the linguistic field is fairly well investigated, the glottochronology methods turned out to be useful for justification of, generally, what is already known.


But the glottochronology of the Türkic languages is an unexplored field, the time constants are unknown, and is unknown what to fix to. So the fix is done to fit the main provisions of the modern linguistic science on the Türkic languages. In science, this is not the best approach.

The table above shows that in the Sakha and Turkish languages, of the 91 words 68 are common. If the empirical position about the square root is wrong (languages are stable), and the rate of replacement is not 0.05, as for the flexive languages, but say, only half as fast, then we obtain
[ln (91/68)] / 2x0.025 = 5, 800 years.

That is, then the Sakha and Türkish languages would have diverged 5,800 years ago. Since the carriers of the R1b began their journey from the Southern Siberia 16,000 years ago, and arrived to the Asia Minor 5,500 years ago, the calculated divergence time of 5,800 years ago is possible.

There are publications on the relationship of the Türkic languages and Sioux of the American Indians, on the affinity of the Türkic languages and Mayan languages (see, for example, ), but the author of this work is not a linguist, and can not properly appreciate reliability of that research. If those results were confirmed, it would not be superfluous to state that the Ural-Altaic peoples have a significant trace of the same haplogroup Q as the majority of the American Indians. The same haplogroup have the Maya. Thus, if these findings were to be confirmed, they would have a solid foundation within the framework of the DNA-genealogy.

About the “Kurgan Culture” as an “Urheimat of the Indo-Europeans”, the “Anatolian theory of the Urheimat of the Indo-Europeans”, and how this could happen that were confused not only the “Urheimats”, but the Türks (R1b) and the “Indo-Europeans” (R1a)

In the eyes of the author of this study, there were two main reasons for the confusion and the resulting incorrect postulates. The first and foremost is that at a time when these theories were developing, the science did not operate with the concepts of the tribe in the DNA-genealogy terms, that is the presence of a marker in the DNA of its carriers. These markers that can not be assimilated as are assimilated and over time become blurred the cultural traits, languages, ethnic characteristics, even the anthropological, morphological features of the skeleton. The analysis of these markers, called SNP’s, or “snips” (Klyosov, 2009 a, b, c, d, and references therein), which determines the tribe of their carrier, and allows to trace the migration paths of each tribe separately, and with a calculation of the residence time for each tribe in the course of the migration, quickly demonstrated both the fallacy of the “Kurgan theory”, and the partial, limited significance of the Anatolian theory.


I will not dissect the pros and cons of the Anatolian theory. That could continue indefinitely, as is seen in the literature, when one uncritical “argument” is advanced in opposition to another uncritical “argument”, and so on. The analysis of the haplogroups (of the tribal markers - “snips” and of the tribes) and haplotypes (individual characteristics of the ancestors from the ancestral line, who lived thousands of years ago, and their modern descendants) has helped to establish several important factors.

First - the carriers of the haplogroup R1a were the very same “Proto-Indo-Europeans” who migrated to India and Iran about 3,500 years ago, in the middle of the 2nd millennium BC, it is they who under a name (or a self-name) “Aryans” brought to India their Aryan language, which after millenniums received the titles “Proto-Indo-European” and “Indo-European”, it is their descendants who comprise 47% of the ethnic Russians (haplogroup R1a1) in Russia in general, and 62% in the southern regions of Russia, such as Orel, Belgorod and adjacent territories, and up to 72% in the highest Hindu caste. The modern ethnic Russians with the haplogroup R1a1 have a common ancestor who lived on the Eastern European Plain about 4,800 years ago, and he belonged to the Aryan tribe, or to the future Aryans, that depends on the definitions, the essence is the same.

The tribe R1a1, with its Aryan language, moved to the Eastern European Plain, presumably from the Balkans, in the early 3rd millennium BC. The vector of that migration was to the east, although the travel from the Carpathians to the southern Urals, and to Central Asia for that tribe took fifteen hundred years. It is clear that they were not nomads. It was a slow but steady settling of the Eastern European Plain. It was a spread of the Aryan language from the Baltic to the Caucasus, and later to the South Caucasus, to Anatolia, to the Hittites and Mitanni, though maybe only to their upper society. In those regions the Aryan or Proto-Indo-European language arrived, judging by the mutations in the haplotypes, about 3,600 years ago. It remained in that region, and it if moved, it was not to the east, but to the south toward the Arabian Peninsula. The fraction of the R1a1-M17 in Russia, Iran, Middle East and the Arabian Peninsula (according to Abu-Amero et al, 2009; Underhill et al, 2009), comprises the following quantities:


                         Country                  Proportion of R1a1-M17, %
Russia, South                   62
Russia average                 47
   Iran                                    10-14
Oman                               9.0
UAE                                7.4
Iraq                                        6.9
Anatolia                                6.9
Qatar                                6.9
Saudi Arabia                          5.1
Egypt                                3.0
Lebanon                                2.5
Jordan                                1.4

The Pra-Indo-European (R1a1) did not make any discernible moves to the north or east from the Asia Minor. The South Caucasus, western Azerbaijan or western Iran, and the whole Asia Minor were just “dead-end” regions of the migrating “Proto-Indo-Europeans” 3,600-3,000 years ago. The Arians came there once again in the first millennium BC, already from the territory of Iran, expanding the territory of their empires to the Caucasus and Assyria. But that was already at the time of the ancient Iranian languages, with a transition to the Middle Iranian languages. Remember that the haplogroup R1a1 was found in the Andronovo archaeological culture, and the haplotypes were typical for the modern haplotypes of the ethnic Russians (Klyosov, 2009e, and references therein).

With that, the “Anatolian theory” is over with. In reality, it could relate to the Nostratic languages in the same region, but that was 50-45-40 thousand years ago.
The linguistic and temporal space of the haplogroup R1b was different, but the territories were largely the same. This led archaeologists and linguists to a complete confusion, they confounded the Türks for the “Indo-Europeans”. As was already noted, at first the Proto-Türkic haplogroup R1b appeared about 16,000 years ago in the Southern Siberia. After a long period of time the carriers of that tribe expanded, bringing along their language, to the Middle Volga and Volga-Kama region, which now also abounds with the carriers of the haplogroup R1b, which constitute a substantial proportion of their ethnic groups (Lobov, 2009). Generally, the ethnic Russians (i.e., those who are speaking Russian for many generations and consider themselves to be Russians at least for three generations), 5% of whom have R1b1 haplogroup, have a common ancestor who lived 6,775 ± 830 years ago, much earlier than time of the “Proto-Indo-European” tribe's arrival to the Eastern European Plain.


That was the time of the Middle Volga, Samara, Khvalyn and the ancient Pit Grave or “Kurgan” culture. Neither the R1a1, nor the “Indo-Europeans” had any relation to them. Though the advance of the Kurgan Culture was to the west, or more accurately, to the west and south, they were not carrying along the Indo-European languages. They were carrying the Pra-Türkic or Türkic languages, the term  is also a matter of definitions.

The aforementioned work of the USSR Academy of Sciences corresponding member S.E. Malov “Ancient and new Türkic languages” (1952), although states that “the Türkic languages in the writing monuments of the Türks are known to us from approximately 5th-6th centuries of our era”, in that part discussed only the monuments of writing. Indeed, the writing among the Türkic peoples is held to be late. But the language is not just the writing, though for unwritten languages archeology is practically helpless. So far one can only rely on a common sense - if a tribe identified by its haplogroup, that is by an ineradicable marker in the Y-chromosome, existed for many thousands years, and at times 20 and 16 thousand years respectively, as in the cases of the considered here “Proto-Indo-European” and “Proto-Türkic” tribes R1a and R1b, there is no reason to believe a priori that their languages appeared only with the advent of the writing. The same common sense dictates that the time bar of their languages can be lowered down for many thousands years, to the same 20 and 16 thousand years, unless shown otherwise. Nobody has shown that yet.

S.E. Malov also writes the same, speaking of rock inscription monuments in the basins of the Enisei and Talas: “For that time we can conclude about the Türkic languages that before that they had quite a long history, it is not only difficult, but also impossible to admit the contrary.”And S.E. Malov continues: “The languages, judging by these monuments, are a result of a very large development, and therefore undoubtedly can be presumed that the Türkic languages, which we know and which we could easily understand, i.e. the Türkic languages in their present known to us composition and in the present constitution, existed several centuries before our era, say for five centuries! We are not allowed by our knowledge, or rather by our ignorance, to further penetrate into the depths of the centuries, into the the history of the Türkic languages. Of course, the Türkic languages existed further into the depths of the centuries, but with our present knowledge we would not understand them, we would not know any phonetical transitions, special phonetic laws, and the then vocabulary, especially for specific realities of the ancient Türks.


That's why I continue to believe that a strong likelihood exists that the Basque languages are ancient Türkic Languages of the R1b haplogroup, brought over to the Pyrenees about 4 thousand years ago, after a long circuitous route from the Altai, through the Volga-Urals and the southern steppes, across the Caucasus, Anatolia, and the Middle East, through the North Africa and on to Iberia. And the fact that the Basque language for many linguists remains “unclassified” reflects the position of S.E. Malov “with our present knowledge we would not understand them, we would not know any phonetical transitions, special phonetic laws, and the then vocabulary, especially for specific realities of the ancient Türks.

If the scheme proposed in this paper is correct, then the answer to the question of S.E. Malov “I have an unanswered question: who is older, the Bulgars-Chuvashes in the west (Danube and Volga), or the Uigurs in the East, in the Central Asia, or they belong to the same time”is certainly determined: the Uigurs in the east are much older. This is consistent with the concept that the Oguz branch is a parent branch, and the Ogur branch of the Kangars, Tokhars, and the Huns formed as a local dialect of the Central Asia area.

Haplotypes of the R1b haplogroup carriers in the Eurasia

Comparison of R1b haplotypes of the Uigurs on the one hand, and Chuvashes, Bulgars, and Hungarians on the other, shows that the Uigurs usually have the more ancient subgroup R1b1b1, which predominantly remained in Asia. The remaining R1b haplotypes belong to a more recent haplogroup R1b1b2, which originated in the Caucasus, and tails off to Europe. The common European ancestors of R1b1b2 lived, as was noted above, 6,000 years ago in the Caucasus and 5,500 years ago in Anatolia, 5,300 years ago in the Middle East, and 4,500-3,600 years ago in Europe. The European haplotypes of the R1b1b2 group are so young (in terms of the DNA genealogy) that many still retain the ancestral haplotype of 4,000 years ago (shown here in the 12-marker format)

:13-24-14-11-11-14-12-12 -12-13-13-16

It is called “Atlantic modal haplotype”, because it was first identified in the haplotype study of the British Isles. For example, in a series of 750 haplotypes of the R1b1b2 haplogroup in the Iberian peninsula (in a more sensitive to the changes 19-marker format), 16 haplotypes still retain the ancestral sequence, and are identical to each other in that series. Using the same method shown above for the glottochronology, we can calculate that the starting time for the divergence of these haplotypes, or in other words the time when it was a common ancestor of these haplotypes is

[ln (750/16)] / 0.0285 = 135 generations ago

which adjusted for recurrent mutation (that would revert to the original haplotype, the ancestral state) gives 156 generations, or 3,900 years before the common ancestor (Klyosov, 2009a). Here 0.0285 is the average rate of mutations per haplotype per generation of 25 years (the duration of the generation here is a mathematical parameter, not connected with the actual duration of the generation, which is a “floating parameter”) . Since the same series of 750 haplotypes has 2,796 mutations from the ancestral haplotype,  a simple calculation produces 2,796/750/0.0285 = 131 generations, or adjusted for recurrent mutation results in 150 generations, or 3,750 ± 380 years before the common ancestor.


That is the time when the carriers of the Türkic haplogroup R1b passed through a population bottleneck in the Pyrenees. In the North Africa, among the Berbers, that value is older only by a hundred years (3,875 ± 670 years before the common ancestor), and the ancestral haplotype is the same. Taking into account the dating of R1b1b2 in Europe, and other factors, such as distribution of R1b1b2 in Europe along with their datings, it can be concluded that bearers of this haplogroup had arrived to Iberia not later than 4,500 years before present.

In Asia, among Uigurs, and many Uzbeks, Tajiks, Tuvans, and Kazakhs the haplotype is different, and is derived from the ancestral haplotype:


Compared with the “Atlantic modal haplotype” it differs by 11 mutations, and knowing that each mutation on average happens once in a thousand years, can already be seen how far back in time stand the ancestors of the Asian and European carriers of the haplogroup R1b. More detailed calculations with extended haplotypes showed that the common ancestor of both Asian and European haplotypes lived in Asia 16,000 years ago. That apparently is the minimum lower time limit for the Proto-Türkic languages.

Before continuing this review further, note that the first digit (allele) in the ancestral haplotype, which is 13  both in the “Atlantic modal haplotype” (see above), and in the Asian core haplotype (see above), is very stable, and in the generations mutates on the average once in many millennia. That is, it accompanies the tribe, rarely changing, across long distances and for hundreds generations. It turned out that in the southern steppes, or perhaps even in the Middle Volga culture of 8-7 thousand years ago, this allele became “12”, and many descendants who reached the Caucasus and advanced into Anatolia and on to the Middle East, had “12” in the first allele of their haplotype. For example, among only four people from the Middle East at the site
three have “12” in the first marker. Similarly, out of 11 people from Arab countries nine have “12” in the first marker


In other words, this secondary marker allows to distinguish the descendants of the “ancient Pit Gravers” or “Kurganians”. During migration to Europe, in any case from the North Africa through the Pyrenees, the “12” was replaced by “13”, which passed the bottleneck in Iberia, and is observed in the “Atlantic modal haplotype”. Using it as a marker allows to distinguish with varying degrees of probability separate streams of migration.

For example, out of 750 R1b1b2 haplotypes on the Iberian Peninsula (with the common ancestor 3,750 ± 380 years ago), only 41 were “12”, which is about one case out of 20 (5% of total). Among the R1b1b2 haplotypes of the Central Europe (Flanders) the allele 12 is found only in 3% of the population. On the contrary, among the older R1b1 haplotypes of the ethnic Russians, the direct descendants of the ancient “Kurganian” Pit Gravers (common ancestor lived 6,775 ± 830 years ago) the allele “12” is encountered already in 37% of the population. This allele is advancing to the Caucasus, and in the Caucasian haplotype R1b1b2 the allele “12” is more than half of what determines the basic haplotype


As can be seen, only the first marker yields a single difference of the group R1b1b2 ancient Caucasian haplotypes from the European group, and displays a continuity with the ancient “Pit Grave” haplotypes. But this distinction allows to trace migrational direction for the carriers of the R1b1b2.

To that is important to add that the R1b haplotypes in the Balkans have “12” in that marker in 50% of the cases, in Italy 27%. In Slovenia that parameter is 20%, with the “age” of the common ancestor 4,250 ± 600 years. All these are a branch of the Türks, “Kurganians”, “ancient Pit Gravers”, that crossed from the Eastern European Plain either directly around the Black Sea to the Balkans, and further on to the the Apennines, or through the Asia Minor. The others, as was noted, went to Europe via Anatolia through the Middle East and then on through the North Africa on the way to the Pyrenees. That was a Beaker Culture.

The following are typical haplotypes of the Hungarian Seklers (Szeklers), who belonged to a military service class, and were noted in a 1602 Sekler military census (Klyosov, 2009f). 18% of the Seklers have haplogroup R1b1, 15% have haplogroup R1a1. The largest is a purely European group I2, to which belong 20% of the Seklers. The “Mongoloid” (initial) haplogroup N numbers 2% of the Seklers, another one  initial “Türkic” haplogroup Q numbers 4%.


Thus, the typical Seklers' group R1b1 haplotypes are:

001 12 23 14 10 11 14 12 12 12 14 13 16
002 12 24 14 11 11 14 12 12 12 13 13 16
003 12 24 14 11 11 14 12 12 12 14 14 16
004 12 25 14 11 11 14 12 12 13 13 14 16
005 13 23 14 11 11 14 12 12 11 13 13 15
006 13 23 14 11 11 14 12 12 12 13 13 16
007 13 23 14 11 12 12 12 12 13 14 13 16
008 13 23 15 11 12 12 12 12 13 14 13 16
009 13 24 14 10 11 11 12 12 12 12 13 16
010 13 24 14 10 12 14 12 12 13 13 13 17
011 13 24 14 10 12 14 12 12 14 13 13 17
012 13 24 14 11   9 14 12 12 11 13 13 16
013 13 24 14 11 11 11 12 12 11 12 13 16
014 13 24 14 11 11 13 12 12 12 13 13 16
015 13 24 14 11 11 13 12 12 12 13 13 16
016 13 24 14 11 11 14 12 10 11 14 13 16
017 13 24 14 11 11 14 12 12 12 13 13 16
018 13 24 14 11 11 14 12 12 12 13 13 17
019 13 24 14 11 11 15 12 12 11 13 13 17
020 13 24 14 11 11 15 12 12 11 13 13 17
021 13 24 14 11 12 15 13 12 14 13 13 15
022 13 24 14 12 11 14 12 12 12 13 13 16
023 14 23 14 11 11 14 12 12 11 13 13 16
024 14 23 14 11 11 14 12 12 11 13 13 16
025 15 23 14 11 11 13 12 12 11 13 13 16

The haplotypes tree for all 25 people is depicted in the following diagram.

It is evident that the tree is unbalanced, uneven, and presents different genealogical lineages. Indeed, the first four haplotypes have allele 12 in the first marker (16% of the total, much higher than the typical European 3-5%), which corresponds to the “Kurgan Culture” ancient haplotype. Apparently, that is the starting point of the Hungarian Seklers ancestral migration. No haplotype has the typical Asian “19” in the second marker, all alleles are typical European late alleles. The ancestral haplotype for these markers is determined by the twelve most frequent alleles (vertical columns of numbers) in the whole sample. That is



i.e. exactly the “Atlantic modal haplotype”, as shown above. This is the haplotype number 017 in the list above, it is preserved unchanged from the ancestral, and it also stands alone at the top of the haplotype tree in the diagram above. The sample has a small admixture of those ancient “Kurgan”, or “Caucasian” haplotypes, but they are overshadowed by more recent European haplotypes that “pull the blanket over”. As a result, the Sekler haplotypes of the R1b1b2 haplogroup already represent a younger age of these Türkic carriers of the R1b, predictably the common ancestors of about 4,000 years ago. Perhaps the Sekler haplotypes are shifted a little toward more ancient times because of the admixture of the ancient haplotypes.


Double-check that assumption:

The most stable markers in the ancestral haplotype are the third, seventh and 11th in a row counting from the left, in the above sample from the ancestral haplotype (number 017) they produced only 1, 1, and 2 mutations respectively in all 25 haplotypes. All in all, the sample of 25 haplotypes numbers 82 mutations from the ancestral haplotype.

This produces 82/25/0.022 = 149 generations from a common ancestor with no adjustment for recurrent mutation, or 175 generations (with table corrections, Klyosov, 2009a). In turn, this results in 4,375 ± 650 years to a common ancestor. Indeed, within the margin of error it is the same 4 thousand years, but but slightly aged by the presence of older impurities.

We can also cite the haplotypes of the Bulgarian carriers of the haplogroup R1b. Unfortunately, there are not many of them, the testing for haplotypes did not reach the Bulgaria yet. And the record is incomplete, out of twelve were given only the first four and last three markers, that is we have only “7-marker haplotypes”. But for our purposes that will suffice.

001 13 24 14 11 13 13 16
002 13 24 14 11 13 13 16
003 13 24 14 11 12 13 16
004 13 23 14 10 13 13 16
005 12 24 14 11 13 12 16
006 14 24 14 11 13 13 16
007 12 25 14 11 13 13 16
008 12 23 14 10 14 14 15
009 12 25 14 11 14 13 16
010 13 22 15 11 13 13 16
011 13 23 15 10 13 13 16
012 12 24 15 11 13 13 16
013 13 25 15 11 13 13 16
014 13 24 15 11 13 13 17


Again, the admixture of the ancient “12” is seen in the first marker, it is five out of 14, which is 38%, which is the same as in the Russian ancient haplotypes. There is a typical Asian “22” in the second marker. The rest are typical European haplotypes. The ancestral, or, more precisely, “base” haplotype in this series is


where X is missing alleles in the standard 12-marker haplotype format. It is evident that this is again the “Atlantic modal haplotype.” The Asian haplotypes (haplogroup R1b1b1) practically did not reach Europe. To phrase it differently, the “European” haplotypes are Eurasian haplotypes that reached Europe, the “Asian” haplotypes are Eurasian haplotypes that remained in Asia.

Somewhat distorted (and again somewhat older, for reasons described above) time to a common ancestor can be obtained from the number of mutations in the series, calculated from the base haplotype. There are 29 mutations in all 14 haplotypes. This gives 29/14/0.013 = 159 generations without correction for recurrent mutation, or 188 corrected generations, that is 4,700 ± 990 years before the common ancestor. Here 0.013 is the constant of the mutation rate for the 7-marker haplotype (Klyosov, 2009a), and the estimate of the error is given in the same reference.

Consider several Gagauz haplotypes, more than 90% of whom speak Gagauz Türkic

001 12 24 14 11 13 11 16
002 12 24 14 11 13 11 16
003 12 24 14 11 13 11 17
004 12 25 14 10 14 14 15
005 12 25 14 10 14 13 16
006 13 24 14 11 13 13 16
007 13 24 14 12 13 13 16
008 13 24 14 11 14 13 16
009 13 24 14 11 14 13 16
010 13 24 14 11 15 13 16
011 13 24 14 11 15 13 16

Here we note the ancient allele “12” at almost half (45%) of the first markers, but the second allele is clearly not Asian, but European. Generally, it is again a mixture of descendants of the ancient and relatively recent ancestors, which gives a base haplotype slightly shifted away from the same “Atlantic modal haplotype”. The third marker on the right may no longer be 13, but 14, which apparently reflects a more significant contribution of the ancient ancestor.



All 11 haplotypes have 26 mutations, which gives 26/11/0.013 = 182 uncorrected generations, or 222 corrected generations compensated for reverse mutations, which gives already 5,550 ± 1120 years to a common ancestor.

The proportion of the haplogroup R1b1 among Gagauzes is 12% (Klyosov, 2009g). Thus, we found that to the Europe, including Hungary and Bulgaria, although with spot examples for the latter countries (in the rest of Europe are already known thousands of group R1b1b2 haplotypes, and the picture is clear there), came the carriers of the “new era” haplotypes, with ancestors of 4-6 thousand years ago. Among the ethnic Russians, the ancestors are practically the same, but they hail from about 7,000 years ago (6,775 ± 830 BP). It is still the same ancient Pit Grave or “Kurgan” Culture and its predecessors, the Türkic-lingual carriers of the haplotypes.

The Türkic-lingual Asian carriers of the group R1b haplotype remained in Asia. 5,700-5,100 years ago in the North Kazakhstan they established Botai Archeological Culture, and according to the latest data 5,500 years ago domesticated the horse (Archaeology, Jan-Feb 2010). In addition to the Botai settlement dated 3,700-3,100 BC (definitely haplogroup R1b, since the carriers of the R1a1 appeared in those places were only one and a half - two thousand years later), there was found a camp dated 1,200-900 BC, i.e. 3,200-2,900 years ago. Possibly, that is already “Indo-European» R1a1, after a part of them departed to India. Although, they also could be the Türkic R1b1. The archaeologists, naturally, did not get into such distinctions. They simply noted that the second camp belonged to the Bronze Age. More likely there were R1b1, since the Northern Kazakhstan steppe would have been more appropriate for mounted nomads with their mobile herds to survive there and be productive. The R1a1 in Central Asia had to follow river valleys to survive.

There is some data on the proportion of the haplogroup R1b1 among Bashkirs, which varies from population to population ranging from 7% to 84% (Lobov, 2009, p. 15). Among the Perm and Baimak Bashkirs this proportion is 84% and 81% respectively. Among the Burzyan, Western Orenburg and Saratov-Samara Bashkirs it is 33, 23, and 18% respectively. Among the Eastern Orenburg and Abzelil Bashkirs it is 9% and 7% respectively. Among the Sterlibash Bashkirs in the East Urals the haplogroup R1b1 is absent. Perhaps linguists can compare this statistics with the presence of the Türkic languages in these regions, although the link may be very indirect. In the regions, the both haplogroups and languages change with displacements or arrivals.


According to the data (Wiik, 2008) the following populations have the haplogroup R1b1at these quantities (average):

Nation  Proportion R1b1, %
Bashkirs         19
Khanty            10
Komi         16
Mordovians 13
Chuvashes 12
Udmurts           9
Tatars         6-9
Mari           5
Russians           5

For comparison, the content of the haplogroup R1b1 in other countries is:

Country            Proportion R1b1, %
Hungary                  13-20
Turkey                        6-10
according to other data 14-16
Lebanon, Syria       6-15
Georgia                      10-14
Iraq                                 11

In Central Asia, proportion of the haplogroup R1b in the populations is:

Nation Proportion R1b1, %
Turkmens 37
Uzbeks        10
Uigurs       8-19
Kazakhs         6

As can be seen, this “Türkic” haplogroup have been substantially displaced with respect not only to the language, but also to the presence of the haplogroup. Perhaps they were interrelated processes. Generally, the words of S.E. Malov “The Eastern Türkic languages ... present a more ancient picture, older then the Western Türkic languages” (1952) still remain valid more than 50 years later, although he added that “initially, so to speak, they are not any less ancient than their eastern brethren languages, but in the western Türkic languages now prevail many new elements that replaced the ancient elements” (ibid.). This is certainly true, but in antiquity the western languages certainly yielded to the eastern languages.


So, as inevitably follows from the above, the bearers of the haplogroup R1a1, or the Aryans, or the “Proto-Indo-Europeans”, were trekking eastward from the Europe, most probably from the Balkans, from the beginning of the 3rd millennium BC, populating the European plain (the age of the common ancestor of the Aryans in the Eastern European Plain is about 4,850 years), and further on, establishing the Andronovo Culture 4,000-3,200 years ago, which overlaid the previous habitat of the haplogroup R1b1 that preceded R1a1 by one and a half or two millennia (Botai archaeological culture 5,700-5,100 years ago) and subsequently settled in the Southern and Eastern Urals, southern Siberia and Altai, reaching the northern China.

The haplogroupR1b migrated on an opposite course, but much earlier, in the Eastern European Plain at least 8-5 thousand years ago, partially populating the Caucasus 6,000 years ago, and at the same time crossing into Anatolia and then the Middle East. Time wise, they practically did not intersect  with the “Proto-Indo-Europeans”, the carriers of the haplogroup R1a1, but transversed the same territories, especially the Middle Volga, Samara, Khvalyn, the ancient Pit Grave, Timber Grave, and Andronov Cultures. Again, this led to misunderstandings of archaeologists and linguists about localization of the Indo-European homeland “in the southern steppes of Russia and as much in the Northern Pontic and Anatolia.

Theories of “Indo-European Urheimat” in light of the DNA-Genealogy

Amazingly, all four main hypotheses localizing the “Indo-European homeland”, namely “Circumpontic localization”, “Kurgan”, “Anatolian”, and “Neolithic gap” turned out to be wrong at their core. They could not explain the direction of “Indo-Europeans”, including the path towards the India; they could not explain the timing of their movement and what preceded that movement; they were unable to point the location of the “pra-homeland” and where from the “Pra-Indo-Europeans” appeared there, especially since (the fallacious) notion of “primordial homeland” does not contain the previous localization, which is fundamentally wrong; they could not explain the prolonged contact of the “Proto-Indo-Europeans” with other language families (Kartvelian, North Caucasus, Semitic, Pra-Türkic), which clearly occurred in the 3rd and 2nd millenniums, when the carriers of the haplogroup R1a1 reached the Caucasus about 4,500 years ago, reached the Near East around 3,800-3,600 years ago, and reached the territories of the ancient Pit Grave Culture, Andronovo Culture, and Central Asia, with their probable Türkic-lingual population (haplogroup R1b1) approximately 4,000-3,600 years ago.


1. The “Circumpontic localization” hypothesis (Merpert, 1974, 1976) erroneously places the “homeland” in the Caspian-Black Sea steppes, and also erroneously times that by more than 5,000 years ago (second half of the 4th millennium BC). Apparently, here again for the “Indo-Europeans” are mistaken the Türkic-lingual carriers of the R1b1, who at that time were completing the movement across the Caucasus to Anatolia, and were already present in the Middle East. Not accidentally, this hypothesis suggests the “pastoral cultures of the Caspian-Black Sea steppes”. In connection with that, the author of the hypothesis rightly talks about “continuity and cultural integration”, from the zone of the ancient Pit Grave Culture to the Caucasus region and further to the south of the Black Sea, only on the contrary it belongs to the Türkic-lingual R1b1. The Balkan-Carpathian region, whence the “Proto-Indo-Europeans” came from, in this hypothesis is not even considered. Also not considered is the spread of the “Proto-Indo-Europeans” (haplogroup R1a1) in all directions in Europe, from the Balkans  to the Atlantic, to the Scandinavia, to the south to Greece and the Mediterranean islands, all that mainly in the 4th millennium BC. That was the spread of the carriers of the Proto-Indo-European dialects. During all these contacts was ongoing a borrowing of the existing cultural lexicon, which is reflected in the Indo-European languages.

2. “Kurgan theory» (Gimbutas, 1964, 1974, 1977, 1980) interpreted the materials about the “Indo-Europeans” totally opposite the real movement of the “Indo-Europeans”, which took place millennia later, and in reality analyzed the most likely scenario of the southwestward move by the Türkic-lingual tribes (Haplogroup R1b1). The concept of the Eurasian steppes as a homeland of the “Indo-European community” is totally counterproductive and wrong. First, the “Proto-Indo-Europeans” could not appear in the steppes from a nowhere, no language “homeland” in that location can be real. In fact, they have not appeared out of nowhere. The Türkic-lingual R1b1 migrated from the east, the Aryan-speaking “Proto-Indo-Europeans” migrated from the west. In the depth of the  ancient times, they both came from the Southern Siberia. The route of their arrival to the South Siberia is also adequately developed in terms of the DNA genealogy. This does not mean at all that the factual material collected by M. Gimbutas is incorrect. On the opposite, it is precisely accurate, like the findings on the increase in the share of the animal husbandry relative to the agriculture in the region of the ancient Pit Grave culture, and the further movement of the “Kurganians”, and the facts and conclusions about the type of the housing and settlements, about the physical type of the population, and the terminology related to the horse, but all that belongs to the Türkic-lingual R1b, and not to the “Pra-Indo-Europeans”, about which M. Gimbutas apparently even did not suspect. The same applies to the physical type of the population, because both the R1a “Proto-Indo-Europeans”, and the R1b “Pra-Türks” not only both are Caucasoids, but altogether belong both to the same upstream haplogroup R1. It is easy to transpose, and that's what had happened.


The first “wave of the Kurgan Culture carriers” M. Gimbutas attributed to the beginning of the 4th millennium BC, approximately 6,000 years ago, in the territory between the Volga and Dnieper. It is certainly the Türkic-lingual R1b1, since the carriers of the haplogroup R1a1 did not exist there at that time, they appeared there more than a thousand years later, and yet it took them several hundred more years to reach the Volga river. Besides, as M. Gimbutas pointed out, the “carriers of the first wave of the Kurgan Culture” developed from the Samara and Seroglazov cultures of the Volga Basin. These were definitely the Türkic-lingual R1b1. To the “Proto-Indo-Europeans” they have no relation neither in time, nor place, nor origin. A recent paper (Vybornov 2008) showed that radiocarbon dating of the pottery found in the Volga-Kama Neolithic monuments allows to date the encampments of  the northern Caspian Sea area by the first half of the 6th millennium BC, that is about 8,000 years ago. The “Proto-Indo-Europeans” would appear there only 4,000 years later. The author (Vybornov 2008) notes that at the same time forms the Neolithic culture in the south of the Volga-Ural interfluve, which is where M. Gimbutas had placed the “homeland of the Indo-Europeans”. A few centuries later (second half of the 6th millennium BC) appear the settlements in the Lower Volga region (ibid.). Now we can definitely stipulate that all that is the areal of the Türkic languages.

Finally, as is known now, the domestication of the horse came about in the north of the Kazakhstan, in all certainty again by the carriers of the R1b1, about 5,500 years ago, long before the arrival of the “Proto-Indo-Europeans” (Archaeology, Jan-Feb 2010), and the use of the horses in the household economy by the “Kurganians” is an important stipulation of M. Gimbutas. That is again an argument in favor of the Türkic-lingual “Kurganians” who were expanding from the east to the west, and not vice versa, as did the “Proto-Indo-Europeans”. That applies without even allusion to the fact that the argument of a “mountain landscape” does not work at all in relation to the Dnieper-Volga region, although that did not bother M. Gimbutas a least.

It is clear that the migration of the haplogroup R1a1 amply satisfies the anticipation of some contact continuity and cultural integration associated with the migration of the R1a1 Proto-Indo-Europeans from the Balkans to the Eastern European Plain and beyond to the Caucasus, Middle Asia, and the Urals, and on to the India and Iran. The application of that obvious factor only in respect to the “Kurgan Culture” remains unclear. Naturally, this provision also worked in the migration of the Türkic “haplogroup R1b1.


These postulates and inconsistencies can be analyzed further, but the situation has long been clear. All arguments that support the alleged migration of the “Indo-Europeans” from their “homeland” in the Circumpontic zone, as well as in the Volga-Ural region, or between the Volga and Dnieper, are either erroneous, or not specific arguments, and as easily fit the Balkans.

3. The same also applies to the “Anatolian” theory of the “Indo-European homeland” (Gamkrelidze and Ivanov, 1980, 1984, 1989). The linguistic evidence for the landscape, flora and fauna of the “Indo-European homeland” which they analyzed are perfectly suitable for the Balkans, aside from the fact that they are far from absolute. As is already known, applied formally and indiscriminately, they cause problems with these “arguments” in any territory.

. The subject should be not an absolute and unquestionable use of these and similar “arguments”, but the optimization of the results of their application. And there, I repeat, the Balkans are optimally suited. But adding the distribution regions and the dating of the “Proto-Indo-European” haplogroup R1a1, taking their migration to the India and Iran as a major argument, and practically analogous R1a1 haplotypes in these countries and in the modern Russia, where their proportion in the population reaches 62%, the question can be held as resolved. On the contrary, the territory of the Asia Minor (Anatolia) is categorically not suitable for the epicenter role for the spread of the “Proto-Indo-Europeans”, and moreover for the spread out to the north, as was intercepted and “developed” by the followers of that hypothesis. It is incompatible with the DNA genealogy data, according to which the movement was from the Eastern European Plain and to the south and to the east, to India and Iran.

The time at which T.Gamkrelidze and V.Ivanov, and after them V.Safronov and C.Renfrew placed the “Indo-Europeans” in the eastern Anatolia, Southern Caucasus and northern Mesopotamia (Safronov, 1989; Renfrew, 1987; Renfrew, 1998), namely the 5th-4th millennium BC, i.e. about 6,000 years ago, is also incompatible with the arrival there of the R1a1 haplogroup carriers only in the first half of the 2nd millennium BC, that is, two or three thousands years later. Interestingly, to substantiate the Anatolian hypothesis, were attracted extensive materials on the paleogeography, archaeology (in particular, on the development continuity of the local Anatolian cultures), paleozoology, paleobotany, linguistics, and particularly the data on borrowings from individual Indo-European languages into the local non-Indo-European languages and reverse using comparative historical method, but all these arguments at a closer look work wonderfully in respect to the Balkan homeland.


They also work in regard to the migration of the R1a1 carriers with their “Pra-Indo-European” language, or rather the Aryan language, in the 3rd and 2nd millenniums BC from the Balkans through the Eastern European Plain to the Caucasus and Eastern Anatolia, and in reality adequately explain the linguistic contacts of the R1a1 carriers in these regions.

Also exists a systemic problem with these “arguments”, they could very well be simply decoded erroneously, when the authors take the approximate and ambiguous interpretations for the “facts” and then absolutize them. Utterly logical is the observation of O.S. Rubin (“Problems of localization of the  Indo-European pra-homeland: a critical review of the modern concepts”) that “the questionable conclusions of Gamkrelidze-Ivanov relating to the chronological framework of the existence and disintegration of the dialect groups raise doubts.”In fact, this remark refers only to a part of the problem, because not only the chronology is incorrect, but also wrong is the essence of the hypothesis of the “Indo-European pra-homeland” in the Asia Minor. The linguistic constructs of T.Gamkrelidze and V.Ivanov, relating to the “Proto-Indo-Europeans” arriving in the Asia Minor in the 2nd millennium BC, apparently do not raise principal doubts. For example, it is quite possible that the first linguistic community which separated from the Aryan (“Proto-Indo-European”) community was really the Anatolian community of the 2nd millennium BC. Although at the same times also began separating the Indo-Iranian branch, and the Greek-Armenian-Aryan branch, and what in the future would become the Balto-Slavic branch.

Lately also appeared anthropological evidence that craniological indicators in the Mediterranean and Asia Minor populations are not compatible with the “Indo-European” indicators in Middle Asia and South Siberia (Kozintsev, 2008). So the theory of T.Gamkrelidze and V.Ivanov, and also of V.Safronov (1989) and C.Renfrew (1987) failed a test against the new data. More specifically, the cited work of Kozintsev studied 245 cranial series of Eurasia from the Neolithic Age to the Early Iron Age, and has shown that no reason exists to attribute any ancient group from the territory of the Southern Siberia and Kazakhstan to the South Caucasoid (Mediterranean) type, as does not exist any reason to attribute migrations to these territories from the Middle Asia and Asia Minor, or from the So. Caucasus, at least according to the physical anthropology. The most likely source of migration to the Southern Siberia and Kazakhstan, including the Afanasievo and Andronovo cultures, is the population of the Bronze Age N.Pontic steppes, and several Late Neolith and Bronze Age groups from the Western and Central Europe.


A.Kozintsev continues that this similarity can be attributed to the migration of the Indo-Europeans “from the Europe to the east, up to the Central Asia. A.Kozintsev reasonably stipulates that the “return of the descendants of one of their groups from the Central Asia to Europe during the Early Iron Age was apparently the cause for the appearance of the Scythians on the historical scene. About the Scythians, and their relation to the “Indo-Europeans” (I will not apply the word “Iranian” as totally compromised in this context) and the “Türks” will be addressed below.

4. Regarding the hypotheses of V.A. Safronov and C. Renfrew (“Neolithic Gap Theory), they are not fundamentally different from those of T.Gamkredidze and V.Ivanov in respect to the “localization of the Indo-European homeland” in both regional and temporal aspects, i.e. they are not right (references are given above). In the cultural area of the Çatalhöyük 8,000 years ago (6th millennium BC) simply could not be “Proto-Indo-Europeans” as a phenomenon, which does not exclude, for example, trade exchanges with the Balkans. But that is not included in the concept of the “homeland”, much like the Amundsen expedition to the North Pole in 1911 did not mean a settling by the Norwegians of the North Pole, as well as acceptance of the North Pole as an “ancestral home” of the Norwegians.

In this regard is utterly surprising completely at times uncritical acceptance by the linguists of the concepts based on idiosyncratic justifications, without due consideration of the alternatives. An example of this is the work of Gray and Atkinson (Gray and Atkinson, 2003). The authors have done a great job, collected a wealth of material, and found that their “analysis produced an estimated age range for the initial Indo-European divergence of between 7800 and 9800 years BP”. This, as we now know, pretty well coincides with the emergence of the future “Indo-Europeans”, the carriers of the haplogroup R1a1, in the Balkans (11600±1600 years BP [Klyosov, 2009b]). 

And what conclusion Gray and Atkinson make? That it was Anatolia. On what grounds? Because, according to the “Anatolian Theory”, the Indo-European language originated there around 8,000-9,500 years ago. As write Gray and Atkinson, this is “in striking agreement with the Anatolian hypothesis.”That's it, the matter is settled. And the authors even carried “... supports the Anatolian theory of Indo-European language”into the title of the article. What alternatives were considered? None, except mentioning that the “Kurgan Culture” arose at the “beginning in the sixth millennium BP”, and “these hypotheses need not be mutually exclusive” The Balkans in their paper are not even mentioned. As we now know, the “Kurgan Culture” had no relation to the “Indo-Europeans”. 

A propos, in the article they write “... a period of rapid divergence [that is] giving rise to the Italic, Celtic, Balto-Slavic and perhaps Indo-Iranian families is intriguingly close to the time suggested for a possible Kurgan expansion”. As we now know, everything “Indo-Iranian” in that phrase is incorrect, neither the link of the Indo-Iranian linguistic family with the “Kurgan culture”, nor the “intriguingly close” time. The time for the “Indo-Iranian linguistic family” totally depends on a definition what the “Indo-Iranian” is referring to; however, at any rate it has nothing to do with the “Kurgan archaeological culture”, which was a culture of the R1b1 tribes. The “Indo-Iranian” linguistic family can extend in time as deep as around 9-12 thousand years BP in the Balkans, to as recent as 3,500 years BP, an actual time of transforming the languages of the “Indo-Iranian” territory into a linguistic family. In any case, the Aryan tribe and the Aryan language is directly related to the R1a1 tribe.

The Anatolian hypothesis and the Balkan (the Aryan, R1a1) hypothesis can, nevertheless, be brought together at one condition, that the bearers of the R1a1 haplogroup were migrating around 12 thousand years ago from the east (from South Siberia and across Middle Asia and/or the Iranian Plateau) through Asia Minor to the Balkans, where they had arrived 11,600 ± 1,600 years BP. We do not have any specific data that their migration was necessarily through Asia Minor; however, it cannot be excluded. Obviously, in that case the Anatolia cannot be considered to be a “homeland” of the Proto-IE languages.


The difficulties of matching archaeological evidence and DNA genealogy may also be caused by these disciplines operating with different attributes. In archeology it is a “chain transmission” of the material and cultural traits, which archaeologists often (or even usually) do not associate with migrations, with the movement of people. As noted Anthony (Anthony, 2007), every archeology student from the 1960s studied under a motto “The pots are not people”, and generally from the 1970s-1980s the concept of migration practically disappeared from the archaeology. In contrast, in the DNA-genealogy these attributes are the migration, directions, regions, and times. Therefore, from the standpoint of the (Russian) archaeology, the “Eurasian Indo-European continuum” looks like Catacomb-Timber Grave-Petrov-Andronovo-Sintashta cultures, and they are not migrations, but a “chain transmission of the cultural traits”, together with the language. And archaeologists even note that in the anthropological relation, this chain is practically homogeneous.

But from the standpoint of the DNA genealogy that approach is fundamentally defective. Here the archaeologists have mixed up, have transposed two migration counterflows of two tribes, a Türkic-lingual R1b from the east to the west, and an Aryan-speaking, “Proto-Indo-European” R1a from the west to the east. The anthropology of these two streams is in fact close or almost identical, because they are two kindred tribes, both Caucasoids, both formed from the same R1 type haplotype. And then archeologists have that the cultures of one tribe, the Türkic-lingual R1b from the Khvalyn, Sredny Stog, and ancient Pit Grave and then the Catacomb Cultures, with a general direction to the west, suddenly jumped over to the “Proto-Indo-European” Andronov and Sintashta Cultures, formed by the movement of the future Aryans (haplogroup R1a1) to the east.

In other words, was ongoing not some “chain transmission of cultural traits” without migrations, but specifically the migrations. That is evinced by the detection of the haplogroup R1a1 in excavations in Germany dated by 4,600 years ago, and in the Andronov culture in the Southern Siberia dated by 3,800 - 3,400 years ago, and the same haplogroup (and the same haplotypes) to the west of the Urals, in the Eastern European Plain, dated by 4,800 years ago, and the same haplogroups (and the same haplotypes) in India and Iran, and with the same dating as in the Eastern European Plain, only 800-1,000 years later. These were precisely the migrations, rather than simply unembodied “transmission of cultural traits”. Admittedly, this is a weak spot of the archaeology.


DNA genealogy helps to solve, or at least to suggest solutions for many questions that archeology and linguistics have not been able to resolve . For example, questions of the “ethnogenesis of the ancient Celts, the time of whose appearance in the Western Europe, like their paths of settlement, the most “traditional” theory can not explain” (Rubin, “Localization problems of the Indo-European homeland: a critical review of the modern concepts, pp. 84-92), despite the fact that “migration of Celtic tribes is only recorded in the direction from west to east, not vice versa” (Alinei, 2004a, b). The DNA genealogy gives an immediate response: the Pra-Celts, the haplogroup R1b1b2, arrived on the European continent by a roundabout migrational path, from the Eastern European Plain across the Caucasus, Middle East and North Africa, on to the Iberian Peninsula, and further into the Continental Europe, 4,800-4,500 years ago (see above). Naturally, the movement from the Pyrenees (and from the British Isles, the next phase of R1b1b2 advancement from the Pyrenees) to Europe went from the west to the east. The Celts were carrying Türkic languages across the Europe.

In conclusion, a brief pause on the Scythian issue. From the above, it is clear that the Scythian people - in fact, a collective term, were both Türkic-lingual, and “Iranian-lingual”, or more accurately, Aryan-lingual. They were both nomadic pastoralists (which is typical for the Türkic tribes), and farmers (which is often typical for the Aryans). They had both haplogroups R1a1, and R1b1. They lived in felt yurts (many of those who lived in them, were carriers of R1b1), and also in stationary buildings (many of those were farmers, R1a1). Unfortunately, neither the specialists in the Indo-European languages, nor the Turkists are willing to recognize the duality (at least) of the Scythians, Sarmatians, and many other steppe (and not only steppe) tribes of the 1st millennium BC and the beginning of our era. Moreover, these tribes definitely had other haplogroups, in the first place G, Q, N, C. The carriers of the haplogroup G in the Scythian and Sarmatian times likely were “Iranian-speaking”, and lived in the Iranian Plateau much earlier then the Aryan times. Then, of course, they were not “Indo-Europeans”. The carriers of the Q, N, and C were most likely Türkic-lingual.

The sooner both sides, the “Iranists” and “Türkists” recognize these facts, or at this point only considerations, the sooner linguistics would be enriched by new findings and discoveries. Especially, if in addition they would adopt in their research arsenal the DNA genealogy. I dare to hope that this article would facilitate that.



An attentive reader definitely noticed that this article is not about the Türkic languages. It is perfectly clear that the main gist of the article about the Türkic-speaking of the R1b carriers is based on ONE single argument - that the Europe really has a Türkic substrate, and that the Indo-European language has many Türkisms, which follows from the analysis of the texts of the Classical authors stipulated by a number of Turkologists,. That's it. Because if this observation is false or exaggerated, then there are no other arguments about the Türkic-speaking of the R1b (excepting, naturally, the fact that many modern Türkic-speaking people are indeed bearers of R1b haplogroup). It may be honestly noted that the author is not a linguist, and therefore he simply relied on (or accepted as a working hypothesis) this argument of the Turkologists.

Although with that remains that in Europe was some non-Indo-European substrate, that is the language of the Basques and a Basque-like language in much of the Europe. As noted the historian and linguist Indarbi Byzov, in the era before Romanization and Celtization, the Iberian peninsula was inhabited by Iberians, linguistic kins of the Basques. They also inhabited the British Isles, Ireland, and the western part of the France. East of them, extending to the Rhine, lived Ligurians, who also were identified with the Sino-Caucasian group (I. Byzov, private communication). But that again is clearly the haplogroup R1b, Beaker culture, because it again comes from the Pyrenees into the Europe, following the migration routes of the haplogroup R1b1b2 carriers. So, we again come to the existence in the 2nd millennium BC Europe of a some (if it was not Türkic) agglutinative language peculiar to the haplogroup (of the tribe) R1b1b2, and then all the provisions of the article are correct, only remains to replace the word “Türkic” to the “agglutinative language of the R1b carriers”. For example, the “Erbin language”. Which could be an ancient language, also agglutinative, and which the Turkologists, analyzing ancient texts, could misconstrue for a Türkic language. Maybe that would bring about a reconciliation, because both parties, the “Iranians” and “Türkists”, were amiss to a some degree.

Returning to the main topic, I am reiterating that the article is not about the Türkic languages, as an open-minded reader may have noticed. The article does not cite a single Türkic word, and likewise it has no linguistic analysis. The article is about the DNA genealogy. The focus of the article is that the “Kurgan” culture could not be “Indo-European”, that no “Anatolian homeland” had existed, and that linguists and archaeologists have transposed two migratory flows, the two ancient dominant tribes of the Eurasia. So, all cardinal stipulations of the article remain in force. Unless, of course, they would be countered by not less compelling reasons, for example, as that such tribes (R1a and R1b in modern nomenclature) did not exist, or that their migrations did not exist, or that the “homeland” of the IE language was definitely in Anatolia (and not on their way from South Siberia and/or Central Asia to Europe), or definitely in the ancient Pit Grave Culture (and apparently before that in the Khvalyn, Seroglazov, Samara, Middle Volga Cultures) [please select only one], and that the IE language in the 4th - 2nd millennia BC was positively migrating to the west rather than to the south (not true) and east toward the India and Iran (might be true for R1a1, and not true for R1b1). And also that both tribes R1a and R1b spoke Indo-European languages, and that both tribes brought these languages to India and Iran (which is not true anyway, as is witnessed with their haplotypes, as it was described in this study). And also that both tribes R1a and R1b were advancing from Anatolia to the east as far as India. Or that both tribes were moving from the ancient Pit Grave Culture culture? In general, any counterargument must be SUBSTANTIATED. That is the only stipulation.

