Exclamation Criticism of the Indo-European,Altaic&Afro-Asiatic theories!

When dealing with such huge language families such as Indo-European, Altaic and Afro-Asiatic it's obvious that the similarities inside each of those 3 language families are very limited(for example there are few common roots between the Turkic and Tunguz branch, and few common roots between the Semitic and Omotic branches as well as rather few common roots between the Armenian and the Celtic branch of Indo-European-add to that lexical disparity some sharp morpho-grammatical divergences)
[though it's also obvious that Indo-European is a rather recent phylum that get desintegrated rather recently, as showed for example by the constructed (or more accurately guessed) numeric system of the proto indo-european in comparison with the numeric system of Altaic and to a lesser extent the huge differences between the numeric system of the north afrasan branches (Semitic/Araboabyssinic,Berber/Libyc&Egyptian)compared with the Kushomo-Tchadic group]

1/Criticism of Indo-European

Evidence that most Indo-European Lexical
reconstructions are artefacts of the linguistic method
of analysis
Angela Marcantonio
University of Rome ‘La Sapienza’
A typical book on Indo-European linguistics takes for granted that the evidence for the Indo-European
family tree is ‘obvious’ or ‘compelling’. In more specialist research books, one finds a focus on how
to ‘fit’ the data to the model. The earliest example of this was Verner, who famously described
‘exceptions’ to Grimm’s Law. Verner modified the model by introducing ‘contextual specifications’,
thus converting the exceptions into ‘apparent exceptions’. Over the years, the counter-evidence to the
model has been ‘explained away’ with similar processes, so that today one can find very little direct
counter-evidence to the model. Here I examine this process critically. I ask if the linguistic method has
become circular – in other words, I ask “Have we fitted the data to the model, or have we made the
model so flexible that it can fit almost any data?”
1. Introduction
1. 1. The circularity issue
The objective of this chapter is to address the general issue of the flexibility, and related risk of
circularity which is embedded into the comparative method with the purpose of verifying whether, and
to which extent, these weaknesses may have a negative impact on the results it yields -- in this case,
specifically, the I-E comparative corpus -- and, therefore (arguably) on the I-E theory as a whole.
Following the results of the investigation, I shall argue that the great majority of the conventionally
stated I-E sound laws lack statistical significance and that, as a consequence, most of the conventionally
established correspondences (within a chosen corpus) are, in fact, not correspondences, but similarities,
most probably ‘chance resemblances’.

Among the weaknesses1 embedded in the comparative method, that of circularity has since long been
identified, and can be defined as follows (see also Fox2 [1995]). ‘Circularity’ is where we:
a) assume that a set of words are of I-E origin
b) make a reconstruction of the origin and history of the words in question through the (assumed)
sound-changes associated with this history (on the assumption that the words are of I-E origin)
c) observe that the words match the reconstruction
d) conclude that the words under comparison are of I-E origin because they match the reconstruction
As to the issue of the scientific reliability of the Comparative Method, there is no room here to give a comprehensive account
of this long-standing debate, but see Introduction in this volume and related bibliography; see also Koerner (1989) and Salmons
& Joseph (eds, 1998).
2 As Fox (1995: 63) puts it: “An inevitable problem is that of circularity: it is difficult to avoid some version of the vicious
circle that results from assuming that forms are cognate because they can be reconstructed with the same proto-phoneme, where
the proto-phoneme is itself the result of assuming that they are cognate”.
This is a circular argument because our conclusions depend on the original assumption -- it would be
different if, for example, we had assumed that the words under examination were of different origin,
belonging to different language families and then, during the reconstruction process, using the stated
methods and the relevant data, we might have come to a conclusion different from the initial
assumption. This type of circularity, in turn, stems (also) from the fact that the original choice of the
languages to compare is, by its very nature, based on intuitive and often subjective judgments --these in
turn being often influenced, or even dictated, by historical and socio-cultural factors.

That the risk in question is a real, concrete one –and not just a theoretical possibility --can be shown
with an equally concrete example: the issue of the well known correlations existing between Hungarian
(traditionally classified as ‘Uralic’) and Turkic (traditionally classified as ‘Altaic’). These correlations
are traditionally explained as ‘borrowed’, this being in turn the effect of long lasting, intense contacts
between the early ‘magyar’ tribes and Turkic tribes (after the magyars split off from the rest of the
Uralic proto-community).
However, there is no ‘independent evidence’ to support the thesis of this centuries-long
cohabitation, this ‘symbiosis’ (as is usually referred to in the literature) and, therefore, of the ensuing
borrowing, either from history, or archaeology, or anthropology (let alone the thesis of the splitting off
from the rest of the Uralic proto-community). On the contrary, the symbiosis in question is ‘assumed’
only on the basis of the intense ‘borrowing’. Thus, the whole argument represents a classical example of
vicious circle.
And, in fact, several scholars nowadays call into question the validity of both the Uralic (see for
example Marcantonio [2002]) and the Altaic theory (see for example Unger [1990]). A similar situation
is encountered within I-E too, with regard to the well known lexical, phonological and structural
similarities identified between Sanskrit and the other (non-I-E) languages (such as Dravidian and
Munda) that constitute the complex ‘South Asian linguistic area’– see at this regard Masica (1976) as
well as the so-called ‘Aryan Debate’ (for which see Introduction and the chapter by Annamalai &
Steever in this volume). In particular, the identified similarities have been traditionally interpreted by
most scholars as ‘clear’ instances of borrowing (since the languages / language families in question are
not considered genetically related), despite the fact that, in practice, it is very difficult in this area (too)
to distinguish – through linguistics means only -- similarities attributable to genetic inheritance, or to
borrowing, or even to chance resemblances (as some scholars now do recognize, see for ex. Hock [1996:
Even if this ‘intrinsic circularity’ (as I would like to call it) were not considered to represent a
methodological difficulty, there is still embedded into the comparative method what one could call the
‘every-day practice circularity’, which has been effectively described by Morpurgo Davies (1998: 254)3
as follows:
A final agreement about the nature and validity of sound-laws was never reached. It was generally
accepted […] that testing any sound law against the data was bound to reveal a number of exceptions;
[….]. The neogrammarians did of course maintain that all the exceptions could be explained away by
re-defining the law, or by identifying a different starting-point, or by recognizing the interference with
analogical process, but they were immediately accused of circularity […].We can say that the sound
laws have no exceptions only because when we find an exception we eliminate it saying that there has
been analogical interference. On the other hand, we also say that the only way in which we can prove
that a form […] is analogical is by pointing out that otherwise it would be an exception to the sound
Other typical means of ‘explaining away’ exceptions and items of counter evidence are: assuming
borrowing (even from unknown, extinct languages /dialects), re-arranging in a different order the stated
sequence of rules, postulating a (/another) laryngeal segment, etc. It is true to say that each of these
‘adjustable parameters’ – as these could be called -- does reflect a plausible, genuine linguistic process;
however, the problem is that the overall cumulative effect of many adjustable parameters added to
the definition of a given law may endanger the ‘cumulative effect’, that is, the ‘statistical significance’
any established ‘law’, or even ‘tendency’, should have. Thus, there is the danger that the explanatory
system ceases to be ‘explanatory’ and becomes an ‘adhoc’ system, a system that is so flexible and
complex that the attained results are difficult to verify or, conversely, falsify. In addition to this, one
may observe that the task of the Comparative Method is that to reject words that do not meet the stated
soundrules, therefore it can be no surprise that (at least in principle) one is left with only those words
that (supposedly) do meet them. This amounts to a licence to reject any items of counter-evidence to the
model, or, to put it another way, it introduces a systematic bias in the data which needs to be filtered out
(as we shall see, this bias can be filtered out using quantitative methods of analysis).
1. 2. The impact of circularity on reconstruction
The flexibility/circularity issue, however much known,
has hardly ever been the object of a targeted, systematic and extensive investigation (to my knowledge)
in order to verify its possible impact on the soundness of I-E, or other traditionally established language
families. In other words, there have not been investigations whose purpose is that of assessing the
cumulative effect, the statistical significance of the traditionally established IE comparative corpus (but
see Brady & Marcantonio [2003], Marcantonio [2002] and [2003/5]).
There may be several reasons for this. It has been often claimed (see for example Nichols [1996])
that the comparative method is not a ‘heuristic method’, and therefore it can only be used ‘after’ a
language family has been already established -- through other means -- to trace back sound changes and
related correspondences. This being the case, it does not matter how many exceptional sound changes,
how many chance resemblances, ‘false matches’ there may be within a given comparative corpus.
Besides, a certain amount of items of counter evidence and /or of exceptions would not, in any case, be
sufficient to call into question the I-E model (in the same way that no theory can be rejected on the basis
3 See Introduction for a fuller version of this quote.
of the existence of a certain amount of counter-evidence). For example, Ringe (1995:60) argues that:
“irregularity (in suitably small number) can be tolerated in working with languages whose relationship
has already been established beyond doubt, but they are potentially fatal weaknesses in an attempt to
prove a doubtful relationship”. Similarly, It has been claimed that there are linguistic areas for which we
‘know’ that the languages are related, but whose relatedness cannot be demonstrate by using the logic of
the comparative method, mainly because the correspondences are, simply, not regular. This is for
example the case of the languages of southeastern New Caledonia (Grace 1990 & 1996), whose
investigation appears to show that the languages are “obviously” genetically related, although the
method of comparison is hardly applicable. Following the same line of reasoning, other scholars have
stated that the establishment of just a few cognates, or even of one single cognate, would be enough to
demonstrate genetic relatedness. This is, for example, the position held by Harrison (2003: 217), who
also observes that when the number of putative cognates and /or correspondence sets approaches a level
that is not statistically significant (i.e., that might be attributable to chance) “the C[comparative]
M[ethod] has ceased to work”. Another reason for the lack of interest in assessing the impact of
circularity on I-E (and elsewhere) may lie in the fact that, in order to carry out this type of investigation,
one needs to apply the appropriate methodology, that is, quantitative, statistical methods of analysis.
However, most linguists appear to believe that statistics is not a meaningful tool within historical
linguistics, because the task of the historical linguist would not be that of quantifying how many items
of evidence support the model, or, conversely, how many items of evidence counter to the model can be
identified. The task of the historical linguist is instead that of finding those very factors that have
triggered the irregularities, and therefore, to justify them. In other words, counter-evidence is nothing
but irregularities that we have not been able to justify (yet). This way of reasoning ties in well with the
thesis that the comparative method is not heuristic, or that genetic relatedness cannot always be
demonstrated, being simply ‘obvious’ / ‘evident to the naked eye of the trained scholar’ (see again
Nichols 1996).
Thus, assessing the cumulative effect of sound laws and the statistical significance of a
comparative corpus is simply an irrelevant task here, a task that does not help us in our understanding of
language development and sound changes, and does not lead anywhere.
1. 3. The advantages of quantitative methods of analysis
In contrast to the points of view summarised above, there have been several linguists who have pointed
out the advantages for historical linguistics to adopt (also) quantitative methods of analysis, and have
themselves carried out several (types of) quantitative investigations. Here the major advantage would be
that of relying on methods that are ‘objective’, being by their very nature ‘replicable’ and, therefore,
truly verifiable or falsifiable; see for example Chrétien (1937); Ringe (1992, 1993, 1995, 1998, 1999);
Ringe et al. (2002); Mc Mahon & Mc Mahon (2003) and McMahon & McMahon4 (2005). In particular,
according to Ringe (1999: 213), it is fundamental that any observed similarity between two languages
must be supported by a demonstration that the similarity in question could not be the result of sheer
chance, since failing to provide a demonstration of this sort is “non-negotiable” for a linguistic
discipline which purports itself to be scientific. On several occasions (see for example [1995:65]) Ringe
also refers to what he defines “the ‘brute force’ probabilistic analysis” that would correlate a very large
number of pairs from I-E languages, as against the poor statistical value correspondences which would
correlate the various languages from other language families. Ringe (1995: 68-71) further observes that,
since one cannot prove that languages are not related, those who propose specific correlations among
languages have to provide “objective proof” of them, otherwise “scientific historical linguistics is
simply impossible”. Similarly, McMahon & McMahon (2005:26-7) state that:
For a detailed account of the history of quantitative, historical linguistics – scope of applicability, methods and results --see
McMahon & McMahon (2003) and (2005; see also the chapter by Drinka in this volume.
The current difficulty comparativists face is our inability to test and demonstrate family relationships,
so that these can either be proved beyond reasonable doubts, or refuted. If we cannot tell good results
from bad ones in a formal and repeatable way, we cannot hope to distinguish good methods from bad
ones either. […..] The relative informality of the [comparative] method, and the lack of testability
beyond the checks built into the method itself, mean we have to rely on the experience and integrity of
individual practitioners to do so.
Despite this kind of programmatic (and, in my opinion, correct) statements, hardly any quantitative
investigation carried out thus far within historical linguistics has addressed the issue of the statistical
significance, or, to use Ringe’s words, “the ‘brute force’ probabilistic analysis” that, supposedly,
correlates a very large number of correspondences from the I-E languages; on the contrary, this is given
for granted (but see Bird’s (1982) and Ringe’s (1995) analysis of Pokorny’s dictionary5). These
investigations have in fact centered on assessing the soundness of the traditional, internal sub-branching
of I-E (see Ringe et al. [2002] and McMahon & McMahon [2005]), or on comparing the I-E
comparative corpus with that of the Nostratic /Eurasiatic macro-family (Ringe 1995, 1998), in order to
show how the latter lacks statistical significance and is, therefore, not a valid family. For example Ringe
et al. (2002) attempt to recover the first order subgrouping of the I-E family using a new computational
method, and this is because, according to the Authors, through the traditional methods of historical
linguistics no consensus has ever been reached “on how those ten robust subfamilies [of I-E] are related
to one another cladistically” (p. 81). In a word, typically the I-E family, whose statistical significance
and soundness is not under scrutiny, is used as a ‘control case’ against which to compare the statistical
significance, and, therefore, the validity of other language families.
Thus, whether implicitly assumed or explicitly stated, whether considered to be relevant or
irrelevant for the purpose of evaluating the soundness of the I-E theory, the positive statistical
significance of the I-E corpus appears to be, in any case, taken for granted, and this is the point where
my personal opinion differs from that of most linguists’. In other words, I believe that:
a) Assessing the statistical significance of the comparative corpus of any language family
(including I-E), does actually matter. In fact, in my opinion, when the number of putative
cognate sets approaches a level that is not statistically significant – to use Harrisons’ words --
this does not mean that “the C[omparative] M[ethod] has ceased to work”, but, on the contrary,
it means that we are facing clear instances of evidence counter to the predictions of the model;
b) Even if one did not agree with this stand of mine — and, as we have seen, many historical
linguists in fact do not – I believe it is nevertheless a useful exercise to investigate the statistical
significance of the I-E comparative corpus, at least to have an empirically assessed, objective
picture of the status of I-E with regard to, for example, those language families that hardly
contain any proper correspondences (for which see discussion above);
c) If one accepts what is stated in points (a) and (b) above, then the statistical significance of I-E
has to be properly and empirically verified, instead of being simply assumed, and this
verification can only be achieved through the help of quantitative, replicable types of analyses.
5 As far as I know, the only attempts to assess the statistical significance of an IE corpus have been made by Bird (1982) and
then by Ringe (1995), although the purpose of their analysis is not that of verifying the soundness of the I-E theory. The
purpose of these studies is that of comparing the distribution of cognates in Nostratic with the distribution of cognates in
“uncontroversial language families” (Ringe 1995:64), such as I-E, in order to show the difference between the two cases and
draw the necessary conclusions. More specifically, Ringe analyses the comparative corpus presented in Pokorny (1959-69),
using the statistical summary of the distribution of cognates in the dictionary as carried out by Bird (1982). The specific
purpose of the research is that of investigating whether the distribution of cognates in Pokorny does produce a curve that can be
generated randomly, that is, whether these cognates might be (at least in part) the effect of chance resemblances, and then to
compare the I-E situation with the distribution of cognates within Nostratic. Ringe (1995:65) concludes that: “Pokorny’s
evidential standards are lamentably lax; he includes a large number of items which probably should not be reconstructed for
Proto-Indo-European (PIE), either because the cognate set shows too many irregularities of one sort or another, or because
‘sound symbolism’ of various kinds can account for the similarities observed.
Bearing these general issues in mind, I shall examine a significant sample of the conventionally
established I-E phonological / lexical and morpho-phonological comparative corpora in order to verify
whether, and to which extent, the circularity problem has affected them. In practice, I shall try to asses
the ratio between the number of rules -- and related ‘adjustable parameters’-- needed to establish a given
correspondence and the overall number of correspondences those (more or less large sets) of rules are
capable of accounting for. Once the picture on this aspect of I-E studies is clear, then one can draw
his/her own conclusions on whether the attained results have any bearing on the validity, or otherwise,
of the theory as a whole, according to his/her stand on the general, underlying issues discussed above
and in the Introduction.
1. 4. The quantitative analysis: Summary of the methods and results
The data examined in this investigation are taken from a well defined source, the LIV dictionary of I-E
verbal roots (Rix [1998]) -- a modern and (supposedly) much more rigorous dictionary than the previous
existing ones, and which represents the current, widely accepted way on the procedures of
reconstruction of the I-E roots and stems.
My investigation involves simply counting and categorising all the reconstructed verbal roots present in
LIV6, as well as any other relevant element associated with the roots in question (such as meaning(s),
morpho-phonological alternations, presence vs absence of laryngeal segments, etc., for which see
discussion below), and then simply mapping the results of the counting into standard types of graphs.
No sophisticated statistical analysis and /or computer modelling of the type used by Ringe or McMahon
& McMahon (quoted) are in fact used here. The quantitative investigation of the verbal roots is designed
to quantify the following main instances of circularity, as indicated by the reported results:
Result 1: 32% of the roots are reconstructed on the basis of data drawn from one single branch of the I-E
family. Thus, the evidence suggests that these roots may be of local or other origin. However, the
assumption is made that they are of I-E origin and reconstructions are proposed; it is then self-evidently
circular to conclude that these words are of I-E origin just because of the establishment of a
Result 2: 34% of the roots are reconstructed on the basis of data drawn from two language branches
only. As is known, reconstructions of this sort are not safe, because the possibility of chance
resemblance or borrowing rates quite high in these circumstances. In spite of this, the assumption is
made that these roots are of I-E origin, and reconstructions are proposed. However, once again, it would
be circular to conclude that these words are of I-E origin because of the existence of reconstructions.
Result 3: Most reconstructions of the verbal roots consist not just of a single reconstruction, but of a set
of alternative ‘sub-reconstructions’ – a set of alternative starting-points which are called ‘alternations’.
This is justified by the observation that morpho-phonological alternations are indeed one of the features
6 Note that, strictly speaking, LIV does not list ‘indogermanischen Wurzeln’, but ‘indogermanischen Verben’, that is, those
provided with ‘Primärstämme’ (see below). Note also that I have not used the second edition of LIV (2001) because it was not
yet available when this research was carried out.
characteristic of (several) I-E languages. Although this is undoubtedly the case, if not used carefully,
this procedure can make the method circular.
To take an extreme example, if an attested language A uses the word dog whilst another language B uses
the word cat to refer to what is supposed to be the same or similar reference, then if one assumes (a
priori) that the words belong to genetically related languages, one might reconstruct the proto-word *cat
~ *dog in order to justify the actual difference in sound (and often also in meaning). In the analysis
below I’ll show that the number of alternative ‘sub-reconstructions’ reconstructed for each verbal root
typically increases -- and increases linearly -with the increase of the number of the language branches
that support the root. This is exactly what one would expect if the *cat ~ *dog fallacy were being used.
No alternative explanation for this clear pattern in the data can really be found. Thus, reconstructions of
this sort are based on the assumption that the words are related, and so it would be circular to conclude
that in fact they are so related, just because of the establishment of ‘reconstructed’ --as against to
actually ‘attested’-- morpho-phonological alternations.
The object of my investigation are the phonological / lexical correspondences, not just because
dictionaries deal with lexical correspondences, but because at the lexical level there is at least one clear
criterion (however much disputable) one can rely upon in order to carry out the analysis and against
which to verify the attained results: the regularity principle and the sound laws (supposedly) established
on the basis of this principle. I shall not investigate morphological correspondences because at this level
the operation of the regularity principle is, admittedly, (even) much less regular than at the phonological
level, therefore it is not clear which guiding principles and criteria one can rely upon in order to
establish a match, if not an intuitive, ‘naked-eye’ observation of (a certain degree of) similarity among
the identified morphological elements. In other words, in general, it is not clear how to validate the
morphological model against the evidence, since there are not clearly specified laws and there is much
greater reliance on processes such as analogical leveling, whose conditions of operation do not appear to
be consistently defined using predictive and verifiable criteria.
However, as mentioned above, I have examined a significant sample of morphophonological
correlations, which are instead easier to analyze through quantitative methods because of the
(supposedly) rigorous correlation between sound alternation and morphological information.
As we shall see, the results of this investigation show that the lexical corpus and at least one
specific morpho-phonological corpus assembled in LIV definitively lack statistical significance, as a
consequence of which it is reasonable to assume that most of the correspondences there assembled are
‘chance resemblances’.
However, the possibility that the traditionally established comparative corpus of I-E (like that of
any other language family) might contain a (more or less) high number of chance resemblances should
not come as a total surprise. In fact, for example Hock (1993 & 1994), considers the issue of chance
resemblances (with particular reference to the South Asia linguistic area) to be an “especially
troublesome” issue. He has therefore has assembled long lists of chance resemblances among several I-E
and Dravidian languages7, just to observe that (1996:37): “As the continuous controversy over remote
7 Hock (1993) presents a long list of what he considers to be chance resemblances between Tamil and English as well as Indo-
Aryan and English. Hock (1994) presents a list of Dravidian /Indo-Aryan chance similarities. He shows how difficult it can be
to tell apart proper correspondences, borrowing and chance resemblances not only within macro-families but also within well
established families.
linguistic relationships shows, there is no generally accepted answer to the question: What is the chance
of similarities being accidental?”. Well, I hope the analysis carried out below will unable us to get a
fairly faithful picture of the level of chance similarities embedded within a major I-E lexical corpus.
2. The statistical significance of the Indo-European comparative corpus assembled in LIV
2. 1. Counting the verbal roots
Counting all the verbal roots listed in LIV, one observes that 200 of the 683 ‘safely reconstructed’ roots
(‘safely’ according to the dictionary itself), are reconstructed on the basis of only one daughter language
or language branch (‘witness’ to the reconstruction). In the absence of any comparison with another
language branch, it is hard to discern the basis of the claim that the roots descended from ancient I-E
rather than from some other source, much less that they have been ‘safely’ reconstructed as Indo-
European. In fact, if a word is present in only one language group then this strongly suggests quite the
opposite – that the word is of local origin.
most often false.
Likewise, an equally high number of roots (34%, as mentioned) have been reconstructed on the basis of
only two daughter language branches as ‘witnesses’. There is an unacceptably high likelihood of false
matches if one operates with only two ‘witnesses’ to support the reconstruction, because it is too easy or
tempting to pick out similar forms that may or may not be truly related. This observation is generally
accepted among linguists; for example, Meillet (1934: 41) proposes the ‘threewitnesses criterion’, whilst
Greenberg (2005) suggests four languages as a sufficient number to establish a historical relationship.
This simple counting shows that, of the LIV corpus, 66% of the roots cannot be relied upon
because they are reconstructed on the basis of just one or two languages / branches witnesses only. They
lack therefore statistical significance, and it is hard to imagine on which quantitative basis they can be
considered as reflections of genuine I-E correlations. As a matter of fact, Campbell (1998:115),
obviously accepting the three-witnesses criterion as relevant, criticizes the ‘Nostratic’ comparative
corpus as assembled by Illic-Svityc (1971-84) for the fact that 35 % of the cognate sets have been
established on the basis of only two member families. On the basis of this and other methodological
flaws Campbell rejects the Nostratic hypothesis as not scientifically proven (correctly, in my opinion).
Thus, on the same basis, for the sake of consistency, one has to reject 66% of the verbal roots as
reconstructed in LIV. Discounting those unreliable roots whose reconstruction does not meet the threewitnesses
this leaves us with 34% of the roots whose reconstruction is supported by three or more witnesses
and are therefore worth examining as the basis of a genuine I-E comparative corpus. Of this 34%, nearly
half have been reconstructed using laryngeal segments, whose status as phonetic/phonological I-E
segments is still disputed, despite the fact that the laryngeal theory is now widely (if not universally)
accepted. This leaves 126 roots (that is 18% of the total) that could form evidence, in principle, for
genuine linguistic correlations, because their reconstruction meets the three-witnesses criterion and do
not make recourse to laryngeal segments.
The next step will be to try and count the number of linguistic rules that have been used to
reconstruct the LIV corpus. Unfortunately LIV does not state which rules are used in its reconstruction,
and therefore one must look at the general literature and assume that these rules have been adopted in
the dictionary. According to Collinge (1985), there 54 ‘major laws’ and 11 ‘major tendencies’, giving a
total of 65 general rules.
In addition, for each rule there are a variable number of ‘contextual specifications’ or ‘adjustable
parameters’. The precise number is very difficult to ascertain, because, of course, this number depends
on single interpretations of a given Law (of which often there are several): those interpretations that
admit fewer exceptions tend to contain more adjustable parameters, since these are needed to ‘justify’
the exceptions in question and transform them into ‘apparent exceptions’. Brady & Marcantonio (2003)
have counted 165 to 202, general (as opposed to language-specific) such adjustable parameters for all
the Laws (not major tendencies) listed in Collinge (1985). This estimated number of contextual
specifications /adjustable parameters represents an under-estimate, since the contextual, ‘languagespecific’
specifications that are evidently used in LIV (in the form of ‘explanatory notes’) to get the
expected reflex in all the relevant languages, but have not been mentioned by Collinge while merely
stating the general Law, have not been counted.
Whatever the overall number of Laws plus their adjustable parameters might be, their number is
at least comparable with, and almost certainly higher than, 126, the number of the roots that meet the
three-witnesses criterion and do not contain laryngeals. These roots have been called in Brady &
Marcantonio (2003) as ‘lawful’ – just to pick up a name – that is, they have been established according
to the conventional Laws, without making recourse to laryngeal segments. The diagram below (from
Brady & Marcantonio 2003) illustrates this situation: there is only a very small number of roots that are
widely represented, being attested in 6 or more languages / language branches; this number becomes
even smaller if one does not count the roots containing laryngeal segments (or if one does not accept the
particular version of the Laryngeal theory adopted by LIV [see below]):
2. 2. Are the Indo-European morpho-phonemic alternations genuine or artefacts?
2. 2. 1. We shall see in this section that there is evidence of a strong bias in the reconstruction of the
roots themselves in the area of vowel alternation (Ablaut), and associated morpho-phonology. This bias
has helped researchers to find correlations where, according to our analysis, they are not justified and
may be an artefact of the method of analysis. As we shall see below, the bias is betrayed by the fact that
the roots whose reconstruction has been established on the basis of a large number of I-E languages /
language groups tend also to display a large number of morphophonemic alternations.
Vowel alternations, both in nouns and verbs, are widely claimed to be a distinctive feature of I-E,
and many of the vowel alternations present in the modern languages have, supposedly, been
inherited from alternations already existing in the proto-language (see the chapter by Carruba in this
volume for a modern account of this phenomenon). In particular, among the inherited alternations in the
verbs, there are claimed to be morpho-phonemic alternations of the radical vowel(s) -- ‘radical Ablaut’--
that is, alternations systematically associated with morphological information. In other words, different
grades are associated with different tenses and / or moods (at the singular active within a given
declension). Thus, most root reconstructions typically consist not just of a single reconstructed root, but
of a set of alternative reconstructions, postulated in order to represent (also) the various (morphophonemic)
alternations, supposedly already existing in the proto-language. For example, according to
Szemerényi (1996: 91), the set of the basic Ablaut variations of the disyllabic root *genē- ‘to bear, be
born, beget’ is the following: *genə-, *gnē-, *gṇ- (for the e-grade series); *gonə-, *gnō- (for the o-grade
series) and *gēnə-, *gōnə- (for the long grade series, in first syllable only). Since schwa can be lost
before vowels, also forms of the type *gen-, *gon- and *gēn- can occur.
Although gradation is a common phenomenon in languages, it is intuitively clear that, with so many
starting-point reconstructions, it can be a priori quite easy to match almost any form found across a
given linguistic area. Indeed, for example, it has been necessary to postulate up to ten alternative
reconstructions for *genē-, in order to account for the actually attested and spottily distributed forms.
This has been taken to the extreme where the number of the reconstructed alternations (10) is close to
the number of attested ones (there are 16 different forms from only 4 languages: Sanskrit, Latin, Greek
and Germanic). In other words, postulating so many gradations at the level of the proto-language makes
it an easy job to ‘prove’ that the specific, attested patterns of alternation -- rather than, say, the general
phenomenon of gradation -- have been inherited from the supposed proto-language8. In layman’s terms,
we are close to the hypothetical extreme discussed above, where we have two attested words, dog and
cat, and reconstruct that they are connected via the I-E form *cat ~ *dog. We shall examine this in more
detail below.
Let us now discuss in detail the verbal roots as reconstructed in LIV. Each reconstruction consists of two
or more alternative reconstructions (alternative forms), which I shall call ‘variants’. These variants
consist, mainly, of vowel alternations / morpho-phonemic variations, although often they also contain
affixes (infixes/suffixes). These affixes – called by LIV ‘primary affixes’ --enlarge the original root and
are claimed to convey aspectual information and /or to modify the original meaning of the bare root,
8 The fact that many alternating starting-point reconstructions can be a too powerful tool of analysis has also been pointed out
within Uralic studies, where most scholars refrain from reconstructing proto-alternations, that is, attributing them to the Uralic
proto-language. [AF: then what is vowel harmony if not an alternation?]
although it is self-evident that several roots do display a bewildering array of different formations whose
origin /function cannot be traced back and, therefore, justified (for a discussion of this issue see
Clackson [2007:151 ff.]). Like the root, affixes too may alternate. In a word, by ‘variant’ I mean each
different alternating form -- consisting of Ablaut only or Ablaut + primary affix – that, together with the
other, similar forms in the set, contributes to make a reconstructed root (and, of course, the various,
reconstructed stems derived from it). These variants are regarded as different aspects of a single
reconstruction, and many of them contain laryngeal segments, either in the root or in the affixes, or both
(according to the laryngeal theory version adopted by LIV). Let us give a couple of examples: the root
*deh3 (*dō-, according to pre-laryngeal notation) ‘to give’ (LIV 1998:89, 2001:105-6) and the root
*dek’ ‘to take, perceive/notice’ (LIV 1998:93-5, 2001: 109-11):
*deh3 ‘to give’
1) Aorist *déh3- / dh3- 2) Present *dé-doh3 /dh3
3) Desiderative9 ?*di-dh3-sé
*dek’ ‘to take, notice’
1) Aorist *dék’-/dek’- 2a) Present *dék’-/dék’-
2b) Present *dek’-néu/nu- 3) Perfect *de-dók’/dek’-
4) Causative *dok’-éie- 5) Desiderative: *di-dk’-sé
6) Intensive *dék’-dok’/dek’- 7) Essive: *dek’-h1ié-
As one can see, the root *deh3 consists of a set of five alternative forms, five variants in total, across 3
tenses / moods: two variants associated with Aorist, two with Present and only one with Desiderative,
the latter being enlarged with the suffix -sé-. Equally, the root *dek’ consists of a set of 13 variants,
across seven tenses/ moods: two variants associated with Aorist, two with type (a) and two with type (b)
Presentx; two variants associated with Perfect; one with Causative and one with Desiderative; two again
associated with Intensive, and one with Essive. Type (2b) Present, Causative and Desiderative are also
enlarged with affixes. The suffix -néu/nu-, which enlarges the Present tense of type (b) in the root *dek’,
also, in turn, displays alternation. The fact that there is not just one single form / variant, but a sub-set of
two different, alternating variants associated with most tenses /moods is justified by the observation that
the tenses /moods in question tend to display alternations also internally, within their declension
(according to conventional reconstructions): typically, an alternation between full grade in the singular
active of the Indicative/Injunctive vs reduced or Ø-grade elsewhere. In other words, the overall set of
variants typically consists of what one could call ‘horizontal’ and ‘internal’ variants, that is,
respectively, alternating forms associated with different tenses / moods (such as Present vs Perfect), and
alternating forms associated with the internal paradigm (such as Singular vs Plural / Dual within a given
As an example of a specific (horizontal) morpho-phonemic alternation let us consider the Present
vs Perfect vs Aorist alternation in Greek, widely quoted as illustration of the phenomenon. As
conventionally stated, a full, e-grade is associated with the Present tense, an o-grade is associated with
the Perfect tense, whilst the Aorist is associated with the ø-grade (at the Singular Indicative/ Injunctive
active), as illustrated by the Greekxi verbs ‘to leave’ and ‘to see’ below:
Present Perfect Aorist
leipo ‘leave’ le-loip-a e-lip-on
dérko-mai ‘see’ de-dor-ka e-drak-on (* < drk-)
9 The question mark is in the text, meaning that the reconstruction is considered by LIV as unsure.
Morpho-phonemic alternations of this sort, if substantiated by a good amount of items of evidence
(whatever ‘good’ might mean), across the I-E languages, could be indeed a strong clue of genetic
inheritance, because a widely attested, constant association between a given radical vowel grade and a
given morphological information can hardly be the result of chance resemblance. However, the reality is
that the evidence for postulating so many variants in the reconstruction of a single roots (including many
types of Present and several enlarging, primary affixes) appears to be a fabrication of the method,
as I am going to argue below, because of the following facts: A) as anticipated, roots established on the
basis of a high(er) number of I-E daughter languages / branches have been reconstructed with a high(er)
number of alternative variants, this being an indication of a strong bias (see par. 2. 2. 2. below); B) a
detailed investigation of one specific morpho-phonemic alternation, the Present vs Perfect, has revealed
that it is, indeed, poorly attested, thus confirming the suspicion that most of the postulated, reconstructed
alternations are indeed artefacts of the method (see par. 2. 2. 3. below).
2. 2. 2. Let us then start with the issue raised in point (A) above, that is, the correlation between the
number of languages / language groups that support a given reconstructed root and the number of
variants reconstructed for that very root. Here the crucial factor is that this correlation is quite strong,
and the constant of proportionality is very close to ‘one’. This suggests that, on average, one variant
(alternative form), has been added to the root each time a new daughter language (branch) is claimed as
being supporting evidence for establishing that root. There are several hundred such roots in LIV. For
each verbal root, I have counted the number of reconstructed variants (V) associated with the root, and
the number of daughter languages (language groups/branches) (G) in which at least one (supposedly)
derived verbal item is attested. These two values (V and G) should be independent of one another if the
correlations among the assumed cognates (and related distribution of radical Ablaut) are to be
considered genuine.
On the other hand, if the reconstructions were due to chance resemblances, one would expect that
the two values V and G would be positively correlated, because the higher is the number of variants
postulated, the higher is the likelihood of establishing a match, or better, a false match among the
language groups. This is indeed what my counting has revealed. As shown in the graph in figure (II), the
average number of variants is strongly correlated with the number of daughter language groups
accounted for by the root. In fact, the data fits extremely well to the formula G = 1.6 + L for the range 1
to 5 groups:
Thus, this clear correlation, with a proportionality factor of very close to ‘one’, strongly suggests that,
on average, one new variant has been ‘appended’ to the reconstruction with each newly claimed
daughter language group; in turn this suggests that the majority of the ‘cognates’ reported in this corpus
are artefacts of the method of reconstruction. There is, nevertheless, a departure from this linearity,
whereby G >5 (see figure V below). In other words, there is a small number of roots -- 4% --whose
reconstruction is supported by 6 or more daughter language groups and that are, therefore, potentially
(statistically) significant. I shall discuss this aspect below, trying to find a possible explanation for this
departure from the observed linearity.
There are two possible ways one might attempt to account for this linear relationship between the
two values (G and V). Linguists have already asked themselves why there are many fewer alternations
in the attested languages than in the reconstructions. The conventional interpretation (see for example
Szemerényi (1996:83-93); Sihler (1995: 109-111), etc.) assumes that different languages may inherit
different variants from the multiplicity of true original, I-E variants of a root. Thus, it should not be
surprising that, most often, only a few and /or different grades (out of a complete series) are found
scattered across different languages. In other words, each new language group claimed to support the
root under investigation, allegedly, uncovers more variants from the original, I-E complete set of
variants. This is indeed the thesis espoused also by LIV and by Carruba (in this volume): the fact that
most I-E languages do not, in fact, typically display the complete, or almost complete, series of a
claimed (morpho-phonemic) alternation is justified by assuming a widespread process of analogical
leveling in the daughter languages (although Carruba admits that the I-E apophonie – as he calls this
phenomenon – has not yet been fully understood and explained). As an example of how one particular
variant out of a given set of variants may be attested by one language or two, another variant by a
different language, a third alternation by yet another, different language, and so on, let us consider again
the root *dek’ (1998:934, 2001:109-11). The 7 reconstructed, ‘horizontal’ variants, according to LIV,
are distributed as follows:
1) the two forms of Present (a) *de´k’-/dék’- and (b)*dek’-néu /nu- are attested in Indo-Iranian and
2) the Perfect *de-dók’/dek’- is attested in Vedic, Greek and Latin
3) the Aorist *dék’-/dek’- is attested in Greek and Armenian
4) the Causative: *dok’-éie- is attested in Hittite, Greek and Latin
5) the Desiderative: *di-dk’-sé- is attested in Vedic and Latin
6) the Intensive: *dék’-dok’/dek’- is attested in Greek
7) the Essive: *dek’-h1ié-is attested in Latin
The conventional explanation -- that more languages uncover more variants out of an original, complete
series -- could certainly be a plausible explanation in principle.
However, combining this explanation with the fact that the constant of proportionality is almost exactly
‘one’ leads to an important conclusion: that almost every reconstruction is based on a different starting
point reconstruction (the extreme ‘*dog ~ *cat’ situation). The conventional explanation is also
inconsistent with the fact that the ‘first’ language group is special in that it uncovers more variants than
subsequent groups, as is evident from the graph. In practice, the evidence suggests that the following
process has taken place: A reconstruction is initially introduced with about 2.3 variants on average,
based on some of the variants actually attested in several daughter languages -- mainly Sanskrit and /or
Greek -- from which comparisons typically start. If, while searching for other language groups in
support of the reconstructed root, the evidence does not quite match the initially proposed
reconstruction, then one new variant can be appended to accommodate it. Thus the linear relationship
between V (variants) and G (language group/branch), and its slope of (almost) unity, is the statistical
result of this process of introducing a large number of alternative reconstructions, which are artefacts of
the method of analysis. In other words, here is clear evidence of a bias, which appears to have arisen
because linguists, analysing the data, knew a priori which words are ‘supposed’ to share a common
2. 2. 3. Let us now discuss the issue raised in point (B), that is, the distribution of the e vs o (Present vs
Perfect) alternation, one of the major sources of variants of a root within I-E. Accordingly, in LIV, 259
roots display this type of alternation, and 114 of these are considered to be safe cases. A close scrutiny
of the actual evidence across the I-E area reveals that this alternation is rather poorly attested. In fact,
only the Indo-Iranian languages and Greek document a clear category of Perfect with o-grade (strong
form) in the singular indicative active, alternating with Ø-grade (weak form) elsewhere, with related
position of the accent on the radical syllable in the strong forms and on the desinential syllable in the
weak forms.
The situation is much more complex and unclear in the other languages, including the Germanic
languages and, even more, Latin (see below).
Furthermore, even in Indo-Iranian and Greek the evidence is not that strong after all. In fact, in Indo-
Iranian the -o- grade is attested indirectly, due to the merger of short /e/ and /o/. In Greek, the forms of
Perfect presenting o-grade vs Ø-grade are only a small number -- about a dozen verbs – that are,
however, considered to be very old; see the type: Sing. oida vs Plural id-men, corresponding to Skt.
ve´da vs vidmá (< *woid; ‘I /we have seen’, therefore ‘I /we know’).
Basically, in Greek, the internal alternation (in the appropriate declensions) is found only in the socalled
a-thematic formation, a small minority of cases, the absence of internal alternation being in fact
the norm, otherwise. This contradiction between the predictions of the model and the actual data is
‘explained away’ through a typical ‘rescuing’ procedure: analogy. As Sihler (1995:109) puts it: “In
Greek the inherited pattern [of Ablaut] have been analogically extended, leveled and otherwise
confused” (see also Carruba in this volume). As to Germanic and Latin, the situation is as follows. In
Germanic, the Present vs Perfect, horizontal alternation, as well as the internal alternation between full
and Ø-grade is attested for sure in one type of Perfect only, the so-called ‘Präteritopräsentia’ (see for
example Prokosch [1938: 188ff.], Di Giovine [1996 II:139 ff.] and Clackson [2007:120 ff.]), of the type:
Gothic wait ~ witum (wissa, wiss), corresponding to the above mentioned oida vs ~ id-men and ve´da ~
vidmá (there are about ten examples of this type).
In Latin, the overall Ablaut system, including that of the Perfect, admittedly, is quite different from the
Indo-Iranian and Greek one, although opinions may vary on how to interpret such a wide difference,
such a “ruin”, as Sihler (1995:109) defines it. For example, Di Giovine (1996, II:140 -1) stresses the
basic independence of the Latin Perfect system from the I-E one, because there are no safe instances
either of o-grade in the radical syllable or of alternation between ‘strong’ ~ ‘weak’ grades within the
Nella flessione del perfectum latino ….. non si evidenziano sicuri esempi di continuazione di un
grado vocalico *-o- nella sillaba radicale ……. si deve anche osservare che non esistono esempi di
alternanza tra forme forti e forme deboli all’interno della flessione del perfectum [‘In the
conjugation of the Latin Perfect one cannot find safe examples of the continuation of an *-o-grade
in the radical syllable… and there are no examples of alternation between strong and weak forms
within the conjugation of the Perfectum’. ]
On the contrary, Sihler (ibidem) argues in favor of the basic I-E character of the Latin Perfect, although
a series of innovations would have almost completely “obscured” its original shape (see the chapter by
Di Giovine in this volume on the Perfect and the I-E verbal system in general)
In order to quantify this situation, I counted those roots whose reconstructions include both the
present and perfect tense e/o alternation. In seventeen of these reconstructed roots, none of the daughter
language groups displayed the alternation in question. In other words, here the alternation has been
reconstructed in the absence of any actual evidence, using the assumptions of the model alone. Further,
in the vast majority of the roots (73), only one daughter language group showed the alternation. This
evidence, if interpreted without the bias of the model, clearly suggests that the alternation is simply of
local origin. Figure (III) below shows the full results of the counting. The Present / Perfect alternation is
in fact supported by only eight roots that meet the ‘three witnesses’ criterion (See Table (II) below for
the list of these 8 roots). Examining these roots in more detail, one finds that even these do not provide
straightforward evidence for the alternation because: a) 6 of them contain laryngeal segments; b) 7 of
them include Sanskrit, where e and o merge into a:
To take a concrete example of how an attested verbal form in a given language can be associated to just
one out of the set of variants of its reconstructed root, without necessarily displaying any horizontal and
/ or internal gradation, let us consider the root *bhueh2-‘to grow, originate / be created, become, be’
(LIV: 83-5), whose Present and Perfect forms are, respectively:
Pres. *bhéuh2 -e- ~ *bhuh2 -ié-vs Perf. *bhe-bhuóh2 / bhuh2-
This is one of the few potentially, ‘statistically good’ roots, in the sense that the Present, the Perfect and
also the Aorist (whose alternative forms are *bhuéh2 - / bhuh2-) are actually attested in at least three
languages (see lexical item (6) in Table (II) below).
Nevertheless, most attested verbal forms derive from just one of the reconstructed variants of the
assumed set of alternations. For example, associated with the Perfect there are the following,
nonalternating forms: (LIV 1998: 83; 2002: 98):
• Sanskrit (Vedic): Perf. ba-bhū´-va ‘he / she is (has become)’;
• Greek: Perf. pephuasi. ‘they are / have grown’;
• Oscan: Perf. fufens ‘were’;
• Old Norse: Pret. bjó ‘lived’
The Oscan Imperfect form fufans ‘were’ would derive still from just one variant of the Perfect of the
root, this being the full grade: *bhe-bhueh2 (according to footnote (24) of LIV 1998; but see also Sihler
(1995: 554-5), according to whom there are uncertainties regarding the precise shape of the original
stem). Similarly, associated with the Aorist are the following, non-alternating forms, all derived from
the single variant *bhuh2-:
• Vedic: Aor. á-bhū-t ‘he / she has become’;
• Greek: Aor. e-phū ‘he /she grew, became, originated’;
• Old Latin: Conj. fu-as ‘you should be’;
• Latin: Perf. fū-ī ‘I was’;
• Old Lithuanian: bit(i) ‘he was’;
• Old Church Slavic: by ‘he was’;
• ultimately, also Modern English be (a finite stem in Old English béom, bist etc.; see Sihler
The only alternating forms available, derived from both internal variants of the Aorist, are the Old Irish
ones: 3rd Sing. boí ‘was’ alternating with 3rd Plu. -bátar ‘were’. Furthermore, as one can see, not
necessarily an attested tense / mood derives from the respective, reconstructed one, as shown by the case
of the Latin Perfect fui, and, ultimately, the Present forms of Modern and Old English, all derived from
one variant of the reconstructed Aorist. This type of mismatch between tense and /or mood categories
are encountered frequently in the process of reconstruction, and in LIV. Although we are dealing with a
general and common phenomenon, the fact remains that these mismatches between the reconstructed
and the actual, attested tense/mood categories further increase the flexibility of the explanatory system,
especially in connection with the variety of meanings admitted for most reconstructed roots. In the
specific case of *bhueh2, the root is associated with several basic meanings, six variants and one
laryngeal segment (see Table (II) below). All these elements, coupled with the fact that this root alone
has in the Rigveda several thousand forms -- of which there are over twenty instances of the Perfect
stem --clearly grossly increase the possibility of ‘picking up’ the form suitable to obtain the desired
Let us now give a couple of examples of how, on many occasions, one (extra) variant (tense /
mood alternation, one of the many forms of Present, or a variant enlarged with a primary suffix), has
been ‘appended’ to the reconstruction in order to account for just one language / language branch:
*deuh2 ‘to fit together’ (1998: 106; 2001:123)
Aorist: *déuh2-/ duh2-Tocharian B tsuwa ‘fit together’
Present: *du-né /n-h2- Greek dunamai ‘can’
*twerH ‘to grasp’ (1998:596-7; 2001:656)
Kausative: *twortH-éie- Old Church Slavic tvoriti ‘create /do’
Essive: *twrH-h1ié- Lithuanian turiù ‘hold / have’
To conclude this paragraph, two final observations: a) quite often, the single languages for which one
more variant has been appended to the reconstruction of a root are Tocharian and Hittite;
b) the number of variants that constitute the set of alternative reconstructions in LIV can rise up to 13.
From the facts and analyses as reported in the points (A) and (B) bove one can draw the conclusion that
the great majority of the morpho-phonemic alternations reported in LIV are not real, inherited
alternations, but just artefacts, a fabrication of the method of analysis.
2. 3. Filtering out the artefacts
One can make a very rough estimate of the effect of filtering out the artefacts described above. If a root
is reconstructed with more variants than language groups/branches to support them, then it should be
discounted as having no significance. Putting this more mathematically, one can define the ‘Intrinsic
significance’ (S) value of a root as the amount by which the number of language groups (G) exceeds the
number of variants (V), that is, S = G -V. Roots with a ‘negative’ Intrinsic Significance value are those
displaying more variants than language groups/branches – with the appropriate, attested morphophonemic
alternation -- to support them, and therefore are of doubtful significance.
Of the 357 roots listed in LIV whose reconstruction has been achieved without the aid of the Laryngeal
Theory, 39 (11%) have a ‘positive’ Intrinsic Significance value, i.e. their reconstruction contains more
languages / language branches as witnesses of the reconstruction than variants. Of these, 29 roots (7%)
are reconstructed on the basis of three or more language branches witnesses, and may therefore be
regarded as qualified candidates to be statistically significant correspondences, and therefore to properly
support the stated Laws of Indo-European. Thus, according to this analysis, there is indeed a set of
reconstructed roots that can be considered as candidates which might reflect genuine I-E correlations,
but its number is only a very small portion of the total – just 7% of the roots listed in LIV that do not
contain laryngeal segments – much more than this of course (as shown in the graph below) if one
accepts without reservations the laryngeal theory. (Notice however that this positive evaluation does not
take into account the number of Laws and adjustable parameters needed to reconstruct the roots in
2. 4. The Laryngeal theory: adding extra flexibility to the system?
Let us now come back to Figure (II) above (par. 2. 2. 2), whose graph suggests that, on average, one
new variant has been ‘appended’ to the reconstruction with each newly claimed daughter language
group, with the exception of a very small number of roots (4%). In other words, these roots have a
higher average Intrinsic Significance than the others (S > Ø below the dotted line). Brady &
Marcantonio (2003) have looked for a possible explanation for the departure from linearity of these 4%
of roots whose reconstruction is supported by 6 or more daughter languages / language groups. A closer
investigation has revealed that these significant roots are more likely to have been reconstructed using
laryngeal segments. This in turn suggests that ‘appending’ laryngeal segments increases the flexibility of
the model, as an alternative to ‘appending’ more variants. The laryngeal theory is now accepted, and
considered to be settled, by the great majority of scholars, and it certainly goes beyond the scope of this
paper to deal with the rights or wrongs of the theory. Here, as mentioned, I just limit myself to observe
the effects the Laryngeal Theory (or at least the specific version of it adopted by LIV) may have on the
flexibility and the statistical significance of the corpus under investigation.
Matching vowel alternations across the I-E area (in both the mono-syllabic and by-syllabic roots)
on the basis of actual evidence has always proven difficult, if not impossible, before the introduction of
the laryngeal theory – as clearly illustrated in the chapter by Carruba in this volume. This is clearly
recognised by LIV (p. 4) when it states that, without such a theory, “Morphostrukturen und Bildungsregeln
des Urindogermanischen nicht verständlich sind”. The dictionary adopts a system with four
laryngeals, h1, h2, h3, + a not well specified H. In particular, h1, h2, h3 (according to LIV, p. 5),
“Können fast jeden Platz einnehmen und an fast jeder Stelle zusätzlich zu den anderen Radikalen
stehen”. LIV does not divulge any information about the number of Laws and related contextual
specifications needed to govern the complex behaviour of these segments, although one can expect such
a number to be quite high. Given such an abundance of segments, coupled with their (relative) freedom
of occurrence and the increased number of rules, one may reasonably expect to find a certain amount of
apparent matches in this corpus. The possibility of ‘picking’ the appropriate form to make the wanted
match is further increased by the existence of the forth, unspecified laryngeal --a real ‘passe-partout’ –
and by the monosyllabic character of many of the roots --mono-syllabic roots are, in general, quite easy
to match, as shown by Ringe (1999).
The intuitive observation that, in these conditions, matches are extremely easy to find, is
confirmed by a proper, quantitative analysis. In fact, I have investigated the distribution of the laryngeal
segments in LIV, with the purpose of ‘quantifying’ the effects the use of laryngeal segments bring into
the equation. The results are illustrated in the graph below, figure (V). This graph shows that the roots
displaying a higher Intrinsic Significance value -- i.e. more claimed daughter language groups than
variants (other things being equal; see figure (IV) above) -- are more likely to have been reconstructed
by making recourse to the Laryngeal Theory. This suggests that the flexibility afforded by inserting
laryngeals segments into the reconstructions makes it easier to find a fortuitous match in a new language
To conclude this section, it is worth observing that, despite the extra degree of flexibility the overall
explanatory system is granted with, thanks to the laryngeal segments, there are still exceptions left over
in the domain of vowel alternations. Sihler (1995:131) proposes the following justification to account
for the exceptions, and for the fact that, as we have seen, the actual evidence from most I-E languages
typically points to “non-ablauting” forms:
Even though the origins of ablaut … were necessarily phonological, by the earliest period
reachable by the comparative method the distribution of different ablaut grades in PIE had been
MORPHOLOGIZED, that is, a given form or class of forms was associated with a certain grade in
PIE. This is particularly true of the distribution of e- vs. o-grades.. … As is to be expected when
phonological alternations are captured by morphology, the system was never completely regular;
…..Thus, evidence points to a non-ablauting root *bhu- or bhuH- ‘become’
This statement, in my opinion, is a concrete testimony of how it may be very difficult, if not impossible,
to falsify the I-E theory, since one can always bridge the gaps between the predictions of the model and
the actual data through ad-hoc explanations that will then be granted the status of a (more or less)
general principle.
3. The statistically significant evidence for Indo-European
3. 1. The ‘best’ Indo-European roots
Let us now have a look at the most widely represented roots – 6 or more branches/groups – which are
listed Table (I) below. This table shows their reconstruction, the presence vs absence of laryngeal
segments and the various meanings of the actual attested forms, as given in LIV. In other words, this
Table shows the ‘Intrinsic Significance’ value of each of these statistically relevant roots, that is,
basically, the significance a root displays taking into account also all those factors that may contribute to
the ease of finding a match.
At first glance these roots do display a positive Intrinsic Significance value, because they are
supported by a high number of language witnesses and are reconstructed through a (relatively) small
amount of variants. These are, therefore, the ‘best’ matches for I-E. However, the high significance
value of these roots is potentially undermined by two factors: the frequent recourse to the Laryngeal
Theory, which, however much justified and correct, does add an extra degree of flexibility to the
explanatory system (as argued for in the previous paragraph),
and the high variety of meanings often involved.
A root whose attested forms display a high variety of meanings clearly has less significance,
because more meanings multiply the ease with which a connection can be made, as is known. The
‘Significance Estimate’ can be expressed through the following formula: ‘G -V -M’ (Groups minus
Variants minus Meanings). On the basis of this measure, for example, a simple root that meets the threewitnesses
criterion and has only one meaning and one (variant) form of the root, has a significance of
‘one’. More meanings or variants, or laryngeal segments, reduce the Significance value because of the
reasons expounded above. According to this measure then the only roots which are highly significant,
being represented in all / most language groups without having to make recourse to a high number of
variants are the root *bher- (which, however, has three meanings) and the root *h1es-‘to be’ (which
contains one laryngeal segment).
Whatever the case, and whatever relevance one might accord to factors such as presence (vs
absence) of laryngeals and (wide) variety of meanings, the fact remains that the really statistically
significant roots, according to this measure, are only just an extremely small number (only 14) with
respect to the overall, high number of roots listed in LIV. Last, but not least, there is to be remembered
that there are often mismatches between what is considered to be a (more or less) regular set of
correspondences between LIV and other dictionaries, as shown, for example, by the root for ‘to milk’
(*h2melg’ ), classified by Ernout-Meillet (1959) as belonging to the ‘popular, familiar’ variety of
language and, therefore, as irregular and /or poorly represented across I-E. See also Buck (1949: 385-6)
and Belardi (2002:140ff.) for a similar evaluation of this root, and other similar roots.
Insert Table I here
3. 2. The best Present / Perfect alternations
In Table (II) below I have listed those 8 roots which, while being statistically relevant, also appear to
display what could be called a ‘proper’ Present / Perfect alternation, that is an alternation that is not only
reconstructed, not only assumed, but is actually attested in the relevant languages /language groups.
Once again, at least according to the quantitative analysis carried out here, the number of roots
displaying ‘proper’ alternations is rather low:
Insert Table II here
3. 3. The nominal roots
In this last paragraph I would like to compare the state of the verbal roots with that of the nominal roots,
or lexemes (to use a more appropriate definition), although briefly, since a detailed analysis of the latter
is outside the scope of this chapter (for a recent overview of the state of the I-E lexemes see Clackson
[2007:187 ff.]).
I have argued above that the great majority of the morphophonemic alternations reported in LIV
are not real alternations, but rather artefacts of the method of analysis, that is: most variants have been
‘appended’ to the reconstruction of a given root in order to facilitate a difficult match.
This thesis appears to be supported by the following fact, since long recognised in the specialist
literature: ordinary nouns / lexemes display a much higher degree of irregularity than verbal roots or
verbal nouns (see for example Meillet [1934: 289 ff., 379-416, 396]; Benveniste [1935:175-ff.]; Belardi
[2002:141ff.], Campanile [1983]; see also Ernout-Meillet [1959] and Schlerath [1987]). Meillet
discusses at length the ordinary nouns, including those belonging to the basic, every-day vocabulary,
which he names “vocabulaire populaire”. The Author observes that most of these terms are not proper
correspondences, either because they present irregularities (at least in some of the languages where they
are attested), or because they cannot be safely traced back to a common, I-E source, being the various
terms rather different in the various branches, or a mixture of both factors. Besides, often these nouns
only occur within three or even two contiguous languages (as already pointed out by Schmidt [1872]).
These terms include the nouns for: ‘goat’, ‘dog’, ‘fox’, ‘bee’, ‘honey’, ‘milk’ (a related milking and
feeding activity), ‘nail’, ‘spleen’, ‘bone’, ‘palm’ (of hand), ’sheep’, ‘night’, ‘four’, ‘wolf’, ‘bull’, ‘pig’,
‘nose’, ‘tongue’, ‘shoulder’, ‘eyes’, ‘mouth’, ‘ears’, ‘brother-in-law’, ‘horse’ and several kinship terms
referring to the semantic domain of ‘the family of the woman / wife’ and that of ‘distant relations’ (for a
recent assessment of the status of reconstructed kinship terms see Clackson [2007:200 ff.]). Interestingly
enough, Benveniste (1935:175), while trying to justify this state of affairs, observes that the nominal
roots, unlike the verbal roots, “ne permettent pas de définir une racine, … n’offrent pas d’alternance
radicale”, by this meaning that the verbal roots are much more regular than ordinary nouns thanks to
their typical, alternating structure. A similar way of thinking appears to lie behind the following words
by Clackson (2007:1901):
It is possible that ….our rules for deriving affixed forms from roots may be a construct of the
comparative process. We rely on roots as the base of derivation since we can reconstruct roots with
more confidence than we can reconstruct individual lexemes. Roots may be shared across many
languages, while a particular lexical formation is only found in a small number of languages These
observations, according to the analyses presented in this chapter, could be interpreted and rephrased
as follows: because of the very fact that ordinary nouns do not, indeed, display that rich
alternation (supposedly) typical of the verbal roots, there is less room to ‘pick and chose’ among
variants, and, therefore, less opportunity to justify, or ‘explain away’, the encountered
irregularities. Not only, the fact that it is much more difficult to reconstruct “individual lexemes”
may be an indication that the I-E roots, as conventionally reconstructed (including their assumed,
intricate and often unjustified derivational procedures) are indeed an artefact of the method of
4. Conclusion
The quantitative analyses carried out above show that the lexical comparative corpus and the Present /
Perfect, e ~ o alternation assembled in LIV definitively lack statistical significance. It is reasonable
therefore to assume that most of the correspondences and most of the Present / Perfect alternations under
discussion are chance resemblances, artefacts of the traditional method of analysis. However, these very
same analyses have also individuated those roots and those Present / Perfect alternations that do display
statistical significance and are therefore most likely to represent genuine linguistic correlations, although
the number of these ‘good’ roots and alternations is lamentably low.
What are then the conclusions to be drawn from this? As mentioned above, the conclusions to be drawn
may vary according to which position scholars hold with regard to the following, basic methodological
issues: a) how relevant, how ‘diagnostic’ are the lexical correspondences in general, and therefore within
I-E, as against the morphological correspondences? b) How many good correspondences, if any, are
required, or at least desirable, in order to establish and ‘prove beyond reasonable doubt’ a language
family? c) How relevant are the morpho-phonological correlations of the type examined above for the
task of assessing genetic relations? And, if they do rate quite high in this task, would that small bunch of
Present / Perfect, ‘proper’ alternations cutting across several (but not all) I-E languages constitute a
diagnostic clue of relatedness? Is the comparative method a heuristic method or not? Are ‘obvious’,
‘compelling’ correlations (that is, those clearly observable through the naked eye of the trained scholar)
good enough to assume, if not prove beyond doubts, genetic relatedness, etc.? Certainly, if no consensus
is attained on these issues, drawing ‘a’ conclusion from the analyses carried out here will turn out to be
Whatever the answers to these fundamental questions may be, my personal opinion is that the
circularity issue embedded in the comparative method has not (yet) been resolved and that, as a
consequence, the I-E theory, as it stands today, is unable to make clear-cut and testable predictions. On
the contrary, the I-E theory appears to be still flexible enough to be adjustable (and adjusted) to account
for almost any data. To use the words of an eminent physicist, W. Pauli, a theory of this sort may be
regarded as: “not even wrong”.
Baldi, P.
1987 Indo-European Languages. In Comrie, B. (ed.), The Major Languages of Western
Europe. London: Routledge. 21-58.
Belardi, W.
2002 L’Etimologia nella Storia della Cultura Occidentale, I. Roma: Il Calamo.
Bird, N.
1982 The Distribution of Indo-European Root Morphemes. Wiesbaden: Harrassowitz.
Benveniste, E.
1935 Origines de la Formation des Noms en Indo-Européen. Paris: Librairie Adrien-
Brady, R & Marcantonio, A.
2003 Evidence that most Indo-European reconstructions are artefacts of the linguistic method
of analysis. In Hajicová, E. et al. (eds), Proceedings of the 17th International Congress
of Linguists. Prague: Matfyzpress, MFF UK. CD-ROM ISBN: 80-8673221-5.
Buck, C. D.
1949 A Dictionary of Selected Synonyms in the Principal Indo-European Languages. The
University of Chicago Press.
Chrétien, C. D.
1937 Quantitative classification of Indo-European languages. Language 13:83-103.
Campanile, E.
1983 Problemi di Lingua e di Cultura nel Campo Indoeruropeo. Pisa: Giardini.
Campbell, L.
1998 Historical Linguisitcs. An Introduction. Edinburg University Press.
Clackson, J.
2007 Indo-European Linguistics. An Introduction. Cambridge: Cambridge University Press
Collinge, N. E.
1985 The Laws of Indo-European. Amsterdam: Benjamins.
Di Giovine, P.
1996 Studio sul Perfetto Indo-Europeo, II. Roma: Il Calamo.
Ernout, E. & Meillet, A.
1959 Dictionnaire Etymologique de la Langue Latine. Paris: Klincksieck (4th edition).
Fox, A.
1995 Linguistic Reconstruction. An Introduction to the Theory and Method. Oxford
University Press.
Grace, G.
1990 The “aberrant” versus “exemplary” Melanesian languages. In P. Baldi (ed.), Linguistic
Change and Reconstruction Methodology. Berlin: Mouton de Gruyter. 155-173.
1996 Regularity of change in what? In M. Ross & M. Durie (eds), 157-179.
Greenberg, J. H.
2000 Indo-European and its Closest relatives: The Eurasiatic Language Family. Vol. 1:
Grammar. Stanford University Press.
2001 Indo-European and its Closest relatives: The Eurasiatic Language Family. Vol. 2:
Lexicon. Stanford University Press.
2005 Genetic Linguistics: Essays on Theory and Method. Edited by W. Croft.
Harrison, S. P.
2003 On the limits of the comparative method. In B. D. Joseph & R. D. Janda (eds), The
Handbook of Historical Linguistics. Oxford: Blackwell. 213-243.
Hock, H. H.
1993 SWALLOW TALES: Chance and the “world etymology” MALIQ’A ‘swallow, throat’.
In K. Beals et al. (eds), Papers from the 29th Regional meeting of the Chicago
Linguistic Society. 215-238.
1994 (Pre-)Rig-Vedic convergence of Indo-Aryan with Dravidian? Another look at the
evidence. Studies in the Linguistic Sciences 14/1: 89-107.
1996 Pre-Rigvedic convergence of Indo-Aryan and Dravidian? A survey of the issues and
controversies. In J. E. M. Houben (ed.), Ideology and Status of Sanskrit. Leiden: Brill.
Illic-Svityc, V. M.
1971-84 Opyt sravnenija nostraticeskikh jazykov (semitokhamitskij, kartvelskij, indoevropejskij,
ural’skij, dravidskij, altajskij), I-III. Moscow: Nauka.
Koerner, E. F. K.
1989 Comments on reconstruction in historical linguistics. In Vennemann, T. (ed.), The New
Sound of Indo-European: Essays in Phonological Reconstruction. Berlin / New York:
Mouton de Gruyter. 1-20.
LIV = Rix, H. (ed.)
1998 Lexikon der Indogermanischen Verben. Die Wurzeln und ihre Primärstammbildungen.
Wiesbaden: L. Reichert.
Macdonell, A. A.
1995 A Vedic Grammar for Students. Delhi: Motilal Banarsidass.
Marcantonio, A.
2002 The Uralic Language Family: Facts, Myths and Statistics. Transactions of the
Philological Society. Oxford/Boston: Blackwell.
2003/2005 Evidence: the missing concept in comparative studies. A preliminary comparison of
Uralic and Indo-European. In M. M. J. Fernandez-Vest (ed.), Les Langues Ouraliennes
Aujourd’hui: Approche Linguistique et Cognitive / The Uralic Languages Today: a
Linguistic and Cognitive Approach. Bibliothèque de l’École des Hautes Études:
Sciences Historiques et Philologiques 340. Paris: Librairie Honoré Champion. 117-132.
Masica, C. P.
1979 Defining a Linguistic Area: South Asia. Chicago University Press.
McMahon, A. & McMahon, R.
2003 Finding families: quantitative methods in language classification. Transactions of the
Philological Society 101/1: 7-57.
2005 Language classification by Numbers. OUP.
Meillet, A.
1934 Introduction à l’Étude Comparative des Langues Indo-Européennes. Paris: Librairie
Morpurgo Davies, A.
1978 Analogy, segmentation and early Neogrammarians. TPS. Commemorative volume: The
Neogrammarians. 36-64.
Nichols, J.
1996 The comparative method as heuristic. In Ross, M. & Durie, M. (eds), 1996. The
Comparative Method Reviewed. Regularity and Irregularity in Langue Change. Oxford
University Press.
Pokorny, J.
1959-69 Indogermanisches etymologisches Wörterbuch. Bern: Francke.
Prokosch, E.
1939 A comparative Germanic Grammar. Philadelphia: University of Pennsylvania Press.
Rees, M.
1999/2000 Just six numbers. London: Poenix.
Ringe, D.
1992 On Calculating the Factor of Chance in Language Comparison. Transactions of the
American Philosophical Society 82: 1-110. Philadelphia.
1993 A reply to Professor Greenberg. Proceedings of the American Philosophical Society
137/1: 91-109.
1995 “Nostratic” and the factor of chance. Diachronica 12 / 1: 55-74.
1998 A probabilistic evaluation of Indo-Uralic. In J. C. Salmons and B. D. Joseph (eds), 153-
1999 How hard is it to match CVC-roots? Transactions of the Philological Society. 97:213-
Ringe, D., Warnow, T. & Taylor, A.
2002 Indo-European and computational cladisitcs. Transactions of the Philological Society
100/1: 59-129.
Salmons, J. C. and Joseph, B. D. (eds)
1998 Nostratic: Sifting the Evidence. Amsterdam Studies in the Theory and History of
Linguistic Science. Series IV. Current Issues in Linguistic Theory. Amsterdam: J.
Schmidt, J.
1872 Die Verwandtschaftsverhältnisse der indogermanischen Sprachen. Weimar: H. Böhlau.
Schlerath, B.
1987 On the reality and status of a reconstructed language. JIES 15: 41-6.
Sihler, A. S.
1995 New Comparative Grammar of Greek and Latin. OUP.
Szemerényi, O.
1996 Introduction to Indo-European Linguistics. OUP.
Unger, J. M.
1990 Summary report of the Altaic panel. In P. Baldi (ed.), Linguistic Change and
Reconstruction Methodology. Berlin: Mouton de Gruyter.

The Aryan Hypothesis
And our objective today
e Indo-Aryan debate
• Indo-Aryans were ...?
– Indigenous
– Migrants (from the Indo-European area)
• Seems impossible to resolve the debate
• Bryant 2009
– There is nothing in Indian archaeology that supports
the assumed migration of peoples
– The entire issue is a derivative consequence of the
‘family tree’ presuppositions of historical linguistics
– Scholars have become exhausted with the polemical
and emotional element of the discussions
What textbooks say
Sanskrit Greek Latin Gothic English
Bhár-ā-mi Phér-ō Fer-ō Bair-a (I) bear
ásmi eimí sum im (I) am
ésti est est ist (he) is
pitár patér pater fadar father
tráyah treis trēs θrija three
•Examples of word similarities
•Most similarities include Greek – Sanskrit
•Many also include Latin
•Other languages also represented
Cutting the Gordian knot
• Received wisdom is that Sanskrit derived
from Indo-European
• No evidence from archaeology, palaeoanthropology
or genetics
• The only evidence is linguistic
• We will examine the linguistic theory
– it is based on mistaken evidence
– It fails Sir Isaac Newton’s criterion for an
acceptable scientific model - it is ‘unfalsifiable’
• The time has arrived to challenge the Indo-
European theory
– May cut the Gordian knot
– Allow space for alternative models for the origin
of Sanskrit
he linguistic evidence (1)
Many claimed Indo-European words
Widely cited example: ‘Father’
Sanskrit janaka, Pitár-, taataH
Old Avestan (modern day Iran) Pta (later patar/pitar)
Greek PatēJr
Latin Pater
Slavic, Albanian (missing)
Lithuanian (missing)
Old Irish Athir
Gothic (4th century) Atta (except once: ‘O Father’)
Old high German (9th century) Atto ~ Fater
Modern German Vater
Geographical distribution of ‘father’
Indo-Greek states
Matches the Indo-Greek states
Greek -
Conclusion on ‘father’
• Claimed as evidence for the Indo-European theory
• But an entirely different history is equally possible
– Indo-Greek states used Sanskrit and Greek as official
– Policy of intermarriage
– Opportunity for the word to spread
– Then Latin modelled on Greek
– Then helped to spread via church influence (‘O Father’ has
strong religious meaning)
• The Indo-European inheritance explanation is not the
only possible one!
More on word correspondences
• Many more words are like ‘father’
– Greek and Sanskrit are the bedrock of the theory
• About 1000 nouns (Pokorny 1959-1969)
– But Ringe: ‘evidential standards’ are ‘lamentably lax’
• 683 ‘safely reconstructed’ IE verbs (Rix 1998)
– But 29% are evidenced in only one language or branch
– 34% are evidenced in only 2 languages or branches
– This indicates most claimed words are of local origin,
not Indo-European
• We will return to this topic ...
The linguistic evidence (2)
Sound laws
ntroduction to sound laws
• ‘Exceptionless’ laws of sound change from IE into the
various languages
– Founding laws
Sanskrit Greek Latin Gothic English
Bhár-ā-mi Phér-ō Fer-ō Bair-a (I) Bear
• 1822: Grimm’s law
• 1876: Verner’s Law
– Today: many laws
• Verner’s “Stunning paper”
– Roger Lass, 1997
• “Critical for uncovering language relations”
– Hoeningswald, 1990
What does Verner’s Law say?
• 1822: Grimm’s Law
– Indo-European p, t, k®Germanic f, θ, h~ χ
• 1870’s: counter-evidence to Grimm’s Law published
– Verner thought he detected a systematic pattern
– Claimed to turn exceptions into ‘apparent exceptions’
• 1876: Verner’s Law
– Germanic f, θ, h~ χ ®b, d ~ d, g
– ‘voiceless consonant from Grimm’s Law becomes voiced’
– In medial position only
– Unless the corresponding consonant in the supposed IE
ancestor word is preceded by an accent
Verner’s Law in action
• First must reconstruct the IE accent
– Not reconstructable (IE languages differ too much)
– Verner says ‘I must use Sanskrit’ (does not say why)
– But Sanskrit has tones, not a stress accent
• Verner uses the tone that is transcribed like a stress accent!
• Verner’s Law applied to ‘father’
– Sanskrit (assumed = IE) Pitár
– (by Grimm’s law) fiθar
– (by Verner’s law) fidar ¬θ becomes d because
no Sanskrit accent before
• These laws do not consider vowels
Verner’s evidence (1) kinship terms
Meaning Sanskrit Verner’s
Old High
Father pitár fadar Atta Atto~fater
Exactly 50% do not match - expected by chance!
Brother bhrátar brōθar Brōθar bruoder
śvaśrū- swigar swaihro swigar
śváśura- swehur swaihra swehur
Verner’s ‘proof’ obtained by omitting
all the evidence that does not match!
erner’s evidence (2) : conjugation of verb
Meaning Sanscrit as cited by
Verner’s Germanic
To go Bhédana * Līθan
I go bhédāmi * līθa
You go bhédasi * līθis
He goes bhédati * līθiθ
We go bhédāmas * līθam
You go bhédatha * līθiθ
They go bhédanti * līθand
I went (past) bibhéda * laiθ
You went (past) bibhéditha * laist
He went (past) bibhéda * laiθ
We went (past) bibhidimá * lidum
You went (past) bibhidá * liduθ
They went (past) bibhidús * lidun
... Then the
Germanic consonant
becomes voiced
erner’s evidence (2) : conjugation of verb
• How is
• Gothic
– All voiceless
– ‘merger’
• Old English
Meaning Sanscrit as cited by
Verner’s Germanic
To go Bhédana * Līθan
I go bhédāmi * līθa
You go bhédasi * līθis
He goes bhédati * līθiθ
We go bhédāmas * līθam
– Does not
– ‘Analogy’
– Inverted
– Not explained
• Germanic
‘forced’ to

erner’s evidence (2) : conjugation of verb
• Verner
‘to go’ as
class 1
Meaning Sanscrit as cited by
To go Bhédana
I go bhédāmi
You go bhédasi
He goes bhédati
We go bhédāmas
• But in fact it
is Class 7!
Verner ‘forced’
the Germanic
to match the
wrong Sanskrit!

Summary of Verner’s evidence
• Kinship terms
– Perfect match obtained by omitting all counterevidence
• Verbal conjugation
– Verner reconstructs a prehistoric Germanic
• But it does not accord with any attested ones
– Crates a ‘Perfect match’ with an erroneous Sanskrit
• Conclude Verner’s Law is invalid
– contradicted by all the evidence in the original paper
More on sound laws
• Many further laws are based on similar methods
– ‘rescuing procedures’ when the data do not match
– Able to match almost any data
• If an attested word appears to be an exception
– Add a special event in word development
• Eg ‘father ‘‘lost initial p’ in Old Irish Athir
– Add a sub-law (‘contextual specification’)
• Today there are over 200 laws and sub-laws
– Add some intermediate rules
• Perhaps like Verner’s rules for reconstructing Gmc conjugation
• If all else fails
– call it ‘not cognate’
– Therefore still not an exception
• There are today more than 200 laws and sub-laws
– In fact there are more laws & sub-laws than words to
be explained in Rix dictionary
• This fails Newton’s test for a valid scientific model
– ‘What certainty can there be in a Philosophy which
consists in as many Hypotheses as there are
Phaenomena to be explained.’
• Because the model can flex to match any data
– Newton: ‘New Hypotheses may be devised that shall
seem to overcome new difficulties’
The Gordian knot
• Received wisdom is that Sanskrit derived from
• No evidence from archaeology, palaeoanthropology
or genetics
• The only evidence is linguistic
• We examined the linguistic theory
– it is based on mistaken evidence
– It fails Sir Isaac Newton’s criterion for an acceptable
scientific model - it is ‘unfalsifiable’
– Many claimed IE correlations are in fact due to
• Challenge the Indo-European theory
– There is no evidence for the existence of an Indo-
European speech community
– May cut the Gordian knot
– Allow space for alternative models for the origin of
Verner’s 1876 proof of his founding Law of IE
• Law (simplified) says: Germanic t changes into d unless the corresponding
consonant in the Sanskrit word is preceded by an ‘accent’
– Also: p to b and k to g
– But Sanskrit has tones, not a stress accent
• Verner’s preliminary proof uses ‘father’
– Sanskrit Pitár -> Germanic fadar
• He selectively omits to mention contradictions to his law
– Atta = ‘father’ contradicts his theory
– So he uses fadar even though cited once only (‘O Father’)
– In fact, his preliminary proof uses 4 kinship terms. 50% of the attested
Germanic forms contradict his theory but are not mentioned
• Verner’s ‘backbone’ evidence is also ‘forced’
– Conjugates √bhid- (to ‘split’, ‘smash’, ‘break’) as class 1 (it’s class 7)!
– Uses this mistaken conjugation to predict t~d etc. in the assumed
corresponding Germanic Strong conjugation
– But the attested Germanic conjugations do not match this prediction
– ‘Reconstructs’ hypothetical prehistoric Germanic conjugation that precisely
matches this mistaken Sanskrit conjugation and seems to prove his law
– Today this ‘reconstruction’ is widely cited in textbooks as proof of Verner’s Law
Linguistic evidence (2): Sound laws
• regular and systematic sound-changes inside IE words,
governed by sound-laws
– Describe changes from IE -> Skt, IE->Greek etc.
– Sound changes claimed to be regular and systematic
– This enables you to transform the attested words so that they
seem to match precisely
• ‘Father’ is one of the key items of evidence (detail later)!!
• Another widely quoted example: ‘to carry, bear’
– Skt bhara-mi Greek pher-o, Latin fer-o, OHG bir-u
– Claimed similarities across root and conjugation
– (again selective choice of Sanskrit word)
• But to the naked eye these are different!
– The method seems to be able to justify this difference
– How?
Summary (1) - methodology
• Problems with the linguistic evidence for IE
– ‘forced’ similarities – do not match common sense or the
archaeological evidence
• Unfalsifiable theory
– Not actually supported by evidence, but there’s always an
• The common sense approach
– Look at geographical distribution of words (‘isoglosses’)
– Look at opportunities for word to spread
– If they match, investigate why
• What if your evidence contradicts the IE theory?
– Don’t be afraid to publish your data
– The linguistic evidence does not trump yours!
‘Forced’ interpretation of a local word (Rye)
as Indo-European
• Rye in three IE language
groups (Russian, Germanic,
Baltic) so must be IE!
• But can’t reconstruct the
IE word
– Reconstruct IE god
*Deiwos- caused rye fields
to ripen
• But contradicted by
presence in Finnish
– ‘Borrowed into Finno-Ugric
at an early date’
• Unfalsifiable theory
– Not actually supported
by evidence, but there’s
always an excuse
To be continued

In a nutshell, here below some 20 points I wrote in criticism of Proto Indo-European in the discussion page of Wikipedia based on various linguistic readings:

About the some 500 supposed (constructed or more accurately guessed) hypothetical proto indo-european roots (you can find all the proto indo-european roots in wiktionnary):

1/How could be same roots have such different meanings(for example exist 4 "pel" 1st "pel"=flour,2nd "pel"=gray,3rd"pel"=skin,4th "pel"=flat)!

2/How could be synonims exist with different roots(such as skin which has at least 3 listed different proto ie roots)!

3/Many semantic shifts are very very broad to such extent that with such lax semantics many (constructed or attested)[proto or not]words&roots of different languages in the world can fit as proto Indo-European!

4/Many sound shifts look very unlikely and are against the sound laws!

5/Many supposed proto ie roots are anachronist(door,bourgh,fort...)and could not exist in the language of steppe hordes of the bronze age (according to the pontic steppes "urheimat" model).

6/Many supposed proto ie roots are most likely Semitic loans(star,three,sun,six,seven,eight,home,tree,fie ld,pilaku[axe],barley,field,snow,door,corn,dher,goat,buck..... )as they could not be explained from indo-european intrinsic phonetico-semantic pradigmas

7/Many other roots could be loans from Kartvelian,NW Caucasian,NE Caucasian,Altaic,Uralic&pre Indo-European languages of Europe(Vasconic,Pictish,Tartessian,Pelasgian,Iberi an,Aquitanian,Ligurian,Raetian,Etruscan,Wiik's Saami substratum...)

8/Many supposed proto ie roots are shared eurasiatic and nostratic roots and thus could be loans.

9/Many supoosed proto ie roots are supported by examples of very few Indo-European branches and sometimes by only 1 ie branche,or from only 1-2 branch with very unbelievable sound changes forgetting the innovation,loans and chance propabilities.
As an exemple,the supposed pie *h₂éǵʰ-r root has only 2 given examples both in the same Indo-Iranian branch(Avestan and Sanskrit)
h₂éǵʰ-r/n̥- day Skr. अहर् (ahar), Av. azan

10/The sound change du=>er in the Armenian erku=2,if included will make many languages Indo-European,for example proto Semitic thnay(2)is by far closer to proto Indo-European *two(2)than the Indo-European Armenian erku(2) is.

11/A great number of proto indo-european roots are somehow impossibly proununciable as
dngh2wleis=tongue whereas other proto languages roots such as Turkic or Semitic are not only easily proununciable but in the same time are real and existing words with clear Turkic & Semitic etymology and similar derivative words that contain the same consonants which is not the case for the hypothetical proto indo-european roots(not the reflexe words).

12/Very often semitic(as well as other languages)roots are closer to some indo-european reflexes than proto ie do for example proto semitic "lis" (tongue)is closer to Armenian "lezu" and Baltic "lesvis" than proto ie "dngh2weis" do.


15/The number of the common Indo-European roots shared by at least 3 Indo-European branches is very limited (106 roots) when compared for example with the number of roots shared by languages such as Semitic (more than a dozen of thousands of common shared roots wich do have own meanings) Malay languages etc...

16/Since the guessed proto indo-european roots are not found in any indo-european language(ie of course there is no for example an english word nebhos=cloud)the reflex words in the various indo-european languages are cutted from their roots and their semantico-phonetic derivations and inner pradigmas are rather arbitrary ones lacking a clear system, for example in Semitic every speaker who knows the meaning of a root automatically will know the meaning of its derivatives(example from the root *ktb, the derivative kVtVb(V stands for vowel) is always connected with the active form and nkVtVb with the passive form and so on... ie with clear and well defined paterns a system that is lacking in the indo-european daughter languages in respect with the constructed proto indo-european roots)

17/The linguists often classify Altaic&Uralic as phonetically conservative languages(since the mono and biconsonantic proto roots of Uralic and Altaic are invariable)while classifiying Indo-European,Kartvelian and Afro-Asiatic(the 3 have the rare pecularity amongst world languages of having tri and tetraconsonantic proto roots)are classified as apophonic ie the internal structure of the proto roots can undergo vocalic and consonantic ABLAUT, so if we would reanalyze proto indo-european roots with a semitic model perspective(ie purely consonantal roots with the ablaut of vowels serving as well established models to derivate words and various grammatical forms as well as conjugation paters)perhaps it would open a new horizon for "clogged" indo-european studies(albeit in the same time it would create internal problems such as the below example:
1/ bʰer=brown, shining
Ltv. bērs; bebrs, Lith. bėras; bebras, Old Prussian bebrus, Gaul. Bibrax, Welsh befer, Eng. brūn/brown; bera/bear; beofer/beaver, Gm. brūn/braun; bero/Bär; bibar/Biber, ON brúnn; bjǫrn; bjórr, Skr. भाति (bhā́ti), Av. bawra, Lat. fiber, Russ. бобр (bobr), Toch. parno/perne; paräṁ/perne
2/bʰer=to bear / carry
Skr. भरति (bhárati), Av. baraiti, Russ. брать (brat’), Ir. berid/beir, Welsh cymmeryd, Arm. բերել (berel), Alb. bie; mbart, Gk. φέρω (pherō), Lat. ferō, Umbrian fertu, Eng. beran/bear, Gm. burde/Burde, Toch. pär/pär, Lith. berti, Ltv. bērt, Kamviri bor, Phryg. ber, Goth. ������������ (bairan), ON bera, OCS бьрати (bĭrati), Pers. baratuv/bār, Polish brać, Hitt. kapirt
3/bʰer=to boil
Alb. brumë, Gaul. Voberā, OIr. bréo; bruth; berbaim, Welsh brwd; brewi, Eng. brǣþ/breath; /broth; /brew; beorm/, Gm. brādam/Brodem; braten, ON bráðr, Gk. φύρδην μίγδην (phurdēn-migdēn); phréār; porphurein, Lat. fervēo; fermentum; dēfrutum, Skr. भुरति (bhuráti); bhurnih
Russ. борзой (borzoj), Lith. bruzdùs, Welsh brys, MIr. bras, Polish bardzo, OCS брьзо (brĭzo), Lat. festīnō; fastenus; confestim
Russ. берег (bereg), Eng. burg/borough, Arm. բարձր (barjr), Skr. बर्हयति (barháyati), Av. bərəz(ant), Gaul. Bergusia, Gm. berg/Burg, Hitt. parku; en-park, Toch. pärk/pärk, Thrac. Berga, Goth. baírgahī, ON bjarg; borg, Pers. burj, MIr. brí, Illyr. Berginium, Gk. πύργος (purgos), Welsh bre; bera, Lyc. prije; pruwa, Lat. fortis, Alb. breg
ie why the initial 3 "bher" have so different semantics when they have 100% identical phonetics and why the 4 th and 5 th "expanded root" "bheres"&"bhergh" have no semantics similar with at least one of the 3 initial "bher"!?

19/As well as when only a medial vowel is differeing between 2 (consonantly identical) roots such as
*wedʰ=to lead
Russ. веду (vedu), Lith. vesti, Ltv. vads, Ir. fedid/fedim, Av. vāðayeiti, Hitt. uwate, OCS vedǫ, Pol. wieść, Welsh arweddu, OPruss west
*wadʰ=to pledge
Lat. vas (g. vadis), Ir. fedid/, Welsh arweddu; dyweddio Goth. wadi, ON veð, Eng. weddian/wed; weotuma/-, Gm. wetti/Wette, Lith. vadas; vaduoti; vedù; vedẽklė, Ltv. vadot; vedu, OCS vedǫ; voždǫ, Gk. éedna, Hitt. huettiya, Av. vadayeiti; upāvādayeiti; vadū; vadrya, Skt. vadhú

20/We know from other language families(Turkic,Semitic...)that some succession of particular consonants/vowels or consonants&vowels or particular morphemes or words starting with a particular vowel and consonant are not possible but since proto indo-european is a hypothetical language we could not know much or verify the phonemic clusters and other phonetic paterns that the intrinsic proto indo-european language pradigmas DO NO ALLOW!

Last edited by Colin Wilson; 11-10-2010 at 02:44 PM.
Here below another critical review of the linguist Arnaud Fournet(albeit in French):

La reconstruction de l’indo-européen et la réalité
du sémitique : convergences et perspectives
Arnaud Fournet
La création de l’indo-européen
Le terme indo-européen est attesté pour la première fois sous la plume de Thomas
Young en 1813. Mais la parenté indo-européenne a été reconnue beaucoup plus
tôt, dès le XVIe siècle. La ressemblance entre sanscrit, latin et grec fut notée pour
la première fois en 1583 par un jésuite anglais, Thomas Stephen, qui vécut en
Inde de 1579 à 1619. Des gens ayant des intérêts plus terrestres, comme le marchand
italien Filipo Sassetti, en 1585, furent frappés par la familiarité du sanscrit
et sa similitude avec le latin et le grec. Sassetti cite en particulier les nombres de
6 à 10, ainsi que les mots « Dieu » et « serpent ». Néanmoins, il n’a jamais formulé
l’hypothèse d’une parenté, car il ressort de son courrier une sorte d’effarement
devant l’écart culturel qui sépare l’Italie de l’Inde. Beaucoup de travail fut accompli,
spécialement aux Pays-Bas par Marcus Boxhorn (1640) et en France par
Claude de Saumaise (1643), sur le lexique des langues indo-européennes, qui
n’avaient pas alors ce nom, surtout sur le sanscrit, le grec, le latin, le perse et les
langues germaniques. La démarche était alors strictement comparative et, dans
cette époque très fixiste, la notion de proto-langue était tout simplement impensable.
En outre, en l’absence de données précises sur le sanscrit et les langues
perses, la démonstration d’une parenté restait peu solide, voire presque fabuleuse.
Les similitudes évidentes de ces langues furent expliquées dans le cadre de
l’origine « scythique », parfois aussi appelée « japhétique ». Le peuple scythe bien
connu, une branche iranienne de l’indo-européen, fut alors supposé avoir essaimé
à travers toute l’Eurasie et s’être ramifié en autant de langues modernes.
Leibniz (1646-1716) a contribué à propager cette hypothèse d’une diffusion « scythique
» qui remonte en fin de compte à Boxhorn1 :
On peut conjecturer que cela vient de l’origine commune de tous ces peuples
descendus des Scythes, venus de la mer Noire, qui ont passé le Danube
et la Vistule, dont une partie pourrait être allée en Grèce, et l’autre aura
rempli la Germanie et les Gaules.
D’autres travaux, comme ceux de James Parsons (1767), sont intéressants sur le
plan comparatif, spécialement en ce qui concerne la ressemblance entre langues
celtes – irlandais et gallois –, mais ils sont entrelacés de façon inextricable avec
une thématique biblique, relative à la Genèse, de sorte qu’on peine à distinguer
1. Leibniz (1990, p. 218).
les hypothèses véritablement historiques et les éléments plus mythiques, voire
Les mots japhétique et indo-européen sont restés longtemps en concurrence.
En 1905, on écrit encore :
Le sujet du présent livre est le groupe de langues que les Allemands appellent
aujourd’hui indo-germanique (idg.), et que l’on désigne aussi sous le
nom d’indo-européen (i.e.), nom usuel en France et qui sera adopté dans la
présente traduction, ou d’aryen, ou de japhétique3.
Avant l’invention des notions d’évolution et de préhistoire4, les différences furent
expliquées non pas par des divergences progressives au fil du temps, mais par des
mélanges entre langues dans des proportions variables. Dans ce cadre prémoderne,
les langues n’évoluent pas mais elles se mélangent, ce qui donne naissance
à d’autres idiomes. Un point de vue clairement moderne sur l’évolution des langues
est exprimé par Jakob Grimm (1785-1865) dans Geschichte der Deutschen Sprache
(1848, p. 833) :
Tous les dialectes se développent dans un ordre progressif, et plus on remonte
vers l’origine des langues, plus leur nombre diminue et plus leurs
différences s’effacent. S’il n’en était pas ainsi, la formation des dialectes et
la pluralité des langues resteraient inexplicables. Toute diversité est sortie
graduellement d’une unité primitive. Les dialectes allemands se rapportent
tous à une ancienne langue germanique commune, et celle-ci à son tour, à
côté du lithuanien, du slave, du grec et du latin, n’était qu’un des dialectes
d’un idiome primitif plus ancien encore.
Une autre présentation moderne de la parenté indo-européenne est formulée par
Friedrich Schlegel dans Ueber die Sprache und die Weisheit der Indier (1808) :
Le sanscrit présente un lien de parenté très fort avec le latin, le grec et le
germanique ainsi que le perse. Les similitudes ne se limitent pas seulement
à un très grand nombre de racines que ces langues ont en commun mais
existent aussi dans la structure et la grammaire. En conséquence de quoi le
rapprochement n’est pas accidentel, explicable par des échanges, mais fondamental,
provoqué par une origine commune.
Le mérite de la démonstration de la parenté revient à Franz Bopp (1791-1867).
Venu étudier le sanscrit à Paris en 1812, il publie quatre ans plus tard un ouvrage
d’une ampleur colossale, qui formalise de façon définitive l’apparentement qui
est dans les têtes érudites depuis 350 ans. En relisant la Grammaire comparée des
langues indo-européennes, on reste admiratif devant le travail accompli. En fait,
presque tout ce qui fait le comparatisme y est déjà !
2. Pour en savoir plus, on pourra consulter les ouvrages d’érudition scientifique suivants :
Sergent (1995, p. 20-46), synthèse faite par un historien ; Mallory (1997a), synthèse faite
par un archéologue.
3. Brugmann (1905, p. 2). La traduction française de ce livre commence par cette phrase.
4. Ce mot est attesté en français seulement à partir du milieu du XIXe siècle.
Un mot doit être dit à propos de Sir William Jones (1746-1794), qui est souvent
présenté comme le découvreur du sanscrit et comme l’initiateur du comparatisme
indo-européen. En 1786, Jones, qui était juge à la cour suprême de
Calcutta, prononça un discours auprès de la Société royale asiatique du Bengale,
dont nous traduisons un extrait demeuré célèbre :
Le sanscrit, quelle que soit son antiquité, est d’une structure merveilleuse,
plus parfaite que le grec, plus riche que le latin et plus subtilement raffinée
que ces deux derniers, tout en ayant avec eux une affinité si forte dans les
racines des mots et dans les formes grammaticales qu’elle ne saurait s’être
produite par hasard, si forte en effet qu’aucun philologue ne pourrait les
examiner toutes les trois sans croire qu’elles ne viennent de quelque origine
commune, qui peut-être n’existe plus. Il existe une raison similaire, bien
que moins contraignante, pour supposer que le gotique et le celte, quoique
mélangés à un idiome différent, ont la même origine que le sanscrit, et le
vieux perse pourrait être ajouté à la famille.
Nous avons souligné deux expressions qui montrent les limites de la modernité
de Jones. Il reste dans le cadre ancien du mélange de langues et ne franchit pas le
cap d’une origine commune si ancienne qu’elle serait perdue et à reconstruire.
Jones est un érudit, certes bien informé mais qui représente la fin d’une époque,
mélangiste et fixiste. Son apport réel au comparatisme indo-européen est très
surestimé. Ses compétences linguistiques également, car il a considéré que le
farsi avait la même origine que l’arabe, puisqu’écrits dans le même alphabet. Il est
instructif de citer la phrase qui précède immédiatement le paragraphe demeuré
célèbre :
Le pur hindi, qu’il soit d’origine tartare ou chaldéenne, était la langue originelle
de l’Inde du Nord, dans laquelle le sanscrit fut introduit par des
conquérants venus d’autres royaumes à quelque époque reculée, car nous
ne pouvons pas douter que la langue des Védas a été utilisée dans la majeure
partie du pays, aussi longtemps que la religion des Brahmanes l’a dominé.
En clair, Jones ne voit pas la filiation historique entre le hindi et le sanscrit. Pour
lui, le hindi serait d’origine altaïque ou sémitique et antérieur au sanscrit. Il va de
soi que les thuriféraires anglophones de Jones ne mentionnent guère ces assertions
suspectes de leur champion. En général, on ne cite que le paragraphe le plus
favorable, soigneusement extrait de son contexte et ébavuré.
En outre, faire démarrer aussi tard – à la fin du XVIIIe siècle – les recherches
sur l’indo-européen aboutirait à en faire une des familles les plus tardivement
identifiées, un demi-siècle après la famille ouralienne par exemple5. C’est
tout simplement absurde.
5. La première affirmation claire d’une parenté ouralienne est due à Philip Johann van
Strahlenberg, un officier suédois, en 1729, qui s’est exprimé en langue allemande. La première
traduction en français, publiée à Amsterdam, est de 1757 : Description de l’empire
russien. Voir le tome 1, p. 148 et suivantes.
Les sous-familles reconnues pour l’indo-européen sont les suivantes :
– celtique et italique,
– germanique,
– albanais,
– thrace, dace (éteintes),
– grec, arménien, phrygien (éteint),
– anatolien : hittite, louvite, lydien (connues depuis 1915),
– balte et slave,
– indo-iranien,
– tokharien.
Le formalisme indo-européen
Quand on part d’un corpus de données lexicales, la première étape de travail
consiste à trier et à comparer afin de mettre au jour les faits intéressants. En
premier lieu, il faut repérer un certain nombre de cognats potentiels et poser un
certain nombre de correspondances. Cette phase d’analyse débouche normalement
sur une nouvelle phase appelée « reconstruction ». Ces travaux ont commencé
pour l’indo-européen vers le XVIe siècle et ils ont gagné en rigueur et en
précision au début du XIXe siècle.
On reconstruit des étymons ou des proto-formes qui possèdent plusieurs
caractéristiques :
– ils sont censés remonter à la proto-langue, qui existait autrefois,
– ils synthétisent les résultats de l’analyse comparative,
– ils sont formatés de manière à pouvoir en dériver les formes attestées.
Dans les présentations usuelles qui sont faites de la méthode comparative,
on parle très peu du format dans lequel doit se mouler la proto-langue. Cette notion
est à notre avis essentielle. Le format utilisé pour la reconstruction de l’indoeuropéen
a évolué plusieurs fois depuis le début du XIXe siècle. Ainsi le mot père a
été reconstruit :
– *pitar quand on considérait le sanscrit comme étant la proto-langue,
– *pəter dans un formalisme empirique,
– *pə2ter dans un formalisme semi-structuraliste,
– *pH2ter dans un formalisme structuraliste et laryngaliste.
Avec ces différents formats, on rend compte des mêmes données primaires, à
savoir : latin pater, grec patêr, sanscrit pitâr, irlandais aithir, etc. Notre avis est
qu’il est important de comprendre que le formalisme utilisé pour rendre compte
des données n’est pas dans les données et qu’il n’est pas neutre vis-à-vis de la
conception que l’on se fait de la proto-langue et du cadre théorique dans lequel se
placent les reconstructeurs. Certains esprits narquois ont souligné que le visage
de l’indo-européen évoluait beaucoup plus vite que celui des langues-filles. C’est
directement lié aux progrès du comparatisme et de la linguistique elle-même. Le
format de la reconstruction évolue parce qu’on pose sur la proto-langue un regard
qui évolue. Ces différents formats ne sont pas plus vrais les uns que les au7
tres, ni mutuellement exclusifs. Ils ne disent pas la même chose. Bien sûr, on attend
du format le plus récent qu’il dise plus et mieux que les précédents.
Le premier âge de la reconstruction *pitar, avant 1840, ne se projetait pas
encore dans les notions de proto-langue et de préhistoire. Il avait le sanscrit
comme horizon indépassable. Le deuxième âge *pəter est un premier pas vers la
notion de langue préhistorique. Il reconstruit des étymons mais il reste très près
des données, ce qui donne une proto-langue avec un bric-à-brac de racines présentant
des alternances morphologiques très variées : e ~ o ~ Ø (zéro), ê ~ ô ~ ə, â
~ ô ~ a, etc. Le formalisme de cette époque est en outre très vocalique. Ensuite, les
deux formats suivants sont des perfectionnements significatifs résultant d’une
réflexion en profondeur sur la phonologie et la structure des racines de l’indoeuropéen.
Ce sont les points que nous allons maintenant examiner de plus près.
De façon générale, l’indo-européen est tiraillé entre deux pôles : les comparatistes
philologues se contentent d’un mécano phonético-algébrique alors que
les comparatistes linguistes veulent une proto-langue réaliste, acceptable. D’un
côté, on veut des formules et des règles de réécriture ; de l’autre, on veut un état
de langue possible. Les évolutions du formalisme indo-européen se font pour
concilier les deux points de vue et pour répondre à deux objectifs principaux :
améliorer l’efficience du mécano algébrique et renforcer le réalisme de la protolangue.
Nous allons voir que toutes les réformes à ce jour aboutissent à rapprocher
l’indo-européen du sémitique.
Le système des consonnes postulé au tout début était calqué sur le sanscrit.
La structure parfaite et rectangulaire des occlusives du sanscrit avait été soulignée
dès l’Antiquité par les Indiens eux-mêmes, et en particulier par Panini, et
elle avait séduit les premiers comparatistes :
Labiales Dentales Rétroflexes Palatales Vélaires
Sourdes p t T č k
Sourdes aspirées ph th Th čh kh
Sonores b d D j g
Sonores aspirées bh dh Dh jh gh
Nasales m n N ń
Ce tableau a servi de crible au XIXe siècle pour trier les cognats légitimes et le
sanscrit a servi de fil directeur pour mettre en place le mécano algébrique des
reconstructions. Il suffisait de voir à quoi correspondaient les sons du sanscrit
dans les autres langues indo-européennes, et le tour était joué.
Il est vite apparu que la série des sourdes aspirées était marginale sur le
plan quantitatif et qu’elle concernait surtout le sanscrit lui-même. Le système
orthodoxe, une fois supprimée la série sourde aspirée, repose sur trois séries fondamentales
: sourde, sonore, sonore aspirée. Du point de vue du mécano algébrique
des comparatistes, cela ne change rien. Mais du point de vue linguistique,
c’est gênant. Car ce système de traits n’existe nulle part sur la planète et il n’est
pas acceptable par la phonologie. A partir de 1950, cela a provoqué la recherche
d’un nouveau format plus acceptable pour le proto-système phonologique : les
systèmes dits « glottaliques ».
Système orthodoxe Système glottalique
Etait sourde t Devient sourde t
Etait sonore d Devient glottalisée W [t?]
Etait sonore aspirée dh Devient sonore d
Le reformatage des traits ne change pas le nombre de phonèmes ni les correspondances.
Il propose de changer la nature des traits du proto-système. Il existe
plusieurs variantes de système glottalique6 mais, par rapport à notre propos, ils
aboutissent en pratique à aligner l’indo-européen sur le système de l’hébreu et du
proto-sémitique : à savoir sourde, sonore, sourde glottalisée (ou sourde emphatique).
Ce système est plus acceptable puisqu’on est certain qu’il existe !
Un autre argument en faveur des théories glottaliques est la rareté troublante
de *b, pour lequel on ne reconstruit que très peu de lexèmes. On sait que le
phonème /p?/ n’existe pas dans tous les systèmes à glottalisation, car la coarticulation
labiale et glottale est peu efficace. La réinterprétation de *b comme valant
en fait */p?/ expliquerait donc sa rareté.
A ce jour, ce visage glottalique de l’indo-européen n’est pas accepté de façon
universelle comme valide et fait partie des hypothèses qui clivent les comparatistes
en deux camps. Celui des sceptiques reste majoritaire à ce jour. Par
exemple, Michael Meier-Brügger, qu’on peut considérer comme représentatif du
comparatisme orthodoxe de langue allemande, affirme7 : « Die Zweifel an der Berechtigung
des Ansatzes von Glottalen für das Uridg sind nach Ansicht der Verfasser des
Studienbuches nicht ausgeräumt. » Indépendamment des réticences théoriques, sur
le plan pratique, si l’on mettait en oeuvre le format glottalique dans la reconstruction
de l’indo-européen, il faudrait réécrire 150 ans de travaux accumulés.
Une entreprise titanesque...
D’autres auteurs adhèrent et vont même jusqu’à préciser que ces glottalisées
seraient des pré-glottalisées, par exemple Robert Beekes8 : “Important is that
the glottalic feature probably preceded the consonant.”
Les glottalisées se manifestent en slave comme des préglottalisées. En effet,
slave /d/ < *[?t] est toujours précédée d’un /ê/ long, alors que slave /d/ < *[dh] est
toujours précédée d’un /e/ bref. L’explication en est que [*v+*?t] > [*v+*?d] >
6. Pour un panorama des hypothèses : Collinge (1985, Appendix II, p. 259-269) ou Salmons
7. Meier-Bruegger (2002, p. 126). Nous traduisons : « Les doutes sur la validité de l’emploi
de glottal(isé)es pour l’indo-européen ne sont pas éliminés, selon l’avis de l’auteur du
[présent] manuel. »
8. Beekes (1995, p. 133). Nous traduisons : « Il est important de noter que le trait glottalique
précédait probablement la consonne. »
[*v?+*d], d’où une voyelle longue9. Ce phénomène porte le nom éponyme de Loi de
Winter, d’après le découvreur. Certains comparatistes en concluent que tout le
domaine indo-européen avait des préglottalisées. Nos propres travaux, dont il
serait trop long de faire état ici, sur la séquence arrêt glottal + sourde, montrent
que seules les langues centrales – germanique, balte, slave, grec, arménien, indoiranien
– ont des préglottalisées. Le celte et l’italique ont des postglottalisées. A
l’appui de la théorie glottalique, certains comparatistes, comme Kortlandt, soulignent
que certaines langues indo-européennes présentent encore de nos jours
des glottalisées, comme par exemple les variétés d’anglais parlées en Angleterre
même. De notre point de vue, il est intéressant de confronter ces hypothèses
glottaliques avec la réalité concrète du sémitique, qui est la terre d’élection des
glottalisées-emphatisées. Nous donnerons deux exemples. La racine du verbe
donner √ dô- contient un /d/ initial. Si l’on admet que /d/ vaut */ṭ/ [t?], à savoir ط
ou ʋ, il est flagrant que le verbe « donner », en arabe ‘aṭâ’ et en hébreu naṭan,
contient une emphatique (ou glottalisée) radicale, conforme à ce qu’on attend
d’après le reformatage de l’indo-européen. Réciproquement, le mot « allaiter,
téter » √ dheH1 correspond à hébreu dad, « sein », sans glottalisée de part et
En résumé, pour faire simple, nous dirons que le reformatage glottalique de
l’indo-européen est nécessaire pour des raisons internes au domaine indoeuropéen
et qu’il offre en prime des perspectives macro-comparatives très prometteuses.
En ce qui concerne les voyelles, de toutes les alternances possibles dans un
squelette de consonnes, il est vite apparu que le paradigme e ~ o ~ Ø était la référence
absolue : le cas le plus fréquent et le plus régulier. Les comparatistes appellent
apophonie l’alternance e ~ o ~ Ø. /e/ est appelé degré e, /o/ degré o et Ø degré
zéro. Il existe dans les langues indo-européennes des exemples très clairs de cette
apophonie e ~ o ~ Ø héritée :
– latin ped-is ~ grec pod-os « pied » au cas génitif ;
– latin tegô « je couvre » ;
– latin tegumen « tégument, peau » (< qui couvre la chair) ;
– latin tect-um « toit » (< qui couvre la maison) ;
– latin toga « toge » (< qui couvre le corps).
En latin et en grec, il y a cinq voyelles brèves : a, e, i, o, u. Mais elles n’ont pas le
même statut : /i/ et /u/ ne sont pas intégrées dans un jeu d’alternances morphologiques.
La voyelle /a/ existe dans toutes les langues mais elle ne rentre pas non
plus dans des alternances normales et régulières. Comme Antoine Meillet l’a fait
remarquer avec pertinence, /a/ est fréquent surtout à l’initiale et il alterne avec
9. Beekes (1995, p. 133). “The solution is that the glottal stop /?/ lengthens the preceding vowel :
a?g-t > âc-t. [...] The theory also explains why in Balto-Slavic a preceding vowel is lengthened by a
voiced consonant (Winter’s Law) [...] It is now possible to make out whether a Balto-Slavic *g goes
back to an aspirated [voiced] or a non-aspirated [voiced] sound.”
10. Le rapprochement avec l’arabe nahd est plus délicat (double incrémentation...), mais
/o/ et jamais /e/. Nous en verrons plus loin les raisons avec la structure des racines11.
Dans les verbes, /e/ a valeur de présent et /o/ a valeur de passé. Ce système
se voit bien en grec et en germanique, y compris dans une langue moderne
comme l’anglais. L’alternance e ~ o est souvent reflétée sous une forme évoluée : i
(< *e) ~ a (< *o), en germanique.
e o Ø
Valeur Présent Passé Dérivé
Anglais to get « obtenir » get got
Anglais to bear « porter » bear [ber] bore [bor]
Anglais to be born « naître » born [born] birth [bərθ]
Anglais to drink « boire » drink drank drunk
Anglais to bring « apporter » bring brought [bro:t]
L’invention invention des laryngales
L’alternance e ~ o ~ Ø marche bien tant que la racine a un beau squelette de
consonnes C1_C2. Elle est déréglée dans certaines racines, comme par exemple
celle du verbe donner. En latin, l’entrée du dictionnaire liste les formes : dô « je
donne », dâs « tu donnes », dâre « donner », dedî « j’ai donné », datum « donné » et
il faut ajouter dônum « don ». On voit mal comment intégrer ces formes dans le
même paradigme que les autres verbes. C’est aussi le cas de la racine « être debout
», qui a en grec un ensemble de formes sta, stâ, stau, stû, morphologiquement
opaques. L’intégration de ces racines aux alternances étranges dans un modèle
unifié est un autre problème de la reconstruction.
Dans son Mémoire sur les voyelles de l’indo-européen, écrit en 1870 à 18 ans
(!), Ferdinand de Saussure a étudié de plus près les voyelles longues du grec ancien
et a confirmé qu’il n’existait qu’une seule alternance morphologique fondamentale
: e ~ o ~ Ø. Et il a postulé des coefficients sonantiques pour rendre compte
des autres alternances et des voyelles longues. Dans l’approche de Saussure, il en
faut trois, notées E, A, O :
Données Format saussurien
avec coefficients12
ê ~ ô ~ e eE ~ oE ~ ØE
â ~ ô ~ a eA ~ oA ~ ØA
ô ~ ô ~ o eO ~ oO ~ ØO
stâ ~ sta steA ~ stA
dô ~ do deO ~ dO
11. #a- < *H2e et #e- < *H1e sont en distribution complémentaire.
12. Voir Saussure (1870, p. 145) pour un tableau similaire.
Si l’on raisonne en termes de traits ou d’effets phonétiques, A colore en voyelle
/a/ et allonge, E allonge et ne colore pas, O allonge et colore en /o/. En suivant
Saussure, la racine « donner » est reformatée d_O et la racine « être debout »
st_A. L’approche de Saussure est plus structuraliste et plus abstraite. Elle régularise
la morphologie mais elle pose le problème de déterminer quel genre de réalité
phonétique se cache derrière ces coefficients, postulés un peu ex nihilo pour les
besoins de la cause. Tant qu’on se contente d’un mécano graphique, ce genre de
format fonctionne, mais si l’on veut donner un sens linguistique à ces coefficients
sonantiques, la situation devient plus problématique.
Notons au passage que Saussure s’en est toujours tenu à une interprétation
vocalique des coefficients, alors que Hermann Möller est incontestablement le
premier à avoir postulé une nature consonantique pour ces coefficients13.
Par ailleurs, il est apparu que le degré dit zéro (Ø) avait souvent un reflet
vocalique au lieu d’être amuï et strictement égal à zéro. Le sanscrit présente alors
/i/ là où les autres langues ont /a/. Sur le modèle de l’hébreu, qui présente des
alternances schwa ~ voyelle pleine, les comparatistes ont appelé schwas, notés /ə/,
les trois reflets vocaliques des coefficients sonantiques de Saussure. Du point de
vue de l’épistémologie des sciences et de l’histoire du comparatisme, il est surprenant
que personne n’ait jamais étudié l’importation du concept de schwa hébreu
dans le domaine indo-européen. Il est pourtant manifeste que les premiers
comparatistes connaissaient les langues majeures de l’Antiquité européenne :
grec, latin et hébreu...
Dans ce format semi-vocalique de l’indo-européen, les alternances E A O
sont configurées ainsi :
Données Format saussurien
avec coefficients
Format empirique
avec schwas14
ê ~ ô ~ e eE ~ oE ~ ØE eə1 ~ oə1 ~ ə1
â ~ ô ~ a eA ~ oA ~ ØA eə2 ~ oə2 ~ ə2
ô ~ ô ~ o eO ~ oO ~ ØO eə3 ~ oə3 ~ ə3
stâ ~ sta steA ~ stA steə2 ~ stə2
dô ~ do deO ~ dO deə3 ~ də3
Le but ultime de la démonstration de Saussure dans le Mémoire sur le système primitif
des voyelles dans les langues indo-européennes est résumé en page 135 :
Le phonème a1 [moderne *e] est la voyelle radicale de toutes les racines. Il
peut être seul à former le vocalisme de la racine ou bien être suivi d’une seconde
sonante que nous avons appelé coefficient sonantique (p. 8.) [...]
13. Szemerényi (BSL, n° 68, 1973, p. 7) : « En net contraste avec Saussure, Möller essaya,
dès le début, de donner des définitions phonétiques de ces consonnes perdues. »
14. La numérotation 1 pour [e] et 2 pour [a] remonte directement à Brugmann, par
l’entremise de Saussure.
Les phonèmes A et O [moderne *H2 et *H3] sont des coefficients sonantiques.
Ils ne pourront apparaître à nu que dans l’état réduit [zéro] de la racine.
A l’état normal de la racine, il faut qu’ils soient précédés de a1
[moderne *e], et c’est des combinaisons a1+A, a1+O, [moderne *e+H2 et
*e+H3], que naissent les longues â et ô.
De même en page 141 :
L’ê long, dans notre théorie, ne doit pas être un phonème simple. Il faut
qu’il se décompose en deux éléments. Lesquels ? Le premier ne peut être
que a1 (e). Le second, le coefficient sonantique, doit apparaître à nu dans la
forme réduite (p. 135)15.
Du point de vue pratique, ces trois formats – vocalique, coefficient ou schwa –
donnent les mêmes résultats, mais sur le plan linguistique, ils sont très différents.
Tant qu’on ne s’interroge pas sur la nature phonétique possible de ces entités, on
peut en rester à une stricte algèbre graphique, proche du jeu d’écriture16 :
Die Bezeichnung Laryngale für den Ansatz dreier Konsonanten (Engelaute)
der idg Grundsprache is wissenschaftsgeschichtlich bedingt. Fur diese ist
eine algebraistische Notierung als uridg *h1, *h2 bzw *h3 üblich geworden.
[Notre traduction : L’appellation laryngale pour trois consonnes de l’indoeuropéen
est scientifiquement certaine. Pour elles, une notation algébrique
*h1, *h2 et *h3 est devenue usuelle.]
La logique interne de la morphologie arabe amène à la conclusion que le [â] long
repose sur une séquence implicite *[awa] attestée nulle part, dans les synchronies
arabophones. Ici */w/ est un fantôme morphologique, que la logique de la
langue oblige à postuler pour rétablir la régularité là où la synchronie ne voit
qu’anomalie. De la même façon, la logique interne de l’indo-européen oblige à
postuler au moins trois entités phonologiques supplémentaires qui permettent
de retrouver un point de fonctionnement stable et cohérent. La démarche intellectuelle
de Saussure est in fine identique à celle des grammairiens arabes, tout en
étant complètement indépendante et appliquée à un corpus de données totalement
Outre leurs effets sur la longueur et le timbre vocalique, Saussure luimême,
dès 1870, avait noté que ces fantômes ont des effets sur les consonnes. Ainsi,
la racine *√ st_A au degré zéro *stA aboutit en sanscrit à √ sth au lieu de √ st.
La racine « boire » *√ p_O au degré zéro *pO aboutit en latin à une sonore /b/
dans bibere, d’où le français « boire ». Ces coefficients-schwas ont un impact sur les
consonnes : ils provoquent aspiration ou sonorisation17. Ils ont plusieurs propriétés
phonétiques observables : allongement, coloration, aspiration, sonorisation...
On comprend qu’ils ont longtemps eu un parfum d’hérésie ou de scandale.
15. On notera que Saussure postule bien trois coefficients sonantiques pour la morphologie.
Certains auteurs prétendent qu’il n’en aurait postulé que deux. Ils n’ont pas lu le
16. Meier-Bruegger (2002, p. 106, § L314).
17. Outre Saussure, le Danois Holger Pedersen (1867-1953) et Albert Cuny (1870-1947) sont
arrivés à des conclusions identiques.
Les comparatistes qui ont adhéré dès 1870 à la notion de coefficients sonantiques
ont été contraints de réfléchir à la nature phonétique possible de ces
coefficients. Il fallait trouver des phonèmes capables de se vocaliser, capables
d’influencer le timbre des voyelles environnantes et capables d’impacter
l’articulation des consonnes. L’hébreu et l’arabe abondent en phénomènes de ce
genre et on rebaptisa progressivement laryngales ces coefficients sonantiques.
Citons ce qu’en dit Martinet (1986, p. 141) :
Les langues sémitiques illustrent bien ce type d’action de certaines consonnes
sur les voyelles voisines ; comme ces consonnes y sont désignées, souvent
à tort d’ailleurs, comme des « laryngales », ce terme a été vite employé
pour nommer les trois différents schwas. Tout ce qui vient d’être exposé et
ce qui va suivre forme ce qu’on appelle la « théorie des laryngales ». Pendant
longtemps, cette théorie a conservé le tour largement algébrique que
lui avait donné Saussure. On y opérait avec des formules comme *eə2=*â,
sans chercher à savoir ce que pouvait être physiquement ə2 et l’on continuait
à utiliser, pour noter les « laryngales », le signe vocalique ə. Puis
l’habitude s’est établie d’utiliser H accompagné des mêmes chiffres, sans
que cela entraîne nécessairement une réflexion relative à la nature des sons
en question.
Oswald Szemerényi s’est intéressé à l’appropriation du concept de ces coefficientsschwas-
laryngales au sein des comparatistes à partir de 187018. On peut distinguer
deux périodes : avant et après la découverte du hittite. Le premier protagoniste
est Hermann Möller (1850-1923), un Danois, qui était un chaud partisan « de la
parenté de l’indo-européen et du sémitique »19. En 1879, il appelait les laryngales
« glottales, en 1880, gutturales, et on le voit continuer à hésiter entre les deux
termes pendant de longues années ». En 1906, Möller écrit20 : « Es waren wahrscheinlich
Gutturale von der Art der Semitischen. » C’est en 1911 que le terme de laryngales
fait sa première apparition : « Die von F. de Saussure für das
Vorindogermanische erschlossenen “phonèmes” entsprechen den semitischen Laryngalen.
Oswald Szemerényi conclut :
Aussi pouvons-nous conclure que Saussure est bien le fondateur des vues
modernes sur le vocalisme et le système des alternances apophoniques de
l’indo-européen, mais n’est, au mieux, qu’un précurseur du laryngalisme ; le
18. Oswald Szemerényi, La théorie des laryngales de Saussure à Kuryłowicz et à Benveniste (BSL,
n° 68, 1973). Cet article a été republié dans Scripta Minora.
19. D’une certaine façon, la double affirmation d’une interprétation laryngaliste et d’un
apparentement au sein d’un proto-euro-hamito-sémitique a rendu la première assertion
suspecte dans un premier temps. De nos jours, les indo-européanistes ont retenu la première
sans la deuxième.
20. Möller (1906, Teil I (Konsonanten), p. VI). Nous traduisons : « Elles étaient vraisemblablement
des gutturales d’un type semblable au sémitique. »
21. Szemerényi (BSL, n° 68, 1973, p. 7). Nous traduisons : « Les phonèmes postulés par
F. de Saussure pour l’indo-européen correspondent aux laryngales sémitiques. »
véritable fondateur de la théorie laryngale est le savant danois Hermann
Möller. (p. 11.)
C’est donc en cherchant une parenté entre l’indo-européen et le sémitique que
Möller a introduit le concept de laryngales dans l’espace indo-européen.
Le deuxième protagoniste est Albert Cuny (1870-1947). D’abord réservé sur
les thèses de Hermann Möller, il en fait une première recension sévère, puis :
Une meilleure connaissance de Möller mena Cuny à une conversion : en
1912, il publie un article de revue où non seulement il rend compte des
théories de Möller (notamment de la théorie laryngale) mais va au-delà de
son modèle sur plusieurs points importants22.
Néanmoins, pendant 45 ans, ces coefficients-schwas-laryngales restèrent une hypothèse
abstraite justifiée par sa capacité à régulariser la morphologie de l’indoeuropéen.
En 1915, le déchiffrement du hittite par Hrozny et la découverte – inattendue
– qu’il s’agit d’une langue indo-européenne apportèrent une légitimation
concrète à ce qui était jusque-là une audace. Bien que le hittite ne conserve pas
de trace claire de toutes les laryngales postulées et que le système d’écriture soit
un peu difficile23, le hittite fut interprété comme la preuve a posteriori que le formalisme
laryngaliste était le bon.
Mais le coup le plus spectaculaire fut assurément la découverte que h hittite
continuait la laryngale i.e. H2, cf. hantetsi- premier : latin ante, anterior.
D’autre part, H1 ne laissait pas de trace en hittite, du moins en position initiale.
Cf. estsi « il est », de H1es-ti24.
Avec des laryngales, le format de la morphologie est le suivant :
Données Format laryngaliste
Format saussurien
avec coefficients
Format empirique
avec schwas
ê ~ ô ~ e eH1 ~ oH1 ~ H1 eE ~ oE ~ ØE eə1 ~ oə1 ~ ə1
â ~ ô ~ a eH2 ~ oH2 ~ H2 eA ~ oA ~ ØA eə2 ~ oə2 ~ ə2
ô ~ ô ~ o eH3 ~ oH3 ~ H3 eO ~ oO ~ ØO eə3 ~ oə3 ~ ə3
stâ ~ sta steH2 ~ stH2 steA ~ stA steə2 ~ stə2
dô ~ do deH3 ~ dH3 deO ~ dO deə3 ~ də3
Emile Benveniste, dans Origines de la formations des noms en indo-européen, a assez
bien résumé la situation :
La condition préalable à toute reconstruction de l’indo-européen a été
fournie par la géniale découverte de F. de Saussure relative à la nature
22. Ibidem, p. 13. Cuny systématise les intuitions de Möller et découvre des phénomènes
nouveaux, qui confortent les hypothèses initiales.
23. Voir plus loin dans l’article.
24. Ibidem, p. 17, à propos des travaux de Jerzy Kuryłowicz, qui est le troisième protagoniste
avec Möller et Cuny.
consonantique25 du phonème ə [=schwa=coefficient]. Admise et enrichie par
Möller, par MM. Pedersen et Cuny, cette théorie peut aujourd’hui [1935]
passer pour établie grâce à la perspicacité de M. J. Kuryłowicz, qui a su reconnaître
dans le h hittite deux des trois variétés du ə indo-européen.
(p. 148.)
La morphologie oblige donc à postuler au moins trois entités ou laryngales H,
dont la nature consonantique est certaine : « Der Primär konsonantische Character
dieser uridg Phoneme steht ausser Frage. »26 Que nous traduisons : « Le caractère
avant tout consonantique de ces phonèmes ne fait aucun doute. »
Le système orthodoxe à trois laryngales
Si l’on prend au sérieux ces entités, il se pose la question de la nature linguistique
et phonologique de ces objets que la description des langues attestées oblige à
postuler dans un état de langue plus ancien. Toute la problématique du phonétisme
indo-européen depuis 1870 est concentrée autour de deux questions :
– attendu que la morphologie régulière de la proto-langue amène à postuler
des phonèmes *H fantômes, quel est leur nombre ?
– une fois actée la légitimité de plusieurs phonèmes H, quels sont les meilleurs
symboles pour eux dans l’API, quelle est leur nature ?
Pour démêler ces questions, plusieurs angles d’attaque sont possibles :
– la morphologie vocalique des racines, comme l’a fait Saussure,
– les alternances liées au contact consonne-laryngale, comme dans √ p_h3
« boire » > latin bib-ere,
– le témoignage explicite de certaines langues, anatoliennes en particulier,
– des considérations théoriques, typologiques ou phylogénétiques.
Plusieurs remarques préliminaires sont possibles. Le choix du terme laryngale
plutôt que coefficient est déjà en soi un jugement de valeur qui oriente la recherche
dans une direction quasi exclusive. En fait, à peu près n’importe quelle
consonne peut s’amuïr en allongeant la voyelle précédente. En outre, les laryngales
sont définies de façon extrinsèque par leurs effets sur les voyelles et les
consonnes plus que par leurs traits intrinsèques. Chaque laryngale est donc une
classe de phonèmes plutôt qu’un phonème unique. C’est surtout Martinet, en bon
phonologue, qui a insisté sur ce point. Et nous verrons plus loin le bien-fondé de
cette remarque. La plupart des indo-européanistes fonctionnent comme si chaque
H1/2/3 était un phonème unique et non une classe. Et cela complique en pratique
les identifications possibles. Chaque hypothèse phonétique convoque en sa
faveur la partie des faits qui lui convient le mieux, en feignant d’ignorer qu’elle
n’explique pas tous les faits connus et légitimes.
Il y a encore deux effets des laryngales que nous n’avons pas signalés. En
sanscrit, */rH/ donne un /r:/ long en opposition à /r/ simple. En lituanien et en
25. En réalité, d’après Szemerényi, Saussure n’a jamais considéré ses propres coefficients
comme des consonnes mais a toujours adhéré à une conception vocalique des coefficients.
26. Meier-Bruegger (2002, p. 106, § L314).
serbe, la séquence remontant à *voyelle + rH-C n’est pas intonée de la même façon
que la séquence remontant à *voyelle + r-C27. De ce fait, ces langues sont tonales,
ce qui est très rare en Europe.
De façon générale, la théorie actuelle la plus orthodoxe postule trois laryngales(-
phonèmes). Dans son livre qui est le premier à s’intéresser de façon ciblée
aux traces des laryngales en latin, Peter Schrijver indique28 :
I essentially follow the views of what can nowadays be considered orthodox
laryngeal theory (See e.g. Mayhofer, 1985 ; Beekes, 1988a). I shall not discuss
the rich variety of alternative proposals, e.g. Adrados’ palatalized and labialized
laryngeals, Puhvel’s nine laryngeals and Szemerényi’s one laryngeal.
[...] These laryngeals are written here *h1, *h2 and *h3 (cover-symbol H).
Their exact phonetic nature is unknown and is in fact irrelevant to their existence,
but Indo-Europeanists agree that they were consonants.
[Nous traduisons : Je suis les idées qui peuvent maintenant être considérées
comme la théorie laryngaliste orthodoxe. Je ne discuterai pas les nombreuses
propositions alternatives, comme les laryngales palatalisées et labialisées
de Adrados, les neuf laryngales de Puhvel ou la laryngale unique de
Szemerényi. Ces laryngales sont écrites ici *h1, *h2 et *h3 (symbole générique
H). Leur nature phonétique exacte est inconnue. Cela ne met d’ailleurs pas
en cause leur existence, mais les indo-européanistes s’accordent à en faire
des consonnes.]
La pierre de touche de la théorie des laryngales est le témoignage du hittite et des
autres langues anatoliennes : le palaïte, le louvite cunéiforme et le louvite hiéroglyphique.
Depuis la découverte de Kuryłowicz, tout le monde s’accorde à voir
dans le <-h(h)-> anatolien une trace directe de *h2 : « Das Anatolische hält mit seinem
h direkte Spuren von uridg *h2 fest. »29
Il existe plusieurs exemples canoniques :
– ha-ap-pa « fleuve » < *H2ep (Pokorny 51) ;
– ha-as-te-ir-za « étoile » < *H2stêr (Pokorny 1027), grec astêr « astre » ;
– har-ki-iš « blanc » < * H2erg- (Pokorny 64), grec argês « blanc » ;
– pa-ah-ha-as- « protéger, garder » < *peH2-s (Pokorny 787), latin pâs-tor
« pâtre, gardien ».
A contrario, *H1 n’a pas de reflet graphique dans les mots suivants :
– ed-mi « je mange » < *H1ed (Pokorny 287) ;
– eš-mi « je suis » < *H1es (Pokorny 340).
C’est sur cette base que la théorie laryngaliste a été définitivement validée dans
ses principes. L’interprétation des graphies anatoliennes est toutefois difficile et
il existe peu de certitudes encore aujourd’hui. L’opposition entre sourdes héritées
et sonores n’est indiquée qu’à l’intervocalique -CC- versus -C-. Ailleurs,
27. Ces phénomènes ont été décrits dès Saussure et Cuny. Le comparatiste actuel vit sur
des bases heuristiques qui ont finalement plus d’un siècle.
28. Schrijver (1991, p. 2).
29. Meier-Bruegger (2002, p. 123).
l’opposition est soit perdue diachroniquement soit non notée. Le point de vue
actuel d’un spécialiste de l’anatolien est le suivant30 :
In general cuneiform -vt-tv- spellings are used for inherited voiceless stops
and -v-tv- for inherited voiced or voiced aspirate stops. This applies to
Palaic and CLuvian, as well as Hittite. (p. 16.)
[En général, les géminées sont utilisées pour les occlusives sourdes de
l’indo-européen et les simples pour les occlusives sonores et sonores aspirées.
Cela vaut pour le palaite, le louvite cunéiforme et le hittite.]
The Akkadian syllabary has a series of signs for a consonant conventionally
transliterated as h. The sound in Akkadian is apparently a voiceless velar
fricative. In Hittite words h reflects the PIE “laryngeals” *h2 and *h3. Orthographically,
h patterns like the stops with contrastive -hh- and -h- betwen
vowels, usually -h- in clusters (but occasionally geminate). Once again regular
morphophonemic alternations such as strong stem nâh- versus weak
nahh- supports the assumption that -hh- versus -h- is constrastive. [...] Historically,
geminate -hh- is the regular reflex of *h2, while all clear cases of
medial -h- continue “lenited” *h2. (p. 21.)
[Le syllabaire akkadien a une série de signes pour une consonne conventionnellement
transcrite <h>. Le son akkadien est apparemment une fricative
vélaire sourde. Dans les mots hittites h reflète les « laryngales » *h2 and
*h3. Orthographiquement, h se comporte comme les occlusives : -hh- et -hs’opposent
entre voyelles, -h- est simple dans les groupes de consonnes
(mais parfois géminée). Répétons que les alternances morphologiques telle
que la racine forte nâh- versus faible nahh- conforte l’hypothèse que -hhs’oppose
à -h-[...] La géminée -hh- est le reflet régulier de *h2 alors que tous
les exemples clairs de -h- intervocalique continue *h2 « lénifiée ».]
My assymption to pharyngeal articulation for Proto-Anatolian and the cuneiform
languages is not crucial, and velar fricatives instead are quite possible.
(p. 22.)
[Mon hypothèse d’une articulation pharyngale pour le proto-anatolien et
les langues cunéiformes n’est pas cruciale, et des fricatives vélaires sont
tout à fait possibles.]
/*h1/ I know of no compelling evidence for the preservation of PIE /*h1/ in
Proto-Anatolian in any position. [...] In most positions it is lost without a
trace. (p. 65.)
[Je ne connais pas d’exemple définitif de la conservation de /*h1/ en protoanatolien
en aucune position. [...] Elle disparaît sans une trace dans la plupart
des positions.]
/*h2/ PIE /*h2/ is generally preserved in Proto-Anatolian as a fortis, voiceless
fricative which I symbolize as H. (p. 68.)
[/*h2/ est généralement conservée en proto-anatolien comme une fricative
forte sourde, que je symbolise <H>.]
30. Melchert (1994). Voir également Kimball (1999). Leurs points de vue sont proches, ce
que Melchert reconnaît lui-même dans la préface de son livre. Entre crochets, notre traduction.
/*h3/ The only major controversy regarding laryngeals in Anatolian concerns
the fate of initial /*h3/. (p. 71.) […] /*h3/ is preserved initially as h- in
Hittite, Palaic and Cuneiform Luvian. I assume that initial /*h3/ was a lenis
voiced fricative /*h/ in Proto-Anatolian, distinct from the fortis, voiceless
fricative /*H/ which is the regular reflex of /*h2/. (p. 72.)
[La seule controverse majeure concernant les laryngales en anatolien
concerne le sort de /*h3/. [...] /*h3/ est conservée à l’initiale comme #h- en
hittite, palaite et louvite cunéiforme. Je suppose que /*h3/ à l’initiale était
une fricative voisée douce /*h/ en proto-anatolien, distincte de la fricative
sourde forte /H/ qui est le reflet régulier de /*h2/.]
Melchert considère que *H1 ne laisse aucune trace et que *H2 est sans doute une
fricative sourde vélaire /x/ ou pharyngale /H/, proche du <h> de l’akkadien.
Le tableau suivant résume différentes hypothèses de système à trois laryngales(-
phonèmes) :
H1 H2 H3
Melchert31 (Ø) /x/ ou /H/ /h/
Meier-Brügger32 /h/ /x/ /γ(w)/
Beekes, Schrijver ع/ /ع/ /?/ 33 w/
Le constat émergeant de cette comparaison est un désaccord apparent sur le lieu
– glottal, pharyngal, vélaire – et le mode – sourd, sonore, glottalisé, labialisé. La
reconstruction interne des laryngales dans un cadre strictement indo-européen
se heurte à des difficultés sérieuses. Dans les faits, il existe plusieurs systèmes
orthodoxes à trois laryngales (-phonèmes).
Néanmoins, il faut noter trois points de convergence générale :
1) H3 est unanimement tenue pour sonore.
En effet la racine *p_H3 « boire » fournit le corpus : grec pô-tis < *p_H3
« boisson », sanscrit pibati « il boit » < *pipH3-ati dans lequel *p est sonorisé par
*H3, latin bib-ere « boire » < *pib- < *pipH3-.
Dans le tableau, H3 est toujours sonore mais de lieu variable : glottal, pharyngal,
2) H3 est considérée de même lieu articulatoire que H2 avec un effet labialisant
en supplément.
3) H1 est censée avoir un autre lieu articulatoire que le couple H2/H3.
Le système proposé par Melchert est un peu différent car il concerne un état de
langue déjà évolué par rapport à l’indo-européen et valable pour le protoanatolien.
31. Les valeurs proposées par Melchert concernent le proto-anatolien. Conforme à sa
ligne prudente, Melchert ne propose pas de valeurs pour l’indo-européen. Il met même
des guillemets au mot « laryngeal ».
32. Meier-Bruegger (2002, p. 106).
33. Schrijver (1991, p. 2).
Les données hétérodoxes
L’orthodoxie à trois laryngales(-phonèmes) est motivée en premier lieu par la
volonté, sans aucun doute légitime en l’absence de certitudes, de limiter le nombre de
phonèmes disparus. Néanmoins, cela aboutit en pratique à mutiler les données de
façon incontestable.
Les spécialistes de l’albanais postulent une quatrième laryngale, dont il est
rarement fait état. Par exemple, comparé au grec orkhis « testicule », l’albanais
présente herðe34. Cette langue a donc des traces explicites, encore aujourd’hui, de
laryngales. A contrario, les aspirées de l’arménien peuvent s’expliquer intégralement
comme un développement interne tardif, ce qui ruine toute possibilité de
s’appuyer sur cette langue.
D’autre part, les données anatoliennes ne sont pas aussi orthodoxes qu’on
veut bien le dire :
– ha-ap-pa « fleuve » < *H2ep (Pokorny 51 *ap) ;
– ap-pa « derrière » < *H2ep (Pokorny 53 *apo)35 ;
– me-hur « temps » < *meH1 (Pokorny 703 *mê) ;
– me-e-hu-e-ni « temps » < *meH1 (Pokorny 703 *mê)36.
Il existe donc des exemples clairs où H2 n’a aucun reflet et, a contrario, où H1
est reflété.
Jaan Puhvel, un autre spécialiste du hittite, postule une double valeur phonétique
pour chaque laryngale, suivant qu’elle laisse ou non une trace <h> en
hittite37 :
E1 = voiceless e-coloring laryngeal, lost in Hittite, intervocalically
lengthens preceding vowel and yields glide -y- ; E1 > a. [Equivaut à H1
E2 = voiced e-coloring laryngeal, Hittite h-, -h-.
A1 = voiceless a-coloring laryngeal, Hittite h-, -h(h)-. [Equivaut à H2
A2 = voiced a-coloring laryngeal, lost in Hittite, A2 > a.
w = voiceless o-coloring laryngeal, lost in Hittite, A1
w > u.
w = voiced a-coloring laryngeal, lost in Hittite, Hittite h-, -h-.
[Equivaut à H3 orthodoxe.]
Cette analyse plus complexe que celle de Melchert ou Kimball est donc induite
par les données, qui ne se laissent pas réduire à une orthodoxie à trois laryngales(-
phonèmes) seulement.
Un autre exemple de Puhvel qui n’est pas dans Pokorny est le suivant : hittite
ay-, e- « être chaud » < *i.e. H2ei-dh « brûler, être en flammes, être chaud »
(Pokorny 11 *ai-dh). Puhvel rapproche le mot hittite de l’albanais hî « cendre ».
34. Mallory (1997b, p. 10). Indiqué également dans Pokorny (p. 782) sans mention particulière.
Voir hittite ark « pénis », sans #h-.
35. Cité par Jerzy Kuryłowicz, le découvreur des laryngales, dans Kuryłowicz (1956,
p. 225).
36. Lehmann (1955, p. 26).
37. Puhvel (1984, vol. 1, p. X).
Le Hethitisches Elementarbuch de Johannes Friedrich a été traduit en anglais,
remanié et mis sur Internet par Olivier Lauffenburger. Dans la traduction anglaise
Hittite Grammar (p. 16), le traducteur écrit :
In the most common theory, P.I.E. had three laryngeals, noted H1, H2 and H3
that could “color” a neighboring vowel ‘e’. The laryngeal H1 had no coloration
effect, the laryngeal H2 colored in ‘a’ and the laryngeal H3 colored in ‘o’.
In Hittite, the laryngeal H1 vanished and the laryngeal H3 was retained only
in initial position. In median position, the fricative resulting from a laryngeal
can be lenis (written between two vowels by ‘h’) or fortis (written between
two vowels by ‘hh’). [...] It should be noted that the theory described
here is incomplete : it does not explain cases where Hittite displays a ‘h’
where there is no laryngeal, and conversely cases where Hittite does not
display a ‘h’ where a laryngeal occured.
[Dans la théorie la plus répandue, l’indo-européen avait trois laryngales, notées
H1, H2 et H3, qui pouvaient « colorer » une voyelle « e » voisine. La laryngale
H1 n’avait pas d’effet colorant, la laryngale H2 colorait en « a » et la
laryngale H3 colorait en « o ». En hittite, la laryngale H1 a disparu et la laryngale
H3 est conservée seulement en position initiale. En position médiane, la
fricative résultant d’une laryngale peut être douce (lénis) (écrite entre
voyelles « h ») ou forte (fortis) (écrite entre voyelles « hh »). [...] Il faut noter
que la théorie décrite ici est incomplète [sic] : elle n’explique pas les cas où
le hittite présente un « h » alors qu’il n’y a pas de laryngale38 et à l’inverse
les cas où le hittite ne présente pas de « h » alors qu’une laryngale existait.]
Si l’on accorde crédit aux données hétérodoxes, l’opposition entre H1 et H2/H3
n’est pas une opposition de lieu mais une opposition de mode. Il existe au minimum
deux couples de fricatives H2/H3 sonores et il en résulte deux fricatives H1
sourdes, en opposition phonologique avec H2/H3, à savoir partageant le même lieu
articulatoire et s’opposant par le seul trait : sonore ~ sourde. D’autre part, si l’on
accorde crédit au système glottalique, /?/ existe aussi, en tant que primitive
Il existe une hypothèse remontant simultanément à Edward Sapir et à Antoine
Meillet que les laryngales H1 et H2 puissent se durcir pour donner -k-. Un
bon exemple est le féminin régulier latin imperâtor, formé avec le suffixe régulier
-iH1-, qui se dit imperâtri-k-s au nominatif singulier (-s#). La racine *gwiH3 vivant
donne quick en anglais, dans lequel la finale -ck n’est sans doute pas un suffixe
mais le durcissement de H3. Il a donc de fortes raisons de penser qu’à côté des
glottales et des pharyngales, il a existé une série vélaire/uvulaire /x/ /γ/ /γw/. Le
hittite fournit des indications internes dans ce sens. A côté des formes isḵisa
« dos », tetḵissar « tempête » et hamesḵanza « printemps », il existe des variantes
ishisa, tethessar et hameshanza. En outre, le nom de personne Giluhepa, d’origine
hourrite, est écrit <Krgp> en égyptien hiéroglyphique. La vélarité de certaines
réalisations « laryngales » est donc indéniable d’après un faisceau de données
internes et externes.
38. Nous ne connaissons pas de cas de ce type et l’auteur n’en donne pas.
Au final, il existe un ensemble convergent de constats :
– des alternances internes au hittite,
– des transcriptions en langue étrangère,
– des correspondances duales à l’intérieur du corpus indo-européen.
La théorie orthodoxe limitant à trois valeurs phonétiques les laryngales nous
paraît intenable. Les données amènent à postuler des systèmes plus riches. Le
proto-système, que nous proposons pour l’indo-européen, présente au minimum
l’inventaire phonologique suivant :
Mode/Lieu Glottale Pharyngale Vélaire/Uvulaire
Sourd H1.b */H/ H1.c */x/
Sonore H2.b */‘/ H2.c */γ/
Glottalisé H0.a */?/
Labialisé H3.b */‘w/ H3.c */γw/
Dans l’état actuel du dossier indo-européen, il nous semble qu’on ne peut plus
progresser sur des bases internes. Nous allons maintenant examiner les données
extérieures, macro-comparatives, en particulier les données sémitiques.
Les données macro-comparatives
Commençons par H1 qui a deux valeurs possibles : /ḥ/ ح et /ḫ/ .خ
Racine i.e. Pokorny 703 *mê √ m_H1 mesurer (entre autres le temps) (moment, mois,
Racine i.e. Pokorny 731 *mên-ôt √ m_H1-n mois, lunaison, lune.
Gotique mel [me :l] « temps »
Lituanien métas « temps, année »
Hittite me-hur « temps »
Hittite me-e-hu-e-ni [meḥweni] « temps »
Latin mensis, Grec mênê « mois »
Anglais moon, month « lune, mois »
Arabe maḥwat « heure, moment »
Hébreu maḥzor « cycle, révolution (des astres) »
Hébreu maḥzor hayareaḥ « lunaison (cycle de la lune) »
Cette racine serait √ m_ḥ1.b [m_ḥ] avec une pharyngale.
Racine i.e. Pokorny 666 *lê(i) √ l_H1 faible, mou, lent, en retard
Latin lênis « doux ; calme, lent »
Lituanien lenas « calme, lent »
Lette léns « paresseux, doux »
Vieux slave lenu « indolent »
Latin lentus « indolent ; lent, long »
Grec elînuô « se reposer »
Arabe laḥḥ « lent, paresseux, rétif (un animal) »
Arabe laḥlaḥ « rester à sa place et ne pas bouger »
Arabe laḥam « ne plus vouloir avancer »
Arabe laḫamat « lent et paresseux » (variante avec x au lieu de ḥ)
Latin lentus « mou, souple »
Arabe laḫi‘ « être lâche et flasque (chairs) »
Arabe laḫmat « langueur et flacidité du corps »
Latin lassus « fatigué »
Arabe laġab « être très fatigué »
Il y aurait trois racines différentes : *√ l_H1.b *√ l_H1.c *√ l_H2.c
Racine i.e. Pokorny 661 *lei-(bh) faible, affamé
Grec lîmos « faim » < *liH-mos 39
Arabe lataḥ « avoir faim »
Arabe latḥân, latḫân « affamé »
Arabe mulḥûb « décharné »
Arabe malḥûs « décharné »
Arabe laḥab « être très maigre (de vieillesse) »
Arabe laḥiq « être mince »
Il y a une seule racine : *√ l_H1.b.
Racine i.e. Pokorny 662 *s-lei- mucus, boue
Latin lentus « visqueux, tenace »
Anglais slime « mucus » <*s-lî-m
Allemand Schlei « tanche » (cf. balto-slave)
Arabe laḥḥ « avoir les paupières collées »
Arabe laḫij « être couvert de saletés visqueuses (paupières, yeux) » (Variante avec ḫ)
Arabe multaḥimat « conjonctivite »
Hébreu laḥmît « conjonctivite »
Il y a une seule racine : *√ l_H1.b.
Racine i.e. Pokorny 683 *lêu- chant
Latin laud « éloge »
Allemand Lied « chant » < *leH-ut
Arabe laḥn « son, mélodie, chant »
Hébreu laḥan « air, mélodie »
Il y a une seule racine : *√ l_H1.b.
Voir plus loin le nom de l’alouette.
Racine i.e. Pokorny 662 *lei- /*leu- enduire, oindre ; souiller, tacher
Latin lînô « oindre »
Irlandais as-len-aim « je souille »
Grec lûma « souillure, ordure »
Arabe laḫḫ « oindre, imprégner »
Hébreu liḫlûḫ « saleté, crasse »
Hébreu liḫleḫ « salir, souiller »
Il y aurait 1 seule racine : *√ l_H1.c.
39. Ce schème apophonique i_o est inhabituel en indo-européen.
Racine √ H1.b_d un
Slave *od *ed « un »
Arabe ’aḥad / waḥid « un »
Les exemples avec H2 qui a trois valeurs possibles : /’/ ع /‘/ , أ et /ġ/ .غ
Racine i.e. Pokorny 4 ag √ H0.a_g mener, guider
Latin agô « pousser devant soi, faire avancer »
Latin agmen « troupe d’hommes »
Arabe ’ajâ’ « forcer, contraindre à quelque chose »
Arabe ’ajl « troupeau ; troupe d’hommes »
Racine i.e. Pokorny 38 anH √ H0.a_n_H1.b souffle, vent, respirer
Latin anima « souffle ; âme »
Grec anemos « vent »
Arabe ’anaḥ « haleter, respirer avec effort ; pousser un soupir »
Hébreu ’anaḥah « soupir »
Racine i.e. Pokorny aner √ H0.a_n_r homme
Grec anêr « homme »
Arabe ’ins « homme, être humain »
Hébreu ’anš « homme »
Racine i.e. Pokorny 1010 s-tâ √ t_H2.b couler, s’écouler
Grec sta-zô, sta-lazzô « s’écouler »40
Arabe ṯa‘-b « faire écouler, verser (larmes, sang) »
Arabe ṯa‘-jar « verser, répandre »
Racine s-tâ √ t_H2.c mare, étang
Latin stâgnum « étang »
Arabe ṯaġ -b « mare d’eau stagnante »
Valeurs supplémentaires où H2.a est /h/
Racine aw √ H2.a_w air, souffle
i.e. aw √ H2.a_w « air »
Arabe hawa’ « air, souffle »
Racine ay √ H2.a_y vivant
i.e. ay √ H2.a_y « vivant »
i.e. ayənk- √ H2.a_y « jeune, plein de vie »
Arabe haya’ « vivre »
Valeurs supplémentaires où H2.a /h/ alterne avec H2.c /ġ/.
40. Dans plusieurs langues indo-européennes (celte, balto-slave, germanique), cette racine
donne le nom de l’urine. Cette racine semble être associée aux différents fluides
corporels (larmes, sang, etc.). Peut-être est-ce un mot très ancien du domaine « médical ».
Racine i.e. Pokorny 1053 tâ √ t_H2.a fondre, se liquéfier
Latin tâ-bêo « fondre (neige, cire) »
Anglais to thaw « fondre (neige, glace) »
Arabe ṯahṯah « fondre (se dit de la neige) »
Arabe ṯaġab « fonte des glaces »
Si l’on accepte ces rapprochements, l’enveloppe des réalisations phonétiques, que
nous proposons pour l’indo-européen, présente donc au minimum l’inventaire
suivant :
Mode/Lieu Glottale Pharyngale Vélaire/Uvulaire
Sourd H1.b *[H] H1.c *[x]
Sonore H2.a *[h] H2.b */‘/ H2.c *[γ]
Glottalisé H0.a */?/
Labialisé H3.b */‘w/ H3.c */γw/
Certains phonèmes ne sont pas complètement indépendants :
– H2.a *[h] et H2.c *[γ] alternent,
– H1.b *[H] et H1.c *[x] alternent.
En sémitique, et en particulier en arabe, il s’agit de phonèmes différents. La question
reste ouverte de savoir si ces paires de phonèmes alternants peuvent être
ramenés à l’unité. A un stade linguistique plus ancien que l’indo-européen ou le
proto-sémitique, il est possible que chaque paire de phonèmes alternants corresponde
à un proto-phonème unique, mais les clés de distribution allophonique qui
pourraient valider cette hypothèse de réduction à l’unité nous échappent pour le
La structure de la racine
L’influence du sémitique sur la reconstruction de l’indo-européen ne s’arrête pas
à la phonologie. Rappelons en préalable qu’Emile Benveniste était né en 1902 à
Alep, en Syrie. A nos yeux, cette origine proche-orientale est la raison directe de
son intérêt pour la forme des racines indo-européennes et elle explique aussi le
type de format proposé. Il ne fait quasiment jamais allusion au sujet – plus ou
moins tabou – du sémitique mais dans les faits, ce qu’il propose aligne l’indoeuropéen
sur l’écriture consonantique de l’arabe et des langues sémitiques.
Ses objectifs sont présentés dans la préface de son livre Origines de la formation
des noms en indo-européen :
L’objet essentiel de la grammaire comparée, depuis une soixantaine
d’années [de 1870 à 1935], a été de poser des correspondances entre les langues
indo-européennes et d’expliquer, en partant de l’état que définissent
ces correspondances, le développement des dialectes attestés [= les langues
indo-européennes elles-mêmes]. [...] Depuis le Mémoire de F. de Saussure, le
problème de la structure des formes indo-européennes elles-mêmes a été
presque complètement négligé. Il paraît communément reçu qu’on peut
analyser l’évolution de l’indo-européen sans se soucier de ses origines,
qu’on peut comprendre des résultats sans pousser jusqu’aux principes. De
fait, on ne va guère au delà de la constatation. [...] Nous avons visé avant
tout à définir des structures, des alternances, l’appareil formel.
Cet état des lieux rédigé en 1935 sur l’empirisme du domaine indo-européen au
ras des données lexicales reste largement vrai en 2007. Plus loin, il précise sa pensée
Ce qu’on a enseigné jusqu’ici de la nature et des modalités de la racine est,
au vrai, un assemblage hétéroclite de notions empiriques, de recettes provisoires,
de formes archaïques et récentes, le tout d’une irrégularité et d’une
complication qui défient l’ordonnance. On enregistre des racines monosyllabiques
(*bher-) ou dissyllabiques (*gweyə-) ; des racines bilitères (*dô-),
trilitères (*per-), quadrilitères (*leuk-), quinquilitères (*sneigwh-) ; des racines
à voyelle intérieure (*men-) ou à diphtongue (*peik-) ; à voyelle initiale
(*ar-) ou à voyelle finale (*pô-) ; à degré long (*sêd-) ou à degré zéro
(*dhək-) ; à diphtongue longue (*srêig-) ou brève (*bheudh-). (p. 147.)
Dans son livre, Benveniste ne se comporte pas en comparatiste ou en reconstructeur.
Il parle en grammairien normatif outré par le désordre. La modélisation des
racines qu’il propose est la suivante :
– thème I C1vC2(-C3-) [fa‘(-l-)]
– thème II C1C2vC3- [f‘al-]
Les racines du type *ar sont reformatées *H2er-, ce qui explique que *a ne puisse
jamais alterner avec *e à l’initiale, puisqu’il s’agit de deux allophones du même
phonème, et les racines du type *dô- sont reformatées *deH3- : pour la même
raison d’allophonie, *ê et *ô ne peuvent pas alterner.
Bien qu’il s’en défende41, sa pensée suit un chemin qui est strictement
l’inverse de celui qui mène de l’alphabet phénicien à l’alphabet grec. En pratique,
Benveniste retraverse la Méditerranée dans l’autre sens sans le dire.
Cette définition [de la racine] doit être entendue littéralement et phonétiquement,
et non pas au sens où les sémitistes l’emploient pour caractériser
seulement le schème consonantique de la racine. (p. 171.)
Le thème I de la racine benvenistienne est en termes sémitiques la racine consonantique
dotée du schème vocalique -v(-Ø-) [fa‘l], et le thème II a le schème -Ø-v-
[f‘al]. Curieusement, Benveniste nie une filiation conceptuelle, qui est pourtant
On peut y voir une précaution pour ne pas froisser les comparatistes. Mais
il y a aussi une raison technique, interne. Benveniste ne parle pas de prosodie,
mais ce qui distingue le thème I [fa‘l] du thème II [f‘al], c’est la place de l’accent à
très haute époque en indo-européen ancien. Le squelette consonantique de la
racine indo-européenne n’admet qu’une seule voyelle obligatoirement accentuée,
soit dans la première, soit dans la deuxième syllabe.
41. Il appartient à l’histoire des sciences de comprendre pourquoi.
Benveniste veut rendre compte en même temps de deux phénomènes différents
: la structure des racines, qui sont constituées de consonnes uniquement,
et la place de l’accent, qui conditionne l’insertion de la seule voyelle admissible
dans la racine en indo-européen ancien. Notre avis est que les schèmes pléovocaliques
du type fa‘al, fa‘il ou fa‘ul sont impossibles dans le stade le plus ancien de
l’indo-européen. Seuls sont possibles fa‘l, f‘al ou fə‘la. Cette particularité identifie
les formes réellement anciennes et les schèmes pléovocaliques sont des innovations
réalisées par les langues filles42.
A ce titre, on peut noter que certaines langues indo-européennes, comme
le groupe germanique, et en particulier l’anglais, résistent à la pléovocalie et ont
même tendance, dans les emprunts, à expurger le maximum de voyelles pleines.
L’anglais perpétue un commandement phonologique hérité de l’indo-européen
ancien : le schème vocalique de la racine aura une voyelle pleine et une seule, le
reste sera du lubrifiant phonétique.
La théorie de Benveniste est la fusion de la réalité sémitique, consonantique,
de la racine avec la prosodie, monovocalique, spécifique de l’indo-européen
ancien. Elle propose d’intégrer toutes les racines dans un format unique : un
squelette C1_C2_(C3) + une seule voyelle.
Au terme de cette première partie, il apparaît nettement que tant la phonologie
que les racines de l’indo-européen ont été modelées d’après l’exemple
implicite ou explicite des langues sémitiques, et cela serait encore plus flagrant si
l’indo-européen était écrit avec l’alphabet arabe plutôt que latin.
L’incrémentatioincrémentation en indo-européen
Nous allons maintenant aborder nos propres travaux sur l’indo-européen. Par des
chemins différents, nous avons été amené à une évolution de la théorie de la racine
indo-européenne qui nous rapproche des principes Racines et incréments, tels
que les propose Georges Bohas pour l’arabe. Bien que la théorie proposée par
Benveniste soit un progrès évident, la fréquentation des études indoeuropéennes
nous a convaincu depuis de nombreuses années de la nécessité
d’une réforme. Mais nous avons mis très longtemps à trouver une solution acceptable.
Etant donné que nous sommes à la fois (macro-)comparatiste et reconstructeur,
à nos yeux, ce type de théorie permet de mettre en évidence de la
morphologie fossile. C’est le concept par lequel nous sommes arrivé à une théorie
du type Racines et incréments.
Dans un autre domaine, la doxa officielle de l’arabe prône une théorie de la
racine à trois consonnes. En pratique, cette doxa empêche de percevoir la structure
réelle du lexique, qui se trouve atomisé en une myriade de racines de même
sens mais sans lien supposé. Il nous semble que le point de départ de la refondation
proposée par Georges Bohas pour l’arabe est en premier lieu un refus de
l’atomisation lexicale, un refus de la déstructuration du lexique et la volonté de
faire émerger l’innervation qui traverse et organise le corpus lexical de l’arabe.
42. Précisons, pour éviter toute ambiguïté, que ce diagnostic est personnel.
Notre démarche est similaire mais sur un corpus différent : celui des langues indo-
européennes. Nous partageons ce refus de l’atomisation et de la déstructuration
du lexique.
Le point de départ de notre questionnement, depuis une vingtaine
d’années, est un triple constat et une conviction intime :
– constat que le dictionnaire étymologique de l’indo-européen de Pokorny
offre de multiples entrées de forme proche et de sens proche, qui seraient
donc à considérer comme autant et autant de « racines » synonymes, de
sens souvent assez vague ;
– constat que souvent, ces « racines » n’ont pas de classe syntaxique clairement
explicitée ;
– constat que ce dictionnaire recourt à un concept d’« élargissement » des
« racines », qui donne l’impression que n’importe quel phonème de la langue
peut être un élargissement et que ces « élargissements » ne sont motivés
par aucun trait de sens, ni par aucune considération de classe
syntaxique ;
– conviction intime que si l’indo-européen doit être considéré comme une
vraie langue, c’est-à-dire comme un état synchronique possible, et non pas
comme un mécano lexico-graphique commode mais fictif, alors il est impossible
que les locuteurs aient eu à leur disposition tant et tant de racines
synonymes de sens vague, élargissables ad libitum par n’importe quel phonème
sans aucune considération sémantique ni syntaxique. Il est bien
connu que les langues se distinguent autant par ce qu’elles permettent que
par ce qu’elles ne permettent pas. Ici, tout paraît possible.
Un exemple de mécano lexicographique est *(s)-ker « couper », dans Pokorny
(1959, p. 938-947). L’entrée elle-même donne le ton : (s)ker-, (s)kerə-, (s)kre:-.
Trois formes « de base » sont annoncées avant élargissement ou suffixation, avec
ou sans #s-. Indépendamment de la cohérence sémantique du corpus réuni sous
cette entrée, qui fait aussi problème, l’inventaire – difficile à réaliser – est (au
moins) le suivant :
– (s)-ker-, (s)-kr-ek, (s)-ker-s,
– (s)-kr-eH1, (s)-kr-eH1-t, (s)-kr-e-n-H1-d, (s)-kr-eH1-bh, (s)-kr-eH1-m,
– (s)-kr-eH2, (s)-kr-eH2-k,
– (s)-krew, (s)-kreu-d, (s)-keru-p, (s)-kreu-p,
– (s)-ker-d, (s)-ker-t,
– (s)-ker-b, (s)-ker-bh, (s)-kr-eb, (s)-kr-ebh, (s)-ker-p, (s)-kr-ep,
– (s)-ker-i-H1, (s)-kr-iH1, (s)-kr-eH1-y-, (s)-kr-iH1-p, (s)-ker-iH1-bh,
– (s)-kr-ey-d, (s)-kr-ey-t.
Dans ce kaléidoscope morphophonologique, on ne sait plus si on a affaire à des
voyelles, des consonnes, des suffixes, des élargissements, etc. Nous sommes
convaincu que jamais aucune proto-langue ne peut avoir fonctionné de cette
façon. Tout cela se situe largement au-delà des limites de la linguistique.
Il nous a toujours semblé nécessaire de repenser la théorie de la racine,
afin que l’indo-européen puisse devenir une vraie langue. Mais le chemin est long
entre les premiers doutes, l’intention et la solution.
Pour des raisons quasiment idéologiques, les comparatistes du XIXe siècle
ont exclu que l’indo-européen aurait pu posséder des infixes ou des préfixes.
Leurs préjugés quant à une prétendue évolution hiérarchisée des langues ne
l’admettaient pas. L’indo-européen se devait d’être flexionnel et suffixal, car
c’était à leurs yeux le fonctionnement le plus noble et donc le plus adapté à la
proto-langue indo-européenne, supposée être « supérieure ». C’est Schleicher qui
est à l’origine de cette typologie hiérarchisée des langues. Les continuateurs,
quelle que soit la pureté de leurs intentions et de leurs travaux, n’ont pas remis
en cause ce genre de prémisses, qui ont pourtant orienté l’indo-européen dans la
mauvaise direction. Avec le temps, les aspects scandaleux de la doxa indoeuropéenne
(aryenne...) ont été gommés, mais les conséquences indirectes restent
présentes dans les travaux contemporains.
Le seul préfixe toléré est le #s- mobile initial, dont la présence est trop évidente
pour ne pas être vue, et le seul infixe toléré est -n- à valeur de présent, qui
est vivant en latin. On lui doit « rompre » vis-à-vis de « rupture » (avec et sans
nasale). Dans la suite, nous allons voir que tout porte à croire que l’indo-européen
a possédé un système d’affixes, comprenant préfixes, infixes et suffixes. Dans un
souci de convergence terminologique avec la TME, nous parlons désormais
Au début du XXe siècle, Hermann Möller avait lui aussi conclu à la nécessité
d’une théorie des affixes. Il avait recouru au terme de (pré-)formatif.
Même dans le cadre standard, restrictif avec seulement #n- et -n-, certains
mots restent isolés dans le comparatisme classique. Ainsi les différents noms du
scarabée en grec :
Grec kârabos « scarabée »
Grec kerambuks « scarabée » (infixe -n-)
Grec skarabeis « scarabée (préfixe #s-) »43
Aussi étonnant que cela puisse paraître, spécialement pour quelqu’un qui
s’intéresse de près à la TME en arabe, ces mots ne sont pas censés avoir de rapport
entre eux. Il nous semble pourtant que ce rapport est évident, même si les schèmes
vocaliques internes sont anormaux, vu de l’indo-européen orthodoxe. Comment
ne pas voir que ces mots partagent la même racine √ k_r_b ?
Notre conviction, depuis longtemps et aujourd’hui plus que jamais, est
qu’on peut appliquer au corpus indo-européen la même démarche que celle qui
est engagée pour le lexique de la langue arabe : à savoir déterminer des racines et
des incréments. Dès lors qu’on fait sauter le verrou intellectuel interdisant de
chercher des incréments, la théorie de la racine indo-européenne devient lumi-
43. Ces mots sont souvent tenus pour non indo-européens, mais le simple fait qu’il existe
une forme avec #s- préfixé suffit à en faire une racine indo-européenne.
neusement simple et quantité de lexèmes qui ne sont pas rattachés à une racine
trouvent naturellement leur place.
Les incréments -r- et -l- en indo-européen
Outre le #s- initial et l’infixe -n-, deux incréments fréquents de l’indo-européen
sont -r- et -l- qui sont toujours infixés. Les exemples sont innombrables et touchent
tous les champs sémantiques. Il faut noter que contrairement à l’arabe qui
tend à incrémenter -r- et -l- en toute position, l’indo-européen est restrictif et
n’incrémente qu’en position infixale. En outre, il est fréquent que -r- ait une
nuance péjorative.
Un premier exemple est l’anglais to speak « parler » vis-à-vis de l’allemand
sprechen. L’anglais est la seule langue germanique à n’avoir aucune trace de -rdans
cette racine. Ce point a déjà été observé et discuté :
On est porté à admettre la chute44 indo-européenne de r, sous certaines
conditions, dans le groupe initial consonne + r, par exemple, latin fungor :
fruor fruges, gotique brukjan « avoir besoin, se servir de » ; sanscrit bhanakti
« briser », arménien bekanim « je brise » ; latin frango, gotique brika « je
brise » ; grec (w)agnumi : (w)râgnumi « je brise » : néerlandais wrak « débris
» ; grec poti : proti « à côté de » ; anglo-saxon specan « parler » ; vieuxhaut-
allemand sprehhan. Sur cette question, voir Brugmann, Grdr45 I2 426 et
ses références46.
A propos des synonymes grecs poti : proti « à côté de », Chantraine affirme dans
son dictionnaire qu’ils n’ont pas « la même origine », affirmation qui n’est accompagnée
d’aucun argumentaire.
Brugmann est plus prudent :
Il est moins vraisemblable de supposer que *poti d’époque indo-européenne
soit sorti de *proti, que de croire qu’il a été sorti de *po, et que dans l’usage il
s’est en quelque sorte confondu avec *proti, qu’il en a usurpé les autres
Dans les faits, ces deux mots sont inséparables, mais dans la doxa, étant donné
que -r- n’a pas le statut d’infixe ou d’incrément, la seule posture théorique admissible
est de nier par principe qu’ils aient un lien génétique possible. Notre diagnostic
est que cette posture n’est pas tenable compte tenu de l’accumulation de
données qui montre que l’incrémentation est un phénomène ultra-fréquent en
44. L’auteur n’imagine pas qu’il puisse s’agir d’une incrémentation.
45. Le sigle Grdr renvoie à Brugmann et Delbrück, Grundriss der vergleichenden Grammatik
der indo-germanischen Sprachen [traité de grammaire comparée des langues indoeuropéennes].
46. Boisacq (1923, p. 8).
47. Brugmann (1905, p. 504, § 612). Cet exemple gêne (notez le mot « usurpé »).
Prenons les synonymes anglais suivants, qui signifient tous « sauter, bondir
– to skip ME skippen origine inconnue
– to hop OE hopp-ia racine i.e. keu-p
– to leap OE leap-a racine i.e. klou-p
Si on fait l’analyse que #s- est le préfixe #s- mobile et que -l- est un infixe, alors
ces trois formes supposées indépendantes se ramènent à une racine unique
√ *k_b qui possède trois formes :
– s-k_b d’où skip
– k_b d’où hop
– k-l-_b d’où leap
L’économie dans la description est évidente. En outre, cette analyse intègre to skip
dans une famille de mots, alors qu’il était isolé et d’origine inconnue.
De même en grec, √ *k_p :
kalupt-ô « couvrir, cacher »
krupt-ô « cacher »
De même en grec, √ *H2_k :
alek-sô « protéger, défendre »
alkê « défense, protection »
arke-ô « protéger »
De même, √ *p_k :
Grec ha-paks < *s_m + *p_k-s « une fois »
Latin sim-p-l-eks < *s_m + *p-l-_k-s « simple < en une fois48 »
Allemand ein-fach < *oin + *pak « simple < en une fois »
Prenons le mot peigne en slave : les différentes formes reposent sur *greb- avec
divers suffixes. Ces formes slaves sont à première vue isolées dans l’espace indoeuropéen.
Une fois reconnu que -r- est un incrément, on retrouve la racine
√ *g_bh, qui donne les formes germaniques comme l’anglais comb « peigne », et
vieux norrois kambr, avec un autre incrément.
Non seulement les langues indo-européennes sont apparentées entre elles,
mais elles le sont à un degré qui est bien plus fort que ce que le comparatisme
classique permet de décrire. Du fait d’une théorie de la racine, défaillante et inadaptée,
quantité de lexèmes sont tenus – à tort – pour isolés ou non indoeuropéens.
Une autre racine √ *bh_g casser, frapper :
racine i.e. bheg « asséner un coup » sanscrit bhanj ; arménien bek-anim
racine i.e. bhlag « frapper de façon répétée » latin flagere
racine i.e. bhlîg « frapper de façon répétée » latin flîgere, flictus
racine i.e. bhreg « casser » latin frangere, fractus (-r- péjoratif)
48. Littéralement, « fait en un seul pli ».
Une autre √ *H1_p : saisir, prendre en main :
racine i.e. Hop « choisir » latin op-târe
racine i.e. (H)rep « s’accaparer, voler » latin rap- ax, rap-t (-r- péjoratif)
racine i.e. (H)leup sanscrit lop-tra « butin »
Un autre exemple plus complexe √ *bh_H2 briller, être blanc ou blond :
racine i.e. bhaH2 « briller » grec phôs « lumière »
racine i.e. bhleH2-k « briller, blanc » anglais bleach, bleak
racine i.e. bhleH2-nk « briller, blanc » français blanc
racine i.e. bhreH-k-t « briller » anglais bright
racine i.e. bhleH2-nt « briller, blond » anglais blond
racine i.e. bhleH2-w « briller, blond » latin flavus « blond-roux »
Ou √ *k_w être très sonore, faire du bruit
i.e. keu- anglais shout < ME shoute < *s-kw-di.
e. k-l-eu-eH1 anglais loud < *hlûd- < *kwH1-tos
i.e. k-r-eu (-H2) instrument de musique : gallois crwth < *kruttâ
Plusieurs racines présentent le paradigme incrémental suivant :
Préfixe Infixe Suffixe
#s- -r- -l- -s-
Par exemple : √ *gh_bh tenir en main :
racine nue anglais to give < OE giba < *ghebh
racine suffixée latin hab-êo < *ghabh-eH1-
racine infixée anglais to grab < néerlandais grabben < *gh-r-obh
racine infixée anglais to grasp < *gh-r-obh-s
(avec métathèse normale en anglais)
Par exemple : √ *bh_H pousser, croître ; arbre49 ; fleur :
racine suffixée grec phêgos « chêne », anglais beech « hêtre » <*bh_H-g
racine suffixée allemand Baum « arbre » <*bh_H-m
racine infixée anglais birch « bouleau » < *bhrH-g50
racine infixée latin fraxinus « frêne » < *bhrH-g-sracine
infixée allemand Blüte « fleur » ; latin flor « fleur »
49. Le signifié « arbre » pour la racine nue est attesté en ouralien : cf. finnois puu « arbre
50. Ce mot a été maintes fois convoqué dans l’épuisante quête de la patrie originelle indoeuropéenne.
Il nous apprend que dans cette patrie, il y avait des arbres. Mais on ne sait
pas très bien lesquels.
L’incrément incrément #d < t? ou ṭ
La théorie classique ne rend pas compte des infixes -r- et -l-. Une autre absence
concerne le préfixe #d-51. L’exemple canonique est *dakru- « larme » en face de
*akru « larme ». A ce jour, les comparatistes ne se sont pas résolus à postuler un
préfixe #d-. Il existe plusieurs exemples très clairs :
akru « larme » d-akru « larme (exemple canonique) »
rew « couler » d-rew « s’écouler »
ar-bor « arbre » d-oru « arbre »
nebh « nuage » d-nephos « ténèbres » (grec)
ors « derrière, cul » d-ors « dos » (latin)
eigh « être piquant » d-eigh « piquer, être piquant »
A notre avis, ce incrément est représenté en arabe.
Nous en proposons quelques exemples ci-dessous comme préfixe.
Arabe baḫar « fumer ; fumée, vapeur d’eau bouillante »52
Arabe _abaḫ « cuire »
Arabe bašar « homme, genre humain »
Arabe ṭabš « hommes, genre humain »
Arabe ba‘ar « déféquer, excréments »
Arabe ṭaba‘ « saleté, crasse, boue »
Arabe baqar « fendre, ouvrir »
Arabe ṭabaq « séparer deux parties à la charnière »
Arabe rašša « arroser, asperger »
Arabe ṭaraša (F. II) « répandre çà et là (en arrosant, en aspergeant) »
Arabe radda « repousser, éloigner, écarter »
Arabe ṭarada « éloigner, écarter, repousser, chasser »
D’autre part, il existe au moins un exemple de -W- infixé.
Racine √ H_m « être en colère »
Arabe ḥaṭam (F. V) « être enflammé de colère »
Arabe muḥṭamir « courroucé »
Arabe ḥamâ « être en colère »
Arabe ḥamâ’ « être en colère »
Arabe ḥamal « montrer en colère ; exciter la discorde et l’inimitié »
Arabe ḥamaq (F. V) « se fâcher »
Arabe ḥamar « s’enflammer de colère »
Arabe ḥamas (F. V) « mettre en colère »
Arabe ḥamašâ « irriter »
Arabe ḥamiyat « emportement de colère, début de colère »
Arabe ḥašmat « colère »
Arabe ḥadam (F. IV) « s’enflammer de colère » ; (F. VIII) « bouillonner de colère »
51. Ce préfixe est attesté surtout sur des racines de sens médio-passif, ce qui permet le
rapprochement de cet incrément avec le préfixe passif du berbère : #ṭ-.
52. Cf. i.e. bhok « feu » > latin focus « feu », arménien bok’ « feu ». Et arabe nabîḫat « allumette
L’incrément -t-
L’incrément -n- n’est pas le seul infixe de l’indo-européen. Il y a de nombreux
exemples en grec d’un infixe -t-. Ainsi :
ptolemaios « le belliqueux » ~ polemos « la guerre »
ptolis « la ville » ~ polis
ptaiô « heurter, renverser » ~ paiô « battre »53
Depuis le XIXe siècle, les comparatistes eux-mêmes ont observé des exemples
possibles, mais ils n’ont jamais osé franchir le pas et postuler des incréments. Ici
aussi, on arrive à expliquer des termes isolés et à déterminer des racines nouvelles.
Un autre corpus ingérable dans l’approche orthodoxe :
Grec ptelea « orme »
Latin populus « peuplier »
Russe topol’ « peuplier »
Examen des lexèmes grecs :
pterna « talon » ~ skrt parśnis (exemple classique repéré depuis le XIXe siècle)
pteron « aile » < *pet « voler » ~ latin penna < *petna (ici -t- est radical)
ptissô « piler, concasser » ~ latin pis-ton ; « pilon » < *pislum ; latin pinsô « piler »
(infixe -n-)
pto(i)a « peur, épouvante » ~ latin pau-ra
ptukhs « pli » ~ allemand biegen « plier »
ptûô « cracher, vomir » ~ latin spu-ô ; anglais to spew
ptôkh-eô « mendier » ~ anglais to beg (ces deux mots sont isolés)
kti-zô « bâtir une maison » ~ i.e. kei-m, germanique haim « maison, home »
A notre avis, l’incrément -t- existe également en sémitique.
Nous en proposons quelques exemples ci-dessous.
Racine √ r_‘ « paître »
Arabe ra‘â « paître »
Arabe rata‘ (F. IV) « laisser paître librement »
Racine √ m_š « traire »
Arabe mašša « traire une femelle en lui laissant un peu de lait dans le pis »
Arabe mataš « presser doucement les trayons d’une femelle en la trayant »
Racine √ l_ḫ « sale »
Arabe lataḫ « salir quelqu’un d’ordures »
Arabe laḫḫ « oindre, imprégner »
Arabe laḫḫatun (fém.) « sale, malpropre et qui sent mauvais (femme) »
Hébreu liḫlûḫ « saleté, crasse »
Hébreu liḫleḫ « salir, souiller »
Racine √ h_l « s’écouler, se répandre abondamment (pluie, larmes) »
Arabe halla (F. VII) « être versé par torrents, se répandre (pluie) »
Arabe halla (F. VII) « être baigné de larmes (yeux) »
53. Cf. Meillet : « [...] dans paiô, non plus que dans ptaiô, qu’on n’en peut séparer [...] »,
Ernoult et Meillet (1932, p. 708).
Arabe hatal « faire tomber par intervalles des ondées, des averses (ciel) »
Arabe hatal « verser des larmes abondantes »
Arabe hatîl « qui fait tomber une pluie continuelle (ciel) »
Racine √ ḥ_m « noir »
Arabe ḥutmat « couleur noire »
Arabe ḥâtim « corbeau »
Arabe ḥamma « être noir »
Arabe ’aḥamm « noir »
Arabe ’aḥtam « noir »
Arabe ḥamam « charbon »
Arabe ḥamḥam « très noir »
Arabe ḥamâ (F. XII) « être noir (nuit, nuage) »
Arabe yaḥmûm « noir »
Arabe saḥam « noir »
Arabe ḥamâṯat « sang noir dans le coeur »54
Racine √ ḫ_n « couper »
Arabe ḫanna « couper (le tronc d’un palmier) »
Arabe ḫana’ « couper (le tronc d’un palmier) »
Arabe ḫatan « tronquer, écourter »
Arabe ḫanaf « couper (un fruit rond) »
Racine √ f_q « ouvrir, fendre en deux »
Arabe fataq « fendre, rompre, séparer ce qui était joint par une charnière »
Arabe faqqa « ouvrir ; disjoindre, séparer en deux »
Arabe faqa’ « fendre ; séparer »
Arabe faqaḥ « ouvrir les yeux pour la première fois (chiot) »
Arabe faqa‘ (F. VII) « se fendre ; être fendu »
Arabe falaq « fendre, couper en deux »
Arabe faraq « fendre, pourfendre en deux »
Racine √ k_m « cacher, recouvrir »
Arabe kamma « couvrir, recouvrir (d’un couvercle, d’une enveloppe) »
Arabe katam (F. IV) « céler, cacher, dérober à la vue »
Les formes verbales dérivées en indo-européen
A partir d’une base trilitère, l’arabe tire plusieurs formes dérivées :
Forme I forme de base fa‘al
Forme II duplication interne fa‘‘al
Forme III allongement fâ‘al
Forme IV préfixation ’af‘al
La forme I est représentée en indo-européen : c’est la forme de citation de chaque
racine. La forme II existe aussi, mais c’est toujours la première consonne qui est
dupliquée et non la deuxième.
54. Ce serait un exemple d’incrémentation suffixale par -ṭ.
Par exemple : √ p_l « plein, abondant ; foule, multitude »
Latin populus « foule, peuple »
Anglais folk « peuple »
Arabe ḥafla « foule »
Latin plênus « plein » < *pleH1-nos
Grec pimplê-mi « je remplis » (-mi : « je, P1 »)
Arabe ḥafil « plein »
Pour des raisons qui restent à élucider, la forme II est souvent associée en indoeuropéen
au passé. Par exemple, en latin : mordêo « je mords », momordi « j’ai
mordu » ; dô « je donne », dedi « j’ai donné ».
La forme III à allongement est bien représentée. Il s’agit des degrés vocaliques
dits longs /e:/ /o:/, dont l’origine est l’allongement de la voyelle simple /e/
/o/ et non la trace d’une laryngale /e+H/ ou /o+H/. La classe dite VII des verbes
forts germaniques repose sur la forme III. Le verbe anglais to bear « porter » appartenait
à cette classe : I bear < *bêr « je porte » et I bore < *bôr « j’ai porté ».
L’adjectif wet « mouillé » repose sur */wêd/ avec une longue.
La forme IV est très intéressante. Son existence en indo-européen est niée.
Les exemples sont tenus pour des formations aberrantes, souvent qualifiées de
« populaires »55 et dépourvues de toute portée vis-à-vis de la théorie de l’indoeuropéen.
Une autre façon de refouler les exemples en dehors du périmètre indoeuropéen
consiste à en faire des emprunts à un substrat. Il en existe d’excellents
exemples, en particulier en grec, mais pas seulement. Cette formation fournit
beaucoup de noms d’animaux ou d’objets. La cohérence morphologique et sémantique
de cette forme permet d’exclure le hasard comme principe
d’explication. En outre, il est fréquent que la racine de ces mots soit au degré
zéro, ce qui laisse à penser que le préfixe était autrefois accentué. Nous citons en
note le point de vue de Chantraine dans son Dictionnaire étymologique de la langue
grecque. Le terme convenu pour désigner l’incrément #a- est celui de « prothèse »,
dont l’origine médicale montre à quel point les indo-européanistes considèrent
cet incrément comme un corps étranger.
Grec a-glîs56 et gelgis « tête d’ail » < *√ g_l_y (exemple d’alternance gel ~ ?a-g_l)
Grec a-glaFos et glaukos « brillant » < *√ g_l_w
Grec a-greiphna57 « herse, rateau » et grîphos « filet »< *√ gh_bh « saisir, attraper »
Grec aielouros < *a-wisel-58 « chat sauvage », latin vison « vison », anglais weasel
« belette » < *√ wis-
Grec a-i-gupios59 et gups « vautour »
Grec a-mal-os, a-blêkhros et blêkhros « faible, doux » < *√ m_l
Grec a-nepsios60 « cousin germain » < *√ nep-ot « neveu »
55. Sous la plume des comparatistes, en particulier de Meillet, le label « populaire » équivaut
à être rejeté dans les ténèbres de la linguistique.
56. Cf. Chantraine : « ne peut être séparé de gelgis (voir ce mot) ».
57. Idem : « l’α- étant une prothèse non autrement expliquée ».
58. L’élément final ouros signifie « queue ».
59. Le -i- après #a- serait dû à l’influence d’autres noms d’oiseaux commençant par #ai-
Grec a-pion61 < *a-pis-on « poire » et latin pirus < *pisos « poire »
Grec a-sp-is « vipère aspic » et latin serpens « serpent » < *√ s_p « ramper, serpent »
Grec a-spalaks et spalaks « taupe »
Grec a-s-pharangos62 « gosier, gorge » et pharungs « gosier » < *√ bh_r-
Grec a-staphis63 et staphis « raisin sec », staphulê « grappe de raisin »
Grec a-stakhus et stakhus « épi de blé »
Grec a-stralos64 et latin sturnus « étourneau » < *√ st_r-/tr_s- « étourneau, grive »
Grec a-tharê « bouillie de farine ou de gruau » < *ghrew « gruau »65
Grec a-nthrênê « bourdon », tenthrêne « guêpe », thrônaks « bourdon » < *√ dhren
« bourdonner »
Grec a-trapos66 « sentier » et trapeô « fouler, marcher sur »
A noter qu’en combinant *?a- et *-t-, on donne une cohérence √ k_H aux formes
suivantes :
Latin a-qui-la « aigle »
Grec a-isa-los < *?a-k-ya-los « faucon émérillon »
Grec ik-t-înos « milan » (avec infixe -t)
Germanique commun *kû-tya « milan »
La variante *?i- du préfixe *?a- se retrouve dans d’autres mots :
Irlandais áth « four » < Celte *apatinos < *?a- + kwH2- tin-os, d’après *√ kwH2
« cuire, brûler »
Grec ipnos « four » < *?i- + kwH2- n-os (sans -ti-)
Grec i-erâks, îrêks « faucon » < *√ H1_r « aigle, grand oiseau »
Grec i-gn-uê « pli du genou » < *√ g_n(H1) « engendrer »67
Grec gonu et gnuks « genou »
Grec (dialectal) i-kn-us « cendre » < *√ k_n « cendre »68
Grec i-khthu:s « poisson » < *√ gh-dh-uH « poisson » < *√ gh_H « ouvrir la bouche,
être béant »69
Grec i-skh-ion « hanche, col du fémur » < *i-s_k-snos70
60. Cf. Chantraine : « L’α- initial présente l’ambiguïté habituelle, mais semble devoir être
interprété comme une prothèse (ou un ə2). »
61. Idem : « Il faut admettre un thème *piso, l’α initial fait difficulté comme souvent (prothèse
?). »
62. Id. : « Le sens précis du mot conduirait à le rapprocher de pharungs. »
63. Id. : « Le thème fait penser à celui de staphulê “grappe de raisin”. La forme à initiale α-
semble la plus ancienne. Est-ce une prothèse ? »
64. Id. : « Avec prothèse, se rapproche évidemment de v.h.a. stara, lat. sturnus. »
65. En admettant une métathèse *ghrew > *ghwer > θer.
66. Cf. Chantraine : « Il faut admettre un α- copulatif et la racine qui se trouve dans trapeô
“fouler” [...] c’est la piste foulée. »
67. Ce mot ignus est généralement considéré comme provenant de *en+g_n- par préfixation.
Néanmoins, cette hypothèse pose des problèmes phonétiques car on attend **innus
plutôt que ignus.
68. La forme normale en grec est konia.
69. Ce mot grec est extraordinaire sur le plan morphologique : incrément *-dh- et schème
vocalique *i_u. Les autres attestations n’ont pas l’incrément mais ont la voyelle *u.
70. Ce mot a été comparé depuis longtemps avec le mot sancrit sak-thi « cuisse », mais la
théorie orthodoxe ne permet pas de rapprocher ces deux mots de façon satisfaisante.
A comparer également les mots suivants :
Grec i-orkos « chevreuil » (incrément #i-)
Grec d-orkas « chevreuil » (incrément #d-)
Anglais roe < germanique *roik-os « chevreuil » (infixe -i-)
Celte *y-ork-os « chevreuil » (incrément #i-)
Ces mots sont sans doute en rapport avec la racine *√ H1_r « animal à corne », qui
donne également le mot latin ariês « bélier ».
L’existence de cet incrément en germanique, en italique et en celte, à côté
du grec, permet de reporter ce procédé morphologique au stade le plus ancien de
l’indo-européen. On ne peut pas dire avec une certitude définitive si #a- *H2eremonte
à *?a-, en raison de la multiplicité des valeurs phonétiques de H2, mais
c’est a priori l’hypothèse la plus simple et la plus fructueuse. C’est celle que nous
Peter Schrijver s’est intéressé à plusieurs lexèmes désignant entre autres
des oiseaux dans les langues d’Europe de l’Ouest. Ce corpus fait apparaître
l’existence d’un préfixe #a-. L’explication usuelle selon laquelle on toucherait là
un substrat non indo-européen nous paraît infondée et surtout inutile. Notre avis
est que ces lexèmes sont pleinement indo-européens et constituent des exemples
supplémentaires de la forme IV. Une fois de plus, une théorie adaptée de la morphologie
fossile permet de donner du sens, d’organiser les données et de simplifier
les hypothèses, en bref de rationaliser.
*mes_l- ~ *a-m_sl- « merle »
Gallois mwyalch
Latin merula > français « merle »
Vieux haut allemand amsla, amasla, amisla, amusla
Vieil anglais o:sle
*laHw- ~ *a-laHw- « alouette »
Vieil anglais la:verce > lark
Vieux haut allemand le:rahha, le:rihha
Moyen néerlandais leewerke
Latin (d’origine gauloise) alauda
Emprunt en finnois (ouralien) leivo(nen)
*raud- ~ *a-rud- « (morceau de) minerai »
Latin raudus “lump of ore”
Vieux haut allemand aruz, ariz
Vieil anglais arut
Emprunt en finnois (ouralien) rauta
Emprunt en lapon (ouralien) ruowde
Conclusions et perspectives
La famille indo-européenne est une invention autonome remontant à la Renaissance.
Par la suite, tous les progrès significatifs dans l’étude et la théorisation de
cette famille ont été accomplis chaque fois que la modélisation linguistique de
l’indo-européen s’est rapprochée de la réalité du sémitique et de l’arabe. Le système
phonologique avec des laryngales imaginé par Saussure, puis développé par
Möller et Cuny, et la théorie de la racine de Benveniste sont formés d’après le
modèle du sémitique, de façon quasiment explicite.
Le proto-lexique de l’indo-européen, reconstruit par les comparatistes, est
atomisé en une myriade de racines de forme et de sens proche. De multiples
exemples montrent que le lexique présente une organisation propre que la doxa
officielle n’a pas su mettre en évidence. La seule façon de sortir de cette impasse
est d’appliquer une méthodologie d’analyse identique à celle que propose Georges
Bohas pour l’arabe : déterminer des racines et des incréments. Nos analyses
montrent que les incréments valides pour l’indo-européen sont les mêmes que
ceux qu’on détermine en sémitique, à commencer par -r- et -l-, et aussi -n- ou
#?a-. A l’inverse, on trouve en arabe des incréments nouveaux, -t- et -ṭ-, que la
TME n’a pas proposés mais dont le corpus indo-européen offre des exemples limpides.
La recherche en indo-européen de l’équivalent des formes verbales dérivées
de l’arabe permet de donner à la prétendue prothèse vocalique #a- un vrai
statut linguistique, puisqu’elle correspond à la forme IV.
Par rapport à la question de l’apparentement de l’indo-européen vis-à-vis
de l’ouralien ou du chamito-sémitique, nous sommes convaincu que le tropisme
sémitisant de l’indo-européen contient implicitement la réponse. Nous avons
examiné de près l’ouralien et on cherchera en vain des préfixes et des infixes -ret
-l- dans cette famille. L’approche Racines et incréments est également valide et
féconde dans le domaine ouralien, mais elle ne fournit pas les mêmes résultats
qu’en indo-européen et qu’en sémitique, en termes de morphologie fossile. Ce
n’était pas l’objet que nous voulions aborder ici.
Il faut noter qu’il existe, dans le domaine comparatiste, des résistances
considérables à certaines évidences. Il a fallu un siècle pour que les Allemands
tiennent compte des études de Saussure remontant à 1870. Seebold, dans les années
soixante-dix, est le premier à pratiquer le formalisme laryngaliste, trois
générations après Saussure, deux générations après la découverte du hittite et
une génération après Benveniste. Formulons le voeu que cette fois-ci
l’appropriation de l’approche Racines et incréments dans le domaine indoeuropéen
soit plus rapide.
BEEKES Robert, 1995, Comparative Indo-European Linguistics. An Introduction, Amsterdam,
John Benjamins.
BENVENISTE Emile, 1935, Origines de la formation des noms en indo-européen, Paris, Maisonneuve.
BOHAS Georges, 1997, Matrices, étymons, racines, Leuven, Peeters.
BOISACQ Emile, 1923, Dictionnaire étymologique de la langue grecque, Paris, Klincksieck.
BRUGMANN Karl, 1905, Abrégé de grammaire comparée des langues indo-européennes, Paris,
CHANTRAINE Michel, 1983, Dictionnaire étymologique de la langue grecque, Paris, Klincksieck.
COLLINGE Neville E., 1985, The Laws of Indo-European, Amsterdam, John Benjamins.
ERNOULT et MEILLET, 1932, Dictionnaire étymologique de la langue latine, Paris, Klincksieck.
GRIMM Jakob, 1848, Geschichte der Deutschen Sprache, Leipzig, Herzel.
KIMBALL Sara E., 1999, Hittite Historical Phonology, Innsbrucker Beiträge zur Sprachwissenschaft,
n° 95, Innsbruck.
KURYŁOWICZ Jerzy, 1956, L’apophonie en indo-européen, Wrocław, Ossolineum.
LEHMANN Winfred P., 1955, Proto-Indo-European Phonology, Austin, Univ. of Texas Press.
LEIBNIZ G. W., 1990, Nouveaux essais sur l’entendement humain, Paris, Flammarion.
MALLORY James, 1997a, A la recherche des Indo-Européens, Paris, Le Seuil.
–, 1997b, Encyclopedia of Indo-European Culture, Fitzroy Dearborn Publishers.
MARTINET André, 1986, Des steppes aux océans, Paris, Payot.
MEIER-BRUEGGER Michael, 2002, Indogermanische Sprachwissenschaft, Berlin, Walter de Gruyter.
MELCHERT H. Craig, 1994, Anatolian Historical Phonology, Leiden Studies in Indo-European.
MÖLLER Hermann, 1906, Semitisch und Indogermanisch, Copenhague, H. Hagerup.
MUKHERJEE S. N., 1968, Sir William Jones, Cambridge University Press.
PARSONS James, 1767, The Remains of Japhet, Being Historical Enquiries into the Affinity and
Origins of the European Languages, Londres.
POKORNY Julius, 1959, Indo-Germanisches etymologisches Wörterbuch (IEW), Berne, Francke
PUHVEL Jaan, 1984, Hittite Etymological Dictionary, Berlin, Mouton Publishers.
SALMONS Joseph C., 1993. The Glottalic Theory. Survey and Synthesis, Journal of Indo-European
Studies, Monograph Series, n° 10, McLean, Institute for the Study of Man.
SCHRIJVER Peter, 1991, The Reflexes of the Proto-Indo-European Laryngeals in Latin, Amsterdam,
SAUSSURE Ferdinand de, 1870, Mémoire sur le système primitif des voyelles dans les langues indoeuropéennes,
Leipzig, Teubner.
SCHLEGEL Friedrich, 1808, Ueber die Sprache und die Weisheit der Indier, Heidelberg, Mohr und
SERGENT Bernard, 1995, Les Indo-Européens, Paris, Payot.
SZEMERENYI Oswald, 1973, La théorie des laryngales de Saussure à Kuryłowicz et à Benveniste,
BSL, n° 68.
The Following User Says Thank You to Colin Wilson For This Useful Post:
2/Criticism of Altaic
Similar criticism see the free paper below

Here below some interesting staff from the web about Altaic

among the pronominal systems of Turkic, Mongolic,
Tungusic, IE, Kartvelian, etc., are due to their
all coming from the same place but not to their being
genealogically related. The mail I got focused on
the question of whether a language can borrow its
pronouns from another. The answer is apparently yes,
since there are well documented cases (such as Copper
Island Aleut from Russian, or Shelta which has both
Irish and English pronouns).

However, my point is not that borrowing is impossible
(although it is rare). My point is that borrowing has
to be demonstrated, like anything else (although Eric
Hamp has argued that borrowing is a weaker hypothesis
than relatedness because it is harder to refute). It
cannot simply be asserted, as some have done. And the
situation with the Altaic pronouns (Turkic, Mongolic,
Tungusic) is parallel to that in IE. It is just as
easy to say that the similarities between Germanic and
Indo-Aryan and Slavic, say, are due to borrowing as it
is in the Altaic case. It is just that no one would
DARE to say such a thing.

Which brings me briefly to the political issues. It is
very tempting for someone who did not live in Russia
during the Soviet period to laugh off the remarks made
by Prof. Vovin, but I think it is very true that
political analogies such as the ones he drew were
unfortunately very real. And we must remember that
Soviet Russia was where much much of the fight over
Altaic took place. Clauson, a Britisher, and Rona-Tas,
a Hungarian, published their early attacks on Altaic
in a Russian journal, for example. Moreover, it seems
to me that a kind of political analysis is applicable
to the situation in the West too. We must realize that
the acceptance or rejection of an idea is not a pure
ethereal intellectual act, but a political one within
the politics of academia. Thus, it is not irrelevant
I think that in the case of Indo-European there is a very
large (numerically) body of linguists in the strict sense
of the term, whereas in the case of many other language
groups (including all the Altaic ones) there have
traditionally been a few linguists (sometimes one or maybe
two in a given country) and a whole lot of philologists
or historians. The audiences at the conferences, the reviewers
in the journals, one's colleagues in a university dept.
all make a great deal of difference, and these have been
so very different for the Altaic comparative linguist as
opposed to the Indo-Europeanist. I have no doubt that it
made a difference that people who were largely untrained
in linguistics and not very intersted in its goals were
the dominant group in Altaic studies, as indeed were some
of the leading critics of Altaic. Sinor describes himself
somewhere as a nonlinguist adding that linguists will
breathe a sigh of relief to hear him say that. Now, he
was actually more of a linguist than he admitted, but certainly
not a linguist in the same sense as Poppe or Starostin or Vovin.

Finally, at present, it seems to me that politics has much to
do with attitudes towards Altaic as well. In particular,
it seems that Altaic sometimes gets knocked by people who are
really gunning after Nostratic (and after Greenberg). It is
also a political fact (or a social one, or whatever) that
there is no forum in our field (except perhaps for LINGUIST)
where such a discussion as we have just had is possible. Thus,
it is perfectly easy to disseminate harsh criticism and mis-
information about a theory like Altaic so that many linguists
who do not work in comparative ling or on these languages will
hear of and assume the worst, but there is no journal which
reaches such an audience which would be a forum for a discussion
in which the issues can be aired and both sides can be heard.
There should be (much as our friends in anthropology have one)
but there isn't. That's too bad, but at least we have LINGUIST.
Message 2: Re: 5.929 Altaic
Date: Tue, 30 Aug 94 16:55:49 ESRe: 5.929 Altaic
Subject: Re: 5.929 Altaic

Reinhard Hahn is certainly right that "*everyone* who does not reject the AH
definitely considers Turkic, Mongolic and Tungusic to be Altaic" and "not *ever
yone* is totally convinced that Korean and Japanese are related to them or are
closely enough related to them to be called "Altaic". I had no intention to
misinterpret Reinhard Hahn's position, and if I produced such an impression, I
beg to accept my sincere apologies. All I wanted to say is that this *everyone*
/ *not everyone*'s position stands on a very shaky basis. Let me develop
this a little bit further. First of all, as far as I know, majority of us now,
who do reserch on comparative Altaic proper, and not on separate Altaic langua-
ges, and who actively publish during last ten years on the subject, accept the
idea of five-member Altaic family. Sergei Starostin in his book made very solid
lexicostatistical arguments in favour of inclusion of Japanese and Korean on
the same level of relationship. There are also considerations of phonetic
nature (there are no common phonetic innovations and/or archaisms in "classic"
Altaic triade as opposed to Japanese and Korean or vice versa), and morphologic
al nature (Japanese, Korean and M-Tungus share more common morphological mar-
kers than any of them with Turkic or Mongolian). On the other hand, I have
never heard any solid arguments supporting exclusion of Japanese and Korean
from Altaic; as it seems to me, it is rather based on circulating rumours,
like those which were described previously regarding the very nature of Altaic
itself. If anyone can provide such arguments,I will be very willing to hear
them and to discuss them. Otherwise, I believe, the outdated concept of "Clas-
sic" Altaic with only three members as it was held in sixties and seventies
should not be represented as majority's point of view. There is, I believe,
a scholarship problem, too. While it is enough to have reading knowledge of
English, German, and Russian (and more recently Chinese, too) in order to get
access to Turkic, Mongolian, or M-Tungus materials, unfortunately, there was
no access to reconstructed Japanese data until S. Martin's fundamental "THe
Japanese Language through Time" appeared in 1987 for a specialist in Turkic,
Mongolian, or M-Tungus, unless s/he knew the way across piles in written in
modern and Classical Japanese. Using even Kenkyuusya's Japanese-English dictio-
nary would be as fruitful for comparative Altaic purposes, as using modern
Kazakh alone to represent Proto-Turkic. The same situation exists even today
with Korean data. Though Martin's excellent "Korean Reference Grammar" intro-
duces Middle Korean data to a scholar who cannot read Korean, these data are
still quite far from Proto-Korean reconstruction. On the other hand, on the
Japanese-Korean side of Altaic, we find very similar situation. First of all,
only few scholars here are interested in Altaic as such, and many of the
latter cannot read "Sravnitel'nyi slovar' tunguso-man'zhurskikh iazykov",
which is the main source on comparative Manchu_tungus. Comparing Japanese
and Korean with Manchu alone, which is best accessible language for the collea-
gues who do not read Russian, even remotely will not produce the same results
as comparison with Proto_Manchi-Tungus, with reconstruction based on all availa
ble sources. This situation does create a little correlation between the
specialists, and "Inner" and "Outer" Altaic reflect not a linguistic situation
but a major division between specialists.
Let me now briefly stop on the point why I believe that anti-Altaicist
methodology is in contradiction to the comparative method. I, believe, that
some of what follows, was already mentioned before by N. N. Poppe and R. A.
1) In his attack on Miller Doerfer accuses the former that he compares
Japanese not with Altaic, but with different Altaic languages. This accusation
does not make sense from standpoint of a comparativist: Indoeropeanist compa-
res , let's say, a certain Greek form, not with IE reconstruction, but with
cognate forms in various IE languages. In the same way, an Uralist compares
Finnish word with its cognates in other Uralic languages, not with PU recon-
struction. And so on.
2) Another odd methodological cornerstone of anti-Altaistics is
a thesis that one cannot compare simultaneously more than two languages. My
question "why?" addressed to one of them was answered "because I think so".IE-
peanists, Uralists, and other specialsts in comparative linguistics usually
earn their bread by comparing many languages simultaneously, and nobody finds
it strange.
3) Anti-Altaicists, if we scrutiny attentively their arguments, tend
to reject certain examples ad hoc, if they do not look alike. Therefore, a
principle of regular correspondences is completely replaced by a search for
look-alikes. Often one can hear something like: "how can you seriously com-
pare J isi "stone" and Turkish tash (sh stands for hushing sibilant approximate
ly as English sh in shame)?" Well, I can since there are regular correspondence
s between the two.
4) Anti_altaicists, with their theory of omnipotent loanwords tend to
disregard the simple fact that very often their proposals do not make any
sense from the historical point of view. Such is, for example, Doerfer's
claim that Manchu-Tungus borrowed its word *moo "tree" from Mongolian
modun. Taking into consideration traditional habitat of the peoples involved,
the M-Tungus word has the same chance to be borrowed from Mongolian as it does
from Martian, not to metion all phonetical difficulties which arise with such
an interpretation.
5) This anti-historical approach further manifests itself
in, for example, such historically irresponsible statements as
"when ancestors of Chuvash lived in Siberia near Mongols". We have zero
evidence for supporting this statement: but it is certainly necessary to place
Chuvash (westernmost Turkic language) near Mongols in order to justify anti-
Altaistic interpretation of zetacism and lambdaism.
Regarding the terminology, I find Reinhard Hahn's additional arguments
for Mongolic quite acceptable. Xalxa, Chaxar, Ordos etc., however, may be
also called Central Mongolian (or East Mongolian, whatever one prefers), as
opposed to West Mongolian (Oirat & Kalmyk), North Mongolian (Buriat) and
Mongolian Outliers (Dahur, Dongxiang, Baoan, Monguor, and Moghol). I agree,
however, that his proposal is more logical. I would, however, strongly
insist on preserving Manchu-Tungus rather then Tungusic, for the followingtwo r
easons: 1) Manchu is farther from the Tungusic languages than Chuvash is from
Turkic (at least on the basis of lexicostatistical results); 2) people are less
aware(for the time being)that Manchu is an early offshot from the rest
of the group than they are aware about the similar situation with Chuvash: ther
e are attempts to group Manchu together with Nanai and other Tungusic langua-
ges of Primor'e region, which are traditionally called South Tungusic", but
to the best of my knowledge all classifications of Turkic except one very
confusing by Baskakov classify Chuvash as a language standing quite separately.
Alexander Vovin
Miami University

The part below, could be an explanation for the similarities between Indo-European and Altaic (as well as Uralic and Afro-Asiatic) despite their respective urheimat being so distant (both in time and in sapace) and their genetic ancestry being so diverse!=>those pronouns were borrowed from Indo-European to Altaic and Uralic since primitive languages often dont need (or are not so complex to have a need for pronouns) pronouns and the first attested text of an Altaic or Uralic language is very recent (Old Turkic Orhon scripts by 8 th century)+those pronouns lack Uralic and Altaic etymology while proto ie *ego (I nominative) could be connected with pie *kwo=who (afro-asiatic ku/ki=who,wich; also see Akkadian anaku=I, Egyptian ink=I, Somali anogi=I and berber nek=I) and pie *eme(oblique case of I)from lsilakh mV (what) or lislakh mn (man)
The mail I got focused on
the question of whether a language can borrow its
pronouns from another. The answer is apparently yes,
since there are well documented cases (such as Copper
Island Aleut from Russian, or Shelta which has both
Irish and English pronouns).
2/Criticism of Afro-Asiatic

In this paper, Fleming included information about his
unpublished computations, which «indicate that Omotic languages
never achieve more than 5% of shared retentions on
the short Swadesh list when they are compared with other
Afroasiatic languages outside Cushitic.» The percentage of
«shared retentions» is not higher than the accidental similarity
expected between any two unrelated languages, which is usually
estimated at 4%–5%, or even 7% (Campbell 1997: 229,
405). This indicates that there is no genetic relationship between
OM and AA.

From the paper below
Is Omotic Afroasiatic?
A Critical Discussion.
Rolf Theil
Department of Linguistics and Scandinavian Studies
University of Oslo, Norway
1 Introduction
Omotic, a group of 25–30 languages spoken in southwestern
Ethiopia, is regarded as a family whose interior classification
is presented in Table 1. The three main branches, South
Omotic, North Omotic, and Mao, are very distantly related.
Table 1: The branches of the Omotic language family (Hayward 2003)
South Omotic Hamar, Aari, Dime
North Omotic
DIZOID Dizi, Sheko, Nayi
Gong a Kafa, Shakicho (Mocha), Shinasha, Anfillo
Gimira Bench, She
Ometo-C'ara C’ara
North Ometo Wolaitta, Gamo, Gofa, Dawro, Malo,
Basketo, Oyda
East Ometo Zayse, Zargulla, Harro and other lacustrine
varities, Koorete
South Ometo Maale
Yem (earlier known as 'Janjero') Yem
Mao Mao of Begi, Mao of Bambeshi, Diddesa
OM(otic)1 is generally regarded as a branch Afroasiatic. This
paper is a discussion of the arguments for this AA affiliation,
the OM Theory (Lamberti 1991). I claim to show that no convincing
arguments have been presented, and that OM should
be regarded as an independent language family. No closer
1 Cf. list of abbreviations at the end of the paper.
genetic relations have been demonstrated between OM and AA
than between OM and any other language family.
2 Joseph H. Greenberg
Greenberg (1963) divided the languages of Africa into 4 families,
Niger-Kordofanian, AA, Nilo-Saharan, and Khoisan. He
divided AA into 5 branches, SE(mitic), EG(yptian), BE(rber),
CH(adic), and CU(shitic), and CU into 5 subbranches, North,
Central, East, West, and South CU. WCU corresponded to OM.
Greenberg's (1963) classification of African languages was
primarily based on mass comparison, a method described by
Campbell (1997: 210) as being based on looking at –
«many languages across a few words» rather than «at a few languages
across many words» ([Greenberg] 1987: 23), where the lexical similarity
shared «across many languages» alone is taken as evidence of genetic
relationship, with no methodological considerations deemed relevant.
A few lines later, Campbell adds that the resemblances –
detected in mass comparison must still be investigated to determine
whether they are due to inheritance from a common ancestor or
whether they result from borrowing, accident, onomatopoeia, sound
symbolism, or nursery formations … Since Greenberg’s application of
his method does not take this necessary next step, the results frequently
have proven erroneous or at best highly controversial.
Greenberg (1963) does not discuss WCU explicitly. As pointed
out by Fleming (1974), for several generations, CU had been
accepted by most scholars as a branch of AA. However, the
WCU languages «gained their membership in [AA] from a presumed
kinship with the proper Cushites.» Chapter III Afroasiatic
in Greenberg (1963) is an attempt to prove that CH is a
branch of AA, and an AA Comparative Word List is presented,
with 78 CH words claimed to have cognates in other branches
of AA. There are 14 different WCU words in the list.
3 Fleming (1969)
Fleming (1969) reclassified WCU as a sixth branch of AA –
Aari-Kafa (A-K). He used what he regarded as two methods,
lexicostatistics and grammatical comparison.
Lexicostatistics, developed by Morris Swadesh, involves
measuring the percentage of words with similar sound and
meaning in different languages, on the basis of lists of basic
vocabulary. Words with similar sound and meaning are called
cognates. The larger the percentage of cognates, the closer the
languages being compared are presumed to be related.
Fleming's lexicostatistical argumentation has this structure:
(1) CU(shitic) is more internally differentiated than other
branches of AA; about 12% of cognates are found between
the (non-A-K) branches of CU. (2) Between A-K and the
branches of CU, the percentage of cognates falls below 10%,
which is the same level as that pertaining between families of
(non-A-K) AA. (3) Therefore, A-K is a branch of AA, not of CU.
Lamberti (1991) reminds us of the fact that Fleming adduces
no evidence but the result of his lexical statistical test,
and the data used during the enquiry has remained unknown.
Still, the OM Theory was accepted by some scholars of
African linguistics.
Fleming presented some morphological features that he
regarded as typically CU, and that were absent from A-K. A-K
either lacks gender or uses different indicators than CU m. k /
f. t.; there is no over-all correspondence in the pronominal
system between A-K and CU, except 1pl n. He added two typological
features: A-K verb roots are commonly monosyllabic
and more rigid than CU roots, and the characteristic conjugational
patterns of ordinary CU are absent.
Fleming's lexicostatistical comparisons are of little value,
since no lexical data are presented. No conclusions can be
drawn about the status of A-K. The morphological differences
pointed out between A-K and CU are differences between A-K
and all the other branches of AA. The morphological data
indicate a genetic relationship with neither CU nor AA. Fleming's
typological arguments are irrelevant; there are often
typological differences between closely related languages.
4 Fleming (1974)
Fleming (1974) replaced the name Aari-Kafa with Omotic,
«after the most prominent geographical feature of their region
– the Omo river basin.»
In this paper, Fleming included information about his
unpublished computations, which «indicate that Omotic languages
never achieve more than 5% of shared retentions on
the short Swadesh list when they are compared with other
Afroasiatic languages outside Cushitic.» The percentage of
«shared retentions» is not higher than the accidental similarity
expected between any two unrelated languages, which is usually
estimated at 4%–5%, or even 7% (Campbell 1997: 229,
405). This indicates that there is no genetic relationship between
OM and AA.
Fleming presented what he regarded as two methods to
support the OM Theory: morphological and lexical comparison.
However, these are not two methods, but mass comparison
applied to lexical and grammatical morphemes, respectively.
From a comparative point of view, the main difference
between lexical and grammatical morphemes is that the latter
tend to consist of fewer phonemes that lexical morphemes.
The shorter a morpheme, the higher the probability of finding
accidental similarities, and Fleming's morphological comparisons
are therefore even less reliable than his lexical comparisons.
As pointed out by Meillet (1967: 53), a «comparison
which rests solely on one or even two root consonants is
without value if it is not supported by very specific facts.»
This is true for grammatical as well as lexical morphemes.
4.1 Fleming's (1974) morphological comparisons
Fleming's (1974) grammatical morphemes with alleged cognates
in (other branches of) AA are presented below. Language
names are changed in accordance with Table 1. Data
from different branches of AA are separated by a dot, •.
I. CAUSATIVE -s. «Almost universal.»
II. PLURAL -n~-na in SOM AAR; *-ti; partial reduplication and change of
stem vowel.
III. GENITIVE CONNECTOR -n~-ni in NOM YE, «rare elsewhere»; -t~-ti in
SOM AAR, «rare elsewhere».
IV. CASE Acc. -m SOM /-n NOM; dat. -n SOM / -s NOM • «The /n m/
accusative is found in Semitic.»
V. MASCULINE/FEMININE Acoustically flat/sharp vowels, cf. KA m. -o / f.
-e. • «The «flat/sharp» contrast is also found widely in AA, often
associated with k/t.»
VI. FEMININE -n and n+V occur in nouns in SOM and in verbs in NOM. •
Fem. -n occurs in verbs in SE UG. • «[P]lural markers in /n/ in [MEG]
were analyzed by Gardiner [1957: 85-87] as "really pronouns" of a
neutral character which had been feminine in older stages of [AEG]. So
feminine in /n/ may also be a very archaic AA trait preserved in [OM].»
VII. 3RD PERSON PRONOMINAL BASE is-~us-~uz-~b- in NOM, «most of
which have contacts in [AA].»
VIII. 1PL PRONOUN no: (na) «almost everywhere; «its link to [AA] is clear.»
IX. 1SG PRONOUN i- ‘my’, in ‘me’ in SOM, and perhaps some other SOM
languages • «[U]sed
by Greenberg to show [CH] links to [SE].»
X. VERBAL PERSON SUFFIXES. 1sg -it, 2sg -n, 3sg Ø, 1pl -ot, 2pl -ɛt, 3pl -ɛk
– «rests heavily on Galila [dialect of AAR] which is the only SOM language
with a proper paradigm of person marking inflections. But SOM
DI has enough left of an earlier paradigm to make it plausible.»
In most cases Fleming mentions no data from other branches
of AA. No attempt is –
made to specify the grammatical morphemes in the various families of
[AA]. It is presumed that the reader knows about the common particles
of [AA] or some of its sub-divisions or that he can easily obtain
Greenberg’s famous article on [AA] [ch. III of Greenberg (1963)] which
remains the template for phylum-wide comparisons in [AA] studies.
No systematic phonological comparisons are made between
grammatical morphemes in OM and (other branches of) AA.
This weakens Fleming's argumentation.
Fleming lists grammatical morphemes that occur in one
or just a few OM language(s), without telling why they should
be regarded as retentions from POM, e.g.:
(i) -n~-na 'plural' occurs in SOM AAR. Pl. formations vary
within and among OM languages, e.g. NOM ML uses gemination
of the stem final consonant, or the suffixes -atsi and -att-
(Azeb 2001); NOM KA uses -na'ó (my field notes); KO uses -ita
(my field notes). (Pluralization through partial reduplication
and change of stem vowel are typological features, and
therefore irrelevant.)
(ii) NOM YE has the genitive connector -n~-ni, which is
«rare elsewhere».
(iii) i- ‘my’, in ‘me’ in SOM AAR and «perhaps some other
[SOM] languages».
(iv) The reconstructed verbal suffixes rest «heavily on
Galila [dialect of AAR].» The SOM reconstructions differ from
most verbal person suffixes in a NOM language like KA (my
field notes), as shown in Table 2.
Table 2: Verbal person suffixes. Fleming's SOM compared to Kafa
SOm Kafa SOm Kafa
1st person -it -Ø -ot -on
2nd person -n -in -ɛt -otee
3rd person -Ø -e m, -an f -ɛk -eetee
One of Fleming's explicit comparisons with other branches
of AA is farfetched. An etymological relationship is proposed
between OM m. -o / f. -e and AA m. k / f. t, because -o and k
are acoustically flat, while -e and t are acoustically sharp. The
relationship is not accounted for historically.
Most OM morphemes claimed by Fleming to have AA
cognates consist of a coronal consonant (t s z n), either alone
or with a vowel that plays no role in the comparison. Coronals
are among the most frequent consonants in grammatical
morphemes in the languages of the world, and accidental
similarities between unrelated languages are easy to find.
4.2 Fleming's lexical comparisons
Below follows a summary of Fleming's (1974) presentation of
21 OM words with alleged AA cognates.
1. ALL. POM *kull «might be proposed»; the reconstruction is based on a
PSOM reconstruction *kull (cf. DI kʊll, HM, KR wull) and NOM forms KA, SH
bulli «but the correspondence is not confirmed» • SE UG kl, AM hullu.
2. ASHES. POM *b-nd- • CH Gabri búndu • CU OR ibid-da 'fire'.
3. BLOOD. PSOM *zumB/dzumʔ • BE i-damm-ən «(from [SE]?)» • CH Maha
dom, Bachama zambe • Se *dmm.
4. BONE. POM *k'us • BE i-xs, «said to be from *i-ḳs» • CH HM k'aši • CU
GL Gəs 'foot' • AEG ḳs.
5. BRIGHT, SHINY. OM DI Bɛlxən; SH p'arik' 'lighten, flash' • CH Batta
Garua baratje 'lightning' • CU KH birqa: 'lightning' • AEG brq 'to shine' • SE
HE bɔraq 'lightning'.
6. TO COME. POM y-/yiʔ/yɛɡ • CU BD ʔi • MEG ı̉w and ı̉ı̉.
7. BUILD, CREATE. OM DI bɪn • CH Bolewa bin 'house', Sokore be:ni
'build' • CU «forms with mina or mana for 'house' abound» in ECU and CCU.
8. DOG. PNOM *kan-; «kana … virtually universal in [NOM]. SGO has an
innovating form kuna:n-o but NGO has kana» • SE *kl-b «with the assumption
that -b is a suffix for animal terms».
9. EAT. PSOM *its; NOM M itsa 'crop' • BE ča • CH Bolewa ti, HS či • CU BD
tiyu 'food' • AEG tʔ 'bread' • SE AK teʔ-u.
10. EYE. POM *a: f / a • CU SI af- 'to see' «judged to be borrowed from
[OM]» • SE UG ʕpʕp-m 'eyes', presumed to be reduplicative with -m pl.
11. TO FLY. OM DA fal, GM fir • BE Shilha firri • CH Ankwa p'aar 'jump',
etc • CU BD fa:r 'jump, hop' AEG pʔ • SE AR farra 'flee', UG pr 'flee'.
12. GO. OM COMT b-, EOMT ba/bay • CH Dera bə 'go away'; Newman's
PCH *B- • CU BD ba:y, AF, OR ba: • SE HE bɔ, AR ba:ʔ 'return'.
13. HEART. OM K nibb-o «secondary form», AN yimb-a, SH nɨmba, AMU
libb-o; «all suspected of being borrowed from OSE *lbb. The same for YE
nib-a. However, AAR … lip'a/liBa … and BA lippe 'belly', perhaps also
COMT ulw-a/ull-o 'belly', suggest that the form goes back to [POM]. If so, cf.
Greenberg (1963) 'heart'. • The form is virtually absent from [CU], being
known only in [OR] lap'e [etc.]».
14. KNEEL. OM AAR gump-ɛr-; ML *gumB-at • CU BD gumba 'knee' «and
probably other [AA] forms [for 'knee'] cited in Greenberg (1963)» • CH
Angas kirm 'kneel', Musgu gurfa 'kneel' • BE Kabyle keref 'bend the knee'.
15. LICK. OM DI lits’, CA hals. • «Cf. Greenberg (1963) 'tongue'»: BE i-ls •
CH HS harše/halše, Angas lis • AEG ns • SE AR lisa:n.
16. MOON. POM *ʔarf-/ʔarp. «[NOM] has an innovated form agen-» • SE
UG ʕrp-t 'clouds' • «Cf. also [CCU] arba 'moon'»
17. MOUTH. PSOM *af/ap. «[NOM] has innovated forms … from *no:nor
*ad-» • CH HS ʔafa 'throw in the mouth' • CU BD yaf, SO af • SE AK pu:.
18. NOSE. PNOM *sinD/sint’ • CH HS sunsuna: 'to smell', Sukur šin • CU SO
san • EG snsn 'to smell'.
19. TOOTH. POM *ačč/ats • BE TA added 'bite', etc. «Possibly all [BE]
forms are from [AR]» • CH Angas at 'bite' • SE AR əðð.
20. DONKEY. PNOM *kur-; PSOM *uki- • CH Bolewa koro.
21. YOUNG FEMALE. SOM DI amza 'woman, woman in prime sexual life',
AAR anza 'young woman' • SE CHA anž 'heifer', AR anj 'heifer'.
Fleming compares words from 26 OM languages with words
from all languages in the other branches of AA, that is, around
350 languages (Gordon 2005). This method gives more than
8 500 possible language pairs to compare where one of the
members of the pairs is an OM language. On this background,
21 cognates is not impressive, and one may ask whether a significantly
lower number is at all possible. Let us take a closer
look at some of Fleming's cognates.
1. ALL. POM *kull, reconstructed on the basis of SOM DI
kʊll, HM wull and possibly NOM bulli; no reasons are given for
postulating a phonological correspondence k–w–b. DI kʊll is
apparently the only occurrence of a form with k- outside SE,
and may be a loanword from SE.
3. BLOOD. Fleming compares PSOM *zumɓ/*dzumʔ to PSE
*dmm, BE i-damm-әn and CH Maha dom, etc. He presents no
other words exhibiting OM-AA phonological correspondences
z/dz–d or mɓ/mʔ–mm, and the vowels seem to play no role in
the comparison. Fleming does not mention that the words
for 'blood' in NOM are completely different, cf. ML súgútsi
(Azeb 2001) KO súutse (my field notes) and WO suutta (Lamberti
& Sottile 1997). No arguments are presented for treating
the SOM forms as more conservative than the NOM forms.
Similarity with AA is not an argument unless it is shown that
the comparison is not as farfetched as it looks.
6. COME. Fleming compares POM *y-/yiʔ-/yɛɡ to CU BD ʔi
‘come’ and MEG ı̉w and ı̉ı̉ ‘come’. ‘Come’ in AEG was ywy
(«jwj») (Loprieno 1995). Only the initial consonant resembles
OM y-/yiʔ-/yɛɡ. The CU form is not evidently similar.
8. DOG. Fleming (1974: 88) compares POM *kan- to SE
*kl-b (with the assumption that -b is a suffix for animal terms),
CH and PLECU *k-r-. He adds that «South Gonga has an innovating
form kuna:n-o but North Gonga has kana.» No reason
is given for treating SGO kuna:no as innovating. No arguments
support the analysis *kl-b. No other words are presented that
exhibit a phonological correspondence OM n–AA l/r.
9. EAT. Fleming compares PSOM *its to forms meaning
'eat' in BE, CH, CU, and 'bread' in EG. Fleming seems to assume
PAA *-t- ‘eat’, but presents no other evidence for a phonological
correspondence PAA *t – POM *ts, or PAA *t – BE
šš/čč; cf. Shilha ešš (Dray 1998) and Kabyle ečč ‘eat’ (Dallet
1982). Fleming does not discuss vowel differences or the
glottal stop in the SE and EG forms.
12. GO. The SE forms mean 'return', not 'go'. Fleming
does not discuss the plausibility of a semantic change 'go' >
'return' or 'return' > 'go'.
13. HEART. Fleming assumes that NOM K nibb-o, «secondary
form», AN yimb-a, SH nɨmba, and AMU libb-o are cognates,
and that they are not borrowed from OSE *lbb, due to
SOM words meaning 'belly': AAR lip'a/liBa, BA lippe. Fleming
does not explain in what way K nibb-o is a «secondary form»,
but the ordinary word for 'heart' in K is múllo (my field notes).
AM lïbb 'heart' would become nibbo if borrowed into K, in accordance
with general principles (Theil, in press).
14. KNEEL. Fleming does not explain how OM *gumB-at is
related to CH Angas kirm 'kneel', Musgu gurfa 'kneel' and BE
Kabyle keref 'bend the knee'. The CH and BE forms have a
liquid not found in OM. The comparison is farfetched.
19. TOOTH. Again, an example of an unparalleled phonological
correspondence, OM čč/ts – BE dd – CH t – SE ðð. The
AR form is wrong; the correct form is aḍḍ(a).
21. YOUNG FEMALE. Fleming compares OM DI amza
'woman, woman in prime sexual life' and AAR anza 'young
woman' to AR anj 'heifer'. Doniach (1972) has only one AR
word meaning 'heifer', ʕijla. Cowan (1994) has no word anj or
ʕanj 'heifer'. Elie Wardini (p.c.), professor of AR at the University
of Stockholm, does not know such a word. However, he
mentions naʕja 'ewe, female sheep' and ʕanz(a) 'goat'; the latter
resembles AAR anza, but DI amza indicates that m is the
original nasal, with a regressive assimilation in AAR anza.
There is clear evidence that the n of AR ʕanz(a) is the original
nasal, cf. the plural forms aʕnuz/ʕunūz/ʕināz (Cowan 1994).
As Wardini adds, one should be very careful with AR words
without cognates in other SE languages; the historical study of
the AR lexicon is almost totally neglected.
4.2.1 Preliminary conclusion
Comparing morphemes the way Fleming has done, it is practically
impossible not to find some look-alikes. However, to
quote Meillet (1967: 51), «an etymology is valid only if the
rules of phonological correspondences are applied in an exact
way, or in case a divergence is accepted, if this divergence is
explained by special circumstances rigorously defined.» But in
Fleming (1974) we find discussions of neither phonological
nor semantic correspondences.
Another weakness in Fleming's argumentation is that he
has not shown that OM is closer to AA than to any other language
family. In the next paragraph OM is compared to PIE.
4.3 Omotic and Proto-Indo-European
The following comparison between OM and PIE is limited to
Fleming's alleged OM/AA cognates. The comparison is also
limited in another way: With few exceptions, OM is compared
to one language, PIE, and not to all the 449 IE languages (Gordon
2005); including all languages in the comparison would
have made it even easier to find similarities.
BE, CH, CU, EG, and SE forms are left out, but are found in
4.2-3. The source for IE forms is Mallory & Adams (2006),
unless other works are referred to.
I have included data from Greenberg's (2000-2002) Eurasiatic
(IE, Uralic, Altaic, Gilyak, Korean-Japanese-Ainu, Chukotian,
and Eskimo-Aleut) and Ruhlen's (1994) «global etymologies
». Fleming's methods are similar to those of Greenberg
and Ruhlen, and the EA and GE data emphasize the arbitrariness
of Fleming's results.
Most resemblances in grammatical morphology between
OM and AA are also found between OM and IE:
I. CAUSATIVE. OM -s • IE *-s (Greenberg 2000) • EA *-s.
II. PLURAL. (a) -n~-na in SOM AAR; (b) *-ti; (c) partial redupl. and change
of stem vowel • IE *-ns acc pl • EA -t.
III. GENITIVE CONNECTOR. (a) -n~-ni in NOM YE; (b) -t~-ti in SOM AAR
• IE *-n (Greenberg 2000) • EA -n.
IV. CASE. Acc -m SOM /-n NOM; dat -n SOM / -s NOM • IE acc sg *-m,
gen/abl sg *-(o)s.
V. MASC/FEM Flat/sharp, cf. K m -o / f. -e. • IE m sg nom *-os / f sg
nom *-eH2; cf adjective 'new': m *new-os, f *néw-eH2, n *néw-om.
VI. FEM -n and n+V occur in nouns in SOM and in verbs in NOM • IE
Latin -īn- in regīna 'queen' and gallīna 'hen' is a fem. suffix.
VII. 3RD PERSON PRON BASE. is-~us-~uz-~b- in NOM • IE *s-, cf. m. *so
and f. *seHa 'that one' • EA s-
VIII. 1PL PRON. no: (na) • IE *nóH1 'we two' • EA 1st person n-.
IX. 1SG PRON i- ‘my’, in ‘me’ SOM AAR • Cf. IE 1st person forms without
a nasal and with a nasal: *H1egy, *H1éme.
X. VERBAL PERSON SUFFIXES. Table 3 is a comparison between K and
PIE (2nd conj). The main difference is found in 2.sg.
Table 3. Verbal person suffixes. Kafa and Proto-Indo-European
Kafa IE Kafa IE
1st person -Ø *-oH2 -on *-omes
2nd person -in *-etH2e -otee *-ete
3rd person -e m., -an f. *-ei -eetee *-onti
Below is comparison of 21 OM and PIE lexical morphemes.
1. ALL. POM *kull, PSOM *kull, DI kʊll, HM, KR wull. NOM K, MO bulli •
PIE *H3el- (Bjorvand & Lindeman 2000).
2. ASHES. POM: *b-nd- • PIE *péH2ur 'fire'; *pē(n)s- 'dust' • EA pana 'ashes',
par 'fire', pa 'dry' • GE bur 'ashes, dust'.
3. BLOOD. PSOM: *zumb/dzumʔ. • PIE *gyheumn- 'libation', *gyheu- 'pour' •
EA kem. NB: The OM z – PIE gyh correspondence has a parallel in OM z –
PIE gh, cf. 21. YOUNG FEMALE.
4. BONE. POM *k'us • PIE *H2óst. • GE kati.
5. BRIGHT, SHINY. OM DI Bɛlxən; SH p'arik' 'lighten, flash' • PIE *bhreH2gy-
• EA belk.
6. TO COME. PNOM *y-/yiʔ/yɛɡ • PIE *H1ey-, *H1eyH2- 'go' (Bjorvand &
Lindeman 2000) • EA i~ya 'go'.
7. BUILD, CREATE. OM DI bɪn • IE *bhendh- 'bind'.
8. DOG. PNOM *kan-. SGO kuna:n-o • PIE *kywon-~*kyun- • EA kan~kun •
GE kuan. The PIE and EA alternations resemble OM kan-~kun-.
9. EAT. PSOM *its; NOM MJ itsa 'crop' • PIE *H1ed-, Hittite *ets- (Bjorvand
& Lindeman 2000).
10. EYE. POM *a: f /a • PIE xwekw-; cf. Greek ōps.
11. TO FLY. OM DA fal, GM fir • PIE *pl-ew-k- > Proto-Germanic *fléuh-
‘flee, fly’ • EA par • GE par.
12. GO. OM OMT b-, ba/bay • PIE *gweHa 'come'.
13. HEART. NOM K nibb-o, AN yimb-a, SH nɨmba, AMU libb-o. SOM AAR
lip'a/liBa, BA lippe 'belly' • PIE *leybh- > Proto-Germanic *leiba- ‘body, belly;
life’ (Bjorvand & Lindeman 2000).
14. KNEEL. OM AAR gump-ɛr-; ML *gumB-at • PIE *gyénu-~*gyónu-~*gynu-
'knee' (Bjorvand & Lindeman 2000).
15. LICK. OM DI lits’, CA hals • PIE *leigyh- • EA lak.
16. MOON. POM *ʔarf-/ʔarp • PIE *H3érbhis 'circle'.
17. MOUTH. PSOM *af/ap • PIE *H1ub-~*H1up- 'up' (Bjorvand & Lindeman
2000) (the origin of English root in up, open); cf. also Hittite api- 'hole
in the ground' (Greenberg 2002: 96) • EA api 'hole'.
18. NOSE. PNOM *sinD/sint’ • PIE has sound-imitative words beginning
with sn-, referring to breathing, snoring, nose, etc.; cf. English snout, snore,
snot, snuff, sniff • GE čun(g)a 'nose; to smell'.
19. TOOTH. POM *ačč/ats • IE *H1d-ént-~*H1d-ónt-~*H1d-nt- (same root
as *H1ed- 'eat').
20. DONKEY. POM *kur-; PSOM *uki- • English horse < Proto-Germanic
*hrussa- < IE *kyers- 'run' or *(s)ker- 'hop about' (Pfeifer 1995).
21. YOUNG FEMALE. OM DI amza 'woman, woman in prime sexual life',
AAR anza 'young woman' • PIE *maghwiHa- (from *magh- 'be able').
All of Fleming's OM/AA lexical cognates have parallels in PIE,
and in some cases the similarities are more striking between
OM and IE; 8. DOG is an interesting example. There are also
lots of similarities in the grammatical morphemes, and while
the OM/PIE resemblances are described explicitly, the OM/AA
similarities are left to the reader to discover.
The conclusion is not that OM is related to PIE. Rather, the
comparison shows how easily look-alikes are found. Resemblances
between OM and AA that are also found between OM
and PIE do not support the hypothesis of an AA affiliation for
OM, regular phonological correspondences between OM and
AA are established. But such correspondences have never
been demonstrated.
Undoubtedly, many more look-alikes would have been
found if we went beyond Fleming's cognates. Some are found
in OM, PIE, and AA, like 'horn': OM KA k'áro, PIE *kyer-, AA AR
qarn, others only in OM and PIE, like 'foot, leg': OM KA baatoó,
PIE *pōd-~pod-~ped- or 'wall': OM KA duuhó, IE *dhigyh-s.
5 Later studies
To the best of my knowledge, nobody has later presented a
more convincing argumentation than Fleming (1969, 1974)
for the OM Theory. In spite of this, this theory is the received
opinion among Africanists. In this paragraph, I shall discuss
some other attempts to support it.
5.1 M. Lionel Bender
Bender (1975, 2000, 2003; Fleming & Bender 1976) has
argued for the OM Theory. Bender (2003) presents those four
(!) POM words that he regards as likely lexical retentions from
AA, that is, 2,7% of the items on Swadesh's 150 words list:
BIRD kap- • OCU kanb- ‘bird, wing’ • Se *k-n-p.
DOG kan- • OCU kar- «??»
EYE aap- • OCU ʔaykw «??» • «More likely semantic transfer from [AA]
‘mouth’, e.g. AM af.»
SEW sip- • OCU šekw- • Se š-f-y ‘sew, mend’.
None of these proposals are convincing. As I showed in the
preceding paragraph, the OM words meaning 'dog' and 'eye'
have parallels in PIE, and Bender's two new proposals, 'bird'
and 'sew', can be compared to PIE *kap- 'hawk, falcon' and
*sep- 'handle (skilfully), hold (reverently)'.
In addition, Bender (2003) presents 25 grammatical morphemes,
repeated from Bender (2000), «likely to be retentions
» from AA; cf. Table 4. Since he has found no lexical
support for the OM Theory, these 25 morphemes are his only
evidence (p. 314):
Pending further work on [AA] lexicon, I am forced to the conclusion
that lexicon alone cannot serve to establish Omotic as [AA]. Omotic has
a very innovative and mixed lexicon with many intrusions from [AA]
languaes, especially Cushitic, and also from Nilo-Saharan. Morphological
retentions establish Omotic as an [AA] family.
Table 4. Bender's 25 OM grammatical morphemes with alleged AA cognates
Independent Pronouns Verbal affixes Nominal
2sg n 1 1sg n 2 nominal case i 3
3sg m is 2 2sg n 1 genitive ka 1
1pl nu 2 1pl uni 2 gentive n 2
2pl int 3 2pl eti 2 dative s 2
3pl ist 3 2pl to 3 Verbal TMA System
Pronoun Gender and Case Interrogatives jussive o~u 3
absolutive n 1 Q particle ay 1 perfect i~e 2
Demonstratives Q particle al~ar 2 perfect a 2
near ha ~ ka 1 Q particle am 2 Derivations
Copulas/Connectives causative s 1
be k~g 2 pas. / recip. t 2
1: Found in all Om branches. 2: Found in all but one Om branch.
3: Found in two branches with traces in one or two others.
Bender (2003) assumes an historical stability of morphology
that cannot be taken for granted. Thomason (1980) (cited in
Campbell 1997: 222-23) showed that «morphology is by no
means so stable as to justify the assumption that lexical cognates
may vanish almost entirely while the morphology holds
firm» (1980: 360) and that «all the evidence available from
well-documented language families indicates that morphological
diversification goes along with elsewhere in diversification
elsewhere in the grammar (1980: 368).
More than 50 percent of Bender's (2003) grammatical
morphemes are monophonemic, and, as mentioned earlier,
similarities are easiest to find for short morphemes, and especially
when they consist of one highly frequent phoneme,
which is in general the case with grammatical morphemes; cf.
the discussion in Campbell (1997: 221-222).
Finally, Bender (2003) includes 5 pronouns. Campbell
(1997: 240-52) has a detailed discussion of the controversial
use of pronouns in establishing relatedness of languages, and
concludes «by agreeing with Meillet that «pronouns must be
used [only] with caution» (2003: 252). Pronouns tend to be
similar in all languages, and the consonants of pronouns are
in general those found in grammatical morphemes in general.
«The consonants that are used tend to be the ones that are
least marked … m, n, t, k, and s» (1997: 243). The OM pronouns
mentioned by Bender (2003) all contain n, t, or s.
5.2 Richard J. Hayward
Hayward (1990, 1995) supports the OM Theory, but apparently
for reasons that are incompatible with Bender's: «[C]ertain
grammatical formatives … often assumed … indispensable
hallmarks of the [AA] phylum … are simply absent from
Omotic» – while «[i]n terms of vocabulary … Omotic looks
respectably [AA]» (1995: 13). On the same page, he refers to
«Blažek (forthcoming)», who claims that in terms of shared
Omotic looks like being a reasonably nuclear member of [AA]. For
example Blažek claims that for some 80 per cent of the names for parts
of the body found among the various Omotic languages cognates can
be identified among the Chadic languages—which … is a family of
languages situated on the other side of the African Continent.
I have not had access to Blažek's work, and Bender (2003)
does not refer to it. To check Blažek's claims, I compared the
body parts terms among Bender's (2003) POM reconstructions
to Newman's (1977) and Jungraithmayr & Ibriszimow's
(1994) PCH reconstructions, and found no evidence.
5.3 Christopher Ehret
Ehret (1995) reconstructs 1024 PAA roots, and lists OM reflexes
for round 435. Not surprisingly, he writes (1995: 9):
The Omotic languages emerge from the available data as definitely
Afroasiatic. The demonstrations in Fleming (1969, 1974) and Bender
(1975) that Omotic forms a division of the family quite distinct from
Cushitic seem fully convincing.
On the background of the discussions in earlier paragraphs of
this paper, this is surprising. It is also worth mentioning that
Ehret (1995) accepts only 9 of Fleming's (1974) 21 cognates:
1 ALL, 4 BONE, 6 COME, 7 BUILD, CREATE, 11 FLY, 12 GO, 13
Many of Ehret's proposed 435 OM–AA cognates are farfetched
– morphologically, phonologically, semantically, and
in other ways. It is impossible to show this in detail, but the
following examples gives an impression of Ehret's methods:
AA ROOT 82 *-feŋ- 'to set apart, move apart (tr.)' • SE AR fann 'species,
kind, category; way, manner' • CU *fenḥ- 'to spread apart' • OM OMT GA
*penge 'door'; «semantics: move apart > open > door».
AA ROOT 140 *dîm-/*dâm- 'blood' • SE *dm (*dam) 'blood' • EG idmi 'red
linen' • CU *dîm-/*dâm- 'red' • WCH *d-m- 'blood' • OM GO *dam- 'blood'
«(MO 'damo) (contra Leslau, loan < Sem. seems implausible in this case)»
AA ROOT 367 *-ɣâp- 'to rise, arise' • SE AR ɣafw 'to float on the water' •
EG xpr 'to come into being; become; grow up; occur, happen’ • CU *ɣaap-
/*ɣuup- 'fruit' • NOM *kap- 'bird'; «semantics: rise > fly».
AA ROOT 636 *-ŋôm- 'to use the mouth (other than eating)' • SE AR namm
'breath, breeze' • EG nmi 'to shout, low' • PSCU *ŋûm ‘to pucker the lips (as
in blowing)’ • NOM *no:n- ‘mouth’; «presumed assim. *no:m- > *no:n-».
AA ROOT 637 *ŋaan 'boy' • EG nn 'child' • CU BU naw 'small' boy' • some
WCU *nan 'brother' • NOM *na:m- 'son'; «stem with nasal dissim., *nVn >
AA ROOT 660 *-noh- or *-ŋoh- or ɲoh- 'to cry out' • SE *nhḳ 'to bray';
«stem + *k' intens. of effect» • OM YE nòon 'to murmur'; «[PRE-POM]
*nohn-, stem + *n non-fin. > *no:n-».
AA ROOT 859 *-dlǎʔ- 'to decline, become low' • SE AR ḍaʔal 'to make
oneself small' • EG ḏʔt 'remainder, deficiency' • SCU Proto-Rift *tlatlaʔ-
'afternoon' • OM MO t'à:’o 'place'; «semantics: < presumed earlier sense
"ground": ground is below one».
AA ROOT 914 *-tl’uw- 'to rise' • SE Modern South AR *ṣwr 'to stand, stay';
«stem + *r diffus.» • EG twʔ, twʔ 'to support, sustain, hold' • CU *ɬw 'meat';
«Ng. tlùwái, stem + *y deverb.; semantics: rise > grow > live, + *y deverb.
> animal (i.e., living creature) > meat» • NOM *t’umu ‘mountain’; «stem +
*m n. suffix».
Ehret’s methods are dubious, among other things in the following
ways. Roots are broken up into ad hoc roots + suffixes;
cf. root 660 «[PRE-POM] *nohn-, stem + *n non-fin. > *no:n»
. OM root 914 SE Modern South AR «stem + *r diffus.» and
NOM «stem + *m n. suffix». This means that the etymologies
are based upon a single consonant.
Ad hoc sound changes are «presumed»; cf. root 636 OM
«presumed assim. *no:m- > *no:n-» and root 637 OM «stem
with nasal dissim., *nVn > *nVm».
Meaning relations are often farfetched. Cf. root 914 SE 'to
stand, stay', EG 'to support, sustain, hold', CU 'meat', OM
'mountain'; the reconstructed AA meaning is 'to rise'.
Ehret rejects Fleming's 3 BLOOD etymology and instead,
cf. root 140, relates the AA form to GO *dam- 'blood'. For
some unknown reason he thinks that it is implausible that this
is a loanword from Semitic. It is tempting to quote Meillet
(1967: 51):
The risk that a word is borrowed is always great, and the etymologist of
an ancient or modern language who reasons is if the words to be
explained had a priori every chance of being native exposes himself to
frequent errors.
Root 140 cannot be used to prove a genetic relationship between
OM and AA, because it may be a loanword. KA damoó
'blood' has exactly the form to be expected if borrowed from
AM däm 'blood' (Theil, in press).
Ehret’s claim that «[t]he Omotic languages emerge from
the available data as definitely Afroasiatic» is not supported.
5.4 Marcello Lamberti
There are still scholars who argue that OM is a branch of CU. I
include a few lines about Lamberti (1991), who argues for this
view. He is of the opinion that (1991: 556)
lexical arguments do not have a great weight within the evaluation of a
genetic relationship because lexemes (also those of core vocabulary!)
can easily undergo semantic changes, can be easily be replaced by new
expressions, and can always be the result of borrowing … The morphology,
on the contrary, represents the most conservative and intimate
part of a language.
He goes on to present some comparisons of grammatical
morphemes in different CU and OM languages. Some of the
morphemes resemble each other, but no attempt is made to
establish regular phonological correspondences between the
languages. I shall discuss some of his suffixes.
He postulates a noun forming suffix *-tee, which inter alia
has the modern forms -tsi (CU AW), -ti (CU SO), -tsi (OM ZA),
and -ti (OM YE), but he does not account for the phonological
variation. The ZA form is illustrated in «d'an-tsi (udder) ←
d'am- (suck)». A change *-tee > ZA *-tsi is not well founded,
and the phonemic analysis of the ZA form can be questioned.
ZA is closely related to KO, which I know from my own fieldwork.
The KO counterpart is ɗànse 'breast'; s is pronounced
[ts] after l, r, and n (Theil, forthcoming). There are no reasons
to believe that ZA -se comes from an earlier *-tee.
Surprisingly, Lamberti (1991: 556-557) analyzes the KA
suffix -cco in two different ways; as «-ec-co, e.g., shatt-ec-co (coward)
», where -ec- is claimed to come from «the suffix for agent
nouns *-aam», and as «-ccoo, e.g. Kafi-ccoo (a Kafa man)»,
claimed to come from a singular noun suffix *-ttaa. There are
no morphological reasons for treating the KA -cco suffix as
two different suffixes, and the assumed change «*-ttaa > -ccoo»
has no basis.
Lamberti (1991: 557) claims that the same *-ttaa suffix has
become -ttsi in ZA, «e.g. akima-ttsi (traditional doctor), cf.
Amharic hakim (id.)». The analysis «akima-ttsi» is clearly wrong,
and should be akim-attsi; attsi is a noun meaning 'person'. KO
has kèm-atse 'hunter' and yèem-atse 'shepherd', which are compounds;
cf. kème 'to hunt', yèeme 'to herd', and àtse 'person'.
Finally, Lamberti (1991: 558) claims that «the numerals 1,
2, 3, 5, 10, 100, and 1,000» support the hypothesis that OM is
a branch of CU. But he does write anything else about this
In conclusion, Lamberti (1991) does not present any
interesting evidence in favor of a «Cushitic Theory».
6 Conclusion
My conclusion is that Omotic should be treated as an independent
language family. No convincing alternative has ever
been presented.
Hayward (1995: 11) writes that «[i]t is, of course, a relief
not to have Omotic as an isolate; we do not need a whole
family of 'Basques' on our hands!» An alternative point of
view is possible. Africa is the cradle of mankind. Why are
there no language isolates on a continent where humans have
lived since language was invented?
7 Abbreviations
AA Afroasiatic
AEG Ancient
A-K Aari-Kafa
AK Akkadian
AM Amharic
AMU Amuru
AN Anfillo
AAR Aari
AW Awngi
BA Basketo
BD Bedawie
BE Berber
BU Burunge
CA C'ara
CCU Central
CH Chadic
CHA Chahar
COMT Central
CU Cushitic
DA Dawro
DI Dime
EA Eurasiatic
ECU East Cushitic
EG Egyptian
GB (G) Gabri
GE Global
GL Galab
GM Gamo
GO Gonga
HM Hamar
HS Hausa
IE Indo-European
KA (K) Kafa
KH Khamir
KO Koorete
KR Karo
MJ (M) Maji
ML Maale
MEG Middle
MO Mocha
NOM North
OCU Old Cushitic
OM Omotic
OMT Ometo
OR Oromo
OSE Old South
PAA Proto-
PCH Proto-Chadic
PIE Proto-Indo-
PLECU Proto-
Lowland East
PNOM Proto-
North Omotic
POM Proto-
PSCU Proto-South
PSE Proto-Semitic
PSOM Proto-South
SE Semitic
SGO South Gonga
SH Shakicho
SI Sidamo
SO Somali
SOM South
TA Tamashek
UG Ugaritic
WCH West Chadic
WCU West
WO Wolaitta
YE Yem
ZA Zayse
8 References
Azeb Amha (2001). The Maale Language. (CNWS publications,
99.) Leiden: Research School of Asian, African, and Amerindian
Studies, Universiteit Leiden.
Bender, M. Lionel (1975). Omotic: a New Afroasiatic Language
Family. (Southern Illinois University Museum Series, No. 3.)
Bender, M. Lionel (2000). Comparative Morphology of the Omotic
Languages. (Lincom Europa Studies in African Linguistics, 19.)
München: LINCOM.
Bender, M. Lionel (2003). Omotic Lexicon and Phonology.
Carbondale: Southern Illinois University.
Bjorvand, Harald and Lindeman, Fredrik Otto (2000). Våre
arveord. Etymologisk ordbok. Oslo: Novus Forlag.
Campbell, Lyle (1997). American Indian Languages. The Historical
Linguistics of Native America. (Oxford Studies in Anthropological
Linguistics, 4.) Oxford: Oxford University Press.
Cowan, J. Milton (ed.) (1994). The Hans Wehr Dictionary of
Modern Written Arabic. Urbana: Spoken Language Services,
Dallet, J.-M. (1982). Dictionnaire kabyle-français. Parler des At
Mangellat, Algérie. Paris: SELAF.
Doniach, N. S. (ed.) (1972). The Oxford English-Arabic
Dictionary of Current Usage. Oxford: The Clarendon Press.
Dray, Maurice (1998). Dictionnaire français-berbère. Dialecte des
Ntifa. Paris: L'Harmattan.
Fleming, Harold C. (1969). The Classification of West
Cushitic within Hamito-Semitic. Pp. 3–27 in Daniel F. McCall
et al. (ed.) Eastern African History. New York, Washington,
London: Frederick A. Praeger.
Fleming, Harold C. (1974). Omotic as an Afroasiatic Family.
Studies in African Linguistics, Supplement 5: 81–94.
Fleming, Harold C. and Bender, M. Lionel (1976). Non-
Semitic Languages. Pp. 34–62 in M. Lionel Bender et al. (ed.)
Language in Ethiopia. London: Oxford University Press.
Gordon, Raymond G., Jr. (ed.) (2005). Ethnologue: Languages of
the World. 15th edition. Dallas, Tex.: SIL International. Online
version: http://www.ethnologue.com/.
Greenberg, Joseph H. (1963). The Languages of Africa.
International Journal of American Linguistics, Vol. 29, No. 1.
Greenberg, Joseph H. (1987). Language in the Americas.
Stanford: Stanford University Press.
Greenberg, Joseph H. (2000-2002). Indo-European and its closest
relatives. 2 vol. Stanford: Stanford University Press.
Hayward, Richard J. (1990). Introduction. Pp. vii–xix in
Richard J. Hayward (ed.) Omotic Language Studies. London:
School of Oriental and African Studies, University of
Hayward, Richard J. (1995). The Challenge of Omotic. An
Inaugural Lecture Delivered on 17 February 1994. London: School
of Oriental and African Studies, University of London.
Hayward, Richard J. (2003). Omotic. The ‘empty quarter’ of
Afroasiatic Linguistics. Pp. 241–261 Jacqueline Lecarme (ed.)
Research in Afroasiatic Grammar II. Selected Papers from Fifth
Conference on Afroasiatic Language, Paris, 2000. Amsterdam/
Philadelphia: John Benjamins Publishing Company.
Jungraithmayr, Herrmann and Ibriszimow, Dymitr (1994).
Chadic Lexical Roots. 2 vol. Berlin: Dietrich Reimer Verlag.
Lindeman, Fredrik Otto (1997). Introduction to the 'Laryngeal
Theory'. (Innsbrucker Beiträge zur Sprachwissenschaft, Band
91.) Innsbruck: Institut für Sprachwissenschaft der Universität
Mallory, J. P., and Adams, D. Q. (2006). The Oxford
Introduction to Proto-Indo-European and the Proto-Indo-European
world. Oxford: Oxford University Press.
Lamberti, Marcello (1991). Cushitic and its Classification. Anthropos
86.1991: 552-61.
Lamberti, Marcello and Sottile, Roberto (1997). The Wolaytta
language. (Studia linguarum Africae orientalis, 6.) Köln: R.
Loprieno, Antonio (1995). Ancient Egyptian. A linguistic introduction.
Cambridge: Cambridge University Press.
Meillet, Antoine (1967). The Comparative Method in Historical
Linguistics. (Instituttet for sammenlignende kulturforskning.)
Paris: Librairie Honoré Champion.
Newman, Paul (1977). Chadic Classification and Reconstructions.
(Afroasiatic Linguistics, Vol. 5, Issue 1.) Malibu: Undena
Pfeifer, Wolfgang (1995). Etymologisches Wörterbuch des
Deutschen. Erarbeitet im Zentralinstitut für
Sprachwissenschaft, Berlin, unter leitung von Wolfgang
Pfeifer. München: Deutscher Taschenbuchverlag.
Ruhlen, Merritt (1994). On the Origin of Languages. Studies in
Linguistic Taxonomy. Stanford: Stanford University Press.
Theil, Rolf (in press). Kafa phonology. Journal of African
Languages and Linguistics.
Theil, Rolf (forthcoming). Koorete phonology.
Thomason, Sara Grey (1980). Morphological instability, with
and without language contact. P. 359–372 in Jacek Fisiak (ed.)
Historical morphology. The Hague: Mouton.
On the other hand it seems that Afro-Asiatic in Africa was the result of Back migrations from Western Asia see below

As you know Kushitic speaking Ethiopians of southern Ethiopia appeared to be homogenously more than 20% south-western Asian while many of south-western Saudis ended up 100% south-western Asian see the maps below:



4/Ethiopians(South Ethiopian Kushites and Adid Ababa Ethiosemites)

+There is an important presence of J1 hg amongst Kushitic speaking Bejas (whose for of Kushitic could be the most archaic Kushitic) as well as endogamous Christian Egyptian copts and Isolated mountainous Berbers (alongside a high south-western and western Asian input)

Last edited by Colin Wilson; 11-10-2010 at 04:18 PM.
