On the lexical level, Romani is characterised by various layers of loanwords. The so-called Indian "words of origin" and early loans form the pre-european or Asiatic part of the lexicon, Boretzky (1992) calls "inherited words". Pre-european summarises lexemes of Indo-Aryan origin and the early loans from Persian, Armenian, and Byzanthinian-Greek, thus, the Asian part of the Romani lexicon. The number of pre-european words is relatively low: the dictionary of Boretzky/Igla (1994) holds about 700 lexemes of Indian origin, about 70 of Persian origin, 40 of Armenian, and about 230 lexemes of Greek origin. The lexicon of no single variety disposes of all these more than 1000 pre-european words. Romani varieties have max. 600 lexemes of asiatic origin. The European part of the lexicon, the later and recent loans, have their origins in the languages of the Balkans and other European languages. (see table).

On account of the Romani loanstrata, the migration route of the Roma to Europe could be reconstructed. First stop following emigration from the northeast of the Indian subcontinent was Persia. This is supported by the presence of lexemes of Persian origin. However, as Romani lacks Arabic loans, it is to be assumed that the Romani speakers left the Persian area before the Creolisation of the Iranian and Arabian cultures in direction of Armenia and later on moved into the Byzanthinian area. This theory is supported by the existence of loans from Armenian and a strong influence of Byzanthinian-Greek.

The strong influence of (Byzantinian-)/Greek on Romani is manifested in the lexemes used for cardinal numbers listed below, which, besides words of Indian origin (ai.), only show Greek (gr.) loans:

jekh/jek ←ai. ekka- one
duj ←ai. d(u)va: two
trin ←ai. tri:ṇi three
štar ←ai. catva:ra four
pandž/panč ←ai. pañca- five
šov ←ai. śaś/śat six
efta ←gr. έφτά seven
oxto/ofto ←gr. όχτώ eight
enja ←gr. ένννιά nine
deš ←ai. daśa- ten
biš ←ai. viṁśati twenty
tr(ij)anda ←gr. τριάντα thirty
saranda ←gr. σαράντα fourty
šel ←ai. śata one hundred

Emigration from Asia Minor must have taken place before the Turkisation of this region: the lexicons of Romani varieties spoken by groups that moved on to central and western Europe show no Osmanic-Turkic elements. Groups that remained in the southern Balkans adopted Turkish lexemes later.

In the Balkans Romani borrowed words from south Slavic languages. Lexemes of Slavic origin form the last common layer in the Romani lexicon. Up to the Slavic praho, the lexemes of the enclosed table suggest a common lexicon. The further lexemes are variety-specific: the Romanian loan pomána pertains to the lexical stock of the Kalderaš-Romani; kalápa, of Hungarian origin, is used among others in Burgenland-Romani; vélto, a German loan, characterises varieties of Sinte-Romani.


Boretzky, Norbert (1992) Zum Erbwortschatz des Romani. In: Zeitschrift für Phonetik Sprachwissenschaft und Kommunikationsforschung 45, pp. 227-251.
Boretzky, Norbert / Igla, Birgit (1994) Wörterbuch Romani-Deutsch-Englisch für den südosteuropäischen Raum. Mit einer Grammatik der Dialektvarianten, Wiebaden.
Heinschink, Mozes F. (1994) E Romani Čhib – Die Sprache der Roma. In: Heinschink, Mozes F. / Hemetek, Ursula (eds.) Roma. Das unbekannte Volk. Schicksal und Kultur, Wien, pp. 110-128.
Image Printable version
Image Lexicon table