[proxy] en.wikipedia.org← back | site home | direct (HTTPS) ↗ | proxy home | ◑ dark◐ light

Santali language

Contributors to Wikimedia projects

Santali
ᱥᱟᱱᱛᱟᱲᱤ

The word Santali in Ol Chiki script

Native toIndia, Bangladesh, Nepal
EthnicitySantal, Mahali

Native speakers

7.6 million (2011 census[1])[2]

Austroasiatic

Dialects
  • Mahali (Mahili)
  • Kamari-Santali
  • Khole
  • Lohari-Santali
  • Manjhi
  • Paharia
Official status

Official language in

Language codes
ISO 639-2sat
ISO 639-3Either:
sat – Santali
mjx – Mahali
Glottologsant1410  Santali
maha1291  Mahali

Distribution of Santali language

Santali (ᱥᱟᱱᱛᱟᱲᱤ, Pronounced: [santaɽi], সাঁওতালি, ସାନ୍ତାଳୀ, सान्ताली) is a Kherwarian Munda language spoken natively by the Santal people of South Asia. It is the most widely-spoken language of the Munda subfamily of the Austroasiatic languages, related to Ho and Mundari, spoken mainly in the Indian states of Assam, Bihar, Jharkhand, Mizoram, Odisha, Tripura and West Bengal.[5] It is one of the constitutionally scheduled official languages of the Indian Republic and the additional official language of Jharkhand and West Bengal per the Eighth Schedule of the Indian Constitution.[6] It is spoken by around 7.6 million people in India, Bangladesh, Bhutan and Nepal, making it the third most-spoken Austroasiatic language after Vietnamese and Khmer.[5]

Santali is characterised by a split into at least a northern and southern dialect sphere, with slightly different sets of phonemes: Southern Santali has six phonemic vowels, in contrast with eight or nine in Northern Santali, different lexical items, and to a certain degree, variable morphology. Santali is recognised by linguists as being phonologically conservative within the Munda branch. Unlike many Munda languages that had their vowel systems restructured and shrunk to five such as Mundari, Ho, and Kharia, Santali retains a larger vowel system of eight phonemic cardinal vowels, which is very unusual in the South Asian linguistic area.[7][8] The language also uses vowel harmony processes in morphology and expressives similar to Ho and Mundari.[9] Morphosyntactically, Santali, together with Sora, are considered less restructured than other Munda languages, having less influence from Indo-Aryan and Dravidian languages.[10] Clause structure is topic-prominent by default.[11]

Santali is primarily written in Ol Chiki script, an indigenous alphabetic writing system developed in 1925 by Santal writer Raghunath Murmu. Additionally, it is also written in various regional Indian writing systems such as Bengali-Assamese script, Odia script, Devanagari, and the Santali Latin alphabet.[7]

[ video — the proxy does not render media ]
A Santali speaker in Assam, India

The Santals call themselves hɔɽ (lit. 'man') and their language hɔɽ rɔɽ ("language of the Santals"). It is also referred as mãjhi bhasa ("language of the Majhis"), and the Santals, when being asked about their caste, sometimes call themselves maɲjhi or mãjhi ("village headman", "chief").[12] In North Bengal, the language is known as jaŋli or pahaɽia. In Bihar it is called parsi ("foreign"). The name Santal, in turn, was derived from Sāmanta-pāla ('dwellers of the frontiers') and was used by Bengalis to refer the Santals. L.O. Skrefsrud assumed that Santal was derived from Sãot, name of a place in Midnapore region in West Bengal where the Santals were supposed to have been settled in remote antiquity.[13] In Nepal, the Santali language is known as Satar.[14]

According to linguist Paul Sidwell, proto-Munda language speakers ancestral of Santali probably arrived on the coast of Odisha from Indochina about 4000–3500 years ago, and spread before the Indo-Aryan migration to the Chota Nagpur Plateau and adjacent areas.[15]

Santali books in Mayurbhanj Book Fair

Santali remained non-literary until the mid-1800s, when European interest in the languages of India led to the first efforts to document it. The language was initially recorded using the Latin alphabet, then Bengali, Devanagari, and Odia by European-American anthropologists, folklorists, and missionaries such as Jeremiah Phillips, A. R. Campbell, Lars Skrefsrud, and Paul Bodding.[16] Their work resulted in Santali dictionaries, collections of folk tales, and studies on the language’s morphology, syntax, and phonetics. By the late 19th-century, several Santal intellectuals began to use several writing systems to compose books, stories, and poems in their language. The first Santali weekly magazine in Latin alphabet, the Pera Ho̠ṛ, was established in 1922, followed by the Marshal Tabon (1946); Bihar-run Devanagari Ho̠ṛ So̠mbad (1947), Bengali Pachim Bangla (1956), and the Jug Siriro̠l (since 1971) in Latin. There are two Bangladesh-based Santali monthly magazines–Aboak’ kurumuTureak’ Kurai and GoDet’–both written in Bengali script and published from Rangpur and Dhaka, respectively.[16]

In 1922, Sadhu Ramchand Murmu from Jhargram district of West Bengal attempted to create a Santali script called Monj Dander Ank, but it did not gain popularity. Later, in 1925, Raghunath Murmu from Mayurbhanj district of Odisha developed the Ol Chiki script, which was first publicised in 1939 and eventually became widely adopted.[17][18] The Ol Chiki script is now considered as official script for Santali literature and language across West Bengal, Odisha, and Jharkhand.[19][20] However, users from Bangladesh use Bengali script instead.[dubiousdiscuss]

Santali was included in the Eighth Schedule to the Constitution of India for official recognition as a scheduled language in 2003 through the 92nd Amendment Act, granting it the right to be used in government communication, education, and competitive examinations.[21] In December 2013, the UGC, the higher education regulatory body of India, introduced Santali as a subject in the National Eligibility Test (NET), enabling its use for lectureship and as a medium of instruction in colleges and universities.[22]

Geographic distribution

[edit]

Distribution of Santali language in the states of India[23]

  1. Jharkhand (44.4%)
  2. West Bengal (33.0%)
  3. Odisha (11.7%)
  4. Bihar (6.20%)
  5. Assam (2.90%)
  6. Maharashtra (1.40%)
  7. Chhattisgarh (0.20%)
  8. Tripura (0.10%)
  9. Other states (0.10%)

Santali is spoken by over seven million people across India, Bangladesh, Bhutan, and Nepal, with India being its native country and having the largest number of speakers amongst the four.[5] According to 2011 census, India has a total of 7,368,192 Santali speakers (including 358,579 Karmali, 26,399 Mahli).[24][25] State wise distribution is Jharkhand (2.75 million), West Bengal (2.43 million), Odisha (0.86 million), Bihar (0.46 million), Assam (0.21 million) and a few thousand in each of Chhattisgarh, and in north-eastern states Tripura, Arunachal Pradesh, Mizoram.[26]

The highest concentrations of Santali language speakers are in Santhal Pargana division, as well as East Singhbhum and Seraikela Kharsawan districts of Jharkhand, the Jangalmahals region of West Bengal (Jhargram, Bankura and Purulia districts) and Mayurbhanj district of Odisha.

Smaller pockets of Santali language speakers are found in the northern Chota Nagpur plateau (Hazaribagh, Giridih, Ramgarh, Bokaro and Dhanbad districts), Balesore and Kendujhar districts of Odisha, and throughout western and northern West Bengal (Birbhum, Paschim Medinipur, Hooghly, Paschim Bardhaman, Purba Bardhaman, Malda, Dakshin Dinajpur, Uttar Dinajpur, Jalpaiguri and Darjeeling districts), Banka district and Purnia division of Bihar (Araria, Katihar, Purnia and Kishanganj districts), and tea-garden regions of Assam (Kokrajhar, Sonitpur, Chirang and Udalguri districts). Outside India, the language is spoken in pockets of Rangpur and Rajshahi divisions of northern Bangladesh as well as the Morang and Jhapa districts in the Terai of Koshi Province in Nepal.[27][28]

Santali is one of India's 22 scheduled languages.[6] It is also recognised as the additional official language of the states of Jharkhand and West Bengal.[29][30]

Dialects of Santali include Kamari-Santali, Khole, Lohari-Santali, Mahali, Manjhi, Paharia.[5][31][32]

Being scattered apart in many different pockets in one of the most densely-populated parts of India, Santali dialects are becoming increasingly distinct in phonology, morphology, and lexicon. Reports by R.N. Cust (1878) mentioned four or more dialects, while according to George Campbell, only two main Santali dialects are attested: Northern and Southern. Data gathered by Ghosh (1994) and Kobayashi et al. confirm Campbell's account.[33] Northern Santali speakers are concentrated in Santhal Pargana division (Godda, Deoghar, Dumka, Jamtara, Sahibganj and Pakur), Hazaribagh, throughout the North Chotanagpur Division; Purnia and Bhagalpur divisions in Bihar; Malda division, Birbhum, Bankura, Murshidabad, Cooch Behar, and Jalpaiguri districts in West Bengal. Southern Santali speakers predominantly live in Southern Bankura, Purulia, Paschim Medinipur in West Bengal; Gumla, Simdega, the Singbhum districts of Jharkhand; Balesore and Kendujhar, and Mayurbhanj district of Odisha.[34]

According to observation by Ghosh, "In the lexicon SS (Southern Santali) and NS (Northern Santali) are somewhat different, initiated by borrowing from the neighbouring languages. The local borrowings in the two dialects are so high that sometimes one appears to be unintelligible to the other. In certain cases the usage is also different."[16]

Santali has 21 consonants, not counting the 10 aspirated stops which occur primarily, but not exclusively, in Indo-Aryan loanwords and are given in parentheses in the table below.[35][page needed]

  Bilabial Alveolar Retroflex Palatal Velar Glottal
Nasal m n (ɳ)* ɲ ŋ  
Stop voiceless p () t () ʈ (ʈʰ) c () k () ʔ
voiced b () d () ɖ (ɖʱ) ɟ (ɟʱ) ɡ (ɡʱ)  
Fricative   s       h
Trill/Flap   r ɽ      
Approximant   l   j w  
*ɳ only appears as an allophone of /n/ before /ɖ/.

In native words, the opposition between voiceless and voiced stops is neutralised in word-final position. A typical Munda feature is that word-final stops are "checked", i. e. glottalised and unreleased.

Bodding (1929) noted that in the vowel space between an open syllable and a syllable that starts with a vowel, if both vowels are of the same height, approximant /w/ is inserted in between cues of two low vowels, and /j/ for mid-high and high vowels.

Santali has eight oral and six nasal vowel phonemes. With the exception of /e o/, all oral vowels have a nasalised counterpart.

  Front Central Back
High i ĩ   u ũ
Mid-high e ə ə̃ o
Mid-low ɛ ɛ̃   ɔ ɔ̃
Low   a ã  

The Southern Santali dialect (Singhbhum) features a smaller inventory of six vowels /a, i, e, o, u, ə/.[36][37]

The comparative method would suggest a five vowel system for Proto-Kherwarian. Osada (1992) proposes that the open vowel /ɛ, ɔ/ in Santali likely emerged under Indo-Aryan influence, notably Bengali.[38] However, acoustic and prosodic analyses on Santali dialects and other Munda languages provide phonetic and distributional evidence for the argument that Santali and Munda languages are undergoing or have undergone contraction rather than expansion of their vowel systems. Monosyllables preserve the fullest contrasts (including phonemic /ə/ and localized mid-vowel splits), but disyllables, especially in prominent second syllables, show convergence toward a symmetrical five-vowel /i e a o u/ system and progressive loss of vowel harmony. The mid-central vowel is restricted in distribution, and mid-vowel (ATR-like) contrasts are unstable and geographically uneven.[39] Overall, the trajectory is toward merger and neutralization. Comparatively, other Munda languages such as Mundari and Ho already exhibit five-vowel systems, suggesting a broader synchronic areal convergence. Although contact with Bengali and Hindi may influence phonetic realization, the dominant structural pattern is reduction from likely more conservative complex vocalic distinctions of proto-Munda, to a pan-South Asian simple five-vowel norm, not expansion of the inventory, with the Assam Santali dialect have achieved the non-harmonic five-vowel system. Vowel harmony also appears to have been vanished entirely in the Odisha dialect. Northern varieties in Jharkhand and West Bengal preserve older oppositions; southern and peripheral lects show advanced leveling toward a symmetrical five-vowel system.[39]

There are numerous diphthongs and triphthongs. Larger vowel sequences can be found, eg. kɔeaeae, meaning 'he will ask for him', with six consecutive vowels.[40]

Note that in the level diphthongs /ea, ia, io, iu, oa, ua/, semivowels /w, j/ are usually inserted in between and dissolve the diphthong into two syllables when realised.[41]

Santali prosody exhibits iambic patterns with stress is always released in the second syllable in most disyllabic words, excepting loan words from Hindi, Bihari, Bengali and Assamese. In trisyllabic words, a process called V2 deletion actively drops the second vowel, turning the supposedly trisyllable into a disyllable consisting of two heavy syllables. Despite that, stress consistently falls on the second syllable. Eg. hapaɽam ('ancestor') → hapˈɽám.[42][10]

Like all Kherwarian languages, vowel harmony in Santali is a morphological triggered process.[43] In morphology and word formation, Santali uses a vowel harmony system based on vowel height. As discussed above, vowel harmony exhibits geographic variation across Santali dialects. For example, Santali in Sonitpur district, Assam has lost vowel harmony completely, while in neighboring Udalguri district the vowel harmonic processes still remain active, per 2019 observations.[43] There are certain restrictions in a vowel harmonic sequence:[9]

1). /e/ and /o/ never co-occur with /u/ in the same stress unit (word with affixes, enclitics,...).

2). /ɛ/ and /ɔ/ never co-occur with /e/ and /o/. Thus, some suffixes and enclitics may have two variants, such as the instrumental suffix -tɛ, the vowel is raised to /e/, → [-te]. Note that this only occurs with weak (harmonic) syllables and suffixes, while others do not. More examples to show: ɛɽɛ=e → [ɛɽɛ=jɛ] (lie=3), ɛgɛr ("to scold"), gɔʈɛn ("part"), mɛrɔm ("goat"), ɛhɔp ("to begin"), hɔʈlɛˀtʃ ("cooking pot"). Trisyllabic, tetrasyllabic structures and anything beyond the domain of the foot seem to not follow this pattern consistently. Eg. bɛhebajoˀt ("neglect"), bɛdɛrgɛˀtʃ ("unclean"), tʃʰɔldori ("small tent").

3). Syllables with /i/ and /u/ only co-occur with /ə/, but not /a/. Eg. busək ("to give birth"), bidə ("to dismiss"), əgu ("to bring").

4). Only /a/ can co-occur with /e o ɛ ɔ/ while /ə/ cannot. Eg. boŋga ("evil spirit"), sadɔm ("horse"), hako ("fish"), mare ("ancient").

5). /e/ may be alternated to /i/ if the preceding syllable ends with /u/ or /ə/.

Santali, like all Munda languages, is a suffixing agglutinating language. It remains a subject of intense linguistic debate over whether Santali and related languages such as Mundari and Kherwarian lects have recognizable parts of speech (verbs, nouns, adjectives,...). Traditional grammatical descriptions often treat lexemes that take cases in a syntactical unit as parts of the nominal system, and those that take TAM/Person/Number as verbal. However, deeper analyses by Neukom (2001), Hengeveld & Rijkhoff (2005), Peterson (2005), Rau (2013) suggest that in fact Santali is a flexible language; that is, the lexemes are inherently underspecified for lexical category and can either function in referential ("noun"), predicative ("verb"), or attributive ("modifier") roles; while Evans & Osada (2005) and Croft (2005) argue that the Kherwarian languages do possess, but fluid, defined word classes. According to Neukom (2001), about one-third of all the Santali lexemes ("contentives") are rigid, un-derived verbs, which means they are syntactically restricted to the predicative function. The rest of the lexicon (nominals, proforms, adpositions, derived "nominals" etc) are purely contentive and syntactically flexible.[44] Currently, the Oxford Handbook of Word Classes (2023) rates Santali as a Type I Flexible language.[45]

Nouns are inflected for number and case.[46]

Three numbers are distinguished: singular, dual and plural.[47]

Singular ᱥᱮᱛᱟ (seta) 'dog'
Dual ᱥᱮᱛᱟᱼᱠᱤᱱ (seta-ken) 'two dogs'
Plural ᱥᱮᱛᱟᱼᱠᱚ (seta-) 'dogs'

The case suffix follows the number suffix. The following cases are distinguished:[48]

Case Marker Function
Nominative Subject and object
Genitive ᱼᱨᱮᱱ (-rɛn) (animate)
ᱼᱟᱜ (-akˀ), ᱼᱨᱮᱭᱟᱜ (-rɛakˀ) (inanimate)
Possessor
Comitative ᱼᱴᱷᱮᱱ (-ʈhɛn)
-ᱴᱷᱮᱡ (-ʈhɛtʃ')
Goal, place
Instrumental-Locative ᱼᱛᱮ (-tɛ) Instrument, cause, motion
Sociative ᱼᱥᱟᱶ (-são) Association
Allative ᱼᱥᱮᱱ (-sɛn)
ᱼᱥᱮᱡ (-sɛtʃˀ)
Direction
Ablative ᱼᱠᱷᱚᱱ (-khɔn)
ᱼᱠᱷᱚᱡ (-khɔtʃˀ)
Source, origin
Locative ᱼᱨᱮ (-rɛ) Spatio-temporal location

Santali has possessive suffixes which are only used with kinship terms: 1st person , 2nd person -m, 3rd person -t. The suffixes do not distinguish possessor number.[49]

To mark nominals as definite, Santali morphology uses suffixes -tɛtˀ for nouns, and -ʈakˀ for pronouns, respectively.[50]

ɖər-tɛtˀ

branch-DEF

əgui-mɛ,

bring-2SG.IMP

dare-tɛtˀ

tree-DEF

ikə-kə-kˀ-mɛ

be-MOD-MID-2SG.IMP

'Bring the branch, let the tree be.'

Gender and noun class

[edit]

True gender distinction marking on nominals and verbs (like in Sanskrit, Hindi, other Indo-Aryan and Dravidian languages) does not exist in Santali. Native peripheral markers such as the genitive, locative markers, and nominalizers can be used to distinguish between animate and inanimate noun classes. For lexicalized gender distinction, there are several ways to mark the contrast between female and male:

- Morphologically-marked modifiers borrowed from Indo-Aryan such as -i for feminine, and -a for masculine are found in certain lexemes:[51]

- Sex-based gender lexemes. These words are inherently gendered and cannot be inflected for gender, unlike the words listed above.[52]

- Compounded sex-based gender. The head noun is compounded with a gender-denoting modifying word. Masculine compounds go with ənɖiə, sanɖi, pɛ̄ʈhar, kuɖu, and feminine objects go with ɛŋga, bətʃhi, and pəʈhi.

The personal pronouns in Santali distinguish inclusive and exclusive first person and anaphoric and demonstrative third person.[52]

Personal pronouns
Singular Dual Plural
1st person exclusive ᱤᱧ () ᱟᱹᱞᱤᱧ (əliɲ) ᱟᱞᱮ (alɛ)
inclusive ᱟᱞᱟᱝ (alaŋ) ᱟᱵᱳ (abo)
2nd person ᱟᱢ (am) ᱟᱵᱮᱱ (aben) ᱟᱯᱮ (apɛ)
3rd person Anaphoric ᱟᱡ (atʃˀ) ᱟᱹᱠᱤᱱ (əkin) ᱟᱠᱳ (ako)
Demonstrative ᱩᱱᱤ (uni) ᱩᱱᱠᱤᱱ (unkin) ᱳᱱᱠᱳ (onko)

The interrogative pronouns have different forms for animate ('who?') and inanimate ('what?'), and referential ('which?') vs. non-referential.[53]

Interrogative pronouns
Animate Inanimate
Referential ᱚᱠᱚᱭ (ɔkɔe) ᱳᱠᱟ (oka)
Non-referential ᱪᱮᱹᱞᱮᱹ (tʃele) ᱪᱮᱫ (tʃetˀ)

The indefinite pronouns are:[54]

Indefinite pronouns
  Animate Inanimate
'any' ᱡᱟᱸᱦᱟᱸᱭᱟᱜ (jãheã) ᱡᱟᱸᱦᱟᱸ (jãhã)
'some' ᱟᱫᱚᱢ (adɔm) ᱟᱫᱚᱢᱟᱜ (adɔmak)
'another' ᱮᱴᱟᱜᱤᱡ (ɛʈakˀitʃ') ᱮᱴᱟᱜᱟᱜ (ɛʈakˀak')

The demonstratives distinguish three degrees of deixis (proximate, distal, remote) and simple ('this', 'that', etc.) and particular ('just this', 'just that') forms.[55]

Demonstratives
Simple Particular
Animate Inanimate Animate Inanimate
Proximate Singular ᱱᱩᱭ
(nui)
ᱱᱚᱣᱟ
(nui)
ᱱᱤ
(nii)
ᱱᱤᱭᱟᱹ
(niə)
Dual ᱱᱩᱠᱤᱱ
(nukin)
ᱱᱚᱣᱟᱠᱤᱱ
(noakin)
ᱱᱤᱠᱤᱱ
(nikin)
ᱱᱤᱭᱟᱹᱠᱤᱱ
(niəkin)
Plural ᱱᱳᱠᱳ / ᱱᱩᱠᱩ
(noko / nuku)
ᱱᱚᱣᱟᱠᱳ
(noako)
ᱱᱮᱹᱠᱳ / ᱱᱩᱠᱩ
(neko / niku)
ᱱᱤᱭᱟᱹᱠᱳ
(niəko)
Distal Singular ᱩᱱᱤ
(uni)
ᱳᱱᱟ
(ona)
ᱤᱱᱤ
(ini)
ᱤᱱᱟᱹ
(inə)
Dual ᱳᱱᱠᱤᱱ
(onkin)
ᱳᱱᱟᱠᱤᱱ
(onakin)
ᱤᱱᱠᱤᱱ
(inkin)
ᱤᱱᱟᱹᱠᱤᱱ
(inəkin)
Plural ᱳᱱᱠᱳ / ᱩᱱᱠᱩ
(onko / unku)
ᱳᱱᱟᱠᱳ
(onako)
ᱮᱹᱱᱠᱳ / ᱤᱱᱠᱩ
(enko / inku)
ᱤᱱᱟᱹᱠᱳ
(inəko)
Remote Singular ᱦᱟᱹᱱᱤ
(həni)
ᱦᱟᱱᱟ
(hana)
Dual ᱦᱟᱹᱱᱠᱤᱱ
(hənkin)
ᱦᱟᱱᱟᱠᱤᱱ
(hanakin)
Plural ᱦᱟᱹᱱᱠᱳ
(hanko)
ᱦᱟᱱᱟᱠᱳ
(hanako)

The basic cardinal numbers (transcribed into Latin script IPA)[56] are:

1 ᱢᱤᱫ mitˀ
2 ᱵᱟᱨ bar
3 ᱯᱮ
4 ᱯᱩᱱ pon
5 ᱢᱚᱬᱮ mɔ̃ɽɛ̃
6 ᱛᱩᱨᱩᱭ turui
7 ᱮᱭᱟᱭ ɛyae
8 ᱤᱨᱟᱹᱞ irəl
9 ᱟᱨᱮ arɛ
10 ᱜᱮᱞ gɛl
20 ᱤᱥᱤ -isi
100 ᱥᱟᱭ -sae

The numerals are used with numeral classifiers. Distributive numerals are formed by reduplicating the first consonant and vowel, e.g. babar 'two each'.

Numbers basically follow a base-10 pattern. Numbers from 11 to 19 are formed by addition, gel ('10') followed by the single-digit number (1 through 9). Multiples of ten are formed by multiplication: the single-digit number (2 through 9) is followed by gel ('10'). Some numbers are part of a base-20 number system. 20 can be bar gel or isi.

ᱯᱮ

pe

(3‍

×

ᱜᱮᱞ

gel

10‍)

or

or

or

(ᱢᱤᱫ)

(mit’)

((1‍)

×

ᱤᱥᱤ

isi

20‍

+

ᱜᱮᱞ

gel

10‍)

30

Santali has a quite large number of postpositional words that can be added to either the bare nominals or to the number suffixes and the definitive marker. Some of them require the genitive case. There are complex forms that use combinations of a postposition and a case suffix.[57]

Santali adpositions[58]
Meaning
ləgitˀ/lagatˀ + -tɛ 'for'
modre 'among'
dhəbitʃˀ 'till, until, up to'
bhitrirɛ 'inside, within'
talarɛ 'middle'
latarrɛ 'under'
lagire 'due to'
tʃetanrɛ 'above, top'
leka/leka-tɛ 'like/by any means'
atɛ 'along with'
hɔtɛtʃˀtɛ 'for, by, due to'
tulutʃˀ 'being with, association with'
iəte 'owing to, due to, on account of'
-katɛ gerund, converb
mɛntɛ 'by saying, for the purpose'

To derive new nominals, the stems of lexical verbs, adjectives, and other nouns can employ many different methods, including affixation, reduplication, and compounding.

Suffixation: Two nominalising suffixes -itʃˀ for animate, and -akˀ for inanimate noun class, are used to form referential nominals.[59]

Verbs → nouns: jɔm ('eat') > jɔmakˀ ('food')

adjectives → nouns: nɔtɛ ('this side') > nɔtɛn ('belonging to this side') > nɔtɛnakˀ ('thing of this side') / nɔtɛnitʃˀ ('one of this side')

ponɖ ('white') > ponɖakˀ ('white thing') / ponɖitʃˀ ('white one')

suffixes → nouns: ɔl-tɛ (write-INS) > ɔltɛakˀ ('that with which is written(pen)')

An entire verbal construction can be nominalised:[50]

dal-ke-d-ej-itʃˀ

fight-AOR.ACT-TR-3SG.OBJ-ANIM.NMLZ

'one who struck him/her'

Infixation is the most productive derivation method in Santali. Infixes -tV-, -nV-, -mV-, -ɽV-, and -pV- are often inserted into nouns, verbs, adjectives to derive new words.[60]

ɛhɔp ('begin') > ɛtɔhɔp ('beginning')

rakap ('rise', 'ascend') > ranakap ('development')

Prefixation in North Munda has been reduced to a very few restricted exceptions.[61]

tʃɛt ('teach') > matʃɛt ('teacher')

Despite bearing noun-like semantics, the derived forms remain precategorial and can appear in other functions in probably seldom-attested contexts.[62]

oɽakˀ=iɲ

house=1SG.SUBJ

ɛtɔhɔpˀ-akatˀ-a

beginning-ACT.PRF-IND

'I have begun (to build) the house.'

Verbs in Santali inflect for tense, aspect and mood, voice and the person and number of the subject and sometimes of the object.[63] However, defining parts of speech in traditional linguistic terms, such as "verbs" and "nouns" in Jharkhandi Munda languages more generally (including most Kherwarian varieties and Kharia) is a highly controversial issue, since the evidence for discrete lexical categories like nouns, verbs, and adjectives is often extremely weak or even virtually absent, at least in the basic lexical level.[64] From this perspective, it may be nearly unfeasible to apply the conventional parts-of-speech framework to North Munda. A single element with apparently nominal semantics (may be metonymic in nature) may function as the predicate base in one sentence (typically in clause-final position), while appearing elsewhere as an argument in the same phonological and morphological form with zero-derivation. In fact, predicates and their complements may be primarily defined by syntactic configurations rather than by inherent lexical categories. For further theoretical and empirical discussions on word classes in Mundari, see Evans & Osada (2005), Peterson (2005), Hengeveld & Rijkhoff (2005), Croft (2005); for Kharia, see Peterson (2013).

Similarly, Santali has been described as a language with a regular degree of lexical flexibility.[65] Neukom (2001) posits that "nouns" don't exist in Santali, but instead there are "flexible lexemes" that can function either as arguments (=referential role) or as predicates within phrasal units, with no profound categorical distinction between these uses.[66] In everyday speech, Santali flexibility may show even more idiosyncrasies than those documented for Mundari. Rau (2013) provides attested examples showing that, within accepted usage, even proper names—cross-linguistically often treated as purely referential expressions denoting inherent properties may frequently occur as predicates in Santali without eliciting objections.[67] For instance, the sentence unkin-dɔ Kaɽa ar Guja-wa-kin-a 'Their names were Kara and Guja' (lit. "they were Kara-and-Guja-ed") uses the second proper name directly as an active applicative predicate, while the first name precedes the conjunctive element, producing a distributive interpretation of the predication.[68]

Neukom (2001) further notes that almost any type of lexeme—including nominals, interrogatives, and indefinites—can function predicatively, but only in combination with either a light verb copula (kan "COP.IPFV" or tahɛ̃kan "COP.IMPREF") or an applicative suffix -a/-wa (often glossed as "for/to someone") plus the indicative/finite suffix. Together, these elements act as a compositional verbalising operator, yielding a structure that behaves like a nominal sentence.[69][70] Rau (2013) also notes that there are examples of zero-copula construction.[71] A commonly cited property of lexically flexible languages is the absence or reduced productivity of lexical derivational mechanisms. While Ghosh (2008) (#Morphology##Derivation) showcases that Santali does indeed possess a productive derivational system, the extent to which derived forms participate in systematic, corpus-wide lexical flexibility in Santali has not yet been assuredly established. For discussion on the flexibility of Southern Santali, see Dash (2025).

The Santali TAM system is very complicated. In fact, categories of tense-aspects and voices always fuse into an interlocked system consisting of a series of verbal subtemplates, so it is impossible for analyses to single out a morpheme that marks a single TAM category accordingly. TAM paradigms interact with active and middle voice intricately: Active TAMs denote senses of UNMARKED, transitive, volitional, and outwardly directed, mostly employed in polyvalent predicates; Middle TAMs signify the status of intransitive, self-directed, and avolitional, mostly found in monovalent predicates. There are two subtemplates for the imperfective and perfective. Two recognisable tense categories are non-past and past, and the past is further divided into two tenses: anterior and aorist. The imperative/prohibitive do not have any markers but possess their own unique verbal templates.[72]

Santali verb paradigm Active Middle
Future/Present -okˀ
Present Progressive -etˀ (-kan) -okˀ-kan
Aorist -ketˀ -en
Anterior -letˀ -len
Perfect -akatˀ -akan
Past perfect -akatˀ-tahɛ̃kan -akan-tahɛ̃kan
Past progressive -etˀ-tahɛ̃kan -okˀ-kan-tahɛ̃kan
Optative -ke -k-okˀ
Irrealis -le -len
Conditional -khan

Applicative voice in Santali is represented by adding the applicative marker -a- to four tenses (Future, Imperfective, Past 1, Perfect) with an additional and rare Past 2 tense in the cases of inanimate objects. The active set serve polyvalent predicates, while the middle set mark for monovalent ones.

Santali applicative TAMs Active Middle
Future -a -jɔn
Present -a-kan -jɔn-kan
Past Animate -atˀ -an
Perfect -akawatˀ -akawan
Past Inanimate (-lakˀ)
singular dual plural
1st person exclusive -ɲ(iɲ) -liɲ -lɛ
inclusive -laŋ -bon
2nd person -m -ben -pɛ
3rd person -e -kin -ko

Transitive verbs with pronominal objects take infixed object markers.

singular dual plural
1st person exclusive -iɲ- -liɲ- -lɛ-
inclusive -laŋ- -bon-
2nd person -me- -ben- -pɛ-
3rd person -e- -kin- -ko-

In applicative constructions, inanimate objects are marked with a pronominal suffix, a checked -kˀ.

Possessor argument indexing

[edit]

Transitive verbs may form agreements with non-arguments/outside/indirect objects. To denote inalienable possession of the concerned indirect object, prefix -t- is attached to the applicative forms of the pronouns; otherwise it is marked in the noun phrase and functions as an attribute.

ako-ge=ko

3PL-EMPH=3PL.SUBJ

idi-ke-tˀ-ko-tako-a

take-ACT.AOR-TR-3PL.OBJ-3PL.POSS-FIN

'They took theirs away themselves.'

Dual person as honorific

[edit]

In specific contexts nowadays, Santali speakers have been increasingly using the pronominal duals to express honorific in a generalised sense to show respect to the addressed interactants, such as senior, highly-regarded, or unfamiliar persons.[73][74]

Two verbs mena ("to be") and hena ("to have") have irregular templates. The subject pronominal marker, instead of being an enclitic form, appears as a suffix in the slot where the object marker normally would be placed.[75] All constructions involving these two verbs are conjugated in the middle voice to express existence, possession, and location.[70]

tʃɛtˀ

what

dʒinis

thing

hena-kˀ-taben-a

have-MID.PRES-2DU.POSS-FIN

'What thing is there of you (have)?'

noko-modre

these-among

kombɽo

thieve

mena-e-a

be-3SG.SUBJ-IND

'There is a thief among these persons.'

bar

Two

ganɖa

CLF

sim

chicken

hopon

children

mena-kˀ-ko-tale-a

be-MID.PRES-3PL.SUBJ-1PL.POSS-FIN

'We have two chicks.'

Santali mena seems to be stemmed out from a small number of originally middle, intransitive predicate bases that have an inversed pronominalized pattern. Some other inherently intransitive, low agency, and non-volitional verbs such as rɛnɛtʃ ("be hungry") may display similar irregular behaviors like that of mena.

rɛnɛtʃ-iɲ=a

be.hungry-1SG.OBJ=IND

'I'm hungry.'

Below is the paradigm of non-negated, non-past, fully finite existential/locative copula mena:

Non-negated, non-past, fully finite copular structure[76]
singular dual plural
1st person exclusive mena-ɲ=a mena-ʔ-liɲ=a mena-ʔ-le=a
inclusive mena-ʔ-laŋ=a mena-ʔ-bon=a
2nd person mena-m-a mena-ʔ-ben=a mena-ʔ-pe=a
3rd person Animate mena-e=a mena-ʔ-kin=a mena-ʔ-ko=wa
Inanimate mena-ʔ=a

Semantics and pragmatics in Santali verb indexation

[edit]

In Santali as well as Kherwarian languages, the pronominal subject markers are mobile clitics that may encompass the whole clause. In most of the cases, except the stems mena and hena mentioned above, the pronominal subject clitics have two placements: (1) attach to the word preceding the verb stem, or, (2), enclitic to the final position of the verbal complex:

(1) X=S Verb

daka=ko

rice=3PL.SUBJ

dʒɔm-∅-a

eat-ACT.PRES-IND

'they eat rice'

(2) X verb=S

daka

rice

dʒɔm-∅-a=ko

eat-ACT.PRES-IND=3PL.SUBJ

'they eat rice'

According to MacPhail (1957), (1) occurs more frequently than (2).[77]

In complicated predicates, where there are more than one lexeme constitutes the sentence, such as the glossed one below, the subject clitic follow the (2) indexation pattern, not the (1) as expected:[78]

ale

we(PL.EXCL)

TOP

lelha

stupid

bhucuŋ

ignorant

koŋka

foolish

bhũiə

Bhuya

kan-a-le

IPFV.COP-IND=1PL.EXCL

'We are foolish, stupid, witless Bhuyas.'

The placement of the subject clitic can also distinguish the type of nominal sentences (sentences with copulae). In a predicational sentence where the subject is referential and the complement is non-referential, the host of the clitic is the subject.[79]

mitˀ=dɔ=e

one=TOP=3SG.SUBJ

bhut

ghost

kan-a

COP-IND

'One was a ghost.'

In an equative sentence where both the subject and the complement are referential, the subject clitic is placed at the end of the sentence.[80]

nui

this

ma

MOD

iɲ-ren

1SG-GEN.ANIM

hɔɽ

person

kan=e

COP=3SG.SUBJ

'This is my wife.'

Indexing arguments in Santali is essentially intertwined with the distinction of animacy of arguments. Distinction between animate/inanimate is not marked on nouns at all, but is conveyed through morphosyntax, such as in genitive and locative cases and verbal agreement. That is, if an argument of the verb does not belong to the animate noun class, the verb will not index that argument. Inanimate entities such as flower, tree, rice, book, food,... and objects that cannot move by themselves like vehicles (eg. motorbike, car, aeroplane) are never indexed by the verb. However, there are some notable exceptions of inanimate objects that are significant ('sun', 'moon', 'star') or culturally important ('doll') are considered animate in Santali:

siɲtʃãdo

sun

rakapˀ-kan-a=e

rise.MID.AUGM-IPFV-FIN=3SG.SUBJ

’The sun is rising'

ɲiɲdətʃãdo

moon

dubutʃˀ-en-a=e

set.MID.AUGM-MID.ANT-FIN=3SG.SUBJ

’The moon set'

Likewise, 'Government' is also considered a single body of animate entities and is marked with third person singular. Even mushroom, thorn being pricked, puff-ball, earwax are perceived as animate and are indexed by pronominal markers as such, showing the unpredictability of the Santali animacy-based indexation system.[81]

In negative formations, the negation particle may show indexation of an inanimate subject, while other Kherwarian languages suppress it.

ɖɛr

tree.branch

ba=i

NEG=3SG.SUBJ.INAN

rapud-kan-a

break-IPFV-IND/FIN

'The branch isn't breaking.'

As described by Ghosh (2008), there are no specific markers for the imperative series. However, in the affirmative imperative, the indicative/finite marker -a is replaced by second person markers. In the negative imperative, verb (TAM/person-syntagma) takes -a while the imperative subject marker moves to the enclitic position behind the negative particle, right before the verb (See ##Negation).

daja-kate

show.mercy-CONV

ma-ge

MOD-FOC

oko-baɲtʃao-ka-ɲ-tabon-pe

hide-save-BEN-1SG.OBJ-1PL.INCL.POSS-2PL.SUBJ.IMP

'Please show kindness and hide and save me (for the sake of us)'

Any finite predicates will attach -a, except the imperative and in the subordinate clause. This suffix marks the predicate an indicative (real, default, narrative) mood.[82]

noa-rɛakˀ

this-GEN

mitˀ

one

ʈaŋ

CLF

kəhəni

story

ləi-ad-iɲ-a=e

tell-ACT.APPL.PST-1SG.OBJ-FIN=3SG.SUBJ

'S/he told me a story about this.'

There are two causative markers: a- and -otʃo. -otʃo is attached on every type of verb stems, and a- is restricted to two transitive verbs jɔm ('eat') and ɲu ('drink').[83]

am

You

me

ba=m

NEG=2SG.SUBJ

ɖaɽ-otʃo-li-d-iɲ-a

run-CAUS-ACT.ANT-TR-1SG.OBJ-IND

'You didn’t make me run.'
Anderson (2018), field notes.

While both the causative and the permissive share the same suffix -otʃo, the permissive is different as an applicative marker is combined with the causative morpheme, resulting in the shift of the concerned person from the accusative to the dative position.

sɛn-otʃo-daɽe-a-e-a=ɲ

go-CAUS-ABIL-ACT.APPL.IPFV-3SG.OBJ-FIN=1SG.SUBJ

'I let/made him/her come.'

ɲɛl-otʃo-ad-e-tahɛ̃kan-a=ko

see-PERM-ACT.APPL.PST-3SG.OBJ-IMPERF-FIN=3PL.SUBJ

'They had permitted him/her to see.'

Infix -pV- turns transitive and ditransitive verb roots into reciprocal meaning, but in many verbs it also conveys that the action is done together by two participants.[84]

dal ('beat') > dapal ('beat each other')

landa ('laugh') > lapanda ('laugh together')

The benefactive for transitive and ditransitive stems is -ka in Northern Santali dialect and -ka-k in Southern Santali. In Southern Santali, if the object is animate, the last -k will be replaced by pronominal clitics. All benefactive stems are conjugated with active TAM markers.[84]

tɔl ('bind') > tɔlka ('to bind for somebody')

tɔl-ka-e-kan-a=e

bind-BEN-3SG.OBJ-IPFV-FIN=3SG.SUBJ

'S/he is binding it(cow) for someone.'

Transitive verbs and a limited number of intransitive and intransitive-transitive verb roots will take -jɔn to form the Medio-passive voice.[85]

Passive and Reflexive

[edit]

Transitive roots, transitive-intransitive roots, and causative stems will take -ok to derive passive stems. In the transitive-intransitive roots, it denotes the prominence of transitivity. Attaching it to transitive verbs will create reflexivity.[85]

ɲɛl ('see') > ɲɛlok ('be seen') (passive)

ranotʃo ('cause to medicate') > ranotʃok ('be caused to medicate') (causative > passive)

mak ('cut') > makok ('cut oneself') (reflexive)

The intransitive applicative TAM set is also interpreted as expressing reflexivity and used to emphasise the action directed toward the subject themselves.

uni

He

tupri

hat

ar

and

aŋgrɔp=e

coat=3SG.SUBJ

hɔrɔk-akawan-a

put.on-MID.APPL.PRF-IND

'S/he has put on hat and coat.'

Noun incorporation is not a feature of Santali.[86]

Nominal "verbalisation"

[edit]

In daily speeches, nominal roots can be found functioning as verbs with appropriate inflection. The verbalisation of nominals extends to interrogatives and indefinites. Adjectives that are derived from nominals can take inflection as well as person indexation, too. It is said that virtually every entity-denoting lexeme is capable of functioning the predicative role in Santali.[70]

(1) "medicine"

ɔdʒɔn-ad-e-a=ɲ

medicine-ACT.APPL.PST-3SG.OBJ-IND=1SG.SUBJ

'I gave him/her medicine.' (lit. 'I medicine-ed him/her')

(2) "king"

jon

John

raajaa-en-a=e

king-MID.PST-IND=3SG

'John became king.' (Lit. 'John kingged.')

(3) "orphan"

huɖiɲ

small

gidrə=i

child=3SG.SUBJ

ʈuər-oʈo-ka-d-e-a

orphan-leave-AOR.ACT-TR-3SG.OBJ-IND

'She left a child motherless.'

(4) Pronoun

uni

S/he

TOP

am-akˀ-kan-a=e

2SG-GEN-IPFV-IND=3SG.SUBJ

'S/he is yours.' (lit. 'S/he is you-r-ing')

In the (1) example, the "verbalized" predicate structure of the lexeme ɔdʒɔn bears the identical semantics as of the free lexeme itself, with an additional applicative (to give DATIVE) sense. The (2) sentence with middle TAM suffix also shows similar regularity of semantics, producing an inchoactive meaning to become X (X here is entity/state/property-denoting semantics). The (3) sentence exemplifies an active TAM suffixed predicate using an "noun-like" lexeme ʈuər ("orphan") as the semantic base, which brings up a subtle shift to causative theme to make X/make someone be X, but the semantics is still mostly uniform (orphan–motherless).[87]

Similar "verbalization/recategorization" via zero derivation like these can occur in English (eg. gun–gunned "get shot by gunfire", ice–iced "become ice", empty–emptied "become empty, make something empty",...).[88] However, English has both idiosyncratic verbalization (unpredictable semantic outcome) and compositional verbalization (predictable semantic outcome),[89] while in Santali it displays extreme regularity and predictability as they have direct semantic correspondence with their nominal counterparts and very little idiosyncrasies.[67][90]

(5) "big"

dare

tree

maraŋ-en-a

big-MID.PST-IND

'The tree became big.' (Lit. 'The tree bigged.')

(6) "kind"

uni

S/he

TOP

dajawan-kan-a=e

kind-IPFV-IND=3SG.SUBJ

'S/he is kind.'

(7) superlative comparison

hana

that.far.INAN

dare

tree

noa

this.INAN

dare-ko-khɔn

tree-PL-ABL

TOP

sɛ̃ɽa-gɛ-a

big-FOC-IND

'That tree is bigger than this tree.'

The existence of an independent adjective class in Santali is invalidated by sentences (5), (6), (7), since these adjective-like lexemes can occur in predicate position, take TAM/Person/Number and semantically/syntactically behave like the aforementioned examples (1), (2), (3).[61]

Further more, mimetic sounds, such as ãã (animal groan) (8), complex units, such as the postpositional phrase kombɽo tuluj "with thieves" (9), and even proper names (10) can function as the semantic bases of the predicates. These examples below provide a compelling argument against analyzing the flexibility as a lexical derivational process by Evans & Osada (2005).[78] This perspective on "verbalization" support the implication that rather than a linguistic anomaly, flexibility is in fact the nature of the language itself.[91][a]

(8) mimetic sound

bar

two

three

dhao=e

time=3SG.SUBJ

ãã-jen-a

groan(onomatopoetic)-MID.AOR-IND

'It (i.e. the buffalo) groaned two or three times.'

(9) phrase

alo=m

PROH=2SG.SUBJ

kombɽo

thief

tuludʒ-oˀk-a

with-MID-IND

'Don't keep company with thieves.'

(10) proper name

uni

that

buɖhi-ren

old.woman-GEN.AN

hɔpɔn-tɛtˀ

son-3.POSS

koɽa-wakˀ

boy-NMLZ.INAN

ɲutum=dɔ

name=TOP

Turtə-wa-e-a

Turta-APPL-3SG.OBJ-IND

'The old woman's son's name was Turta.' (lit. 'That woman's son's name was Turta-ed')

In the cases of proper names, when an active applicative suffix is applied, it expresses that x is caused to be the individual named N, which translates into being called N. In nonpast active form, the construction describes the (temporal) property of being the individual named N to the subject.[93]

Serial verb constructions

[edit]

Two or more verbs and modifiers can combine together to derive a compound verb. Normally they are combinations of two transitive verbs or two intransitive verbs and limited numbers of transitive+intransitive and intransitive+transitive combinations.[86]

ɲɛlɲam-led-e-tahɛ̃kan-a=ko

see.find-ACT.ANT-3SG.OBJ-IMPERF-FIN=3PL.SUBJ

'They had seen and found him/her.'

Auxiliary verb constructions

[edit]

Complex predicates are pervasive in Munda clause structure. Simple verbs like go, become, finish, come, try,... are often employed as auxiliary verbs (v2 in South Asian linguistics) to add or embolden modality, aktionsart, and orientations to the predicates. In Santali, there are univerbated auxiliary constructions to mark many functions. One example show below, the verb gɔt ("pluck") is often used as auxiliary verb to denote telicity, that is, a quick, sudden, or intense action.[94] Santali AVCs exhibit split-doubled pattern: the lexical verb may index the object argument, and the auxiliary verb may index the subject argument.[95]

ɲɛl-gɔt-ke-d-e-a=pɛ

see-TEL.AUX:pluck-ACT.AOR-TR-3SG.OBJ-FIN=2PL.SUBJ

'You guys suddenly caught sight of him/her' or 'You guys saw him/her off/said good-bye to him/her.'

Some auxiliary constructions may exhibit behaviours of compound verbs. Two most common used auxiliary verbs in Santali are daɽe ("can") and lega ("try"). The first one is often combined with an active applicative suffix, while the latter mostly found with the middle TAMs.[96]

ba=e

NEG=3SG.SUBJ

rɔɽ-daɽe-atˀ-a

speak-AUX:can-ACT.APPL.PST-FIN

'S/he could not speak'

sereɲ-lega-kˀ-mɛ

sing-AUX:try-MID-2SG.IMP

dʒut-okˀ-rɛ

succeed-MID.FUT-LOC

hɔ̃

too

baŋ-rɛ

NEG-LOC

hɔ̃

also

'Try to sing whether you will succeed or not.'

There are three particles in Santali used to express negation: baŋ, ɔhɔ and alo. baŋ and ba (shortened form) are the negatives for interrogative and declarative sentences; ɔhɔ is the emphatic negative of declarative sentences; alo is the prohibitive negative in the imperative. These negative particles will take away the subject marker from the verb.[97]

ba=ko

NEG=3PL.SUBJ

sap-le-d-e-a

catch-ACT.ANT-TR-3SG.OBJ-FIN

'They did not catch him/her.'

alo=m

PROH=2SG.SUBJ

ləi-Ø-a-e-a

tell-ACT.PRES-BEN-3SG.OBJ-FIN

'Don’t tell him/her.'

In existential/locative copular formations, negation is different in present tense and past tense. Below is the chart of negative, non-past, fully finite existential/locative copula paradigm:[98]

Negated, non-past, fully finite copular structure
singular dual plural
1st person exclusive bən-ug-iɲ=a ban-uʔ-liɲ=a ban-uʔ-le=a
inclusive ban-uʔ-laŋ=a ban-uʔ-bon=a
2nd person ban-uʔ-m=a ban-uʔ-ben=a ban-uʔ-pe=a
3rd person Animate ban-ug-itʃˀ=a ban-uʔ-kin=a ban-uʔ-ko=wa
Inanimate ban-uʔ=wa

In negative past copular constructions, the negative particle ban encodes the subject, and the past tense is indicated by the separate copula taheken.[99]

iɲ-rin

I-GEN.ANIM

bar-ija

two-CLF

kuɽi

girl

gidrə

child

ba=kin

NEG=3DU.SUBJ

taheken=a

COP.PST=IND

"I did not have two daughters."

Expressives arguably can be justified as an independent lexical category in Santali. Echo-word formation can be constructed by three processes: (1) generating masdar in an identical form; (2) augmenting a consonant in the repeated element; (3) vowel mutation. Sometimes masdars co-occur with vowel mutation simultaneously. Expressives can express highly detailed semantics and are not constrained by syntactic rules.[96]

(1) masdars. These expressives are formed by simply reduplicating the first element.

(2) (∅VX CVX) masdars with augmenting a consonant

(3) (∅V1X CV2X) with vowel mutation

The initial and medial consonants of the first element may be alternated in masdars.[100]

The Santals categorize expressives as a form of "twisted speech" (benta katha), a discourse mode characterized by profound metaphorical depth.[101] These items occupy a central role in Santali daily communication and cultural life. Expressives are especially high prevalent within performance traditions—including music, storytelling, folktales, and poetry—with an extensive presence in the oral genres of performances.[102]

Simple clause structure

[edit]

Santali is strictly head-last. Simple noun phrases in Santali would have the following typical structure:[103]

(DEMONSTRATIVE) (QUANTIFIER) (ADJECTIVE) (ADJECTIVE) NOUN

Example: noa əɖi maraŋ bir (that very big forest) "very big forest"

Santali person indexation clearly shows that it follows the nominative-accusative alignment: the subject pronominal clitic agrees with the person/number of the nominative argument; the object pronominal infix agrees with the person/number of the accusative argument.[70] But there are no markings featured on NPs whatsoever to show their relation:

am

You

iɲ=m

me=2SG.SUBJ

ɖaɽ-otʃo-ki-d-iɲ-a

run-CAUS-ACT.AOR-TR-1SG.OBJ-IND

'You made me run.'[104]

Thus, word order may be used to determine which constituent part of the non-verbal elements is the subject argument or accusative/object argument. Usually, the unmarked word order is SOV. However, Santali word order is highly influenced by context, discourse, and pragmatism. If the S/A is considered less topical than the O/P, then the word order would be reduced to OV. The sentence would be shrunk down further if no argument is deemed topical.[11] Some can argue that then the pronominal clitics representing arguments in NPs, perhaps, should be considered the arguments themselves.

gidrə-(ko)i

child-PL

Subject

ruekˀ

fever

Object

ɲam-akatˀ-koi-wa

get-ACT.PRF-3PL-IND

Verb

'The children caught fever.'

kimin

daughter.in.law

Object

TOP

Particle

ba=m

NEG=2SG.SUBJ

Negative

əgu-∅-ko-wa

bring-ACT.FUT-3SG.OBJ-IND

Verb

'Will you not bring daughter-in-law?'

dʒɔm-le-a=e

eat-ACT.IRR-IND=3SG.SUBJ

Verb

'S/he would eat.'

The default word order of INTRANSITIVE, MONOVALENT sentence is SV, though notice that it can be reduced if the subject is not a matter of topic or focus.

hɔpɔn=e

son=3SG.SUBJ

hetʃˀ-en-tiɲ-a

come.AUGM.PASS.MID-MID.AOR-1SG.POSS-IND

'My son has come.'

hij-oʔ-en=a=y

come-MID.PASS.AUGM-MID.AOR=IND=3SG.SUBJ

'He has come.'

Complex sentence structure

[edit]

Coordinative particles are employed in Santali complex sentence structure for various conjunctive, disjunctive, and adversative functions.[105]

Complex sentence coordinators
Particle Translation Note
Conjunctive ar and operates within the sentence
adɔ thereupon operate across sentences
khan then
Disjunctive or
baŋkhan otherwise
Adversative mɛnkhan but also denotes switch reference
bɔrɔŋ rather
bitʃkom rather (L)
hutkə then used in conditional sentences to introduce the apodosis, in which the protasis is supposed not to have been realized, and therefore, the apodosis would not have occurred.
Conclusive baŋma "that is to say, namely"

Note that mɛnkhan ("but") may denote switch reference in the second clause.[106]

unkin

They.DU

TOP

din-ge

day-FOC

əɖi

very

kurumuʈu=kin

diligently=3DU.SUBJ

kəmi-ja

work-IND

mɛnkhan

but

tʃheka-katɛ=je

how-CONV=3SG.SUBJ

mitˀ

one

din

day

uni

s/he

hɔɽ-rɛn

man-GEN

orakˀ

house

boŋga

goddess

TOP

bɔhɔkˀ

head

latʃˀ

stomach

haso

pain

ɲam-ke-d-e-a

get-ACT.AOR-TR-3SG.OBJ-IND

'They two work very hard, but one day for unknown reason the man's wife was affected by pain in stomach and head.' (They → the man's wife)

In subordinating clauses, there are the uses of converb katɛ, ablative khɔn, place marker ʈhɛn, temporal khan, and purposive jɛmon available to link the subordinates with the narrative clauses.[107]

ɲɛl-jɔn-katɛ

see-MPASS-CONV=1SG

tʃalao(ʔ)-kˀok-a

go.AUGM.PASS-MID.OPT-IND

'Seeing it I would go.'

Indefinite pronouns jãhã ("any") and jãhãe ("anyone") are used to link relative clauses. The choice of which particle should be used primarily depends on the semantics and animacy of the referred argument.[108]

jãhã

any

dare-rɛ=m

tree-LOC=2SG.SUBJ

detʃˀ-len-a

climb.AUGM.PASS.MID-MID.ANT-IND

on-rɛ

that-LOC

mitˀ

one

ʈaŋ

CLF

tɛrɔm

honey

tʃak

comb

mena-kˀ-a

be-MID.PRES-IND

'There is a honey-comb in the tree which you climbed.'

Pronouns, interrogative pronouns, and correlative particles jodi ("if"), tahle ("then"), tobe ("then"), dʒɛmɔn ("as"), tɛmɔn ("so") are used to form correlatives in both the main and attributive clauses.[109]

oka

which

disɔm-rɛ

country-LOC

onko

they

gaɖel

crowd

hɔɽ-ko

people=3PL.SUBJ

jarwa-akan-tahɛ̃kan-a

gather-MID.PRF-IMPERF-IND

ona

that

disɔm-rɛn

country-GEN

raj

king

TOP

gɔj-akan-a

die-MID.PRF-IND

'The king of the country where the crowd of people had gathered has died.'

dʒɛmɔn=iɲ

as=1SG.SUBJ

mɛn-led-a

say-ACT.ANT-IND

tɛmɔn-ge

so-FOC

tʃando

Chando

in-akˀ

me-GEN

sana-e

wish=3SG.SUBJ

purəu-ke-tˀ-tiɲ-a

fulfill-ACT.AOR-TR-1SG.POSS-IND

'Chando fulfilled my wish as I had asked.' (~ As I had said, so Chando fulfilled my wish)

Combining uses of indefinite pronouns with demonstratives/locatives like jãhã:ona, jãhãe:uni/onko, and jãhã:on-rɛ likewise can also be considered correlative conjunctions.[109]

In daily conversations, Santali speakers generally employ high percentages of words of native Austroasiatic/Munda/Santali origins, compared to other Munda languages such as Kharia and Juang. The loan strata, mostly borrowed from Hindi (eg. rəskə "joy" < Hindi rasika, haʈ "market" < Hindi hāʈ, kagodʒ/kagotʃ "paper" < Persian kāgaz via Indo-Aryan,...) and regional languages Sadri (Eg. kuʈəm/kutɨsi "hammer" < Sadri kuʈasi), Khortha, Angika, Maithili, Assamese, Bengali (Eg. rəs "heap" < Bangla raʃi, bhəgnə "nephew" < Bangla bhagina/bhagna), Nepali, Oriya and even English may account for almost 20% of the lexemes of daily needs. Younger generation who have opportunities to engage in higher education tend to be more accustomed with lexical influence from neighbouring languages as well as English.[110] A good number of words seem to be derived from earlier stages of Indo-Aryan (either Vedic Sanskrit dating from 1,500-1,000 BCE, Classical Sanskrit ~500 BCE, or Middle Indo-Aryan) are also found, such as datlom "sickle" < Vedic/Classical Sanskrit d̪at̪ra-m "sickle-SG.N.NOM/ACC" (cf. Pali d̪at̪t̪a "sickle", Bengali দা d̪a "blade"). Santali also is the source of borrowings by several regional Indo-Aryan languages, namely Sadri, Khortha and Kurmali. Eg. Khortha gidʌr "child" < Santali gidrə, Kurmali nisʈai "exactly, truly" < Santali niʈsahi, et cetera.[111]

A limited number of words are shared between Kuṛux and Santali, eg. Kuṛux kʰotā "nest" and Santali tukə "nest", Kux. ura "beatle" and Sat. uru "beatle", Kux. busū "straw" and Sat. busupˀ "straw", but they are difficult to analyze as their cognates also appear across other Munda and Indo-Aryan languages.[112] A very few lexical items appear to be shared between Munda and Tibeto-Burman, likely represent the remaining traces of earlier contact between the two groups.[113] Eg. Tshobdun snəm "oil" and Santali sunum "oil", Limbu pɛːr "to fly" and Sat. apir "to fly", Lepcha pok "to throw" and Sat. tapaʔ "to throw", et cetera.

As for the Austroasiatic lexicon, most Santali terms share same origins with other Austroasiatic languages, including aspirated-phoneme words. For examples:

Text 1: Article 1 of the Universal Declaration of Human Rights

[edit]

The following text is Article 1 of the Universal Declaration of Human Rights, written in Santali:

ᱡᱚᱛᱚ ᱞᱮᱠᱟᱱᱚ ᱢᱳᱱᱚ ᱟᱨᱚ ᱚᱫᱷᱤᱠᱟᱨᱚ ᱨᱮᱭᱟᱠᱚ ᱟᱫᱷᱟᱨᱚ ᱨᱮ ᱢᱩᱪᱳᱛᱚ ᱫᱷᱟᱵᱤᱪᱚ ᱥᱣᱚᱛᱚᱱᱛᱨᱚ ᱟᱨᱚ ᱥᱩᱢᱟᱱᱚ ᱠᱳ ᱦᱩᱭᱩᱠᱚᱟ ᱾ ᱩᱱᱚᱠᱳ ᱦᱳ ᱵᱩᱫᱫᱷᱤ ᱟᱨᱚ ᱵᱩᱡᱷᱚᱦᱚᱩ ᱠᱳ ᱟᱜᱩ ᱛᱳᱨᱟ ᱵᱟᱠᱟ ᱫᱟᱱᱮᱪᱚ ᱟᱨᱚ ᱢᱤᱠᱚ ᱦᱚᱰᱚ ᱟᱨᱚ ᱫᱳᱥᱟᱨᱚ ᱦᱚᱰᱚ ᱨᱚ ᱟᱯᱱᱟᱨᱚ ᱨᱮᱭᱟᱠᱚ ᱣᱭᱚᱣᱚᱦᱟᱨᱚ ᱦᱩᱭᱩᱠᱚ ᱡᱳᱨᱩᱰᱟ ᱾[115]

जत लेकान मोन आर अधिकार रेयाक आधार रे मुचोत धाबिच स्वतन्त्र आर सुमान को हुयुकआ। उनको हो बुद्धि आर बुझहौ को आगु तोरा बाका दानेच आर मिक हड आर दोसार हड र आप्नार रेयाक व्यवहार हुयुक जोरुडा॥[116]

All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.

Glossed and translated

[edit]

dʒɔtɔ

all

lekan

kind

mon

human

ar

and

ɔdhikar-reakˀ

rights-GEN

adhar-rɛ

dignity-LOC

mutʃot

birth

dhabitʃˀ

at

swɔtɔntrɔ

free

ar

and

suman-ko

equal-3PL

huju-ʔ=ko=a

COP-MID=3PL.SUBJ=IND

'All human beings, with respect to rights and in dignity, at birth free and equal are.'

unko

They

hɔ̃

also

buddhi

intelligence

ar

and

budʒhhɔ=ko

judgment=3PL

əgu-tor-a

bring-with-IND

baka

therefore

danetʃˀ

one.another

ar

and

mitˀ

one

hɔɽ

man

ar

and

dɔsar

other

hɔɽ

person

towards

apnar-reakˀ

own-GEN.INAN

jawɔhar

behavior

huju-kˀ

COP-MID

dʒəruɽa

should

'They also reason and understanding given-with; therefore toward one another, one person toward another person, in their own conduct should act.'

The following Santali story was narrated by a Santal man, age 40, from Jitpur village, Jamtara, Santal Parganas, Jharkhand, and was collected, translated, and annotated by Ghosh (2008).[117]

Kɔki gɔ "Stepmother"

mitˀ

one

tɛtʃˀ

CLF

tʃasa-hɔɽ=e

farmer-man=3SG.SUBJ

tahɛ̃kan-a

COP.IMPERF-IND

There was a farmer.

uni-rɛn

he-GEN

ɛra

wife

TOP

əɖi

very

khaʈoa

diligent

hɔɽ=e

person=3SG.SUBJ

tahɛ̃kan-a

COP.IMPERF-IND

He had a diligent wife.

unkin-rɛn

they.DU-GEN

mitˀ

one

tɛtʃˀ

CLF

koɽa

boy

gidrə

child

tahɛ̃kan-takin-a

COP.IMPERF-3DU.POSS-IND

They had a son.

unkin

they.DU

TOP

din-ge

day-FOC

əɖi

very

kurumuʈu=kin

diligently=3DU.SUBJ

kəmi-ja

work-IND

mɛnkhan

but

tʃheka-katɛ=je

how-CONV=3SG.SUBJ

mitˀ

one

din

day

uni

s/he

hɔɽ-rɛn

man-GEN

orakˀ

house

boŋga

goddess

TOP

bɔhɔkˀ

head

latʃˀ

stomach

haso

pain

ɲam-ke-d-e-a

get-ACT.AOR-TR-3SG.OBJ-IND

They two work very hard, but one day for unknown reason the man's wife was affected by pain in stomach and head.

ar

and

atʃka

suddenly

ge

EMPH

unkin

they.DU

apa

father

hɔn

son

bəgi-atˀ-kin-a=e

leave-ACT.APPL.PST-3DU.OBJ-IND=3SG.SUBJ

Soon she passed away, leaving behind her husband and son.

khan

then

əɖi

very

duk-rɛ=kin

trouble-LOC=3DU.SUBJ

paɽao-en-a

fall-MID.AOR-IND

Thereafter these two faced lots of problems.

kəmi

work

hɔ̃

even

nit

now

TOP

baŋ=kin

NEG=3DU.SUBJ

kəmi-daɽe-akˀ-kan-a

work-ABIL-MID-IPFV-IND

They could not even go to work.

tʃedakˀ

because

dʒe

that

majdʒiu

woman

kəmi-ko

worker-PL

TOP

ar

and

unkin

they.DU

oɽakˀ-rɛ

house-LOC

ekkal

absolutely

bənu-kˀ-ko-wa

NEG.COP-MID-3PL.OBJ-IND

The absence of a woman to take care of household chores led to this situation.

adɔ

thereupon

ajma

much

hudis-baɽa-katɛ

though-about-CONV

mirˀ

one

tetʃˀ

CLF

kəki-gɔ=kin

aunt-mother=3DU.SUBJ

saŋgha

marry.a.widow

əgu-ke-d-e-a

bring-ACT.AOR-TR-3SG.OBJ-IND

Thereupon after a lot of contemplation, they brought a stepmother for the boy by sangha marriage.

uni

he

əgu-katɛ

bring-CONV

thora

some

din

day

TOP

thik-ge

right-EMPH

din=ko

day=3PL.SUBJ

khema-ke-d-a

pass-ACT.AOR-TR-IND

After she arrived, things worked well for sometime.

inə

just.that

tajɔm

after

TOP

uni

that

gidrə

boy

kəki

aunt

gɔ-akˀ

mother-GEN

mɛtˀ

eye

samaŋ-rɛ

front-LOC

TOP

əɖi

very

sikiɽ-ge

hate-EMPH

ɲɛl-e-a

see-3SG.OBJ-IND

After sometime the boy was looked upon with hatred by the stepmother.

adɔ

thereupon

ona-tɛ

that-INS

mitˀ

one

din

day

uni

that

ajo

woman

TOP

atʃ-rɛn

3SG-GEN

hɛrɛl-tɛtˀ=e

husband-DEF=3SG.SUBJ

met-a-e-kan-a

say-ACT.APPL-3SG.OBJ-IPFV-IND

nui

this

gidrə

boy

do

TOP

jãhã-sɛn

any-to

idi-oʈo-ka-e-mɛ

drive-away-BEN-3SG.OBJ-2SG.IMP

ar

and

baŋkhan

otherwise

gotʃˀ-giɖi-ka-e-me

kill-off-BEN-3SG.OBJ-2SG.IMP

One day the woman told her husband, "Drive away the child anywhere or otherwise kill him."

khan

then

hɛrɛl-tɛtˀ

husband-DEF

ona

that

katha

word

aɲdʒɔm-toraj

listen-away

tiŋgitˀ-gɔtˀ-en-a

deafen-TEL-MID.AOR-IND

Having heard that the husband felt as if he had been deafened.

ar

and

mɔnɛ

mind

mɔnɛ-tɛ

mind-LOC

mɛn-jɔŋ-an-a

say-MPASS-MID.APPL.PST-IND

dʒe

that

nui

this

ajo

woman

TOP

tʃit

what

katha=e

word=3SG.SUBJ

met-ad-iɲ-a

say-ACT.APPL.PST-1SG.OBJ-IND

He reflected over what the woman told him.

adɔ

then

əɖi

very

hudis-rɛ=e

think-LOC=3SG.SUBJ

parao-en-a

fall-MID.AOR-IND

He then fell into deep thought.

adɔ=e

then=3SG.SUBJ

kuli-ruəɽ-ke-d-e-a

ask-return-ACT.AOR-TR-3SG.OBJ-IND

tʃedakˀ-em

why=2SG.SUBJ

əɽis-a-e-kan-a?

worry-ACT.APPL.PRES-3SG.OBJ-IPFV-IND

He asked her, "What bothers you about the boy?"

adɔ

then

uni

that

ajo=e

woman=3SG.SUBJ

rɔɽ-ruəɽ-ke-d-a

tell-return-ACT.AOR-TR-IND

'hɛ̃,

yes

I

TOP

əɽis-gi=ɲ

worry-EMPH=1SG.SUBJ

ɲɛl-e-kan-a'

see-3SG.OBJ-IPFV-IND

Then she replied, "I am looking at him with great fear."

khan

then

hɛrɛl-tɛtˀ=e

husband-DEF=3SG.SUBJ

mɛn-ke-d-a

say-ACT.AOR-TR-IND

'am-ge

you-EMPH

ləi-mɛ

tell-2SG.IMP

tɔbe

then

tʃikə-katɛ=ɲ

how-CONV=1SG.SUBJ

gɔdʒ-e-a'

kill-3SG.OBJ-IND

He told her, "Suggest to me how to kill him."

adɔ

then

uni

that

ajo=e

woman=3SG.SUBJ

ləi-a-e-kan-a,

tell-ACT.APPL-3SG.OBJ-IPFV-IND

'am-ak'

you-GEN

isi

plough

TOP

dʒɔkhɔn

when

si-okˀ-ben

plough-MID=2DU.SUBJ

dʒɔɽao-idi-a

link-continue-IND

un

that

dʒɔkhen

time

gidrə

boy

TOP

laha-ka-e-mɛ

front-BEN-3SG.OBJ-2SG.IMP

ar

and

am

you

TOP

tajɔm-re

behind-LOC

si-mɛ

plough-2SG.IMP

ar

and

am-rɛn

you-GEN

ɖaŋ-ra

bullock

khub

very

laga

drive

laga-kin-mɛ

drive-3DU.OBJ-2SG.IMP

She suggested, "When you two go to work in the field, you should link your plough then keep the child in front of you, and you plough from behind driving your bullocks very hard."

un

that

dʒɔkhen-ge

time-EMPH

uni

that

gidrə

boy

TOP

ona

that

isi-tɛ=e

plough-INS=3SG.SUBJ

gutu

insert

gɔdʒɔ-kˀ-a

kill-MID-IND

"Then the child will die being pierced by the yoke." She continued.

khan

then

ona

that

aɲdʒɔm-katɛ

listen-CONV

goʈa

whole

bəd

upland

bəjhar=kin

low.land=3DU.SUBJ

si-tʃaba-ke-d-a

plough-finish-ACT.AOR-TR-IND

After that the father and son ploughed up the whole high land and low land for many days.

mɛnkhan

but

uni

that

gidrə

boy

gɔdʒ-e-ləgitˀ

kill-3SG.OBJ-for

okte-ge

time-EMPH

baŋ

NEG

hɛtʃˀ-len-a

come-MID.ANT-IND

But he never got around to killing his son.

khan

then

ajo=e

woman=3SG.SUBJ

mɛn-ke-d-a,

say-ACT.AOR-TR-IND

'saman

whole

khɛt=ben

land=2DU.SUBJ

si-tʃaba-ke-d-a

plough-finish-ACT.AOR-TR-IND

adɔ

yet

ɛnrɛ

still

hɔ̃

even

nui

this

gidrə

boy

TOP

ba=m

NEG=2SG.SUBJ

gɔtʃˀ-daɽe-ad-e-a'

kill-ABIL-ACT.APPL.PST-3SG.OBJ-IND

The woman told the farmer, "You have ploughed the whole field and still you could not kill the boy."

khan

then

ona

that

ɔkte

time

uni

that

ajo

woman

gidrə

boy

gɔdʒ-e-ləgitˀ

kill-3SG.OBJ-for

mitˀ

one

ʈɛtʃˀ

CLF

kuɽpaɽ-ke-d-a

make.suggestion-ACT.AOR-TR-IND

Then the woman made another suggestion on how to kill the boy.

met-ad-e-a=e

say-ACT.APPL.PST-3SG.OBJ-IND=3SG.SUBJ

'hana

that.far

ʈənɖi-rɛ

plains-LOC

gundli=bon

millet=1PL.SUBJ

tʃas-akatˀ

cultivate-ACT.PRF

ona-ge

that-EMPH

si-ben'

plough-2DU.IMP

She told the farmer, "In that far off plain (where) you plough for our millet cultivation."

un

that

dʒɔkhɛtʃˀ-ge

time-EMPH

uni

that

gidrə

boy

TOP

ona

that

isi-tɛ

plough-INS

sɔb-ɔkˀ

pierce-MID

gɔtʃˀ-otʃo-je-m

die-CAUS-3SG.OBJ-2SG.IMP

"At that time when you are ploughing that field, let the boy die by being pierced by the yoke", she said.

ar

and

ona

that

dʒajga-rɛ

place-LOC

TOP

ɛʈakˀ

any

tʃas=laŋ

cultivate=1DU.SUBJ

lagao-a

applu-IND

"And in that piece of land we will cultivate other crops."

thereupon

uni

that

hɛrɛl-tɛtˀ

husband-DEF

TOP

ona-ge

that-EMPH

hɛ̃-ad-a

yes-ACT.APPL.PST-IND

The farmer told her he would do as she told him.

khan

then

dɔsar

second

hilokˀ-ge

day-EMPH

setakˀ-ge

morning-LOC

gidrə=e

boy=3SG.SUBJ

met-a-e-kan-a

say-ACT.APPL-3SG.OBJ-IPFV-IND

'dɛlaŋ-ʈa

look-DEF

si-okˀ=laŋ

plough-INAN=1DU.SUBJ

idi-a'

take-IND

The next morning he told his son, "Look child, we will take the plough."

'ona

that

gundli-laŋ

millet=1DU.SUBJ

si-otʃo-g-a

plough-CAUS-MID-IND

ar

and

ɛʈakˀ

another

tʃas=bon

cultivate=1PL.SUBJ

lagao-a'

apply-IND

"We will plough up the millet and cultivate another crop."

'khan

then

koɽa

boy

gidrə

child

rɔɽ-ruəɽ-ke-d-a

tell-return-ACT.AOR-TR-IND

'hɛnda

oh

baba,

father

ona

that

gundli

millet

ma

MOD

bili-dʒut-akan-a'

ripe-properly-MID.PRF-IND

Then the boy replied, "Oh father, that millet has ripened."

gapa

tomorrow

mɛaŋ

after.tomorrow

khan-ge

then-EMPH

dʒɔm-dʒut-ukˀ-a

eat-suitable-MID-IND

"Tomorrow or the day after it will be edible."

ona

that

TOP

tʃedakˀ-laŋ

why=1DU.SUBJ

si-bəridʒ-a

plough-waste-IND

"Why shall we waste the crop by ploughing?"

khan

then

uni

that

gidrə-rɛn

boy-GEN

apa-tˀ-tetˀ=e

father-INAL-DEF=3SG.SUBJ

hudis-ke-d-a,

think-ACT.AOR-TR-IND

'səri-ge

right-EMPH

nui

this

gidrə

boy

TOP

bhage

good

solha-ge=i

advice-EMPH=3SG.SUBJ

ləi-a-ɲ-kan-a'

tell-ACT.APPL-1SG.OBJ-IPFV-IND

The farmer thought, "The child is giving me advice in good spirit."

adɔ

thereupon

mɔnɛ

mind

mɔnɛ-tɛ

mind-LOC

hudis-dʒɔn-kan-a=e

think-MPASS-IPFV-IND=3SG.SUBJ

He then made up his mind.

hudis-katɛ

think-CONV

gidrə-rɛn

boy-GEN

apa-t-tɛtˀ

father-INAL-DEF

TOP

atʃˀ-rɛn

3SG-GEN

ajo=e

wife=3SG.SUBJ

met-ad-e-a,

say-ACT.APPL.PST-3SG.OBJ-IND

'iɲ

I

TOP

nui

this

gidrə

boy

TOP

oho=ɲ

NEG.EMPH=1SG.SUBJ

gɔtʃˀ-daɽe-ke-a'

kill-ABIL-ACT.OPT-IND

He told his wife, "I can never kill this boy."

ona

that

aɲdʒɔm-sãotɛ

listen-with

uni

that

ajo

wife

TOP

əɖi

very

raŋgao-gɔtˀ-en-a=e

get.angry-TEL-MID.AOR-IND=3SG.SUBJ

ar

and

boge-tɛ=kin

good-INS=3DU.SUBJ

dʒhogɽa-en-a

quarrel-MID.AOR-IND

On hearing that the woman became very angry and quarreled with him.

adɔ

then

uni

that

ajo

woman

dɔ=e

TOP=3SG.SUBJ

laga-giɖi-kad-e-a

drive-away-ACT.BEN.PST-3SG.OBJ-IND

Then he drove the woman away.

mutʃɛtˀ-en-a

finish-MID.AOR-IND

The story ends here.

  1. ^ Note that flexibility is mostly a North Munda/Jharkhandi phenomenon. In comparison, South Munda languages such as Remo, Sora, Gorum exhibit much less flexibility compared to North Munda and Kharia.[92] For instances, modifiers (i.e. "adjectives") cannot take TAM/Person and have (some languages optionally) to be accompanied with copula verbs in predicational sentences:
    1). Remo

    ɖio

    house

    seroʔ

    dirty

    (ɖi-ta)

    COP-MID.NPST

    'The house is dirty'

    2). Sora

    anin

    he

    kuddɨb

    all

    kəndud-ən-dʒi

    frog-NSFX-PL

    sɨrɛŋ

    from

    anin

    he

    suɽa

    big

    'He is bigger than all the other frogs.'

  1. ^ "Statement 1: Abstract of speakers' strength of languages and mother tongues – 2011". www.censusindia.gov.in. Office of the Registrar General & Census Commissioner, India. Archived from the original on 16 July 2019. Retrieved 7 July 2018.
  2. ^ Santali at Ethnologue (21st ed., 2018)
    Mahali at Ethnologue (21st ed., 2018)
  3. ^ "P and AR & e-Governance Dept" (PDF). wbpar.gov.in. Retrieved 10 January 2021.
  4. ^ "Redirected". 19 November 2019. Archived from the original on 9 May 2019. Retrieved 9 May 2019.
  5. ^ a b c d Santali at Ethnologue (18th ed., 2015) (subscription required)
    Mahali at Ethnologue (18th ed., 2015) (subscription required)
  6. ^ a b "Distribution of the 22 Scheduled Languages". censusindia.gov.in. Census of India. 20 May 2013. Archived from the original on 7 February 2013. Retrieved 26 February 2018.
  7. ^ a b Zide, Norman (1999). "Three Munda scripts". Linguistics of the Tibeto-Burman Area. 22 (2): 199–232. doi:10.32655/LTBA.22.2.13.
  8. ^ Ghosh (2008), p. 20.
  9. ^ a b Ghosh (2008), p. 25.
  10. ^ a b Hildebrandt, Kristine; Anderson, Gregory D. S. (2023). "Word Prominence in Languages of Southern Asia". In Hulst, Harry van der; Bogomolets, Ksenia (eds.). Word Prominence in Languages with Complex Morphologies. Oxford University Press. pp. 520–564. doi:10.1093/oso/9780198840589.003.0017. ISBN 978-0-19-884058-9.
  11. ^ a b Ghosh (2008), p. 75.
  12. ^ Ghosh (2008), p. 13.
  13. ^ Ghosh (2008), p. 12.
  14. ^ van Driem, George (2022). Languages of the Himalayas: Volume 1. Brill. p. 275. ISBN 9789004514911.
  15. ^ Sidwell, Paul. 2018. Austroasiatic Studies: state of the art in 2018. Archived 22 May 2018 at the Wayback Machine Presentation at the Graduate Institute of Linguistics, National Tsing Hua University, Taiwan, 22 May 2018.
  16. ^ a b c Ghosh (2008), p. 19.
  17. ^ Hembram, Phatik Chandra (2002). Santhali, a Natural Language. U. Hembram. p. 165.
  18. ^ Choksi, Nishaant (2 January 2018). "Script as constellation among Munda speakers: the case of Santali". South Asian History and Culture. 9 (1): 92–115. doi:10.1080/19472498.2017.1411064. ISSN 1947-2498.
  19. ^ "Ol Chiki (Ol Cemet', Ol, Santali)". Scriptsource.org. Archived from the original on 27 November 2015. Retrieved 19 March 2015.
  20. ^ "Santali Localization". Andovar.com. Archived from the original on 17 March 2016. Retrieved 19 March 2015.
  21. ^ "When Murmu's meeting with Vajpayee ensured constitutional recognition to Santhali language". Deccan Herald. Retrieved 19 June 2025.
  22. ^ "Syllabus for UGC NET Santali, Dec 2013" (PDF). Archived (PDF) from the original on 6 November 2018. Retrieved 4 January 2020.
  23. ^ "C-16: Population by mother tongue, India - 2011". Office of the Registrar General & Census Commissioner, India.
  24. ^ "SCHEDULED LANGUAGES IN DESCENDING ORDER OF SPEAKERS' STRENGTH - 2011" (PDF). census.gov.in. Archived (PDF) from the original on 9 October 2022. Retrieved 17 December 2019.
  25. ^ "ABSTRACT OF SPEAKERS' STRENGTH OF LANGUAGES AND MOTHER TONGUES - 2011" (PDF). census.gov.in. Archived (PDF) from the original on 14 November 2018. Retrieved 17 December 2019.
  26. ^ "PART-A: DISTRIBUTION OF THE 22 SCHEDULED LANGUAGES-INDIA/STATES/UNION TERRITORIES - 2011 CENSUS" (PDF). census.gov.in. Archived (PDF) from the original on 15 April 2022. Retrieved 17 December 2019.
  27. ^ "Santhali". Ethnologue. Archived from the original on 25 May 2020. Retrieved 4 January 2020.
  28. ^ "Santhali becomes India's first tribal language to get own Wikipedia edition". Hindustan Times. 9 August 2018. Archived from the original on 22 February 2019. Retrieved 22 February 2019.
  29. ^ "Second language". India Today. 22 October 2011. Archived from the original on 14 February 2022. Retrieved 5 November 2019.
  30. ^ Roy, Anirban (27 May 2011). "West Bengal to have six more languages for official use". India Today. Archived from the original on 6 March 2023. Retrieved 5 November 2019.
  31. ^ "Glottolog 3.2 – Santali". glottolog.org. Archived from the original on 9 July 2018. Retrieved 26 February 2018.
  32. ^ "Santali: Paharia language". Global recordings network. Archived from the original on 3 December 2018. Retrieved 26 February 2018.
  33. ^ Kobayashi, Masato; Osada, Toshiki; Murmu, Ganesh (2003). "Report on a Preliminary Survey of the Dialects of Kherwarian Languages". Journal of Asian and African Studies. 66: 331–364.
  34. ^ Ghosh (2008), p. 17.
  35. ^ Anderson (2007).
  36. ^ Minegishi, Makoto (1990). "Santali–English–Japanese wordlist: A Preliminary Report". アジア・アフリカ言語文化研究 (Journal of Asian and African Studies). 39: 69–84.
  37. ^ Osada (1996), p. 246.
  38. ^ Osada (1996), p. 247.
  39. ^ a b Horo, Luke; Anderson, Gregory D. S. (2025), Outstanding issues in Santali vocalism and vowel harmony. Paper presented at the 13th International Conference on Austroasiatic Linguistics, October 29-31, 2025, University of Hawaiʻi Press
  40. ^ Anderson (2014), p. 375.
  41. ^ Ghosh (2008), p. 23.
  42. ^ Ghosh (2008), p. 30.
  43. ^ a b Horo, Luke; Anderson, Gregory D. S.; Harrison, K. David (2024). "Vowel Harmony in the Munda Languages". In Hulst, Harry van der; Ritter, Nancy A. (eds.). The Oxford Handbook of Vowel Harmony. Oxford University Press. pp. 723–728. doi:10.1093/oxfordhb/9780198826804.013.57. ISBN 978-0-19-882680-4.
  44. ^ Neukom (2001), pp. 13, 17.
  45. ^ Lier, Eva van, ed. (2023). The Oxford Handbook of Word Classes. Oxford University Press. p. 210. doi:10.1093/oxfordhb/9780198852889.001.0001. ISBN 978-0-19188-7-185.
  46. ^ Ghosh (2008), p. 32.
  47. ^ Ghosh (2008), pp. 32–33.
  48. ^ Ghosh (2008), pp. 34–38.
  49. ^ Ghosh (2008), p. 38.
  50. ^ a b Ghosh (2008), p. 39.
  51. ^ Ghosh (2008), p. 40.
  52. ^ a b Ghosh (2008), p. 41.
  53. ^ Ghosh (2008), p. 43.
  54. ^ Ghosh (2008), p. 44.
  55. ^ Ghosh (2008), p. 45.
  56. ^ "Santali". The Department of Linguistics, Max Planck Institute (Leipzig, Germany). 2001. Archived from the original on 1 December 2017. Retrieved 27 November 2017.
  57. ^ Ghosh 2008, p. 48.
  58. ^ Ghosh 2008, pp. 48–50.
  59. ^ Ghosh (2008), p. 50.
  60. ^ Ghosh (2008), p. 51.
  61. ^ a b Ghosh (2008), p. 52.
  62. ^ Neukom (2001), p. 60.
  63. ^ Ghosh (2008), p. 53ff..
  64. ^ Subbarao & Everaert (2021), pp. 122, 129.
  65. ^ Rau (2013), p. 169.
  66. ^ Neukom (2001), p. 13-16.
  67. ^ a b Rau (2013), p. 179.
  68. ^ Rau (2013), p. 182.
  69. ^ Neukom (2001), p. 173.
  70. ^ a b c d Ghosh (2008), p. 54.
  71. ^ Rau (2013), p. 181.
  72. ^ Ghosh (2008), p. 64.
  73. ^ Ghosh (2008), p. 33.
  74. ^ Choksi, Nishaant (2021). "Structure, Ideology, Distribution: The Dual as Honorific in Santali". Linguistic Anthropology. 31 (3): 382–395. doi:10.1111/jola.12343.
  75. ^ Subbarao & Everaert (2021), p. 111-112.
  76. ^ Neukom (2001), p. 168.
  77. ^ Neukom (2000), p. 97.
  78. ^ a b Rau (2013), p. 173.
  79. ^ Rau (2013), p. 176.
  80. ^ Rau (2013), p. 177.
  81. ^ Ghosh (2008), p. 85.
  82. ^ Ghosh (2008), p. 66.
  83. ^ Ghosh (2008), p. 68.
  84. ^ a b Ghosh (2008), p. 69.
  85. ^ a b Ghosh (2008), p. 70.
  86. ^ a b Ghosh (2008), p. 72.
  87. ^ Rau (2013), p. 178.
  88. ^ Dash (2025), p. 126.
  89. ^ Dash (2025), p. 127.
  90. ^ Dash (2025), p. 128.
  91. ^ Rau (2013), p. 184.
  92. ^ Rau (2013), p. 170.
  93. ^ Rau (2013), p. 183.
  94. ^ Anderson (2007), p. 247-248.
  95. ^ Anderson (2007), p. 262.
  96. ^ a b Ghosh (2008), p. 73.
  97. ^ Ghosh (2008), pp. 66–67.
  98. ^ Neukom (2001), p. 170.
  99. ^ Anderson & Jora (2020), pp. 249–252.
  100. ^ Ghosh (2008), p. 74.
  101. ^ Kisku, Murmu & Choksi (2020), p. 224.
  102. ^ Kisku, Murmu & Choksi (2020), p. 225.
  103. ^ Ghosh (2008), p. 76.
  104. ^ Anderson & Jora (2020), p. 253.
  105. ^ Ghosh (2008), p. 78.
  106. ^ Ghosh (2008), p. 79.
  107. ^ Ghosh (2008), p. 81.
  108. ^ Ghosh (2008), p. 83.
  109. ^ a b Ghosh (2008), p. 84.
  110. ^ Ghosh (2008), p. 88.
  111. ^ Paudyal & Peterson (2021), p. 348.
  112. ^ Osada (1996), p. 257.
  113. ^ Anderson (2014), p. 403.
  114. ^ Sidwell, Paul (2024). "500 Proto Austroasiatic Etyma: Version 1.0". Journal of the Southeast Asian Linguistics Society. 17 (1): i–xxxiii. hdl:10524/52519.
  115. ^ "Universal Declaration of Human Rights: Santali" [ᱢᱟᱱᱣᱟ ᱟᱹᱭᱫᱟᱹᱨᱤ ᱨᱮᱭᱟᱜ ᱥᱟᱱᱟᱢ ᱡᱟᱹᱛ ᱨᱮᱭᱟᱜ ᱜᱷᱚᱥᱚᱬᱟ] (PDF) (in Santali). Office of the United Nations High Commissioner for Human Rights. p. 2. Archived (PDF) from the original on 10 January 2025. Retrieved 16 February 2026.
  116. ^ "Santali Alphabet and Language". Omniglot.com. Retrieved 16 February 2026.
  117. ^ Ghosh (2008), pp. 92–93.
  • Ghosh, Arun (2008). "Santali". In Anderson, Gregory D.S. (ed.). The Munda Languages. London: Routledge. pp. 11–98.

Linguistic journals

[edit]

Comparative studies

[edit]

Grammars and primers

[edit]