Corpus of Gāndhārī Texts
Stefan Baums and Andrew Glass

The foundation for our work on this site is a digital corpus of all published Gāndhārī texts. We started jointly compiling this corpus in 2002, and brought coverage to completion in 2014 (see Baums & Glass 2013 for a brief history of our work, as well as our Blog). Going forward, we continue to keep our corpus updated as new material is discovered and published, add improved documentation to each text, and carry out a comprehensive linguistic analysis of the full range of Gāndhārī texts that we assembled. Especially the Gāndhārī manuscripts discovered in the last twenty years (most of which remain unpublished) are making significant contributions to the corpus, but also new inscriptions and coins from Gandhāra and wooden documents from Central Asia continue to be found.

Gāndhārī corpus
Sources: Hultzsch 1925, British Library, British Library, Stefan Baums.

The current numbers of items and word tokens in our corpus are as follows:

388 manuscripts and manuscript fragments
1,162 inscriptions
901 administrative documents
337 coin legends
=2,788 items (119,303 word tokens)

As we add new material to our corpus, we correct and improve the published texts where necessary; all such changes (currently a total of 3,856) are documented and justified in footnotes (see further here). We also normalize the transliteration system and text-critical marks to the standard set out in the preface of our Dictionary of Gāndhārī. Historical editions of texts are currently presented in plain-text form using the original conventions used by their various editors (see, for example, the relic inscription CKI 266). A tighter integration with the accepted text, tracing the history of individual readings and interpretations, is gradually being applied. In addition to linguistically analyzed documents and metadata, our corpus also contains images of the texts and inscribed objects (currently a total of 2,431) which, following the same upgrade, will be linked to the transliterated texts to allow easy verification of the proposed readings.

Our corpus of Gāndhārī texts can be explored in three ways: If either the assigned catalog number or other details of the object or text (such as its findspot or its general content) are known, it (and related items) can be located through our Catalog of Gāndhārī Texts. If distinctive words or phrases from a text are known, these can be searched for in our Dictionary of Gāndhārī or through the full-text search function in the Catalog section. Finally, if it is known where a text was published, the citation key given in our Bibliography of Gāndhārī Studies can be searched for in the Catalog.