{"id":88,"date":"2013-04-01T17:49:48","date_gmt":"2013-04-01T15:49:48","guid":{"rendered":"http:\/\/gandhari.org\/blog\/?p=88"},"modified":"2016-10-14T04:42:37","modified_gmt":"2016-10-14T02:42:37","slug":"lemmatization","status":"publish","type":"post","link":"https:\/\/gandhari.org\/blog\/?p=88","title":{"rendered":"Lemmatization"},"content":{"rendered":"<p>Over the last few weeks, we collected all word instances in the Gandhari.org text collection that currently have Sanskrit parallels associated with them (3,404 items). We arranged these under 1,960 separate lemmata, and I associated each lemma with a standardized G\u0101ndh\u0101r\u012b spelling. The result is now available from the Dictionary \u2192 GD submenu, while the old Dictionary search interface (based on word forms rather than lemmata) continues to be available internally under Dictionary \u2192 Index. At this point I would like to solicit feedback from the users of Gandhari.org on my proposed standardized spelling for lemmata. The spelling basically uses only those graphemes that are common to the various G\u0101ndh\u0101r\u012b orthographies; applies anusv\u0101ra consistently; uses the spelling <i>sp<\/i> for the reflex of OIA sibilant + <i>m<\/i> or <i>v<\/i>; uses <i>g<\/i> for the lenition product of velar stops, but <i>y<\/i> for the lenition product of palatal stops and original <i>y<\/i>; and does not mark optional palatalizations. My aim is to provide a standardized spelling that is as central to the overall G\u0101ndh\u0101r\u012b tradition as possible, while being as helpful as possible to the users of our Dictionary. Please have a look through the list of lemmata that is now online and let me know whether you think I come close to these goals.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Over the last few weeks, we collected all word instances in the Gandhari.org text collection that currently have Sanskrit parallels associated with them (3,404 items). We arranged these under 1,960 separate lemmata, and I associated each lemma with a standardized G\u0101ndh\u0101r\u012b spelling. The result is now available from the Dictionary \u2192 GD submenu, while the <a class=\"read-more\" href=\"https:\/\/gandhari.org\/blog\/?p=88\">[&hellip;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/gandhari.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/88"}],"collection":[{"href":"https:\/\/gandhari.org\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gandhari.org\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gandhari.org\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/gandhari.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=88"}],"version-history":[{"count":1,"href":"https:\/\/gandhari.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/88\/revisions"}],"predecessor-version":[{"id":318,"href":"https:\/\/gandhari.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/88\/revisions\/318"}],"wp:attachment":[{"href":"https:\/\/gandhari.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=88"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gandhari.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=88"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gandhari.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=88"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}