Tuesday, 9 December 2014

What is so ‘scientific’ about Sanskrit? #SeriousQuestion

30 Comments
@zeusisdead posted this query on twitter, and that finally prompted me collate all the material I could find on this topic; because this topic has bothered me for a long time.
“Sanskrit is the ideal language for computer science” is a view that is so widespread in India, that my mother, who is 70, and knows little about Sanskrit, and even less about computer science, passionately believes this, and I can’t convince her otherwise. Indians are in love with the concept that things invented in India 2000 years ago are still better than the best that the western world can throw at us today.
A broader question is the one that ZeusIsDead asked: what is so ‘scientific’ about Sanskrit?
As far as I can tell, there are two interesting aspects to Sanskrit:
  • Sanskrit is the first language to have a formal grammar defined; and there is evidence that Pāṇini’s work in this area influenced modern linguists like de Saussure and Chomsky. (And oh, Devanagari is awesome)
  • One guy in NASA in the 80s tried to push Sanskrit as an ideal language for Artificial Intelligence applications; he was neither able to convince the AI community of this, nor was he able to make much headway in this himself. This approach is largely dead, but Indian media and the ancient-Indians-were-the-best crowd did not get the memo.
In short: Pāṇini’s Grammar for Sanskrit was a phenomenal work that probably influenced modern linguists, but it is not particularly useful in Computer Science.

Influence of Sanskrit on Modern Linguistics

From the Wikipedia page on Pāṇini:
Pāṇini’s work became known in 19th-century Europe, where it influenced modern linguistics initially through Franz Bopp, who mainly looked at Pāṇini. Subsequently, a wider body of work influenced Sanskrit scholars such as Ferdinand de Saussure, Leonard Bloomfield, and Roman Jakobson. Frits Staal (1930-2012) discussed the impact of Indian ideas on language in Europe. After outlining the various aspects of the contact, Staal notes that the idea of formal rules in language – proposed by Ferdinand de Saussure in 1894 and developed by Noam Chomsky in 1957 – has origins in the European exposure to the formal rules of Pāṇinian grammar
How exactly did this influence modern linguists?
In particular, de Saussure, who lectured on Sanskrit for three decades, may have been influenced by Pāṇini and Bhartrihari; his idea of the unity of signifier-signified in the sign somewhat resembles the notion of Sphoṭa. More importantly, the very idea that formal rules can be applied to areas outside of logic or mathematics may itself have been catalyzed by Europe’s contact with the work of Sanskrit grammarians
Here, an important connection to computer science also can be seen:
Pāṇini’s grammar is the world’s first formal system, developed well before the 19th century innovations of Gottlob Frege and the subsequent development of mathematical logic. In designing his grammar, Pāṇini used the method of “auxiliary symbols”, in which new affixes are designated to mark syntactic categories and the control of grammatical derivations. This technique, rediscovered by the logician Emil Post, became a standard method in the design of computer programming languages. Sanskritists now accept that Pāṇini’s linguistic apparatus is well-described as an “applied” Post system. Considerable evidence shows ancient mastery of context-sensitive grammars, and a general ability to solve many complex problems. Frits Staal has written that “Pāṇini is the Indian Euclid.”

Sanskrit as an ideal language for AI applications

In 1985, Rick Briggs wrote a paper for the Association for the Advancement of Artificial Intelligence titled Knowledge Representation in Sanskrit and Artificial Intelligence. At that time, AI researchers were focused on trying to construct artificial languages that could be used in AI so that computers would not have to deal with the ambiguities of real languages. Briggs argued that instead of constructing artificial languages, we could simply use a highly structured language like Sanskrit.
Here is what he wrote in the abstract:
In the past twenty years, much time, effort, and money has been expended on designing an unambiguous representation of natural languages to make them accessible to computer processing. These efforts have centered around creating schemata designed to parallel logical relations with relations expressed by the syntax and semantics of natural languages, which are clearly cumbersome and ambiguous in their function as vehicles for the transmission of logical data. Understandably, there is a widespread belief that natural languages are unsuitable for the transmission of many ideas that artificial languages can render with great precision and mathematical rigor.
But this dichotomy, which has served as a premise underlying much work in the areas of linguistics and artificial intelligence, is a false one. There is at least one language, Sanskrit, which for the duration of almost 1,000 years was a living spoken language with a considerable literature of its own. Besides works of literary value, there was a long philosophical and grammatical tradition that has continued to exist with undiminished vigor until the present century. Among the accomplishments of the grammarians can be reckoned a method for paraphrasing Sanskrit in a manner that is identical not only in essence but in form with current work in Artificial Intelligence. This article demonstrates that a natural language can serve as an artificial language also, and that much work in AI has been reinventing a wheel millenia old.
The fact that someone from NASA (NASA!!!!) wrote this, and he claimed that Sanskrit is better than the efforts of modern researchers, gave the ancient-India-was-awesome crowd, and Indian media a collective orgasm. The web is full of people claiming that Sanskrit is the ideal language for computers, and if you follow the trail of references, all roads lead to this one paper by Briggs. (It is important to note that NASA itself has no official position on this; also, random rumors on the web about some “Mission Sanskrit” by NASA are hoaxes.)
Unfortunately for Briggs and for Sanskrit, this effort never did pan out. Looking at modern AI and natural language processing research, one is hard pressed to find any papers that reference Sanskrit in anything other than simple translation of Sanskrit or other Indian languages.

Vague Ramblings from the Internet

There’s this speech by Justice Markandey Katju titled “Sanskrit as a Language of Science. It rambles on for pages, but makes only two semi-relevant points:
  • [Sanskrit] enabled scientific ideas to be expressed with great precision, logic and elegance.
    • This is just proof by assertion. There is no real support provided for this statement.
    • Also, this is in direct contradiction to another article by a Sanskrit lover which claims that one of the great attributes of Sanskrit is that the same sentence can have two or more completely different meanings. (Scroll down on that page to “Sanskrit is a Context based Language”, and the next section.)
  • The alphabet of Sanskrit is arranged in a very logical and scientific manner.
    • This is certainly true. I’ve blogged about it here.
    • While this fact is pretty cool, it has no relation to the use of Sanskrit as a Language of Science
The rest of the article rambles on about ancient Indian philosophy, and the achievements of our ancestors in the fields of Science and Maths and Astronomy and Medicine and Engineering – all of this, while being interesting and impressive, does not really throw any light on the topic being discussed.
Overall, the internet is full of articles like this and this which go on for pages describing the various interesting features of Sanskrit. And people somehow list this as proof that Sanskrit is the ideal language for Science. A careful reading of the articles usually shows that there is no connection between the various cool features of Sanskrit and its suitability for Science.
Many people also point out that European languages are derived from Sanskrit. That is slightly inaccurate. Linguists have hypothesised the existence of a language called Proto Indo European which is the common ancestor of Sanskrit and most European languages. In any case, that has nothing to do with Sanskrit’s suitability for Science.
The best comment I got was this:
Vedas are in Sanskrit and Vedas are eternal. Hence, Sanskrit is the oldest language.
Sadly, that is the level of 90% of the discourse on this topic on the internet.

Follow-up Reading

Antariksh Bothale, who studies Computational Linguistics at the University of Washington, Seattle has this interesting answer to the question “Is Sanskrit over-rated as a language in India”. Lots of good nuggets of information.
Also, if you don’t know how awesome the Devanagari script is, check this out

Conclusion

In short, one guy thought Sanskrit might be a good language for AI applications, but that turned out to be a dead end. Sorry.
But, Pāṇini rocked!
Note: I am not an expert in this field, and this is just information I’ve collected from the internet. So if anyone is able to uncover any additional information, or even information that contradicts what I’ve said, please leave comments below. I’d love to be mistaken on this point.

30 thoughts on “What is so ‘scientific’ about Sanskrit? #SeriousQuestion”

    1. @Kshitiz, No, sorry. I’ve not read Vikram Chandra’s book. The review you posted seems to indicate that he has gone off on tangents that are not really related to computer science. But if someone who has really read the book comments, that would be great.
  1. namaste,
    Yes unfortunately there are a lot of people who are blinded by their love for Samskritam and perpetuate tall claims like “Sanskrit is the best language for computer science”. Sad.
    While, Samskritam does not need to resort to lies to establish its greatness , such false claims are very counter-productive.

    जयनगर-विभागः | बेङ्गळूरु-महानगरम्
    संस्कृतभारती
    1. Incorrect, pIz read this http://www.hitxp.com/articles/sanskrit-lessons/dhatu-root-verbs-samskrit-grammar-dictionary/
      In short, for a language to be used as a programming language it needs to have context free grammar which then can be compiled into machine understandable binary code by a compiler.
      Sanskrit is the only human spoken language which has a context free grammar which means while you cannot write a compiler which can read and understand (parse) english sentences bcoz of the ambiguous nature in English sentences, you can definitely write a compiler for Sanskrit which can understand sanskrit and compile the instructions into binary.
      If all the geeks behind computer software were sanskrit speakers instead of english then all languages like java, sql etc would not have had their own syntax, instead they would be simply parsing sanskrit sentences. Bcos there would have been no need to reinvent new context free grammar for new programming languages.
      Infact the BNF notation used to develop programming languages is based on Sanskrit notations described by Panini in his Ashtadhyayi. See http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Backus%E2%80%93Naur_Form.html
  2. Guess you have not gone through this link in its entirety.
    http://www.hitxp.com/articles/sanskrit-lessons/dhatu-root-verbs-samskrit-grammar-dictionary/
    If sanskrit was the prominent language in computer science instead of english then there would be no need to invent languages like sql bcos sanskrit itself is structured like sql or c. One just has to write the compiler to parse sanskrit sentences and normal sanskrit language sentences can b used to query databases.
    Also sanskrit never needs loan words no matter what new technology or concepts get invented or discovered. Sanskrit language has in built mechanism to create meaningful new words based on attributes. See http://www.hitxp.com/articles/sanskrit-lessons/sanskrit-lesson-1-secret-science-sacred-sanskrit/
    Sanskrit is the best language to store information which is a key requirement in IT. To understand this one has to understand dhatu and how data is stored in sanskrit sentences. Any way the true power of sanskrit in AI and NLP will be appreciated the day it gets successfully applied in these fields.
  3. Thanks for doing the research on this… I’d often wondered how that claim about Sanskrit and CS came about.
    On a related note, I’m looking forward to @sidin’s new book “The Skeptical Patriot”, which examines several such beliefs about the glories of the Indian civilization. It’s not yet published, but is expected to be available in a month or two. I wouldn’t be surprised to find the Sanskrit claim among those he’s covered.
  4. Also PIE is a language born out of pure imagination of linguists who refused to probe further into the antiquity of Sanskrit, just bcoz the pointers show sanskrit somewhere at the root of indo European languages. Why does Lithuanian for instance is filled with sanskrit words as is?
    Read about Pie below. Can anybody answer who spoke pie, any literature in pie or any reference to a language like pie in other ancient texts worldwide? http://www.hitxp.com/articles/culture/sanskrit-greek-english-latin-roman-words-derived-pie-proto-indo-european-language/
  5. Interesting article indeed! May be those who claim sanskrit as the ideal language for programming should produce a small program as a proof which maybe parses sanskrit sentences? :-)
    Also normally sanskrit has this style of not mentioning few words at all in a sentence but the reader has to assume those based on the context. Even ashtadhyayi has followed this style as far as i have read it. How can that be handled in a program, i wonder.
  6. Thanks for clearing up some of the myths. I just have one question
    “Vedas are in Sanskrit and Vedas are eternal. Hence, Sanskrit is the oldest language.”
    Isn’t that true? I don’t know about “eternal” but Vedas are the oldest scriptures in the world right? (correct me if I am wrong). So doesn’t that mean that Sanskrit is the oldest language, at least in written form?
    1. @Gaurav,
      • India wasn’t big on writing; we had more of an oral tradition. So “Vedic Sanskrit” (more accurately called Old Indo Aryan, or Old Indic) was used to compose and orally recite the Vedas for probably a 1000 years before it was first written down in any form that has survived.
      • Because of the oral tradition, and the lack of writing in general, it is difficult to get reliable evidence of when this language was first used.
      • This language was rather different from what we now know as Sanskrit.
      • In 5th century BC, Panini standardized and codified the grammar of Old Indic. It is estimated that some variation of this language was in use for a 1000 years before that (i.e. the oldest evidence for the existence of “Vedic” compositions is from 1500BC.)
      • By contrast, there is solid evidence for various forms of writing from Egypt, Sumer, and the Akkadian empire that are older than 2500 to 3000 BC. So chances of Old Indic being the oldest language are difficult to justify.
      However, Panini’s grammar is pretty impressive for its time, and there doesn’t seem to be anything comparable to it in that time frame.
      1. My dear Navin, there are more solid evidence for sanskrit 5000BC old in our country, if u don’t know then check this website- bharathgyan.com, and these are Ramayan dating 5000BC and Mahabharath dating-3000BC with all scientific research
  7. Navin —
    Like you, I, too concluded, a long time ago, that your grandmother’s reaction, and that of a million other Indians, has more to do with a sense of nationalistic pride than dispassionate scientific inquiry. (Gurudev is an example of one who, even though he appears competent to prosecute the issue with real understanding, is swayed by that “saffronist” reverence rather than logic and knowledge.) And I, too, have described this phenomenon with exactly the same tone of sarcasm as you have. But even though I know a bit of computer science (I have practiced it for decades) and I know a bit of Sanskrit (grew up in a Brahmanic environment), I feel my rejection of the hypothesis that Sanskrit has a few things going for it as a language for use in AI (and related endeavors) is premature without deeper study. I didn’t sense, from all you’ve written above, that you addressed yourself to the Rick Briggs article with any significant vigor. Net-net: I would like to pursue this with you somehow, if a proper forum could be found.
    1. @Ajit, I would say this: A large number of people believe/claim that Sanskrit is an ideal language for use in AI/computerized processing. However, there is no evidence so far to support this claim. The one person with some background in computational linguistics who made this claim did it in one paper, has not followed it up with any further work, nor has anyone else from the field done so. Certainly, absence of evidence is not evidence of absence, so I certainly can’t claim that Sanskrit is *not* appropriate for this purpose. But, there is no justification to the claim that it *is* appropriate either.
      I don’t have enough of a background in computational linguistics to pursue this further. But I do have enough of a background in CS research to be able to confidently claim that no serious work/research is happening in this area currently. Looks like a dead end to me.
      I am not enough of a computational linguist to pursue this further
  8. From my individual experience what I draw. (BTW I dont know sanskrit at all)
    – I have issue with hyper forgetfulness, yet few mantras that I may have learned they just come out of my mouth no matter what state of mind I am in. I find this attribute just out of the world in sanskrit. and I guess none can even think of challenge this because sustaining volumes and volumes of wisdom in oral forms for “thousands” of years prove that point.
    – Second aspect that sets it apart for me is, the shear perfection in scritp and how it structurally captures the entire movement of body organs coming into play (tongue, lips etc)
    1. @Ashmish, Both your points, while interesting, have nothing to do with whether or not Sanskrit is an ideal language for computer science. I have already mentioned the second aspect in my article.
  9. I have read the article and the comments too. You have presented a balanced and critical view of the idea. I think you would like the proof in support of Sanskrit as well-suited-for-computer language. I have an evidence but I cannot give it as a proof. The reason is that the evidence I refer to here is not attested by the Sanskrit scholars. The evidence is
    I have developed fully rule based Grapheme to Phoneme converter as a part of Sanskrit Speech Synthesizer. The converter program, given the rules are correct, gives perfect output (100% accuracy). The rules are extracted based on the observational study of Sanskrit speech, and on the bases of different instruction how to correctly pronounce Sanskrit. The rules are also attested by shastraic quotes.
    It is not a proof due to the condition “given the rules are correct”. This tool can be presented as a proof only if the rules are attested by the scholars to be true, practically and traditionally. For now, their correctness is self proclaimed. If it is attested as a proof, then it will be small but significant proof of “Sanskrit is well suited language for computer”. It is notable that no other language has this module fully rule based. Most languages rely on the pronunciation lexicon largely for exceptions.
    1. @Diwakar, Fair enough. I’m willing to concede the point that Sanskrit might be a language really well suited for speech synthesis.
      However, that is not what most people think of when they think of “a language well-suited for computers”. They are thinking of natural language understanding, which is a much harder problem.
      1. I used it in speech synthesis – does not mean that it is well suited for speech synthesis (only). What I said that this is small but significant evidence. What I mentioned is a language module. Every great invention is rooted in some small thing. As invention of rocket is based on flying balloon and then on aircraft. The colourful light coming from inside the LED screen has its root of invention in Edison’s bulb.
        As you have also said that in the time of Rick Briggs, Sanskrit was thought upon to be used in place of programming languages, but now we have very effective programming languages. Therefore the efforts to use Sanskrit have been stopped (almost). That is why not enough work (though much work) has been done on Sanskrit NLP to make it understood by computers.
        Yes, Natural Language Understanding is hard, very hard task. But like most of the hard tasks, it can be achieved through smaller tasks.
    2. Kudos Mr.Diwakar Mishra! I think there are a couple of points I’d like your and Navin’s views on.
      1) What is really ‘natural’ about what we call natural languages? There is nothing natural about English (for example)- words are created by the requirements of a changing environment (and lost); Meaning languages are not natural extensions of anything a grammar. Computer languages on the other hand manifest from sets of rules – structure being an automatic flourish of that. Sanskrit, given the rigor and discipline enforced by SIMPLE grammar therefore seems more akin to such structure – structure lending to expansion, conjucation and traceability.
      2) What is stopping you from getting the attestation? How can we help?
      1. The term Natural Language does not mean a language which has naturally evolved, but it is a technical term used for Human spoken languages. What makes it different from artificial languages that they have variations, and ambiguity – that one expression can have more than one meaning, and more than one expression may have one meaning.
        Attestation can be done after I present the rules, and the basis of those in front of scholars who are conscious about the correctness of speech.
  10. Navin,
    I applaud you for giving pen to the thoughts – Any effort to bust myths must be encouraged. So that we we can establish whether something is truth or myth.
    But apart from providing a summary of what you searched and found in the internet, what exactly are you seeing/saying that disproves or proves anything? To be fair, shouldn’t your reasonable conclusion, if based solely on the ‘evidence’ you gathered be that you couldn’t conclude that Sanskrit is more particularly suited for it; but there is nothing you have presented here that says otherwise either.
    Now, to the point of being suited to be a computer language, what all does it take? You admit you do not know. That being the case on what basis and credence do you dismiss the published findings of the NASA scientist? Because what he said has not become reality, is that the only reason for you rejecting it? If that is sufficient, wouldn’t the plate be still flat and not having been rolled up into an imperfect ball as it has recently?
    I know 7 real programming languages, three distinct assembler levels machine interpreters and over a dozen ‘scripting’ ‘languages’. I also know English, Tamil, a wee-bit of Sanskrit and some colloquial Hindi. Instinctively I have felt and realized a few things which I will share:
    1) there is a more natural and un-halting formation in Sanskrit than the others. Because with only a few readings of Sanskrit letters and very basic grammar, I am able to generate meanings for other words; I can do that in several ‘computer’ languages too – but NOT in English (which in fact is where I have most fluency in!); to a much lesser degree in Hindi and Tamil as well even though though and likely because they borrow a lot of Sanskrit words
    2) When I look at organization of data (whether traditional data architectures or the recent concepts of big data to mean unstructured and voluminous collections) it is very clear that schemata that are being used even after 50+ years of modern computer science, have still not gotten to a stable plateau.
    To me that is indication that Mr.Briggs may have been on to something
    But by all means, if I have missed any of your counter-evidence or even if it is just thoughts based on an hypothetical or instinctive premise, please highlight.
    Sans that, it is unfair on yourself and all others for you write what sounds like a logical paper but is actually based on a subjective collection of internet missives. BTW I know you are seeking, rather than rejecting out of any malice because you have honestly stated the source and limitations of what information you used.
    Here is one final request: however ‘uneducated’ and ‘illiterate’ our grand mothers may seem like to us by our standards, it has been my experience that they are every bit as smart and questioning and seeking as we are and our children are.
    1. @Sridharan, what you’re saying is literally true. I don’t have evidence to disprove that Sanskrit is more suited for computer processing.
      However, extraordinary claims require extraordinary proof. And considering that nobody is using Sanskrit for computer processing, nobody has written papers on this topic other than one guy, who did not continue this work, nor did anyone else, I think the burden of proof is on the people who claim that Sanskrit is more suited for computer processing.
      I mean, I can make the claim that every night when he’s absolutely alone in his room, Narendra Modi converts into an Ogre, and converts back to a human in the morning, or when someone else comes into the room. Would you be able to disprove it? Would you then claim that “based solely on the ‘evidence’ gathered you couldn’t conclude that Modi converts into an ogre; but there is nothing you could gather that says otherwise either.” No. You’ll say it’s a ridiculous claim and unless I provide some evidence, my statement should not be taken seriously. The burden of proof is on me because I am making a tall claim.
  11. A simple search on google scholar also shows the below articles. Interested to know more about what gurudev and diwakar are doing / saying / adding to this debate. Is any of this conclusive or helpful towards proving either side of the debate? Hope some of the below held. Wish to be able to spend more time on getting to the bottom of this topic, at some point in life …
    http://www.jstor.org/discover/10.2307/41694883?uid=2&uid=4&sid=21104729160477
    http://www.sciencedirect.com/science/article/pii/0888613X87900077
    http://repository.ias.ac.in/61380/
    http://sanskritlibrary.org/tomcat/sl/-/pub/lies_sl.pdf
    http://parankusa.org/ComputerProcessing.pdf
  12. Navin, I have read with pleasure your balanced views on the subject. I am hoping that the debate can reach the same level of dispassionate and honest argument. regards, Ashok
    1. I believe that a lot of research is being done related to the use of Sanskrit as a computer language. Could you please check why there is so much of research and studies being done on Sanskrit in other countries especially Germany. There has to be a very good reason.
        1. I also do not find the research being done on the use of Sanskrit as a computer language, but there is much research going on for ‘processing of Sanskrit language by computer’, in other words ‘Sanskrit Computational Linguistics’.
          A few links of such works
          sanskrit.jnu.ac.in
          sanskrit.uohyd.ernet.in
          sanskrit.inria.fr
          for other works you can search with names of Hans Henrich Hock, Brendon S Gillon, Peter M Scharf, Girish Nath Jha, Amba Kulkarni, Oliver Hellwig, Wiebke Petersen.
          you can see the series of Sanskrit Computational Linguistics conference proceedings. One of them is http://www.springer.com/computer/ai/book/978-3-642-17527-5
  13. So basically what you are saying is : ” I don’t know Sanskrit and I don’t know AI or computational linguistics. However my Google search has returned only a single paper written by a white man in the 80s. So you Hindus are making a big fuss about nothing. Move on there is nothing there. Oh and BTW , I will do you the honor of giving kudos to Panini”. Seriously ? How are you any better than those who claim Sanskrit is the greatest language there is because they are in the Vedas. I like discussions where the author knows what they are talking about. What you have done is kill the interest of a young researcher who might have aspired pursue this subject, especially in a country which desperately needs to reconnect with its relatively glorious past.

No comments: