Thesis OnArtificial Intelligence for Universal Networking Language (UNL)(Perspective Bengali Language)ByDeen Islam Muslim ID: 200720851Ariful Hoque Tuhin ID: 200710698Shohanur Rahman ID: 200720100 Under the Supervision ofMd. Ahsan Arif, Sr. LecturerDept. of Computer Science and EngineeringAsian University of Bangladesh Artificial Intelligence for Universal Networking Language (UNL)(Perspective Bengali Language)Deen Islam Muslim, Ariful Hoque Tuhin, Shohanur RahmamDepartment of Computer Science and EngineeringAsian University of BangladeshAbstract:In this paper we present the computational analysis of the complex case structure of Bengali- a member of the Indo Aryan family of languages- with a view toward interlingua based MT. Bengali is ranked 4th in the list of languages ordered according to the size of the population that speaks the language. Extremely interesting language phenomena involving morphology, case structure, word order and word senses make the processing of Bengali a worthwhile and challenging proposition. A recently proposed scheme called the Universal Networking Language has been used as the Interlingua. The approach is adaptable to other members of the vast Indo Aryan language family. The parallel development of both the analyzer and the generator system leads to an insightful intra-system verification process in place. Our approach is rule based and makes use of authoritative treatises on Bengali Grammar and develop rules for certain Bengali to UNL conversion process. Introduction:About 189 Million people speak Bengali and is ranked 4th in the world in terms of the number of peoplespeakingthelanguage(ref:http://www.harpercollege.edu/˜mhealy/g101ilec/intro/clt/cltclt/top100.html). Like most languages in the Indo Aryan family, descended from Sanskrit, Bengali has the SOV structure with some typical characteristics. A motivating factor for creating a system for processing Bengali is the possibility of laying the framework for processing many other Bengal languages too.Work on Indian language processing abounds. Project Anubaad [1] for machine translation from English to Bengali in the newspaper domain uses the direct translation approach. Angalabharati [2] system for English Bengal machine translation is based on pattern directed rules for English, which generates a pseudo-target-language applicable to a group of Indian Languages. In MATRA [3], a web based MT system for English to Bengal in the newspaper domain, the input text is transformed into case-frame like structures and parameterized templates generate the target language. The MANTRA MT system for official documents uses Tree Adjoining Grammar (TAG) to achieve English Bengal MT (ref: Project Anusaaraka [4] is a language accessor system rather than an MT system and addresses multiple Indian languages. Interlingua based MT for English, Bengal and Marathi [5] [6], that uses the UNL, transforms the source text into the UNL representation and generates target text from this intermediate representation. References to most of these works can also be found at http://www.tdil.mit.gov.in/mat/ach-mat.htm. Other famous MT systems are Pivot [7], Atlas [8], Kant [9], Aries [10], Geta [11], SysTran [12] etc. The Universal Networking Language (UNL) (http://www.unl.ias.unu.edu) has been defined as a digital Meta language for describing, summarizing, refining, storing and disseminating information in a machine independent and human language neutral form. The information in a document is represented sentence by sentence. Each sentence is converted into a directed hyper graph having concepts as nodes and relations as arcs. Knowledge within a document is expressed in three dimensions: 1. Word Knowledge is expressed by Universal Words (UWs), which are language independent. These UWs are tagged using restrictions describing the sense of the word in the current context. For example, drink(icl > liquor) denotes the noun sense of drink restricting the sense to a type of liquor. Here, icl stands for inclusion and forms an is-a relationship like in semantic nets.2. Conceptual Knowledge is captured by relating UWs through a set of UNL relations [14]. For example, Humans affect the environment is described in the UNL asagt(affect(icl>do).@present.@entry, human(icl>animal).@pl)obj(affect(icl>do).@present.@entry, environment(icl>abstract thing).@pl)agt means the agent and obj the object. affect(icl > do), human(icl >animal) and environment(icl > abstract thing) are the UWs denotingconcepts.3. Speaker’s view, aspect, time of event, etc. are captured by UNL attributes.For instance, in the above example, the attribute @entry denotes the mainpredicate of the sentence, @present the present tense and @pl the pluralNumber.The above discussion can be summarized using the example below John, who is the chairman of the company, has arranged a meeting at his residenceThe UNL for the sentence is;======================== UNL =======================mod(chairman(icl>post).@present.@def,company(icl>institution).@def)aoj(chairman(icl>post).@present.@def, John(icl>person))agt(arrange(icl>do).@entry.@present.@complete, John(icl>person))pos(residence(icl>shelter), John(icl>person))obj(arrange(icl>do).@entry.@present.@complete, meeting(icl>event).@indef)plc(arrange(icl>do).@entry.@present.@complete, residence(icl>shelter));====================================================In the expressions above, agt denotes the agent relation, obj the object relation, plc the place relation, pos is the possessor relation, mod is the modifier relation and aoj is the attribute-of-the-object (used to express constructs like A is B) relation. The detailed specification of the Universal Networking Language can be found at http://www.unl.ias.unu.edu/unlsys.Our work is based on an authoritative treatise on Bengali grammar. The strategies of analysis and generation of linguistic phenomena have been guided by rigorous grammatical principles.Universal Networking Language Based Analysis and Generation for BengaliBangla-English DictionaryBangla to English dictionary is the source of building a Bangla to UNL dictionary asuniversal words are English words mandated by UNL. Such dictionaries also provide allattributes along with the meaning of a word. Any entry in the dictionary is put in thefollowing format:[HW] {ID} “UW” (ATTRIBUTE1, ATTRIBUTE2 . . .) Here,HW < Head Word (Bangla word)ID < Identification of Head Word (omitable)UW < Universal WordATTRIBUTE < Attribute of the HWFLG < Language FlagFRE < Frequency of Head WordPRI < Priority of Head WordSome example entries of dictionary for Bangla are given below:shohor {} “city(icl>region)” (N, PLACE) prochur {} “huge(icl>big)” (ADJ) Here the attributes,N stands for NounPLACE stands for placeADJ stands for AdjectiveFLG field entry is B which stands for BanglaA universal knowledge base is defined in UNL specification. This knowledge base islanguage independent and each native language word should be referenced to this knowledgebase. The knowledge base of universal words is a hierarch of concepts.En-Converter and De-Converter machinesThe En-Converter (henceforth called EnCo) is a language-independent parser, a multi-headed Turing machine providing a framework for morphological, syntactic and semantic analysis synchronously using the UW dictionary and analysis rules. The structure of the machine is shown in the figure 1.Fig. 1. The EnCo machineThe machine has two types of heads- processing heads and context heads.The processing heads (2 nos.) are called Analysis Windows (AW) and the context heads are called Condition Windows (CW). The machine traverses the sentence back and forth, retrieves the relevant universal words from the lexicon and, depending on the attributes of the nodes under the AWs and those under the surrounding CWs, generates semantic relations between the UWs and/or attaches speech act attributes to them. The final output is a set of UNL expressions equivalent to a UNL graph. The De-Converter (henceforth called the DeCo) [18] is a language-independent generator that produces sentences from UNL graphs (figure 2).Fig. 2. The DeCo machineLike EnCo, DeCo too is a multi-headed Turing Machine. It does syntactic and morphological generation synchronously using the lexicon and the set of generation rules.Existing Problems:1) Spell Checking:By Somehow, Spell Checker is not included with the current UNL system. For that, there is a possibility to arising wrong output during en-conversion or de-conversion process. An example of such situation is given below: A simple English sentence in a right spelling form:I live in BangladeshAccording to [13], the final UNL expression is as follows:aoj(live(icl>inhabit>be,aoj>living_thing,plc>place).@entry.@present,i(icl>person))plc(live(icl>inhabit>be,aoj>living_thing,plc>place).@entry.@present,bangladesh(iof>asian_country>thing))Where “Bangladesh” is assigned to the UW of “Asian_Country”, which tells the de-converter to search “Bangladesh” word in “Asian_Country” Category. But if we if we type the “Bangladesh” word in wrong spelling like “Banladesh” then it convert that word in such form:aoj(live(icl>inhabit>be,aoj>living_thing,plc>place).@entry.@present,i(icl>person))plc(live(icl>inhabit>be,aoj>living_thing,plc>place).@entry.@present,banladesh)It does not define any UW for “Bangladesh”. There for wrong conversion can be occurred. 2) Maintaining Exact Grammatical Pattern:According to [13], for a single sentence or like some compound sentences it works fine. But we have found some crucial problems when en-converting and de-converting some multiple sentences.For An example, using this sentence:I like rice and I play football.En-converting this sentence in UNL forms: aoj:01(like(icl>please>be,equ>enjoy,obj>uw,aoj>person).@entry.@present,i(icl>person):01)obj:01(like(icl>please>be,equ>enjoy,obj>uw,aoj>person).@entry.@present,rice(icl>grain>thing))agt:02(play(icl>compete>do,agt>thing,obj>uw,ptn>thing).@entry.@present,i(icl>person):02)obj:02(play(icl>compete>do,agt>thing,obj>uw,ptn>thing).@entry.@present,football(icl>field_game>thing))and(:02,:01)After De-converting this UNL form in English:I like rice and I play football.But, Problem occurs when using some multiple sentences like:We play football. We try to win every match.En-converting this sentence in UNL forms:agt(try(icl>attempt>do,agt>person,obj>uw).@entry.@present,we(icl>group):01.@pl)pos:01(match(icl>contest>thing),we(icl>group):02)fictit(try(icl>attempt>do,agt>person,obj>uw).@entry.@present,every(icl>quantity,per>thing))obj:01(win(icl>prize>do,agt>thing,obj>thing,scn>thing).@entry,match(icl>contest>thing))obj(try(icl>attempt>do,agt>person,obj>uw).@entry.@present,:01)After De-converting this UNL form back to English:We try win we match.More exampleI am rahim. i am a student of asian university of bangladesh.En-converting this sentence in UNL forms:aoj(rahim.@entry.@present,i(icl>person):01)fictit(rahim.@entry.@present,i(icl>person):02)aoj(student(icl>university_student>person,obj>knowledge_domain).@indef.@present,i(icl>person):02)mod(university(icl>body>thing),asian(icl>adj,com>asia))obj(student(icl>university_student>person,obj>knowledge_domain).@indef.@present,university(icl>body>thing))obj(university(icl>body>thing),bangladesh(iof>asian_country>thing))After De-converting this UNL form back to English:I BE rahim. I a student of an Asian university of Bangladesh3) Lack of Exact Rules and Absence of AI: Let’s try some tricks with UNL:My name is Casper and I dont like to play football.According to [13], en-conversion takes this sentence in UNL form like this way:pos(name(icl>language_unit>thing,pos>thing),i(icl>person):01)aoj(casper(iof>city>thing).@entry.@present,name(icl>language_unit>thing,pos>thing))and(:01,casper(iof>city>thing).@entry.@present)aoj:01(like(icl>please>be,equ>enjoy,obj>uw,aoj>person).@entry.@not.@present,i(icl>person):02)obj:01(like(icl>please>be,equ>enjoy,obj>uw,aoj>person).@entry.@not.@present,play(icl>compete>do,agt>thing,obj>uw,ptn>thing))obj:01(play(icl>compete>do,agt>thing,obj>uw,ptn>thing),football(icl>field_game>thing))Where name “Casper” categorized under “City”, not as a person name. Similarly for Bangla to UNL conversion we can give an example like:Golapi ekhon aar train-e uthena.During the En-conversion, we may face the problem to identify whether Golapi is a person name or a color name.4) Verb Representation Problem:Let take a look at this example:I am a good boy.En-converted UNL form:aoj(boy(icl>child>person,ant>girl).@entry.@indef.@present,i(icl>person))mod(boy(icl>child>person,ant>girl).@entry.@indef.@present,good(icl>adj,ant>bad))De-Converted English:I Be a good boy.Another Example:My name is Karim and I like flowers.En-converted UNL form:pos(name(icl>language_unit>thing,pos>thing),i(icl>person):01)aoj(kerim(icl>name,iof>person,com>male).@entry.@present,name(icl>language_unit>thing,pos>thing))and(i(icl>person):02,karim(icl>name,iof>person,com>male).@entry.@present)man(kerim(icl>name,iof>person,com>male).@entry.@present,like(icl>how,obj>thing))man(i(icl>person):02,like(icl>how,obj>thing))obj(like(icl>how,obj>thing),flower(icl>angiosperm>thing).@pl)De-converted English:The name of I like is Karim and I this like the flowers.Recommended Solution: 1) Implementation of a good spell checker 2) Rearrange the Sentence in Exact Grammatical Pattern3) Implementation of AI and Exact Rules4) Determine the Exact Form of Verb RepresentationConclusionsSystematic analysis of the case structure forms the foundation for any natural language processing system. In this paper, we have described a system for the computational analysis of the Bengali case structure for the purpose of Interlingua based MT using UNL. The complementary generator system too has been implemented, which provides the platform for intra system verification. Verification via cross system generation is being done using the Bengal generation system (also under development.) Apart from the case structure, computational analysis based on authoritative grammatical treatise, addressing complex phenomena involving verbs, adjectives and adverbs is under way.References:[1] Dey, K.: Project Anubaad: an English-Bengali MT system. Jadavpur University, Kolkata (2001)[2] Sinha, R.: Machine translation: The Indian context. AKSHARA’94, New Delhi (1994)[3] Rao, D., Mohanraj, K., Hedge, J., Mehta, V., Mahadane, P.: A practical framework for syntactic transfer of compound-complex sentences for English-Bengal machine translation. (2000)[4] Bharati, A., Chaitanya, V., Sanyal, R.: Natural Language Processing: A Paninian Perspective. Prentice Hall India Private Limited (1996)[5] Dave, S., Bhattacharya, P., Girishbhai, P.J.: Interlingua based English-Bengal machine translation and language divergence. Journal of Machine Translation, Volume 17 (2002)[6] Monju, M., Sachi, D., Bhattacharyya, P.: Knowledge extraction from Bengal texts. Knowledge Based Computer Systems, Proceedings of the International Conference KBCS2000 (2000)[7] Muraki, K.: Pivot: Two-phase machine translation system. MT Summit Manuscripts and Program, pp. 81-83 (1987)[8] Uchida, H.: Atlas. MT Summit II, pp. 152-157 (1989)[9] Lonsdale, D.W., Franz, A.M., Leavitt, J.R.R.: Large-scale machine translation: An interlingua approach. (www.lti.cs.cmu.edu/Research/Kant/PDF/aei94.pdf)[10] Gonzlez, J.C., Go, J.M., Nieto, A.F.: Aries: A ready for use platform for engineering Spanish-processing tools. Digest of the Second Language Engineering Convention, pages 219-226 (1995) [11] Vauquois, B., Boitet, C.: Automated translation at Grenoble University. (acl.ldc.upenn.edu/J/J85/J85-1003.pdf)[12] W.John, H., L., S.H.: An introduction to Machine Translation. London: Acamedic Press (1992)[13] Russian and English Language Server, the Russian UNL Language Center, www.unl.ru.我的大学爱情观1、什么是大学爱情:大学是一个相对宽松,时间自由,自己支配的环境,也正因为这样,培植爱情之花最肥沃的土地。
大学生恋爱一直是大学校园的热门话题,恋爱和学业也就自然成为了大学生在校期间面对的两个主要问题恋爱关系处理得好、正确,健康,可以成为学习和事业的催化剂,使人学习努力、成绩上升;恋爱关系处理的不当,不健康,可能分散精力、浪费时间、情绪波动、成绩下降因此,大学生的恋爱观必须树立在健康之上,并且树立正确的恋爱观是十分有必要的因此我从下面几方面谈谈自己的对大学爱情观2、什么是健康的爱情:1) 尊重对方,不显示对爱情的占有欲,不把爱情放第一位,不痴情过分;2) 理解对方,互相关心,互相支持,互相鼓励,并以对方的幸福为自己的满足; 3) 是彼此独立的前提下结合;3、什么是不健康的爱情:1)盲目的约会,忽视了学业;2)过于痴情,一味地要求对方表露爱的情怀,这种爱情常有病态的夸张;3)缺乏体贴怜爱之心,只表现自己强烈的占有欲;4)偏重于外表的追求;4、大学生处理两人的在爱情观需要三思:1. 不影响学习:大学恋爱可以说是一种必要的经历,学习是大学的基本和主要任务,这两者之间有错综复杂的关系,有的学生因为爱情,过分的忽视了学习,把感情放在第一位;学习的时候就认真的去学,不要去想爱情中的事,谈恋爱的时候用心去谈,也可以交流下学习,互相鼓励,共同进步。
2. 有足够的精力:大学生活,说忙也会很忙,但说轻松也是相对会轻松的!大学生恋爱必须合理安排自身的精力,忙于学习的同时不能因为感情的事情分心,不能在学习期间,放弃学习而去谈感情,把握合理的精力,分配好学习和感情3、 有合理的时间;大学时间可以分为学习和生活时间,合理把握好学习时间和生活时间的“度”很重要;学习的时候,不能分配学习时间去安排两人的在一起的事情,应该以学习为第一;生活时间,两人可以相互谈谈恋爱,用心去谈,也可以交流下学习,互相鼓励,共同进步5、大学生对爱情需要认识与理解,主要涉及到以下几个方面:(1) 明确学生的主要任务“放弃时间的人,时间也会放弃他大学时代是吸纳知识、增长才干的时期作为当代大学生,要认识到现在的任务是学习——学习做人、学习知识、学习为人民服务的本领在校大学生要集中精力,投入到学习和社会实践中,而不是因把过多的精力、时间用于谈情说爱浪费宝贵的青春年华因此,明确自己的目标,规划自己的学习道路,合理分配好学习和恋爱的地位2) 树林正确的恋爱观提倡志同道合、有默契、相互喜欢的爱情:在恋人的选择上最重要的条件应该是志同道合,思想品德、事业理想和生活情趣等大体一致摆正爱情与学习、事业的关系:大学生应该把学习、事业放在首位,摆正爱情与学习、事业的关系,不能把宝贵的大学时间,锻炼自身的时间都用于谈情说有爱而放松了学习。
相互理解、相互信任,是一份责任和奉献爱情是奉献而不时索取,是拥有而不是占有身边的人与事时刻为我们敲响警钟,不再让悲剧重演生命只有一次,不会重来,大学生一定要树立正确的爱情观3) 发展健康的恋爱行为 在当今大学校园,情侣成双入对已司空见惯抑制大学生恋爱是不实际的,大学生一定要发展健康的恋爱行为与恋人多谈谈学习与工作,把恋爱行为限制在社会规范内,不致越轨,要使爱情沿着健康的道路发展正如马克思所说:“在我看来,真正的爱情是表现在恋人对他的偶像采取含蓄、谦恭甚至羞涩的态度,而绝不是表现在随意流露热情和过早的亲昵4) 爱情不是一件跟风的事儿很多大学生的爱情实际上是跟风的结果,是看到别人有了爱情,看到别人幸福的样子(注意,只是看上去很美),产生了羊群心理,也就花了大把的时间和精力去寻找爱情(5) 距离才是保持爱情之花常开不败的法宝爱情到底需要花多少时间,这是一个很大的问题有的大学生爱情失败,不是因为男女双方在一起的时间太少,而是因为他们在一起的时间太多相反,很多大学生恋爱成功,不是因为男女双方在一起的时间太少,而是因为他们准确地把握了在一起的时间的多少程度6) 爱情不是自我封闭的二人世界很多人过分的活在两人世界,对身边的同学,身边好友渐渐的失去联系,失去了对话,生活中只有彼此两人;班级活动也不参加,社外活动也不参加,每天除了对方还是对方,这样不利于大学生健康发展,不仅影响学习,影响了自身交际和合作能力。
总结:男女之间面对恋爱,首先要摆正好自己的心态,树立自尊、自爱、自强、自重应有的品格,千万不要盲目地追求爱,也不宜过急追求爱,要分清自己的条件是否成熟要树立正确的恋爱观,明确大学的目的,以学习为第一;规划好大学计划,在不影响学习的条件下,要对恋爱认真,专一,相互鼓励,相互学习,共同进步;认真对待恋爱观,做健康的恋爱;总之,我们大学生要树立正确的恋爱观念,让大学的爱情成为青春记忆里最美的风景,而不是终身的遗憾! 。