File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-1014_metho.xml
Size: 12,091 bytes
Last Modified: 2025-10-06 14:12:26
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-1014"> <Title>SEMANTIC MATCHING OFFERS AND JOB SEARCH BETWEEN JOB REQUESTS COLING 90</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> SEMANTIC MATCHING OFFERS AND JOB SEARCH BETWEEN JOB REQUESTS </SectionTitle> <Paragraph position="0"> The members of the development team were : B Carden, A Chaouachi, B Euzenat, G Klintzing, M Macary, R Leborgne. We wish to thank the LE MONDE newspaper team for their collaboration during the specification phase.</Paragraph> <Paragraph position="1"> I The primary objective of this system which was developed for the LE MONDE daily newspaper, is to offer an efficient tool for a rapid and intelligent job searching service to professionals in the context of the ever increasing number of advertisements in the printed press.</Paragraph> <Paragraph position="2"> Traditionally, offers of employment appear in newspapers and magazines and sometimes cover twenty pages. The person in search of employment faces the daunting t~tsk of daily readings of lists of job offers.</Paragraph> <Paragraph position="3"> the system will propose to the candidate 3 categories of job offers : 1.- Project manager real time - Project manager in process control - Project manager in automation & industrial computing - tlead of process control department ........................</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> FUNCTIONS PROPOSED BY THE SYSTEM </SectionTitle> <Paragraph position="0"> 2.- Project manager in software engineering ........................</Paragraph> <Paragraph position="1"> This system carries out an optimized comparison between the job offers in the advertisement data base and the requests and/or curriculum vitae entered by end-users at their terminals (minitel). This is performed by extracting pertinent information from the input texts and comparing it at semantic level with the Knowledge Base and the data of the indexed offers.</Paragraph> </Section> <Section position="3" start_page="0" end_page="68" type="metho"> <SectionTitle> 3.- Project manager in computing ARCHITECTURE </SectionTitle> <Paragraph position="0"> The End-User Interface The requirements expressed in the job offer should resemble as closely as possible the characteristics of the candidate or at least be semantically close. More precisely, the results of the matching process are grouped according to the three following criteria : - the requirements of the position are directly fulfilled by the characteristics of the candidate - the requirements expressed in the job offer are met only partially the job offers require characteristics other than those expressed in the candidate's curriculum vitae, but are in the same semantic field.</Paragraph> <Paragraph position="1"> For example, if the requested position is: ( 1 ) 'project manager real time computing', The end-user interface allows the candidate to enter his curriculum vitae in natural language 2.</Paragraph> <Paragraph position="2"> Part of the interaction with the candidate concerns the identification of unknown words (typing errors and spelling mistakes) at which point the system asks the user to correct his text. When a user's request does not match with any requirements in the job offers data base, the system enters into a dialogue with the candidate in order to relax or modify the constraints he imposed on the job search criteria at initial request time.</Paragraph> <Paragraph position="3"> The job offers proposed are assembled into groups according to their semantic pertinence with respect to the candidate's request.</Paragraph> <Paragraph position="4"> Databases 1_ This system is in operation since September 89 and can 'be consulted by Minitel, using telephone number 3615 and selecting the LM / EMPLOI service.</Paragraph> <Paragraph position="5"> The system uses linguistic data : 2- The user can express his attributes freely and without constraints, contrary to SQL, for example.</Paragraph> <Paragraph position="7"> - a general dictionary containing grammatical words, verbs and a certain number of nouns; - a dictionary specialized in the universe of employment (professions, training, universities, software tools, regions,..); - a Knowledge Base (KB) 3.</Paragraph> <Paragraph position="8"> The conceptual, semantic and pragmatic models of this application are represented in the KB. This KB describes certain facts which are universal truths and others which are only true within the context of the universe of employment.</Paragraph> <Paragraph position="9"> The job offers and the curriculum vitae of the candidates (or users) have been modelled using the object analogy. Each conceptual object has an associated attribute list (domains) with values which instantiate them. The values ,of a particular domain are linked together by semantic and pragmatic relations. In the same way, relations may exist between values of different domains. where POSITION is an 9.b.j_e.~. Qualification, Function, ... are the domains. Marketing, computing .... the v_z3Jl~. Moreover, we can see an &quot;upper-level&quot; relation between technician & engineer, and a generic term relation between computing and data base.</Paragraph> <Paragraph position="10"> The al~xl.v.,~,~ The sy.,;tem uses morphologic and syntactic analysers and a semantic analysis engine called the &quot;matching machine&quot; (MM).</Paragraph> <Paragraph position="11"> The rule sets (see below for further details) and grammars used by the morphologic and syntactic analysers in this application were designed to be linguistically robust and rapid in execution. Given that the application was designed for 200 simultaneous Minitell 4 connexions by the response time for these analysers must be extremely short.</Paragraph> <Paragraph position="12"> 3- The combined size of the two dictionaries is approximately 30,000 words with 3000 referential woIds for the KB.</Paragraph> <Paragraph position="13"> 4- This physical architecture consists of a : frontend which manages the connexions and serializes the user's queries, and a backend supporting the analysers. With regard to questions of morphology, the analysers possess rule sets describing inflexion and derivation for the recovery of canonical forms of words 5 stocked in the dictionary starting from the text of the user's request or curriculum vitae.</Paragraph> <Paragraph position="14"> This analyser also possesses rules for treating initial letters (H.E.C. <==> HEC, CIA <==> C.I.A ..... ), abbreviations (St6 <==>Soci6t6, m <==> m~tre .... ), &quot;floating prefix&quot; terms (micro-informatique <---=> microinformatique <==> micro informatique), concatenated or disjoint expressions (mettre en oeuvre <==> les mesures ~ rapidement ng_q_9.C/uvre ....</Paragraph> <Paragraph position="15"> pomme de terre .... ) and other morpho-lexical phenomena.</Paragraph> <Paragraph position="16"> Concerning syntactic analysis, the corresponding analyser possesses a grammar of &quot;standard&quot; French. However, phenomena such as anaphora, coreferencing (except in certain minor cases), the scope of negations, among others are not treated. This is a deliberate choice since the persons using this system (through their requests or curriculum vitae) do not often use these elements of style in their texts (texts are chiefly noun phrases or verbal sentences).</Paragraph> <Paragraph position="17"> It is important to note that the analysers described above 6 are independent of the application and can be reused for other applications.</Paragraph> <Paragraph position="18"> Concerning the text comprehension phase, the MM treats the information received from the syntactic analyser in conjunction with information drawn from the Knowledge Base.</Paragraph> <Paragraph position="19"> The MM uses functions or &quot;methods&quot; which carry out specific treatments according to the type of objects under consideration.</Paragraph> <Paragraph position="20"> How the Matchin~ Machine works The functioning of the MM is at the same time semantic and pragmatic and 4 distinct steps are identified. They are : - 1. Recuperation of normalized terms from the user's request or curriculum vitae; -2. Identification of the domain and of tile object concerned by these terms; -3. Semantico-pragmatic spreading from tile initial terms according to the &quot;method&quot; used for their associated object.</Paragraph> <Paragraph position="21"> 5- For us, canonical words are : a singular, masculine nouns or adjectives, and roots of verbs.</Paragraph> <Paragraph position="22"> 6- except some rules used to handle special words like the acronyms, the &quot;telematic language&quot;, etc. -4. Extraction, intersection and classification of the indexed job offers according to the initial terms and those identified by the spreading process.</Paragraph> <Paragraph position="23"> ~qp_._2_serves to unambiguously identify the objects designated by the normalized terms which were extracted from the user's request. For example, in the following request: (2) Expert translator of text in English the analyser will assign the term &quot;English&quot; to the domain FUNCTION of the object POSITION since this term designates, in this context, a specialization within the profession of the translator.</Paragraph> <Paragraph position="24"> In contrast, if the request is: (3) Civil engineer spealing English in this case, the term &quot;English&quot; will be considered, in this context, as a value designating an object LANGUAGE (which is one of the conceptual level objects found in the job offers) 7.</Paragraph> <Paragraph position="25"> consists in passing from one term to another, starting at an initial term, in a tree-walk through the semantic and pragmatic network of the KB. This is performed in an outward spreading manner and is determined by the methods associated with the object types designated by the initial terms. The arcs between the nodes of the network are weighted and the result of a spreading process is a new term Y at a distance n from a starting term X.</Paragraph> <Paragraph position="26"> The distance that a spreading process is allowed to run through the network is determined by the methods. This distance is one of the parameters necessary to calculate the final distance in the following step.</Paragraph> <Paragraph position="27"> is charged with the ordering of the job offers by comparing initial and final terms.</Paragraph> <Paragraph position="28"> The set.,; of job offers then undergo set operations (boolean operations). This treatment is directed by a number of dynamically acting rules. That is, the actions of these set operations depend on the semantic role assigned to the terms during the second step of analysis and the objects concerned by these terms.</Paragraph> <Paragraph position="29"> For example, for the request: (4) Computing journalist 7_ Among the examples mentioned here, we could consider the following: (6) English translator Given that the model does not take nationalities into account, the system will interpret this request as in example (2).</Paragraph> <Paragraph position="30"> the job offers proposed must correspond to positions for journalists specializing in the computing domain and not to positions in the press and/or informatics. However, in the following example: (5) UNIX / C programmer the system must propose positions for specialists in UNIX / C. It will also propose job offers for software programmers in which n__qo mention of operating systems or programming language is made, and others in which other operating systems or languages were mentioned. The classification of job offers is made as a function of the distance and the criteria fulfilled by the request. The job offers will be presented to the user according to this classification (see Example (1)).</Paragraph> </Section> class="xml-element"></Paper>