VictorTranslator

LinguaBuzz

LinguaBuzz is a by-product of VICTOR Translator. It is based on 8-year experience working in natural language processing.

At last you can see what people say on the Internet

To search “mentions” of a brand or a product in social networks, blogs, forums, etc. – definitely, on the Internet – there is not any other option than using “keywords”, as we do every day to look for something in our usual web search engine . The problem is that these searches return us many irrelevant or incorrect results.

It is inevitable: the language is ambiguous, inaccurate, polysemic. Most words mean several different things and often appear in sentences that have nothing to do with what is really looking for. Current tools for analysis of opinion can not solve this problem. They are very inaccurate because they do not understand the meaning of words.

At this point LinguaBuzz and its linguistic analysis appear on stage.

LinguaBuzz is a byproduct of VICTOR Translator – an automatic translation machine– that was developed over eight years. It analyzes and labels each word of each sentence and assigns the grammatical functions (noun, verb, adjective, etc.) to each word. Also, the syntactic relations between them, as for example, what is the subject of the sentence, the direct object… or what adjective qualifies what noun.

After analyzing and tagging all mentions located – based on the “keywords” included in these mentions – Linguabuzz tells you what mentions talk about “something” that interests you, discarding “trash” or noise that is irrelevant.

This linguistic analysis is what makes LinguaBuzz better than other similar products.

LinguaBuzz knows about what people say and not just what words are used. This is the huge difference.

After “filtering” the results using the linguistic information, LinguaBuzz displays the results in an easy-to-use and clear graphic format.

Furthermore, when LinguaBuzz analyzes each sentence, it extracts automatically “topics” and the “Sentiment” from the opinions – this sentiment could be positive, negative or neutral. This information is incorporated into the charts that show not only who gives the mention, what it is about, but also whether the mention is positive or negative.

Besides that, LinguaBuzz reaches a very high percentage of accuracy. Now it is possible to use this data to draw conclusions and make decisions about the marketing strategy.

What is LinguaBuzz used for?

LinguaBuzz’s mainly function is making complete and accurate market studies automatically using the mentions published on the Internet.

But there are many other ways to use these reports about products, brands or even persons.

- Analysis of products in relation to competitors.
– Search the most appropriate sites for marketing campaigns.
– Study of the impact of a marketing campaign (traditional or online media).
– Segmentation of advertising in social networks: users’ id in Facebook, Twitter, etc., who are more appropriate to offer certain products.
– Identification of trolls in social networks (people who only write negative mentions in general or about a particular brand).

There are many different ways to take advantage of the huge amount of information generated in LinguaBuzz’s reports. Tell us your needs and we will make our best to satisfy them.

More information in http://www.linguabuzz.com

VICTOR in La Opinión de Granada

Source: La Opinión de Granada

VICTOR, the Most “Formal” Machine Translator

Its principal novelty lies in its capacity to “undersand” the message, fruit of a decade’s work of a professional linguist.

CARMEN BANDERA: Its name is VICTOR and though it might sound like a person, in reality it’s a revolutionary English-to-Spanish machine-translation program developed by MTC Soft, a Granada-based software company. VICTOR was commissioned by Berca Translator, a local translation firm.

Its principal novelty lies in its capacity to transcribe not only the words of a text, but also its meaning. That is to say, “to understand” the message. “Until now nobody had created a machine translator with the necessary quality for professional work, and to achieve it we had to start from zero with a fresh analysis of the problem,” says Fernando Moreno-Torres, the company’s CEO.

This fresh analysis came from a professional linguist and translator with 20 years of experience in the European Commission in Brussels. After a decade’s work developing a new system she contacted MTC Soft three and a half years ago and asked them to create the necessary computer applications. The result has just been born and has had an excellent reception among translators, academics and business people. “The contribution of the linguistic analysis was what made the whole project possible,” explained Moreno-Torres, «permitting us to create a translator which is clearly better than those which preceded it. We’re talking about results which are on the order of 95% correct, and rising.»

There are various keys to the success of VICTOR. In the first place, there’s the process of linguistic analysis which converts “plain text” into “enriched text.” That is to say, the translator submits each word to more than 180 different routines in a matter of seconds and returns the text with all of its elements analyzed and labelled with their linguistic attributes. Thanks to this process of analysis, the program not only translates words, but the sense of the words.

Another of the challenges was to achieve a correct translation into Spanish, choosing the adequate equivalents for each word. To do that we had to create a custom-made dictionary with exhaustive information on each word and how they relate to one another.

“What surprised me most about the dictionary was its capacity to learn the subtleties of the language. The program is constantly learning more words, and there is less and less to correct. It produces an almost-perfect Spanish,” says Moreno-Torres. «The program can distinguish the different senses of the words thanks to its capacity to learn, thereby eliminating the incorrect equivalences which are normally so frustrating with machine translators,” he adds.

VICTOR is not intended to translate all kinds of texts. It won’t translate colloquial language, for example, as its function is to convert into Spanish formal texts, written in correct English, such as manuals, contracts, technical articles, etc.

This isn’t a software which you can buy and install in your computer, rather a service offered by MTC Soft. «The client sends us an English text and we translate it, revise it (only until the client’s personalize dictionary is finished) and return it to them in Spanish. In a short time it becomes a fully-automatic online service,” adds MTC Soft’s CEO.

The program only translates English into Spanish, and not vice versa, “because the process is in no way symmetrical. In principle, we don’t foresee the possibility of developing another program which translates Spanish into English.”

MTC Soft, founded in Granada in the early 90’s, started out in the software business developing management applications for Spanish notaries, and then for property registrars. In recent years they have diversified their offer to include more general applications, such as document management and machine translation.

A Linguistic Solution to Perfecting Search Technology

MTC Soft Chosen to Present Search Ap at WWW 2009 Conference

MTC Soft, a software company based in Granada, Spain has been selected by the WWW 2009 Conference as one of 26 developers-track participants from around the world to present their proposed contributions to the future of the World Wide Web. The 18th annual World Wide Web Conference is scheduled for April 20-24, 2009 in Madrid, Spain.

Granada, Spain March 23, 2009– MTC Soft, a software company based in Granada, Spain has been selected by the WWW 2009 Conference as one of 26 developers-track participants from around the world to present their proposed contributions to the future of the World Wide Web. The 18th annual World Wide Web Conference is scheduled for April 20-24, 2009 in Madrid, Spain. This invitation to address the international WWW Conference is a first for Andalusian software developers.

«Parsalyzer is our new grammatical analysis application designed to perfect advance search technology,» says Fernando Moreno-Torres, CEO of MTC Soft and promoter of their new language parser for search use. He adds, «We’re honored to be invited to present it before this forum, where we’ll be speaking from the same podium as top experts from the research departments of companies like HP, Yahoo, Google, Telefonica and Microsoft, as well as leading universities worldwide.»

Parsalyzer is our new grammatical analysis application designed to perfect advance search technology
We’re honored to be invited to present it before this forum, where we’ll be speaking from the same podium as top experts from the research departments of companies like HP, Yahoo, Google, Telefonica and Microsoft, as well as leading universities worldwide.
When we began to see the results we were achieving with the beta version of the VICTOR Translator
it occurred to me that the same parsing engine might be applied to problems related to search technology.
The WWW annual conferences have traditionally highlighted technological advances which have gone on to make important contributions to the future of the Internet.

Parsalyzer is based on a new departure in linguistic analysis originally developed for VICTOR Translator, MTC Soft’s automatic translator application, by a veteran translator with the European Commission in Brussels. «When we began to see the results we were achieving with the beta version of the VICTOR Translator,» says Moreno-Torres, «it occurred to me that the same parsing engine might be applied to problems related to search technology.»

He adds, «The reason that search engines return so many erroneous and irrelevant replies is that they are only reading the words in the documents searched, not the sense of the words. Our Parsalyzer linguistic analysis application changes all that. Now search engines will be able to eliminate a great percentage of irrelevant search engine results pages (SERPs), making the desired relevant results much more accessible.»

MTC Soft, founded in Granada in the early 90’s, originally developed management software for Spanish notary offices, and later property registrars. In recent years they have diversified into other, more-general business applications such as document management and automatic translation. They are also active internationally, implanting their software solutions for public administration in Latin America and the Maghreb.

MTC Soft to talk in WWW2009

MTC Soft, Small Spanish Firm, Keeps Fast Company at the World Wide Web Conference 2009 in Madrid, Spain

Madrid, Monday, April 27, 2009—Like the mouse that roared, MTC Soft, a small software company from Granada, Spain, caused a stir at the 18th annual World Wide Web Conference 2009 in Madrid last week, holding its own among researchers from universities from all over the world as well as a bevy of Web heavyweights, including experts from companies like Google, Yahoo and Microsoft.

“There were moments when I found myself asking, ‘What’s a guy like you doing in a place like this,’” confesses Fernando Moreno-Torres, CEO of MTC Soft, “especially when I was on my way up to the podium for my presentation and I recognized Tim Berners Lee, the father of the World Wide Web, in the audience.” Moreno-Torres’s paper was selected among almost 100 proposals submitted by software developers from all over the world for the Developers’ Track of this conference which took place in Madrid last week (April 20-24).

In his presentation on Thursday afternoon Moreno-Torres introduced Parsalyzer, MTC Soft’s new product, which the young Spanish company chief affirms could drive the next great leap in Internet searches. “Until now,” says Moreno-Torres, “searches have been executed on “plain text,” with no linguistic differentiation among words. Parsalyzer automatically performs a complete linguistic analysis consisting of 140 separate processes on a text, then labels every word with its characteristics and function within the sentences. This “enriched text” permits a search motor to return dramatically improved search results, on the order of a 95% improvement.”

Moreno-Torres’s presentation on Thursday afternoon saw standing room only in the conference room. So much so, affirms Moreno-Torres, that two researchers from Google’s European laboratory in Switzerland found themselves sitting on the floor in the first row.

The most gratifying moment, according to Moreno-Torres, occurred at the dinner that night, when he coincided with Don José Alberto Jaén, full professor of Computer Science and Artificial Intelligence at Madrid´s Polytechnique University, who had sat in the first row of seats during the Parsalyzer presentation. At one moment in their conversation the young man from MTC Soft asked the professor: “Do you believe what I affirmed in my talk this afternoon that we have created a tool which will substantially improve Internet searches and might well be a new standard for the World Wide Web?”

“Yes, I believe it,” replied the veteran professor.

New York Times

VICTOR Translator, noticia del New York Times.

Interview with the translator

We’re talking with the person responsible for the original idea and the linguistic analysis for VICTOR Translator. A linguist and translator for the European Commission in Brussels for more than 25 years, she was the inspiration for this English-to-Spanish machine-translation project commissioned by the translation company, Berca Translator and developed by MTC Soft in Granada, Spain.

Question: There are many machine translators on the market, some of them developed by industry giants. Why was it necessary to invent another one? What made you think that you were the right person to undertake this adventure? How did you expect to achieve with the help of a tiny Spanish software company what some of the world’s most important corporations hadn’t achieved with unlimited resources?

Answer: I’ve got 25 years’ experience in professional translating, 10 of them translating into Spanish original texts from English, French and Italian using some of the finest machine translating systems on the market. I participated in the development of some of those systems. For that reason I am well acquainted with the problems and limitations of the existing systems.

That’s what prompted me to develop an analysis which, concentrating on the specific stumbling blocks in the process of English-to-Spanish translations, would permit programmers to overcome those problems from its very conception. That is to say, an application capable of overcoming the obstacles which machine-translation programs face every working day:

Functional-analysis faults (frequent in the English-Spanish language pair, since the process proceeds from a Saxon language to a Latin one.
Syntactic-analysis deficiencies (identification of a sentence’s true subject, or the semantic or syntactic links between the different parts of the sentence.
The necessity to foresee, for a single English Word, different translations according to the context.
The necessity to give adequate treatment to idioms of the source language, avoiding literal translation while substituting them for their corresponding idioms in the target language (this both in expressions and in verbs)
The necessity, particularly in an English-to-Spanish translation system, to obtain a syntactic reformulation in specific cases, such as the addition of linking conjunctions or prepositions, or relative pronouns, the conversion of a sentence from the passive mode in English to reflexive in Spanish, etc.

Our greatest advantage in this project is that we’re starting from a lengthy practical experience which not even the universities (for their principally theoretical approach) nor the great computer-science companies (for their mainly statistical approach) seem to have counted on until now. It’s our intention to use statistical information only to solve the problems which exhaustive linguistic analysis is not capable of resolving. No linguistic analysis can determine if “Harry’s bar” is the bar which belongs to Harry or the famous “Harry’s bar” in Venice. In this case we are also obliged to resort to statistics.

Presumably one doesn’t go lightly into a project which is going to last for years. What inspiration moved you to start? What final objectives did you have from the beginning?

My objective from the outset was to achieve a system which would resolve all of the “automatic” part of the translation process, which constitutes—I know from experience—a very high percentage of my daily work. Then there is the advantage that once a machine has translated an expression correctly, or produces a correct analysis, it will always do it the same way. I think the effort we dedicate to achieving a program capable of carrying out this task must necessarily be more productive than dedicating day after day resolving repeatedly the same problems in successive texts to be translated.

How much time have you invested in creating the linguistic analysis necessary to create this program?

Approximately five years for the abstract conception and another five for the practical part, working with the programmers, and adapting my work to theirs, often on a trial-and-error basis as they went along developing the program.
Please explain to us a bit of your procedure in creating your analysis. What were the first issues to overcome? Why hadn’t they been solved before by the creators of earlier machine-translation programs? I mean, machine translation has been around for a long time, and a lot of money has been spent on it. What happened?

I think our program has various strong points:

A very efficient functional and syntactic analysis, based on logic, but also on practice (compatibility, probability, tendency of the speaker to obviate ambiguity, etc.)
The possibility for the massive, automatic creation of specific glossaries. That is to say, of groups of set expressions.
The posibility to convert English idiomatic verbs into the corresponding Spanish idiomatic verbs. I don’t believe any other program does that today.
The proper management of English phrasal verbs, extremely complex and subtle.
The translation or ad hoc addition of the different prepositions according to the specific translation which Spanish assigns to the original verb: that which we refer to in VICTOR Translator as
“subordinated” and “added” prepositions.
Finally, the possibility to condition the translation of any Word (from among all possible translations) to the context in which it appears, with the option of delimiting that context almost exhaustively.

How did the contact with Fernando Moreno-Torres y MTC Soft come about? Coincidence, friendship, or a systematic search?

The origins of our cooperation are in mutual confidence, both personal and professional.

What are the usual complaints that professional translators have against machine translators?

The use of automatic translators began around 10 years ago. For a professional translator the principal problem with today’s machine translators is that sometimes it takes more work to “fix” the phrase that the machine proposes than to do the translation yourself.
Today’s machine translators, conceived, I think, from lower levels of complexity than those of the process of translating (perhpas because the lacked translating experience from the beginning) cannot resolve all of the problems presented in real translation experiences. When I started working on this new translator I started out from the point of view that a program which offers functional and syntactic analysis is contributing something valuable to the human translator. If it also obtains the appropriate terminology, a correct linking of the different parts of the sentence, correct prepositions and correct idioms, the material which the automatic translator offers to the human professional translator will contribute to an important saving in time and resources.

How has your collaboration with MTC Soft gone? Was it easy for them to translate your natural-language linguistic analysis into a computer program?

No, it hasn’t been easy, but MTC Soft has put two magnificent analysts with many years of experience on the project, and they have been able to translate the linguistic logic which underlies the program into computer language. They have quickly captured the problems and contradictions which have arisen and contributed elegant computer solutions.

How does one feel seeing the VICTOR project all the way through to the beta version?

Sometimes I feel impatient, because there’s still work to do on the next version which will incorporate statistical analysis. But, seeing the really brilliant results achieved up till now, I feel a tremendous satisfaction.

Do you hope to see your translator colleagues using VICTOR in their daily work?

Yes, I hope so.

Interview with the Project Manager of VICTOR Translator

VICTOR Translator is a computerized system for the automatic translation of English into Spanish, developed by MTC Soft, a small software firm in Granada, Spain.

Q: What was your role in the development of VICTOR?
A: Project manager, as well as analyst and programmer.

How many people worked on the project?
The linguistic analysis is the work of the promotor of the idea, a profesional translator from the European Commission with more that 20 years of experience.

In the development of the program, besides the project director there were two analyst-programmers working during the whole process both on the program and the dictionaries, along with the translator herself. Then there were 30 students and six professors from the Faculty of Translation and Interpretation from the University of Granada who created the first versions of the dictionaries, plus six students in practice to complete and aliment the dictionaries.

How long did it take you to reach the beta stage?
The linguistic analysis goes back 10 years. The first design of the application took about a year.

The creation of the dictionaries (both the program and the content) another year. It was another year’s work after the dictionaries before we launched the beta program. We’re still working on both of those aspects. To sum up, the overall project took 10 years, the last three and a half of which were dedicated to the development of the application.

Could you explain to us a little bit about the process of creation of this machine translator? What were the most difficult problems to overcome? What were the most unpleasant surprises you came up against in the process?
The principal problem that has to be overcome when making an automatic translator is the linguistic analysis, which the translator had already resolved.

For us the biggest challenge was to convert her analysis, in natural language, into computer language, that is to say, a program. Our main work was to develop a computer platform capable of converting the sub-processes of translation into computer routines. Once that platform was created, the difficulty lay in understanding those routines in natural language (which are mental processes) and put them down in lines of code (computational processes).
Our surprises have been the continuous changes which the translator has made in some of the phases of translation in order to improve them. The results of her work have been very valuable in checking the validity of the process, and have given rise to numerous adjusments.

What does VICTOR Translator that the others don’t?
VICTOR does a profound linguistic analysis of the text, backed up by a set of dictionaries which are very complete, classified by glossaries (by themes), and which are continually being nurtured with new concepts, and syntactical and semantic structures.

And how did you manage that?
Thanks to the initial linguistic analysis, and to the made-to-order dictionaries and to the computer platform we created to codifiy the translation processes.

What types of clients do you expect for VICTOR Translator? That is to say, who needs VICTOR?
I would say that, in the first phase it will be companies which need to translate large quantities of documents related to a concrete subject with its own glossaries.

Then there will be translators who require a previous analysis of the document they must translate or a first draft so as to be able to rework it a bit rather than starting from zero. Later on, when we’ve created enough glossaries, VICTOR will do the job for any person who needs rapid and trustworthy translations.

¿VICTOR Translator offers applications which go beyond simple machine translation, doesn’t it? What might these be?
R: Linguistic labelling of texts. Memories of translations. Semantic searches. Text classification. Support for translators.

I’d like to present a challenge to VICTOR Translator right here and now. Would it be capable of translating, “hand on hand,” a text at random, so as we can compare it with the translation of the same text made by one of the popular machine translators, say Google Translator?

This is the paragraph which we used for the demo, and the result was very good, first the text in the original English, then the Google translation, then VICTOR’S:

Original:

“Eurostat, in its press release, stated that it was not in a position to certify the figures included in the notification of Portugal, due, among other reasons, to shortage of information on capital injections from the Portuguese government to public corporations,which had been treated as acquisition of shares and other equities with no effect on the government deficit.”

Google:
“Eurostat, en su comunicado de prensa, declaró que no estaba en condiciones de certificar las cifras consignadas en la notificación de Portugal, debido, entre otras razones, a la escasez de información sobre las aportaciones de capital de que el Gobierno portugués a las empresas públicas, que había sido tratados como la adquisición de las acciones y otras, sin efecto sobre el déficit público.”

VICTOR:

“Eurostat, en su comunicado de prensa, afirmó que no estaba en condiciones de certificar las cifras incluidas en la notificación de Portugal, debido, entre otras razones, a la escasez de la información sobre
las inyecciones de capital del gobierno portugués para las empresas públicas, que había sido tratada como la adquisición de las participaciones y otros valores sin efecto alguno en el déficit público.”

Press

Questions

Q: Does the world really need another machine translator?
A: There’s no doubt about it. It needs one which functions properly, at least in the English-Spanish language pair. The world badly needs one which yields quality results and is a useful tool for translators.

How did this English-Spanish machine-translator project come about?
The idea wasn’t ours. It came from a professional translator and linguist with more than 20 years experience who saw the need for a product like this in her daily work. Automatic translation is still not completely resolved, especially in the English-Spanish language pair.

Machine translation is a vast field. What sections of that field does VICTOR Translator address?
For starters we’ll be focusing on companies or institutions with large numbers of documents based on specialized themes. This is necessary because we have to create a personalized dictionary of terms and expressions for each customer, which means that it would only be profitable if they translate a large number of documents. These documents do not include literary texts, general periodical articles, chats, forums, etc.

If your company is dedicating time and resources to this ambitious project you must have something new to offer. What is it exactly?
The project is revolutionary insofar as, until now, machine translators have had an underlying problem in their design and their development has been blocked for several years. Only a fresh start, with an entirely new focus, would provide the possibility to obtain notably better results.

As far as novelties in the product are concerned, it’s difficult to enumerate them in a few words. To sum it up briefly, the linguistic analysis for our new system was designed by someone who was well versed in the errors of previous machine translators and centered her work on the unsolved problems.

What would the profile of a typical customer for your automatic translator look like? Or is there more than one? How many are there?
As I mentioned before, the typical client is a company or institution which needs to translate large numbers of documents related to a their central theme. As we go incrementing the number of glossaries in our system we can admit clients with lower volumes of documents.

There is another large group of potential customers for VICTOR Translator: professional translators and translating companies who will be able to use our program to create a draft version of texts, which their human translators can quickly revise.

How do you plan to market the product?
At least initially VICTOR will be offered as a service, not a product.
The customer sends us their texts in English and we return them translated into Spanish.
It’s not a shrink-wrapped program which the customer installs in their computer. We must keep in mind that we have to create a personalized dictionary for each customer and that the databases involved are large. For those reasons it’s not so simple to install VICTOR as just another program in your computer.

Nevertheless, we look forward to the possibility of installing the program on customers’ sites if there is a large enough number of documents to translate.

Will this service be sold only in Spain and the Hispanic world, or also in English-speaking countries?
It is our intention to market it in all countries with a market for it. We must keep in mind that we have potential clients among both Spanish and English-speaking companies and institutions. Our Spanish client needs to translate English text into Spanish in order to understand and use documentation written in English. On the other hand, English customers need it to convert their English-language documentation into Spanish for the Spanish-speaking market.

Is VICTOR now available in the Spanish market?
Yes, if on a limited basis. We’re still finishing the fine tuning, so our first clients, who are also collaborators, enjoy special treatment, especially in the creation of their thematic glossaries. This means that they can contract the service, but always keeping in mind that the rhythm of translation will be slower in the early stages, until the glossary is fully adapted to their texts.

Can we talk about versions and prices?
There won’t be “versions” in the usual sense of the word. Yes, we will be adapting the program for each different client, but only as far as their dictionaries are concerned, not the program.

As far as prices are concerned, we must make an important stipulation: there will be two prices for two very different results. The first one is a totally automatic translation at high speed and for a very modest price, after the first phase of the creation of the glossary for each client.

The second case is a translation revised manually by a professional translator. In this case the time required to complete the translation will be longer, and the price significantly higher, though still well below the usual price, thanks to the use of the previous automatic translation which reduces the work of the translator to a brief revision of the final text.

Will there be versions of VICTOR Translator for other language pairs?
Not in the medium term. One of the keys to the success of VICTOR is its specialized adaptation to these two languages. To create a version in another language pair would be a whole new project, and very time consuming. Nevertheless, we don’t rule it out for the future.