[WISH] An idea for a future built-in translator.
forums at david-woolley.me.uk
Wed Jan 6 03:04:10 EST 2010
Kristoffer Grundström wrote:
> My idea is that Pidgin could have a built-in script or function that
> doesn't really needs to be enabled to work that translates the incoming
> text into YOUR language & the same back to the other person & the script
That pre-supposes an open source machine translator. I'm not aware of
any such project. Machine translation, even between Western European
languages is a difficult problem and may well be beyond the resources of
any reasonable open source development project, which will have to
re-invent a lot of the work done for proprietary programs.
> would also recognize if this text is written like a document or letter
> or just a plain conversation so that the text will turn out correct when
> translated. I know that it's a big project, but think a bit.
> If a person writes in their native language to you and you don't understand.
> If you want this text translated you have to spend time after time after
> time after.........you know the drill......to find what that particular
> word or sentence means if you DO find it.
What might be possible would be the facility to look up individual words
in a dictionary, but maintaining those dictionaries would have to be a
project in its own right. Dictionaries require a lot of work to
compile, and you cannot use existing dictionaries in the process,
because that would infringe on the dictionary compilers' copyrights.
CEDICT has existed for some time, but only gives the translation, with
no explanation on usage, and is missing many words.
> Google translate doesn't translate whole sentences correctly since many
> words can mean many things.
Most words have multiple meanings! Google and Babelfish will have cost
a lot to develop (they may be based on commercial, standalone,
products). They do not translate word for word, but the fact that they
are still produce very bad translations demonstrates how difficult the
problem is for someone with lots of commercial resources, including the
ability to licence existing dictionary data.
Emails are not formal business letters, whatever businesses may want.
RFC1855 says there should be an address here, but, in a world of spam,
that is no longer good advice, as archive address hiding may not work.
More information about the Support