Click here to register.
      
IRC banner


     WebGUI Dev > Connection charset, bugs and Russian translation

Connection charset, bugs and Russian translation

User Vitkovsky
Date 6/19/2008 8:13 pm
Views 1445
Rating 0    Rate [
|
]
Previous · Next
User Message
Vitkovsky

This pertains to bags (http://www.webgui.org/bugs/tracker/import-of-packages-with-international-text-is-broken http://www.plainblack.com/bugs/tracker/http-proxy--syndicated-content-with-international-text-corrupted) and some others.

My apologies for long message, simply, I wanted to describe in one place all symptoms, which can help to correct these bugs.

Around two years back I've wrote about that that unless put by force to set of the connection to the base, that given in the base can be written not in Unicode, but, for instance, in latin1. (Can be a different variants.) Then this has not found supports beside community, as not actual. Many users use the Latin alphabet so beside them this does not render the essential influence upon functioning.

I was need to recode all database on servers, which I support, and every upgrade I was have to add every time line $dbh->do("SET NAMES utf8"); after connection to database:

           my $dbh = DBI->connect($dsn,$user,$pass,{RaiseError=>0,AutoCommit=>1 });
           $dbh->do("SET NAMES utf8"); 
            unless (defined $dbh) {
                        $session->errorHandler->error("Couldn't connect to database: $dsn");
                        return undef;
            }

There is one more thing, which caused discomfort, but possible was work:

In Russian translation I could not install russian locale. As soon as I set ru_RU, In some assets given were displayed in not correct charset. I was need to set en_US
This does not depend on that, was added line " $dbh->do("SET NAMES utf8"); " or not.

Before fall-winter packages were imported/exported orderly and httpProxy & SyndicatedContent worked orderly.

I do not remember exactly, since what versions ceased to work. Possible, thereafter, what have migrated to 2-nd JSON.

 

Insofar I litter, as from versions 7.5.1 have solved to provide in correspondence to charset data, which are written to the database.

The charset was setting directly when connecting by turning on the flag.
mysql_enable_utf8 => 1

When use of such way Russian translation is displayed in wrong charset (screen shot is attached). Herewith text of the site itself is displayed it is correct.

For reproduce it You need to set in Your and visitor's profile Russian language.

If I use the old way ( $dbh->do("SET NAMES utf8"); ), that translation works orderly:

    my (undef, $driver) = DBI->parse_dsn($dsn);
    my $dbh = DBI->connect($dsn,$user,$pass,{RaiseError => 0, AutoCommit => 1,
#        $driver eq 'mysql' ? (mysql_enable_utf8 => 1) : (),
    });
        $dbh->do("SET NAMES utf8"); # my line instead of commented line
            unless (defined $dbh) {
                        $session->errorHandler->error("Couldn't connect to database: $dsn");
                        return undef;
            }

 

I understand that these belongings can not and must not depend on each other, but there is fact and I can not find him explanation.

 

In description of this module is described that both ways of the set connection charset must be equivalent on result:

 (http://search.cpan.org/~capttofu/DBD-mysql-4.007/lib/DBD/mysql.pm about mysql_enable_utf8):

".......Additionally, turning on this flag tells MySQL that incoming data should be treated as UTF-8. This will only take effect if used as part of the call to connect(). If you turn the flag on after connecting, you will need to issue the command SET NAMES utf8 to get the same effect.

This option is experimental and may change in future versions....."

 

So,

When to use SET NAMES utf8:

  • The Russian translation works orderly
  • httpProxy & SyndicatedContent - works not it is correct
  • Import/export package - breaks the charset.
  • Use ru_RU locale in russian translation breaks the charset.

When turning on the flag mysql_enable_utf8:

  • The Russian translation - breaks the charset
  • httpProxy & SyndicatedContent - works orderly.
  • The Import/export package - breaks the coding in 7.5.11 (tested on WebGUIdemo) though I seem that in 7.5.10 or 7.5.9 worked orderly  (tested on WebGUIdemo too).  Possible, this presently problem in demo only.

 

Graham or JT, please, upload Russian translation to the translation server (http://www.plainblack.com/uploads/cb/MO/cbMOw8pkZUoSY2FG9BqOuw/Russian.tar.gz).

 


Vladimir Vitkovsky
http://www.webgui.uanet.biz/
http://web-octopus.com
http://www.transport.su/

 



Attached Files
Back to Top
Rate [
|
]
 
 
Vitkovsky

If by WebGUI tools you are refering to translationserver.cgi, it was fixed in the last few days to deal with UTF8 characters properly.

With the latest translation tool and the latest WebGUI, these issues should be resolved.

 

Indeed, I used the old tools for translation. The Format of the files has several differences.

I have tried to take the partial(1.6%) translation, which rests upon server of the translation http://i18n.webgui.org (in new amendable format), but has got same effect. Contributory changes have not been able to solve the problems with Russian translation.

The Translation is displayed so, that multibit symbols are displayed as singlebit.

Notable that old and new translation format in some situation is displayed orderly. Most often this occurs on admin page with admin template.

All actions were executed on version 7.5.10.

Then I have upgraded to 7.5.13.

Result the same.

If I setting the charset connection to database by old way (SET NAMES utf8), that, translation is displayed orderly, but were added problems with the other assets. For instance, in assets manager all are displayed in wrong charset and there are problems with DataForm.

Possible, key to decision of the problem need to search for not in format of the file of the translation and way of the setting charset connection to the base, but in side effect other CPAN modules. Is not excluded that this is somehow connected with wrong setting locale some from modules.

I have tried to begin to create the Ukrainian translation. (This is Cyrillic too. From Russian alphabet differ only several letters.) Result the same, as with Russian language.

Did someone tried to use Bulgarian or Byelorussian translation with 7.5? (Cyrillic too.)

Does this problem reveal itself with Cyrillic only or there is and in the other languages, which alphabets consist of multibit symbols?


Vladimir Vitkovsky
http://www.webgui.uanet.biz/
http://web-octopus.com
http://www.transport.su/

 



Back to Top
Rate [
|
]
 
 
Vitkovsky

 It seems that Translation is not displayed so, that multibit symbols are displayed as singlebit.

Translation is displayed so, that all text encoded twice

As experiment I have tried to add line "utf8::decode($output);" to /data/WebGUI/lib/WebGUI/International.pm:

sub get {
        my ($self, $id, $namespace, $language) = @_;
        $namespace = $namespace || $self->{_namespace} || "WebGUI";
        $language = $language || $self->{_language} || $self->session->user->profileField("language") || "English";
        $id =~ s/$safeRe//g;
        $language =~ s/$safeRe//g;
        $namespace =~ s/$safeRe//g;
    my $cmd = "WebGUI::i18n::".$language."::".$namespace;
    WebGUI::Pluggable::load($cmd);
    our $table;
    *table = *{"$cmd\::I18N"};  ##Create alias into symbol table
        my $output = $table->{$id}->{message};
        $output = $self->get($id,$namespace,"English") if ($output eq "" && $language ne "English");
        utf8::decode($output);   # my line
    return $output;
}

All translation text became be displayed orderly with mysql_enable_utf8 => 1. Nearly all work orderly, but if I try to edit some assets I see blank page (Data Form, Page Layout, etc.)

It is likely that Perl has a lot of "reefs" when work with unicode...


Vladimir Vitkovsky
http://www.webgui.uanet.biz/
http://web-octopus.com
http://www.transport.su/

 



Back to Top
Rate [
|
]
 
 
Vitkovsky

 As for: ".....but if I try to edit some assets I see blank page ...."

- I got it in IE7. Just tried in Firefox - all works orderly. Seems, this is something like: http://www.plainblack.com/bugs/tracker/wiki-page-redirects/errors-in-ie6/5


Vladimir Vitkovsky
http://www.webgui.uanet.biz/
http://web-octopus.com
http://www.transport.su/

 



Back to Top
Rate [
|
]
 
 
Vitkovsky

Hello Graham,

I've saw your changes in translationserver.cgi and tried to apply this. Seems it solved problem.

And that pleasantly, I now without problems can in file of the translation to set ru_RU or uk_UA.

Translation files have partly plane unicode text and partly text like: "\x{411}\x{435}\x{437}\x{43f}\x{435}\x{43a}\x{430}". Both formats work orderly. 
All files have line "use utf8;" - seems, this that helped to force to display orderly translation.
Now there is no need to in that line, which I added as experiment to /data/WebGUI/lib/WebGUI/International.pm. All work and with it and without it.

Even more, I tried to put the line (use utf8;) in old translation files and it too worked orderly.

Does really necessary text in such (\x{411}\x{435}) format only?

Than I tried to put the line (use utf8;) to 7.4 translation files - whole translation became be displayed in crossover coding.
Than I have changed the lines (from "SET NAMES utf8" to "mysql_enable_utf8 => 1") and all became be displayed orderly. Even more, bug with HTTP Proxy (http://www.webgui.org/bugs/tracker/http-proxy--syndicated-content-with-international-text-corrupted#sduZ8mPtKU27rVnbHuYuKg) - disappeared, but appeared for 7.4 bug with Syndicated Content as I wrote for 7.5.15 (http://www.webgui.org/bugs/tracker/syndicatedcontent-brokes-cyrillic-text#eYynbirf_bSWC4lCV5XgwA)


Vladimir Vitkovsky
http://www.webgui.uanet.biz/
http://web-octopus.com
http://www.transport.su/

 



Back to Top
Rate [
|
]
 
 
Vitkovsky

 Some in addition: with "mysql_enable_utf8 => 1" in 7.4 - I can not export packages. It only shows folder of package.


Vladimir Vitkovsky
http://www.webgui.uanet.biz/
http://web-octopus.com
http://www.transport.su/

 



Back to Top
Rate [
|
]
 
 

Recent Discussions Color Key

Design:

Development:

Et Cetera:

Install/Upgrade:  

Smoketest:

Template Group:


Re: Site paid for by advertizing by Klaus - Fri @ 02:27am

Smoke Test for WebGUI (Stable) (2008-11-21) by botaction - Fri @ 12:37am

Re: Site paid for by advertizing by pwrightson - Thu @ 10:59am

Re: Site paid for by advertizing by JT - Thu @ 08:58am

Re: Regelmäßiger Termin für Usertreffen in der Rhein-Neckar-Region by Klaus - Thu @ 06:11am

Smoke Test for WebGUI (Stable) (2008-11-20) by botaction - Thu @ 12:00am

Smoke Test for SVN (2008-11-20) by botaction - Thu @ 12:00am

Re: Improving page layouts by fdillon - Wed @ 08:38pm

Re: Improving page layouts by knowmad - Wed @ 08:25pm

Re: Site paid for by advertizing by knowmad - Wed @ 08:07pm

Re: SSL Configuration? by knowmad - Wed @ 07:51pm

Re: The Death of the Collaboration System by preaction - Wed @ 07:39pm