Jump to content

IPB connection charset during the installation and after


RPG-support

Recommended Posts

According to the modern tradition, the installation process of most contemporary products automatically set the data base connection charset. IPB does not do this. You are allowed to enter the data base charset only during the process of installation.

For example, when I am using the bitrix system on my www.artsgallery.pro site, I do not have any problems with the connection to data base charset since in bitrix system the connection charset to the data base is set automatically by the following directive during the process of installation: 'set names 'utf8';'.

The IPB do not have this setting during the process of installation. It appears that IPB offers the user to enter the data base encoding only: UTF8 is adviced.

I'm not clear on what you are referring to. Our software doesn't run a SET NAMES query by default (unless you add sql_charset option to your conf_global.php file). The table definitions in your database utilize your MySQL default table character set and column collation settings, as our CREATE TABLE queries do not define any character set information.

This leads to the bad ergonomics and loss of human time.

Example:

I have Russian language on my IPB. This means that I can not use the fulltext search setting of MySQL. The fulltext search setting was disabled in ACP.

I have noticed that Russian words can not find any search results in the IP.Downloads items' descriptions. Sometimes Russian words could provide the proper search results for the IP.Downloads items' titles only.

While investigating the issue with the help of the brave IPS tech support, I got the information that while my server uses latin1 connection to the data bases, I will face this problem. BUT I need latin1 for my other sites on this server, so I can not simply change this setting to the utf8.



And since the latin1 was the connection charset during the process of IPB installation and usage, the data base should be converted now into the 'real' utf8. As the solution I was advices by the IPS Tier II support to convert the IPB data base to the 'real' utf8 by the following script.

And finally I changed the conf_global.php settings to the following:

$INFO['sql_charset'] = 'utf8';



This made the Russian search working everywhere on my IPB.

I think that making the additional field during the IPB installation will help to safe much human time in this regard.

We need 2 fields during the installation:

Connection charset:
DB charset:

Thank you for reading the long text.

Link to comment
Share on other sites

$INFO['sql_charset'] was added as a small workaround for those, who don't know how to configure their servers well. If you create your new database with default collation set as ut8_general_ci (for example), an if you push "default-character-set=utf8" to my.cnf - there will be no need in $INFO['sql_charset'] at all. And actually that's the only correct way to configure your MySQL as IP.Board to work with UTF-8. Triggering "SET NAMES" on each query - isn't. That's why it's hidden by default, to be used only by those, who is using cheap shared hosting and can't reconfigure their servers.

Link to comment
Share on other sites

For those who is in tank, I have several sites on the server as said above and the most important of them need latin1 as default. So, I can not use my.cnf setting on my very expensive server. And I suppose IPB developers were more friendly than you and did not want to force users with encoding they should use by default on their servers because they knew that IPB is not and will not be the only sun on the sky.

Link to comment
Share on other sites

For those who is in tank, I have several sites on the server as said above and the most important of them need latin1 as default. So, I can not use my.cnf setting on my very expensive server. And I suppose IPB developers were more friendly than you and did not want to force users with encoding they should use by default on their servers because they knew that IPB is not and will not be the only sun on the sky.

Ivan.

conf_global.php can be manually configured for install.

That said.... latin1? How does YOUR native language even work? How does it *need* latin1? If it is some script refusing to offer UTF-8, you should be beefing them. At a general level, UTF-8 should *always* be used.

Link to comment
Share on other sites

1) I use latin1on the xslt-based sites hosted on the same server, for example.

2) There is no any installation setting adviced for the conf_global.php on the Installing official page and there is no any caution while using check_requirements.php. That means that majority of users will install IPB and have the db at least in not pure utf8 because their server db connection may be not utf8, like in my case. I think that non-English speaking users are at least the significant in number part of the IPS comunity. So, the discrimination should be removed in any sence.

3) And finally, professional attitude to the clients' servers is that your product should not depend from the server environment on such the primitive level.

Link to comment
Share on other sites

1) I use latin1on the xslt-based sites hosted on the same server, for example.

2) There is no any installation setting adviced for the conf_global.php Installing official page and there is no any caution while using check_requirements.php. That means that majority users will install IPB and have the db at least in not pure utf8 because their server db connection may be not utf8, like in my case. I think that non-English speaking users are at least the significant in number part of the IPS comunity. So, the discrimination should be removed in any sence,

3) And finally, professional attitude to the clients' servers is that your product should not depend from the server environment on such primitive way.

I implore you to turn on your IPB debug to level 3 with that configuration set. It is not pretty, it is meant for shared hosting, it is the least efficient way to resolve the problem.

Triggering "SET NAMES" on each query - isn't. That's why it's hidden by default, to be used only by those, who is using cheap shared hosting and can't reconfigure their servers.

1) That is your decision.

2) Power user installation, not letting the pretty web installer make the file for you, thus not documented.

3) Yes, and any coder worth their salt will tell you flat you will have issues using latin1 with non-ASCII input.

Link to comment
Share on other sites

I have already converted the db as I said in the firt post above.

You simply want to put IPB in the center of the universe.

Try to inquire what is the genuine attitude for programms development.

I am.

Again, actually go look at the difference in your queries in your debug footer.

There is a reason this is hidden away, and is highly recommended one not use it and fix the SQL server config.

No, IPB is not at the center of the universe, but the only charset worth anything globally in this world is UTF-8, and you are asking for a field in the installer to support third-party scripts using latin1.

Link to comment
Share on other sites

1) That is your decision.
2) Power user installation, not letting the pretty web installer make the file for you, thus not documented.
3) Yes, and any coder worth their salt will tell you flat you will have issues using latin1 with non-ASCII input.


I have just spoken with one good coder and he confirmed that programm you are installing should have db charset connection setting predefined. We are speaking about IPB but not about my server enviorment, aren't we?

Link to comment
Share on other sites

I have just spoken with one good coder and he confirmed that programm you are installing should have db charset connectionsetting preinstalled. We are speaking about IPB but not about my server enviorment, aren't we?

no, we are speaking of your server environment.
I am not against the field being added to the installer for posterity/shared hosting.
I would like to know why you willingly accept the additional weight instead of fixing the problem at it's source.

$INFO['sql_charset'] was added as a small workaround for those, who don't know how to configure their servers well. If you create your new database with default collation set as ut8_general_ci (for example), an if you push "default-character-set=utf8" to my.cnf - there will be no need in $INFO['sql_charset'] at all. And actually that's the only correct way to configure your MySQL as IP.Board to work with UTF-8. Triggering "SET NAMES" on each query - isn't. That's why it's hidden by default, to be used only by those, who is using cheap shared hosting and can't reconfigure their servers.

This workaround should not be needed when you have control of the server config, and the weight can be dropped/avoided.

The funny thing is, you are making IPB work harder for the latin1 script, which is not it's problem nor responsibility.

I told you to turn on your level 3 debug and look at the queries with that config for a reason.

Link to comment
Share on other sites

Once again, I have several sites which need latin1.

You are looking like a tourist who came from the USA to Russia with his own 110W iron and claiming the 110W in the 220W country. It is more reasonable that you buy the proper 220W iron in the country you came rather than to force every one to adopt your utf8 MySQl and server connection rules.

I think it is useless to continue with you here.

Link to comment
Share on other sites

Once again, I have several sites which need latin1.

You are looking like a tourist who came from the USA to Russia with his own 110W iron and claiming the 110W in the 220W country. It is more reasonable that you buy the proper 220W iron n the country you came rather than to force every one to adopt your utf8 MySQl and server connection rules.

I think it is useless to continue with you here.

sir, I am asking kindly WHY they need latin1, and how you typing out a sentence in your native language even works.
UTF-8 is not 'our' rule, it is a global one, it is the only charset to work on every language in existence.

If you only knew the hell charsets are, and what a relief UTF-8 is.

I am not arguing the option in the installer. I am asking you how latin1 even works in your usage, without mangling anything written in your native language, especially where ajax requests are involved, which will always use UTF-8 regardless of the source charset which will often result in mangled input in a configuration like this..

Link to comment
Share on other sites

I have http://www.russianpaintings.net/ - it uses latin1 connection (server default), it is in English.
I have http://www.artsgallery.pro/ - it uses utf8 connection predefined in the bitrix engine, it is in Russian.
I have IPB http://www.a108.net/ - it uses utf8 connection set in conf_global after long time of investigation (see the 1st post above), it is both in English and Russian.

From the anvanced php code to the advanced php coders of IPS:

A program must not depend on environment variables to get reasonable defaults.

From: http://www.debian.org/doc/debian-policy/ch-opersys.html

I hope the logic of all this will work with IPB also since the logic is the only universal thing in our lifes.

Link to comment
Share on other sites

I indeed agree the option should indeed be added to the installer. I am again, not arguing that.
That said.

I have http://www.russianpaintings.net/ - it uses latin1 connection (server default), it is in English.

I think there are a couple of reasons that many web designers and developers still aren't using Unicode across the board.

“I don't need Unicode, because my site is in English!”
I'll bet this is the most common (and stupid) excuse. Even assuming all your content is in English, many of your visitors may not use English as their first language. If you've got any areas where users can contribute content (for example, forums, contact us, blog comments etc), things will go badly. Even if all your visitors are native English-speaking monoglots, it's more than likely that some will have characters in their name that can't be represented in Windows Latin or ASCII.

Source, only because I could not have said it so eloquently myself.
There is literally no reason to run on latin1 beyond a script insane enough to force latin1.
Everything in latin1 is representable loss-lessly in utf-8. The reverse is not true. User input accepted means bad things happen to good sites/people.

Also, that throws a 502 bad gateway for me.

Link to comment
Share on other sites

Triggering "SET NAMES" on each query - isn't. That's why it's hidden by default, to be used only by those, who is using cheap shared hosting and can't reconfigure their servers.


You do not need to trigger "SET NAMES" on each query. You should call "SET NAMES" only on connect to the db.

SET NAMES indicates what character set the client will use to send SQL statements to the server. It also specifies the character set that the server should use for sending results back to the client. SET NAMES works for the connection session only.
Link to comment
Share on other sites

There's no need for arguing here guys.

The reason IP.Board traditionally has not put character set configuration up front and center is multi-fold:

  • Frankly, it confuses 99% of our client base. Most do not understand what a character set is, and the server defaults work fine for the vast majority of our clients (most Russian users host in Russia where the server defaults account for Russian languages, for example).
  • IP.Board has been around for a very long time and users who have been running the software for a very long time are likely trapped into an older character set, typically iso-8859-1, thus making the option more difficult to implement across the board.

We are investigating a forced conversion to UTF-8 with IP.Board 4.0, which would solve the problems once and for all for all users, if successful. We will keep everyone posted with 4.0 development news as it becomes available.

Link to comment
Share on other sites

  • 2 years later...
On ‎10‎.‎04‎.‎2013 at 6:00 PM, bfarber said:

There's no need for arguing here guys.

 

The reason IP.Board traditionally has not put character set configuration up front and center is multi-fold:

  • Frankly, it confuses 99% of our client base. Most do not understand what a character set is, and the server defaults work fine for the vast majority of our clients (most Russian users host in Russia where the server defaults account for Russian languages, for example).
  • IP.Board has been around for a very long time and users who have been running the software for a very long time are likely trapped into an older character set, typically iso-8859-1, thus making the option more difficult to implement across the board.

 

We are investigating a forced conversion to UTF-8 with IP.Board 4.0, which would solve the problems once and for all for all users, if successful. We will keep everyone posted with 4.0 development news as it becomes available.

Same problem.

I created a ticket to IPS about this issue, unfortunately I didn't receive an answer for 48 hours. :(
I upgraded 3.2 to 4.1. I have a character set collation problem. My forum uses Turkish character set.
3.2 DB has utf8 character set. I see "Özgür" in DB, but "Özgür" in forum - this was OK.
4.1 upgrade tool convert it to utf-8 again. I see "Özgür" in DB, but "Özgür" in forum - this is incorrect.
Also, members login name with Turkish character set can not login successfully cause of this issue.
If I post a new message with Turkish characters, I see my post with Turkish characters NOT utf8 in DB. I think this is a problem. Upside down :)

How can I find reason and fix this problem ? I think there is no need to convert tables, it must remain utf8.
Is there any parameter within IPB config files ? Or do I need to change mysql server client/server/connection collation character set values ?

Capture.JPG

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...