How to Fix the Collation and Character Set of a MySQL Database manually

NOTE: This article only applies to BedigitCom products that can be found here.

What is Collation?

The collation determines how results are sorted and ordered. In newer versions of BedigitCom applications, collation changes may become more strict – i.e, an application requires a certain collation. You must ensure your database has the correct collation for the application it will be used with.

Collation in MySQL can be complicated because you can have a separate collation set at:

  1. The database level
  2. The table level
  3. The column level

Additionally, the information inside a column may be encoded incorrectly as well – causing the data in that column to be displayed incorrectly.

Should I use utf8mb4 instead utf8?

utf8mb4 (which provides support for 4-byte characters) handles Emoji and some Chinese characters that are missing from utf8. There are other, less common, characters. Read about the Unicode planes, realizing that the BMP is the only plane handled by MySQL’s utf8.

As for space… For characters that exist in utf8, utf8mb4 takes the same amount of space.

Setup Guides for MySQL

To setup your MySQL database correctly, see the following resources for each product:

IMPORTANT: Always back up your data before performing any modifications to the database. If possible, test any alter, insert, update, or delete SQL commands on a staging server first.

NOTE: You may wish to add all the ALTER TABLE statements to a single file for easier execution.

Dealing with Foreign Key constraints

It may be necessary to ignore foreign key constraints when making changes to a large number of columns. You can use the SET FOREIGN_KEY_CHECKS command to ignore foreign key constraints while you update the database.

SET FOREIGN_KEY_CHECKS=0;
 
-- Insert your other SQL Queries here...
 
SET FOREIGN_KEY_CHECKS=1;

Changing the database collation

In the example below, change:

  • <yourDB> to your actual database name
  • <charset> to either utf8 or utf8mb4
  • <collation> to either utf8_unicode_ci or utf8mb4_unicode_ci

To change the database collation:

ALTER DATABASE <yourDB> CHARACTER SET <charset> COLLATE <collation>

Changing table collation

WARNING: Please note, the query below will produce a series of ALTER TABLE statements, which you must then run against your database.

In the example below, change:

  • <yourDB> to your actual database name
  • <charset> to either utf8 or utf8mb4
  • <collation> to either utf8_unicode_ci or utf8mb4_unicode_ci

To change table collation:

SELECT CONCAT('ALTER TABLE `',  table_name, '` CHARACTER SET <charset> COLLATE <collation>;')
FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C
WHERE C.collation_name = T.table_collation
AND T.table_schema = '<yourDB>'
AND
(
    C.CHARACTER_SET_NAME != '<charset>'
    OR
    C.COLLATION_NAME != '<collation>'
);

Changing column collation

WARNING: Please note, similar to the query above, the queries below (one for varchar columns, and one for non-varchar columns) will produce a series of ALTER TABLE statements, which you must then run against your database. 

In the examples below, change:

  • <yourDB> to your actual database name
  • <charset> to either utf8 or utf8mb4
  • <collation> to either utf8_unicode_ci or utf8mb4_unicode_ci

To change column collation for varchar columns:

SELECT CONCAT('ALTER TABLE `', table_name, '` MODIFY `', column_name, '` ', DATA_TYPE, '(', CHARACTER_MAXIMUM_LENGTH, ') CHARACTER SET <charset> COLLATE <collation>', (CASE WHEN IS_NULLABLE = 'NO' THEN ' NOT NULL' ELSE '' END), ';')
FROM information_schema.COLUMNS 
WHERE TABLE_SCHEMA = '<yourDB>'
AND DATA_TYPE = 'varchar'
AND
(
    CHARACTER_SET_NAME != '<charset>'
    OR
    COLLATION_NAME != '<collation>'
);

To change column collation for non-varchar columns:

SELECT CONCAT('ALTER TABLE `', table_name, '` MODIFY `', column_name, '` ', DATA_TYPE, ' CHARACTER SET <charset> COLLATE <collation>', (CASE WHEN IS_NULLABLE = 'NO' THEN ' NOT NULL' ELSE '' END), ';')
FROM information_schema.COLUMNS 
WHERE TABLE_SCHEMA = '<yourDB>'
AND DATA_TYPE != 'varchar'
AND
(
    CHARACTER_SET_NAME != '<charset>'
    OR
    COLLATION_NAME != '<collation>'
);

.