You are currently reviewing an older revision of this page.

DRAFT - INT-XXXX - How to update MySQL collation to support emojis

Symptom

When attempting to store text containing emojis and certain other Unicode characters in the customer's MySQL business database, they encounter the following error:

2019-08-22 06:56:39,213 [Appian Work Item - 75 - execution00 : UnattendedJavaActivityRequest] ERROR org.hibernate.util.JDBCExceptionReporter - Incorrect string value: 'blah' for column 'blah' at row 1
2019-08-22 06:56:39,214 [Appian Work Item - 75 - execution00 : UnattendedJavaActivityRequest] ERROR com.appiancorp.type.external.writetodatastore.WriteToDataStoreCore - Error when trying to write to the data store. com.appiancorp.suiteapi.process.exceptions.SmartServiceException: userMsg[error.duringWrite.userMsg=An error occurred while trying to write to the entity <entity> [id=<uuid>, type=<type> (id=<id>)] (data store: <data store>). Details: org.hibernate.exception.GenericJDBCException: could not insert: [<entity>]: java.sql.SQLException: Incorrect string value: '<value>' for column '<column>' at row <rownum> Data: TypedValue[it=<id>,v={<value>}

Cause

This is caused by the MySQL database by default not being configured to support the full set of unicode characters.

Action

On the customer's MySQL database, run the following SQL commands to manually set the default server level character set and collation:

set global character_set_server = 'utf8mb4'; set global collation_server = 'utf8mb4_unicode_ci';

Any new tables created will inherit the default charater set and collation. But any tables that are already created with the previous character set and collation will not automatically switch their character set to the new one. The customer will need to do that explicitly themselves. For each <table_name>, the customer should run:

alter table <table_name> convert to character set utf8mb4;

They need to do it themselves because something may not be compatible between two character sets in which case they will need to decide what to do. For example, latin1 characters only take one byte per character. The maximum index key length in mysql is 767 bytes. So they can have an index on a character field that is 767 bytes. But utf8mb4 can take up to 4 bytes per character. Which means that you can have only have an index on a varchar field that is 767/4=191 characters only. The column length can be more than 191 characters but the index key length cannot be more than 191 characters. They will need to decide what to do in that case:

  1. Reduce the length of the column on which there is an index to 191 characters only. Drop the index. Change the character set. Then recreate the index.
  2. Keep the column length as it is but just recreate the index with a maximum key length of 191 chars only. This will work if the index is not unique. It will not work if the index is unique.

The following query will help them identify which tables have columns that are using a character set:

select table_name, column_name, ordinal_position, column_default, is_nullable, data_type, character_maximum_length, character_set_name, collation_name, column_type from information_schema.columns where character_set_name is not null and table_schema = 'Appian' order by table_name, ordinal_position;

Affected Versions

This article applies to all versions of Appian.

Last Reviewed:September 2019