utf8mb4 instead of
utf8 in MySQL/MariaDB to have proper unicode support
This is actually a bit of an older problem: MySQL’s
utf8 isn’t UTF-8.
utf8 in MySQL only supports three bytes per character, while the real UTF-8 needs up to four bytes. This results in some characters not being supported.
Proper UTF-8 support has been introduced as a workaround in 2010 (version 5.5.3[^1]) with a new character set called
Mathias Bynens explains in detail what the implications are. In a nutshell: You won’t be able to store all unicode characters, including the (very important) 💩-emoji. I’m not suggestion that you write your own web bot, that throws piles of poop around the web just to see, who used
utf8 and has some data loss in the process.
[^1] I cannot find the actual changelog anymore, however this was confirmed by at least three independent articles (or at least they appear to be independent).
Remember to repair and optimize all databases after running any of those commands:
mysqlcheck -u root -p --auto-repair --optimize --all-databases
Run this sequence for every database “database” you need to switch over
ALTER DATABASE database CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
Convert a table
ALTER TABLE `db.table` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Switch column of a table
Convert a single column (
ALTER TABLE `db.table` MODIFY COLUMN column varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;