“VARCHAR(255)CHARACTER SET utf8”是255个字节还是255个字符

时间:2010-08-12 11:09:52

标签: mysql unicode utf-8 truncate varchar

我在INNODB / MySQL表中声明了一个字段

VARCHAR(255) CHARACTER SET utf8 NOT NULL

然而,当插入我的数据被截断为255字节而不是字符。这个 可能会破坏尾随的两个咬合代码点i 强调文本 n两个留下无效字符。 任何想法我可能做错了

编辑:

示例会话就像这样

mysql> update channel set comment="ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬x" where id = 1;
Query OK, 0 rows affected, 1 warning (0.00 sec)
Rows matched: 1  Changed: 0  Warnings: 1

mysql> select id, channelName, comment from channel;
+----+-------------+------------------------------------------------------------------------------------------
| id | channelName | comment                                                                                                                                                                                                                                                         |
+----+-------------+-----------------------------------------------------------------------------------------
|  1 | foo         | ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩ�� |
+----+-------------+-----------------------------------------------------------------------------------------
1 row in set (0.00 sec)

通过mysql-admin我查看注释字段,看看它确实是VARCHAR(255)并使用“UTF-8 Unicode”

来自命令

show full columns from channel

我得到了

+-----------------------------+------------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+
| Field                       | Type             | Collation       | Null | Key | Default | Extra          | Privileges                      | Comment |
+-----------------------------+------------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+
| id                          | int(11)          | NULL            | NO   | PRI | NULL    | auto_increment | select,insert,update,references |         |
| channelName                 | varchar(255)     | utf8_general_ci | NO   |     | NULL    |                | select,insert,update,references |         |
| comment                     | varchar(255)     | utf8_general_ci | NO   |     | NULL    |                | select,insert,update,references |         |
+-----------------------------+------------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+

的MySQL>显示变量类似'character_set%'

+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

2 个答案:

答案 0 :(得分:7)

根据manual,你应该没问题:

  

MySQL以字符为单位解释字符列定义中的长度规范。 (在MySQL 4.1之前,列长度以字节为单位进行解释。)这适用于CHAR,VARCHAR和TEXT类型。

你碰巧使用的是4.1之前版本的mySQL吗?

答案 1 :(得分:2)

这是一个黑暗的刺,但你使用UTF-8作为连接和客户端字符集?问题SHOW VARIABLES LIKE 'character_set%'并查看它是否告诉您UTF-8或latin-1。

也许如果您使用错误的连接/客户端字符集,UTF-8字节将被重新解释为单字节字符并以这种方式存储在数据库中。