Blog RSS Feed Subscribe

Jordi Boggiano

Jordi Boggiano Passionate web developer, specialized in web performance and php. Partner at Nelmio, information junkie and speaker.

Categories

MySQL's GROUP_CONCAT limitations and cascading bad luck

We had an incident today over at Teamup (where I have worked for the last 9 months by the way:) which is worth a quick blog post if it helps save anyone from having a bad day.

We are using MySQL's GROUP_CONCAT feature to fetch a list of ids to delete when cleaning up old demo calendars. You end up with a list of ids in one row, easy to fetch, split it on commas, and done. So far so good. Then we run a few DELETE ... WHERE id IN (...) queries to clean things up in a few tables. So far so good.

However if you fail to read the fine print on the MySQL docs, you might not have seen this sentence: The result is truncated to the maximum length that is given by the group_concat_max_len system variable, which has a default value of 1024. What this means is that a query that worked just fine in testing conditions, suddenly started failing in production once the data set hit a critical size. Thanks to another stroke of bad luck, it returned a list of ids truncated right after a comma (3,4,5,) so we had an empty id in our WHERE IN (3,4,5,) clause. Unfortunately combined with the fact we had optional relations in some tables (I won't bore you with details) that empty match made it wipe about 60% of the data in those.

Thankfully we have backups on top of the DB replication which let us recover the lost data pretty quickly, and it only affected a small feature in the grand scale of things, but this could have ended much worse so it is worth pointing out a few things:

  • If you use GROUP_CONCAT and expect large amounts of data returned, make sure to increase the limit before executing your query. For example this sets both the max length for the group concat and the max packet length (which caps the former) to 10MB SET SESSION group_concat_max_len = 10485760, SESSION max_allowed_packet = 10485760;. Use more if you think you need more.
  • Maybe for safety using GROUP_CONCAT should be avoided if you don't know how much data to expect, simply fetching ids and then fetching all rows at the program level does the job too.
  • Do snapshot backups even if you have replication in place, it can save your ass!

And now to hope for a more quiet rest of the week!

Edit: There is some good news, MySQL 5.8 might include a fix and turn the current warning for truncation into an error, see http://bugs.mysql.com/bug.php?id=78041

August 11, 2015 // PHP, Web

Post a comment

Subscribe to this RSS Feed Comments

2015-08-13 08:04:18

Eugene Leonovich

I run into the same problem with GROUP_CONCAT recently.

Just a note, if you need to concatenate strings instead of integer values, you can utilize a null byte as a separator:

GROUP_CONCAT(my_str_field SEPARATOR "\0")
-- or
GROUP_CONCAT(CONCAT_WS(CHAR(0), my_str_field1, my_str_field2) SEPARATOR "\0")


And then unpack it:
explode("\0", $data)
// or
array_chunk(explode("\0", $data), 2)

2015-08-13 08:06:43

Eugene Leonovich

P.S. The "Post a comment" link is broken.

2015-08-13 09:11:02

Seldaek

@Eugene: how is the link broken and if it is how did you manage to comment? :)

2015-08-13 13:14:39

Eugene Leonovich

@Seldaek I use Firefox 40 on Mac, when I click the link, I get redirected to "javascript:document.getElementById('comment-wrap').style.display = 'block'; document.getElementById('comment-toggle').style.display = 'none'; false;", so I used Firefox Developer Tools to make the form visible.

It works well in Google Chrome though.

2015-08-13 14:13:50

Seldaek

Ah ok thanks, fixed now I believe.

2015-08-18 07:55:43

Kick_the_BUCKET

Why not just use a sub-select inside the WHERE IN() clause? The only thing to remember in such a case is to only select one column in the subselect and you won't ever run into length limitations...

2016-09-14 20:57:23

Webcaresoft

A very good and Useful information you have shared with us thanks for this good stuff informative for Beginners. Thanks again