MongoDB, Ruby and UTF-8
Yesterday, I was trying to insert a bunch of records in a MongoDB instance. I kept noticing that ‘bad’ strings were passed along, despite various attempts to encode them properly into UTF-8. And finally, when firing up a MapReduce function in the Mongo shell it returned the following error:
map invoke failed: JS Error: Error: invalid utf8 nofile_b:1
The complete result:
When I searched for that specific erroneous record in the shell, it told me this:
Google didn’t return the answers I needed untill I stumbled upon this Google Groups thread of about a week ago: http://groups.google.com/group/mongodb-user/browse_thread/thread/7ed11f212d84…
Instructions.
If you don’t want to read another blog after crawling through a dozen other sites (like I did), here are the instructions:
Just update to the Mongo 1.3 gem, released today! (What a coincidence)
and the BSON_ext gem:
The update will prevent Ruby from inserting invalid UTF-8 strings into MongoDB.