Fun with Java RegEx String Replacement


In the Pachyderm presentation publishing code, there is a step where it compiles the various versions of images for use in the final product - resizing as needed, wrapping in the .swf file and burning in the metadata. It uses an XML format, provided by JSwiff, to replace the freeze-dried content of a templated flash file wrapper with the dynamically defined data (image and text).

We just do a simple find-and-replace, looking for special tags that we've placed in the templated .swf xml version - looking for things like "{tombstoneTitleShort}" and replacing it with "My Most Excellent Photo". Seems simple. But I just came across a case where it failed. The extended text for an image included a $ - which would be fine, but it's a Magic Regex Character, symbolizing the end of a line of text (likewise, ^ symbolizes the beginning of a line). And, it was unescaped (and not at the end of a line), so the String.replaceAll() method was barfing appropriately. I think this is what was happening... Looked like it from the debug output, anyway...

I just switched that bit of string replacement to use the Apache Commons Lang StringUtils replacement method (org.apache.commons.lang.StringUtils.replace(String, String, String) ), and all appears fine now.

As an aside, there are a lot of handy little goodnesses in the Apache Commons Lang library (as in the other Commons libraries). I need to make sure I'm taking advantage of it more, rather than writing my own utility code...

Update: Nope. That didn't work as cleanly as I'd hoped. Now it was complaining that some of the replaced characters were invalid UTF-8. So, I'm now replacing characters in my replacement string to attempt to escape them properly before feeding them to String.replaceAll( pattern, value ), using the "oldReplace" method from this tip.


See Also