|
EncodeURI23 December 2011 Last week I ported my Diff Match Patch library[?] to Dart[?]. (Although the port is complete, it won't get released until after the holidays due to all my code reviewers being with their families.) One of the major stumbling blocks I ran into was Dart's lack of encodeURI/decodeURI functions to turn text into characters that are safe to use in a URL. Obviously I had to write my own functions. No big deal. How hard can it be to look up the character code for '@', do a hexadecimal conversion and return '%40'? Turns out that the format is much more complicated than this. Two-digit hex codes only get you to character 127 (the top-most bit must be 0). Beyond this one needs to switch to four and six-digit codes to reach 65k Unicode characters -- all of which must be encoded in UTF-8's bit scheme. Beyond even this, one needs to switch to eight-digit Unicode surrogate pairs to reach an additional one million Unicode characters.
Here's the resulting code and the unit tests: After showing my solution to the Dart team they asked me to submit it into their core library so that nobody else has to go through what I went through. Dart is shaping up to be a really great language. Although it's not ready for serious use right now, it is great to be able to help shape a language that's going to be one of the pillars of web programming within a few years. Update: encodeUri/decodeUri are now included with Dart. Use the Uri class. |