{"id":21,"date":"2006-06-18T23:05:31","date_gmt":"2006-06-18T11:05:31","guid":{"rendered":"http:\/\/craig.dubculture.co.nz\/blog\/2006\/06\/18\/asshat-space\/"},"modified":"2015-09-27T16:55:28","modified_gmt":"2015-09-27T15:55:28","slug":"asshat-space","status":"publish","type":"post","link":"http:\/\/craig.dubculture.co.nz\/blog\/2006\/06\/18\/asshat-space\/","title":{"rendered":"Asshat space (or wordpress c2 a0, for search-fu)"},"content":{"rendered":"<p>Somehow, WordPress is inserting C2 A0 characters in my feed, which means that Planet NZTech can't parse them, so my posts don't show up until I find them manually and fix them.<\/p>\n<p>C2 A0 is a unicode non-breaking space. It could be because of my habit of hitting Space twice after a sentence, that it realiases one of them has to be non-breaking. Whatever it is, it's irritating.<\/p>\n<p>It doesn't happen in the output under ISO-8859-1. It's only on Windows, doing a diff of the feed as downloaded on my UTF-8 Linux server, that I actually see the problem.<\/p>\n<p>Badly configured UTF-8 systems often end up with the symbol A-with-circumflex (\u00c2) before the character. In #wlug, we lovingly call this character \"the asshat\". I had thought that putting it in would stop this post from being picked up, but seems there's an \u00e2 in HTML just for my asshat character.<\/p>\n<p>I've also found I can see them with <em>LANG=iso-8859-1 less index.html<\/em>. This explains why I couldn't find them to start with - less runs in UTF-8 by default, which draws it as a space!<\/p>\n<p>Unfortunately, it works fine on <a href=\"http:\/\/planet.wlug.org.nz\/\">Planet WLUG<\/a>, so it's fixed in newer planetplanet, which <a href=\"http:\/\/words.rancidbacon.com\/planetyak-2006-06-18-06-55.html\">doesn't work for Follower<\/a> at the moment ?<\/p>\n<p>Not much can really be fixed at this point, so this writeup can act as a \"this is the problem\" in case anyone Googles for \"wordpress c2 a0\".<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Somehow, WordPress is inserting C2 A0 characters in my feed, which means that Planet NZTech can't parse them, so my posts don't show up until I find them manually and fix them. C2 A0 is a unicode non-breaking space. It could be because of my habit of hitting Space twice after a sentence, that it [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[66],"tags":[22,21],"_links":{"self":[{"href":"http:\/\/craig.dubculture.co.nz\/blog\/wp-json\/wp\/v2\/posts\/21"}],"collection":[{"href":"http:\/\/craig.dubculture.co.nz\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/craig.dubculture.co.nz\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/craig.dubculture.co.nz\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/craig.dubculture.co.nz\/blog\/wp-json\/wp\/v2\/comments?post=21"}],"version-history":[{"count":2,"href":"http:\/\/craig.dubculture.co.nz\/blog\/wp-json\/wp\/v2\/posts\/21\/revisions"}],"predecessor-version":[{"id":693,"href":"http:\/\/craig.dubculture.co.nz\/blog\/wp-json\/wp\/v2\/posts\/21\/revisions\/693"}],"wp:attachment":[{"href":"http:\/\/craig.dubculture.co.nz\/blog\/wp-json\/wp\/v2\/media?parent=21"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/craig.dubculture.co.nz\/blog\/wp-json\/wp\/v2\/categories?post=21"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/craig.dubculture.co.nz\/blog\/wp-json\/wp\/v2\/tags?post=21"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}