Background | |
The UCS defines many more characters, so 1 byte per character is not enough. Unicode uses 31 bit, so the logical size of each character would be 4 bytes (32 bit). The problem with those wide characters is that they're only needed if your use of the ~2 billion characters are evenly distributed --- most people use no more than 256 of those characters in their documents, so there's a lot wasted space. The UTF-8 encoding is a way of transforming 4 byte wide characters into 1-6 byte wide characters. It's backwards compatible with ASCII meaning that texts encoded in ASCII automatically is in UTF-8 as well. Other encodings (including Latin-1) use two or more bytes to represent each character. That's why 'æ', 'ø', and 'å' turns into two-letter combinations when an UTF-8 encoded text is viewed as Latin-1. All the above is dealt with in much more detail in the UTF-8 and Unicode FAQ for Unix/Linux which is usefull for a lot more than just Unix/Linux. |
If Tiki on your server doesn't look fine: | |
Some servers (like Apache with the default Debian config) adds a To solve this with Apache then either check that there is no option An other solution is to move the |
Test on this page itself | |
(pl) Czy polski ogónki funkcjonujÄ… tutaj, pisany przez mozilli? Np, tu trochÄ™ treść. (ja) 日本語. (en) Good, this looks OK 😊. |
Editing the translations | |
|