Unicode 的BOM (byte order mark) @ 工作小錦囊 - 隨意窩
文章推薦指數: 80 %
Unicode 的BOM (byte order mark) ... A Byte Order Mark (BOM) is the character at code point U+FEFF ... FF FE 00 00, UTF-32, little-endian. 工作小錦囊AByteOrderMark(BOM)isthecharacteratcodepointU+FEFF("zero-widthno-breakspace"),whenthatcharacterisusedtodenotetheendiannessofastringofUCS/UnicodecharactersencodedinUTF-16orUTF-32and/orasamarkertoindicatethattextisencodedinUTF-8,UTF-16orUTF-32.http://en.wikipedia.org/wiki/Byte_Order_Mark這個問題發現在某個script輸出兩個staticutf-8encodedfile,導致html的layout在IE會亂掉的情況.readfile("template1");...htmlcodereadfile("template2");其中template1跟template2都是ucs-bom(leadingbyte是efbbbf)BytesEncodingForm0000FEFFUTF-32,big-endianFFFE0000UTF-32,little-endianFEFFUTF-16,big-endianFFFEUTF-16,little-endianEFBBBFUTF-8為了輸出的效能,readfile不能以file_get_contents取代,有幾種寫法1.加上htmlremark:echo"realcontents.....2.從outputbuffer中間把這些東西strip掉functionstrip_u8_bom($str){returnstr_replace("\xEF\xBB\xBF","",$str);}functionStripUtf8Bom($data){if(substr($data,0,3)=="\xEF\xBB\xBF")returnsubstr_replace($data,'',0,3);return$data;}3.把template的BOM去掉if($_SERVER["argc"]<2){printf("Usage:%sfile[file...]\n",$_SERVER['argv'][0]);exit;}$argc=$_SERVER["argc"];for($i=1;$i
延伸文章資訊
- 1不同編碼的字節順序標記的表示 - BOM_百度百科
en:UTF-EBCDIC. DD 73 66 73. 221 115 102 115 ; en:Standard Compression Scheme for Unicode. 0E FE F...
- 2Unicode 與UTF - OpenHome.cc
... JavaScript》條款七),一開頭的兩個位元組(ff fe)是用來識別檔案採用的位元組順序,稱為BOM(byte order mark),之後使用兩個位元組來儲存每個Unicode 字元。
- 3Process a file that starts with a BOM (FF FE)
From this wikipedia article, FF FE means UTF16LE . So you should tell iconv to convert from UTF16...
- 4BOM BOM BOM | 就是愛程式
ff fe ## ## UTF-16, Little Endian; ef bb bf UTF-8. Microsoft與BOM. 許多Windows 軟體(包括Windows 筆記本) 在UT...
- 5Byte order mark - Globalization - Microsoft Learn
Byte Order Mark (BOM) is used to indicate how a processor places serialized text into a sequence ...