Unicode spaces - Jukka K. Korpela

文章推薦指數: 80 %
投票人數:10人

Unicode spaces ; U+202F, NARROW NO-BREAK SPACE, foo bar, Narrower than NO-BREAK SPACE (or SPACE), “typically the width of a thin space or a mid space” ; U+205F ... Unicodespaces Thisdocumentliststhevarious spacecharactersinUnicode. Foradescription,consultchapter 6 WritingSystemsandPunctuation andblockdescription GeneralPunctuation intheUnicodestandard.Thisdocumentalsoliststhreecharacters thathavenowidthandcanthusbedescribedasno-widthspaces. Thethirdcolumnofthefollowingtableshowstheappearance ofthespacecharacter,inthesensethatthecellcontainsthe words“foo”and“bar”inborderedboxes separatedbythatcharacter. Itis possiblethatyourbrowserdoesnotpresentallthespacecharacters properly.Thisdependsonthefontused,onthebrowser, andonthefontsavailableinthesystem. Spacecharactersand“zero-widthspaces”inUnicode Code Nameofthecharacter Sample Widthofthecharacter U+0020SPACEfoobar Dependsonfont,typically1/4em,oftenadjusted U+00A0NO-BREAKSPACEfoo bar Asaspace,butoftennot adjusted U+1680OGHAMSPACEMARKfoo bar Unspecified;usuallynotreallyaspacebutadash U+180EMONGOLIANVOWELSEPARATORfoo᠎bar 0 U+2000ENQUADfoo bar 1en(=1/2em) U+2001EMQUADfoo bar 1em(nominally,theheightofthefont) U+2002ENSPACE(nut)foo bar 1en(=1/2em) U+2003EMSPACE(mutton)foo bar 1em U+2004THREE-PER-EMSPACE(thickspace)foo bar 1/3em U+2005FOUR-PER-EMSPACE(midspace)foo bar 1/4em U+2006SIX-PER-EMSPACEfoo bar 1/6em U+2007FIGURESPACEfoo bar “Tabularwidth”,thewidthofdigits U+2008PUNCTUATIONSPACEfoo bar Thewidthofaperiod“.” U+2009THINSPACEfoo bar 1/5em(orsometimes1/6em) U+200AHAIRSPACEfoo bar NarrowerthanTHINSPACE U+200BZEROWIDTHSPACEfoo​bar 0 U+202FNARROWNO-BREAKSPACEfoo bar NarrowerthanNO-BREAKSPACE(orSPACE), “typicallythe widthofathinspaceoramidspace” U+205FMEDIUMMATHEMATICALSPACEfoo bar 4/18em U+3000IDEOGRAPHICSPACEfoo bar Thewidthofideographic(CJK)characters. U+FEFFZEROWIDTHNO-BREAKSPACEfoobar 0 “Zero-widthspaces” Previously MONGOLIANVOWELSEPARATOR(U+180E) wasclassifiedasaspacecharacter,nowasformattingcharacters(withnowidth). Thecharacters ZEROWIDTHSPACE(U+200B)and ZEROWIDTHNO-BREAKSPACE(U+FEFF)wereneverclassified asspacecharactersinUnicode,despitetheirname. ZEROWIDTHSPACE,whensupported,canbeusedtoindicatealinebreaking opportunitywithinastring.Similarly, ZEROWIDTHNO-BREAKSPACEcanbeusedbetweentwocharactersto“glue” themtogether,sothattheynolinebreakingappearsbetweenthemeven ifnormalprocessingruleswouldallowthat. Widthsofspacecharacters ThecharactersU+2000…U+2006,whenimplementedinafont,usuallyhave thespecificwidthdefinedforthem,thoughsmalldeviationsexist. Theirwidthsaredefinedintermsoftheemunit,i.e.thesizeofthefont. ThecharactersU+2007…U+200AandU+202Fhavenoexactwidthassignedtothem inthestandard,andimplementationsmaydeviateconsiderablyevenfromthe suggestedwidths.Moreover,whenconceptswiththesamenames,suchas “thinspace”,areusedinpublishingsoftware,themeaningscanberatherdifferent. Forexample,inInDesign,“thinspace”isnow1/8 em (i.e.0.125 em,asoppositetothesuggested0.2 em) and “hairspace”only1/24 em(i.e.about0.042 em,whereasthewidthofa THINSPACEglyphtypicallyvariesbetween0.1 emand0.2 em). Notesonsupportinbrowsersandothersoftware Webbrowsersandotherprogramsmayfail torenderallspacecharactersaccording totheirdefinitionsordescriptions. Manycommonlyusedfontslacksomeofthespacecharacters. Thesituationhasimprovedovertheyears,butcautionisstill neededespeciallywhentextdatamayneedtobetransferredfrom oneprogramtoanotherormaybeviewedusingdifferentfonts. Modernbrowserscanusuallyfindaglyphforacharacter ifsomeofthefontsinthesystemcontainit.Thisdoesnot alwaystakeplace,however, SeeGuidetousingspecialcharactersinHTML. Moreover,fontsubstitutionmaycauseundesiredeffects,sincethewidths ofcharactersvarybyfont. Theuseofvariousspacecharactersofspecific width,suchasTHINSPACE, isoftenanunnecessaryrisk. Considerusingothermethods,suchasthe featuresofatextprocessingprogramor(onWebpages)CSSpropertieslike padding, margin, word-spacing, and letter-spacing. Widthadjustments Intextprocessing,Webpagedisplay,andothercontexts, spacecharactersareoften“adjustable”inthe sensethattheyarepresentedindifferentwidths,especially tosatisfyjustificationrequirements.Youmightseethisineffectin thisparagraph.Justificationoftenjustmakesspaceswider,though itmayshrinkthem,too,especiallyintypesetting. No-breakspaces aredefinedinUnicodeashavingthesamewidthasspaces. Thisdoesnotspecifywhatshouldhappentothemin justification. Thecommonpracticehasbeentotreatthem ashavingfixedwidth(ineachfont),whichmeansthat inadjustedtext,spacesandno-breakspaceshavedifferenteffects. Onwebbrowsers,no-breakspacestendedtobenon-adjustable, butmodernbrowsersgenerallystretchthemonjustification. Within justifiedtextonwebpages, authorsmayhaveusedno-breakspacesinsteadofnormalspaces topreventstretching(e.g.,asin5 minstead of5 m).Duetochangesinbrowserbehavior, itisbettertousefixed-widthspacesinstead.Amongthem,thefour-per-em space (e.g.,asin5 m) usuallybestcorrespondstothewidthofanormalunstretched space.However,thefixed-widthspacesactasnormalspaces inlinebreaking,soyoumaywishtousesometechniqueto preventundesiredlinebreaks (e.g., asin5 m). Alternatively,considerusing NARROWNO-BREAKSPACE,whichisgenerallytreated asnon-stretchableinwebbrowsers. Itmightbeadequateincontextswherestringsbelongtogethersothat theyshouldnotbesplitontwolinesandcouldwellberenderedwith decreasedspacingbetweenthem,e.g.inexpressionslike ”10 kg”and”C. S. Lewis”. Thechangeinthetreatmentofno-breakspaces,though inconvenient,isconsistentwithchangesinCSSspecifications. Forexample,clause 7Spacing ofCSSTextModuleLevel3(Editor’sDraft24Jan.2019) definestheno-breakspace,butnotthefixed-widthspaces, asaword-separatorcharacter,stretchableonjustification. TheUnicodestandarddescribestheadjustmentprocess andtheintendedroleofspecific-widthspacecharactersasfollows: Thefixed-widthspacecharacters(U+2000..U+200A)arederivedfrom conventional(hotlead)typography.Algorithmickerningandjustificationincomputerized ty­pog­ra­phydonotusethesecharacters.However,wheretheyareused (forexample,in typesettingmathematicalformulae), theirwidthisgenerallyfont-specified,andtheytypically donotexpandduringjustification.TheexceptionisU+2009 THINSPACE,which sometimesgetsadjusted. TheEMQUADcharacteriscanonicalequivalent toEMSPACE.Theintendeddifferenceseemstobe inthecodechartnoteforthelatter: “mayscalebythecondensationfactorofafont”. ThereisnosuchnoteforENSPACE tomakeitanydifferentfromENQUAD. Itisnotclearwhat“condensationfactor”meanshere. Othernotes TheMEDIUMMATHEMATICALSPACEcharacterwasaddedinUnicode version 4.0. Regardingthenon-breakingpropertyofno-breakspaceandother characters,see Unicodelinebreakingrules:explanationsandcriticism. Microsoft’spageSpaceCharactersDesignStandardssays: “Indigitalfontsthereareonlytwokindsofspacecharacterssupportedbymostcomputers,thespaceandtheno-breakspace.” Thisissomewhatmisleading,sincethesupportdependsonfontsratherthancomputers,exceptfor no-breakspacesupport,whichdependsonprograms. AlanWood’sexcellentUnicoderesourcescontainapageon theGeneralPunctuationblock,withwidthsofspacecharacters illustratedgraphically. Seealso:StylingspacesinCSS. Demonstration Thisparagraphisherefordemonstrationpurposesonly,anditcontainsnormalSPACEcharactersbetweenwords. This paragraph is here for demonstration purposes only, and it contains SIX-PER EM SPACE characters instead of normal SPACE characters between words. Visiblespaces Therearesomegraphiccharactersthatcanbeusedasymbols foraspace.Thoughsometimescalledvisiblespaces,theyarenot spacesatallbutvisiblenotationsusedtoindicatetheappearanceof spacesininstructionmanualsanddescriptionsoftexts. Thefollowingtablelistssomesymbols,indecreasingorderby practicalusefulness.Theirshapesvarybyfont;especiallythelastone variesalot. ␣U+2423OPENBOX ␢U+2422BLANKSYMBOL ␠U+2420SYMBOLFORSPACE



請為這篇文章評分?