In Python, the built-in functions chr() and ord() are used to convert between Unicode code points and characters.Built-in Functions - chr() ...
Top
Python
ConvertUnicodecodepointandcharactertoeachother(chr,ord)
Posted:2021-09-21/Tags:Python,String
Tweet
InPython,thebuilt-infunctionschr()andord()areusedtoconvertbetweenUnicodecodepointsandcharacters.
Built-inFunctions-chr()—Python3.9.7documentation
Built-inFunctions-ord()—Python3.9.7documentation
AcharactercanalsoberepresentedbywritingahexadecimalUnicodecodepointwith\x,\u,or\Uinastringliteral.
UnicodeHOWTO-Python’sUnicodeSupport—Python3.9.7documentation
Thisarticledescribesthefollowingcontents.
ConvertcharactertoUnicodecodepoint:ord()
ConvertUnicodecodepointtocharacter:chr()
UseUnicodecodepointsinstrings:\x,\u,\U
SponsoredLink
ConvertcharactertoUnicodecodepoint:ord()
Byspecifyingastringofonecharacterasanargumentoford(),theUnicodecodepointofthecharacterisreturnedasanintegerint.
i=ord('A')
print(i)
#65
print(type(i))
#
source:chr_ord.py
Anerroroccursifyouspecifyastringofmorethantwocharacters.
#ord('abc')
#TypeError:ord()expectedacharacter,butstringoflength3found
source:chr_ord.py
Unicodecodepointsareoftenwritteninhexadecimalnotation.Usethebuilt-infunctionhex()toconvertanintegertoahexadecimalstring.
s=hex(i)
print(s)
#0x41
print(type(s))
#
source:chr_ord.py
Thebuilt-infunctionformat()canbeusedtospecifymoredetailedformatting,suchaszero-fillingandtheprefix0x.
print(format(i,'04x'))
#0041
print(format(i,'#06x'))
#0x0041
source:chr_ord.py
Insummary,thehexadecimalUnicodecodepointforaparticularcharactercanbeobtainedasfollows.
print(format(ord('X'),'#08x'))
#0x000058
print(format(ord('💯'),'#08x'))
#0x01f4af
source:chr_ord.py
FlagsandotheremojiarerepresentedbymultipleUnicodecodepoints.
https://unicode.org/Public/emoji/4.0/emoji-sequences.txt
NotethatasofPython3.7.3,ord()doesnotsupportsuchemojiandanerrorraises.Ifyoucheckthenumberofcharactersofthoseemojiwiththebuilt-infunctionlen(),thenumberofUnicodecodepointsisreturned.
#ord('🇯🇵')
#TypeError:ord()expectedacharacter,butstringoflength2found
print(len('🇯🇵'))
#2
source:chr_ord.py
ConvertUnicodecodepointtocharacter:chr()
chr()returnsthestringstrrepresentingacharacterwhoseUnicodecodepointisthespecifiedintegerint.
print(chr(65))
#A
print(type(chr(65)))
#
source:chr_ord.py
InPython,anintegercanbewritteninhexadecimalwith0x,soyoucanspecifyitasanargumentofchr().Itdoesn'tmatterifitisfilledwithzeros.
print(65==0x41)
#True
print(chr(0x41))
#A
print(chr(0x000041))
#A
source:chr_ord.py
IfyouwanttoconvertahexadecimalstringrepresentingaUnicodecodepointtoacharacter,convertthestringtoanintegerandthenpassittochr().
Useint()toconvertahexadecimalstringintoaninteger.Specifytheradix16asthesecondargument.
s='0x0041'
print(int(s,16))
#65
print(chr(int(s,16)))
#A
source:chr_ord.py
Thesecondargumentcanbe0ifthestringisprefixedwith0x.Seethefollowingarticleformoredetailsonthehandlingofhexadecimalnumbersandstrings.
Convertbinary,octal,decimal,andhexadecimalinPython
UnicodecodepointsareoftenwrittenintheformofU+XXXX.Toconvertsuchastringtoacharacterofthatcodepoint,justselectthenumericpartofthestringwiththeslice.
Howtoslicealist,string,tupleinPython
s='U+0041'
print(s[2:])
#0041
print(chr(int(s[2:],16)))
#A
source:chr_ord.py
SponsoredLink
UseUnicodecodepointsinstrings:\x,\u,\U
Ifyouwrite\x,\u,or\UandahexadecimalUnicodecodepointinastringliteral,itistreatedasthatcharacter.
Itshouldbe2,4,or8digitslike\xXX,\uXXXX,and\UXXXXXX,respectively.Anerrorisraisedifthenumberofdigitsisnotcorrect.
print('\x41')
#A
print('\u0041')
#A
print('\U00000041')
#A
print('\U0001f4af')
#💯
#print('\u041')
#SyntaxError:(unicodeerror)'unicodeescape'codeccan'tdecodebytesinposition0-4:truncated\uXXXXescape
#print('\U0000041')
#SyntaxError:(unicodeerror)'unicodeescape'codeccan'tdecodebytesinposition0-8:truncated\UXXXXXXXXescape
source:chr_ord.py
Eachcodeistreatedasonecharacter.Youcancheckitwiththebuilt-infunctionlen()whichreturnsthenumberofcharacters.
print('\u0041\u0042\u0043')
#ABC
print(len('\u0041\u0042\u0043'))
#3
source:chr_ord.py
Notethatinrawstringswhereescapesequencesaredisabled,thestringistreatedasis.
RawstringsinPython
print(r'\u0041\u0042\u0043')
#\u0041\u0042\u0043
print(len(r'\u0041\u0042\u0043'))
#18
source:chr_ord.py
SponsoredLink
Share
Tweet
RelatedCategories
Python
String
RelatedArticles
Reversealist,string,tupleinPython(reverse,reversed)
ExtractasubstringfromastringinPython(position,regex)
CreateastringinPython(single,double,triplequotes,str())
Getthefilename,directory,extensionfromapathstringinPython
SortalistofnumericstringsinPython
ConvertalistofstringsandalistofnumberstoeachotherinPython
ConcatenatestringsinPython(+operator,join,etc.)
Checkifastringisnumeric,alphabetic,alphanumeric,orASCII
Getthelengthofastring(numberofcharacters)inPython
Convertastringtoanumber(int,float)inPython
Handlelinebreaks(newlines)inPython
WritealongstringonmultiplelinesinPython
SplitstringsinPython(delimiter,linebreak,regex,etc.)
WrapandtruncateastringwithtextwrapinPython
Howtoslicealist,string,tupleinPython
Categories
Python
NumPy
OpenCV
pandas
Pillow
pip
scikit-image
JupyterNotebook
ImageProcessingString
Regex
File
Dateandtime
Mathematics
Dictionary
List
Summary
About
GitHub:nkmk
SponsoredLink
RelatedArticles
Reversealist,string,tupleinPython(reverse,reversed)
ExtractasubstringfromastringinPython(position,regex)
CreateastringinPython(single,double,triplequotes,str())
Getthefilename,directory,extensionfromapathstringinPython
SortalistofnumericstringsinPython
SponsoredLink