First, str in Python is represented in Unicode . Second, UTF-8 is an encoding standard to encode Unicode string to bytes .
Home
Public
Questions
Tags
Users
Companies
Collectives
ExploreCollectives
Teams
StackOverflowforTeams
–Startcollaboratingandsharingorganizationalknowledge.
CreateafreeTeam
WhyTeams?
Teams
CreatefreeTeam
Collectives™onStackOverflow
Findcentralized,trustedcontentandcollaboratearoundthetechnologiesyouusemost.
LearnmoreaboutCollectives
Teams
Q&Aforwork
Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch.
LearnmoreaboutTeams
Howtoconvertastringtoutf-8inPython
AskQuestion
Asked
11years,11monthsago
Modified
1monthago
Viewed
856ktimes
231
Ihaveabrowserwhichsendsutf-8characterstomyPythonserver,butwhenIretrieveitfromthequerystring,theencodingthatPythonreturnsisASCII.HowcanIconverttheplainstringtoutf-8?
NOTE:ThestringpassedfromthewebisalreadyUTF-8encoded,IjustwanttomakePythontotreatitasUTF-8notASCII.
pythonpython-2.7unicodeutf-8
Share
Improvethisquestion
Follow
editedMay3,2018at22:12
Batman
8,31777goldbadges3939silverbadges7676bronzebadges
askedNov15,2010at8:26
BinChenBinChen
59.3k5353goldbadges139139silverbadges182182bronzebadges
3
1
Trythislinkhttp://evanjones.ca/python-utf8.html
– Mudassir
Nov15,2010at8:33
IthinkabettertitlewouldbeHowtocoerceastringtounicodewithouttranslation?
– boatcoder
Aug11,2016at22:05
3
In2018,python3ifyougetasciidecodeerrordo"some_string".encode('utf-8').decode('utf-8')
– devssh
Sep26,2018at8:40
Addacomment
|
13Answers
13
Sortedby:
Resettodefault
Highestscore(default)
Trending(recentvotescountmore)
Datemodified(newestfirst)
Datecreated(oldestfirst)
303
InPython2
>>>plain_string="Hi!"
>>>unicode_string=u"Hi!"
>>>type(plain_string),type(unicode_string)
(,)
^Thisisthedifferencebetweenabytestring(plain_string)andaunicodestring.
>>>s="Hello!"
>>>u=unicode(s,"utf-8")
^Convertingtounicodeandspecifyingtheencoding.
InPython3
Allstringsareunicode.Theunicodefunctiondoesnotexistanymore.Seeanswerfrom@Noumenon
Share
Improvethisanswer
Follow
editedAug31,2020at1:00
Maxime
2822silverbadges55bronzebadges
answeredNov15,2010at8:31
user225312user225312
121k6666goldbadges167167silverbadges181181bronzebadges
7
36
,Iamgettingthefollowingerror:UnicodeDecodeError:'utf8'codeccan'tdecodebyte0xb0inposition2:invalidstartbyteThisismycode:ret=[]forlineincsvReader:cline=[]forelminline:unicodestr=unicode(elm,'utf-8')cline.append(unicodestr)ret.append(cline)
– GopakumarNG
Oct22,2013at6:56
127
NoneofthisappliesinPython3,allstringsareunicodeandunicode()doesn'texist.
– Noumenon
Aug28,2015at12:00
1
Howtoyouconvertubacktoastrformat(convertubacktos)?
– Tanguy
Aug25,2017at13:25
3
Thiscodewillonlyworkaslongasthetextdoesnotcontainnon-asciicharacters;asimpleaccentedcharacteronthestringwillmakeitfail.
– Haroldo_OK
Feb16,2018at10:31
1
Hi,ifyouhave"2340"inastringvariable,andyouwanttoprinttheunicodecharacterU+2340(⍀),isthereanywaytodothat?
– Sha2b
Nov5,2019at3:36
|
Show2morecomments
82
Ifthemethodsabovedon'twork,youcanalsotellPythontoignoreportionsofastringthatitcan'tconverttoutf-8:
stringnamehere.decode('utf-8','ignore')
Share
Improvethisanswer
Follow
answeredOct7,2013at17:00
duhaimeduhaime
24k1414goldbadges156156silverbadges200200bronzebadges
3
28
GotAttributeError:'str'objecthasnoattribute'decode'
– saran3h
Aug6,2018at14:06
3
@saran3hitsoundslikeyou'reusingPython3,inwhichcasePythonshouldhandleencodingissuesforyou.Haveyoutriedreadingyourdocumentwithoutspecifyinganencoding?
– duhaime
Aug6,2018at14:56
3
Pythonbydefaultpickssystemencoding.Inwindows10it'scp1252whichisdifferentfromutf-8.Iwastedfewhoursonitwhileusingcodecs.open()inpy3.8
– VisheshMangla
Jul1,2020at15:15
Addacomment
|
24
Mightbeabitoverkill,butwhenIworkwithasciiandunicodeinsamefiles,repeatingdecodecanbeapain,thisiswhatIuse:
defmake_unicode(inp):
iftype(inp)!=unicode:
inp=inp.decode('utf-8')
returninp
Share
Improvethisanswer
Follow
editedMay26,2021at0:17
ThavasAntonio
5,64211goldbadge1212silverbadges3838bronzebadges
answeredNov29,2014at19:13
BlueswannabeBlueswannabe
24122silverbadges22bronzebadges
1
1
Thisnolongerworks,aswritten...theunicodetypedoesn'texistinpython3
– MikePennington
Dec26,2021at15:14
Addacomment
|
16
Addingthefollowinglinetothetopofyour.pyfile:
#-*-coding:utf-8-*-
allowsyoutoencodestringsdirectlyinyourscript,likethis:
utfstr="ボールト"
Share
Improvethisanswer
Follow
editedApr25,2015at5:17
famousgarkin
13.3k55goldbadges5656silverbadges7474bronzebadges
answeredMay22,2014at15:15
KenKen
36933silverbadges1414bronzebadges
3
2
ItisnotwhatOPasks.Butavoidsuchstringliteralsanyway.ItcreatesUnicodestringinPython3(good)butitisabytestringinPython2(bad).Eitheraddfrom__future__importunicode_literalsatthetoporuseu''prefix.Don'tusenon-asciicharactersinbytesliterals.Togetutf-8bytes,youcouldutf8bytes=unicode_text.encode('utf-8')laterifitisnecessary.
– jfs
Apr26,2015at1:26
1
@jfshowwillfrom__future__importunicode_literalshelpmetoconvertastringwithnon-asciicharacterstoutf-8?
– OrtalTurgeman
Nov29,2018at17:30
@OrtalTurgemanI'mnotansweringthequestion.Look,itisacomment,notananswer.Mycommentaddressestheissuewiththecodeintheanswer.Ittriestocreateabytestringwithnon-asciicharactersonPython2(itisaSyntaxErroronPython3—bytesliteralsforbidthat).
– jfs
Nov29,2018at17:34
Addacomment
|
13
IfIunderstandyoucorrectly,youhaveautf-8encodedbyte-stringinyourcode.
Convertingabyte-stringtoaunicodestringisknownasdecoding(unicode->byte-stringisencoding).
Youdothatbyusingtheunicodefunctionorthedecodemethod.Either:
unicodestr=unicode(bytestr,encoding)
unicodestr=unicode(bytestr,"utf-8")
Or:
unicodestr=bytestr.decode(encoding)
unicodestr=bytestr.decode("utf-8")
Share
Improvethisanswer
Follow
answeredNov15,2010at8:55
codeapecodeape
95.4k2424goldbadges151151silverbadges180180bronzebadges
Addacomment
|
12
city='Ribeir\xc3\xa3oPreto'
printcity.decode('cp1252').encode('utf-8')
Share
Improvethisanswer
Follow
answeredJul26,2017at20:31
WillemWillem
1,29411goldbadge88silverbadges77bronzebadges
0
Addacomment
|
10
InPython3.6,theydonothaveabuilt-inunicode()method.
Stringsarealreadystoredasunicodebydefaultandnoconversionisrequired.Example:
my_str="\u221a25"
print(my_str)
>>>√25
Share
Improvethisanswer
Follow
editedFeb20,2019at10:23
PradeepR
322bronzebadges
answeredApr20,2017at15:53
ZldProductionsZldProductions
31933silverbadges1212bronzebadges
Addacomment
|
5
Translatewithord()andunichar().
Everyunicodecharhaveanumberasociated,somethinglikeanindex.SoPythonhaveafewmethodstotranslatebetweenacharandhisnumber.Downsideisañexample.Hopeitcanhelp.
>>>C='ñ'
>>>U=C.decode('utf8')
>>>U
u'\xf1'
>>>ord(U)
241
>>>unichr(241)
u'\xf1'
>>>printunichr(241).encode('utf8')
ñ
Share
Improvethisanswer
Follow
answeredNov9,2017at17:24
Joe9008Joe9008
62577silverbadges1212bronzebadges
Addacomment
|
4
First,strinPythonisrepresentedinUnicode.
Second,UTF-8isanencodingstandardtoencodeUnicodestringtobytes.Therearemanyencodingstandardsoutthere(e.g.UTF-16,ASCII,SHIFT-JIS,etc.).
WhentheclientsendsdatatoyourserverandtheyareusingUTF-8,theyaresendingabunchofbytesnotstr.
Youreceivedastrbecausethe"library"or"framework"thatyouareusing,hasimplicitlyconvertedsomerandombytestostr.
Underthehood,thereisjustabunchofbytes.Youjustneedaskthe"library"togiveyoutherequestcontentinbytesandyouwillhandlethedecodingyourself(iflibrarycan'tgiveyouthenitistryingtodoblackmagicthenyoushouldn'tuseit).
DecodeUTF-8encodedbytestostr:bs.decode('utf-8')
EncodestrtoUTF-8bytes:s.encode('utf-8')
Share
Improvethisanswer
Follow
editedOct6,2020at9:23
answeredAug7,2020at0:11
shiokoshioko
30533silverbadges1111bronzebadges
Addacomment
|
0
youcanalsodothis:
fromunidecodeimportunidecode
unidecode(yourStringtoDecode)
Share
Improvethisanswer
Follow
answeredJul19,2021at16:25
KevinKevin
166bronzebadges
1
Whatisunidecode?Isitthispypi.org/project/Unidecode?Pleaseprovideinfoifit'sa3rd-partypackage,andhowtoinstall/useit.
– GinoMempin
Jul19,2021at23:27
Addacomment
|
0
Youcanusepython'sstandardlibrarycodecsmodule.
importcodecs
codecs.decode(b'Decodeme','utf-8')
Share
Improvethisanswer
Follow
answeredSep20,2021at22:26
hacckshaccks
102k2424goldbadges167167silverbadges256256bronzebadges
Addacomment
|
0
TheurlistranslatedtoASCIIandtothePythonserveritisjustaUnicodestring,eg.:
"T%C3%A9st%C3%A3o"
Pythonunderstands"é"and"ã"asactual%C3%A9and%C3%A3.
YoucanencodeanURLjustlikethis:
importurllib
url="T%C3%A9st%C3%A3o"
print(urllib.parse.unquote(url))
>>Téstão
Seehttps://www.adamsmith.haus/python/answers/how-to-decode-a-utf-8-url-in-pythonfordetails.
Share
Improvethisanswer
Follow
answeredSep1at10:20
GeorgeFonsecaGeorgeFonseca
111bronzebadge
Addacomment
|
-1
Yes,Youcanadd
#-*-coding:utf-8-*-
inyoursourcecode'sfirstline.
Youcanreadmoredetailsherehttps://www.python.org/dev/peps/pep-0263/
Share
Improvethisanswer
Follow
editedApr26,2020at11:44
DavidBuck
3,6003333goldbadges2929silverbadges3434bronzebadges
answeredApr26,2020at11:05
David-StarDavid-Star
3733bronzebadges
Addacomment
|
YourAnswer
ThanksforcontributingananswertoStackOverflow!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers.
Draftsaved
Draftdiscarded
Signuporlogin
SignupusingGoogle
SignupusingFacebook
SignupusingEmailandPassword
Submit
Postasaguest
Name
Email
Required,butnevershown
PostYourAnswer
Discard
Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy
Nottheansweryou'relookingfor?Browseotherquestionstaggedpythonpython-2.7unicodeutf-8oraskyourownquestion.
TheOverflowBlog
HowtoearnamillionreputationonStackOverflow:beofservicetoothers
Therightwaytojobhop(Ep.495)
FeaturedonMeta
BookmarkshaveevolvedintoSaves
Inboximprovements:markingnotificationsasread/unread,andafiltered...
Revieweroverboard!Orarequesttoimprovetheonboardingguidancefornew...
CollectivesUpdate:RecognizedMembers,Articles,andGitLab
Shouldweburninatethe[script]tag?
Linked
0
HowdoIconvertparametervaluesintounicodeinPython2.7?
5
Howtoconvertunicodestringintonormaltextinpython
1
readafileandtrytoremoveallnonUTF-8chars
0
UnicoderegextomatchacharacterclassofChinesecharacters
0
Errno22invalidmodew+orfilename
2
ZMQSocketTypeError:unicodestringsonlyError:isthereafix?
1
pythonunicodecsvexportusingpyramid
0
HowconvertastringcontainunicodecharacterstoUTFinpython?
0
PythontoopenandsavecsvfileinApacheOpenOffice
0
DisplayJapanesecharactersinVisualStudioCodeusingPython
Seemorelinkedquestions
Related
6474
HowdoImergetwodictionariesinasingleexpression?
597
Bestwaytoconverttextfilesbetweencharactersets?
6784
HowdoIcheckwhetherafileexistswithoutexceptions?
6975
WhataremetaclassesinPython?
7492
DoesPythonhaveaternaryconditionaloperator?
2851
Convertstring"Jun120051:33PM"intodatetime
709
Whatisthebestwaytoremoveaccents(normalize)inaPythonunicodestring?
3469
Convertbytestoastring
3588
DoesPythonhaveastring'contains'substringmethod?
2112
WhyisreadinglinesfromstdinmuchslowerinC++thanPython?
HotNetworkQuestions
HowtoruntheGUIofWindowsFeaturesOn/OffusingPowershell
Traditionally,andcurrently,whatstopshumanvotecountersfromalteringballotstomakethem'Spoilt/Invalidvotes?
HowIcanremoveautoincrementfromaPrimarykeyinpostgresql?
Whydostringhashcodeschangeforeachexecutionin.NET?
HowdoGPSreceiverscommunicatewithsatellites?
Howtoremovetikznode?
Canaphotonturnaprotonintoaneutron?
AmIreallyrequiredtosetupanInheritedIRA?
WhydopeopleinsistonusingTikzwhentheycanusesimplerdrawingtools?
Shouldselectedoptionsberemovedfromsingle-andmulti-selectdropdownlists?
IfthedrowshadowbladeusesShadowSwordasarangedattack,doesitthrowasword(thatitthenhastoretrievebeforeusingitagain)?
HowdothosewhoholdtoaliteralinterpretationofthefloodaccountrespondtothecriticismthatNoahbuildingthearkwouldbeunfeasible?
Whatisthebestwaytocalculatetruepasswordentropyforhumancreatedpasswords?
DoestheDemocraticPartyofficiallysupportrepealingtheSecondAmendment?
WhathappenswhenthequasarremnantsreachEarthin3millionyears?
ShouldIusepwdortildeplus(~+)?
ArethereanyspellsotherthanWishthatcanlocateanobjectthroughleadshielding?
Whattranslation/versionoftheBiblewouldChaucerhaveread?
Howcanmyaliensymbiotesidentifyeachother?
Probabilisticmethodsforundecidableproblem
Whenisthefirstelementintheargumentlistregardedasafunctionsymbolandwhennot?
Howtoplug2.5mm²strandedwiresintoapushwirewago?
Wouldmerfolkgainanyrealadvantagefrommounts(andbeastsofburden)?
DidMS-DOSeverdropabilitytosupportnon-IBMPCcompatiblemachines?
morehotquestions
Questionfeed
SubscribetoRSS
Questionfeed
TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader.
lang-py
Yourprivacy
Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy.
Acceptallcookies
Customizesettings