My MySQL database is in utf8, or seems to be SQL query SHOW variables LIKE 'char%' returns me only utf8 or binary. My function looks like this :
Home
Public
Questions
Tags
Users
Companies
Collectives
ExploreCollectives
Teams
StackOverflowforTeams
–Startcollaboratingandsharingorganizationalknowledge.
CreateafreeTeam
WhyTeams?
Teams
CreatefreeTeam
Collectives™onStackOverflow
Findcentralized,trustedcontentandcollaboratearoundthetechnologiesyouusemost.
LearnmoreaboutCollectives
Teams
Q&Aforwork
Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch.
LearnmoreaboutTeams
pythonencodingutf-8
AskQuestion
Asked
9years,7monthsago
Modified
5years,3monthsago
Viewed
396ktimes
52
Iamdoingsomescriptsinpython.IcreateastringthatIsaveinafile.Thisstringgotlotofdata,comingfromthearborescenceandfilenamesofadirectory.
Accordingtoconvmv,allmyarborescenceisinUTF-8.
IwanttokeepeverythinginUTF-8becauseIwillsaveitinMySQLafter.
Fornow,inMySQL,whichisinUTF-8,Igotsomeproblemwithsomecharacters(likeéorè-I'amFrench).
IwantthatpythonalwaysusestringasUTF-8.Ireadsomeinformationsontheinternetandididlikethis.
Myscriptbeginwiththis:
#!/usr/bin/python
#-*-coding:utf-8-*-
defcreateIndex():
importcodecs
toUtf8=codecs.getencoder('UTF8')
#lotofoperations&buildingindexSTRthestringwhomatter
findex=open('config/index/music_vibration_'+date+'.index','a')
findex.write(codecs.BOM_UTF8)
findex.write(toUtf8(indexSTR))#thisbugs!
AndwhenIexecute,hereistheanswer:UnicodeDecodeError:'ascii'codeccan'tdecodebyte0xc3inposition2171:ordinalnotinrange(128)
Edit:
Isee,inmyfile,theaccentarenicelywritten.Aftercreatingthisfile,IreaditandIwriteitintoMySQL.
ButIdontunderstandwhy,butIgotproblemwithencoding.
MyMySQLdatabaseisinutf8,orseemstobeSQLquerySHOWvariablesLIKE'char%'returnsmeonlyutf8orbinary.
Myfunctionlookslikethis:
#!/usr/bin/python
#-*-coding:utf-8-*-
defsaveIndex(index,date):
importMySQLdbasmdb
importcodecs
sql=mdb.connect('localhost','admin','*******','music_vibration')
sql.charset="utf8"
findex=open('config/index/'+index,'r')
lines=findex.readlines()
forlineinlines:
ifline.find('#artiste')!=-1:
artiste=line.split('[:::]')
artiste=artiste[1].replace('\n','')
c=sql.cursor()
c.execute('SELECTCOUNT(id)ASnbrFROMartistesWHEREnom="'+artiste+'"')
nbr=c.fetchone()
ifnbr[0]==0:
c=sql.cursor()
iArt+=1
c.execute('INSERTINTOartistes(nom,status,path)VALUES("'+artiste+'",99,"'+artiste+'/")'.encode('utf8')
AndartistewhoarenicelydisplayedinthefilewritesbadintotheBDD.
Whatistheproblem?
pythonunicodeencodingutf-8
Share
Improvethisquestion
Follow
editedDec3,2015at19:37
irajjelodari
2,91833goldbadges3333silverbadges4545bronzebadges
askedFeb26,2013at15:06
vekahvekah
98033goldbadges1313silverbadges3131bronzebadges
2
Yourpythonsamplecodeisinvalid;therearesyntaxerrorsinatleast2places.Canyoufixthosefirst,please?
– MartijnPieters
♦
Feb26,2013at15:09
Areyousavingthefileasutf-8andnotanasciifile?
– QuentinUK
Feb26,2013at15:10
Addacomment
|
2Answers
2
Sortedby:
Resettodefault
Highestscore(default)
Trending(recentvotescountmore)
Datemodified(newestfirst)
Datecreated(oldestfirst)
64
Youdon'tneedtoencodedatathatisalreadyencoded.Whenyoutrytodothat,PythonwillfirsttrytodecodeittounicodebeforeitcanencodeitbacktoUTF-8.Thatiswhatisfailinghere:
>>>data=u'\u00c3'#Unicodedata
>>>data=data.encode('utf8')#encodedtoUTF-8
>>>data
'\xc3\x83'
>>>data.encode('utf8')#Tryto*re*-encodeit
Traceback(mostrecentcalllast):
File"",line1,in
UnicodeDecodeError:'ascii'codeccan'tdecodebyte0xc3inposition0:ordinalnotinrange(128)
Justwriteyourdatadirectlytothefile,thereisnoneedtoencodealready-encodeddata.
Ifyouinsteadbuildupunicodevaluesinstead,youwouldindeedhavetoencodethosetobewritabletoafile.You'dwanttousecodecs.open()instead,whichreturnsafileobjectthatwillencodeunicodevaluestoUTF-8foryou.
Youalsoreallydon'twanttowriteouttheUTF-8BOM,unlessyouhavetosupportMicrosofttoolsthatcannotreadUTF-8otherwise(suchasMSNotepad).
ForyourMySQLinsertproblem,youneedtodotwothings:
Addcharset='utf8'toyourMySQLdb.connect()call.
Useunicodeobjects,notstrobjectswhenqueryingorinserting,butusesqlparameterssotheMySQLconnectorcandotherightthingforyou:
artiste=artiste.decode('utf8')#itisalreadyUTF8,decodetounicode
c.execute('SELECTCOUNT(id)ASnbrFROMartistesWHEREnom=%s',(artiste,))
#...
c.execute('INSERTINTOartistes(nom,status,path)VALUES(%s,99,%s)',(artiste,artiste+u'/'))
Itmayactuallyworkbetterifyouusedcodecs.open()todecodethecontentsautomaticallyinstead:
importcodecs
sql=mdb.connect('localhost','admin','ugo&(-@F','music_vibration',charset='utf8')
withcodecs.open('config/index/'+index,'r','utf8')asfindex:
forlineinfindex:
ifu'#artiste'notinline:
continue
artiste=line.split(u'[:::]')[1].strip()
cursor=sql.cursor()
cursor.execute('SELECTCOUNT(id)ASnbrFROMartistesWHEREnom=%s',(artiste,))
ifnotcursor.fetchone()[0]:
cursor=sql.cursor()
cursor.execute('INSERTINTOartistes(nom,status,path)VALUES(%s,99,%s)',(artiste,artiste+u'/'))
artists_inserted+=1
YoumaywanttobrushuponUnicodeandUTF-8andencodings.Icanrecommendthefollowingarticles:
ThePythonUnicodeHOWTO
PragmaticUnicodebyNedBatchelder
TheAbsoluteMinimumEverySoftwareDeveloperAbsolutely,PositivelyMustKnowAboutUnicodeandCharacterSets(NoExcuses!)byJoelSpolsky
Share
Improvethisanswer
Follow
editedFeb26,2013at16:21
answeredFeb26,2013at15:10
MartijnPieters♦MartijnPieters
989k275275goldbadges38913891silverbadges32473247bronzebadges
1
4
@vekah:DidyoufollowtheinstructionsinWritingUTF-8StringtoMySQLwithPython
– MartijnPieters
♦
Feb26,2013at16:05
Addacomment
|
3
Unfortunately,thestring.encode()methodisnotalwaysreliable.Checkoutthisthreadformoreinformation:Whatisthefoolproofwaytoconvertsomestring(utf-8orelse)toasimpleASCIIstringinpython
Share
Improvethisanswer
Follow
editedMay23,2017at12:18
CommunityBot
111silverbadge
answeredFeb3,2014at16:50
EvHausEvHaus
1,54911goldbadge1111silverbadges2222bronzebadges
0
Addacomment
|
YourAnswer
ThanksforcontributingananswertoStackOverflow!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers.
Draftsaved
Draftdiscarded
Signuporlogin
SignupusingGoogle
SignupusingFacebook
SignupusingEmailandPassword
Submit
Postasaguest
Name
Email
Required,butnevershown
PostYourAnswer
Discard
Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy
Nottheansweryou'relookingfor?Browseotherquestionstaggedpythonunicodeencodingutf-8oraskyourownquestion.
TheOverflowBlog
HowtoearnamillionreputationonStackOverflow:beofservicetoothers
Therightwaytojobhop(Ep.495)
FeaturedonMeta
BookmarkshaveevolvedintoSaves
Inboximprovements:markingnotificationsasread/unread,andafiltered...
Revieweroverboard!Orarequesttoimprovetheonboardingguidancefornew...
CollectivesUpdate:RecognizedMembers,Articles,andGitLab
Shouldweburninatethe[script]tag?
Linked
38
WritingUTF-8StringtoMySQLwithPython
5
Whatisthefoolproofwaytoconvertsomestring(utf-8orelse)toasimpleASCIIstringinpython
4
HowcanIdecodethisstringinpython?
1
Python:'ascii'codeccan'tdecodebyte
0
Howtoactuallysetutf-8encodinginSQLAlchemy/PostgresinPython
0
Pythonunittest.mocklibrary
0
PythonMysqlnotinsertingotherlanguages
0
Whydoestheencodinggowrong?
1
SavingHTMLtoMongodbencodingissue
Related
6975
WhataremetaclassesinPython?
1323
UTF-8allthewaythrough
7492
DoesPythonhaveaternaryconditionaloperator?
3246
HowdoIconcatenatetwolistsinPython?
2975
Manuallyraising(throwing)anexceptioninPython
974
What'sthedifferencebetweenUTF-8andUTF-8withBOM?
3588
DoesPythonhaveastring'contains'substringmethod?
2898
HowdoIaccessenvironmentvariablesinPython?
HotNetworkQuestions
WhatdothecolorsindicateonthisKC135tankerboom?
InD&D3.5,canafamiliarbetemporarilydismissed?
AmIreallyrequiredtosetupanInheritedIRA?
CounterexampleforChvatal'sconjectureinaninfiniteset
Doesindecentexposurerequireintentionality?
Levinson'salgorithmandQRdecompositionforcomplexleast-squaresFIRdesign
Interpretinganegativeself-evaluationofahighperformer
Howtodestroydatapermanentlyinaworldwheretimetraveliseasilydone?
Howtosimplifyapurefunction?
Traditionally,andcurrently,whatstopshumanvotecountersfromalteringballotstomakethem'Spoilt/Invalidvotes?
meaningof'illesas'inMagnaCarta
HowtogetridofUbuntuProadvertisementwhenupdatingapt?
HowdoIgetajobinthegamesindustryas"theonewiththeideas"?
Whydoesn'ttheMBRS1100SchottkydiodehaveanexponentialI/Vcharacteristic?
What'sthedifferencebetween'Dynamic','Random',and'Procedural'generations?
Iwanttodothedoubleslitexperimentwithelectrons,but
Whatprotocolisthiswaveform?
Howcanmyaliensymbiotesidentifyeachother?
StandardCoverflow-safearithmeticfunctions
WillIgetdeniedentryafterIremovedavisasticker?Ismypassportdamaged?
ConvertanintegertoIEEE754float
Howtoproperlycolorcellsinalatextablewithoutscrewingupthelines?
Howtoplug2.5mm²strandedwiresintoapushwirewago?
WhatisthedifferencebetweenGlidepathversusGlideslope?
morehotquestions
Questionfeed
SubscribetoRSS
Questionfeed
TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader.
lang-py
Yourprivacy
Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy.
Acceptallcookies
Customizesettings