python encoding utf-8 - unicode - Stack Overflow

文章推薦指數: 80 %
投票人數:10人

My MySQL database is in utf8, or seems to be SQL query SHOW variables LIKE 'char%' returns me only utf8 or binary. My function looks like this : Home Public Questions Tags Users Companies Collectives ExploreCollectives Teams StackOverflowforTeams –Startcollaboratingandsharingorganizationalknowledge. CreateafreeTeam WhyTeams? Teams CreatefreeTeam Collectives™onStackOverflow Findcentralized,trustedcontentandcollaboratearoundthetechnologiesyouusemost. LearnmoreaboutCollectives Teams Q&Aforwork Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch. LearnmoreaboutTeams pythonencodingutf-8 AskQuestion Asked 9years,7monthsago Modified 5years,3monthsago Viewed 396ktimes 52 Iamdoingsomescriptsinpython.IcreateastringthatIsaveinafile.Thisstringgotlotofdata,comingfromthearborescenceandfilenamesofadirectory. Accordingtoconvmv,allmyarborescenceisinUTF-8. IwanttokeepeverythinginUTF-8becauseIwillsaveitinMySQLafter. Fornow,inMySQL,whichisinUTF-8,Igotsomeproblemwithsomecharacters(likeéorè-I'amFrench). IwantthatpythonalwaysusestringasUTF-8.Ireadsomeinformationsontheinternetandididlikethis. Myscriptbeginwiththis: #!/usr/bin/python #-*-coding:utf-8-*- defcreateIndex(): importcodecs toUtf8=codecs.getencoder('UTF8') #lotofoperations&buildingindexSTRthestringwhomatter findex=open('config/index/music_vibration_'+date+'.index','a') findex.write(codecs.BOM_UTF8) findex.write(toUtf8(indexSTR))#thisbugs! AndwhenIexecute,hereistheanswer:UnicodeDecodeError:'ascii'codeccan'tdecodebyte0xc3inposition2171:ordinalnotinrange(128) Edit: Isee,inmyfile,theaccentarenicelywritten.Aftercreatingthisfile,IreaditandIwriteitintoMySQL. ButIdontunderstandwhy,butIgotproblemwithencoding. MyMySQLdatabaseisinutf8,orseemstobeSQLquerySHOWvariablesLIKE'char%'returnsmeonlyutf8orbinary. Myfunctionlookslikethis: #!/usr/bin/python #-*-coding:utf-8-*- defsaveIndex(index,date): importMySQLdbasmdb importcodecs sql=mdb.connect('localhost','admin','*******','music_vibration') sql.charset="utf8" findex=open('config/index/'+index,'r') lines=findex.readlines() forlineinlines: ifline.find('#artiste')!=-1: artiste=line.split('[:::]') artiste=artiste[1].replace('\n','') c=sql.cursor() c.execute('SELECTCOUNT(id)ASnbrFROMartistesWHEREnom="'+artiste+'"') nbr=c.fetchone() ifnbr[0]==0: c=sql.cursor() iArt+=1 c.execute('INSERTINTOartistes(nom,status,path)VALUES("'+artiste+'",99,"'+artiste+'/")'.encode('utf8') AndartistewhoarenicelydisplayedinthefilewritesbadintotheBDD. Whatistheproblem? pythonunicodeencodingutf-8 Share Improvethisquestion Follow editedDec3,2015at19:37 irajjelodari 2,91833goldbadges3333silverbadges4545bronzebadges askedFeb26,2013at15:06 vekahvekah 98033goldbadges1313silverbadges3131bronzebadges 2 Yourpythonsamplecodeisinvalid;therearesyntaxerrorsinatleast2places.Canyoufixthosefirst,please? – MartijnPieters ♦ Feb26,2013at15:09 Areyousavingthefileasutf-8andnotanasciifile? – QuentinUK Feb26,2013at15:10 Addacomment  |  2Answers 2 Sortedby: Resettodefault Highestscore(default) Trending(recentvotescountmore) Datemodified(newestfirst) Datecreated(oldestfirst) 64 Youdon'tneedtoencodedatathatisalreadyencoded.Whenyoutrytodothat,PythonwillfirsttrytodecodeittounicodebeforeitcanencodeitbacktoUTF-8.Thatiswhatisfailinghere: >>>data=u'\u00c3'#Unicodedata >>>data=data.encode('utf8')#encodedtoUTF-8 >>>data '\xc3\x83' >>>data.encode('utf8')#Tryto*re*-encodeit Traceback(mostrecentcalllast): File"",line1,in UnicodeDecodeError:'ascii'codeccan'tdecodebyte0xc3inposition0:ordinalnotinrange(128) Justwriteyourdatadirectlytothefile,thereisnoneedtoencodealready-encodeddata. Ifyouinsteadbuildupunicodevaluesinstead,youwouldindeedhavetoencodethosetobewritabletoafile.You'dwanttousecodecs.open()instead,whichreturnsafileobjectthatwillencodeunicodevaluestoUTF-8foryou. Youalsoreallydon'twanttowriteouttheUTF-8BOM,unlessyouhavetosupportMicrosofttoolsthatcannotreadUTF-8otherwise(suchasMSNotepad). ForyourMySQLinsertproblem,youneedtodotwothings: Addcharset='utf8'toyourMySQLdb.connect()call. Useunicodeobjects,notstrobjectswhenqueryingorinserting,butusesqlparameterssotheMySQLconnectorcandotherightthingforyou: artiste=artiste.decode('utf8')#itisalreadyUTF8,decodetounicode c.execute('SELECTCOUNT(id)ASnbrFROMartistesWHEREnom=%s',(artiste,)) #... c.execute('INSERTINTOartistes(nom,status,path)VALUES(%s,99,%s)',(artiste,artiste+u'/')) Itmayactuallyworkbetterifyouusedcodecs.open()todecodethecontentsautomaticallyinstead: importcodecs sql=mdb.connect('localhost','admin','ugo&(-@F','music_vibration',charset='utf8') withcodecs.open('config/index/'+index,'r','utf8')asfindex: forlineinfindex: ifu'#artiste'notinline: continue artiste=line.split(u'[:::]')[1].strip() cursor=sql.cursor() cursor.execute('SELECTCOUNT(id)ASnbrFROMartistesWHEREnom=%s',(artiste,)) ifnotcursor.fetchone()[0]: cursor=sql.cursor() cursor.execute('INSERTINTOartistes(nom,status,path)VALUES(%s,99,%s)',(artiste,artiste+u'/')) artists_inserted+=1 YoumaywanttobrushuponUnicodeandUTF-8andencodings.Icanrecommendthefollowingarticles: ThePythonUnicodeHOWTO PragmaticUnicodebyNedBatchelder TheAbsoluteMinimumEverySoftwareDeveloperAbsolutely,PositivelyMustKnowAboutUnicodeandCharacterSets(NoExcuses!)byJoelSpolsky Share Improvethisanswer Follow editedFeb26,2013at16:21 answeredFeb26,2013at15:10 MartijnPieters♦MartijnPieters 989k275275goldbadges38913891silverbadges32473247bronzebadges 1 4 @vekah:DidyoufollowtheinstructionsinWritingUTF-8StringtoMySQLwithPython – MartijnPieters ♦ Feb26,2013at16:05 Addacomment  |  3 Unfortunately,thestring.encode()methodisnotalwaysreliable.Checkoutthisthreadformoreinformation:Whatisthefoolproofwaytoconvertsomestring(utf-8orelse)toasimpleASCIIstringinpython Share Improvethisanswer Follow editedMay23,2017at12:18 CommunityBot 111silverbadge answeredFeb3,2014at16:50 EvHausEvHaus 1,54911goldbadge1111silverbadges2222bronzebadges 0 Addacomment  |  YourAnswer ThanksforcontributingananswertoStackOverflow!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers. Draftsaved Draftdiscarded Signuporlogin SignupusingGoogle SignupusingFacebook SignupusingEmailandPassword Submit Postasaguest Name Email Required,butnevershown PostYourAnswer Discard Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy Nottheansweryou'relookingfor?Browseotherquestionstaggedpythonunicodeencodingutf-8oraskyourownquestion. TheOverflowBlog HowtoearnamillionreputationonStackOverflow:beofservicetoothers Therightwaytojobhop(Ep.495) FeaturedonMeta BookmarkshaveevolvedintoSaves Inboximprovements:markingnotificationsasread/unread,andafiltered... Revieweroverboard!Orarequesttoimprovetheonboardingguidancefornew... CollectivesUpdate:RecognizedMembers,Articles,andGitLab Shouldweburninatethe[script]tag? Linked 38 WritingUTF-8StringtoMySQLwithPython 5 Whatisthefoolproofwaytoconvertsomestring(utf-8orelse)toasimpleASCIIstringinpython 4 HowcanIdecodethisstringinpython? 1 Python:'ascii'codeccan'tdecodebyte 0 Howtoactuallysetutf-8encodinginSQLAlchemy/PostgresinPython 0 Pythonunittest.mocklibrary 0 PythonMysqlnotinsertingotherlanguages 0 Whydoestheencodinggowrong? 1 SavingHTMLtoMongodbencodingissue Related 6975 WhataremetaclassesinPython? 1323 UTF-8allthewaythrough 7492 DoesPythonhaveaternaryconditionaloperator? 3246 HowdoIconcatenatetwolistsinPython? 2975 Manuallyraising(throwing)anexceptioninPython 974 What'sthedifferencebetweenUTF-8andUTF-8withBOM? 3588 DoesPythonhaveastring'contains'substringmethod? 2898 HowdoIaccessenvironmentvariablesinPython? HotNetworkQuestions WhatdothecolorsindicateonthisKC135tankerboom? InD&D3.5,canafamiliarbetemporarilydismissed? AmIreallyrequiredtosetupanInheritedIRA? CounterexampleforChvatal'sconjectureinaninfiniteset Doesindecentexposurerequireintentionality? Levinson'salgorithmandQRdecompositionforcomplexleast-squaresFIRdesign Interpretinganegativeself-evaluationofahighperformer Howtodestroydatapermanentlyinaworldwheretimetraveliseasilydone? Howtosimplifyapurefunction? Traditionally,andcurrently,whatstopshumanvotecountersfromalteringballotstomakethem'Spoilt/Invalidvotes? meaningof'illesas'inMagnaCarta HowtogetridofUbuntuProadvertisementwhenupdatingapt? HowdoIgetajobinthegamesindustryas"theonewiththeideas"? Whydoesn'ttheMBRS1100SchottkydiodehaveanexponentialI/Vcharacteristic? What'sthedifferencebetween'Dynamic','Random',and'Procedural'generations? Iwanttodothedoubleslitexperimentwithelectrons,but Whatprotocolisthiswaveform? Howcanmyaliensymbiotesidentifyeachother? StandardCoverflow-safearithmeticfunctions WillIgetdeniedentryafterIremovedavisasticker?Ismypassportdamaged? ConvertanintegertoIEEE754float Howtoproperlycolorcellsinalatextablewithoutscrewingupthelines? Howtoplug2.5mm²strandedwiresintoapushwirewago? WhatisthedifferencebetweenGlidepathversusGlideslope? morehotquestions Questionfeed SubscribetoRSS Questionfeed TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader. lang-py Yourprivacy Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy. Acceptallcookies Customizesettings  



請為這篇文章評分?