1 Answer 1 ... The idea of this line: with codecs.open(filename, "r", encoding='utf-8') as csvfile: is to say "This file was saved as utf-8.
Home
Public
Questions
Tags
Users
Companies
Collectives
ExploreCollectives
Teams
StackOverflowforTeams
–Startcollaboratingandsharingorganizationalknowledge.
CreateafreeTeam
WhyTeams?
Teams
CreatefreeTeam
Collectives™onStackOverflow
Findcentralized,trustedcontentandcollaboratearoundthetechnologiesyouusemost.
LearnmoreaboutCollectives
Teams
Q&Aforwork
Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch.
LearnmoreaboutTeams
Pythoncodecsencodingnotworking
AskQuestion
Asked
6yearsago
Modified
6yearsago
Viewed
8ktimes
0
Ihavethiscode
importcollections
importcsv
importsys
importcodecs
fromxml.dom.minidomimportparse
importxml.dom.minidom
String=collections.namedtuple("String",["tag","text"])
defread_translations(filename):#Readsacsvfilewithrowsmadeupof2columns:thestringtag,andthetranslatedtag
withcodecs.open(filename,"r",encoding='utf-8')ascsvfile:
csv_reader=csv.reader(csvfile,delimiter=",")
result=[String(tag=row[0],text=row[1])forrowincsv_reader]
returnresult
TheCSVfileI'mreadingcontainsBrazilianportuguesecharacters.WhenItrytorunthis,Igetanerror:
'utf8'codeccan'tdecodebyte0x88inposition21:invalidstartbyte
I'musingPython2.7.Asyoucansee,I'mencodingwithcodecs,butitdoesn'twork.
Anyideas?
pythonpython-2.7
Share
Improvethisquestion
Follow
askedSep22,2016at21:50
Nacho321Nacho321
1,86177goldbadges3333silverbadges5252bronzebadges
6
2
PerhapsyourfileisnotsavedasUTF-8?
– zvone
Sep22,2016at21:52
1
Tryandchangeencoding='utf-8'toencoding='cp1252'.Wecan'ttellmuchwithoutseeingthedata.
– wim
Sep22,2016at21:53
Whatthoseguyssaid.Windowsdoesn'tuseUTF-8unlessyouforceitto;anyrandomfilethatyouopenwillmostlikelybeencodedwiththecurrentWindowscodepage.Youcanuseencoding='mbcs'togetthatwithoutknowingspecificallywhatitis.
– MarkRansom
Sep22,2016at21:59
ForgottoaddthatI'musingaMaconthis.I'veopenedthefilesusingSublimeandSavedwithencodingUTF-8.Itriedcp1252butitreturnsthiserror:UnicodeDecodeError:'charmap'codeccan'tdecodebyte0x8dinposition31:charactermapsto
– Nacho321
Sep22,2016at22:00
Youneedtofindoutwhichencodingwasusedtogeneratethefile.Ifyouopenthefileinaneditordoyouseethepropercharacters?
– MarkRansom
Sep22,2016at22:19
|
Show1morecomment
1Answer
1
Sortedby:
Resettodefault
Highestscore(default)
Trending(recentvotescountmore)
Datemodified(newestfirst)
Datecreated(oldestfirst)
-1
Theideaofthisline:
withcodecs.open(filename,"r",encoding='utf-8')ascsvfile:
istosay"Thisfilewassavedasutf-8.Pleasemakeappropriateconversionswhenreadingfromit."
Thatworksfineifthefilewasactuallysavedasutf-8.Ifsomeotherencodingwasused,thenitisbad.
Whatthen?
Determinewhichencodingwasused.Assumingtheinformationcannotbeobtainedfromthesoftwarewhichcreatedthefile-guess.
Openthefilenormallyandprinteachline:
withopen(filename,'rt')asf:
forlineinf:
printrepr(line)
ThenlookforacharacterwhichisnotASCII,e.g.ñ-thisletterwillbeprintedassomecode,e.g.:
'espa\xc3\xb1ol'
Above,ñisrepresentedas\xc3\xb1,becausethatistheutf-8sequenceforit.
Now,youcancheckwhatvariousencodingswouldgiveandseewhichisright:
>>>ntilde=u'\N{LATINSMALLLETTERNWITHTILDE}'
>>>
>>>printrepr(ntilde.encode('utf-8'))
'\xc3\xb1'
>>>printrepr(ntilde.encode('windows-1252'))
'\xf1'
>>>printrepr(ntilde.encode('iso-8859-1'))
'\xf1'
>>>printrepr(ntilde.encode('macroman'))
'\x96'
Orprintallofthem:
forcinencodings.aliases.aliases:
try:
encoded=ntilde.encode(c)
printc,repr(encoded)
except:
pass
Then,whenyouhaveguessedwhichencodingitis,usethat,e.g.:
withcodecs.open(filename,"r",encoding='iso-8859-1')ascsvfile:
Share
Improvethisanswer
Follow
answeredSep22,2016at22:22
zvonezvone
17k33goldbadges4343silverbadges7070bronzebadges
Addacomment
|
YourAnswer
ThanksforcontributingananswertoStackOverflow!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers.
Draftsaved
Draftdiscarded
Signuporlogin
SignupusingGoogle
SignupusingFacebook
SignupusingEmailandPassword
Submit
Postasaguest
Name
Email
Required,butnevershown
PostYourAnswer
Discard
Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy
Nottheansweryou'relookingfor?Browseotherquestionstaggedpythonpython-2.7oraskyourownquestion.
TheOverflowBlog
HowtoearnamillionreputationonStackOverflow:beofservicetoothers
Therightwaytojobhop(Ep.495)
FeaturedonMeta
BookmarkshaveevolvedintoSaves
Inboximprovements:markingnotificationsasread/unread,andafiltered...
Revieweroverboard!Orarequesttoimprovetheonboardingguidancefornew...
CollectivesUpdate:RecognizedMembers,Articles,andGitLab
Shouldweburninatethe[script]tag?
Related
6975
WhataremetaclassesinPython?
7492
DoesPythonhaveaternaryconditionaloperator?
3246
HowdoIconcatenatetwolistsinPython?
2975
Manuallyraising(throwing)anexceptioninPython
2573
HowtoupgradeallPythonpackageswithpip?
3588
DoesPythonhaveastring'contains'substringmethod?
2898
HowdoIaccessenvironmentvariablesinPython?
3063
HowdoIdeleteafileorfolderinPython?
2646
HowcanIremoveakeyfromaPythondictionary?
2810
Whyis"1000000000000000inrange(1000000000000001)"sofastinPython3?
HotNetworkQuestions
Sciencefictionbook/novelaboutaliensinhumans'bodies
Findanddeletepartiallyduplicatelines
Canaphotonturnaprotonintoaneutron?
Interpretinganegativeself-evaluationofahighperformer
WhyisitnecessarytoconfigureanACLwhenconfiguringDynamicNat?
WhytheneedforaScienceOfficeronacargovessel?
Howcanmyaliensymbiotesidentifyeachother?
PacifistethosblockingmyprogressinStellaris
Theunusualphrasing"verb+the+comparativeadjective"intheLordoftheRingsnovels
Whatisthebestwaytocalculatetruepasswordentropyforhumancreatedpasswords?
Howtotellifmybikehasanaluminumframe
2016PutnamB6difficultsummationproblem
Isthe2...g6DutchautomaticallywinningforWhite?
Sortbycolumngroupandignoreothercolumnsfailingforthisexample,why?
Whatistheconventionalwaytonotateameterwithaccentsoneverysecond8thnote?
HowdoIresolverecentearthworksaroundmyfuturefenceline?
ShouldIresendanapplication?
PreferenceofBJTtoMOSFET
Howdoyoucalculatethetimeuntilthesteady-stateofadrug?
AreChernclasseswelldefineduptocontractiblechoice?
Doublelinemathsentence
HowdothosewhoholdtoaliteralinterpretationofthefloodaccountrespondtothecriticismthatNoahbuildingthearkwouldbeunfeasible?
What'sthedifferencebetween'Dynamic','Random',and'Procedural'generations?
Ignorespaces,including~
morehotquestions
Questionfeed
SubscribetoRSS
Questionfeed
TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader.
lang-py
Yourprivacy
Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy.
Acceptallcookies
Customizesettings