For such cases, the open() statement should include an encoding spcification, ... myfile = open('alice.txt', encoding='utf-8') # Reading a UTF-8 file; ...
Goto:Na-RaeHan'shomepage
Python3Notes
[HOME|LING1330/2330]
FileReadingandWritingMethods
<>
Onthispage:open(),file.read(),file.readlines(),file.write(),file.writelines().
Beforeproceeding,makesureyouunderstandtheconceptsoffilepathandCWD.Ifyourunintoproblems,visittheCommonPitfallssectionatthebottomofthispage.
OpeningandClosinga"FileObject"
AsseeninTutorials#12and#13,fileIO(input/output)operationsaredonethroughafiledataobject.Ittypicallyproceedsasfollows:
Createafileobjectusingtheopen()function.Alongwiththefilename,specify:
'r'forreadinginanexistingfile(default;canbedropped),
'w'forcreatinganewfileforwriting,
'a'forappendingnewcontenttoanexistingfile.
Dosomethingwiththefileobject(reading,writing).
Closethefileobjectbycallingthe.close()methodonthefileobject.
Below,myfileisthefiledataobjectwe'recreatingforreading.'alice.txt'isapre-existingtextfileinthesamedirectoryasthefoo.pyscript.Afterthefilecontentisreadin,.close()iscalledonmyfile,closingthefileobject.
myfile=open('alice.txt','r')#Reading.'r'canbeomitted
#...readfrommyfile...
myfile.close()#Closingfile
foo.py
Below,myfileisopenedforwriting.Inthesecondinstance,the'a'switchmakessurethatthenewcontentistackedonattheendoftheexistingtextfile.Hadyouused'w'instead,theoriginalfilewouldhavebeenoverwritten.
myfile=open('results.txt','w')#Thefileisnewlycreatedwherefoo.pyis
#...writetomyfile...
myfile.close()#Closingfile.VERYIMPORTANT!
myfile=open('results.txt','a')#'a':appendinginsteadofoverwriting.
#...addtexttothefile...
myfile.close()#Closingfile.DON'TFORGET!
foo.py
Thereisonemorepieceofcrucialinformation:encoding.Somefilesmayhavetobereadasaparticularencodingtype,andsometimesyouneedtowriteoutafileinaspecificencodingsystem.Forsuchcases,theopen()statementshouldincludeanencodingspcification,withtheencoding='xxx'switch:
myfile=open('alice.txt',encoding='utf-8')#ReadingaUTF-8file;'r'isomitted
myfile=open('results.txt','w',encoding='utf-8')#FilewillbewritteninUTF-8
foo.py
Mostly,youwillneed'utf-8'(8-bitUnicode),'utf-16'(16-bitUnicode),or'utf-32'(32-bit),butitmaybesomethingdifferent,especiallyifyouaredealingwithaforeignlanguagetext.Hereisafulllistofencodings.
ReadingfromaFile
OK,weknowhowtoopenandcloseafileobject.Butwhataretheactualcommandsforreading?Therearemultiplemethods.
Firstoff,.read()readsintheentiretextcontentofthefileasasinglestring.Below,thefileisreadintoavariablenamedmarytxt,whichendsupbeingastring-typeobject.Downloadmary-short.txtandtryoutyourself.
>>>f=open('mary-short.txt')
>>>marytxt=f.read()#Using.read()
>>>f.close()
>>>marytxt
'Maryhadalittlelamb,\nHisfleecewaswhiteassnow,\nAndeverywherethatMary
went,\nThelambwassuretogo.\n'
>>>type(marytxt)#marytxtisstringtype
>>>len(marytxt)#marytxthas110characters
110
>>>print(marytxt[0])
M
Next,.readlines()readsintheentiretextcontentofthefileasalistoflines,eachterminatingwithalinebreak.Below,youcanseemarylinesisalistofstrings,whereeachstringisalinefrommary-short.txt.
>>>f=open('mary-short.txt')
>>>marylines=f.readlines()#Using.readlines()
>>>f.close()
>>>marylines
['Maryhadalittlelamb,\n','Hisfleecewaswhiteassnow,\n','Andeverywhere
thatMarywent,\n','Thelambwassuretogo.\n']
>>>type(marylines)#marylinesislisttype
>>>len(marylines)#marylineshas4lines
4
>>>print(marylines[0])
Maryhadalittlelamb,
Lastly,ratherthanloadingtheentirefilecontentintomemory,youcaniteratethroughthefileobjectlinebylineusingthefor...inloop.Thismethodismorememory-efficientandthereforerecommendedwhendealingwithaverylargefile.Below,bible-kjv.txtisopened,andanylinecontainingsmiteisprintedout.Downloadbible-kjv.txtandtryoutyourself.
f=open('bible-kjv.txt')#Thisisabigfile
forlineinf:#Using'for...in'onfileobject
if'smite'inline:
print(line,)#','keepsprintfromaddingalinebreak
f.close()
foo.py
WritingtoaFile
Writingmethodsalsocomeinapair:.write()and.writelines().Likethecorrespondingreadingmethods,.write()handlesasinglestring,while.writelines()handlesalistofstrings.
Below,.write()writesasinglestringeachtimetothedesignatedoutputfile:
>>>fout=open('hello.txt','w')
>>>fout.write('Hello,world!\n')#.write(str)
>>>fout.write('MynameisHomer.\n')
>>>fout.write("Whatabeautifuldaywe'rehaving.\n")
>>>fout.close()
Thistime,wehavetobuy,alistofstrings,which.writelines()writesoutatonce:
>>>tobuy=['milk\n','butter\n','coffeebeans\n','arugula\n']
>>>fout=open('grocerylist.txt','w')
>>>fout.writelines(tobuy)#.writelines(list)
>>>fout.close()
Notethatallstringsintheexampleshavethelinebreak'\n'attheend.Withoutit,allstringswillbeprintedoutonthesameline,whichiswhatwashappeninginTutorial13.Unliketheprintstatementwhichprintsoutastringonitsownnewline,writingmethodswillnottackonanewlinecharacter--youmustremembertosupply'\n'ifyouwishastringtooccupyitsownline.
CommonPitfalls
FileI/Oisnotoriouslyfraughtwithstumblingblocksforbeginningprogrammers.Belowarethemostcommonones.
"Nosuchfileordirectory"error
>>>f=open('mary-short.txt')
Traceback(mostrecentcalllast):
File"",line1,in
IOError:[Errno1]Nosuchfileordirectory:'mary-short.txt'
YouaregettingthiserrorbecausePythonfailedtolocatethefileforreading.Makesureyouaresupplyingthecorrectfilepathandname.ReadfirstFilePathandCWD.Also,refertothis,thisandthisFAQ.
Issueswithencoding
>>>f=open('mary-short.txt')#needencoding='utf-8'
>>>marytxt=f.read()
Traceback(mostrecentcalllast):
File"",line1,in
marytxt=f.read()
File"C:\ProgramFiles(x86)\Python35-32\lib\encodings\cp1252.py",line23,indecode
returncodecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError:'charmap'codeccan'tdecodebyte0x81inposition36593:character
mapsto
"UnicodeDecodeError"meansyouhaveafileencodingissue.Eachcomputerhasitsownsystem-widedefaultencoding,andthefileyouaretryingtoopenisencodedinsomethingdifferent,mostlikelysomeversionofUnicode.Ifthishappens,youshouldspecifytheencodingusingtheencoding='xxx'switchwhileopeningthefile.Ifyouarenotsurewhichencodingtouse,try'utf-8','utf-16',and'utf-32'.
EntirefilecontentcanbereadinonlyONCEperopening
>>>f=open('mary-short.txt')
>>>marytxt=f.read()#Readsinentirefilecontent
>>>marylines=f.readlines()#Nothinglefttoread,returnsnothing
>>>f.close()
>>>len(marytxt)
110
>>>len(marylines)#marylinesisempty!
0
Both.read()and.readlines()comewiththeconceptofacursor.Aftereithercommandisexecuted,thecursormovestotheendofthefile,leavingnothingmoretoreadin.Therefore,onceafilecontenthasbeenreadin,anotherattempttoreadfromthefileobjectwillproduceanemptydataobject.Ifforsomereasonyoumustreadthefilecontentagain,youmustcloseandre-openthefile.
Onlythestringtypecanbewritten
>>>pi=3.141592
>>>fout=open('math.txt','w')
>>>fout.write("Pi'svalueis")
>>>fout.write(pi)#tryingtowritefloat,doesn'twork
Traceback(mostrecentcalllast):
File"",line1,in
TypeError:expectedacharacterbufferobject
>>>fout.write(str(pi))#turnnumberintostringusingstr()
>>>
Writingmethodsonlyworkswithstrings:.write()takesasinglestring,and.writelines()takesalistwhichcontainsstringsonly.Non-stringtypedatamustbefirstcoercedintothestringtypebyusingthestr()function.
Youroutputfileisempty
output.txt
Thishappenstoeveryone:youwritesomethingout,openupthefiletoview,onlytofinditempty.Inothertimes,thefilecontentmaybeincomplete.Curious,isn'tit?Well,thecauseissimple:YOUFORGOT.close().Writingouthappensinbuffers;flushingoutthelastwritingbufferdoesnothappenuntilyoucloseyourfileobject.ALWAYSREMEMBERTOCLOSEYOURFILEOBJECT.
(Windows)Linebreaksdonotshowup
IfyouopenupyourtextfileinNotepadappinWindowsandseeeverythinginoneline,don'tbealarmed.OpenthesametextfileinWordpador,evenbetter,Notepad++,andyouwillseethatthelinebreaksarethereafterall.SeethisFAQfordetails.