Reading and Writing CSV Files in Python - Real Python
文章推薦指數: 80 %
Reading from a CSV file is done using the reader object. The CSV file is opened as a text file with Python's built-in open() function, which returns a file ...
Start Here
LearnPython
PythonTutorials→In-deptharticlesandvideocourses
LearningPaths→Guidedstudyplansforacceleratedlearning
Quizzes→Checkyourlearningprogress
BrowseTopics→Focusonaspecificareaorskilllevel
CommunityChat→LearnwithotherPythonistas
OfficeHours→LiveQ&AcallswithPythonexperts
Podcast→Hearwhat’snewintheworldofPython
Books→Roundoutyourknowledgeandlearnoffline
UnlockAllContent→
More
PythonLearningResources
PythonNewsletter
PythonJobBoard
MeettheTeam
BecomeaTutorialAuthor
BecomeaVideoInstructor
Search
Join
Sign‑In
ReadingandWritingCSVFilesinPython
byJonFincher
data-science
intermediate
python
MarkasCompleted
Tweet
Share
Email
TableofContents
WhatIsaCSVFile?
WhereDoCSVFilesComeFrom?
ParsingCSVFilesWithPython’sBuilt-inCSVLibrary
ReadingCSVFilesWithcsv
ReadingCSVFilesIntoaDictionaryWithcsv
OptionalPythonCSVreaderParameters
WritingCSVFilesWithcsv
WritingCSVFileFromaDictionaryWithcsv
ParsingCSVFilesWiththepandasLibrary
ReadingCSVFilesWithpandas
WritingCSVFilesWithpandas
Conclusion
Removeads
WatchNowThistutorialhasarelatedvideocoursecreatedbytheRealPythonteam.Watchittogetherwiththewrittentutorialtodeepenyourunderstanding:ReadingandWritingCSVFiles
Let’sfaceit:youneedtogetinformationintoandoutofyourprogramsthroughmorethanjustthekeyboardandconsole.Exchanginginformationthroughtextfilesisacommonwaytoshareinfobetweenprograms.OneofthemostpopularformatsforexchangingdataistheCSVformat.Buthowdoyouuseit?
Let’sgetonethingclear:youdon’thaveto(andyouwon’t)buildyourownCSVparserfromscratch.Thereareseveralperfectlyacceptablelibrariesyoucanuse.ThePythoncsvlibrarywillworkformostcases.Ifyourworkrequireslotsofdataornumericalanalysis,thepandaslibraryhasCSVparsingcapabilitiesaswell,whichshouldhandletherest.
Inthisarticle,you’lllearnhowtoread,process,andparseCSVfromtextfilesusingPython.You’llseehowCSVfileswork,learntheall-importantcsvlibrarybuiltintoPython,andseehowCSVparsingworksusingthepandaslibrary.
Solet’sgetstarted!
FreeDownload:GetasamplechapterfromPythonBasics:APracticalIntroductiontoPython3toseehowyoucangofrombeginnertointermediateinPythonwithacompletecurriculum,up-to-dateforPython3.8.
TaketheQuiz:Testyourknowledgewithourinteractive“ReadingandWritingCSVFilesinPython”quiz.Uponcompletionyouwillreceiveascoresoyoucantrackyourlearningprogressovertime:TaketheQuiz»
WhatIsaCSVFile?
ACSVfile(CommaSeparatedValuesfile)isatypeofplaintextfilethatusesspecificstructuringtoarrangetabulardata.Becauseit’saplaintextfile,itcancontainonlyactualtextdata—inotherwords,printableASCIIorUnicodecharacters.
ThestructureofaCSVfileisgivenawaybyitsname.Normally,CSVfilesuseacommatoseparateeachspecificdatavalue.Here’swhatthatstructurelookslike:
column1name,column2name,column3name
firstrowdata1,firstrowdata2,firstrowdata3
secondrowdata1,secondrowdata2,secondrowdata3
...
Noticehoweachpieceofdataisseparatedbyacomma.Normally,thefirstlineidentifieseachpieceofdata—inotherwords,thenameofadatacolumn.Everysubsequentlineafterthatisactualdataandislimitedonlybyfilesizeconstraints.
Ingeneral,theseparatorcharacteriscalledadelimiter,andthecommaisnottheonlyoneused.Otherpopulardelimitersincludethetab(\t),colon(:)andsemi-colon(;)characters.ProperlyparsingaCSVfilerequiresustoknowwhichdelimiterisbeingused.
RemoveadsWhereDoCSVFilesComeFrom?
CSVfilesarenormallycreatedbyprogramsthathandlelargeamountsofdata.Theyareaconvenientwaytoexportdatafromspreadsheetsanddatabasesaswellasimportoruseitinotherprograms.Forexample,youmightexporttheresultsofadataminingprogramtoaCSVfileandthenimportthatintoaspreadsheettoanalyzethedata,generategraphsforapresentation,orprepareareportforpublication.
CSVfilesareveryeasytoworkwithprogrammatically.Anylanguagethatsupportstextfileinputandstringmanipulation(likePython)canworkwithCSVfilesdirectly.
ParsingCSVFilesWithPython’sBuilt-inCSVLibrary
ThecsvlibraryprovidesfunctionalitytobothreadfromandwritetoCSVfiles.DesignedtoworkoutoftheboxwithExcel-generatedCSVfiles,itiseasilyadaptedtoworkwithavarietyofCSVformats.Thecsvlibrarycontainsobjectsandothercodetoread,write,andprocessdatafromandtoCSVfiles.
ReadingCSVFilesWithcsv
ReadingfromaCSVfileisdoneusingthereaderobject.TheCSVfileisopenedasatextfilewithPython’sbuilt-inopen()function,whichreturnsafileobject.Thisisthenpassedtothereader,whichdoestheheavylifting.
Here’stheemployee_birthday.txtfile:
name,department,birthdaymonth
JohnSmith,Accounting,November
EricaMeyers,IT,March
Here’scodetoreadit:
importcsv
withopen('employee_birthday.txt')ascsv_file:
csv_reader=csv.reader(csv_file,delimiter=',')
line_count=0
forrowincsv_reader:
ifline_count==0:
print(f'Columnnamesare{",".join(row)}')
line_count+=1
else:
print(f'\t{row[0]}worksinthe{row[1]}department,andwasbornin{row[2]}.')
line_count+=1
print(f'Processed{line_count}lines.')
Thisresultsinthefollowingoutput:
Columnnamesarename,department,birthdaymonth
JohnSmithworksintheAccountingdepartment,andwasborninNovember.
EricaMeyersworksintheITdepartment,andwasborninMarch.
Processed3lines.
EachrowreturnedbythereaderisalistofStringelementscontainingthedatafoundbyremovingthedelimiters.Thefirstrowreturnedcontainsthecolumnnames,whichishandledinaspecialway.
ReadingCSVFilesIntoaDictionaryWithcsv
RatherthandealwithalistofindividualStringelements,youcanreadCSVdatadirectlyintoadictionary(technically,anOrderedDictionary)aswell.
Again,ourinputfile,employee_birthday.txtisasfollows:
name,department,birthdaymonth
JohnSmith,Accounting,November
EricaMeyers,IT,March
Here’sthecodetoreaditinasadictionarythistime:
importcsv
withopen('employee_birthday.txt',mode='r')ascsv_file:
csv_reader=csv.DictReader(csv_file)
line_count=0
forrowincsv_reader:
ifline_count==0:
print(f'Columnnamesare{",".join(row)}')
line_count+=1
print(f'\t{row["name"]}worksinthe{row["department"]}department,andwasbornin{row["birthdaymonth"]}.')
line_count+=1
print(f'Processed{line_count}lines.')
Thisresultsinthesameoutputasbefore:
Columnnamesarename,department,birthdaymonth
JohnSmithworksintheAccountingdepartment,andwasborninNovember.
EricaMeyersworksintheITdepartment,andwasborninMarch.
Processed3lines.
Wheredidthedictionarykeyscomefrom?ThefirstlineoftheCSVfileisassumedtocontainthekeystousetobuildthedictionary.Ifyoudon’thavetheseinyourCSVfile,youshouldspecifyyourownkeysbysettingthefieldnamesoptionalparametertoalistcontainingthem.
RemoveadsOptionalPythonCSVreaderParameters
ThereaderobjectcanhandledifferentstylesofCSVfilesbyspecifyingadditionalparameters,someofwhichareshownbelow:
delimiterspecifiesthecharacterusedtoseparateeachfield.Thedefaultisthecomma(',').
quotecharspecifiesthecharacterusedtosurroundfieldsthatcontainthedelimitercharacter.Thedefaultisadoublequote('"').
escapecharspecifiesthecharacterusedtoescapethedelimitercharacter,incasequotesaren’tused.Thedefaultisnoescapecharacter.
Theseparametersdeservesomemoreexplanation.Supposeyou’reworkingwiththefollowingemployee_addresses.txtfile:
name,address,datejoined
johnsmith,1132AnywhereLaneHobokenNJ,07030,Jan4
ericameyers,1234SmithLaneHobokenNJ,07030,March2
ThisCSVfilecontainsthreefields:name,address,anddatejoined,whicharedelimitedbycommas.Theproblemisthatthedatafortheaddressfieldalsocontainsacommatosignifythezipcode.
Therearethreedifferentwaystohandlethissituation:
Useadifferentdelimiter
Thatway,thecommacansafelybeusedinthedataitself.Youusethedelimiteroptionalparametertospecifythenewdelimiter.
Wrapthedatainquotes
Thespecialnatureofyourchosendelimiterisignoredinquotedstrings.Therefore,youcanspecifythecharacterusedforquotingwiththequotecharoptionalparameter.Aslongasthatcharacteralsodoesn’tappearinthedata,you’refine.
Escapethedelimitercharactersinthedata
Escapecharactersworkjustastheydoinformatstrings,nullifyingtheinterpretationofthecharacterbeingescaped(inthiscase,thedelimiter).Ifanescapecharacterisused,itmustbespecifiedusingtheescapecharoptionalparameter.
WritingCSVFilesWithcsv
YoucanalsowritetoaCSVfileusingawriterobjectandthe.write_row()method:
importcsv
withopen('employee_file.csv',mode='w')asemployee_file:
employee_writer=csv.writer(employee_file,delimiter=',',quotechar='"',quoting=csv.QUOTE_MINIMAL)
employee_writer.writerow(['JohnSmith','Accounting','November'])
employee_writer.writerow(['EricaMeyers','IT','March'])
Thequotecharoptionalparametertellsthewriterwhichcharactertousetoquotefieldswhenwriting.Whetherquotingisusedornot,however,isdeterminedbythequotingoptionalparameter:
Ifquotingissettocsv.QUOTE_MINIMAL,then.writerow()willquotefieldsonlyiftheycontainthedelimiterorthequotechar.Thisisthedefaultcase.
Ifquotingissettocsv.QUOTE_ALL,then.writerow()willquoteallfields.
Ifquotingissettocsv.QUOTE_NONNUMERIC,then.writerow()willquoteallfieldscontainingtextdataandconvertallnumericfieldstothefloatdatatype.
Ifquotingissettocsv.QUOTE_NONE,then.writerow()willescapedelimitersinsteadofquotingthem.Inthiscase,youalsomustprovideavaluefortheescapecharoptionalparameter.
Readingthefilebackinplaintextshowsthatthefileiscreatedasfollows:
JohnSmith,Accounting,November
EricaMeyers,IT,March
WritingCSVFileFromaDictionaryWithcsv
Sinceyoucanreadourdataintoadictionary,it’sonlyfairthatyoushouldbeabletowriteitoutfromadictionaryaswell:
importcsv
withopen('employee_file2.csv',mode='w')ascsv_file:
fieldnames=['emp_name','dept','birth_month']
writer=csv.DictWriter(csv_file,fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'emp_name':'JohnSmith','dept':'Accounting','birth_month':'November'})
writer.writerow({'emp_name':'EricaMeyers','dept':'IT','birth_month':'March'})
UnlikeDictReader,thefieldnamesparameterisrequiredwhenwritingadictionary.Thismakessense,whenyouthinkaboutit:withoutalistoffieldnames,theDictWritercan’tknowwhichkeystousetoretrievevaluesfromyourdictionaries.Italsousesthekeysinfieldnamestowriteoutthefirstrowascolumnnames.
Thecodeabovegeneratesthefollowingoutputfile:
emp_name,dept,birth_month
JohnSmith,Accounting,November
EricaMeyers,IT,March
RemoveadsParsingCSVFilesWiththepandasLibrary
Ofcourse,thePythonCSVlibraryisn’ttheonlygameintown.ReadingCSVfilesispossibleinpandasaswell.Itishighlyrecommendedifyouhavealotofdatatoanalyze.
pandasisanopen-sourcePythonlibrarythatprovideshighperformancedataanalysistoolsandeasytousedatastructures.pandasisavailableforallPythoninstallations,butitisakeypartoftheAnacondadistributionandworksextremelywellinJupyternotebookstosharedata,code,analysisresults,visualizations,andnarrativetext.
InstallingpandasanditsdependenciesinAnacondaiseasilydone:
$condainstallpandas
Asisusingpip/pipenvforotherPythoninstallations:
$pipinstallpandas
Wewon’tdelveintothespecificsofhowpandasworksorhowtouseit.Foranin-depthtreatmentonusingpandastoreadandanalyzelargedatasets,checkoutShantnuTiwari’ssuperbarticleonworkingwithlargeExcelfilesinpandas.
ReadingCSVFilesWithpandas
ToshowsomeofthepowerofpandasCSVcapabilities,I’vecreatedaslightlymorecomplicatedfiletoread,calledhrdata.csv.Itcontainsdataoncompanyemployees:
Name,HireDate,Salary,SickDaysremaining
GrahamChapman,03/15/14,50000.00,10
JohnCleese,06/01/15,65000.00,8
EricIdle,05/12/14,45000.00,10
TerryJones,11/01/13,70000.00,3
TerryGilliam,08/12/14,48000.00,7
MichaelPalin,05/23/13,66000.00,8
ReadingtheCSVintoapandasDataFrameisquickandstraightforward:
importpandas
df=pandas.read_csv('hrdata.csv')
print(df)
That’sit:threelinesofcode,andonlyoneofthemisdoingtheactualwork.pandas.read_csv()opens,analyzes,andreadstheCSVfileprovided,andstoresthedatainaDataFrame.PrintingtheDataFrameresultsinthefollowingoutput:
NameHireDateSalarySickDaysremaining
0GrahamChapman03/15/1450000.010
1JohnCleese06/01/1565000.08
2EricIdle05/12/1445000.010
3TerryJones11/01/1370000.03
4TerryGilliam08/12/1448000.07
5MichaelPalin05/23/1366000.08
Hereareafewpointsworthnoting:
First,pandasrecognizedthatthefirstlineoftheCSVcontainedcolumnnames,andusedthemautomatically.IcallthisGoodness.
However,pandasisalsousingzero-basedintegerindicesintheDataFrame.That’sbecausewedidn’ttellitwhatourindexshouldbe.
Further,ifyoulookatthedatatypesofourcolumns,you’llseepandashasproperlyconvertedtheSalaryandSickDaysremainingcolumnstonumbers,buttheHireDatecolumnisstillaString.Thisiseasilyconfirmedininteractivemode:
>>>>>>print(type(df['HireDate'][0]))
延伸文章資訊
- 1How to Write to CSV Files in Python
Steps for writing a CSV file · First, open the CSV file for writing ( w mode) by using the open()...
- 2如何在Python 中把列表寫入CSV
可以使用Python 列表值來編寫csv 檔案。像csv.writer()、writerow()、pandas 和numpy 庫這樣的方法都是用於這個目的的。
- 3Python CSV: Read and Write CSV files - Programiz
To write to a CSV file in Python, we can use the csv.writer() function. The csv.writer() function...
- 4Reading and Writing CSV Files in Python - Real Python
Reading from a CSV file is done using the reader object. The CSV file is opened as a text file wi...
- 5csv — CSV File Reading and Writing — Python 3.10.7 ...
The csv module implements classes to read and write tabular data in CSV format. ... The Python En...