Python: Read a CSV file line by line with or without header
文章推薦指數: 80 %
Open the file 'students. · Create a reader object (iterator) by passing file object in csv. · Now once we have this reader object, which is an ... Skiptocontent InthisarticlewewilldiscusshowtoreadaCSVfilelinebylinewithorwithoutheader.AlsoselectspecificcolumnswhileiteratingoveraCSVfilelinebyline. Supposewehaveacsvfilestudents.csvanditscontentsare, Id,Name,Course,City,Session 21,Mark,Python,London,Morning 22,John,Python,Tokyo,Evening 23,Sam,Python,Paris,Morning 32,Shaun,Java,Tokyo,Morning Wewanttoreadalltherowsofthiscsvfilelinebylineandprocesseachlineatatime. Alsonotethat,herewedon’twanttoreadalllinesintoalistoflistsandtheniterateoverit,becausethatwillnotbeanefficientsolutionforlargecsvfilei.e.filewithsizeinGBs.Wearelookingforsolutionswhereweread&processonlyonelineatatimewhileiteratingthroughallrowsofcsv,sothatminimummemoryisutilized. Let’sseehowtodothis, Advertisements Pythonhasacsvmodule,whichprovidestwodifferentclassestoreadthecontentsofacsvfilei.e.csv.readerandcsv.DictReader.Let’sdiscuss&usethemonebyonetoreadacsvfilelinebyline, ReadaCSVfilelinebylineusingcsv.reader Withcsvmodule’sreaderclassobjectwecaniterateoverthelinesofacsvfileasalistofvalues,whereeachvalueinthelistisacellvalue.Let’sunderstandwithanexample, fromcsvimportreader #openfileinreadmode withopen('students.csv','r')asread_obj: #passthefileobjecttoreader()togetthereaderobject csv_reader=reader(read_obj) #Iterateovereachrowinthecsvusingreaderobject forrowincsv_reader: #rowvariableisalistthatrepresentsarowincsv print(row) Output: ['Id','Name','Course','City','Session'] ['21','Mark','Python','London','Morning'] ['22','John','Python','Tokyo','Evening'] ['23','Sam','Python','Paris','Morning'] ['32','Shaun','Java','Tokyo','Morning'] Ititeratesoveralltherowsofstudents.csvfile.Foreachrowitfetchedthecontentsofthatrowasalistandprintedthatlist. Howdiditwork? Itperformedthefollowingsteps, Openthefile‘students.csv’inreadmodeandcreateafileobject. Createareaderobject(iterator)bypassingfileobjectincsv.reader()function. Nowoncewehavethisreaderobject,whichisaniterator,thenusethisiteratorwithforlooptoreadindividualrowsofthecsvaslistofvalues.Whereeachvalueinthelistrepresentsanindividualcell. Thiswayonlyonelinewillbeinmemoryatatimewhileiteratingthroughcsvfile,whichmakesitamemoryefficientsolution. Readcsvfilewithoutheader Inthepreviousexampleweiteratedthroughalltherowsofcsvfileincludingheader.Butsupposewewanttoskiptheheaderanditerateovertheremainingrowsofcsvfile. Let’sseehowtodothat, fromcsvimportreader #skipfirstlinei.e.readheaderfirstandtheniterateovereachrowodcsvasalist withopen('students.csv','r')asread_obj: csv_reader=reader(read_obj) header=next(csv_reader) #Checkfileasempty ifheader!=None: #Iterateovereachrowaftertheheaderinthecsv forrowincsv_reader: #rowvariableisalistthatrepresentsarowincsv print(row) Output: ['21','Mark','Python','London','Morning'] ['22','John','Python','Tokyo','Evening'] ['23','Sam','Python','Paris','Morning'] ['32','Shaun','Java','Tokyo','Morning'] Headerwas: ['Id','Name','Course','City','Session'] Itskippedtheheaderrowofcsvfileanditerateoveralltheremainingrowsofstudents.csvfile.Foreachrowitfetchedthecontentsofthatrowasalistandprintedthatlist.Ininitiallysavedtheheaderrowinaseparatevariableandprintedthatinend. Howdiditwork? Asreader()functionreturnsaniteratorobject,whichwecanusewithPythonforlooptoiterateovertherows.Butintheaboveexamplewecalledthenext()functiononthisiteratorobjectinitially,whichreturnedthefirstrowofcsv.Afterthatweusedtheiteratorobjectwithforlooptoiterateoverremainingrowsofthecsvfile. ReadcsvfilelinebylineusingcsvmoduleDictReaderobject Withcsvmodule’sDictReaderclassobjectwecaniterateoverthelinesofacsvfileasadictionaryi.e. foreachrowadictionaryisreturned,whichcontainsthepairofcolumnnamesandcellvaluesforthatrow. Let’sunderstandwithanexample, fromcsvimportDictReader #openfileinreadmode withopen('students.csv','r')asread_obj: #passthefileobjecttoDictReader()togettheDictReaderobject csv_dict_reader=DictReader(read_obj) #iterateovereachlineasaordereddictionary forrowincsv_dict_reader: #rowvariableisadictionarythatrepresentsarowincsv print(row) Output: {'Id':'21','Name':'Mark','Course':'Python','City':'London','Session':'Morning'} {'Id':'22','Name':'John','Course':'Python','City':'Tokyo','Session':'Evening'} {'Id':'23','Name':'Sam','Course':'Python','City':'Paris','Session':'Morning'} {'Id':'32','Name':'Shaun','Course':'Java','City':'Tokyo','Session':'Morning'} Ititeratesoveralltherowsofstudents.csvfile.Foreachrowitfetchesthecontentsofthatrowasadictionaryandprintedthatlist. Howdiditwork? Itperformedthefollowingsteps, Openthefile‘students.csv’inreadmodeandcreateafileobject. CreateaDictReaderobject(iterator)bypassingfileobjectincsv.DictReader(). NowoncewehavethisDictReaderobject,whichisaniterator.Usethisiteratorobjectwithforlooptoreadindividualrowsofthecsvasadictionary.Whereeachpairinthisdictionaryrepresentscontainsthecolumnname&columnvalueforthatrow. Itisamemoryefficientsolution,becauseatatimeonlyonelineisinmemory. Getcolumnnamesfromheaderincsvfile DictReaderclasshasamemberfunctionthatreturnsthecolumnnamesofthecsvfileaslist. let’sseehowtouseit, fromcsvimportDictReader #openfileinreadmode withopen('students.csv','r')asread_obj: #passthefileobjecttoDictReader()togettheDictReaderobject csv_dict_reader=DictReader(read_obj) #getcolumnnamesfromacsvfile column_names=csv_dict_reader.fieldnames print(column_names) Output: ['Id','Name','Course','City','Session'] Readspecificcolumnsfromacsvfilewhileiteratinglinebyline Readspecificcolumns(bycolumnname)inacsvfilewhileiteratingrowbyrow Iterateoveralltherowsofstudents.csvfilelinebyline,butprintonlytwocolumnsofforeachrow, fromcsvimportDictReader #iterateovereachlineasaordereddictionaryandprintonlyfewcolumnbycolumnname withopen('students.csv','r')asread_obj: csv_dict_reader=DictReader(read_obj) forrowincsv_dict_reader: print(row['Id'],row['Name']) Output: 21Mark 22John 23Sam 32Shaun DictReaderreturnsadictionaryforeachlineduringiteration.Asinthisdictionarykeysarecolumnnamesandvaluesarecellvaluesforthatcolumn.So,forselectingspecificcolumnsineveryrow,weusedcolumnnamewiththedictionaryobject. Readspecificcolumns(bycolumnNumber)inacsvfilewhileiteratingrowbyrow Iterateoverallrowsstudents.csvandforeachrowprintcontentsof2nsand3rdcolumn, fromcsvimportreader #iterateovereachlineasaordereddictionaryandprintonlyfewcolumnbycolumnNumber withopen('students.csv','r')asread_obj: csv_reader=reader(read_obj) forrowincsv_reader: print(row[1],row[2]) Output: NameCourse MarkPython JohnPython SamPython ShaunJava Withcsv.readereachrowofcsvfileisfetchedasalistofvalues,whereeachvaluerepresentsacolumnvalue.So,selecting2nd&3rdcolumnforeachrow,selectelementsatindex1and2fromthelist. Thecompleteexampleisasfollows, fromcsvimportreader fromcsvimportDictReader defmain(): print('***Readcsvfilelinebylineusingcsvmodulereaderobject***') print('***Iterateovereachrowofacsvfileaslistusingreaderobject***') #openfileinreadmode withopen('students.csv','r')asread_obj: #passthefileobjecttoreader()togetthereaderobject csv_reader=reader(read_obj) #Iterateovereachrowinthecsvusingreaderobject forrowincsv_reader: #rowvariableisalistthatrepresentsarowincsv print(row) print('***Readcsvlinebylinewithoutheader***') #skipfirstlinei.e.readheaderfirstandtheniterateovereachrowodcsvasalist withopen('students.csv','r')asread_obj: csv_reader=reader(read_obj) header=next(csv_reader) #Checkfileasempty ifheader!=None: #Iterateovereachrowaftertheheaderinthecsv forrowincsv_reader: #rowvariableisalistthatrepresentsarowincsv print(row) print('Headerwas:') print(header) print('***ReadcsvfilelinebylineusingcsvmoduleDictReaderobject***') #openfileinreadmode withopen('students.csv','r')asread_obj: #passthefileobjecttoDictReader()togettheDictReaderobject csv_dict_reader=DictReader(read_obj) #iterateovereachlineasaordereddictionary forrowincsv_dict_reader: #rowvariableisadictionarythatrepresentsarowincsv print(row) print('***selectelementsbycolumnnamewhilereadingcsvfilelinebyline***') #openfileinreadmode withopen('students.csv','r')asread_obj: #passthefileobjecttoDictReader()togettheDictReaderobject csv_dict_reader=DictReader(read_obj) #iterateovereachlineasaordereddictionary forrowincsv_dict_reader: #rowvariableisadictionarythatrepresentsarowincsv print(row['Name'],'isfrom',row['City'],'andheisstudying',row['Course']) print('***Getcolumnnamesfromheaderincsvfile***') #openfileinreadmode withopen('students.csv','r')asread_obj: #passthefileobjecttoDictReader()togettheDictReaderobject csv_dict_reader=DictReader(read_obj) #getcolumnnamesfromacsvfile column_names=csv_dict_reader.fieldnames print(column_names) print('***Readspecificcolumnsfromacsvfilewhileiteratinglinebyline***') print('***Readspecificcolumns(bycolumnname)inacsvfilewhileiteratingrowbyrow***') #iterateovereachlineasaordereddictionaryandprintonlyfewcolumnbycolumnname withopen('students.csv','r')asread_obj: csv_dict_reader=DictReader(read_obj) forrowincsv_dict_reader: print(row['Id'],row['Name']) print('***Readspecificcolumns(bycolumnNumber)inacsvfilewhileiteratingrowbyrow***') #iterateovereachlineasaordereddictionaryandprintonlyfewcolumnbycolumnNumber withopen('students.csv','r')asread_obj: csv_reader=reader(read_obj) forrowincsv_reader: print(row[1],row[2]) if__name__=='__main__': main() Output: ***Readcsvfilelinebylineusingcsvmodulereaderobject*** ***Iterateovereachrowofacsvfileaslistusingreaderobject*** ['Id','Name','Course','City','Session'] ['21','Mark','Python','London','Morning'] ['22','John','Python','Tokyo','Evening'] ['23','Sam','Python','Paris','Morning'] ['32','Shaun','Java','Tokyo','Morning'] ***Readcsvlinebylinewithoutheader*** ['21','Mark','Python','London','Morning'] ['22','John','Python','Tokyo','Evening'] ['23','Sam','Python','Paris','Morning'] ['32','Shaun','Java','Tokyo','Morning'] Headerwas: ['Id','Name','Course','City','Session'] ***ReadcsvfilelinebylineusingcsvmoduleDictReaderobject*** {'Id':'21','Name':'Mark','Course':'Python','City':'London','Session':'Morning'} {'Id':'22','Name':'John','Course':'Python','City':'Tokyo','Session':'Evening'} {'Id':'23','Name':'Sam','Course':'Python','City':'Paris','Session':'Morning'} {'Id':'32','Name':'Shaun','Course':'Java','City':'Tokyo','Session':'Morning'} ***selectelementsbycolumnnamewhilereadingcsvfilelinebyline*** MarkisfromLondonandheisstudyingPython JohnisfromTokyoandheisstudyingPython SamisfromParisandheisstudyingPython ShaunisfromTokyoandheisstudyingJava ***Getcolumnnamesfromheaderincsvfile*** ['Id','Name','Course','City','Session'] ***Readspecificcolumnsfromacsvfilewhileiteratinglinebyline*** ***Readspecificcolumns(bycolumnname)inacsvfilewhileiteratingrowbyrow*** 21Mark 22John 23Sam 32Shaun ***Readspecificcolumns(bycolumnNumber)inacsvfilewhileiteratingrowbyrow*** NameCourse MarkPython JohnPython SamPython ShaunJava PandasTutorials-LearnDataAnalysiswithPython PandasTutorialPart#1-IntroductiontoDataAnalysiswithPython PandasTutorialPart#2-BasicsofPandasSeries PandasTutorialPart#3-Get&SetSeriesvalues PandasTutorialPart#4-Attributes&methodsofPandasSeries PandasTutorialPart#5-AddorRemovePandasSerieselements PandasTutorialPart#6-IntroductiontoDataFrame PandasTutorialPart#7-DataFrame.loc[]-SelectRows/ColumnsbyIndexing PandasTutorialPart#8-DataFrame.iloc[]-SelectRows/ColumnsbyLabelNames PandasTutorialPart#9-FilterDataFrameRows PandasTutorialPart#10-Add/RemoveDataFrameRows&Columns PandasTutorialPart#11-DataFrameattributes&methods PandasTutorialPart#12-HandlingMissingDataorNaNvalues PandasTutorialPart#13-IterateoverRows&ColumnsofDataFrame PandasTutorialPart#14-SortingDataFramebyRowsorColumns PandasTutorialPart#15-MergingorConcatenatingDataFrames PandasTutorialPart#16-DataFrameGroupByexplainedwithexamples AreyoulookingtomakeacareerinDataSciencewithPython? DataScienceisthefuture,andthefutureisherenow.DataScientistsarenowthemostsought-afterprofessionalstoday.TobecomeagoodDataScientistortomakeacareerswitchinDataScienceonemustpossesstherightskillset.WehavecuratedalistofBestProfessionalCertificateinDataSciencewithPython.ThesecourseswillteachyoutheprogrammingtoolsforDataSciencelikePandas,NumPy,Matplotlib,SeabornandhowtousetheselibrariestoimplementMachinelearningmodels. CheckouttheDetailedReviewofBestProfessionalCertificateinDataSciencewithPython. Remember,DataSciencerequiresalotofpatience,persistence,andpractice.So,startlearningtoday. JoinaLinkedInCommunityofPythonDevelopers Postnavigation ←PreviousPostNextPost→ RelatedPosts 2thoughtson“Python:ReadaCSVfilelinebylinewithorwithoutheader” Niceexplanation Reply Verywellexplainedwithproperandeasy-to-understandexamples.Wasveryclearandmentionofgoodpracticesforbeginnerslikeme!Thanks Reply LeaveaCommentCancelReplyYouremailaddresswillnotbepublished.Requiredfieldsaremarked*Typehere..Name* Email* Website Δ ThissiteusesAkismettoreducespam.Learnhowyourcommentdataisprocessed. Advertisements Advertisements RecentPosts ReplaceastringinmultiplefilesinLinux RecursivelyCountFilesinadirectoryinLinux FindFilescontainingspecificTextinLinux FindlatestmodifiedfilesinadirectoryinLinux(Recursively) DownloadafilefromaserverusingSSH PrintadirectorystructurelikeatreeinLinux ChangepermissionsforDirectory&Sub-directoriesinLinux HowtonormalizecolumnsinPandasDataFrame? HowtoGettheindexcolumnnameinPandas? HowtoResetIndexinaPandasDataFrame? PythonTutorialsLookingforSomething Searchfor: Search Manageyourprivacy Toprovidethebestexperiences,weandourpartnersusetechnologieslikecookiestostoreand/oraccessdeviceinformation.ConsentingtothesetechnologieswillallowusandourpartnerstoprocesspersonaldatasuchasbrowsingbehaviororuniqueIDsonthissite.Notconsentingorwithdrawingconsent,mayadverselyaffectcertainfeaturesandfunctions.Clickbelowtoconsenttotheaboveormakegranularchoices. Yourchoiceswillbeappliedtothissiteonly. Youcanchangeyoursettingsatanytime,includingwithdrawingyourconsent,byusingthetogglesontheCookiePolicy,orbyclickingonthemanageconsentbuttonatthebottomofthescreen. Functional Functional Alwaysactive Thetechnicalstorageoraccessisstrictlynecessaryforthelegitimatepurposeofenablingtheuseofaspecificserviceexplicitlyrequestedbythesubscriberoruser,orforthesolepurposeofcarryingoutthetransmissionofacommunicationoveranelectroniccommunicationsnetwork. Preferences Preferences Thetechnicalstorageoraccessisnecessaryforthelegitimatepurposeofstoringpreferencesthatarenotrequestedbythesubscriberoruser. Statistics Statistics Thetechnicalstorageoraccessthatisusedexclusivelyforstatisticalpurposes. Thetechnicalstorageoraccessthatisusedexclusivelyforanonymousstatisticalpurposes.Withoutasubpoena,voluntarycomplianceonthepartofyourInternetServiceProvider,oradditionalrecordsfromathirdparty,informationstoredorretrievedforthispurposealonecannotusuallybeusedtoidentifyyou. Marketing Marketing Thetechnicalstorageoraccessisrequiredtocreateuserprofilestosendadvertising,ortotracktheuseronawebsiteoracrossseveralwebsitesforsimilarmarketingpurposes. Statistics Marketing Features Alwaysactive Alwaysactive Manageoptions Manageservices Managevendors Readmoreaboutthesepurposes Accept Deny Manageoptions Savepreferences Manageoptions {title} {title} {title} Manageyourprivacy Toprovidethebestexperiences,weusetechnologieslikecookiestostoreand/oraccessdeviceinformation.ConsentingtothesetechnologieswillallowustoprocessdatasuchasbrowsingbehaviororuniqueIDsonthissite.Notconsentingorwithdrawingconsent,mayadverselyaffectcertainfeaturesandfunctions. Functional Functional Alwaysactive Thetechnicalstorageoraccessisstrictlynecessaryforthelegitimatepurposeofenablingtheuseofaspecificserviceexplicitlyrequestedbythesubscriberoruser,orforthesolepurposeofcarryingoutthetransmissionofacommunicationoveranelectroniccommunicationsnetwork. Preferences Preferences Thetechnicalstorageoraccessisnecessaryforthelegitimatepurposeofstoringpreferencesthatarenotrequestedbythesubscriberoruser. Statistics Statistics Thetechnicalstorageoraccessthatisusedexclusivelyforstatisticalpurposes. Thetechnicalstorageoraccessthatisusedexclusivelyforanonymousstatisticalpurposes.Withoutasubpoena,voluntarycomplianceonthepartofyourInternetServiceProvider,oradditionalrecordsfromathirdparty,informationstoredorretrievedforthispurposealonecannotusuallybeusedtoidentifyyou. Marketing Marketing Thetechnicalstorageoraccessisrequiredtocreateuserprofilestosendadvertising,ortotracktheuseronawebsiteoracrossseveralwebsitesforsimilarmarketingpurposes. Manageoptions Manageservices Managevendors Readmoreaboutthesepurposes Accept Deny Manageoptions Savepreferences Manageoptions {title} {title} {title} Manageconsent Manageconsent ScrolltoTop
延伸文章資訊
- 1Python文件读写readline()、readlines()、CSV库、pandas库
csv文件默认是以逗号为分隔符,如果出现错误“ParserError: Error tokenizing data. C error: Expected 1 fields in line 29,...
- 2Python - Read csv file with Pandas without header? - Tutorialspoint
- 3How to read line from csv file in Python - Adam Smith
- 4Reading and Writing CSV Files in Python - Real Python
Learn how to read, process, and parse CSV from text files using Python. ... QUOTE_MINIMAL , then ...
- 5csv — CSV File Reading and Writing — Python 3.10.7 ...
The csv module implements classes to read and write tabular data in CSV format. ... in the dictio...