Reading CSV files in Python - Programiz
文章推薦指數: 80 %
Here, csv_file is a csv.DictReader() object. The object can be iterated over using a for loop. The csv.DictReader() returned an OrderedDict ... CourseIndex ExploreProgramiz Python JavaScript SQL C C++ Java Kotlin Swift C# DSA LearnPythonpractically andGetCertified. ENROLL PopularTutorials GettingStartedWithPython PythonifStatement whileLoopinPython PythonLists DictionariesinPython StartLearningPython PopularExamples Addtwonumbers Checkprimenumber Findthefactorialofanumber PrinttheFibonaccisequence Checkleapyear ExplorePythonExamples ReferenceMaterials Built-inFunctions ListMethods DictionaryMethods StringMethods Viewall LearningPaths Challenges LearnPythonInteractively TryforFree Courses BecomeaPythonMaster BecomeaCMaster BecomeaJavaMaster ViewallCourses Python JavaScript SQL C C++ Java Kotlin Swift C# DSA LearnPythonpractically andGetCertified. ENROLLFORFREE! PopularTutorials GettingStartedWithPython PythonifStatement whileLoopinPython PythonLists DictionariesinPython StartLearningPython AllPythonTutorials ReferenceMaterials Built-inFunctions ListMethods DictionaryMethods StringMethods Viewall Python JavaScript C C++ Java Kotlin LearnPythonpractically andGetCertified. ENROLLFORFREE! PopularExamples Addtwonumbers Checkprimenumber Findthefactorialofanumber PrinttheFibonaccisequence Checkleapyear AllPythonExamples LearnPythonInteractively PythonIntroduction GettingStarted KeywordsandIdentifier Statements&Comments PythonVariables PythonDataTypes PythonTypeConversion PythonI/OandImport PythonOperators PythonNamespace PythonFlowControl Pythonif...else PythonforLoop PythonwhileLoop Pythonbreakandcontinue PythonPass PythonFunctions PythonFunction FunctionArgument PythonRecursion AnonymousFunction Global,LocalandNonlocal PythonGlobalKeyword PythonModules PythonPackage PythonDatatypes PythonNumbers PythonList PythonTuple PythonString PythonSet PythonDictionary PythonFiles PythonFileOperation PythonDirectory PythonException ExceptionHandling User-definedException PythonObject&Class PythonOOP PythonClass PythonInheritance MultipleInheritance OperatorOverloading PythonAdvancedTopics PythonIterator PythonGenerator PythonClosure PythonDecorators PythonProperty PythonRegEx PythonExamples PythonDateandtime PythondatetimeModule Pythondatetime.strftime() Pythondatetime.strptime() Currentdate&time Getcurrenttime Timestamptodatetime PythontimeModule Pythontime.sleep() RelatedTopics WritingCSVfilesinPython PythonCSV WorkingwithCSVfilesinPython PythonFileI/O Pythonopen() PythonJSON ReadingCSVfilesinPython Inthistutorial,wewilllearntoreadCSVfileswithdifferentformatsinPythonwiththehelpofexamples. WearegoingtoexclusivelyusethecsvmodulebuiltintoPythonforthistask.Butfirst,wewillhavetoimportthemoduleas: importcsv WehavealreadycoveredthebasicsofhowtousethecsvmoduletoreadandwriteintoCSVfiles.Ifyoudon'thaveanyideaonusingthecsvmodule,checkoutourtutorialonPythonCSV:ReadandWriteCSVfiles BasicUsageofcsv.reader() Let'slookatabasicexampleofusingcsv.reader()torefreshyourexistingknowledge. Example1:ReadCSVfileswithcsv.reader() SupposewehaveaCSVfilewiththefollowingentries: SN,Name,Contribution 1,LinusTorvalds,LinuxKernel 2,TimBerners-Lee,WorldWideWeb 3,GuidovanRossum,PythonProgramming Wecanreadthecontentsofthefilewiththefollowingprogram: importcsv withopen('innovators.csv','r')asfile: reader=csv.reader(file) forrowinreader: print(row) Output ['SN','Name','Contribution'] ['1','LinusTorvalds','LinuxKernel'] ['2','TimBerners-Lee','WorldWideWeb'] ['3','GuidovanRossum','PythonProgramming'] Here,wehaveopenedtheinnovators.csvfileinreadingmodeusingopen()function. TolearnmoreaboutopeningfilesinPython,visit:PythonFileInput/Output Then,thecsv.reader()isusedtoreadthefile,whichreturnsaniterablereaderobject. Thereaderobjectistheniteratedusingaforlooptoprintthecontentsofeachrow. Now,wewilllookatCSVfileswithdifferentformats.Wewillthenlearnhowtocustomizethecsv.reader()functiontoreadthem. CSVfileswithCustomDelimiters Bydefault,acommaisusedasadelimiterinaCSVfile.However,someCSVfilescanusedelimitersotherthanacomma.Fewpopularonesare|and\t. Supposetheinnovators.csvfileinExample1wasusingtabasadelimiter.Toreadthefile,wecanpassanadditionaldelimiterparametertothecsv.reader()function. Let'stakeanexample. Example2:ReadCSVfileHavingTabDelimiter importcsv withopen('innovators.csv','r')asfile: reader=csv.reader(file,delimiter='\t') forrowinreader: print(row) Output ['SN','Name','Contribution'] ['1','LinusTorvalds','LinuxKernel'] ['2','TimBerners-Lee','WorldWideWeb'] ['3','GuidovanRossum','PythonProgramming'] Aswecansee,theoptionalparameterdelimiter='\t'helpsspecifythereaderobjectthattheCSVfilewearereadingfrom,hastabsasadelimiter. CSVfileswithinitialspaces SomeCSVfilescanhaveaspacecharacterafteradelimiter.Whenweusethedefaultcsv.reader()functiontoreadtheseCSVfiles,wewillgetspacesintheoutputaswell. Toremovetheseinitialspaces,weneedtopassanadditionalparametercalledskipinitialspace.Letuslookatanexample: Example3:ReadCSVfileswithinitialspaces SupposewehaveaCSVfilecalledpeople.csvwiththefollowingcontent: SN,Name,City 1,John,Washington 2,Eric,LosAngeles 3,Brad,Texas WecanreadtheCSVfileasfollows: importcsv withopen('people.csv','r')ascsvfile: reader=csv.reader(csvfile,skipinitialspace=True) forrowinreader: print(row) Output ['SN','Name','City'] ['1','John','Washington'] ['2','Eric','LosAngeles'] ['3','Brad','Texas'] TheprogramissimilartootherexamplesbuthasanadditionalskipinitialspaceparameterwhichissettoTrue. Thisallowsthereaderobjecttoknowthattheentrieshaveinitialwhitespace.Asaresult,theinitialspacesthatwerepresentafteradelimiterisremoved. CSVfileswithquotes SomeCSVfilescanhavequotesaroundeachorsomeoftheentries. Let'stakequotes.csvasanexample,withthefollowingentries: "SN","Name","Quotes" 1,Buddha,"Whatwethinkwebecome" 2,MarkTwain,"Neverregretanythingthatmadeyousmile" 3,OscarWilde,"Beyourselfeveryoneelseisalreadytaken" Usingcsv.reader()inminimalmodewillresultinoutputwiththequotationmarks. Inordertoremovethem,wewillhavetouseanotheroptionalparametercalledquoting. Let'slookatanexampleofhowtoreadtheaboveprogram. Example4:ReadCSVfileswithquotes importcsv withopen('person1.csv','r')asfile: reader=csv.reader(file,quoting=csv.QUOTE_ALL,skipinitialspace=True) forrowinreader: print(row) Output ['SN','Name','Quotes'] ['1','Buddha','Whatwethinkwebecome'] ['2','MarkTwain','Neverregretanythingthatmadeyousmile'] ['3','OscarWilde','Beyourselfeveryoneelseisalreadytaken'] Asyoucansee,wehavepassedcsv.QUOTE_ALLtothequotingparameter.Itisaconstantdefinedbythecsvmodule. csv.QUOTE_ALLspecifiesthereaderobjectthatallthevaluesintheCSVfilearepresentinsidequotationmarks. Thereare3otherpredefinedconstantsyoucanpasstothequotingparameter: csv.QUOTE_MINIMAL-SpecifiesreaderobjectthatCSVfilehasquotesaroundthoseentrieswhichcontainspecialcharacterssuchasdelimiter,quotecharoranyofthecharactersinlineterminator. csv.QUOTE_NONNUMERIC-SpecifiesthereaderobjectthattheCSVfilehasquotesaroundthenon-numericentries. csv.QUOTE_NONE-Specifiesthereaderobjectthatnoneoftheentrieshavequotesaroundthem. DialectsinCSVmodule NoticeinExample4thatwehavepassedmultipleparameters(quotingandskipinitialspace)tothecsv.reader()function. Thispracticeisacceptablewhendealingwithoneortwofiles.ButitwillmakethecodemoreredundantanduglyoncewestartworkingwithmultipleCSVfileswithsimilarformats. Asasolutiontothis,thecsvmoduleoffersdialectasanoptionalparameter. Dialecthelpsingroupingtogethermanyspecificformattingpatternslikedelimiter,skipinitialspace,quoting,escapecharintoasingledialectname. Itcanthenbepassedasaparametertomultiplewriterorreaderinstances. Example5:ReadCSVfilesusingdialect SupposewehaveaCSVfile(office.csv)withthefollowingcontent: "ID"|"Name"|"Email" "A878"|"AlfonsoK.Hamby"|"[email protected]" "F854"|"SusanneBriard"|"[email protected]" "E833"|"KatjaMauer"|"[email protected]" TheCSVfilehasinitialspaces,quotesaroundeachentry,andusesa|delimiter. Insteadofpassingthreeindividualformattingpatterns,let'slookathowtousedialectstoreadthisfile. importcsv csv.register_dialect('myDialect', delimiter='|', skipinitialspace=True, quoting=csv.QUOTE_ALL) withopen('office.csv','r')ascsvfile: reader=csv.reader(csvfile,dialect='myDialect') forrowinreader: print(row) Output ['ID','Name','Email'] ["A878",'AlfonsoK.Hamby','[email protected]'] ["F854",'SusanneBriard','[email protected]'] ["E833",'KatjaMauer','[email protected]'] Fromthisexample,wecanseethatthecsv.register_dialect()functionisusedtodefineacustomdialect.Ithasthefollowingsyntax: csv.register_dialect(name[,dialect[,**fmtparams]]) Thecustomdialectrequiresanameintheformofastring.Otherspecificationscanbedoneeitherbypassingasub-classofDialectclass,orbyindividualformattingpatternsasshownintheexample. Whilecreatingthereaderobject,wepassdialect='myDialect'tospecifythatthereaderinstancemustusethatparticulardialect. Theadvantageofusingdialectisthatitmakestheprogrammoremodular.Noticethatwecanreuse'myDialect'toopenotherfileswithouthavingtore-specifytheCSVformat. ReadCSVfileswithcsv.DictReader() Theobjectsofacsv.DictReader()classcanbeusedtoreadaCSVfileasadictionary. Example6:Pythoncsv.DictReader() SupposewehaveaCSVfile(people.csv)withthefollowingentries: Name Age Profession Jack 23 Doctor Miller 22 Engineer Let'sseehowcsv.DictReader()canbeused. importcsv withopen("people.csv",'r')asfile: csv_file=csv.DictReader(file) forrowincsv_file: print(dict(row)) Output {'Name':'Jack','Age':'23','Profession':'Doctor'} {'Name':'Miller','Age':'22','Profession':'Engineer'} Aswecansee,theentriesofthefirstrowarethedictionarykeys.And,theentriesintheotherrowsarethedictionaryvalues. Here,csv_fileisacsv.DictReader()object.Theobjectcanbeiteratedoverusingaforloop.Thecsv.DictReader()returnedanOrderedDicttypeforeachrow.That'swhyweuseddict()toconverteachrowtoadictionary. Noticethatwehaveexplicitlyusedthedict()methodtocreatedictionariesinsidetheforloop. print(dict(row)) Note:StartingfromPython3.8,csv.DictReader()returnsadictionaryforeachrow,andwedonotneedtousedict()explicitly. Thefullsyntaxofthecsv.DictReader()classis: csv.DictReader(file,fieldnames=None,restkey=None,restval=None,dialect='excel',*args,**kwds) Tolearnmoreaboutitindetail,visit:Pythoncsv.DictReader()class Usingcsv.Snifferclass TheSnifferclassisusedtodeducetheformatofaCSVfile. TheSnifferclassofferstwomethods: sniff(sample,delimiters=None)-ThisfunctionanalysesagivensampleoftheCSVtextandreturnsaDialectsubclassthatcontainsalltheparametersdeduced. Anoptionaldelimitersparametercanbepassedasastringcontainingpossiblevaliddelimitercharacters. has_header(sample)-ThisfunctionreturnsTrueorFalsebasedonanalyzingwhetherthesampleCSVhasthefirstrowascolumnheaders. Let'slookatanexampleofusingthesefunctions: Example7:Usingcsv.Sniffer()todeducethedialectofCSVfiles SupposewehaveaCSVfile(office.csv)withthefollowingcontent: "ID"|"Name"|"Email" A878|"AlfonsoK.Hamby"|"[email protected]" F854|"SusanneBriard"|"[email protected]" E833|"KatjaMauer"|"[email protected]" Let'slookathowwecandeducetheformatofthisfileusingcsv.Sniffer()class: importcsv withopen('office.csv','r')ascsvfile: sample=csvfile.read(64) has_header=csv.Sniffer().has_header(sample) print(has_header) deduced_dialect=csv.Sniffer().sniff(sample) withopen('office.csv','r')ascsvfile: reader=csv.reader(csvfile,deduced_dialect) forrowinreader: print(row) Output True ['ID','Name','Email'] ['A878','AlfonsoK.Hamby','[email protected]'] ['F854','SusanneBriard','[email protected]'] ['E833','KatjaMauer','[email protected]'] Asyoucansee,wereadonly64charactersofoffice.csvandstoreditinthesamplevariable. ThissamplewasthenpassedasaparametertotheSniffer().has_header()function.Itdeducedthatthefirstrowmusthavecolumnheaders.Thus,itreturnedTruewhichwasthenprintedout. Similarly,samplewasalsopassedtotheSniffer().sniff()function.ItreturnedallthededucedparametersasaDialectsubclasswhichwasthenstoredinthededuced_dialectvariable. Later,were-openedtheCSVfileandpassedthededuced_dialectvariableasaparametertocsv.reader(). Itwascorrectlyabletopredictdelimiter,quotingandskipinitialspaceparametersintheoffice.csvfilewithoutusexplicitlymentioningthem. Note:Thecsvmodulecanalsobeusedforotherfileextensions(like:.txt)aslongastheircontentsareinproperstructure. RecommendedReading:WritetoCSVFilesinPython TableofContents BasicUsageofcsv.reader() CSVfileswithCustomDelimiters CSVfileswithinitialspaces CSVfileswithquotes DialectsinCSVmodule ReadCSVfileswithcsv.DictReader() Usingcsv.Snifferclass Shareon: Didyoufindthisarticlehelpful? Sorryaboutthat. Howcanweimproveit? Feedback* Leavethisfieldblank RelatedTutorialsPythonTutorialWritingCSVfilesinPythonPythonTutorialPythonCSVPythonTutorialWorkingwithCSVfilesinPythonPythonLibraryPythonopen() TryPROforFREE LearnPythonInteractively
延伸文章資訊
- 1Reading CSV files in Python - Programiz
Here, csv_file is a csv.DictReader() object. The object can be iterated over using a for loop. Th...
- 2csv — CSV File Reading and Writing — Python 3.10.7 ...
The csv module's reader and writer objects read and write sequences. Programmers can also read an...
- 3Pandas select rows: How to Select Rows from Pandas DataFrame
- 4Python Tutorial: Working with CSV file for Data Science - Analytics Vidhya
- 5Reading CSVs With Python's "csv" Module