Import CSV file as a Pandas DataFrame - Stack Overflow

文章推薦指數: 80 %
投票人數:10人

To read a CSV file as a pandas DataFrame, you'll need to use pd.read_csv . But this isn't where the story ends; data exists in many ... Home Public Questions Tags Users Companies Collectives ExploreCollectives Teams StackOverflowforTeams –Startcollaboratingandsharingorganizationalknowledge. CreateafreeTeam WhyTeams? Teams CreatefreeTeam Collectives™onStackOverflow Findcentralized,trustedcontentandcollaboratearoundthetechnologiesyouusemost. LearnmoreaboutCollectives Teams Q&Aforwork Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch. LearnmoreaboutTeams ImportCSVfileasaPandasDataFrame AskQuestion Asked 9years,9monthsago Modified 2monthsago Viewed 232ktimes 123 HowdoIreadthefollowingCSVfileintoaPandasDataFrame? Date,"price","factor_1","factor_2" 2012-06-11,1600.20,1.255,1.548 2012-06-12,1610.02,1.258,1.554 2012-06-13,1618.07,1.249,1.552 2012-06-14,1624.40,1.253,1.556 2012-06-15,1626.15,1.258,1.552 2012-06-16,1626.15,1.263,1.558 2012-06-17,1626.15,1.264,1.572 pythonpandascsvdataframe Share Improvethisquestion Follow editedJul29at7:43 MateenUlhaq 22.2k1616goldbadges8686silverbadges127127bronzebadges askedJan16,2013at18:50 mazlormazlor 1,70544goldbadges1818silverbadges2020bronzebadges 0 Addacomment  |  6Answers 6 Sortedby: Resettodefault Highestscore(default) Trending(recentvotescountmore) Datemodified(newestfirst) Datecreated(oldestfirst) 194 pandas.read_csvtotherescue: importpandasaspd df=pd.read_csv("data.csv") print(df) ThisoutputsapandasDataFrame: Datepricefactor_1factor_2 02012-06-111600.201.2551.548 12012-06-121610.021.2581.554 22012-06-131618.071.2491.552 32012-06-141624.401.2531.556 42012-06-151626.151.2581.552 52012-06-161626.151.2631.558 62012-06-171626.151.2641.572 Share Improvethisanswer Follow editedJul29at7:26 MateenUlhaq 22.2k1616goldbadges8686silverbadges127127bronzebadges answeredJan16,2013at18:56 rootroot 72.2k2525goldbadges104104silverbadges119119bronzebadges 0 Addacomment  |  20 ToreadaCSVfileasapandasDataFrame,you'llneedtousepd.read_csv. Butthisisn'twherethestoryends;dataexistsinmanydifferentformatsandisstoredindifferentwayssoyouwilloftenneedtopassadditionalparameterstoread_csvtoensureyourdataisreadinproperly. Here'satablelistingcommonscenariosencounteredwithCSVfilesalongwiththeappropriateargumentyouwillneedtouse.Youwillusuallyneedallorsomecombinationoftheargumentsbelowtoreadinyourdata. ┌───────────────────────────────────────────────────────┬───────────────────────┬────────────────────────────────────────────────────┐ │pandasImplementation│Argument│Description│ ├───────────────────────────────────────────────────────┼───────────────────────┼────────────────────────────────────────────────────┤ │pd.read_csv(...,sep=';')│sep/delimiter│ReadCSVwithdifferentseparator¹│ │pd.read_csv(...,delim_whitespace=True)│delim_whitespace│ReadCSVwithtab/whitespaceseparator│ │pd.read_csv(...,encoding='latin-1')│encoding│FixUnicodeDecodeErrorwhilereading²│ │pd.read_csv(...,header=False,names=['x','y','z'])│headerandnames│ReadCSVwithoutheaders³│ │pd.read_csv(...,index_col=[0])│index_col│Specifywhichcolumntosetastheindex⁴│ │pd.read_csv(...,usecols=['x','y'])│usecols│Readsubsetofcolumns│ │pd.read_csv(...,thousands='.',decimal=',')│thousandsanddecimal│NumericdataisinEuropeanformat(eg.,1.234,56)│ └───────────────────────────────────────────────────────┴───────────────────────┴────────────────────────────────────────────────────┘ Footnotes Bydefault,read_csvusesaCparserengineforperformance.TheCparsercanonlyhandlesinglecharacterseparators.IfyourCSVhas amulti-characterseparator,youwillneedtomodifyyourcodetouse the'python'engine.Youcanalsopassregularexpressions: df=pd.read_csv(...,sep=r'\s*\|\s*',engine='python') UnicodeDecodeErroroccurswhenthedatawasstoredinoneencodingformatbutreadinadifferent,incompatibleone.Mostcommon encodingschemesare'utf-8'and'latin-1',yourdataislikelyto fitintooneofthese. header=FalsespecifiesthatthefirstrowintheCSVisadatarowratherthanaheaderrow,andthenames=[...]allowsyouto specifyalistofcolumnnamestoassigntotheDataFramewhenitis created. "Unnamed:0"occurswhenaDataFramewithanun-namedindexissavedtoCSVandthenre-readafter.Insteadofhavingtofixthe issuewhilereading,youcanalsofixtheissuewhenwritingbyusing df.to_csv(...,index=False) ThereareotherargumentsI'venotmentionedhere,butthesearetheonesyou'llencountermostfrequently. Share Improvethisanswer Follow editedJun9at15:44 TrentonMcKinney 48.2k3131goldbadges118118silverbadges129129bronzebadges answeredMay21,2019at5:33 cs95cs95 347k8787goldbadges640640silverbadges686686bronzebadges 0 Addacomment  |  11 Here'sanalternativetopandaslibraryusingPython'sbuilt-incsvmodule. importcsv frompprintimportpprint withopen('foo.csv','rb')asf: reader=csv.reader(f) headers=reader.next() column={h:[]forhinheaders} forrowinreader: forh,vinzip(headers,row): column[h].append(v) pprint(column)#Prettyprinter willprint {'Date':['2012-06-11', '2012-06-12', '2012-06-13', '2012-06-14', '2012-06-15', '2012-06-16', '2012-06-17'], 'factor_1':['1.255','1.258','1.249','1.253','1.258','1.263','1.264'], 'factor_2':['1.548','1.554','1.552','1.556','1.552','1.558','1.572'], 'price':['1600.20', '1610.02', '1618.07', '1624.40', '1626.15', '1626.15', '1626.15']} Share Improvethisanswer Follow editedJan16,2013at19:34 answeredJan16,2013at19:20 siddharthlatestsiddharthlatest 2,21111goldbadge1919silverbadges2323bronzebadges Addacomment  |  7 importpandasaspd df=pd.read_csv('/PathToFile.txt',sep=',') Thiswillimportyour.txtor.csvfileintoaDataFrame. Share Improvethisanswer Follow answeredSep7,2019at16:09 RishabhRishabh 33133silverbadges33bronzebadges Addacomment  |  -1 YoucanusethecsvmodulefoundinthepythonstandardlibrarytomanipulateCSVfiles. example: importcsv withopen('some.csv','rb')asf: reader=csv.reader(f) forrowinreader: printrow Share Improvethisanswer Follow answeredJan16,2013at19:03 KurzedMetalKurzedMetal 12.2k55goldbadges3838silverbadges6464bronzebadges 2 1 -0.ComingfromR,mazlorwouldn'tbelookingforthecsvmoduleasitistoolowlevel.pandasprovidestherequestedlevelofabstraction. – StevenRumbalski Jan16,2013at19:10 ...inadditionitdoesreadthedataintoausefulPythonobjectsuchasanumpyarray... – PaulHiemstra Jan16,2013at19:16 Addacomment  |  -1 Notequiteasclean,but: importcsv withopen("value.txt","r")asf: csv_reader=reader(f) num='' forrowincsv_reader: printnum,'\t'.join(row) ifnum=='': num=0 num=num+1 Notascompact,butitdoesthejob: Datepricefactor_1factor_2 12012-06-111600.201.2551.548 22012-06-121610.021.2581.554 32012-06-131618.071.2491.552 42012-06-141624.401.2531.556 52012-06-151626.151.2581.552 62012-06-161626.151.2631.558 72012-06-171626.151.2641.572 Share Improvethisanswer Follow answeredJan16,2013at19:12 Lee-ManLee-Man 36611silverbadge88bronzebadges 1 2 ThisdoesnotanswertheOP'squestionasitdoesnotreadthecsvdataintoaPythonobject. – PaulHiemstra Jan16,2013at19:15 Addacomment  |  Highlyactivequestion.Earn10reputation(notcountingtheassociationbonus)inordertoanswerthisquestion.Thereputationrequirementhelpsprotectthisquestionfromspamandnon-answeractivity. Nottheansweryou'relookingfor?Browseotherquestionstaggedpythonpandascsvdataframeoraskyourownquestion. TheOverflowBlog HowtoearnamillionreputationonStackOverflow:beofservicetoothers Therightwaytojobhop(Ep.495) FeaturedonMeta BookmarkshaveevolvedintoSaves Inboximprovements:markingnotificationsasread/unread,andafiltered... Revieweroverboard!Orarequesttoimprovetheonboardingguidancefornew... CollectivesUpdate:RecognizedMembers,Articles,andGitLab Shouldweburninatethe[script]tag? Linked 0 WhenloadingCSVdatawithpandas,thefirstlineismistakenforthetitle 0 Howtoreadinfilewithdelimiterinpandas? 1 HowtospecifycustomparserinPandas.read_csv? -1 ValueError:couldnotconvertstringtofloat:'member_id'whenreadingtextfile 0 TypeError:'function'objectisnotsubscriptablehowtoresolvethiserrorwhilereadingcsvfile 0 Readmultipletxtfileintodataframepython 0 Importcsvascoordinatestopython 0 ProblemwithreadtextfileinPythonPandas? 0 FlaskCSVtoHTMLTable 0 UnicodePandas? Seemorelinkedquestions Related 6784 HowdoIcheckwhetherafileexistswithoutexceptions? 1606 SelectingmultiplecolumnsinaPandasdataframe 2684 RenamingcolumnnamesinPandas 1172 UsealistofvaluestoselectrowsfromaPandasdataframe 2001 DeleteacolumnfromaPandasDataFrame 1657 HowdoIgettherowcountofaPandasDataFrame? 3623 HowtoiterateoverrowsinaDataFrameinPandas 999 WritingapandasDataFrametoCSVfile 3155 HowdoIselectrowsfromaDataFramebasedoncolumnvalues? 1292 GetalistfromPandasDataFramecolumnheaders HotNetworkQuestions Whydostringhashcodeschangeforeachexecutionin.NET? Justifyingdefinitionsofagroupaction. Ignorespaces,including~ HowtoruntheGUIofWindowsFeaturesOn/OffusingPowershell Awordfor"amessagetomyself" Myfavoriteanimalisa-singularandpluralform Howtoprovethisalgebraicidentity? IsdocumentingabigprojectwithUMLDiagramsneeded,goodtohaveorevennotpossible? SomeoneofferedtaxdeductibledonationasapaymentmethodforsomethingIamselling.AmIgettingscammed? Vivadoconstraintswizardsuggestsalotofnonsensegeneratedclocks Isthe2...g6DutchautomaticallywinningforWhite? Changelinkcolorbasedinbackgroundcolor? Whatare"HollywoodTwin"beds? HowtofindthebordercrossingtimeofatraininEurope?(Czechbureaucracyedition) Canananimalfilealawsuitonitsownbehalf? meaningof'illesas'inMagnaCarta Howdocucumbershappen?Whatdoes"verypoorlypollinatedcucumber"meanexactly?Howcanpollinationbe"uneven"? PacifistethosblockingmyprogressinStellaris Botchingcrosswindlandings Howtotellifmybikehasanaluminumframe What'sthedifferencebetween'Dynamic','Random',and'Procedural'generations? HowdothosewhoholdtoaliteralinterpretationofthefloodaccountrespondtothecriticismthatNoahbuildingthearkwouldbeunfeasible? Whatistheconventionalwaytonotateameterwithaccentsoneverysecond8thnote? Howtosimplifyapurefunction? morehotquestions Questionfeed SubscribetoRSS Questionfeed TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader. lang-py Yourprivacy Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy. Acceptallcookies Customizesettings  



請為這篇文章評分?