How to read correctly Japanese characters from a file without ...
文章推薦指數: 80 %
I've got the next result in the python prompt for the list. >>> jP ['\ufeffさよなら\u3000夜の教室',]. Is there a way possible to get rid of ... Home Public Questions Tags Users Companies Collectives ExploreCollectives Teams StackOverflowforTeams –Startcollaboratingandsharingorganizationalknowledge. CreateafreeTeam WhyTeams? Teams CreatefreeTeam Collectives™onStackOverflow Findcentralized,trustedcontentandcollaboratearoundthetechnologiesyouusemost. LearnmoreaboutCollectives Teams Q&Aforwork Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch. LearnmoreaboutTeams HowtoreadcorrectlyJapanesecharactersfromafilewithout(escapesequences)"\ufeff"and"\u3000"valuesinstrings? AskQuestion Asked 2years,3monthsago Modified 2years,3monthsago Viewed 518times 0 IhavethenextJapanesetextwhichIhavetoseparateinstringsbytheirlines('\n'). Thetextiscalled'sonnet.txt' さよなら夜の教室 OnceIopenthefileandsplitthetexttoanarrayoflines. file=open('sonnet.txt',encoding="utf-8") jP=file.read().split('\n') I'vegotthenextresultinthepythonpromptforthelist. >>>jP ['\ufeffさよなら\u3000夜の教室',] Isthereawaypossibletogetridofthe"\ufeff"and"\u3000"parts,notforthisstoredvalue,butingeneralforotherkindsofwords?Thankyou. python Share Follow editedJun17,2020at13:21 AbyssBrandon askedJun17,2020at12:23 AbyssBrandonAbyssBrandon 344bronzebadges 5 1 You'reonlyseeingtheescapesequencesbecauseyou'relookingatalist,thestr()ofwhichisbuiltoutoftherepr()ofitselements(youcouldn'ttellexactlywhatwasinthelistotherwise).Ifyouprintedoutanindividualelementfromthatlist,or.join()-themintoasinglestring,youwouldseeonlytheJapanesetest. – jasonharper Jun17,2020at12:27 1 jP=file.read().replace('\ufeff','').replace('\u3000','').split('\n') – PeymanMajidi Jun17,2020at12:28 Justreplacethosecharacterswithnothingbeforespliting – PeymanMajidi Jun17,2020at12:29 Iwillreadaboutescapesequences.Ineededthemforcomparison,sothoseescapesequencesdoesgetintheway.Thankyou!@jasonharper – AbyssBrandon Jun17,2020at12:33 Hopefullythismethodisvalidforeveryword.Manythanks@PeymanMajidi. – AbyssBrandon Jun17,2020at12:36 Addacomment | 1Answer 1 Sortedby: Resettodefault Highestscore(default) Trending(recentvotescountmore) Datemodified(newestfirst) Datecreated(oldestfirst) 0 ActuallyIwroteyourcodeandmadesonnet.txttextfile,butIdidn'tgetthesameresult. Myoutputwas:['さよなら夜の教室'] Bytheway,Isuggestdoinglikethis: file=open('sonnet.txt',encoding="utf-8") jP=file.read().replace('\ufeff','').replace('\u3000','').split('\n') print(jP) Moreinfo: Eliminatethe“\u3000”error UnicodeCharacter'IDEOGRAPHICSPACE'(U+3000) Share Follow answeredJun17,2020at12:44 PeymanMajidiPeymanMajidi 1,55522goldbadges1515silverbadges2727bronzebadges 0 Addacomment | YourAnswer ThanksforcontributingananswertoStackOverflow!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers. Draftsaved Draftdiscarded Signuporlogin SignupusingGoogle SignupusingFacebook SignupusingEmailandPassword Submit Postasaguest Name Email Required,butnevershown PostYourAnswer Discard Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy Nottheansweryou'relookingfor?Browseotherquestionstaggedpythonoraskyourownquestion. TheOverflowBlog HowtoearnamillionreputationonStackOverflow:beofservicetoothers Therightwaytojobhop(Ep.495) FeaturedonMeta BookmarkshaveevolvedintoSaves Inboximprovements:markingnotificationsasread/unread,andafiltered... Revieweroverboard!Orarequesttoimprovetheonboardingguidancefornew... CollectivesUpdate:RecognizedMembers,Articles,andGitLab Shouldweburninatethe[script]tag? Linked 3 Eliminatethe"\u3000"errorinjava Related 1631 HowcanImakeadictionary(dict)fromseparatelistsofkeysandvalues? 1271 Howtoreadatextfileintoastringvariableandstripnewlines? 586 Howtoreadafilewithoutnewlines? 157 Howtosplittextinacolumnintomultiplerows 0 Removingencodedtextfromstringsreadfromtxtfile 1 Convertstringtoutf-16 2 HowtoreadCtrlcommandcharactersfromafileinPython 0 HowcanIreadalinefromafileandsplitit 0 Howtoreadcsvfiles(withspecialcharacters)inPython?HowcanIdecodethetextdata?Readencodedtextfromfileandconverttostring 0 Readafile.txtandwriteinanew.txtfileonlytherowsthatstartwithstringvalues HotNetworkQuestions Whyare"eat"and"drink"differentwordsinlanguages? WhatdothecolorsindicateonthisKC135tankerboom? HowcanIkeepmyampfromtemperingthetoneofmyprocessor?(rockandhardmetalmusic) Single-rowSettingstable:prosandconsofJoinsvsscalarsubqueries Sciencefictionbook/novelaboutaliensinhumans'bodies Iwanttodothedoubleslitexperimentwithelectrons,but Whatisthebestwaytocalculatetruepasswordentropyforhumancreatedpasswords? Whydoesn'ttheMBRS1100SchottkydiodehaveanexponentialI/Vcharacteristic? IsdocumentingabigprojectwithUMLDiagramsneeded,goodtohaveorevennotpossible? Isthe2...g6DutchautomaticallywinningforWhite? WhathadEstherdonein"TheBellJar"bySylviaPlath? Ignorespaces,including~ Sortbycolumngroupandignoreothercolumnsfailingforthisexample,why? Howcanmyaliensymbiotesidentifyeachother? Levinson'salgorithmandQRdecompositionforcomplexleast-squaresFIRdesign Canananimalfilealawsuitonitsownbehalf? Findanddeletepartiallyduplicatelines Whatare"HollywoodTwin"beds? Unsurewhatthesewatersoftenerdialsarefor WhydopeopleinsistonusingTikzwhentheycanusesimplerdrawingtools? ReturnoftheJedi-"northtower"-howcanyoutellthecardinalpointsincenterofasphere? CounterexampleforChvatal'sconjectureinaninfiniteset Whataretheargumentsforrevengeandretribution? Canaphotonturnaprotonintoaneutron? morehotquestions Questionfeed SubscribetoRSS Questionfeed TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader. lang-py Yourprivacy Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy. Acceptallcookies Customizesettings
延伸文章資訊
- 1Python去除抓取字段中的特殊字符- 台部落 - 萌寵公園
import re str =' \n \u3000\u30001912年4月10日,號稱“世界工業史上的... Python處理csv文件CSV(Comma-Separated Values)...
- 2python UTF-8转GBK字符编码问题(BOM) - SegmentFault
python去除ufeff、xa0、u3000 不知道为什么,明明是utf-8偏偏会遇到bom的问题, ... 'r', encoding='utf-8') as f: reader = csv...
- 3How to read correctly Japanese characters from a file without ...
I've got the next result in the python prompt for the list. >>> jP ['\ufeffさよなら\u3000夜の教室',]. Is ...
- 4生活中的資料科學:中選會選舉資料庫 - Medium
我們可以使用Series 的文字處理方法 .str.replace('\u3000', ... 我們運用了Python 的pandas 套件將中選會的選舉資料庫2020 第15 任總統(副總統)...
- 5Python去除抓取字段中的特殊字符 - 台部落
import re str =' \n \u3000\u30001912年4月10日,號稱“世界工業史上的奇蹟”的 ... Python處理csv文件CSV(Comma-Separated Va...