Batch convert encoding in files - Super User

文章推薦指數: 80 %
投票人數:10人

UTFCast is a Unicode converter that lets you batch convert all text files to UTF encodings with just a click of your mouse. You can use it to ... SuperUserisaquestionandanswersiteforcomputerenthusiastsandpowerusers.Itonlytakesaminutetosignup. Signuptojointhiscommunity Anybodycanaskaquestion Anybodycananswer Thebestanswersarevotedupandrisetothetop Home Public Questions Tags Users Companies Unanswered Teams StackOverflowforTeams –Startcollaboratingandsharingorganizationalknowledge. CreateafreeTeam WhyTeams? Teams CreatefreeTeam Teams Q&Aforwork Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch. LearnmoreaboutTeams Batchconvertencodinginfiles AskQuestion Asked 13years,1monthago Modified 2monthsago Viewed 103ktimes 60 HowcanIbatch-convertfilesinadirectoryfortheirencoding(e.g.ANSI→UTF-8)withacommandortool? Forsinglefiles,aneditorhelps,buthowcanIdothemassfilesjob? linuxwindowsmacosbatchencoding Share Improvethisquestion Follow editedAug29,2020at11:45 PeterMortensen 11.9k2323goldbadges6969silverbadges9090bronzebadges askedAug21,2009at9:12 desolatdesolat 1,05211goldbadge1212silverbadges1717bronzebadges 2 1 related:stackoverflow.com/questions/724083/… – user4358 Aug21,2009at9:18 stackoverflow.com/a/24713621/242933 – ma11hew28 Jul12,2014at13:56 Addacomment  |  14Answers 14 Sortedby: Resettodefault Highestscore(default) Datemodified(newestfirst) Datecreated(oldestfirst) 44 CygwinorGnuWin32provideUnixtoolslikeiconvanddos2unix(andunix2dos).UnderUnix/Linux/Cygwin,you'llwanttouse"windows-1252"astheencodinginsteadofANSI(seebelow).(Unlessyouknowyoursystemisusingacodepageotherthan1252asitsdefaultcodepage,inwhichcaseyou'llneedtotelliconvtherightcodepagetotranslatefrom.) Convertfromone(-f)totheother(-t)with: $iconv-fwindows-1252-tutf-8infile>outfile Orinafind-all-and-conquerform: ##thiswillclobbertheoriginalfiles! $find.-name'*.txt'-execiconv--verbose-fwindows-1252-tutf-8{}\>{}\; Alternatively: ##thiswillclobbertheoriginalfiles! $find.-name'*.txt'-execiconv--verbose-fwindows-1252-tutf-8-o{}{}\; Thisquestionhasbeenaskedmanytimesonthissite,sohere'ssomeadditionalinformationabout"ANSI".Inananswertoarelatedquestion,CesarBmentions: Thereareseveralencodingswhicharecalled"ANSI"inWindows.Infact,ANSIisamisnomer.iconvhasnowayofguessingwhichyouwant. TheANSIencodingistheencodingusedbythe"A"functionsintheWindowsAPI(the"W"functionsuseUTF-16).WhichencodingitcorrespondstousuallydependsonyourWindowssystemlanguage.ThemostcommonisCP1252(alsoknownasWindows-1252).So,whenyoureditorsaysANSI,itismeaning"whatevertheAPIfunctionsuseasthedefaultANSIencoding",whichisthedefaultnon-Unicodeencodingusedinyoursystem(andthususuallytheonewhichisusedfortextfiles). Thepagehelinkstogivesthishistoricaltidbit(quotedfromaMicrosoftPDF)ontheoriginsofCP1252andISO-8859-1,anotheroft-usedencoding: [...]thiscomesfromthefactthattheWindowscodepage1252wasoriginallybasedonanANSIdraft,whichbecameISOStandard8859-1.However,inaddingcodepointstotherangereservedforcontrolcodesintheISOstandard,theWindowscodepage1252andsubsequentWindowscodepagesoriginallybasedontheISO8859-xseriesdeviatedfromISO.Tothisday,itisnotuncommontohavethedevelopmentcommunity,bothwithinandoutsideofMicrosoft,confusethe8859-1codepagewithWindows1252,aswellassee"ANSI"or"A"usedtosignifyWindowscodepagesupport. Share Improvethisanswer Follow editedAug25,2020at15:59 phuclv 24k1111goldbadges103103silverbadges209209bronzebadges answeredSep30,2009at18:17 quackquixotequackquixote 41.3k1414goldbadges102102silverbadges127127bronzebadges 2 7 Don'tusethesamefilenameasinputandoutput!iconvseemstotruncatefilesto32,768bytesiftheyexceedthissize.Ashewritesinthefilehe'stryingtoreadfrom,hemanagestodothejobifthefileissmallenough,elsehetruncatesthefilewithoutanywarning... – sylbru Sep11,2014at7:32 3 FYIThisquestionistaggedwithosxanditdoesn'tlooklikeeitheroftheconvert-allcommandsworkonYosemiteorElCap.TheiconvversionApplesshipsdoesn'tsupport--verboseor-o,andtheothersyntaxredirectingstdoutdoesn'tworkforsomereasonandjustsendsittoregularstdout. – ScottMcIntyre May9,2016at13:02 Addacomment  |  40 WithPowerShellyoucandosomethinglikethis: Get-ContentIN.txt|Out-File-encodingENC-filepathOUT.txt WhileENCissomethinglikeunicode,ascii,utf8,andutf32.Checkout'helpout-file'. Toconvertallthe*.txtfilesinadirectorytoUTF-8,dosomethinglikethis: foreach($iinls-nameDIR/*.txt){\ Get-ContentDIR/$i|\ Out-File-encodingutf8-filepathDIR2/$i\ } whichcreatesaconvertedversionofeach.txtfileinDIR2. Toreplacethefilesinallsubdirectories,use: foreach($iinls-recurse-filter"*.java"){ $temp=Get-Content$i.fullname Out-File-filepath$i.fullname-inputobject$temp-encodingutf8-force } Share Improvethisanswer Follow editedAug29,2020at12:22 PeterMortensen 11.9k2323goldbadges6969silverbadges9090bronzebadges answeredFeb26,2010at6:31 akiraakira 59.6k1717goldbadges135135silverbadges164164bronzebadges 5 ConvertingfromANSItoUTFviayourfirstproposaldoeserasethewholecontentofmytextfile... – Orsinus May9,2015at7:24 @Acroneos:thenyoumadeamistake:thein-fileisIN.txt,theoutfileisOUT.txt...thiswayitisimpossibletooverwritetheoriginal.ifyouusedthesamefilenameforIN.txtandOUT.txtthenyouoverwritethefileyouarereadingfrom,obviously. – akira May10,2015at6:06 1 PowershellwillconverttoUTFwithBOM.findandiconvmightbemucheasier. – pparas Aug23,2017at14:25 @pparasthat'swrong.CommandsrelatedtotextfileslikeOut-File,Get-Content,Set-Content...allhavean-Encodingparameterwhichallowsutf8BOMorutf8NoBOM.iconvismuchworseinthisregardbecauseitneversupportsUTF-8withBOM – phuclv Aug21,2020at23:17 1 \isnotanescapecharacterinpowershellsoputtingitattheendofeachlinewon'twork – phuclv Aug25,2020at16:00 Addacomment  |  7 Onelinerusingfind,withautomaticdetection ThecharacterencodingofallmatchingtextfilesgetsdetectedautomaticallyandallmatchingtextfilesareconvertedtoUTF-8encoding: $find.-typef-iname*.txt-execsh-c'iconv-f$(file-bi"$1"|sed-e"s/.*[]charset=//")-tutf-8-oconverted"$1"&&mvconverted"$1"'--{}\; Toperformthesesteps,asubshellshisusedwith-exec,runningaone-linerwiththe-cflag,andpassingthefilenameasthepositionalargument"$1"with--{}.Inbetween,theUTF-8outputfileistemporarilynamedconverted. Thefindcommandisveryusefulforsuchfilemanagementautomation. Clickhereformorefindgalore. Share Improvethisanswer Follow editedAug30,2020at19:48 answeredAug28,2016at19:53 SergeStroobandtSergeStroobandt 1,87011goldbadge2525silverbadges2525bronzebadges 2 2 ThisworksonMac:find.-typef-iname"*.txt"-execsh-c'iconv-fwindows-1252-tutf-8"$1">converted&&mvconverted"$1"'--"{}"\;,toconvertfromANSI – djjeck Feb17,2021at20:05 2 Myiconvcommandfromgitbashhasno-ooptionsoIusefileredirection>:find.-typef-name'*.txt'-execsh-c'iconv-f$(file-bi"$1"|sed-e"s/.*[]charset=//")-tutf-8>/tmp/converted"$1"&&mv/tmp/converted"$1"'--{}\;.Anyadvantageofusingthissyntax--asopposedtopassing{}directly?find.-typef-name'*.txt'-execsh-c'iconv-f$(file-bi{}|sed-e"s/.*[]charset=//")-tutf-8>/tmp/converted{}&&mv/tmp/converted{}'\; – Sybuser Dec30,2021at14:31 Addacomment  |  5 TheWikipediapageonnewlineshasasectiononconversionutilities. ThisseemsyourbestbetforaconversionusingonlytoolsWindowsshipswith: TYPEunix_file|FIND""/V>dos_file Share Improvethisanswer Follow answeredAug21,2009at9:21 user4358user4358 1 thisisnewlineconversionandhasnothingtodowithencodingconversion – phuclv Aug21,2020at23:18 Addacomment  |  3 UTFCastisaUnicodeconverterforWindowswhichsupportsbatchmode.I'musingthepaidversionandamquitecomfortablewithit. UTFCastisaUnicodeconverterthatletsyoubatchconvertalltextfilestoUTFencodingswithjustaclickofyourmouse.YoucanuseittoconvertadirectoryfulloftextfilestoUTFencodingsincludingUTF-8,UTF-16andUTF-32toanoutputdirectory,whilemaintainingthedirectorystructureoftheoriginalfiles.Itdoesn'tevenmatterifyourtextfilehasadifferentextension,UTFCastcanautomaticallydetecttextfilesandconvertthem. Share Improvethisanswer Follow editedDec7,2011at2:16 Gaff 18.2k1414goldbadges5656silverbadges6868bronzebadges answeredDec6,2011at18:48 TilerTiler 3111bronzebadge 3 Seemstheycannotconvertintothesamefolder,onlyintoanotherdestinationfolder. – UweKeim Aug9,2016at19:49 Theproversionallowsin-placeconversion.$20/3months.rotatingscrew.com/utfcast-version-comparison.aspx – SherylHohman Jan30,2019at19:19 Oh,express(free)versionisuseless-itonly"Detects"utf-8WITHBOM!!(everyonecandothat).OnlyProversionthatAuto-Renewsevery3monthsat$20apop,willauto-detect.Priceissteepforanon-enterpriseuser.ANDBewareifyoutrythebasicversion,andyourfileisalreadyutf-8(withoutBOM),thenthisconverterwilldetectitasASCII,then(re-)"convert"ittoutf-8,whichcouldresultingibberish.BeAwareifthisbeforetryingtheexpressversion!Theyhaveademoversionfortheprothatproducesnooutput-pointlessIMHOcuzcan'tverifyresultsbeforebuying! – SherylHohman Jan30,2019at19:38 Addacomment  |  3 ThereisfreeandopensourcebatchencodingconverternamedCPConverter. Share Improvethisanswer Follow editedAug29,2020at12:32 PeterMortensen 11.9k2323goldbadges6969silverbadges9090bronzebadges answeredMar28,2020at19:15 MSSMSS 21222silverbadges55bronzebadges 1 ThislookedlikeexactlywhatIwashopingfor(aGUI)thoughnodrag-and-drop(youhavetousetheFilemenu)andnoANSIoption(thatIcouldfind).NothingIdidshoweduplaterasUTF-8inNotepad++.Ifthishadhadjustalittlemoredevelopmentthistoolwouldhavebeennearlyperfect. – John Feb24,2021at11:23 Addacomment  |  2 Inmyusecase,IneededautomaticinputencodingdetectionandtheretherewasalotoffileswithWindows-1250encoding,forwhichcommandfile-bireturnscharset=unknown-8bit.Thisisnotavalidparameterforiconv. Ihavehadthebestresultswithenca. ConvertallfileswithtxtextensiontoUTF-8 find.-typef-iname*.txt-execsh-c'echo"$1"&&enca"$1"-xutf-8'--{}\; Share Improvethisanswer Follow editedAug29,2020at12:32 PeterMortensen 11.9k2323goldbadges6969silverbadges9090bronzebadges answeredSep16,2018at17:40 BedlaBedla 12133bronzebadges 1 Dang...Iwishyouranswerwasn'tthatdeeplyburiedatthebottom!encaisreallyuseful,andwayeasiertouse...whenitworks.Thenagain,othersolutionsfail,too... – GwynethLlewelyn May13,2020at0:36 Addacomment  |  1 UsethisPythonscript:https://github.com/goerz/convert_encoding.pyItworksonanyplatform.RequiresPython2.7. Share Improvethisanswer Follow answeredJul1,2018at10:18 kinORnirvanakinORnirvana 1,30111goldbadge88silverbadges44bronzebadges 1 WhataboutPython3? – PeterMortensen Aug29,2020at12:28 Addacomment  |  1 iconv-foriginal_charset-tutf-8originalfile>newfile Runtheabovecommandinaforloop. Share Improvethisanswer Follow editedAug29,2020at12:26 PeterMortensen 11.9k2323goldbadges6969silverbadges9090bronzebadges answeredJun6,2014at14:47 AneeshGargAneeshGarg 11122bronzebadges 2 1 I'mguessingoriginal_charsetisjustaplaceholderhere,notactuallythemagical"detectmyencoding"featureweallmighthopefor. – mwfearnley Feb26,2020at9:11 Thishastheadvantageofnotrequiringthe-ooptionwhichisnotavailableonsomeflavoursoficonv(namely,macOS,andIsuspectFreeBSDaswell).Ontheotherhand,theforloopisnon-trivialtocreateifyourequireittotransverseadeeptreestructureofdirectories... – GwynethLlewelyn May13,2020at0:38 Addacomment  |  1 Imadeatoolforthisfinally:https://github.com/gonejack/transcode Install: goget-ugithub.com/gonejack/transcode Usage: >transcodesource.txt >transcode-sgbk-tutf8source.txt Share Improvethisanswer Follow answeredNov28,2020at7:50 igonejackigonejack 11122bronzebadges Addacomment  |  1 ---------------Solution1----------------------------- Therearetwoflawsin@akira'sanswer. Youroriginalfilewouldbezeroedifencounteredanyfailure. Ifyourpathcontainsanynon-ASCIIcharacter,itwillthrowthiserror Set-Content:Anobjectatthespecifiedpath...txtdoesnotexist,orhasbeenfilteredbythe-Includeor-Excludeparameter. Thisisanimprovedversion,byadding-LiteralPathandif($?) foreach($iinls-name*.txt){ $relativePath=Resolve-Path-Relative-LiteralPath"$i" $temp=Get-Content-LiteralPath"$relativePath" if($?) { Out-File-LiteralPath"$i"-inputobject"$temp"-encodingutf8-force } } ----------------Solution2(Better)---------------- PowerShellcancovertverylimitedencodings,suchgb2312,Shift-JISarenotoneofthem. Notepad++hasapythonplugincandoabetterjobthanthepowershell,andrelativelysafer,youcanreviewwhatyouareabouttoconvert. UseEverythingfindwhatfileyouwanttoconvert.Downloadlinkisatbelow https://www.voidtools.com/ Notepad++Menu->Plugins->PythonScript->NewScripts Copytheoneoftwoscripts(seebellow)andmodifybyyourneeds,saveittothedefaultlocation. DragallfilesfromEverythingintonotepad++ Runpython-scriptwithpython-plugininnotepad++fromMenu->Plugins->PythonScript->Scripts Done Therearetwoscripts,thebottomonecanconvertandsaveopenedtabsintoUTF-8 Script1 https://gist.github.com/bjverde/88bbc418e79f016a57539c2d5043c445 Script2 forfilename,bufferID,index,viewinnotepad.getFiles(): console.write(filename+"\r\n") notepad.activateIndex(view,index) #UTF8(withoutBOM) notepad.menuCommand(MENUCOMMAND.FORMAT_CONV2_AS_UTF_8) notepad.save() notepad.reloadCurrentDocument() Share Improvethisanswer Follow editedJul21at8:44 answeredDec26,2021at11:42 MissingTwinsMissingTwins 2133bronzebadges 1 Themethod5isquitegoodandnice,thanksforthesharing. – ollydbg23 Jul20at10:23 Addacomment  |  0 ConvertZisanotherWindowsGUItoolforbatchconversion Convertfile(plaintext)orclipboardcontentamongthefollowingencodings:big5,gbk,hz,shift-jis,jis,euc-jp,unicodebig-endian,unicodelittle-endian,andutf-8. Batchfilesconversion Previewfilecontentandconvertedresultbeforeactualconversion. Auto-updatethecharsetintag,ifspecifiedinhtmldocs. Auto-fixmis-mappedBig5/GBKcharactersafterconversion. Changefilename'sencodingamongbig5,gbk,shift-jisandunicode. ConvertMP3'sID3orAPEamongbig5,gbk,shift-jis,unicodeandutf-8encoding. ConvertOggtagbetweenTraditionalandSimplifiedChineseinutf-8. Alternativedownloadlink:https://www.softking.com.tw/download/1763/ Share Improvethisanswer Follow answeredAug19,2020at7:54 phuclvphuclv 24k1111goldbadges103103silverbadges209209bronzebadges Addacomment  |  0 Thereisdos2unixonUnix.TherewasanothersimilartoolforWindows(anotherreferenceishere). HowdoIconvertbetweenUnixandWindowstextfiles?hassomemoretricks. Share Improvethisanswer Follow editedAug29,2020at12:18 PeterMortensen 11.9k2323goldbadges6969silverbadges9090bronzebadges answeredAug21,2009at9:14 niknik 54.8k1010goldbadges9696silverbadges140140bronzebadges 1 5 dos2unixisusefultoconvertlinebreaks,buttheOPislookingforconvertingcharacterencodings. – SonySantos Apr17,2014at3:01 Addacomment  |  -1 Ihavecreatedanonlinetoolforthat: https://encoding-converter.netlify.app Youcanuploadbunchoffilesatoncetobeconverted. Useitinthisorder: entertheencodings select/drag&dropyourfiles Uploadwillstartautomatically. Share Improvethisanswer Follow answeredJul15,2021at14:25 ZoldyckZoldyck 1 Addacomment  |  YourAnswer ThanksforcontributingananswertoSuperUser!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers. Draftsaved Draftdiscarded Signuporlogin SignupusingGoogle SignupusingFacebook SignupusingEmailandPassword Submit Postasaguest Name Email Required,butnevershown PostYourAnswer Discard Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy Nottheansweryou'relookingfor?Browseotherquestionstaggedlinuxwindowsmacosbatchencodingoraskyourownquestion. TheOverflowBlog HowtoearnamillionreputationonStackOverflow:beofservicetoothers Therightwaytojobhop(Ep.495) FeaturedonMeta BookmarkshaveevolvedintoSaves Inboximprovements:markingnotificationsasread/unread,andafiltered... Linked 49 HowcanIconvertmultiplefilestoUTF-8encodingusing*nixcommandlinetools? 7 Batchchangeencodingasciifilesfromutf-8toiso-8859-1 6 convertfiletypetoutf-8onunix-iconvisfailing 6 ApptoconvertfromANSItoUTF8onwindows 1 FreeANSItoUTF8MultipleFilesconverter 1 Commandlinetooltoconvertbetweentextformats 0 Needasimpleprogramtoconverttoutf-8 45 ConvertUnixlineendingstoWindows 5 HowtoconvertcharacterencodingsonWindows? 2 ConvertANSItextfile(s)toUTF8onWindowscommandline Seemorelinkedquestions Related 11 Howtofindfileswithgivencharacterencoding? 89 Howtoautodetecttextfileencoding? 5 CanNotepad++convertmultipleopenedfilestoANSI(encoding)? 3 WhatsisthedifferentbeweenWesternEuropean(Windows)-1252andANSIencoding? 11 HowcanIconvertall.svgfilesinadirectoryto.pdfonLinuxinthecommandline 5 Controlencodingofbatch-createdfile HotNetworkQuestions MakeaCourtTranscriber What'sthedifferencebetween'Dynamic','Random',and'Procedural'generations? Whydostringhashcodeschangeforeachexecutionin.NET? Workplaceidiomfor"beiGelegenheit"-ordertodoeventually,butdonotprovidepriority Determinethelengthoftherestofamathdisplaylineformultlined InD&D3.5,canafamiliarbetemporarilydismissed? Whatdoes"parameterizedby"mean? Sapiensdominabiturastris—isitnotPassivevoice? HowtogetridofUbuntuProadvertisementwhenupdatingapt? Howtoplug2.5mm²strandedwiresintoapushwirewago? EquivalencebetweenLebesgueintegrableandRiemannintegrablefunctions Howdocucumbershappen?Whatdoes"verypoorlypollinatedcucumber"meanexactly?Howcanpollinationbe"uneven"? Unsurewhatthesewatersoftenerdialsarefor Howdoparty-listsystemsaccommodateindependentcandidates? Ifquasarsdestroyalllifeintheirhostgalaxy,thenhowdidlifesurvivewhenMilkyWaywasaQuasar6millionyearsago? Adecimal-basedunitoftime sshhowtoallowaverylimiteduserwithnohometologinwithpubkey WhatisthedifferencebetweenGlidepathversusGlideslope? Whyarefighterjetssoloudwhendoingslowflight? I2C(TWI)vsSPIEMInoiseresistance HowdoGPSreceiverscommunicatewithsatellites? WillIgetdeniedentryafterIremovedavisasticker?Ismypassportdamaged? Isthereawordfor"amessagetomyself"? Isitokaytore-renderaplotusedinanotherpublication? morehotquestions Questionfeed SubscribetoRSS Questionfeed TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader. Yourprivacy Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy. Acceptallcookies Customizesettings  



請為這篇文章評分?