PowerShell obeys the standard by assuming ISO-8859-1, but unfortunately the site is using UTF-8. Stack Overflow. Powershell Invoke-WebRequest ...
Skiptocontent
{{message}}
PowerShell
/
PowerShell
Public
Notifications
Fork
5.9k
Star
35.3k
Code
Issues
3.2k
Pullrequests
88
Discussions
Actions
Projects
14
Security
Insights
More
Code
Issues
Pullrequests
Discussions
Actions
Projects
Security
Insights
Newissue
Haveaquestionaboutthisproject?SignupforafreeGitHubaccounttoopenanissueandcontactitsmaintainersandthecommunity.
Pickausername
EmailAddress
Password
SignupforGitHub
Byclicking“SignupforGitHub”,youagreetoourtermsofserviceand
privacystatement.We’lloccasionallysendyouaccountrelatedemails.
AlreadyonGitHub?
Signin
toyouraccount
Jumptobottom
Invoke-WebRequestandInvoke-RestMethoddonotdecodecontentinaccordancewithBOM/Content-Type
#11547
Open
he852100openedthisissue
Jan10,2020
·11comments
Open
Invoke-WebRequestandInvoke-RestMethoddonotdecodecontentinaccordancewithBOM/Content-Type
#11547
he852100openedthisissue
Jan10,2020
·11comments
Labels
Hacktoberfest
PotentialcandidatetoparticipateinHacktoberfest
Issue-Question
ideallysupportcanbeprovidedviaothermechanisms,butsometimesfolksdoopenanissuetogeta
Up-for-Grabs
Up-for-grabsissuesarenothighpriorities,andmaybeopportunitiesforexternalcontributors
WG-Cmdlets-Utility
cmdletsintheMicrosoft.PowerShell.Utilitymodule
Comments
Copylink
he852100
commented
Jan10,2020
•
edited
Unrecognizableandprocessed,garbled.
Example
$url='https://storage.live.com/items/A78ACCAEBB24EDD7!37945?&authkey=!APfFKTYtceWCfG0'
$g='./xmltest'
$reg='pN|utf'
((irm$URL)-split"[`r`n]+")-match$reg
irm$URL-outfile$g
(get-content$g)-match$reg
Expected
PS/sh>irm$URL
xmlFolder
---------
version="1.0"encoding="utf-8"Folder
PS/sh>(irm$URL).Folder.Items.Document
ItemTypeResourceIDRelationshipName
----------------------------------
DocumentA78ACCAEBB24EDD7!37948测试.json
Results
PS/sh>(iwr$URL).Headers.'Content-Type'
text/xml
PS/sh>((irm$URL)-split"[`r`n]+")-match$reg

æµè¯.json
BingClients
Readsavedfiles,Seemsnoproblem.
PS/s>(get-content../aa/irm)-match'pN|utf'
测试.json
BingClients
PS/sdcard/Documents/sh>
curl
PS/sdcard/Documents/sh>((curl$URL)-split"[`r`n]+")-match$reg
%Total%Received%XferdAverageSpeedTimeTimeTimeCurrent
DloadUploadTotalSpentLeftSpeed
1002693100269300217000:00:010:00:01--:--:--2170
测试.json
BingClients
Thetextwasupdatedsuccessfully,buttheseerrorswereencountered:
Allreactions
he852100
added
the
Issue-Question
ideallysupportcanbeprovidedviaothermechanisms,butsometimesfolksdoopenanissuetogeta
label
Jan10,2020
Copylink
Author
he852100
commented
Jan10,2020
•
edited
Theproblemisthatlive.comisnotreturningtheencodingit'susinginitsheaders.PowerShellobeysthestandardbyassumingISO-8859-1,butunfortunatelythesiteisusingUTF-8.
StackOverflowPowershellInvoke-WebRequestandcharacterencodingIamtryingtogetinformationfromtheSpotifydatabasethroughtheirWebAPI.
However,I'mfacingissueswithaccentedvowels(ä,ö,üetc.)
LetstakeTiëstoasanexample.
Spotify'sAPIBrowsercan
Allreactions
Sorry,somethingwentwrong.
Copylink
Collaborator
iSazonov
commented
Jan11,2020
@he852100PleaseaddinfoaboutPowerShellversion.CanyourepowithlatestPowerShellCorebuild?
Allreactions
Sorry,somethingwentwrong.
Copylink
Author
he852100
commented
Jan12,2020
•
edited
PSVersion7.0.0-daily.20200110
PSEditionCore
GitCommitId7.0.0-daily.20200110
OSLinux3.10.0-1062.9.1.el7.x86_64…
PlatformUnix
PSCompatibleVersions{1.0,2.0,3.0,4.0…}
PSRemotingProtocolVersion2.3
SerializationVersion1.1.0.1
WSManStackVersion3.0
sh>Invoke-WebRequest'https://pscoretestdata.blob.core.windows.net/v7-0-0-daily-20200110/powershell-7.0.0-daily.20200110-linux-arm64.tar.gz'-O~/powershell.tar.gz-Resume
StatusCode:416
StatusDescription:RequestedRangeNotSatisfiable
Content:InvalidRange
Therang
especifiedisinvalidforthecurrentsizeoftheresource.
RequestId:e8b88225-401e-0127-7cdc-c866f8000000
PS/root>$a.headers.GetEnumerator()
KeyValue
--------
Server{Windows-Azure-Blob/1.0,Microsoft-HTTPAPI/2.0}
x-ms-request-id{322455bd-301e-008d-77e3-c8f642000000}
x-ms-version{2014-02-14}
Date{Sun,12Jan202000:56:33GMT}
Content-Length{249}
Content-Type{application/xml}
Content-Range{bytes*/46486387}
Windows.net
PowerShellobeysthestandardbyassumingISO-8859-1,butunfortunatelythesiteisusingUTF-8.
Allreactions
Sorry,somethingwentwrong.
Copylink
Author
he852100
commented
Jan12,2020
@iSazonovItcanbedeterminedthatpowershelldoesnotrecognizeutf8bom
Allreactions
Sorry,somethingwentwrong.
Copylink
Collaborator
iSazonov
commented
Jan12,2020
@he852100Iguessitcomesfrom.NetCore.
Allreactions
Sorry,somethingwentwrong.
Copylink
scriptingstudio
commented
Jan13,2020
@he852100Iguessitcomesfrom.NetCore.
ThatcomesfromPS5andolder.Ifwebsitesaying,i'mutf8,whydoesiwrreturnascii?
Allreactions
Sorry,somethingwentwrong.
he852100
mentionedthisissue
May30,2020
Powershell7.0Seemshavesomeencodingproblem
#12107
Closed
iSazonov
added
the
WG-Cmdlets-Utility
cmdletsintheMicrosoft.PowerShell.Utilitymodule
label
May31,2020
Copylink
Contributor
mklement0
commented
May31,2020
•
edited
Note:Idon'tknowwhattheintendedbehavioris,buthereiswhatseemstobehappening:
Becausetheresponsedoesn'tindicateacharacterencoding(charset)initsContent-Typeheaderfield(text/xmlratherthantext/xml;charset=utf-8),PowerShelldefaultstoISO-8859-1,inaccordancewiththe-obsoletesince2014-RFC2616.
BecauseitblindlyassumesISO-8859-1,theUTF-8BOMisreadasdata,andthepayloadisthereforenotrecognizedasXML,whichfallsbacktoa(nincorrectlydecoded)stringinsteadofreturninganXmlDocumentinstance.
NotethatcurrentRFC,RFC7231,nolongermandatesanoveralldefaultandinsteaddeferstothedefaultencodingofthegivenmediatype.
ForXML,RFC7303mandateslookingattheBOMfirstandifthereisnoneatthecharsetattributeintheContent-Typeheader.Ifthatisn'tpresenteither,respecttheencodingspecifiedintheXMLdeclaration,andifthereisnone,defaulttoUTF-8.
GiventhatHTM5nowalsodefaultstoUTF-8andgiventhatRFC2616isobsolete,weshouldconsiderimplementingthefollowinglogicinbothInvoke-WebRequestandInvoke-RestMethod:
respectaBOM,ifpresent
ifthereisnoBOM,respectacharsetattributeinContent-Type
otherwise,forXMLandHTML,respecttheencodingspecifiedintheXMLdeclaration(e.g.)/HTMLelement,ifpresent(green-litinWebcmdletsshouldparsetheattributeforthecorrectencodingifnotinhttpheader #3267)
Ifnoneoftheaboveapplies,defaulttoUTF-8.
👍
2
vexx32andscriptingstudioreactedwiththumbsupemoji
Allreactions
👍
2reactions
Sorry,somethingwentwrong.
Copylink
Collaborator
iSazonov
commented
Jun1,2020
•
editedbyunfurl-links
bot
Currentlywehavemanyworkarounds.IguesstheycomesfromPS5.0.
NowwecoulduseHttpContent.ReadAsStringAsync()method.Itseemsitalreadyhasthedecodinglogic
https://github.com/dotnet/runtime/blob/bd6cbe3642f51d70839912a6a666e5de747ad581/src/libraries/System.Net.Http/src/System/Net/Http/HttpContent.cs#L182
GitHubdotnet/runtime.NETisacross-platformruntimeforcloud,mobile,desktop,andIoTapps.-dotnet/runtime
Allreactions
Sorry,somethingwentwrong.
iSazonov
added
the
Up-for-Grabs
Up-for-grabsissuesarenothighpriorities,andmaybeopportunitiesforexternalcontributors
label
Jun1,2020
Copylink
Contributor
mklement0
commented
Jun1,2020
•
edited
That'spromising,@iSazonov,butitlookslikethereferencedmethodgivesprecedencetothecharsetattributeoverthepayload'sBOM,correct?
ThisisthereverseofhowXMLdataissupposedtobehandledaccordingtoRFC7303(leavingtheadditionalneedtorespectanencodingintheXMLdeclarationaside),and,arguably,foralltextualmediatypes,accordingtosection"5.SecurityConsiderations"ofRFC6657:
thisdocumentrecommendstheuseofcharsetinformationthatismorelikelytobecorrect(forexample,in-bandoverout-of-band).
ABOMisaninstanceofin-bandinformation,whereasthecharsetheader-fieldattributeisout-of-bandinformation;therefore,theBOMshouldtakeprecedence.
Therefore,themethodyoulinktowouldn'tsolvetheproblemdescribedin#12861,forinstance.
Allreactions
Sorry,somethingwentwrong.
mklement0
mentionedthisissue
Jun1,2020
Unrecognizedencoding
#12861
Closed
Copylink
Collaborator
iSazonov
commented
Jun2,2020
theBOMshouldtakeprecedence
Itlookslikea.Netbug.Youcouldopennewissuein.NetRuntimerepo.
Incommon,IguesswecouldsimplifythePowerShellcodeifwewouldfollowthe.NetAPI.
👍
2
mklement0andrjmholtreactedwiththumbsupemoji
Allreactions
👍
2reactions
Sorry,somethingwentwrong.
rjmholt
changedthetitle
[Mybugreport]irm,iwrgetxmlProblem
Invoke-WebRequestandInvoke-RestMethoddonotdecodecontentinaccordancewithBOM/Content-Type
Dec11,2020
astelmachonak
mentionedthisissue
May9,2022
WinfetchNoLongerWorkingonWindows11
kiedtl/winfetch#136
Closed
CarloToso
mentionedthisissue
Oct4,2022
WebcmdletssetdefaultcharsetencodingtoUTF8
#18219
Open
22tasks
Copylink
Member
SteveL-MSFT
commented
Oct5,2022
@PowerShell/wg-powershell-cmdletsreviewedthis.WeagreethattheBOMshouldtakeprecedenceandwhereitmakessense,thewebcmdletsshouldhavethesamebehaviorascurl.We'reexplicitlynotmakinganystatementaboutimplementation
Allreactions
Sorry,somethingwentwrong.
SteveL-MSFT
added
the
Hacktoberfest
PotentialcandidatetoparticipateinHacktoberfest
label
Oct5,2022
Signupforfree
tojointhisconversationonGitHub.
Alreadyhaveanaccount?
Signintocomment
Assignees
Nooneassigned
Labels
Hacktoberfest
PotentialcandidatetoparticipateinHacktoberfest
Issue-Question
ideallysupportcanbeprovidedviaothermechanisms,butsometimesfolksdoopenanissuetogeta
Up-for-Grabs
Up-for-grabsissuesarenothighpriorities,andmaybeopportunitiesforexternalcontributors
WG-Cmdlets-Utility
cmdletsintheMicrosoft.PowerShell.Utilitymodule
Projects
Noneyet
Milestone
Nomilestone
Development
Nobranchesorpullrequests
5participants
Youcan’tperformthatactionatthistime.
Yousignedinwithanothertaborwindow.Reloadtorefreshyoursession.
Yousignedoutinanothertaborwindow.Reloadtorefreshyoursession.