ChatGPT解决这个技术问题 Extra ChatGPT

How do I escape ampersands in XML so they are rendered as entities in HTML?

I have some XML text that I wish to render in an HTML page. This text contains an ampersand, which I want to render in its entity representation: &.

How do I escape this ampersand in the source XML? I tried &, but this is decoded as the actual ampersand character (&), which is invalid in HTML.

So I want to escape it in such a way that it will be rendered as & in the web page that uses the XML output.

The claim in the latest revision of this question that "the actual ampersand character (&) ... is invalid in HTML." is false. Indeed, even the accepted answer to the linked question provided as justification states "HTML5 allows you to leave it unescaped, but only when the data that follows does not look like a valid character reference".

C
CodeCaster

When your XML contains &, this will result in the text &.

When you use that in HTML, that will be rendered as &.


How does that answer the question?
J
John Feminella

As per §2.4 of the XML 1.0 spec, you should be able to use &.

I tried & but this isn't allowed.

Are you sure it isn't a different issue? XML explicitly defines this as the way to escape ampersands.


This was perfectly reasonable when posted, but changes (or perhaps clarifications) to the question since have made it seem nonsensical as an answer. For one thing, the quoted passage is no longer present in the question.
M
Martin Schneider

The & character is itself an escape character in XML so the solution is to concatenate it and a Unicode decimal equivalent for & thus ensuring that there are no XML parsing errors. That is, replace the character & with &.


I really prefer this solution! Should also be possible to use the hexadecimal notation: &
Logically, why would this work? Both strings have an ampersand, including the one with the character code on the end...
@sijpkes Because the ampersand here tells the parser that the following characters are used to represent another character, which in this case would be an ampersand. An ampersand isn't "illegal" in XML-- it just has a special meaning. It means "all of the characters after this until you hit a semicolon should be translated to something else". When you have an ampersand normally, without the descriptive characters and trailing semicolon, the parser gets confused.
This is the answer for me. Adding & in the Location of my Response Header fixed it and is not showing the Ampersand on the Response Header. :D
Stack Overflow is so great. Here is an almost 11 year old post that solves my problem. And it has been viewed over 690,000 times.
P
Patrick Hofman

Use CDATA tags:

 <![CDATA[
   This is some text with ampersands & other funny characters. >>
 ]]>

This is a guess rather than an answer.
It might be a guess; it is correct though. CDATA markers allow raw ampersands to be used.
The origional post never made clear where the & was to be used, CDATA tags cannot be used for attribute values, only for the actual content of the tags, hence the reason I included the '?'.
This is also great for characterizing xml data and this answer is helpful in many other scenarios concerning xml rendering. For me, it really helped in Camel XML DSL, when I needed to set the body or some header with some XML data, the Camel XML parser ignored the CDATA contents, reading them as a stream of characters. Without this the camel engine throws invalid xml structure exceptions
This is exactly the answer I needed, because in my case I'm not sure what characters might be coming in the XML, so I need to escape everything in that section.
P
Peter Mortensen
P
Peter Mortensen

In my case I had to change it to %26.

I needed to escape & in a URL. So &amp; did not work out for me. The urlencode function changes & to %26. This way neither XML nor the browser URL mechanism complained about the URL.


Yes. Note though that the OP was about escaping in XML. Escaping in a URL is a different issue. The real fun begins when you have URLs in XML, or XML-fragments in URLs...
urlencode() in what environment? In PHP?
C
Community

&amp; is the way to represent an ampersand in most sections of an XML document.

If you want to have XML displayed within HTML, you need to first create properly encoded XML (which involves changing & to &amp;) and then use that to create properly encoded HTML (which involves again changing & to &amp;). That results in:

&amp;amp;

For a more thorough explanation of XML encoding, see:

What characters do I need to escape in XML documents?


P
Peter Mortensen

I have tried &amp, but it didn't work. Based on Wim ten Brink's answer I tried &amp;amp and it worked.

One of my fellow developers suggested me to use & and that worked regardless of how many times it may be rendered.


What about the semicolons? Code formatting can be used to work around formatting problems here (but it is also possible without - using "ironic" formatting).
I
Isaac Truett

<xsl:text disable-output-escaping="yes">&amp;&nbsp;</xsl:text> will do the trick.


P
Peter Mortensen

Consider if your XML looks like below.

<Employees Id="1" Name="ABC">
  <Query>
    SELECT * FROM EMP WHERE ID=1 AND RES<>'GCF'
  <Query>
</Employees>

You cannot use the <> directly as it throws an error. In that case, you can use &#60;&#62; in replacement of that.

<Employees Id="1" Name="ABC">
  <Query>
    SELECT * FROM EMP WHERE ID=1 AND RES &#60;&#62; 'GCF'
  <Query>
</Employees>

14.1 How to use special characters in XML has all the codes.


I think this ground was well-covered in the 7 years prior to this answer being posted.