ChatGPT解决这个技术问题 Extra ChatGPT

What does the 'standalone' directive mean in XML?

xml

What does the 'standalone' directive mean in an XML document?


n
nullability

The standalone declaration is a way of telling the parser to ignore any markup declarations in the DTD. The DTD is thereafter used for validation only.

As an example, consider the humble <img> tag. If you look at the XHTML 1.0 DTD, you see a markup declaration telling the parser that <img> tags must be EMPTY and possess src and alt attributes. When a browser is going through an XHTML 1.0 document and finds an <img> tag, it should notice that the DTD requires src and alt attributes and add them if they are not present. It will also self-close the <img> tag since it is supposed to be EMPTY. This is what the XML specification means by "markup declarations can affect the content of the document." You can then use the standalone declaration to tell the parser to ignore these rules.

Whether or not your parser actually does this is another question, but a standards-compliant validating parser (like a browser) should.

Note that if you do not specify a DTD, then the standalone declaration "has no meaning," so there's no reason to use it unless you also specify a DTD.


The example needs further modification. Having "standalone='no'" would not normally close unclosed XML tags (this is a feature of SGML, but not of XML). Validation will fail. It won't provide values for attributes which are REQUIRED, either.
"The standalone declaration is a way of telling the parser to ignore any markup declarations in the DTD. " That's not correct. With standalone=yes, markup declarations are not ignored, instead they cause the document to be invalid XML. Would you mind if I edit that into the answer?
@sleske Please just make the change. The author may always revert if he/she 's not happy.
@Stephan: Thanks for the encouragement. Unfortunately, I am no longer current on the whole XML stuff, so I cannot confidently edit right now. Feel free to edit yourself if you have up-to-date knowledge :-).
C
Cong Ma

The standalone directive is an optional attribute on the XML declaration.

Valid values are yes and no, where no is the default value.

The attribute is only relevant when a DTD is used. (The attribute is irrelevant when using a schema instead of a DTD.)

standalone="yes" means that the XML processor must use the DTD for validation only. In that case it will not be used for: default values for attributes entity declarations normalization

default values for attributes

entity declarations

normalization

Note that standalone="yes" may add validity constraints if the document uses an external DTD. When the document contains things that would require modification of the XML, such as default values for attributes, and standalone="yes" is used then the document is invalid.

standalone="yes" may help to optimize performance of document processing.

Source: The standalone pseudo-attribute is only relevant if a DTD is used


Using standalone="yes" causes additional validity constraints (i.e. may cause an XML document to be invalid). I edited this into the answer, hope that's ok.
@sleske Thanks for your contribution. I tried to simplify your edit while still stating your point clearly. Feel free to edit again if I got it wrong.
S
Stefan Gehrig

standalone describes if the current XML document depends on an external markup declaration.

W3C describes its purpose in "Extensible Markup Language (XML) 1.0 (Fifth Edition)":

2.9 Standalone Document Declaration


u
user657267

The intent of the standalone=yes declaration is to guarantee that the information inside the document can be faithfully retrieved based only on the internal DTD, i.e. the document can "stand alone" with no external references. Validating a standalone document ensures that non-validating processors will have all of the information available to correctly parse the document.

The standalone declaration serves no purpose if a document has no external DTD, and the internal DTD has no parameter entity references, as these documents are already implicitly standalone.

The following are the actual effects of using standalone=yes.

Forces processors to throw an error when parsing documents with an external DTD or parameter entity references, if the document contains references to entities not declared in the internal DTD (with the exception of replacement text of parameter entities as non-validating processors are not required to parse this); amp, lt, gt, apos, and quot are the only exceptions

When parsing a document not declared as standalone, a non-validating processor is free to stop parsing the internal DTD as soon as it encounters a parameter entity reference. Declaring a document as standalone forces non-validating processors to parse markup declarations in the internal DTD even after they ignore one or more parameter entity references.

Forces validating processors to throw an error if any of the following are found in the document, and their respective declarations are in the external DTD or in parameter entity replacement text: attributes with default values, if they do not have their value explicitly provided entity references (other than amp, lt, gt, apos, and quot) attributes with tokenized types, if the value of the attribute would be modified by normalization elements with element content, if any white space occurs in their content

attributes with default values, if they do not have their value explicitly provided

entity references (other than amp, lt, gt, apos, and quot)

attributes with tokenized types, if the value of the attribute would be modified by normalization

elements with element content, if any white space occurs in their content

A non-validating processor might consider retrieving the external DTD and expanding all parameter entity references for documents that are not standalone, even though it is under no obligation to do so, i.e. setting standalone=yes could theoretically improve performance for non-validating processors (spoiler alert: it probably won't make a difference).

The other answers here are either incomplete or incorrect, the main misconception is that

The standalone declaration is a way of telling the parser to ignore any markup declarations in the DTD. The DTD is thereafter used for validation only.

standalone="yes" means that the XML processor must use the DTD for validation only.

Quite the opposite, declaring a document as standalone will actually force a non-validating processor to parse internal declarations it must normally ignore (i.e. those after an ignored parameter entity reference). Non-validating processors must still use the info in the internal DTD to provide default attribute values and normalize tokenized attributes, as this is independent of validation.


The best answer
C
Chris Diver

Markup declarations can affect the content of the document, as passed from an XML processor to an application; examples are attribute defaults and entity declarations. The standalone document declaration, which may appear as a component of the XML declaration, signals whether or not there are such declarations which appear external to the document entity or in parameter entities. [Definition: An external markup declaration is defined as a markup declaration occurring in the external subset or in a parameter entity (external or internal, the latter being included because non-validating processors are not required to read them).]

http://www.w3.org/TR/xml/#sec-rmd


I downvoted since this should be put in a way understandable to normal human beings with medium QI.