Well- Formed versus Valid
There are two levels of "goodness" for an XML document. Well- formedness refers to mandatory syntactic constraints. Validity refers to optional structural and semantic constraints. There's a tendency to use the word valid in its common English usage to describe any correct document. However, in XML it has a much more specific meaning. Documents can be correct and processable yet still not valid.
Well-formedness is the minimum requirement necessary for an XML document. It includes various syntactic constraints, such as every start-tag must have a matching end-tag and the document must have exactly one root element. If a document is not well formed, it is not an XML document. Parsers that encounter a malformed document are required to report the error and stop parsing. They may not attempt to guess what the document author intended. They may not fix the error and continue. They have to drop the document on the floor.
Validity is a stronger constraint than well-formedness, but it's not required in order to process XML documents. Validity determines which elements and attributes are allowed to appear where. It indicates whether a document adheres to the constraints listed in the document type definition (DTD) and the document type declaration (DOCTYPE). Even if a document does not adhere to these constraints, it may still be usefully processed in some cases. The decision of whether and how to reject invalid documents is made by the client application, not by the parser.
The word valid is also sometimes used to refer to validity with respect to a schema rather than a DTD. In cases where this seems likely to be confusing, particularly where one is likely to want to validate a document against a DTD and against some other schema, the term schema-valid is used. As with DTD validity, whether and how to handle a schema-invalid document is a decision for the client application. A schema-validating parser will inform the client application that a document is invalid but will continue to parse it. The client application gets to decide whether or not to accept the document.