XML DTD

Define the structure and rules for XML documents

📋 What is DTD?

DTD (Document Type Definition) defines the legal structure of an XML document. It specifies which elements and attributes are allowed, their order, and how they can be nested together.


<!DOCTYPE note [
    <!ELEMENT note (message)>
    <!ELEMENT message (#PCDATA)>
]>
                                    

Key DTD Concepts

📦

Elements

Define allowed XML elements

<!ELEMENT name (#PCDATA)>
🏷️

Attributes

Define element attributes

<!ATTLIST book id CDATA>
🔢

Entities

Define reusable content

<!ENTITY copy "&#169;">
📝

Notation

Declare external data types

<!NOTATION jpg SYSTEM>

🔹 Internal DTD

Internal DTD is declared inside the XML document within the DOCTYPE declaration. It's useful for simple documents where the structure definition doesn't need to be shared across multiple files. The DTD rules are embedded directly in the document.

<?xml version="1.0"?>
<!DOCTYPE note [
    <!ELEMENT note (to, from, heading, body)>
    <!ELEMENT to (#PCDATA)>
    <!ELEMENT from (#PCDATA)>
    <!ELEMENT heading (#PCDATA)>
    <!ELEMENT body (#PCDATA)>
]>

<note>
    <to>Alice</to>
    <from>Bob</from>
    <heading>Reminder</heading>
    <body>Don't forget the meeting!</body>
</note>

Explanation:

<!DOCTYPE note [...]> - DTD declaration

<!ELEMENT> - Defines element structure

#PCDATA - Parsed character data (text)

🔹 External DTD

External DTD is stored in a separate file and referenced by multiple XML documents. This promotes reusability and makes it easier to maintain consistent structure across many documents. Use SYSTEM for local files or PUBLIC for standard DTDs.

DTD File (note.dtd):

<!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>

XML Document:

<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">

<note>
    <to>Alice</to>
    <from>Bob</from>
    <heading>Reminder</heading>
    <body>Don't forget the meeting!</body>
</note>

🔹 Element Declarations

Element declarations define what content an element can contain. You can specify text content, child elements, empty elements, or any content. Operators like +, *, and ? control how many times elements can appear.

<!-- Text content only -->
<!ELEMENT name (#PCDATA)>

<!-- Child elements (sequence) -->
<!ELEMENT book (title, author, price)>

<!-- Choice (one of) -->
<!ELEMENT contact (email | phone)>

<!-- Empty element -->
<!ELEMENT br EMPTY>

<!-- Any content -->
<!ELEMENT div ANY>

<!-- Occurrence indicators -->
<!ELEMENT library (book+)>        <!-- One or more -->
<!ELEMENT library (book*)>        <!-- Zero or more -->
<!ELEMENT library (book?)>        <!-- Zero or one -->
<!ELEMENT library (book)>         <!-- Exactly one -->

🔹 Attribute Declarations

Attributes provide additional information about elements. DTD attribute declarations specify the attribute name, data type, and whether it's required or optional. You can also set default values and restrict values to specific choices.

<!-- Basic attribute -->
<!ATTLIST book id CDATA #REQUIRED>

<!-- Multiple attributes -->
<!ATTLIST book
    id CDATA #REQUIRED
    category CDATA #IMPLIED
    lang CDATA "en">

<!-- Attribute types -->
<!ATTLIST element
    text CDATA #REQUIRED           <!-- Character data -->
    id ID #REQUIRED                <!-- Unique identifier -->
    ref IDREF #IMPLIED             <!-- Reference to ID -->
    type (fiction|non-fiction) "fiction">  <!-- Enumerated -->

<!-- Default values -->
<!ATTLIST book
    available (yes|no) "yes"       <!-- Default value -->
    status CDATA #FIXED "active">  <!-- Fixed value -->

🔹 Entities in DTD

Entities are shortcuts for frequently used text or special characters. They help avoid repetition and make documents easier to maintain. Entities can be internal (defined in DTD) or external (referencing external files).

<!-- Internal entities -->
<!ENTITY company "TechCorp Inc.">
<!ENTITY copyright "&#169; 2025">
<!ENTITY email "[email protected]">

<!-- Using entities in XML -->
<document>
    <name>&company;</name>
    <footer>&copyright; &company;</footer>
    <contact>&email;</contact>
</document>

<!-- External entity -->
<!ENTITY terms SYSTEM "terms.xml">

<!-- Parameter entity (for DTD reuse) -->
<!ENTITY % common "id CDATA #REQUIRED">
<!ATTLIST book %common;>

🔹 Complete DTD Example

This comprehensive example shows a complete DTD for a library catalog system. It demonstrates elements, attributes, entities, and various content models working together to define a robust document structure.

<!-- library.dtd -->
<!ENTITY % bookType "(fiction|non-fiction|reference)">

<!ELEMENT library (book+)>
<!ATTLIST library
    name CDATA #REQUIRED
    location CDATA #IMPLIED>

<!ELEMENT book (title, author+, isbn, price?, description?)>
<!ATTLIST book
    id ID #REQUIRED
    category %bookType; "fiction"
    available (yes|no) "yes">

<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT isbn (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ATTLIST price currency CDATA "USD">
<!ELEMENT description (#PCDATA)>

Valid XML Document:

<?xml version="1.0"?>
<!DOCTYPE library SYSTEM "library.dtd">
<library name="City Library" location="Downtown">
    <book id="b001" category="fiction" available="yes">
        <title>The Great Adventure</title>
        <author>John Smith</author>
        <isbn>978-0-123456-78-9</isbn>
        <price currency="USD">29.99</price>
        <description>An exciting tale of adventure.</description>
    </book>
</library>

🧠 Test Your Knowledge

What does #PCDATA mean in DTD?