Abstract: This standard defines the format and content of the electronic file set that comprises a digital talking book (DTB). It uses established and new specifications to delineate the structure of DTBs whose content can range from XML text only, to text with corresponding spoken audio, to audio with little or no text. DTBs are designed to make print material accessible and navigable for blind or otherwise print-disabled persons.
Copyright © 2001 by National Information Standards Organization
(This foreword is not a part of the American National Standard for Digital Talking Books... . It is included for information only.)
This standard presents the file specifications for digital talking books (DTBs) for blind, visually impaired, physically handicapped, learning-disabled, or otherwise print-disabled readers. For many years, "talking books" have been made available to print-disabled readers on analog media such as phonograph records and audiocassettes. These media serve their users well in providing human-speech recordings of a wide array of print material in increasingly robust and cost-effective formats. However, analog media are limited in several respects when compared to a print book. First, they are by their nature linear presentations, which leaves much to be desired when reading reference works, textbooks, magazines, and other materials which are often accessed randomly. Digital media offer readers the ability to move around a book or magazine as freely as (and more efficiently than) a sighted reader flips through a print book. Second, analog recordings do not allow users to interact with the book, placing bookmarks, highlighting material, and so forth. A DTB offers this capability, storing the bookmarks and highlights separate from, but associated with, the DTB itself. Third, talking book users have long complained that they do not have access to the spelling of the words they hear. As will be explained below, some DTBs will include a file containing the full text of the work, synchronized with the audio presentation, thereby allowing readers to locate specific words and hear them spelled. Finally, analog audio offers readers only one version of the document. If, for example, a book contains footnotes, they are either read where referenced, which burdens the casual reader with unwanted interruptions, or grouped at a location out of the flow of the text, making them difficult for interested readers to access. A DTB allows the user to easily skip over or read footnotes. The Digital Talking Book offers the print-disabled user a significantly enhanced reading experience -- one that is much closer to that of the sighted reader using a print book. This standard describes the various files that make up a DTB and specifies how each must be formatted.
The DTB goes far beyond the limits imposed on analog audio books because they can include not just the audio rendition of the work, but the full textual content and images as well. Because the textual content file is synchronized with the audio file, a DTB offers multiple sensory inputs to readers, a great benefit to learning-disabled readers, for example. Some visually impaired readers may choose to listen to most of the book, but find that inspecting the images provides information not available in the narrative flow. Others may opt to skip the audio presentation altogether and instead view the text file via screen-enlarging software. Braille readers may prefer to read some or all of the document via a refreshable Braille display device connected to their DTB player and accessing the textual content file.
Digital Talking Books are not tied to a single distribution medium. CD-ROMs will be used first but DTBs will be portable to any digital distribution medium capable of handling the large files associated with digital audio recordings. Regardless of how a DTB is distributed, however, it will normally be in the context of a digital rights management system.
The initiative behind this document grew from a desire to standardize DTB file structures, in the hope that it might prevent a recurrence of the multiple formats currently used for talking books throughout the world. This document benefited greatly from the work of the DAISY Consortium, whose members had broken much of the ground covered in this standard and who contributed enormously to the solution of the many problems encountered.
To be added by NISO.
To be added by NISO.
Standards Committee AQ on Digital Talking Books had the following members at the time this standard was approved:
Standards Committee AQ gratefully acknowledges the contributions made by the DAISY Consortium (www.daisy.org) to this work. The Consortium created a series of open international specifications (DAISY 2.0 ©1998, DAISY 2.01 ©1999, and DAISY 2.02 ©2001) that formed the foundation on which this standard is built. DAISY representatives served on Committee AQ since its inception and knowledge gained in their work on DAISY projects greatly informed the complex discussions and decisions leading to the creation of this document. In addition, they hosted several list-servs on which many issues critical to DTB work in general, and to this standard specifically, were discussed and resolved. It is no exaggeration to state that without their groundbreaking efforts and their ongoing contributions to Committee work, this standard would not exist in anything like its current level of sophistication.
In addition, the Committee wishes to thank the following individuals for their substantial assistance to the process of creating the standard: Robert Berkovitz, Sensimetrics Corporation; Harvey Bingham; Mike Brown; John Churchill, Recording for the Blind and Dyslexic; Manon Gaudet, VisuAide, Inc.; Al Gilman; Markus Gylling, Swedish Library of Talking Books and Braille; Steve Jacobs, NCR Corporation; Lynn Leith, Canadian National Institute for the Blind; Tatsu Nishizawa, Plextor Corporation; Dave Pawson, Royal National Institute for the Blind; James Pritchett, Recording for the Blind and Dyslexic; Dr. Gregg Vanderheiden, TRACE Research and Development Center, University of Wisconsin; Mr. Paul Vassallo, National Institute of Standards & Technology; with special thanks to members of the DAISY Consortium's Specifications and Guidelines Work Team and DTD Work Team. Thanks also to these members of the W3C Synchronized Multimedia (SYMM) Working Group: Dick Bulterman, Oratrix; Wo Chang, NIST; Lloyd Rutledge, CWI; Patrick Schmitz, Microsoft.
This standard establishes file specifications for digital talking books (DTBs) for blind, visually impaired, physically handicapped, learning-disabled, or otherwise print-disabled readers. Its purpose is to ensure interoperability across service organizations and vendors providing content and playback systems to the target population.
This standard provides specifications applicable to all aspects of digital talking book production and rendering, including authoring tools for DTBs, hardware- or software-based playback devices, and compliance-testing software.
The following abbreviations, acronyms, phrases, and terms are used in this standard as defined below. In the following definitions and throughout the standard, bracketed items correspond to entries in section 17, "References to Other Specifications/Documents," where the full URL is provided for each reference.
This standard is based primarily on a variety of widely used standards and specifications, including several from the World Wide Web Consortium and the Open eBook Forum. Wherever applicable and appropriate standards or specifications existed they were used. The use of these specifications and technologies is intended to promote a fast and consistent adoption of this standard for the target population, while encouraging its extension into mainstream use.
Digital Talking Book files, streams, transformation processes and players have been designed to present their content to people with a wide range of abilities and disabilities. They are designed to allow presentation in forms other than conventional print, due to the inaccessibility of printed documents to these users. To the greatest extent possible, files, streams, transformation processes and players should make information available in as many presentation modes as practical, including human-narrated audio, Braille, synthesized speech and, for players with visual display, large print with user-specifiable size and text re-wrapping, as well as text and audio synchronization and other enhancements for persons with learning disabilities. The controls of players should be easily used by people with a wide range of manual dexterity. Further, tools for producing DTBs should be designed from the outset to be usable by people who are blind, visually impaired, or have other reading disabilities.
During the development of this standard, an advisory document, DTB Playback Device Features List was created. Although it is not a normative part of this standard, player developers may find useful accessibility concepts embodied in it.
In addition to the provisions of this standard, valuable supplemental information is available from the guidelines and techniques produced by the Worldwide Web Consortium's Web Accessibility Initiative. At this time, these documents include:
It is not expected that all modes of presentation will be available in all players and documents, but it is strongly recommended that multiple equivalent presentations be made available to users whenever possible. Historically, products marketed to specific user groups with disabilities have sometimes proven unusable. Not all players need to be accessible to all target groups, but any device compliant with this standard must be accessible to the target group for which it is advertised. It is also strongly recommended that DTB production tools and processes be made accessible to persons with disabilities.
This standard is based on the specific versions of the standards and specifications referenced herein, which are used as defined, except as noted by this document. Any refinement or replacement of a referenced specification by a newer or different version is not directly applicable to this standard. Conformance to this standard is based on the versions of the standards and specifications in effect at the time of this writing.
Playback systems must support at least UTF-8 and UTF-16 encodings.
It is possible that compliance with this standard may require the use of one or more inventions covered by patent rights. It is believed that all companies claiming such rights have agreed to grant a license under such rights that they hold on reasonable and nondiscriminatory terms and conditions to any applicant.
Producers of DTB systems or any component thereof are responsible for obtaining the appropriate licenses for any and all technology defined by the relevant standards and specifications referenced by this standard.
Issues surrounding the protection of intellectual property embodied in the works distributed as digital talking books are discussed in section 14, Digital Rights Management.
The maintenance agency designated in Appendix 7 will be responsible for reviewing and acting upon suggestions for modifications to this standard. Questions concerning the implementation of this standard and requests for information should be sent to the maintenance agency.
A list of errata relating to this standard will be maintained at http://www.loc.gov/nls/z3986/v100/errata.html.
A digital talking book (DTB) is a collection of electronic files arranged to present information to the target population via alternative media, namely, human or synthetic speech, refreshable Braille, or visual display, e.g., large print. When these files are created and assembled into a DTB in accordance with this standard, they make possible a wide range of features such as rapid, flexible navigation; bookmarking and highlighting; keyword searching; spelling of words on demand; and user control over the presentation of selected items (e.g., footnotes, page numbers, etc.). Such features enable readers with visual and physical disabilities to access the information in DTBs flexibly and efficiently, and allow sighted users with learning or reading disabilities to receive the information through multiple senses. For a full discussion of these capabilities, see the "Document Navigation Features List" [Navigation Features], developed as the user requirements document on which this standard was based. A document written during the development of this standard, Theory Behind the DTBook DTD [DTBook Theory], also describes the navigational capabilities of a DTB in some detail. The content of DTBs will range from audio alone, through a combination of audio, text, and images, to text alone. Section 13 describes these various types of DTB.
DTB players will also be produced with a variety of capabilities. The simplest might be portable devices with audio-only capabilities. More complex portable players could include text-to-speech capabilities as well as audio output for recorded human speech. The most comprehensive playback systems are expected to be PC-based, supporting visual and audio output, text-to-speech capability, and output to a Braille display. The Playback Device Features List [Player Features] mentioned above presents the committee's priorities for a range of functions across three types of playback device.
The files comprising a DTB fall into ten categories, as described below:
A DTB conforming to this standard must include exactly one Package File which must be a valid XML 1.0 document conforming to the Open eBook Forum™ (OEBF) 1.0.1 package DTD (oebpkg101.dtd) and its associated entity reference (oeb1.ent). The full specification, DTD, and entity reference for the OEBF package file are available for download from the OEBF site [OEBF]. The Package File must be named with the extension ".opf."
A Package File conforming to this standard must comply with all aspects of section 2 of the OEBF Publication Structure 1.0.1, with the following two exceptions:
spine
element may refer only to item
elements of media type text/x-oeb1-document. In DTB applications, the spine
must only reference item
s of media type application/smil.The Package File, drawn from the OEBF Publication Structure 1.0.1, contains administrative information about the DTB, the files that comprise it, and how these files interrelate. This section, drawn largely from the Publication Structure, provides only a brief summary of the function of each section with an example illustrating how it is applied to the DTB. Please see section 2 of the full OEBF Publication Structure 1.0.1 for complete details on the Package File.
The Publication Structure describes the major parts of the Package File as follows:
- PACKAGE IDENTITY - a unique identifier for the OEB publication as a whole.
- METADATA - Publication metadata (title, author, publisher, etc.).
- MANIFEST - A list of files (documents, images, style sheets, etc.) that make up the publication. The manifest also includes fallback declarations for files of types not supported by this specification.
- SPINE - An arrangement of documents providing a linear reading order.
- TOURS - A set of alternate reading sequences through the publication, such as selective views for various reading purposes, reader expertise levels, etc.
- GUIDE - A set of references to fundamental structural features of the publication, such as table of contents, foreword, bibliography, etc.
Here is an informal outline of the package file:
<?xml version="1.0"?> <!DOCTYPE package PUBLIC "+//ISBN 0-9673008-1-9//DTD OEB 1.0.1 Package//EN" "http://openebook.org/dtds/oeb-1.0.1/oebpkg101.dtd"> <package> <metadata>...</metadata> <manifest>...</manifest> <spine>...</spine> <tours>...</tours> <guide>...</guide> </package>
The package
must include a value for its unique-identifier
attribute. This is required because more than one dc:Identifier
may
be present in a DTB's Package File metadata and the unique-identifier
specifies which dc:Identifier
element provides the package's primary
identifier. The value of unique-identifier
must match the id attribute
of one and only one dc:Identifier
element which is a descendant of
the package
element.
The primary identifier of the DTB must be globally unique.
Example 3.1:
<package unique-identifier="uid"> <metadata> <dc-metadata...> <dc:Identifier id="uid" scheme="DTB">uk-rnib-db02006</dc:Identifier> ... </package>
This portion of the Package File contains the information about a DTB that would normally be found in a library catalog record. It includes data about the DTB itself (e.g., title, author, producer, format, and narrator) as well as information about the source publication (usually a print book) such as publisher, edition, copyright statement, etc.
The Package File must contain exactly one metadata
element which must contain
one and only one dc-metadata
element holding Dublin Core [DC]
metadata and must contain supplemental metadata in an x-metadata
element.
The x-metadata
element must contain at least one instance of the meta
element, which uses name
and content
attributes to define
its value (see section 3.2.3, "X-Metadata").
The use of Dublin Core metadata within a compliant DTB must conform to the following description from the OEBF Publication Structure 1.0.1:
The dc-metadata element contains specific publication-level metadata as defined by the Dublin Core initiative (http://purl.org/dc/). The descriptions below are included for convenience, and the Dublin Core's own definitions take precedence (see http://www.ietf.org/rfc/rfc2413.txt).
The dc-metadata element can contain any number of instances of any Dublin Core elements. Dublin Core element names begin with the "dc:" prefix followed by a leading uppercase letter. Dublin Core metadata elements may occur in any order; in fact, multiple instances of the same element type (multiple dc:Creator elements, for example) can be interspersed with other metadata elements without change of meaning.
For upwards-compatibility, the element metadata in an OEB package is required to have an attribute of xmlns:dc="http://purl.org/dc/elements/1.0/" and xmlns:oebpackage="http://openebook.org/namespaces/oeb-package/1.0/".
Following are brief definitions of the Dublin Core elements. See
the Publication Structure and the Dublin Core itself for more complete descriptions.
The attributes "xml:lang" and "id" can be applied to all "dc:..." elements.
Additional attributes can be used with several elements as detailed below. Note
that all Dublin Core element types may be repeated (occur more than once) within
dc-metadata
.
package
unique-identifier
attribute, must include an id.Various schemes are available for identifying digital publications. In the DTB domain, the requirements for an identifier are simply to identify the publication in a manner that is highly likely to be globally unique. A major purpose of the uniqueness requirement is to prevent filename collisions among bookmark files.
To meet this base requirement, a simple DTB id scheme may be used. A DTB identifier under this scheme consists of a hyphen-separated string consisting of a two-letter country code drawn from [ISO 3166], an agency code unique within its country, and an identifier unique within the agency. For example, us-afb-x12345.
This scheme will provide a simple solution to the uniqueness requirement that will serve DTB-publishers' needs in the short term. In the longer term, as the requirements of a global library of alternative format materials become more important, other more sophisticated mechanisms should certainly be employed.
The following names were developed for the DTB application to supply information that the Dublin Core element set does not cover. These names may appear only within the x-metadata
containing element, as values of the name
attribute on the meta
element. Each x-metadata name below is shown as either "Repeatable" (it may be used more than once) or "Not repeatable."
Example 3.2:
... <metadata> <dc-metadata xmlns:dc="http://purl.org/dc/elements/1.0/" xmlns:oebpackage="http://openebook.org/namespaces/oeb-package/1.0/"> <dc:Title>Revised Standards and Guidelines of Service for the Library of Congress Network of Libraries for the Blind and Physically Handicapped 1995</dc:Title> <dc:Subject>library information networks</dc:Subject> <dc:Subject>libraries and the physically handicapped--standards--U.S.</dc:Subject> <dc:Subject>libraries and the blind--standards--U.S.</dc:Subject> <dc:Identifier id="uid" scheme="DTB">us-nls-db00001</dc:Identifier> <dc:Identifier scheme="DOI">10.1000/DX44998</dc:Identifier> <dc:Creator role="aut">American Library Association. Association of Specialized and Cooperative Library Agencies</dc:Creator> <dc:Publisher>National Library Service for the Blind and Physically Handicapped, Library of Congress</dc:Publisher> <dc:Date>2000-06-22</dc:Date> <dc:Source>0-8389-7797-9</dc:Source> <dc:Language>en</dc:Language> <dc:Format>ANSI/NISO Z39.86-200x v1.0.0</dc:Format> <dc:Description>A document developed to improve library service for blind and physically disabled persons by providing a tool for assessing the current status of those services and for developing long-range plans.</dc:Description> </dc-metadata> <x-metadata> <meta name="dtb:sourceDate" content="1995" /> <meta name="dtb:sourcePublisher" content="American Library Association" /> <meta name="dtb:sourceRights" content="copyright 1995, American Library Association" /> <meta name="dtb:narrator" content="Lowenstein, Ralph" /> <meta name="dtb:producer" content="American Foundation for the Blind" /> <meta name="dtb:multimediaType" content="audioNcx" /> <meta name="dtb:totalTime" content="06:22:34.143" /> </x-metadata> </metadata> ...
The manifest
, which is a child of the package
element, must
contain a complete list of all of the files (documents, audio files, images,
style sheets, etc.) that make up a given DTB, including the package file itself.
The distInfo file and any associated audio changeMsgs
are not considered
part of the DTB and thus shall not be listed (See section
11, "Packaging Files for Distribution.") Each file is referenced
by an item
element. Each item
must have an href
attribute
which is the URI of the referenced file and is unique within
the manifest. This URI must not include fragment identifiers;
if relative, it is interpreted as relative to the package file itself. Further,
any relative URIs contained within an XML file listed in the manifest are considered
to be relative to the referring file.
In addition, each item
must have a media-type
attribute containing
the MIME media type of the file, and an id
attribute. The id
is utilized primarily when a manifest item
is referenced by the spine
.
The manifest
also includes fallback declarations for files of types
not supported by this standard (see OEBF Publication Structure
for details). Support for the fallback mechanism is not required by this standard.
The NCX entry in the Package File manifest must have an id value equal to "ncx".
The Resource File entry in the Package File manifest must have an id value equal
to "resource". The item
elements listing SMIL files in the manifest
must have a media-type attribute of "application/smil". The item
elements for the NCX, textual content file(s), Package File, and Resource File must have media-type attribute values of "text/xml." The order of item
elements within the manifest
is not significant.
A sample
manifest
for a DTB with audio, structure, and text follows (multimediaType=audioFullText):
Example 3.3:
... <manifest> <item id="opf" href="rs.opf" media-type="text/xml" /> <item id="text" href="rs.xml" media-type="text/xml" /> <item id="text_style" href="dtbbase.css" media-type="text/css2" /> <item id="ncx" href="rs.ncx" media-type="text/xml" /> <item id="ncx_style" href="ncx16.css" media-type="text/css2" /> <item id="SMIL" href="rs.smil" media-type="application/smil" /> <item id="foreword" href="rs_fwdx.mp3" media-type="audio/mp3" /> <item id="standards" href="rs_stdx.mp3" media-type="audio/mp3" /> <item id="appendices" href="rs_app.mp3" media-type="audio/mp3" /> <item id="index" href="rs_index.mp3" media-type="audio/mp3" /> <item id="fig_01" href="fig1.png" media-type="image/png" /> <item id="resource" href="rs.res" media-type="text/xml" /> <item id="resource_audio" href="res.mp3" media-type="audio/mp3" /> </manifest> ...
Here is a manifest
for an audio-only version of the above DTB (multimediaType=audioNcx), where separate SMIL files were created for each segment of the book.
Example 3.4:
... <manifest> <item id="opf" href="rs.opf" media-type="text/xml" /> <item id="ncx" href="rs.ncx" media-type="text/xml" /> <item id="foreword" href="rs_fwdx.mp3" media-type="audio/mp3" /> <item id="standards" href="rs_stdx.mp3" media-type="audio/mp3" /> <item id="appendices" href="rs_app.mp3" media-type="audio/mp3" /> <item id="index" href="rs_index.mp3" media-type="audio/mp3" /> <item id="SMIL1" href="rsfwd.smil" media-type="application/smil" /> <item id="SMIL3" href="rsapp.smil" media-type="application/smil" /> <item id="SMIL4" href="rsind.smil" media-type="application/smil" /> <item id="SMIL2" href="rsstd.smil" media-type="application/smil" /> </manifest> ...
The spine
, a child of the package
element, shall
consist of a list of one or more itemref
elements whose order defines
the default linear reading order for the DTB. Each itemref
must
contain an idref
which points to the id
of a SMIL file
listed in the manifest
. Only SMIL files can be referenced by itemref
s
in the spine
. The itemref
s must be listed in the spine
in
order in which the SMIL files are to be presented. A player must consult the
spine
when it reaches the end of a SMIL file to determine which file
to render next.
The first
of the following examples shows the spine that corresponds to the first of the two
manifest
examples above:
Example 3.5:
<spine> <itemref idref="SMIL" /> </spine>
The following spine
matches the second manifest
example above. The correct
reading order is presented here. Note that it does not match the order of files in the
manifest
, where order is not significant.
Example 3.6:
<spine> <itemref idref="SMIL1" /> <itemref idref="SMIL2" /> <itemref idref="SMIL3" /> <itemref idref="SMIL4" /> </spine>
Compliant players are not required to support tours
.
The tours
element is an optional child of the package
element.
The OEBF Publication Structure describes tours
as follows: "Much as a tour guide might assemble points of interest into a set
of sightseers' tours, a content provider may assemble selected parts of a publication
into a set of tours to enable convenient navigation. ... Reading systems may
use tours to provide various access sequences to parts of the publication, such
as selective views for various reading purposes, reader expertise levels, etc."
Because of inherent differences between the structures of a DTB and the OEBF
tours
, it is not feasible to implement tours
in a DTB prepared
in accordance with this standard. If a producer wishes to provide the functionality
described above, it may partially achieve it by producing customized navList
s
in the NCX.
Compliant players are not required to support guide
s.
As specified in the OEBF Publication Structure, the guide,
a child of the package
element, lists the key structural features
of the DTB, such as the table of contents, introduction, bibliography, etc.
to enable playback devices to provide convenient access to them. Because DTBs
include a mandatory NCX that satisfies a more rigorous and detailed access requirement,
the guide
is not expected to be used in DTBs.
This standard defines an XML 1.0 Document Type Definition
-- DTBook -- for markup of the textual content files of books and other publications
presented in digital talking book format. To be compliant with this standard,
a textual content file of a DTB must be a valid XML file
conforming to dtbook100.dtd, which can be found in Appendix 1, "DTBook
DTD." The version
attribute on the dtbook
element must
be present and contain the value drawn from the above-named DTD. Parsers will
not enforce the presence of this attribute, so other mechanisms must.
A DTB that includes textual content will, in most cases, contain only one textual content file. However, when necessary (with a very large book, for example), a DTB can contain multiple textual content files, each of which must be valid to the DTBook DTD.
DTB content producers may extend the base DTD by including one or more new elements or full modules for special situations. To remain conformant with this standard, such extensions of the DTD must employ the mechanisms specified by XML 1.0. See section 4.2.2, "Modular Extension of the DTD."
A document developed during the creation of this standard, Theory Behind the DTBook DTD [DTBook Theory], discusses the rationale underlying the DTBook element set and the benefits it provides to digital talking book applications.
An alphabetical listing of the DTBook elements, with definitions, is included in section 4.3. Two documents external to this standard provide detailed information on the use of the element set. First, an expanded version of the DTD, in HTML format, (see [DTBook HTML]) provides full detail on each element, describing where it can be used and which elements can be used within it, along with an expanded list of attributes.
Second, a comprehensive set of guidelines for applying DTBook markup is available from the DAISY Consortium. These Structure Guidelines [StructGuide] describe the correct application of the DTBook element set, emphasize the importance of capturing the structure of the text content, and provide detailed examples of the use of all DTBook elements.
The DTBook element set has considerable application outside of the digital talking book as well. It was designed to enable the production of documents in a variety of accessible formats. At least one U.S. Braille translation software package has implemented a facility that imports DTBook documents and automatically translates and formats them in Grade 2 Braille. It is expected that similar automated processes will be developed for converting properly marked-up documents into large print and for rendering DTBook documents in Braille, synthetic speech, and large print "on the fly." Finally, an attribute called "showin" is incorporated in the DTBook element set to control the display of selected segments of a DTBook document. For example, descriptions of a graph might vary between Braille and large print editions; "showin" could allow only the appropriate version to show in each edition, although both would be present in the DTBook document.
This standard does not mandate the degree of markup to be applied to a textual content file. However, the richer the markup, the greater the functionality available to the reader.
For more information on XML 1.0 markup and DTD usage, see the W3C XML site [XML].
To ensure efficient player operation with DTBs containing textual content files, the smilref
attribute must be present and non-empty for each element in the textual content file referenced by a SMIL file. The smilref
value shall normally be the uri of the SMIL time container (par
or seq
) containing the media object that references a given element. However, in a text-only DTB consisting of a sequence of text media objects, smilref
contains the uri of the media object that references the element. The smilref
attribute permits the DTB player to resume SMIL-based playback following text-based navigation, full-text searches, etc.
The DTBook DTD includes a base set of elements for use in marking up a broad range of material. Additional modules containing elements for specialized applications such as poetry, plays, dictionaries, mathematics, etc. can be "invoked" from within a DTBook document when needed, as described below.
A DTBook document is an XML application. Therefore it should
begin with the XML declaration identifying the version of XML, and the optional
character set encoding (see Appendix 1, "DTBook DTD"
for more information):
<?xml version="1.0" encoding="UTF-8" ?>
This is followed by the document type declaration:
<!DOCTYPE dtbook SYSTEM "dtbook100.dtd" >
For discussion of other ways of expressing the DOCTYPE, see section 2.2 of Appendix 1, "DTBook DTD."
A book can invoke other DTDs or modules to augment the DTBook DTD by adding instructions in square brackets before the concluding ">" of the document type declaration. Such instructions in square brackets are called the "internal subset of declarations." For example:
<!DOCTYPE dtbook SYSTEM "dtbook100.dtd" [ <!ENTITY % dramaModule SYSTEM "drama.dtd" > %dramaModule; <!ENTITY % externalblock "| drama"> <!ENTITY % externalinline "| stagedir"> ]>
The first line of the internal subset declares an entity known as "dramaModule" and provides the URI where that module can be found. The second line invokes this entity, that is "brings it into" the current document, just as the DOCTYPE declaration invoked the base DTD (dtbook100.dtd). The third line declares the entity "% externalblock" and gives it the value "drama." Since dtbook100.dtd contains an entity of the same name, and the internal subset overrules the base (external) DTD (dtbook100.dtd) in areas of conflict, everywhere in dtbook100.dtd where %externalblock; appears (that is, wherever block elements are allowed), the value "drama" is added. Since drama
is the root element in the drama module, the full drama module can be used there. Similarly, the last line effectively allows the element stagedir
to be used anywhere %externalinline; is allowed in dtbook100.dtd (wherever inline elements can be used).
More than one module may be needed and included in a book. In the following example, both a poetry and drama module are invoked, as well as one inline element (stagedir
) from the drama module.
[ <!ENTITY % poemModule "http://www.xyz.org/poem.dtd" > %poemModule; <!ENTITY % dramaModule "http://www.xyz.org/drama.dtd" > %dramaModule; <!ENTITY % externalblock "| poem | drama" > <!ENTITY % externalinline "| stagedir"> ]>
See section 3 of Appendix 1, "DTBook DTD" for a more detailed discussion of this issue.
The element names from DTBook are listed below in alphabetical order. The description provided for each element is taken directly from the DTBook DTD.
A set of audio file formats is listed below. A compliant audio player must be capable of decoding at least one of the formats listed. It is strongly recommended that players be able to decode all listed formats. Content compliant with this standard must be delivered in one of the formats below, or any mixture of them. The file extensions shown for each format must be utilized in audio filenames in compliant DTBs. Values are not case-sensitive.
It is permissible for parts of a single book to be encoded in different audio formats. For example, a producer may choose to encode a lengthy bibliography at a lower bitrate or with a different codec than the main body of the book. Players must support transitions between differently encoded sections smoothly. There is no restriction on the granularity of these parts, i.e. they may occur at any point in the SMIL presentation.
Support for multi-channel rendering is not required. Stereo signals must be recognized and rendered at least in monaural format.
A compliant DTB player that provides audio output should be capable of decoding the following audio formats:
While the ISO standards for MP3 and AAC require support for variable bitrate playback, players compliant with this standard are only required to support constant bitrate playback.
Players must support sample rates of 44.1, 22.05, and 11.025 kHz at a depth of 16 bits per sample. Compressed audio must be encoded such that the output sampling rate is restricted to one of the above three rates.
Audio players capable of recording and exporting audio notes for bookmarks and highlights must support encoding in the following format or one of the formats specified in section 5.1. Audio players capable of importing bookmarks and highlights must support decoding of the following format.
Images included in DTBs must be presented in one or more of the following formats. Compliant playback devices that support image display must be capable of displaying the following image formats: JPEG (JFIF V 1.02) [JPEG] and PNG [RFC 2083]. Support for Scalable Vector Graphics [SVG] is recommended. Appendix 8 of the SVG specification addresses accessibility issues.
The Synchronized Multimedia Integration Language (SMIL 2.0) [SMIL] was developed by the World Wide Web Consortium as a standard for definition and playback of multimedia presentations over the Internet. SMIL defines the sequence of playback for one or more media objects. In the case of DTBs, the primary media objects are audio and textual content files; SMIL provides for their parallel and synchronized presentation. Any DTB constructed using SMIL, and utilizing content encoded in standard text and audio media types, is playable on any device or platform which has implemented a SMIL-conformant player of the same or later SMIL version, so long as the necessary audio and textual rendering decoders are present.
What distinguishes a DTB playback system from a basic SMIL player is the inclusion of specific navigation and presentational capabilities set out in the user requirements for DTBs ([Navigation Features]). These capabilities can utilize information from an NCX file, from the textual content, and/or from the SMIL file itself. The key to this information is the inclusion of unique identifiers within the textual content (when present) and SMIL files. Audio files are indexed by time-based positions and in themselves contain no embedded semantic structure. To provide semantic structure to audio content, it is necessary to associate time-points in the audio file with the corresponding position within the textual content. This is achieved using SMIL through the pairing of a pointer to a specific position within a textual content file (referenced by a URI) with its corresponding time position in the audio content. In the case of the DTB SMIL application, each synchronization point within the SMIL file is assigned a unique identifier. The presence of these identifiers within both the textual content and the SMIL allows navigation to occur by several different methods, as determined by the playback system.
SMIL incorporates a control structure called customTest
s, which allows SMIL authors to identify by class selected elements of a document (e.g., notes, page numbers, line numbers). The playback device can then expose to the user the presence of these classes and allow the user to select whether a given class of elements is to be read or skipped over during sequential playback.
The DTB producer determines granularity of the synchronization events. Synchronization events may be limited to the primary structural elements (those indicated in the NCX) or may be augmented in books with full textual content to include synchronization down to paragraph, sentence, or even word level. The requirement for this level of synchronization is that the textual content include mark-up tags for the desired elements, and that those elements include unique identifiers that can be referenced in the SMIL files.
The SMIL file for a DTB typically will consist of a sequence of parallel events
(e.g., text and audio (and possibly image) events occurring simultaneously).
SMIL represents this structure through the use of the "time containers" seq
(sequence of media objects) and par
(parallel time grouping in which
multiple media objects play back at the same time). A simple form of DTB SMIL
file would be as follows, where the three par
s shown are played one
after the other, and the text and audio content referenced in each par
are rendered simultaneously:
<smil> ... <seq> <par><text.../><audio.../></par> <par><text.../><audio.../></par> <par><text.../><audio.../><img.../></par> </seq> ... </smil>
Synchronization of media objects in this standard is based on the SMIL 2.0 Specification. Developers are requested to reference SMIL 2.0 [SMIL] for complete background and details. Only a small subset of the SMIL specification is utilized in this implementation, drawing from the following modules, which are grouped by functional area. Modules marked with asterisks are used in whole or in part in this application; the others are not utilized but are included because they are part of a core set of modules required for host language conformance under W3C modularization guidelines.
The modules mentioned above can be combined, using W3C modularization guidelines, to form a profile specific to DTB applications. Section 2 of the SMIL specification, "The SMIL 2.0 Modules," describes this process in detail.
To simplify validation using commonly available parsers and to lessen the complexity of determining content models and applicable attribute lists, a DTB-Specific SMIL DTD is included in this standard in Appendix 2. This DTD includes only those elements and attributes from the modules listed above that are required for the DTB application. In addition, it is more restrictive than the SMIL modules in that id attributes are often required in the DTB application when they are implied in the SMIL modules.
A compliant DTB must contain at least one SMIL file. All SMIL files included
in a DTB must be valid XML documents conforming to dtbsmil100.dtd.
The version
attribute on the smil
element must be present
and contain the value drawn from the above-named DTD. Parsers will not enforce
the presence of this attribute, so other mechanisms must.
Time containers (seq
s or par
s) within SMIL files must contain
id
s. Media objects (audio
, text
, and img
) may also
contain id
s, although this practice will generally be limited to single-medium
DTBs. See section 7.4.11, "Text-Only DTBs."
In the textual content file, each segment to be synchronized (e.g., heading,
paragraph, list item, etc.) must be contained within an element carrying a unique
id to which the corresponding SMIL segment points. In addition, any textual
content file element referenced by a SMIL file must include a smilref attribute
specifying the uri of the time container or media object that references it.
The smilref
value shall normally be the uri of the SMIL time container
containing the media object that references a given element. However, in a text-only
DTB consisting of a sequence of text media objects, smilref
shall contain
the uri of the referencing media object itself. See section 4.2.1, "DTBook
Markup Related to SMIL."
To ensure efficient player operation with DTBs containing textual content files, the smilref
attribute must be present and non-empty for each element in the textual content file referenced by a SMIL file. The smilref
value shall normally be the uri of the SMIL time container (par
or seq
) containing the media object that references a given element. However, in a text-only DTB consisting of a sequence of text media objects, smilref
contains the uri of the media object that references the element. The smilref
attribute permits the DTB player to resume SMIL-based playback following text-based navigation, full-text searches, etc.
It is strongly recommended that the SMIL file(s) have a level of granularity matching that of the textual content file. That is, if the textual content file is marked up to the paragraph level, the SMIL file(s) should include synchronization to the paragraph level.
All time offsets in SMIL files (and all other applicable DTB files, e.g., NCX clipBegin
/clipEnd
, bookmark timeOffset
s, etc.), are based on normal play speed. In order to maintain synchronization, a player must process time offsets independently of actual playback speed.
As mentioned above, the DTB application utilizes only a portion of the elements and attributes that make up the modules in the DTB SMIL Profile. Playback devices compliant with this standard need support only the following SMIL elements and attributes, which make up the DTB-Specific SMIL DTD.
smil
. The smil
element contains zero or
one head
and zero or one body
.<!ELEMENT smil (head, body) >
<smil>
...content...</smil>
meta
element), optional layout
,
and optional customAttributes
, in that order. <!ELEMENT head ((meta)*, (layout)?, (customAttributes)?
) >
<head>
...content...</head>
<smil>
<meta
...attributes... />
<!ELEMENT meta EMPTY >
<head>
region
elements it
contains) where on a visual, audio, or tactile rendering space various producer-defined
elements, e.g., figures, text, footnotes, etc. are displayed.<!ELEMENT layout (region)+ >
<layout>
...content...</layout>
<head>
<!ELEMENT region EMPTY >
<region
...attributes... />
region
attribute on media
object references the id on appropriate region
element.region
in display
space. See SMIL 2.0 for details.region
display space.
See SMIL 2.0 for details.region
in display
space. See SMIL 2.0 for details.region
in display space.
See SMIL 2.0 for details.region
in display
space. See SMIL 2.0 for details. region
in display
space. See SMIL 2.0 for details.region
in which it is displayed. See SMIL
2.0 for definitions of attribute values.region
that is not covered by the media object(s) being displayed.backgroundColor
of a region
is shown when
no media is being rendered to the region
. See SMIL
2.0 for definitions of attribute values.<layout>
region
attribute references
the id
on a given region
element will be displayed in that region
.
customTests
which allow
the producer to specify kinds of structures that the user can choose to have
automatically rendered or skipped.<!ELEMENT customAttributes (customTest)+ >
<customAttributes>
...content...</customAttributes>
<head>
customTest
attribute
for par
and seq
below.<!ELEMENT customTest EMPTY >
<customTest
...attributes... />
customTest
attribute on par
or seq
in
body
of SMIL. value = true
) or skip (value = false
)
the structure during sequential playback. If no value is present, the
default is false and the content is skipped.value= "visible"
) or hide from (value
= "hidden"
) the reader the ability to override the setting of defaultState
.
See section 7.4.3, "'Skippable' Structures" for normative
content.<customAttributes>
<!ELEMENT body (par|seq|text|audio|img|a)+ >
<body>
...content...</body>
<smil>
body
contains zero or more seq
s
or par
s and may also directly contain zero or more media objects
(text
, audio
, img
), or links (a
). par
s and seq
s.<!ELEMENT seq (par|seq|text|audio|img|a)+ >
<seq>
...content...</seq>
seq
with matching customTest
element in head
. begin
. body
, seq
, par
seq
s. <!ELEMENT par (seq|text|audio|img|a)+ >
<par>
...content...</par>
customTest
element in head
. body
, seq
par
s" for
normative content. <!ELEMENT text EMPTY >
<text ...attributes... />
region
(defined
in layout
in document head
) in which the text
will be presented. References the id of the appropriate region
.
All types of text objects which are to appear in the same rendering space
should be assigned the same value for region
. For example,
page numbers and producer's notes might both be displayed in the main
text area of a screen (region="text"
), while notes (e.g.,
footnotes) might be displayed in a separate area at the bottom of the
screen (region="notes"
).body
, par
, seq
<!ELEMENT audio EMPTY >
<audio
...attributes... />
clipBegin
. region
(defined
in layout
in document head
) in which the audio object
will be presented. References the id of the appropriate region
.body
, par
, seq
<!ELEMENT img EMPTY >
<image
...attributes... />
region
(defined
in layout
in document head
) in which the image will
be presented. References the id of the appropriate region
.body
, par
, seq
<!ELEMENT a (text|audio|img)* >
<a>
...content...</a>
body
, par
, seq
The following attributes are allowed when the entity %Core.attrib; is listed above:
DTB players should provide the functionality to allow readers to escape from the DTB rendition of specific structures (at a minimum tables, lists, producer's notes, annotations, and notes) with a single action. To support this functionality, any such structure consisting of multiple time containers (i.e., seq
s and par
s) must be wrapped in a seq
. In addition, a class attribute must be applied to the seq
or par
containing a table, list, annotation, or note, using element names drawn from the DTBook DTD (i.e., "table," "list," "prodnote," "annotation," and "note").
DTB player developers may choose to automatically invoke special player navigation modes when the reader enters a table or list. (See "Document Navigation Features List [Navigation Features]." To support this functionality, a class attribute must be included on the seq
or par
containing a table or list, using element names drawn from the DTBook DTD (that is, "table" and "list.") DTBs and players may also support this functionality for other structures using the same mechanism.
Players should offer the user the option to "turn off" certain structures in
a DTB, that is, select structures such as notes or line numbers that the player
will then automatically skip over during sequential playback. To support this
capability, compliant DTBs must include customTest
attributes on
seq
s or par
s containing those structures. In addition,
customAttributes
, as well as a customTest
element
for each "skippable" structure, must be present in the head
of
each SMIL file and contain content. At a minimum, customTest
attributes
must be applied to time containers for linenum
, note
,
noteref
, annotation
, pagenum
, optional
prodnote
, and sidebar
. Attribute values (for customTest
attributes on seq
s or par
s and for the id attribute
on customTest
elements) shall be the names of the "skippable" elements,
drawn from the DTBook DTD (e.g., "linenum", "note", etc.) except as noted in
the following paragraph.
Different customTest
attributes may be applied to a single DTBook
element, depending on the element's attributes. For example, <prodnote
render="optional">
might be assigned the customTest "prodnote_opt"
,
while <prodnote render="required">
would not need to
be assigned a customTest
as the user should not have the option
of turning them off.
In DTB applications, the element customTest
will only be used
when the producer wishes to allow the reader to turn a class of structures on
or off, so the override
attribute on the customTest
element must
always be set to "visible." See description of <customTest>
above. The SMIL specification chose to make "hidden" the default value,
so it is critical that the override="visible"
attribute
always be present when the customTest
element is used.
When a user navigates to a skippable element that has been turned off, the player must render the content of that element.
Section 8.5, "How the NCX Works," describes how information on skippable structures can be gathered in the NCX for efficient presentation to the user.
When a DTB spans several media units (e.g., CD-ROM discs), all files required to render any given SMIL file must be present on the same media unit as that SMIL file. This requirement ensures that players need only track the location of SMIL files in order to provide a complete DTB presentation.
If links (i.e., <a>
tags with href
attributes) are present in the textual content file of a DTB, they must also
be included in the corresponding SMIL file(s). Related links in textual content
and SMIL files must point to the same information in the textual content and
audio files, when audio is present. The default behavior of a link is to be
active for at least the duration of the media object it contains. Players may
establish other behaviors (e.g., maintaining links in the active state for a
preset period of time (possibly modifiable by the user) or until the next link
is encountered).
This standard allows only SMIL 2.0 Basic Layout syntax (i.e., CSS2 syntax and others are not permitted).
Each par
can contain no more than one each of text
, audio
,
image
and seq
. See section 7.4.11, "Text-Only
DTBs" for further discussion of this issue.
When both textual content and audio files are present, text
and audio
objects within the same par
must both represent the same body of material (e.g., the same paragraph).
Because of resource limitations on portable DTB players, SMIL presentations should not be created such that multiple audio media objects are rendered simultaneously. Reading systems are not required to support simultaneous rendering of multiple audio files.
If the clipBegin
attribute is not present in an instance of the
<audio>
element, the audio file referenced must be played
from its beginning. If the clipEnd
attribute is not present, the
audio file must be played to its end. If the value of the clipEnd
attribute exceeds the duration of the audio file, the value must be ignored,
and the audio file played to its end.
It is strongly recommended that links (<a>
tags) be applied
to media objects (normally audio
) for all noteref
s
and annoref
s, with the corresponding note
s and annotation
s
as the targets. The presence of the links will enable key player functionality,
such as easy access to note
s when noteref
s are turned
on and note
s turned off.
It is recommended that noteref
s and note
s be implemented
in SMIL such that the default, linear presentation (on a simple player) of the
noteref
s and note
s is in the order and location appropriate
to the producing agency's policy for rendering note
references
and note
s.
Duration of image display will be equal to that of the longest media object or time container contained within the same par
. Example 7.2 below shows a sample implementation of SMIL for an image and its associated caption and producer's note.
Text-only DTBs must include SMIL files. This will ensure user access to the many features enabled by SMIL. As mentioned above, it is strongly recommended that the SMIL file(s) have a level of granularity matching that of the textual content file.
In a DTB which contains no audio material, the duration of text
media objects is controlled either by the user (i.e., the player renders the
next text
object on command) or the player (e.g., a text-to-speech
engine or a pacing algorithm for a large-print or Braille display triggers the
next media object).
Metadata is included in the <head>
element using the <meta>
tag. Content producers may introduce other metadata besides those listed below,
if needed. However, additional metadata names must include the prefix "prod:".
Players must not fail when encountering unknown metadata but must, at a minimum,
ignore it.
The following example illustrates the use of head
and its contents.
The meta
element contains the unique id of the DTB as well as the
tool that generated this SMIL file and the elapsed time to the start of the
file. The visual display location of any text elements with region ="text"
or region="notes"
is specified by the region
elements
within layout
. The text region occupies most of the screen (the
bottom edge of the "text" region is 15% from the bottom of the overall rendering
window), while the notes regions occupies only the bottom 15%. The customAttributes
indicate that any time containers with customTest="pagenum"
will
be skipped by default, while time containers with customTest="notes"
will automatically be played. If the user interface of the playback device supports
it, the user may change these settings.
Example 7.1:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE smil SYSTEM "dtbsmil100.dtd"> <smil version="1.0.0"> <head> <meta name="dtb:uid" content="dk-dbb-4z0065" /> <meta name="dtb:generator" content="smilgen2.4" /> <meta name="dtb:totalElapsedTime" content="01:33:56.233" /> <layout> <region id="text" top="0%" left="0%" right="0%" bottom="15%"/> <region id="notes" top="85%" left="0%" right="0%" bottom="0%"/> </layout> <customAttributes> <customTest id="pagenum" defaultState="false" override="visible"/> <customTest id="note" defaultState="true" override="visible"/> </customAttributes> </head> <body> ... </body> </smil>
Example 7.2 shows the use of SMIL elements within body
. The initial
seq
includes the attribute "dur"
which specifies
that the entire SMIL file is one hour, three minutes, 24.9 seconds long. Each
par
(a page number, a heading, two paragraphs, and a figure are
shown) includes the segment of text, the image (if applicable), and the corresponding
audio clip that are to be rendered simultaneously. The figure falls between
the two paragraphs.
The image file is presented in parallel with text and audio versions of the figure caption and a producer's note describing the figure. The entire group is wrapped in a par
, with the image file rendered contemporaneously with a sequence of two par
s.
The par
for the second paragraph includes a link that "wraps" the audio
element.
Example 7.2:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE smil SYSTEM "dtbsmil100.dtd"> <smil version="1.0.0"> <head> ... </head> <body> <seq id="baseseq" dur="1:03:24.9"> <par id="p1" customTest="pagenum"> <text region="text" src="rs.xml#pg_1" /> <audio src="rs_fwdx.mp3" clipBegin="00:00:00" clipEnd="00:00:00.91" /> </par> <par id="h1"> <text region="text" src="rs.xml#h1_1" /> <audio src="rs_fwdx.mp3" clipBegin="00:00:01.62" clipEnd="00:00:02.53" /> </par> <par id="para1"> <text region="text" src="rs.xml#para_1" /> <audio src="rs_fwdx.mp3" clipBegin="00:00:03.51" clipEnd="00:01:45.36" /> </par> <par id="img1"> <img region="image" src="fig1.png" /> <seq id="icap1"> <par id="cap1"> <text region="caption" src="rs.xml#caption_1" /> <audio src="rs_fwdx.mp3" clipBegin="00:01:45.98" clipEnd="00:01:52.66" /> </par> <par id="pnote1" customTest="prodnote"> <text region="text" src="rs.xml#prodnote_1" /> <audio src="rs_fwdx.mp3" clipBegin="00:01:53.08" clipEnd="00:02:55.34" /> </par> </seq> </par> <par id="para2"> <text region="text" src="rs.xml#para_2" /> <a href="rs12.smil#h2_9"><audio src="rs_fwdx.mp3" clipBegin="00:02:56.21" clipEnd="00:04:03.75" /></a> </par> ... </seq> </body> </smil>
Notes or sidebars containing multiple paragraphs will need to be represented
as a series of pars
wrapped in a seq
, so that a customTest
can be applied to the seq
, permitting the user to skip the entire
sequence. The first part of Example 7.3 illustrates this situation. In addition,
note references occurring in the middle of a paragraph will require this special
syntax so that the playback device can properly render the content with or without
either the note reference or the note.
In the second half of Example 7.3, the first par
contains the portion of paragraph 120 preceding the note reference (identified with a span
tag in the textual content file as described in section 4.2.1). The second par
holds the note reference itself (e.g., "footnote 1"). The third par
contains the contents of footnote 1 and the last holds the remainder of paragraph 120. Note that the seq
and each par
contain a unique id
. The region
attribute on text
will control whether each segment is displayed in the text or notes region.
Example 7.3:
... <body> <seq id="baseseq" dur="02:14:34.156"> ... (a series of pars) <seq id="sidebar_1" customTest="sidebar"> <par id="para_9"> <text region="text" src="rs.xml#para_9" /> <audio src="rs_fwdx.mp3" clipBegin="02:02.711" clipEnd="02:14.678" /> </par> <par id="para_10"> <text region="text" src="rs.xml#para_10" /> <audio src="rs_fwdx.mp3" clipBegin="02:15.545" clipEnd="02:44.612" /> </par> </seq> ... (a series of pars) <seq id="para_120"> <par id="span_3"> <text region="text" src="rs.xml#span_3" /> <audio src="rs_fwdx.mp3" clipBegin="46:58.744" clipEnd="47:21.659" /> </par> <par id="nref_1" customTest="noteref"> <text region="text" src="rs.xml#nref_1" /> <audio src="rs_fwdx.mp3" clipBegin="47:22.610" clipEnd="47:23.555" /> </par> <par id="ftn_1" customTest="note" class="note"> <text region="notes" src="rs.xml#ftn_1" /> <audio src="rs_notes.mp3" clipBegin="00:00.091" clipEnd="00:34.754" /> </par> <par id="span_4"> <text region="text" src="rs.xml#span_4" /> <audio src="rs_fwdx.mp3" clipBegin="47:24.057" clipEnd="47:582" /> </par> </seq> ... (a series of pars) </seq> </body> ...
The SMIL 2.0 Timing and Synchronization Module describes several different formats in which "clock values" (timing) may be represented. See Clock Values [SMILclock] in that module. Playback devices must support all of these formats. The three formats are:
Full-clock-val (hours, minutes, seconds, and fractions of seconds): 3:22:55.91
Partial-clock-val (minutes, seconds, and fractions of seconds): 43:15.044
Timecount-val (one or more digits, plus an optional fraction and unit of measurement -- h=hours, min=minutes, s=seconds, ms=milliseconds): 34.6s or 356ms or 58.2. (For Timecount values, if no unit is shown, the default is "s" (for seconds).)
If either of the first two formats is used, leading zeroes must be added to single-digit values for minutes and seconds to ensure mm:ss format.
The Navigation Control file for XML applications (NCX) exposes the hierarchical structure of a DTB to allow the user to navigate through it. The NCX is similar to a table of contents in that it enables the reader to jump directly to any of the major structural elements of the document, i.e. part, chapter, or section, but it will often contain more elements of the document than the publisher chooses to include in the original print table of contents. It can be visualized as a collapsible tree familiar to PC users. Its development was motivated by the need to provide quick access to the main structural elements of the document without the need to parse the entire marked-up text file, which in many cases may not be present at all. Other elements such as pages, footnotes, figures, tables, etc. can be included in separate, nonhierarchical lists and may be accessed by the user as well.
It is important to emphasize that these navigation features are intended as a convenience for users who want them, and not a burden to those who do not. The alternative of a simple linear playback of the book will be available for those users not requiring the navigation features of the NCX.
Every DTB must contain exactly one NCX file. The NCX file must be a valid
XML document conforming to ncx100.dtd (see Appendix
3, "NCX DTD") and comply with the additional normative requirements of
section 8.4. The version
attribute on the ncx
element
of the NCX file must be explicitly given with the value drawn from the above-named
DTD. The NCX file itself must be named with the extension ".ncx".
Brief descriptions of the NCX elements follow. Each includes the element declaration extracted from the NCX DTD, along with descriptions of any applicable attributes.
<!ELEMENT ncx (head, docTitle, docAuthor*,
navMap, navList*)>
<ncx ...attributes...>
...content...</ncx>
<!ELEMENT head (smilCustomTest | meta)+>
<head>
...content...</head>
<ncx>
<!ELEMENT smilCustomTest EMPTY>
<smilCustomTest...attributes.../>
<head>
<!ELEMENT meta EMPTY>
<meta
...attributes... />
<head>
<!ELEMENT docTitle (text, audio?)>
<docTitle
...attributes... >
...content...
</docTitle>
<ncx>
<!ELEMENT docAuthor (text, audio?)>
<docAuthor
...attributes...>
...content...</docAuthor>
<ncx>
<!ELEMENT text (#PCDATA)>
<text
...attributes...>
...content...</text>
<navLabel>
, <docTitle>
,
<docAuthor>
<!ELEMENT audio EMPTY>
<audio
...attributes... />
<navLabel>
, <docTitle>
,
<docAuthor>
<!ELEMENT navMap (navLabel*, navPoint+)>
<navMap
...attributes...>
...content...</navMap>
<ncx>
navMap
element contains the primary navigation
information, pointing to each of the major structural elements of the document.
Secondary navigation elements such as page numbers, etc. are not included
in navMap
, but are contained in navList
s.<!ELEMENT navPoint (navLabel+, content, navPoint*)>
<navPoint
...attributes...>
...content...</navPoint>
type
attribute equals "ncx," whose elementRef
attribute value is navPoint, and whose classRef
references the class of the current navPoint
.<navMap>
, <navPoint>
navPoint
element contains one or more
navLabel
s, representing the referenced part of the document,
e.g. chapter title or section number, along with a pointer to content
.
navPoint
s may be nested to represent the hierarchical structure
of a document.<navMap>
,
<navPoint>
, <navList>
, or <navTarget>
in various media for presentation to the user. Can be repeated so descriptions
can be provided in multiple languages. <!ELEMENT navLabel ((text, (audio?, img?))|((text?),
audio, (img?)))>
<navLabel
...attributes...>
...content...</navLabel>
<navMap>
, <navPoint>
,
<navList>
, <navTarget>
<!ELEMENT img EMPTY>
<img
...attributes... />
<navLabel>
navPoint
or navTarget
. <!ELEMENT content EMPTY>
<content
...attributes... />
<navPoint>
, <navTarget>
<!ELEMENT navList (navLabel+, navTarget+)>
<navList
...attributes...>
...content...</navList>
navList
, described by their dtbook
element
name, e.g. pagenum
, note
. <ncx>
navList
element contains secondary navigation
information within navTarget
s. It is similar to navMap
except navTarget
s may not nest, whereas navPoint
s
can. Used for lists of elements such as page numbers, footnotes, figures,
tables, etc. that the user may want to access directly but would clutter
up the primary navigation information.<!ELEMENT navTarget (navLabel+, content)>
<navTarget
...attributes...>
...content...</navTarget>
navTarget
on a visual display of the NCX. navTarget
on a visual display. navTarget
, described by its dtbook
element name, e.g., pagenum
, note
. It may be used to select a presentation from the resource file. Player will locate the resource whose type
attribute equals "ncx," whose elementRef
attribute value is navTarget, and whose classRef
references the class of the current navTarget
.navPoint
that contains this navTarget
. navTarget
element contains one or more
navLabel
s representing the referenced part of the document,
e.g., a page or footnote, along with a pointer to content
.
The mapRef
attribute is used to synchronize the navList
and navMap
.This section collects other normative requirements for the NCX file that cannot be enforced by the DTD.
Metadata shall be included in the head
element of the NCX using
the meta
tag. Content producers may introduce other metadata
besides those listed below, if needed; such additional metadata must be prefixed
by "prod:". Players must not fail when encountering unknown metadata but must,
at a minimum, ignore it.
dc:identifier
element referenced by the unique-identifier
attribute on the package file's package
element. See section
3.1, "Package Identity." <pagenum>
in the DTB. When a DTB spans several distribution media (e.g., multiple CD-ROMs), the full NCX, along with all audio clips referenced by it, must be included on every media unit. This will ensure that the entire NCX will function properly on each piece of media.
The mapRef
attribute on each navTarget
must reference
the innermost navPoint
that contains the element referenced by
the navTarget
.
Each unique customTest
element that appears in one or more SMIL
files must have its attributes duplicated in a smilCustomTest
element in the head
of the NCX.
Upon opening a DTB, a player will ordinarily use the NCX navMap
to define the user's choices for navigation. The navMap
contains
nested navPoint
s that represent the major divisions of the document.
For example, the structure of the book whose NCX is shown in Section 8.6,
Example 8.1 would look like this:
Foreword and Standards are at the same level, in this case the highest level,
level 1. The nesting of navPoint
s allows the user to move directly
between these objects without passing through the lower level divisions in between.
From Foreword, the user can move to level 2 and step to any of the sections
of Foreword. Since there is no level 3 under Foreword, no smaller divisions
can be accessed from the NCX. Such smaller divisions may be present, but they
can only be reached through local navigation. The division
of Standards marked 'a.' is at level 4, and can be reached by stepping through
1 Core Services and 1.1.
The user will also have the option of navigating to items that do not fit
easily into the hierarchical structure of a document, e.g. pages, footnotes
or sidebars. This function is provided by navList
s. Unlike navMap
,
navList
s do not represent the structure of the book by nesting
navTarget
s. In example 8.1, there are two navList
s:
the first contains three navTarget
s representing page numbers,
and the second contains three navTarget
s representing notes.
Each navPoint
or navTarget
provides navigation
information about one piece of the document, e.g. a chapter heading, section
number, page number, figure, etc. The text
element contains the
actual heading, page number, etc. for visual or text-to-speech presentation;
the audio
element uses SMIL 2.0 syntax to point to a clip containing
the audio presentation of the same information. One or both should be used
to give location feedback to the user. The content
element provides
a pointer to an ID within a SMIL file that marks the beginning of the referenced
portion of the DTB.
The required mapRef
attribute of navTarget
allows
synchronization of navList
s with the navMap
. mapRef
points to the innermost navPoint
that contains the page number, note, or
other element referenced by the navTarget
. Similarly, the pageRef
attribute of navPoint
points to the navTarget
representing
the page on which the navPoint
begins.
This standard offers producers the ability to gather in the head
of the NCX information on all skippable elements from the SMIL file(s) (see
section 7.4.3, "'Skippable' Structures"). The smilCustomTest
element
may be repeated to list all skippable elements and their defaultStates. Playback
systems may utilize this information to inform users of their options and
current settings for skippable structures.
Example 8.1:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE ncx PUBLIC "-//NISO//DTD ncx v1.0.0//EN" "http://www.loc.gov/nls/z3986/v100/ncx100.dtd" > <ncx version="1.0.0"> <head> <smilCustomTest id="pagenum" defaultState="false" override="visible"/> <smilCustomTest id="note" defaultState="true" override="visible"/> <meta name="dtb:uid" content="us-nls-00001"/> <meta name="dtb:depth" content="6"/> <meta name="dtb:generator" content="NLSv001"/> <meta name="dtb:pageNormal" content="47"/> <meta name="dtb:pageSpecial" content="0"/> <meta name="dtb:pageFront" content="5"/> <meta name="dtb:skippable" content="pagenum"/> </head> <docTitle> <text>Revised Standards and Guidelines of Service for the Library of Congress Network of Libraries for the Blind and Physically Handicapped 1995</text> <audio src="rs_title.mp3" /> </docTitle> <docAuthor> <text>Association of Specialized and Cooperative Library Agencies</text> <audio src="rs_title.mp3" /> </docAuthor> <navMap> <navPoint class="chapter" id="lvl1_3" pageRef="p1"> <navLabel> <text>Foreword</text> <audio src="rs_fwdx.mp3" clipBegin="00:01.5" clipEnd="00:02.0" /> </navLabel> <content src="sample.smil#h1_3" /> <navPoint class="section" id="lvl2_1" pageRef="p1"> <navLabel> <text>History</text> <audio src="rs_fwdx.mp3" clipBegin="00:03.4" clipEnd="00:03.9" /> </navLabel> <content src="sample.smil#h2_1" /> </navPoint> <navPoint class="section" id="lvl2_2" pageRef="p2"> <navLabel> <text>Development of Standards</text> <audio src="rs_fwdx.mp3" clipBegin="00:56.3" clipEnd="00:57.7" /> </navLabel> <content src="sample.smil#h2_2" /> </navPoint> </navPoint> <navPoint class="chapter" id="lvl1_7" pageRef="p16"> <navLabel> <text>Standards</text> <audio src="rs_stdx.mp3" clipBegin="00:01.3" clipEnd="00:02.1" /> </navLabel> <content src="sample.smil#h1_7" /> <navPoint class="section" id="lvl2_11" pageRef="p16"> <navLabel> <text>1 Core Services</text> <audio src="rs_stdx.mp3" clipBegin="00:02.9" clipEnd="00:04.9" /> </navLabel> <content src="sample.smil#h2_10" /> <navPoint class="subsection" id="lvl3_1" pageRef="p16"> <navLabel> <text>1.1</text> <audio src="rs_stdx.mp3" clipBegin="00:05.7" clipEnd="00:06.7" /> </navLabel> <content src="sample.smil#h3_1" /> <navPoint class="sub-subsection" id="lvl4_1" pageRef="p16"> <navLabel> <text>a.</text> <audio src="rs_stdx.mp3" clipBegin="00:18.7" clipEnd="00:19.1" /> </navLabel> <content src="sample.smil#h4_1" /> </navPoint> </navPoint> <navPoint class="subsection" id="lvl3_2" pageRef="p16"> <navLabel> <text>1.2</text> <audio src="rs_stdx.mp3" clipBegin="00:50.5" clipEnd="00:51.4" /> </navLabel> <content src="sample.smil#h3_2" /> </navPoint> </navPoint> </navPoint> </navMap> <navList id="pages" class="pagenum"> <navLabel> <text>Pages</text> </navLabel> <navTarget class="pagenum" id="p1" value="1" mapRef="lvl1_3"> <navLabel> <text>1</text> <audio src="rs_fwdx.mp3" clipBegin="00:00" clipEnd="00:00.9" /> </navLabel> <content src="sample.smil#p1" /> </navTarget> <navTarget class="pagenum" id="p2" value="2" mapRef="lvl2_2"> <navLabel> <text>2</text> <audio src="rs_fwdx.mp3" clipBegin="00:53.9" clipEnd="00:54.6" /> </navLabel> <content src="sample.smil#p2" /> </navTarget> <navTarget class="pagenum" id="p16" value="16" mapRef="lvl1_7"> <navLabel> <text>16</text> <audio src="rs_stdx.mp3" clipBegin="00:00.0" clipEnd="00:00.7" /> </navLabel> <content src="sample.smil#p3" /> </navTarget> </navList> <navList id="notes" class="note"> <navLabel> <text>Notes</text> </navLabel> <navTarget class="note" id="nref_1" mapRef="lvl2_2"> <navLabel> <text>1</text> <audio src="rs_fwdx.mp3" clipBegin="01:22.6" clipEnd="01:23.5" /> </navLabel> <content src="sample.smil#nref_1" /> </navTarget> <navTarget class="note" id="nref_2" mapRef="lvl2_2"> <navLabel> <text>2</text> <audio src="rs_fwdx.mp3" clipBegin="02:00.6" clipEnd="02:01.4" /> </navLabel> <content src="sample.smil#nref_2" /> </navTarget> <navTarget class="note" id="nref_3" mapRef="lvl2_2"> <navLabel> <text>3</text> <audio src="rs_fwdx.mp3" clipBegin="03:13.3" clipEnd="03:14.1" /> </navLabel> <content src="sample.smil#nref_3" /> </navTarget> </navList> </ncx>
This standard establishes a specific XML file format to support bookmark and highlight export and import. A playback system may allow readers to set bookmarks and to highlight passages in a document, label the marked sections with text or audio notes, and export the resulting collection of marks and notes to other compliant playback devices.
This standard does not require that compliant players support all of the functionality described above. In addition, this standard places no constraints on a playback system's internal system for storing or manipulating the information in the bookmark file. However, if a player supports the export of bookmarks and highlights and their associated notes, the player must format the information as a valid XML file conforming to bookmark100.dtd, the DTD for Portable Bookmarks/Highlights found in Appendix 4. Similarly, a player with bookmark/highlight import capabilities must correctly process bookmarks and highlights and associated notes that are formatted in accordance with boomark100.dtd
Export-capable players must be able to set bookmarks and highlight starts
and ends at any point in a DTB, whether based on the audio file or the textual
content file. That is, players shall not be limited to capturing location
information only at element boundaries. Offsets from element boundaries in
audio files shall be identified by <timeOffset>
in fractional
seconds (Seconds = DIGIT+, Fraction = 3DIGIT). Offsets from element boundaries
in textual content files shall be identified by <charOffset>
,
measured in characters, counting from the nearest previous tag with an id;
white space is normalized (collapsed to one character) and tags are not counted.
If a playback device supports user-recording of audio notes on bookmarks or highlights that may be exported, the recording may be in any format supported by the standard. When generating the filename for a note, the playback device must generate a filename extension appropriate to the recording format (See section 5, "Audio File Formats.")
Bookmark files (which may include highlights) shall be named, by default, with
the value from the bookmark element uid
and the extension ".bmk".
For example: "se-tpb-14339.bmk". Players may allow users to apply
their own filenames to accommodate character limitations in other filesystems
and to avoid filename collisions. To accommodate user-supplied names, players
with bookmark import capabilities must be able to open bookmark files and read
the uid
value to match the correct bookmark file with a DTB. In
addition, players offering import functionality must, at a minimum, automatically
match an imported file with the currently loaded DTB. It is recommended that
if more than one bookmark file is present for a given DTB, players allow the
user to choose among them.
Players may implement a variety of systems for numbering or otherwise identifying bookmarks or highlighted sections so the user can step through and choose from a group of them. However, when preparing a bookmark file for export, players must sort the bookmarks and highlights into document order and write them in that order.
Brief descriptions of the Bookmark/Highlight elements follow. Each includes the element declaration extracted from the Bookmark DTD found in Appendix 4, along with descriptions of any applicable attributes.
<!ELEMENT bookmarkSet (title, uid, lastmark?,
(bookmark | hilite)*) >
<bookmarkSet>
...content...</bookmarkSet>
<!ELEMENT title (text, audio?) >
<title>
...content...</title>
<bookmarkSet>
<!ELEMENT text (#PCDATA)>
<text>
...content...</text>
<title>
, <note>
<!ELEMENT audio EMPTY >
<audio
...attributes... />
clipBegin
attribute
specifies the beginning of a segment of a continuous media object as
a time offset from the start of the media object. The value syntax is
defined by the SMIL 2.0 Timing and Synchronization Module [SMILCLOCK].
clipEnd
attribute
specifies the end of a segment of a continuous media object as a time
offset from the start of the media object. It uses the same attribute
value syntax as clipBegin
. <title>
, <note>
package
element. See section
3.1, "Package Identity."<uid>
...content...</uid>
<bookmarkSet>
<par>
or <seq>
element
in the SMIL file that contains the lastmark, plus a time offset or character
offset to the exact point.<!ELEMENT lastmark (ncxRef, uri, (timeOffset
| charOffset)) >
<lastmark>
...content...</lastmark>
<bookmarkSet>
<lastmark>
is set automatically
by the playback device. navPoint
)at time lastmark, bookmark, or highlight is set. Ensures
that current location in NCX and SMIL are synchronized after moving to a
lastmark, bookmark, or highlight so that any global navigation commands
issued by the user will start from the current location.<!ELEMENT ncxRef (#PCDATA)>
<ncxRef>
...content...</ncxRef>
<lastmark>
, <bookmark>
,
<hiliteStart>
, <hiliteEnd>
<par>
or <seq>
in SMIL, to id in text-only file, or to audio file that contains the <lastmark>
,
<bookmark>
, <hiliteStart>
, or <hiliteEnd>
.<!ELEMENT uri (#PCDATA)>
<uri>
...content...</uri>
<lastmark>
, <bookmark>
,
<hiliteStart>
, <hiliteEnd>
<uri>
points to the id of the <par>
or <seq>
in the SMIL file that contains the <lastmark>
,
<bookmark>
, <hiliteStart>
, or <hiliteEnd>
.
<lastmark>
, <bookmark>
,
<hiliteStart>
or <hiliteEnd>
in audio
file referenced (via the SMIL file) by the uri; in seconds, measured from
beginning of audio file.<!ELEMENT timeOffset (#PCDATA) >
<timeOffset>
...seconds.fraction...</timeOffset>
<lastmark>
, <bookmark>
,
<hiliteStart>
, <hiliteEnd>
bookmark
, lastmark
,
hiliteStart
, or hiliteEnd
in textual content file
referenced by the uri.<!ELEMENT charOffset (#PCDATA) >
<charOffset>
...content...</charOffset>
<lastmark>
, <bookmark>
,
<hiliteStart>
, <hiliteEnd>
<par>
or
<seq>
element in the SMIL file that contains the bookmark,
plus a time offset or character offset to the exact point.<!ELEMENT bookmark (ncxRef, uri, (timeOffset
| charOffset), note?) >
<bookmark>
...content...</bookmark>
<bookmarkSet>
<!ELEMENT note (text?, audio?) >
<note>
...content...</note>
<hilite>
, <bookmark>
<notes>
need not support recording in all of the codecs allowed by this standard.
<!ELEMENT hilite (hiliteStart, hiliteEnd, note?)
>
<hilite>
...content...</hilite>
<bookmarkSet>
<par>
or
<seq>
element in the SMIL file that contains the beginning
of the highlighted section, plus a time offset or character offset to the
exact point.<!ELEMENT hiliteStart (ncxRef, uri, (timeOffset
| charOffset)) >
<hiliteStart>
...content...</hiliteStart>
<hilite>
<par>
or <seq>
element in the SMIL file that contains the end of the highlighted
section, plus a time offset or character offset to the exact point.<!ELEMENT hiliteEnd (ncxRef, uri, (timeOffset
| charOffset)) >
<hiliteEnd>
...content...</hiliteEnd>
<hilite>
In Example 9.1, the reader has set two bookmarks, one in chapter 1, 22 seconds
from the start of paragraph 8 and the other in chapter 3, 88 seconds from
the start of paragraph 12. The reader has added the text note "Atlanta burns"
to the second bookmark. The user has also highlighted a passage in chapter
4 beginning at the start of paragraph 1 and ending 246 seconds after the start
of paragraph 6, labeling it with a ten-second audio comment. The reader last
stopped reading (as indicated by the <lastmark>
) in chapter
5, paragraph 23. The default filename for this bookmark file would be "us-rfbd-JT065.bmk."
Example 9.1:
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE bookmarkSet SYSTEM "bookmark100.dtd"> <bookmarkSet> <title> <text>Gone with the Wind</text> <audio src="gwtw_title.mp3" /> </title> <uid>us-rfbd-JT065</uid> <lastmark> <ncxRef>gwtw.ncx#lvl1_5</ncxRef> <uri>gwtw_ch5.smil#para023</uri> <timeOffset>173</timeOffset> </lastmark> <bookmark> <ncxRef>gwtw.ncx#lvl1_1</ncxRef> <uri>gwtw_ch1.smil#para008</uri> <timeOffset>22</timeOffset> </bookmark> <bookmark> <ncxRef>gwtw.ncx#lvl1_3</ncxRef> <uri>gwtw_ch3.smil#para012</uri> <timeOffset>88</timeOffset> <note> <text>Atlanta burns.</text> </note> </bookmark> <hilite> <hiliteStart> <ncxRef>gwtw.ncx#lvl1_4</ncxRef> <uri>gwtw_ch4.smil#para001</uri> <timeOffset>0</timeOffset> </hiliteStart> <hiliteEnd> <ncxRef>gwtw.ncx#lvl1_4</ncxRef> <uri>gwtw_ch4.smil#para006</uri> <timeOffset>246</timeOffset> </hiliteEnd> <note> <audio src="us-rfbd-JT065.wav" clipBegin="00:00.00" clipEnd="00:10.00" /> </note> </hilite> </bookmarkSet>
Example 9.2 shows a text-only file in which the reader last stopped reading 130 characters after the start of paragraph 297.
Example 9.2:
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE bookmarkSet SYSTEM "bookmark100.dtd"> <bookmarkSet> <title> <text>Chemistry Today</text> </title> <uid>uk-rnib-MM499</uid> <lastmark> <ncxRef>chemtd.ncx#lvl1_3</ncxRef> <uri>chemtd.xml#para297</uri> <charOffset>130</charOffset> </lastmark> </bookmarkSet>
The optional Resource File supplies text segments and pointers to audio clips or images that may assist the reader in using a DTB. These media objects or "resources" provide information missing from a document or present only in a form inaccessible to the reader. Some examples of applications are:
The Resource File, then, can contain three types of information:
1. Resources tailored to a given document, for use primarily during global
navigation. As the user navigates via the NCX, the player, when necessary,
will look in the Resource File to locate the resource
whose type
attribute equals "ncx," whose elementRef
attribute value is navPoint
or navTarget
as appropriate, and whose classRef
references the class of the current NCX navPoint
or navTarget
.
2. Generic representations of the names of elements from the DTBook DTD,
for use during local navigation. As the reader issues local navigation commands
referencing the textual content file, the player will use the name of the
current element in the textual content file to locate the resource
with that element name in its elementRef
attribute. For example,
encountering a paragraph (tagged with <p>
...</p>
)
would call the resource
with elementRef
equal to
"p".
In addition, the classRef
attribute on resource
allows the DTB producer to create resource
s tailored to elements
with specific class names. For example, different resource
s could
be created for <w class="reservedword">
...</w>
and <w class="variablename">
...</w>
.
3. Representations of skippable structures listed in the head
of the NCX. The player will locate the resource
whose type
attribute equals "ncx," whose elementRef
attribute value is smilCustomTest
and whose idRef
attribute references the id of the current smilCustomTest
element. For example, the smilCustomTest
element tagged <smilCustomTest
id="prodnote" />
) would call the resource
with idRef
equal to "prodnote".
The text, audio, and image alternatives allow a resource
to
be presented in a medium appropriate to the playback system's capabilities
and the user's preferences. Images are conceived as holding iconic representations
of heading types. The lang
attribute on the resource
element allows alternative representations to be supplied in multiple languages.
Resources would be called only when appropriate; that is, in response to
clear user requirements and when needed. For example, a resource
with type="ncx"
and classRef="chapter"
would not
be called if a chapter heading with textual and audio content was already
present.
If a Resource File is implemented, it must meet the following requirements.
The Resource File is a valid XML 1.0 file conforming to the
Document Type Definition resource100.dtd. See Appendix 5, "DTD
for Resource File." The version
attribute on the resources
element of any compliant Resource file must be present and contain the value
drawn from the above-named DTD. Parsers cannot enforce the presence of this
attribute, so other mechanisms must. The resource file shall be named with the
extension ".res". Identical copies of the Resource File shall be distributed
on each media unit of the DTB.
Brief descriptions of the elements follow. Each includes the element declaration extracted from the Resource DTD, along with descriptions of any applicable attributes.
<!ELEMENT resources (head?, resource+) >
<resources...attribute>
...content...</resources>
<!ELEMENT head (meta*) >
<head>
...content...</head>
<!ELEMENT meta EMPTY >
<meta
...attributes.../>
head
<!ELEMENT resource ((text, audio?, img?) |
(text?, audio, img?)) >
<resourc
e...attributes...>
...content...</resource
>
dtbook
) or the NCX
(ncx). resource
is to be supplied. resource
is to be supplied.
See section 10.3, "Resource File Requirements"
for normative content. smilCustomTest
element in NCX for which the resource
is to be supplied. <resources>
<!ELEMENT text (#PCDATA) >
<text>
...content...</text>
<resource>
<!ELEMENT audio EMPTY >
<audio
...attributes... />
clipBegin
.
<resource>
<!ELEMENT img EMPTY >
<img
...attributes.../>
<resource>
If a player implementing resource functionality for DTBook elements encounters
an element in the textual content file that includes a class
attribute, the player must present the associated resource
with
the corresponding classRef
, if one exists. Otherwise, if the
appropriate resource
without a classRef
exists,
the player must present it.
In Example 10.1, the Resource File contains a resource for the word "chapter"
to be presented when encountering navPoints
of this class in
the NCX. Resources are supplied for four selected DTBook elements; the last
of these resources uses the classRef
attribute to specify a given
class of the element code
. Finally, a resource is provided for
a smilCustomTest
with an id of "prodnote."
Example 10.1:
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE resources PUBLIC "-//NISO//DTD resource v1.0.0//EN" "http://www.loc.gov/nls/z3986/v100/resource100.dtd"> <resources version="1.0.0"> <resource type="ncx" elementRef="navPoint" classRef="chapter" lang="en"> <text>Chapter</text> <audio src="chapter.mp3" /> <img src="chapter.png" /> </resource > <resource type="dtbook" elementRef="li" lang="en"> <text>list item</text> <audio src="elemres.mp3" clipBegin="00:36" clipEnd="00:38.14" /> </resource> <resource type="dtbook" elementRef="p" lang="en"> <text>paragraph</text> <audio src="elemres.mp3" clipBegin="00:47.51" clipEnd="00:49.34" /> </resource> <resource type="dtbook" elementRef="td" lang="en"> <text>table cell</text> <audio src="elemres.mp3" clipBegin="01:22.12" clipEnd="01:24.01" /> </resource> <resource type="dtbook" elementRef="code" classRef="javascript" lang="en"> <text>javascript</text> <audio src="elemres.mp3" clipBegin="01:45.15" clipEnd="01:47.01" /> </resource> <resource type="ncx" elementRef="smilCustomTest" idRef="prodnote" lang="en"> <text>producer's note</text> <audio src="elemres.mp3" clipBegin="01:54.17" clipEnd="01:56.44" /> </resource> ... </resources >
In Example 10.2, resources are supplied in both English and Danish for a
book whose NCX carries English class names on navPoints
(e.g.,
"chapter"). The "lang" attribute on resource controls which will be presented
to the reader.
Example 10.2:
... <resource type="ncx" elementRef="navPoint" classRef="chapter" lang="da"> <text>Kapitel</text> <audio src="kapitel.mp3" clipBegin="00:00" clipEnd="00:02.23" /> <img src="Kapitel.png" /> </resource> <resource type="ncx" elementRef="navPoint" classRef="chapter" lang="en"> <text>Chapter</text> <audio src="chapter.mp3" clipBegin="00:00" clipEnd="00:02.01" /> <img src="chapter.png" /> </resource> ...
If DTBs are distributed on a physical medium such as CD-ROM, producers will sometimes put more than one book on a disk and sometimes use more than one disk to hold a single book. When multiple DTBs are included on a single distribution medium ("media unit"), a simple method of storing this information for easy access by the player is needed, to present to the reader a "bookshelf" of books. When a single DTB spans several media, the player needs access to specific information so that it can provide correct instructions to the reader, e.g., "Insert disk 2," when required. The "Distribution Information File" (or "distInfo File") stores the data needed for these purposes.
In the following scenarios, the player would need accurate "distribution information" to respond appropriately:
Lastmark
"
will point to another disk.A distInfo File would normally be created for each type of distribution medium, whereas other DTB files would be unchanged regardless of how a DTB is distributed.
When distributing one DTB per media unit, the Package File must be placed in the root of the media unit's file system. When distributing multiple DTBs per media unit, the distInfo File alone must be placed in the root of the media unit's file system. These restrictions do not apply when a DTB is contained on a non-removable storage medium such as a hard drive.
The distInfo File is required on all media units for a given DTB when that DTB spans more than one distribution media or when multiple DTBs are contained on one media unit. Otherwise, a distInfo File is optional. There shall be no more than one distInfo File per media unit (e.g., CD-ROM disk).
The distInfo File, if present, must be a valid XML 1.0
file conforming to distinfo100.dtd (see Appendix 6, "Distribution
Information DTD"), and shall be named "distInfo.dinf." The version
attribute on the distInfo
element must be present and contain
the value drawn from the above-named DTD. Parsers will not enforce the presence
of this attribute, so other mechanisms must.
Distribution on multiple media units has implications for the production of the NCX and SMIL. For the NCX, see section 8.4.2, "DTBs Spanning Multiple Media Units." For SMIL, see section 7.4.4, "Packaging Files across Several Media Units."
Optional changeMsg
s may be used to supply customized messages
instructing users on how to proceed when another media unit is needed to continue
reading. Such changeMsg
s enable presentation of messages in either
text or audio. If no changeMsg
is present when required, the
player must render a default audio or text message (e.g., "please insert disk
2").
Values for the attribute media
on the element <book>
and for the attribute mediaRef
on the elements smilRef
and changeMsg
shall be in the format "x:y", where x is
the sequence number of this media unit, and y is the total number of media
units in the distribution of this book.
<!ELEMENT distInfo (book+) >
<distInfo...attribute...>
...content...</distInfo>
<!ELEMENT book (distMap?, changeMsg*)>
<book
...attributes...>
...content...</book>
package
element. See section 3.1, "Package Identity."
media
attribute identifies the media unit in
hand, in the format "x:y", where x is the sequence number of this media
unit, and y is the total number of media pieces in the distribution
of this book. <distInfo>
distMap
s and zero or
more changeMsg
s.<!ELEMENT distMap (smilRef+) >
<distMap>
...content...</distMap>
<book>
smilRef
s.<!ELEMENT smilRef EMPTY >
<smilRef
...attributes.../>
<distMap>
<!ELEMENT changeMsg ((text, audio?) | (text?,
audio)) >
<changeMsg
...attributes...>
...content...</changeMsg>
<changeMsg>
by matching its mediaRef
attribute to the mediaRef
attribute of the selected <smilRef>
. In the format "x:y", where
x is the sequence number of the specified media unit, and y is the total
number of media pieces in the distribution of this book. <book>
<!ELEMENT text (#PCDATA) >
<text>
...content...</text>
<changeMsg>
<!ELEMENT audio EMPTY>
<audio
...attributes.../>
<changeMsg>
Example 11.1 shows the distInfo File for the first disk of a book that spans
three CD-ROMs. The book element identifies the book through the uid
attribute, points to the package file via pkgRef
and indicates
in the media
attribute that this disk is the first of three. Players
would parse the package file to obtain book metadata, etc. The distMap
element contains a smilRef
for each SMIL file in the book (there are
10 in this particular case). The file
attribute gives the name of each
individual SMIL file. The mediaRef
attribute indicates which disk that
particular SMIL file (and all audio/text/image files referenced by it) resides
upon.
Players would refer to this map when a particular SMIL file is targeted for
playback; if the file is not present on the current disk, the changeMsg
whose mediaRef
attribute matches that of the selected smilRef
element would be played.
Example 11.1:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE distInfo SYSTEM "distInfo100.dtd"> <distInfo version="1.0.0"> <book uid="us-rfbd-tbfz284" pkgRef="./FZ284.opf" media="1:3"> <distMap> <smilRef file="FZ284_0001d.smil" mediaRef="1:3"/> <smilRef file="FZ284_0002d.smil" mediaRef="1:3"/> <smilRef file="FZ284_0003d.smil" mediaRef="1:3"/> <smilRef file="FZ284_0004d.smil" mediaRef="1:3"/> <smilRef file="FZ284_0005d.smil" mediaRef="2:3"/> <smilRef file="FZ284_0006d.smil" mediaRef="2:3"/> <smilRef file="FZ284_0007d.smil" mediaRef="2:3"/> <smilRef file="FZ284_0008d.smil" mediaRef="2:3"/> <smilRef file="FZ284_0009d.smil" mediaRef="2:3"/> <smilRef file="FZ284_0010d.smil" mediaRef="3:3"/> </distMap> <changeMsg mediaRef="1:3"> <text>Insert disc one.</text> <audio src="insert1.wav"/> </changeMsg> <changeMsg mediaRef="2:3"> <text>Insert disc two.</text> <audio src="insert2.wav"/> </changeMsg> <changeMsg mediaRef="3:3"> <text>Insert disc three.</text> <audio src="insert3.wav"/> </changeMsg> </book> </distInfo>
In Example 11.2, a sample distInfo File is presented for a case where two books are included on one CD-ROM. The file contains pointers to two book package files. Both books are complete on this one media unit so the media
attribute is omitted. Players would parse the package files to obtain book metadata, etc.
Example 11.2:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE distInfo SYSTEM "distInfo100.dtd"> <distInfo version="1.0.0"> <book uid="us-nls-db00001" pkgRef="./book1/AllAboutDogs.opf" /> <book uid="us-nls-db98765" pkgRef="./book2/AllAboutCats.opf" /> </distInfo>
The W3C has defined mechanisms for separating content from presentation called the Cascading Style Sheet [CSS] and Extensible Style Language [XSL]. CSS (for which two levels of functionality are currently defined, (Level 1 [CSS1] and Level 2 [CSS2]) and XSL allow specific formatting rules for mark-up to be defined and stored independent of the actual content. Default rules are normally applied by the specific playback or rendering system. The CSS Cascade provides a defined mechanism in which style rules may also be applied by the content producer as well as by the user. Producer-supplied style sheets are particularly important for complex documents with formatting or presentational requirements that would not be met by a player's or user's default styles.
CSS or XSL files may be provided by the content producer to control visual formatting of textual content when a DTB is played on a system that incorporates a visual display and supports CSS or XSL.
If a refreshable Braille display is connected to a DTB player, a Braille style sheet can control formatting so that the document is more easily navigable.
Audio CSS (ACSS, part of CSS2) and XSL also support the aural equivalent of visual formatting, and allow for audio cues to be associated with textual content mark-up. For example, chapter starts or page breaks may be indicated with a specific audio cue.
Style sheets are optional components of DTBs and DTB distribution systems. DTB producers may choose to supply default style sheets for any of the above three categories.
Style sheets must not be written in such a way as to prevent users from overriding
them. DTBs referencing style sheets must do so using standard W3C mechanisms
to link an XML source to its style sheet (see [XML-Style]).
All style sheet processing instructions must include the media attribute
specifying which medium the style sheet applies to. Acceptable values are: all
(for all media), aural (for audio presentations), Braille (for refreshable Braille
displays), embossed (for embossed Braille), handheld (for devices with small
monochrome screens), print (for visual formatting of printed output), and screen
(for color computer screens). For example:
<?xml-stylesheet href="brstyle.css" type="text/css"
media="braille"?>
Playback systems that utilize common PC-based browsers should support presentation styles at least to the extent the browser itself does. However, it is strongly recommended that any DTB player incorporating a visual display implement at least CSS1. Portable players will not generally provide full support for style sheets but may implement a subset of CSS or XSL sufficient for DTB use and the media presented on the player. For example, an audio-only player that is aware of the textual content might support only the audio styles described above.
Developers of playback systems may implement user interface features that support local control of style sheets, thereby allowing the user to define styles that supersede default player- or producer-defined styles. It is strongly recommended that players implementing style sheets support user control of presentation styles.
When multiple style sheets are present for the content being rendered, user-defined styles, if present, shall take precedence, followed by producer-defined and player-defined styles, in that order.
Digital talking books produced in compliance with this standard fall into six types representing the proportions in which six key files are present. In all six types, the Package File spine
defines the linear reading order of the DTB. A DTB which incorporates audio and textual content files for the full contents of the document, as well as a structured navigation control file (NCX) for efficient navigation (type 4 - audioFullText), offers the most features to a reader.
<audio>
elements in sequence. This Type of DTB will
be represented primarily by "legacy" titles transferred from analog to digital
form. Direct navigation to points within the DTB is not possible for the reader.<audio>
elements in sequence. The reader
can navigate directly only to items included in the NCX.<audio>
elements in sequence.<text>
(and
any synchronized <image>
) elements in sequence. This type
includes books such as dictionaries, where the full text is present but the
only audio contains human speech recordings of word pronunciations<text>
elements in
sequence, synchronized with any images present. There are no audio files.The following table shows the six types of DTB and whether each of six files is required (R), optional (O), or not applicable (N/A) for each. Note that the Open eBook Package File (OPF), the navigation control file (NCX), and the SMIL file(s) are required for all types, although the latter two may serve merely as pointers to other files in some cases.
DTB Type | OPF | NCX | Audio | Text | SMIL | Image |
---|---|---|---|---|---|---|
audioOnly (Full audio only) | R | R | R | N/A | R | N/A |
audioNCX (Full audio+structure) | R | R | R | O | R | N/A |
audioPartText (Audio+structure+partial text) | R | R | R | R | R | O |
audioFullText (Audio+structure+full text) | R | R | R | R | R | O |
textPartAudio (Full text+structure+partial audio) | R | R | R | R | R | O |
textNCX (Full text+structure, no audio) | R | R | N/A | R | R | O |
Players must determine how to render content from the types of files present. If only a textual content file is found, a synthetic speech rendering and output to a braille display and/or screen may be presented, according to the user's preferences and the features provided on the playback system. If only an audio file is present, straight audio playback shall be initiated. A player that supports only a subset of the media included in DTBs must, when encountering an unsupported medium, ignore the unsupported files and correctly render those it does support. In addition, if the playback system cannot render the DTB in any way, based on the value of dtb:multimediaType in the package file metadata, it must report this fact to the user. Further, a playback system should inform the user when unable to render in any way specific content it encounters in the DTB.
Protection of intellectual property will continue to be an important issue for national libraries and other agencies serving people with print disabilities. How this responsibility is met in Digital Talking Book distribution programs, however, will vary from country to country due to differences in the legal environment surrounding the distribution of alternative format materials. It will also vary by item, depending on whether the material is under copyright or in the public domain. When applicable, however, it is critical that agencies utilize reasonable administrative and technical measures to protect copyright holders' rights. It is equally important, though, that agencies ensure access to alternative format materials by their target populations. Thus, DTB producers and distributors that implement DRM systems must do so in a manner that does not limit or prevent access to compliant DTBs by eligible users.
It is strongly recommended that playback systems implement Time-Scale Modification (TSM) to enable user control of playback speed with or without pitch correction. Playback rates continuously variable from one-third to three times normal speed are recommended.
All time offsets in a DTB (e.g., SMIL and NCX clipBegin
/clipEnd
,
bookmark timeOffset
s, etc.), are based on normal play speed. In
order to maintain synchronization, a player must process time offsets independently
of actual playback speed.
This standard defines two kinds of conformance: file conformance and player conformance. Conformant Digital Talking Books and DTB playback systems must meet all of the applicable requirements specified in the normative sections of this standard. Requirements will vary depending on the media included in a DTB and the functions supported by a DTB player. It should be noted that while many aspects of the standard can be enforced through the DTDs included in this standard, others cannot, and must be enforced through other means.
The following standards, recommendations, and guidelines are referenced by this standard:
<!-- DTBook DTD V1.0.0 2001-09-28 Harvey Bingham --> <!-- file: dtbook100.dtd --> <!--dtbook XML Document Type Definition Implementing the NISO Digital Talking Book V1.0.0 Harvey Bingham <hbingham@acm.org> George Kerscher <kerscher@montana.com> Michael Moodie <mmoo@loc.gov> David Pawson <dpawson@rnib.org.uk> Assisted by DAISY Consortium and NISO DTB Committee work teams. 1. Purpose The Digital Talking Book Document Type Definition (DTD) provides the means to mark up the text of a published book to permit support for the combination of professional narration and navigation into that narration. The markup tags in the book convey its content in structure, and contain some metadata about the book content and its structure. The Document Type Definition (DTD) names and defines the allowable element types, their allowable content, and their attributes. Correct markup of the text of the book permits the textual material to be synchronized using SMIL files with the professionally narrated version of that book. The synchronization can permit concurrent display of the text being narrated. The textual content can be searched in context to locate material desired for narration. More detailed documentation of this dtbook dtd [DTBOOKDTD] is available as an html document. See [DTBOOKDOC]. 1.1. Prior Related Work The DAISY (Digital Audio-based Information SYstem) Consortium contributed substantially to the development of this DTD. This application of XML is the next generation after several DAISY versions of 2.X specifications, see [DAISY202]. Its Navigation Control Center (NCC) provided for synchronizing document structure with narration. The NCC evolved into an XML application called the "Navigation Control File for XML applications" (NCX). Its content is derived from the markup of documents tagged using the dtbook DTD. Richer structuring capability is one of the objectives of that DTD. The Synchronized Multimedia Integration Language [SMIL 2.0] is used to provide synchronized narrations and text. The NCX provides navigation using the identified elements of documents tagged to this DTD. The dtbook DTD includes many, but not all, of the element types found in both the [HTML401STRICT] and [XHTML11STRICT] strict DTDs. HTML authoring tools permit those additional element tags, and may ignore the additional tags that are dtbook-specific. 1.2. Evolution from HTML Dtbook100 has 79 element types. It shares 47 element types with the HTML4.0 Strict DTD [HTML401STRICT] (and the XHTML Strict DTD [XHTML11STRICT]), omit 30 element types from them, and has 32 unique element types. Endtag markup is sometimes optional in HTML. It is required for use with xhtml and dtbook. Any XML application [XML12] requires endtags, or their abbreviated form for empty elements, such as "<br />". The benefit of including endtags is that the tagged document has dependable structure that can be validated against the dtbook dtd. Some tools available for browsing HTML may be used with dtbook material, at the expense of their discarding or ignoring some specific tagging and attributes that are not part of HTML 4.0. A CSS-based stylesheet [CSS1] or [CSS2] that identifies the presentation expectations for the HTML and non-HTML tags, or a filter to map those tags onto suitable HTML tags can provide appropriate visual presentation. 2. Document Tagging Content A Digital Talking Book document is an XML application. Therefore it must begin with the XML processing instruction, followed by the DOCTYPE. 2.1. XML Processing Instruction The XML Processing Instruction identifies the version of XML, and the optional character set encoding: <?xml version="1.0" encoding="UTF-8" ?> Some alternative encodings to "UTF-8" are "UTF-16", "ISO-10646-UCS-2", or "ISO-10646-UCS-4" that could be used for the various encodings and transformations of Unicode/ISO/IEC 10646. All XML applications are expected to support Unicode. Other alternatives are also acceptable, including "ISO-8859-1", "ISO-8859-2", ... "ISO-8859-9" for parts of ISO 8859. See [ISO8859]. Also the values "ISO-2022-JP", "Shift_JIS", or "EUC-JP" can be used for the various Japanese encoded forms of JIS X-0208-1997. 2.2. DOCTYPE Declaration The document type declaration, the DOCTYPE, follows. It has several forms. The simpler form assumes that the proper version of the dtbook DTD is in the same directory as the dtbook file itself. <!DOCTYPE dtbook SYSTEM "dtbook100.dtd"> A more general form provides the PUBLIC URI from which the SYSTEM filename can be substituted, should that system copy be missing: <!DOCTYPE dtbook PUBLIC "http://www.loc.gov/nls/z3986/v100/dtbook100.dtd" "dtbook100.dtd"> That assumes the URI can be reached, which may not be true for portable dtbook players. The still more general form recommended for xml applications [XML12] is: <!DOCTYPE dtbook PUBLIC "-//NISO//DTD dtbook v1.0.0//EN" "dtbook100.dtd"> where the Formal Public Identifier (FPI) on the second line is converted to the URI where it may be publicly found: http://www.loc.gov/nls/z3986/v100/dtbook100.dtd using the [OASIS-TR9401] Entity Management Catalog to resolve that indirect mapping from FPI to the dtd. That catalog is more generally useful to provide the mapping from any external entity names (such as modules) to URIs where they may be found. Note that the reference above is to a particular version of the DTD. distinguished by the "V100" from subsequent versions. 2.3 Digital Talking Book File MIME Type A Digital Talking Book document is tagged to the dtbook XML application. It's MIME media-type is "text/xml". The tagged book filename should have suffix ".xml". See [RFC2045]. 3. Modular Extension to the DTD The dtbook DTD has two parameter entities defined that provide means to allow an individual book to modularly extend the content models for its block and inline parameter entities <!ENTITY % externalblock ""> <!ENTITY % externalinline ""> These parameter entities appear in corresponding block and inline content models. With this "" content they have no effect on books tagged to the dtbook DTD. In a book that needs a modular extension, values are given by redefinition in the internal subset of that book. This extends the dtbook DTD without having to change it. A book can augment the dtbook DTD by including other declarations or parameter entity references in the internal subset of declarations (in square brackets following the ExternalID and before the concluding ">") of the initial DOCTYPE declaration that identifies the dtbook DTD. Those additional markup declarations in the internal subset take preference over any in the dtbook DTD. The effective DTD is thereby augmented by the parameter entity values and any other declarations of the book's internal subset. For example: <!DOCTYPE dtbook SYSTEM "dtbook.dtd" [ <!ENTITY dramaModule SYSTEM "drama.dtd"> %dramaModule; <!ENTITY % externalblock "| drama"> <!ENTITY % externalinline "| stagedir"> ]> The "%dramaModule;" invocation causes all declarations made within dramaModule to become the initial part of the dtbook DTD. Within the book, the empty entity declarations for both % externalblock and for % externalinline are replaced by these new definitions. Thus the block element drama can appear wherever block elements may occur in dtbook. Similarly any actual content needed for %externalinline; (" | stagedir" is shown above) can appear in that extension to wherever %inline; appears in the DTD. More than one module may be needed and included in a book, for example both poem and drama can appear in the internal subset of the book. For example, the internal subset of the book could contain: <!DOCTYPE dtbook SYSTEM "dtbook.dtd" [ <!ENTITY poemModule SYSTEM "poem.dtd"> %poemModule; <!ENTITY dramaModule SYSTEM "drama.dtd"> %dramaModule; <!ENTITY % externalblock "| poem | stanza | verse | drama"> <!ENTITY % externalinline "| stagedir"> ]> Such external modules need to replicate any parameter entity definitions that are used therein since their definitions are needed before they can be expanded in their references. They cannot depend on parameter entities in the SystemLiteral or PubidLiteral that provides this dtbook100.dtd. Note that arbitrary external modules from other sources may not have all the needed attributes. XML allows augmentation of ATTLISTs in the internal subset. For each module, some accommodation to its use in dtbook may be required. Any parameter entities needed in the content from the internal subset must be declared in those modules. Parameter entities in the dtbook dtd are not available when the internal subset is recognized. Also note that element name collisions may be possible, with names in those modules and associated content models overriding those in dtbook. For modules under control of dtbook design, such collisions can be avoided. A more general solution uses namespace prefixes to element and attribute names to clearly indicate the module source. The fully marked-up document follows, including tags from the external modules in the internal subset. Declarations in the internal subset or in external entity references (such as %dramaModule;) referenced therein take precedence over like-named ones from the external entity containing the base DTD (that is, dtbook100.dtd). Thus the declarations from the module containing the drama and poem tags are included along with the tags in the base DTD (that is dtbook100.dtd) that are not duplicated or redefined in the drama module. So if a <p> tag is defined in the drama module, its definition overrides that of the <p> tag in dtbook. There is an exception: an ATTLIST for elementname that adds attributes from the internal subset augments the ATTLIST attributes with different attribute names in the ATTLIST of the same elementname in the dtbook100.dtd. Note that tools and players processing any extended markup that affects navigation structure will need to know of those modular extensions. The form above for augmenting the dtbook dtd through the document's internal subset does not require the XML namespace mechanism, with its namespace-specific prefix on element and attribute names to disambiguate any potential name collisions. Use of namespaces is not precluded. 4. References [CSS1] Cascading Style Sheets, Level 1 http://www.w3.org/TR/REC-CSS1 [CSS2] Cascading Style Sheets, Level 2 http://www.w3.org/TR/REC-CSS2 [DAISY202] The DAISY 2.02 Specification for the DAISY Digital Talking Book (DTB) format is the predecessor of this dtbook. http://www.daisy.org/dtbook/spec/2/final/d202/daisy_202.html [DTBOOKDTD] The dtbook DTD v1.0.0 is available at http://www.loc.gov/nls/z3986/v100/dtbook100.dtd Note that some browsers may prevent downloading a file with suffix dtd. [DTBOOKDOC] Expanded DTD documentation of this DTD is available as an HTML 4.0 document: http://www.loc.gov/nls/z3986/v100/dtbook100doc.htm Should revisions occur, a new directory named "vxxx" indicating the revision level will contain the revisions. Any prior specific version of the dtbook dtd and its documentation will persist. [DTBOOK3] The last public beta version was dtbook3-07.dtd. http://www.loc.gov/nls/niso/dtbk3-07.dtd http://www.loc.gov/nls/niso/dtbk3-07doc.html Those and prior versions are available at: http://www.loc.gov/nls/z3986/background/ The history of changes prior to this version, including those in internal drafts through dtbk3-10.dtd and before is in: http://www.loc.gov/nls/z3986/background/dtbk3-dtd-changes.txt In that directory also are the old dtdbk3 dtds and their documentation. See its index.html for the list. http://www.loc.gov/nls/z3986/background/index.html [SMIL 2.0] The Synchronized Multimedia Integration Language SMIL 2.0 W3C Recommendation 07 August 2001 is available at: http://www.w3.org/TR/2001/REC-smil20-20010807/smil20.html [HTML401STRICT] "HTML 4.0 Strict DTD", 1999-12-24, Dave Raggett, Arnaud Le hors, and Ian Jacobs. Dtbook100 was originally based on this HTML 4.0 Strict DTD, with design adaptation for dtbook100. The description of HTML 4. See [HTML401]. http://www.w3.org/TR/1999/REC-html401-19991224/strict.dtd [HTML401] "HTML 4.01 Specification" W3C Recommendation 24 December 1999 Documentation of the element types that come from that DTD is available at: http://www.w3.org/TR/1999/REC-html401-199991224/ Dtbook100 is now partially harmonized with [XHTML11STRICT] DTD, using its camelCase parameter entity names and comments and references included following those parameter entities in explanatory comments, and extended table content model. [ISO10646] "Information Technology - Universal Multiple-Octet Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane", ISO/IEC 10646-1:1993. The current specification also takes into consideration the first five amendments to ISO/IEC 10646-1:1993. [ISO8859] "Information Processing - 8-bit single-byte coded graphic character sets - Part 1: Latin alphabet No. 1", ISO 8859-1:1987. Other suffixes "-2 through -9" correspond to other character sets in the family. [OASIS-TR9401] Entity Management, OASIS Technical Resolution 9401:1997 (Amendment 2 to TR 9401). Paul Grosso, 1997 September 10. http://www.oasis-open.org/specs/tr9401.html [RFC1556] "Handling of Bi-directional Texts in MIME", H. Nussbacher, December 1993. http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc1556.html [RFC1766] The %ContentType; and %ContentTypes; media types and the %Charset; and %Charsets; character encodings are from RFC2045 "Tags for the Identification of Languages", H. Alvestrand, March 1995. http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc1766.html [RFC2045] "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", N. Freed and N. Borenstein, November 1996. Note that this RFC obsoletes RFC1521, RFC1522, and RFC1590. http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2045.html [RFC2046] "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", N. Freed, November 1996 http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2046.html [RFC2396] "Uniform Resource Identifiers (URI): Generic Syntax", T. Berners-Lee, R. Fielding, L. Masinter, August 1998. Note that this RFC revises and replaces the generic definitions in RFC 1738 and RFC 1808. http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2396.html [XHTML11] "XHTML (tm) 1.0: The Extensible HyperText Markup Language", W3C Recommendation 26 January 2000, A reformulaton of HTML4 in XML 1.0 includes case-sensitive names, lower-case for elements and their attributes (but not parameter entity names) and in some cases equivalent content models that do not require SGML inclusions and exclusion exceptions (as occurred in the HTML4.0 strict dtd [HTML401STRICT]) is available at: http://www.w3.org/TR/xhtml1/ [XHTML11STRICT] Expanded documentation of the element types that come from that XHTML11 strict.dtd and its other DTDs is available within the zip file: http://www.w3.org/TR/xhtml1/DTD/xhtml1/xhtml1.zip [XML12] This dtbook100.dtd is an application of the Extensible Markup Language XML 1.0 (Second Edition) W3C Recommendation 6 October 2000. It is available at: http://www.w3.org/TR/2000/REC-xml-20000106. --> <!--hb: change record: 1998-10-08 original by Harvey Bingham 1999-01-23 revision 3-01 1999-06-25 revision 3-02 1999-07-20 revision 3-03 1999-09-16 revision 3-04 1999-09-24 revision 3-05 1999-11-05 revision 3-06 2001-01-31 revision 3-07 2001-03-08 revision 3-08 2001-03-30 revision 3-09 basis for dtbook100.dtd 2001-09-07 revision 3-10 version 1.0.0 first draft 2001-09-21 revision 3-11 version 1.0.0 second draft 2001-09-26 revision 3-12 V1.0.0 third draft The record of evolution for this dtd may be found in the archives. See [DTBOOK3]. --> <!-- Comment Classification Conventions Some comments start with a pattern followed by a colon: Use: element type and its use. Ause: attribute use for associated element type. hb: Explanation by Harvey Bingham. Other comments without such a pattern are dividing lines, details about the DTD structure, or about dtbook objects. --> <!--===================== Character Entities =============================--> <!-- Character entities for interoperability. A few characters may have special markup meaning, so are expressed as character entities in text, preceded by "&" and followed by ";". The notation below, #xHHHH (or #xHH) where H is a hexadecimal-number (formed from 1-9 and A-F), indicates the character code position, for Unicode/ISO-10646. Note that the "<" and "&" characters in the declarations of "lt" and "amp" are doubly escaped to meet the requirement that entity replacement be well-formed. --> <!ENTITY lt "&#x003C;"> <!-- "&#60;" < Less than --> <!ENTITY gt ">"> <!-- ">" > Greater than --> <!ENTITY amp "&#x0026;"> <!-- "&#38;" & Ampersand --> <!ENTITY apos "'"> <!-- "'" ' Neutral Quote, Apostrophe--> <!ENTITY quot """> <!-- """ " Quotation mark --> <!-- Three larger character sets included in HTML 4.0 are omitted here: HTMLlat1.ent, HTMLsymbol.ent, and HTMLspecial.ent. Unicode is available to XML applications, so these characters are available. The initial processing instruction that identifies dtbook as an XML application should use a more inclusive encoding, as described at the start of section 2. --> <!--================ Imported Parameter Entity Names =====================--> <!--Many parameter entities come from the [XHTML11STRICT] strict DTD.--> <!ENTITY % Character "CDATA"> <!-- a single character from [ISO10646] --> <!ENTITY % Charset "CDATA"> <!-- a character encoding, as per [RFC2045] --> <!ENTITY % Charsets "CDATA"> <!-- a space separated list of character encodings, as per [RFC2045] --> <!ENTITY % ContentType "CDATA"> <!-- media type, as per [RFC2046] --> <!ENTITY % ContentTypes "CDATA"> <!-- comma-separated list of media types, as per [RFC2046] --> <!ENTITY % Datetime "CDATA"> <!-- date and time information. ISO date format --> <!ENTITY % LanguageCode "NMTOKEN"> <!-- a language code, as per [RFC1766] --> <!ENTITY % Number "CDATA"> <!-- one or more digits --> <!ENTITY % LinkTypes "CDATA"> <!-- space-separated list of link types --> <!ENTITY % MediaDesc "CDATA"> <!-- single or comma-separated list of media descriptors; possible values include BRAILLE, PRINT, PROJECTION, SPEECH, ALL, or the default SCREEN --> <!ENTITY % StyleSheet "CDATA"> <!-- style sheet data --> <!ENTITY % Text "CDATA"> <!-- used for titles etc. --> <!ENTITY % URI "CDATA"> <!-- a Uniform Resource Identifier, see [RFC2396] --> <!--================== dtbook External Module Inclusion =================--> <!ENTITY % externalblock ""> <!-- placeholder for block element expansion from external modules, if changed, string in external subset begins " | blockelementname" --> <!ENTITY % externalinline ""> <!-- placeholder for inline element expansion from external modules, if changed, string in external subset begins " | inlineelementname" --> <!--================== dtbook100 Content Models ============================--> <!ENTITY % list "list"> <!-- list container for ordered or unordered lists --> <!ENTITY % dtbookblock "author | notice | prodnote | sidebar | note | annotation %externalblock;"> <!-- block elements unique to dtbook --> <!ENTITY % inlineinblock "a | cite | img | caption | samp | kbd | pagenum"> <!-- inlines that may alternatively be in block elements --> <!ENTITY % block "p | %list; | dl | div | blockquote | hr | imggroup | table | address | line | %dtbookblock;"> <!-- block elements from html 4.0 augmented by dtbook-unique elements --> <!--================ Character mnemonic entities =========================--> <!-- Omitted as XML uses Unicode, so doesn't need them. May need character entities if the encoding is more restrictive. --> <!--=================== Generic Attributes ===============================--> <!ENTITY % coreattrs "id ID #IMPLIED class CDATA #IMPLIED style %StyleSheet; #IMPLIED title %Text; #IMPLIED"> <!-- coreattrs are attributes permissible for most elements id document-wide unique id class space separated list of classes used for rendering style associated style info title advisory title/amplification --> <!ENTITY % i18n "lang %LanguageCode; #IMPLIED xml:lang %LanguageCode; #IMPLIED dir (ltr|rtl) #IMPLIED"> <!-- i18n internationalization attributes lang language code (backwards compatible) xml:lang language code (as per XML 1.0 spec) dir direction for weak/neutral text ltr=left to right rtl=right to left xhtml recommendation: use both lang and xml:lang, with same value, such as "en-US", on the major containing block, to provide source for the #IMPLIED values of its descendent elements. See [RFC1556]. should the values differ, the xml:lang takes precedence. See [RFC1556] for handling bi-directional text in MIME. --> <!ENTITY % showin "showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED"> <!--showin attribute applies for text elements to permit identification of the kinds of display appropriate for the element, so presentation choice by the reader among alternative readings can be provided, when appropriate. Values of showin are coded with three letters in order: "b"=Braille, "l"=Largeprint, and "p"=Print; or "x"=inappropriate: Value Braille Largeprint Print Interpretation "xxx" hide "xxp" p print only "xls" l largeprint only "xlp" l p largeprint and print "bxx" b braille only "bxp" b p braille and print "blx" b l braille and largeprint "blp" b l p braille, largeprint, and print There is no default value, this attribute value is implied from the most immediate ancestor that specifies a value. If only one showin value is needed it should be included with <book>. Different showin content meeting different needs are possible. Both largeprint and print are appropriate for screen rendering as well as printing. Different corresponding styles may be appropriate. It is possible to include equivalent content from any major structure below <book> to provide the different content suitable for different media. These would be independent, sharing no direct content, possibly having common references to images, with different accompanying text descriptions. --> <!ENTITY % attrs "%coreattrs; %i18n; smilref CDATA #IMPLIED %showin;"> <!-- %attrs; is part of most attribute lists. It includes %coreattrs; from which come the four #IMPLIED attributes: id, class, style, and title. %i18n; from which come the implied attributes: lang, xml:lang, and dir smilref is a pointer to a [SMIL 2.0] file, normally to the time container (SMIL <par> or <seq>) containing the media object that references this element. However, in a text-only DTB consisting of a sequence of text media objects, <smilref> points to the media object that references this element. <smilref> allows resumption of SMIL presentation at the proper location after navigation via dtbook file. All <smilref> values are expected to be added to an augmented version of the <dtbook> during production. %showin; which value (three letters, from 'x'=ignore, 'b'=braille, 'l'=largeprint, or 'p'=print) indicates positionally the explicit applicability of this element tag to the various media. --> <!ENTITY % attrsrqd "id ID #REQUIRED class CDATA #IMPLIED style %StyleSheet; #IMPLIED title %Text; #IMPLIED smilref CDATA #IMPLIED %i18n; %showin; "> <!-- %attrsrqd; includes required id and implied class, style, and title. %i18n; from which come the implied attributes: lang, xml:lang, and dir smilref is a pointer to a [SMIL 2.0] file, normally to the time container (SMIL <par> or <seq>) containing the media object that references this element. However, in a text-only DTB consisting of a sequence of text media objects, <smilref> points to the media object that references this element. <smilref> allows resumption of SMIL presentation at the proper location after navigation via dtbook file. All <smilref> values are expected to be added to an augmented version of the <dtbook> during production. %showin; which value (three letters, from 'x'=ignore, 'b'=braille, 'l'=largeprint, or 'p'=print) indicates positionally the explicit applicability of this element tag to the various media. --> <!--================ Document Structure ==================================--> <!ENTITY % dtbook.content "head, book"> <!-- dtbook.content designates that each dtbook has a <head> of metainformation preceding the <book> content. --> <!--Use: dtbook is the root element in the Digital Talking Book DTD. <dtbook> contains metadata in <head> and the contents itself in <book>. --> <!ELEMENT dtbook (%dtbook.content;)> <!--Ause: dtbook The required version attribute contains the specific version of the dtd, so that the dtd version for any dtbook can be recognized. The internationalization (%i18n;) attributes characterize the <book>. --> <!ATTLIST dtbook version CDATA #FIXED '1.0.0' %i18n; > <!--==================== Book Content ====================================--> <!--Use: book surrounds the actual content of the document, which is divided into <frontmatter>, <bodymatter>, and <rearmatter>. <head>, which contains metadata, precedes <book>. --> <!ELEMENT book (frontmatter?, bodymatter?, rearmatter?)> <!ATTLIST book %attrs; > <!--======================= Book Major Structures ========================--> <!--Use: frontmatter contains preliminary material enclosed in appropriate <level> or <level1>. Content may include <doctitle>, <docauthor> copyright notice, foreword, acknowledgments, table of contents, etc. <frontmatter> serves as a guide to the content and nature of a <book>. --> <!ELEMENT frontmatter (doctitle | docauthor | level | level1 | %block;)+> <!ATTLIST frontmatter %attrs; > <!--Use: bodymatter consists of the text proper of a book, as opposed to preliminary material <frontmatter> or supplementary information <rearmatter>. --> <!ELEMENT bodymatter (level | level1 | %block;)+> <!ATTLIST bodymatter %attrs; > <!--Use: rearmatter contains supplementary material such as appendices, glossaries, bibliographies, and indices. It follows the <bodymatter> of the book. --> <!ELEMENT rearmatter (level | level1 | %block;)+> <!ATTLIST rearmatter %attrs; > <!--================== dtbook Recursive Structure level ================--> <!--Use: level is an alternative tag for marking the major structures in a book. It may be used recursively, i.e., repeated indefinitely with each successive occurrence nesting within the previous. It may also be included in a subsequent higher level. Subordinate levels have greater depth. Contrast with the explicit <level1>...<level6> elements, which may not be intermixed with <level>. --> <!ELEMENT level (levelhd | %block; | %inlineinblock; | level)*> <!--Ause: level The class attribute identifies the actual name (e.g., part, chapter, section, subsection) of the structure it marks. The depth attribute indicates the nesting depth, starting at 1. --> <!ATTLIST level %attrs; depth CDATA #IMPLIED > <!--============ dtbook Hierarchic Structure level1 ... level6 ==========--> <!--Use: level1 is the highest level container of major divisions of a book. Used in <frontmatter>, <bodymatter>, and <rearmatter> to mark the largest divisions of the book (usually parts or chapters), inside which level2 subdivisions (often sections) may nest. The class attribute identifies the actual name (e.g., part, chapter) of the structure it marks. Contrast with <level>. --> <!ELEMENT level1 (h1 | level2 | %block; | %inlineinblock;)*> <!ATTLIST level1 %attrs; > <!--Use: level2 contains subdivisions that nest within <level1> divisions. The class attribute identifies the actual name (e.g., subpart, chapter, subsection) of the structure it marks. --> <!ELEMENT level2 (h2 | level3 | %block; | %inlineinblock;)*> <!ATTLIST level2 %attrs; > <!--Use: level3 contains sub-subdivisions that nest within <level2> subdivisions (e.g., sub-subsections within subsections). The class attribute identifies the actual name (e.g., section, subpart, subsubsection) of the subordinate structure it marks. --> <!ELEMENT level3 (h3 | level4 | %block; | %inlineinblock;)*> <!ATTLIST level3 %attrs; > <!--Use: level4 contains further subdivisions that nest within <level3> subdivisions. The class attribute identifies the actual name of the subordinate structure it marks. --> <!ELEMENT level4 (h4 | level5 | %block; | %inlineinblock;)*> <!ATTLIST level4 %attrs; > <!--Use: level5 contains further subdivisions that nest within <level4> subdivisions. The class attribute identifies the actual name of the subordinate structure it marks. --> <!ELEMENT level5 (h5 | level6 | %block; | %inlineinblock;)*> <!ATTLIST level5 %attrs; > <!--Use: level6 contains further subdivisions that nest within <level5> subdivisions. The class attribute identifies the actual name of the subordinate structure it marks. --> <!ELEMENT level6 (h6 | %block; | %inlineinblock;)*> <!ATTLIST level6 %attrs; > <!--=================== Text Markup ======================================--> <!ENTITY % phrase "em | strong | dfn | code | samp | kbd | cite | abbr | acronym"> <!-- inline text elements --> <!ENTITY % special "a | img | br | q | sub | sup | span | bdo | linenum"> <!-- special inline text elements. --> <!ENTITY % specialnoa "img | br | q | sub | sup | span | bdo | linenum"> <!-- specialnoa inline text elements for anchor <a>. --> <!--========================= Inline Entities ============================--> <!ENTITY % dtbookinline "sent | w | pagenum | prodnote | noteref %externalinline;"> <!-- dtbook added inline text elements --> <!ENTITY % inline "#PCDATA | %phrase; | %special; | %dtbookinline;"> <!-- inline text elements --> <!ENTITY % inlinenoa "#PCDATA | %phrase; | %specialnoa; %externalinline;"> <!-- inlinenoa excludes nested <a>. --> <!ENTITY % inlines "#PCDATA | %phrase; | %special; | pagenum | w | prodnote | noteref %externalinline;"> <!-- inlines excludes direct nesting of sentences <sent>. --> <!ENTITY % inlinew "#PCDATA | %phrase; | %special; %externalinline;"> <!-- inlinew for word <w> excludes any of the %dtbookinline;. --> <!ENTITY % inlinenopagenum "#PCDATA | %phrase; | %special; | sent | w | noteref %externalinline;"> <!-- inlinenopagenum excludes direct <pagenum> in <table> <th> and <td>. --> <!ENTITY % inlinenoprodnote "#PCDATA | %phrase; | %special; | sent | w | pagenum | noteref %externalinline;"> <!-- inlinenoprodnote excludes direct <prodnote>, as they shouldn't nest.--> <!ENTITY % inlinenoanoprodnote "#PCDATA | %phrase; | %specialnoa; | sent | w | pagenum | noteref %externalinline;"> <!-- inlinenoanoprodnote excludes <a> and <prodnote> directly. --> <!--==================== Flow (Block or Inline) Entities =================--> <!ENTITY % flow "%inlinenoprodnote; | %block;"> <!-- flow elements add inlinenoprodnote to block --> <!ENTITY % flownopagenum "%inlinenopagenum; | %block;"> <!-- flownopagenum ideally excludes pagenum though can get in through %block; --> <!--============= Br, Linenum, Address, and Div Content Models ===========--> <!--Use: br marks a forced line break. --> <!ELEMENT br EMPTY> <!ATTLIST br %coreattrs; > <!--Use: linenum contains a line number in, for example, in legal text. --> <!ELEMENT linenum (#PCDATA)> <!ATTLIST linenum %attrs; > <!--Use: address contains a location at which a person or agency may be contacted. By use of <line> to contain content of the individual lines, the class attribute can be used to identify the content of that <line>. For example, class values might include: name, address, region: (state. province, etc.), country, location code: (zipcode, provincial code, etc.), phone, fax, email, etc. --> <!ELEMENT address (%inline; | line)*> <!ATTLIST address %attrs; > <!--Use: div is a generic container for subdivisions of a book. The <level1> ... <level6> hierarchy, or the <level> tag used recursively, should mark the major hierarchical structures of a book, while <div> is used in less formal circumstances or when for production purposes it is desired that a structure should be treated differently. The class attribute value identifies the actual name (e.g., part, chapter, letter) of the structure it marks. Compare with <span> which is used in inline settings. --> <!ELEMENT div (%block; | %inlineinblock;)*> <!--Ause: div The level attribute may extend or augment explicit levels, to indicate nesting level, values the positive integers, with "1" corresponding to <h1>. --> <!ATTLIST div %attrs; level CDATA #IMPLIED > <!--====== dtbook Block Elements Author, Notice, Prodnote, Sidebar ======--> <!--Use: author identifies the writer of a work other than this one. Contrast with <docauthor> which identifies the author of this work. <author> typically occurs within <blockquote> or <cite>. --> <!ELEMENT author (%inline;)*> <!ATTLIST author %attrs; > <!--Use: notice contains a warning, caution, or other type of admonition normally found in the margin of a book. In contrast with <sidebar> a <notice> must be presented at a specific location within the text. Its presentation is not optional. --> <!ELEMENT notice (%inline;)*> <!ATTLIST notice %attrs; > <!--Use: prodnote contains language added to the alternative-format version by the producer; commonly used to: 1) provide descriptions of one or more visual elements such as charts, graphs, etc. 2) supply operating instructions 3) describe differences between the print book and the audio version. --> <!ELEMENT prodnote (%flow;)*> <!--Ause: prodnote The imgref identifies the space-separated id value(s) on pertinent images <img>. Rendering of <prodnote> uses the render attribute to indicate the content is required or optional for the user. If optional, some user preference may allow skipping over the content. But <prodnote render="required"> is essential content for the user. An audible cue could announce the presence of the <prodnote>. --> <!ATTLIST prodnote %attrs; imgref IDREFS #IMPLIED render (required | optional) #IMPLIED > <!--Use: sidebar contains information supplementary to the main text and/or narrative flow and is often boxed and printed apart from the main text block on a page. It may have a heading <hd>. --> <!ELEMENT sidebar (%flow; | hd)*> <!ATTLIST sidebar %attrs; > <!--Use: note marks a footnote, endnote, etc. Any local reference to <note id="yyy"> is by <noteref idref="#yyy">. --> <!ELEMENT note (%block; | %inlineinblock;)+> <!ATTLIST note %attrsrqd; > <!--Use: annotation is a comment on or explanation of a portion of a printed book. It differs from <note> in that an <annotation> is usually set in the margin or on a facing page, often with no explicit reference to it inserted in the text. Any local reference to <annotation id="xxx"> is by <annoref idref="#xxx">. --> <!ELEMENT annotation (%block; | %inlineinblock;)+> <!ATTLIST annotation %attrsrqd; > <!--Use: line marks a single logical line of text. Often used in conjunction with <linenum> in documents with numbered lines. --> <!ELEMENT line (%inline;)*> <!ATTLIST line %attrs; > <!--================== The Anchor Element ================================--> <!--Use: a contains an anchor, which is used to reference another location, within the same or another <dtbook>. --> <!ELEMENT a (%inlinenoa;)*> <!--Ause: a The href attribute value may have three forms: 1) "#idref", in this application, to the element type having the referenced id value in this document; 2) "uri", a uniform resource identifier to a resource, typically a document, see [RFC2396], restricted to work with only a <dtbook> document; 3) "uri#xxx". in the resource uri the element with id="xxx". Uses of the remaining attributes other than %attrs; are: "type" is advisory content MIME type of the target, see [RFC1556]; "hreflang" is language code of the href target, see [RFC1766]; "rel" is a list of forward link type(s), the relationship(s) expressed by the href value to the target, space-separated if multiple; "rev" is a list of reverse link types, the relationship(s) to this location from the href target, space-separated if multiple; "accesskey"=accessibility key character shortcut; "tabindex"=tabbing order. --> <!ATTLIST a %attrs; type %ContentType; #IMPLIED href %URI; #IMPLIED hreflang %LanguageCode; #IMPLIED rel %LinkTypes; #IMPLIED rev %LinkTypes; #IMPLIED accesskey %Character; #IMPLIED tabindex %Number; #IMPLIED > <!--========================= Inline Elements ============================--> <!--Use: em indicates emphasis. Usually <em> is rendered in italics. Compare with <strong>. --> <!ELEMENT em (%inline;)*> <!ATTLIST em %attrs; > <!--Use: strong marks stronger emphasis than <em>. Visually <strong> is usually rendered bold. --> <!ELEMENT strong (%inline;)*> <!ATTLIST strong %attrs; > <!--Use: dfn marks the first occurrence of a word or term that is defined or explained there or elsewhere in <book>. Often <dfn> is rendered in italics, sometimes in parentheses. --> <!ELEMENT dfn (%inline;)*> <!ATTLIST dfn %attrs; > <!--Use: kbd designates information that the reader is to input directly into a computer using the keyboard. --> <!ELEMENT kbd (%inline;)*> <!ATTLIST kbd %attrs; > <!--Use: code designates a fragment of computer code. --> <!ELEMENT code (%inline;)*> <!--Ause: code The attribute xml:space='preserve' preserves whitespace therein (except that an XML parser strips leading and trailing whitespace before passing the internal content including its original whitespace to the application.) The value xml:space='default' leaves the whitespace handling to the application. --> <!ATTLIST code %attrs; xml:space (default | preserve) 'preserve' > <!--Use: samp contains a sample of work created by the author for use as an example or template. For example, a sample business letter, resume, or computer program output, or form. --> <!ELEMENT samp (%inline;)*> <!--Ause: samp The xml:space='preserve' preserves whitespace therein (except that an XML parser strips leading and trailing whitespace before passing the internal content including its original whitespace to the application.) The value xml:space='default' leaves the whitespace handling to the application. --> <!ATTLIST samp %attrs; xml:space (default | preserve) 'preserve' > <!--Use: cite marks a reference (or citation) to another document. <cite> may occur within an <a href="URL">...</a> should that other document be available in the same dtbook distribution. --> <!ELEMENT cite (%inline;)*> <!ATTLIST cite %attrs; > <!--Use: abbr designates an abbreviation, a shortened form of a word. For examples: Mr., approx., lbs., rec'd. --> <!ELEMENT abbr (%inline;)*> <!--Ause: abbr The title attribute value may expand that abbreviation. --> <!ATTLIST abbr %attrs; > <!--Use: acronym marks a word formed from key letters (usually initials) of a group of words. For examples: UNESCO, NATO, XML. --> <!ELEMENT acronym (%inline;)*> <!--Ause: acronym The title attribute value may expand that acronym. The pronounce attribute value "yes" indicates that the acronym is pronounceable as a word (for example, NATO); "no" that the acronym is best presented as a sequence of letters (for example "US"). --> <!ATTLIST acronym %attrs; pronounce (yes | no) #IMPLIED > <!--Use: sub indicates a subscript character (printed below a character's normal baseline). Can be used recursively and/or intermixed with <sup>. --> <!ELEMENT sub (%inline;)*> <!ATTLIST sub %attrs; > <!--Use: sup marks a superscript character (printed above a character's normal baseline). Can be used recursively and/or intermixed with <sub>. --> <!ELEMENT sup (%inline;)*> <!ATTLIST sup %attrs; > <!--Use: span is a generic container for use in inline settings when no specific tag exists for a given situation. The class attribute may describe the nature of the text it marks (e.g., a typographical error). May be used to mark a class of items to which styles are to be applied. Compare with <div> which is used in block settings. #PCDATA following an inline can be given an id for resumed playing by putting it in a <span>. --> <!ELEMENT span (%inline;)*> <!ATTLIST span %attrs; > <!--Use: bdo is used in special cases where the automatic actions of the bi-directional algorithm would result in incorrect display. --> <!ELEMENT bdo (%inline;)*> <!--Ause: bdo The lang attribute indicates the language of the content. The dir attribute indicates the writing direction: "ltr" is left-to-right, "rtl" is right-to-left. --> <!ATTLIST bdo %coreattrs; lang %LanguageCode; #IMPLIED dir (ltr | rtl) #REQUIRED > <!--==================== dtbook Inline Sentence and Word ================--> <!--Use: sent marks a sentence. --> <!ELEMENT sent (%inlines;)*> <!ATTLIST sent %attrs; > <!--Use: w marks a word. --> <!ELEMENT w (%inlinew;)*> <!ATTLIST w %attrs; > <!--==== Inline Page Number, Footnote and Annotation Reference =========--> <!--Use: pagenum contains one page number as it appears from the print document, usually inserted at the point within the file immediately preceding the first item of content on a new page. --> <!ELEMENT pagenum (#PCDATA)> <!--Ause: pagenum The "page" attribute allows three kinds of page numbering schemes to be identified: "normal" Arabic numbering in the body of the book is the default, "front" pages (from the <frontmatter>, often roman numbering), "special" pagination schemes such as letter prefix hyphen Arabic number in appendices. Each pagenum needs a unique id value, by convention is derived from the actual pagenumber. For multi-page continuous content, such as large <img> or <table>, put the sequence of <pagenum> on the page where that content starts. --> <!ATTLIST pagenum %attrsrqd; page (front | normal | special) "normal" > <!--Use: noteref marks one or more characters that reference a footnote or endnote <note>. Contrast with <annoref>. Either may be independently skippable. --> <!ELEMENT noteref (#PCDATA)> <!--Ause: noteref idref relates to the note, for example: <noteref idref="yyy"> refers to <note id="yyy">. The type attribute provides advisory content MIME type of the target, see [RFC1556]. --> <!ATTLIST noteref %attrs; idref CDATA #REQUIRED type %ContentType; #IMPLIED > <!--Use: annoref marks a text segment that references an <annotation>. Each <annoref> is usually a word, phrase, or whole line that is part of the surrounding text (identified in the original print book by bolding, italics, etc.). It should not normally be allowed to be turned off in a DTB application. --> <!ELEMENT annoref (#PCDATA)> <!--Ause: annoref The idref attribute refers to the target id of an <annotation>. The type attribute provides advisory content MIME type of the targeted id, see [RFC1556]. --> <!ATTLIST annoref %attrs; idref CDATA #REQUIRED type %ContentType; #IMPLIED > <!--===================== Inline Quotes ==================================--> <!--Use: q contains a short, inline quotation. Compare with <blockquote> which marks a longer quotation set off from the surrounding text. --> <!ELEMENT q (%inline;)*> <!--Ause: q The cite attribute may provide a URI reference. --> <!ATTLIST q %attrs; cite %URI; #IMPLIED > <!--============================ Images ==================================--> <!-- Image <img> comes from HTML. An <img> may be grouped using <imggroup>, with <caption>, and special usage instructions with <prodnote>. The <imggroup> element may contain one or more <img> and any associated <caption> and <prodnote>. Multiple <img> may share a single caption, or multiple <caption> may apply if several captions refer to a single <img>. Multiple <prodnote> may apply if different versions are needed for different media. --> <!ENTITY % Length "CDATA"> <!-- measured in pixels. --> <!ENTITY % MultiLength "CDATA"> <!-- measured in integer pixels "n", percent "n%" of diplay width, or "0*" indicating minimum appropriate width. Multiple Lengths are separated by white-space. --> <!ENTITY % Pixels "CDATA"> <!-- 0 for no <table> border, positive integer for <table> border width in pixels. --> <!--Use: img marks a visual image. An <img> will generally contain a longdesc, a pointer to the related <prodnote>. The referencing is typically of the form <caption imgref="#yyy">The Caption</caption> for the printed caption of the <img id="yyy">. --> <!ELEMENT img EMPTY> <!--Ause: img The "src" attribute specifies the location of the image file. The "alt" attribute may be used to supply a short description of the <img>. The attributes height and width provide visual sizing information, measured in pixels. --> <!ATTLIST img %attrs; src %URI; #REQUIRED alt %Text; #REQUIRED longdesc %URI; #IMPLIED height %Length; #IMPLIED width %Length; #IMPLIED > <!--Use: imggroup provides a container for <img> or images and associated <caption> and <prodnote>. <prodnote> may contain a description of the image. The content model allows: 1) multiple <img> if they share a caption, with the ids of each <img> in the <caption idref="id1 id2 ...">, 2) multiple <caption> if several captions refer to a single <img id="xxx"> where each caption has the same <caption idref="xxx">, 3) multiple <prodnote> if different versions are needed for different media (e.g., large print, braille, or print.) --> <!ELEMENT imggroup (prodnote | img | caption)+> <!ATTLIST imggroup %attrs; > <!--=================== Horizontal Rule ==================================--> <!--Use: hr is an empty element indicating a horizontal rule. May be used to indicate a break in the text where only blank lines, a row of asterisks, a horizontal line, etc. are used in the print book. --> <!ELEMENT hr EMPTY> <!ATTLIST hr %coreattrs; > <!--======================= Paragraphs ===================================--> <!--Use: p contains a paragraph, which may contain subsidiary <list> or <dl>. --> <!ELEMENT p (%inline; | %list; | dl)*> <!ATTLIST p %attrs; > <!--================ Doctitle, Docauthor and Headings ===================--> <!--Use: doctitle marks the title of the book within <frontmatter>. By convention it should appear only once, usually first. Within <head> is <title> whose contents are generally the same. --> <!ELEMENT doctitle (%inline;)*> <!ATTLIST doctitle %attrs; > <!--Use: docauthor marks each author or editor of this work. Compare with <author>, used to mark the author of another work, within <blockquote> or <cite>. --> <!ELEMENT docauthor (%inline;)*> <!ATTLIST docauthor %attrs; > <!--Use: levelhd contains the text of a heading within <level>. Corresponds to <h1> through <h6> used in <level1> through <level6>. --> <!--Ause: levelhd The depth value is a positive integer, corresponding to the <h1>...<h6> levelN, though not limited to just six levels. Any depth value, "n", should match that on the enclosing <level depth="n">. --> <!ELEMENT levelhd (%inline;)*> <!ATTLIST levelhd %attrs; depth CDATA #IMPLIED > <!--Use: h1 contains the text of the heading for a <level1> structure. --> <!ELEMENT h1 (%inline;)*> <!ATTLIST h1 %attrs; > <!--Use: h2 contains the text of the heading for a <level2> structure. --> <!ELEMENT h2 (%inline;)*> <!ATTLIST h2 %attrs; > <!--Use: h3 contains the text of the heading for a <level3> structure. --> <!ELEMENT h3 (%inline;)*> <!ATTLIST h3 %attrs; > <!--Use: h4 contains the text of the heading for a <level4> structure. --> <!ELEMENT h4 (%inline;)*> <!ATTLIST h4 %attrs; > <!--Use: h5 contains the text of the heading for a <level5> structure. --> <!ELEMENT h5 (%inline;)*> <!ATTLIST h5 %attrs; > <!--Use: h6 contains the text of the heading for a <level6> structure. --> <!ELEMENT h6 (%inline;)*> <!ATTLIST h6 %attrs; > <!--Use: hd marks the text of a heading in a <list> or <sidebar>. --> <!ELEMENT hd (%inline;)*> <!ATTLIST hd %attrs; > <!--=================== Preformatted Text ================================--> <!-- HTML or XHTML preformatted text is omitted, as inappropriate for narrated material. --> <!--=================== Block-like Quotes ================================--> <!--Use: blockquote indicates a block of quoted content that is set off from the surrounding text by paragraph breaks. Compare with <q> which marks short, inline quotations. --> <!ELEMENT blockquote (%block;)*> <!--Ause: blockquote The cite attribute permits inclusion of the URI from which the <blockquote> came. --> <!ATTLIST blockquote %attrs; cite %URI; #IMPLIED > <!--================= Definition List, and Other Lists ====================--> <!--Use: dl contains a definition list, usually consisting of pairs of terms <dt> and definitions <dd>. Any definition can contain another definition list. --> <!ELEMENT dl (dt | dd | pagenum)+> <!ATTLIST dl %attrs; > <!--Use: dt marks a term in a definition list. --> <!ELEMENT dt (%inline;)*> <!ATTLIST dt %attrs; > <!--Use: dd marks a definition of a term within a definition list. --> <!ELEMENT dd (%flow;)*> <!ATTLIST dd %attrs; > <!--Use: list contains some form of list, ordered or unordered. The list may have intermixed heading <hd> (generally only one, possibly with <prodnote>) and an intermixture of list items <li> and <pagenum>. If bullets and outline enumerations are part of the print content, they are expected to prefix those list items in content, rather than be implicitly generated. Note: XHTML has explicit list element types: ol for ordered, and ul for unordered. --> <!ELEMENT list (hd | prodnote | li | pagenum)+> <!--Ause: list The "type" attribute indicates whether the list items <li> are ordered "ol" or unordered "ul". The depth indicates nesting depth, starting at 1. The enum value indicates: 1=integer, a=lowercase, U=uppercase, i=lowercase Roman, or X=uppercase Roman. The bullet value can come from Unicode, using the entity reference form &xdddd; --> <!ATTLIST list %attrs; type (ol | ul) #IMPLIED depth CDATA #IMPLIED enum (1 | a | U | i | X) #IMPLIED bullet CDATA #IMPLIED > <!--Use: li marks each list item in a <list>. <li> content may be either inline or block and may include other nested lists. Alternatively it may contain a sequence of list item components, <lic>, that identify regularly occurring content, such as the heading and page number of each entry in a table of contents. --> <!ELEMENT li (%flow; | lic)*> <!ATTLIST li %attrs; > <!--Use: lic ("list item component") allows ordered substructure within a list item <li>. Used when a list item is made up of two or more components, as in a table of contents entry. The same number of <lic> should occur in each <li>. If not, correspondence of <lic> in different <li> is in order of occurrence for the current writing direction of the <li>. --> <!ELEMENT lic (%inline;)*> <!--lic class attribute may be used to identify the particular component of a list item <li>. For example, in a table of contents class values might include "section", and "pagenumber". --> <!ATTLIST lic %attrs; > <!--======================= Tables =======================================--> <!--hb: the XHTML <table> model is used, including the presentational attributes that have little meaning in Digital Talking Books, but may be useful for concurrent display in different media. Note: The XHTML <table> model has been enhanced from HTML to allow a <table> of just rows <tr>. --> <!ENTITY % Scope "(row | col | rowgroup | colgroup)"> <!-- Scope specifies a set of data cells for which the <th> provides header information. --> <!ENTITY % TFrame "(void | above | below | hsides | lhs | rhs | vsides | box | border)"> <!-- TFrame identifies the sides that are visually framed. --> <!ENTITY % TRules "(none | groups | rows | cols | all)"> <!-- %TRules identifes where visual rulings appear. --> <!ENTITY % cellhalign "align (left|center|right|justify|char) #IMPLIED char %Character; #IMPLIED charoff %Length; #IMPLIED "> <!-- % cellhalign align sets horizontal alignment of content in a table cell. char indicates a character expected in each table cell of a column that text should align on, charoff sets the alignment offset of the first character to align on, as specified with char. cellalign value inheritance if unspecified is (high to low) <th>|<td> <col> <tr> <thead>|<tbody>|<tfoot> --> <!ENTITY % cellvalign "valign (top|middle|bottom|baseline) #IMPLIED"> <!-- % cellvalign valign sets vertical alignment of content in a table cell. valign value inheritance if unspecified is (high to low) <th>|<td> <col> <colgroup> <thead>|<tbody>|<tfoot> --> <!--Use: table contains cells of tabular data arranged in rows and columns. A <table> may have a <caption>. It may have descriptions of the columns in <col>s or groupings of several <col> in <colgroup>. A simple <table> may be made up of just rows <tr>. A long table crossing several pages of the print book should have separate <pagenum> values for each of the pages containing that <table> indicated on the page where it starts. Note the logical order of optional <thead>, optional <tfoot>, then one or more of either <tbody> or just rows <tr>. This order accommodates simple or large, complex tables. The <thead> and <tfoot> information usually helps identify content of the <tbody> rows, For a multiple-page print <table> the <thead> and <tfoot> are repeated on each page, but not redundantly tagged. --> <!ELEMENT table (caption?, (col* | colgroup*), thead?, tfoot?, (tbody+| tr+))> <!--Ause: table The summary attribute value provides a textual summary. The attributes: width, border, frame, rules, cellspacing, and cellpadding provide visual presentation guidance. See their explanation in the comment following those parameter entity declarations. --> <!ATTLIST table %attrs; summary %Text; #IMPLIED width %Length; #IMPLIED border %Pixels; #IMPLIED frame %TFrame; #IMPLIED rules %TRules; #IMPLIED cellspacing %Length; #IMPLIED cellpadding %Length; #IMPLIED > <!--Use: caption describes a <table> or <img>. If used with <table> it must follow immediately after the <table> start tag. If used with <img> or <imggroup> it is not so constrained. --> <!ELEMENT caption (%inline;)*> <!--Ause: caption The imgref attribute value (or space-separated id values) identifies the <img>s to which it applies. --> <!ATTLIST caption %attrs; imgref IDREFS #IMPLIED > <!--Use: thead marks header information in a <table>, consisting of one or more rows <tr> of <th> cells. On multiple-page printed tables, <thead> rows are repeated at the top of the <table> and on top of its continuation on other pages. --> <!ELEMENT thead (tr)+> <!ATTLIST thead %attrs; %cellhalign; %cellvalign; > <!--Use: tfoot marks footer information in a <table>, consisting of one or more rows <tr>, usually of <th> cells. On multiple-page printed tables, <tfoot> rows are repeated at the bottom of the first page of the <table> and its continuation on other pages. --> <!ELEMENT tfoot (tr)+> <!ATTLIST tfoot %attrs; %cellhalign; %cellvalign; > <!--Use: tbody marks a group of rows in the main body of a <table>. If the <table> is divided into several sections, each consisting of a number of rows, each section would be separately tagged with <tbody>. The same <thead> and <tfoot> apply to every <tbody> section. --> <!ELEMENT tbody (tr)+> <!ATTLIST tbody %attrs; %cellhalign; %cellvalign; > <!--Use: colgroup groups adjacent columns <col> that are semantically related. The <col> in a <colgroup> may inherit attribute values from it, or an enclosing parent, such as <thead>, <tfoot>, or <tbody>, or within a <table>. --> <!ELEMENT colgroup (col)*> <!--Ause: colgroup The span attribute indicates how many columns are being spanned, unless overridden by a span attribute value on one of those <col>. The width may contain a space-separated list of pixel widths for each <col>, or percentages if values end in "%", or "0*" to indicate minimal acceptable width based on column content. --> <!ATTLIST colgroup %attrs; span NMTOKEN "1" width %MultiLength; #IMPLIED %cellhalign; %cellvalign; > <!--Use: col is a means to apply attribute values to a column of a <table>. --> <!ELEMENT col EMPTY> <!--Ause: col The span value indicates how many columns the <col> extends, in the writing direction of the <table>. The attribute values apply to <th> and <td> that start in the column, even if they extend into the next column(s), by span value more than 1, and that next <col> may have differenti attribute values. Attribute values from the enclosing row <tr> may override those from the <col> as source for implied values for <th> and <td> therein. --> <!ATTLIST col %attrs; span NMTOKEN "1" width %MultiLength; #IMPLIED %cellhalign; %cellvalign; > <!--Use: tr marks one row of a <table> containing <th> or <td> cells. The values for %cellhalign; and %cellvalign; provide default values for <th> and <td> in the row, overriding any from <col>. --> <!ELEMENT tr (th | td)+> <!ATTLIST tr %attrs; %cellhalign; %cellvalign; > <!--Use: th indicates a table cell containing header information. --> <!ELEMENT th (%flownopagenum;)*> <!--Ause: th The uses of attributes other than %attrs;, %cellvalign; and %cellhalign; are: abbr provides an abbreviated name for a <th> cell that can be used when referring to that <th> cell. Its default value is the cell content. axis usually applied only to <th> cells. It gives a short name for that header content, headers provides the id value(s), used with <td> cells, to reference one or more cells with <th id="xxx"> that contain headings that collectively describe or qualify the content of the cell, for example <td headers="id1 id2">. scope value identifies one of (row | rowgroup | column | colgroup) to which the header information applies. rowspan indicates the total number of rows below that the cell extends, by default 1. colspan indicates the total number of columns the cell extends, by default 1, in the writing direction of the table. --> <!ATTLIST th %attrs; abbr %Text; #IMPLIED axis CDATA #IMPLIED headers IDREFS #IMPLIED scope %Scope; #IMPLIED rowspan NMTOKEN "1" colspan NMTOKEN "1" %cellhalign; %cellvalign; > <!--Use: td indicates a table cell containing data. --> <!ELEMENT td (%flownopagenum;)*> <!--Ause: td The uses of attributes other than %attrs;, %cellhalign; and %cellvalign; are: abbr provides an abbreviated name for a <th> cell that can be used when referring to that <th> cell. Its default value is the cell content. axis usually applied only to <th> cells. It gives a short name for that header content, headers provides the id value(s), used with <td> cells, to reference one or more cells with <th id="xxx"> that contain headings that collectively describe or qualify the content of the cell, for example <td headers="id1 id2">. scope value identifies one of (row | rowgroup | column | colgroup) to which the header information applies. rowspan indicates the total number of rows below that the cell extends, by default 1. colspan indicates the total number of columns the cell extends, by default 1, in the writing direction of the table. --> <!ATTLIST td %attrs; abbr %Text; #IMPLIED axis CDATA #IMPLIED headers IDREFS #IMPLIED scope %Scope; #IMPLIED rowspan NMTOKEN "1" colspan NMTOKEN "1" %cellhalign; %cellvalign; > <!--================ Document Head =======================================--> <!ENTITY % head.misc "style | meta | link"> <!--Use: head contains metainformation about the book but no actual content of the book itself, which is placed in <book>. This information is consonant with the <head> information in xhtml, see [XHTML11STRICT]. Other miscellaneous elements can occur before and after the required <title>. By convention <title> should occur first. --> <!ELEMENT head ((%head.misc;)*,title,(%head.misc;)*)> <!--Ause: head The profile attribute gives one or more whitespace-separated profile URI targets that may provide additional information about the current document. --> <!ATTLIST head %i18n; profile %URI; #IMPLIED > <!--Use: title contains the title of the book but is used only as metainformation in <head>. Use <doctitle> within <book> for the actual book title, which will usually be the same. --> <!ELEMENT title (#PCDATA)> <!ATTLIST title %i18n; > <!--Use: link is an empty element appearing in the <head> section of a document that establishes a connection between the current document and another document. The <link> element conveys relationship information (for example, "next" and "previous") that may be rendered by user agents in a variety of ways. --> <!ELEMENT link EMPTY> <!--Ause: link Each attribute use indicated by a parameter entity is defined in the comment following its definition. --> <!ATTLIST link %attrs; charset %Charset; #IMPLIED href %URI; #IMPLIED hreflang %LanguageCode; #IMPLIED type %ContentType; #IMPLIED rel %LinkTypes; #IMPLIED rev %LinkTypes; #IMPLIED media %MediaDesc; #IMPLIED > <!--Use: meta indicates metadata about the book. It is an empty element that may appear repeatedly only in <head>. --> <!ELEMENT meta EMPTY> <!--Ause: meta The http-equiv attribute connects the content attribute value to an http header field. The name attribute value identifies the specific kind of content value. The scheme value indicates a predetermined format for interpreting the content value. --> <!ATTLIST meta %i18n; http-equiv NMTOKEN #IMPLIED name NMTOKEN #IMPLIED content CDATA #REQUIRED scheme CDATA #IMPLIED > <!--Use: style provides the means to include styling information that applies to the book. It may appear only in <head>. It may include CDATA sections. --> <!ELEMENT style (#PCDATA)> <!--Ause: style The type attribute indicates the MIME-Type [RFC2045]. Type value should be "text/css", rather than "text/javascript". The media attribute value indicates the media for stylesheet definition(s); if multiple, separated by commas. The title value can provide menu choice among alternative stylesheets. The xml:space value indicates that whitespace in the <style> content is preserved without need to include its value in each <style>. --> <!ATTLIST style %i18n; type %ContentType; #REQUIRED media %MediaDesc; #IMPLIED title %Text; #IMPLIED xml:space (default | preserve) 'preserve' >
<!--SMIL 2.0 DTB-specific DTD Version 1.0.0 2001-09-27 file: dtbsmil100.dtd Authors: Michael Moodie, Tom McLaughlin, Lloyd Rasmussen Description: This DTD is intended for use only with DTB applications. Documents valid to this DTD will also be valid to the DTB SMIL Profile, but not necessarily vice versa, as this DTD contains only a subset of the elements and attributes present in the DTB SMIL Profile. This DTD is in some areas more restrictive than the Profile (e.g., requiring IDs on some elements), to enforce structure critical to the DTB application. The following identifiers apply to this DTD: "-//NISO//DTD dtbsmil v1.0.0//EN" "http://www.loc.gov/nls/z3986/v100/dtbsmil100.dtd" --> <!ENTITY % Core.attrib "id ID #IMPLIED class CDATA #IMPLIED title CDATA #IMPLIED" > <!ENTITY % URI "CDATA"> <!-- a Uniform Resource Identifier, see [RFC2396] --> <!ELEMENT smil (head, body) > <!ATTLIST smil %Core.attrib; version CDATA #FIXED "1.0.0" xml:lang NMTOKEN #IMPLIED > <!ELEMENT head ((meta)*, (layout)?, (customAttributes)? ) > <!ATTLIST head %Core.attrib; xml:lang NMTOKEN #IMPLIED > <!ELEMENT meta EMPTY > <!ATTLIST meta name CDATA #REQUIRED content CDATA #IMPLIED > <!-- only smil basic layout allowed; not CSS2. root-layout not included, is implementation dependent. --> <!ELEMENT layout (region)+ > <!ATTLIST layout %Core.attrib; xml:lang NMTOKEN #IMPLIED > <!ELEMENT region EMPTY > <!ATTLIST region id ID #REQUIRED height CDATA 'auto' width CDATA 'auto' bottom CDATA 'auto' top CDATA 'auto' left CDATA 'auto' right CDATA 'auto' fit (hidden|fill|meet|scroll|slice) 'hidden' z-index CDATA #IMPLIED backgroundColor CDATA #IMPLIED showBackground (always|whenActive) 'always' > <!ELEMENT customAttributes (customTest)+ > <!ATTLIST customAttributes %Core.attrib; xml:lang NMTOKEN #IMPLIED > <!ELEMENT customTest EMPTY > <!ATTLIST customTest id ID #REQUIRED class CDATA #IMPLIED defaultState (true|false) 'false' title CDATA #IMPLIED xml:lang NMTOKEN #IMPLIED override (visible|hidden) 'hidden' > <!-- Even though body functions as a seq, and you don't need a base set of seqs wrapping the whole presentation, for DTB applications a base set of seqs should be used. The dur attribute on the first seq is used by the player to determine the length of the SMIL presentation. --> <!ELEMENT body (par|seq|text|audio|img|a)+ > <!ATTLIST body %Core.attrib; xml:lang NMTOKEN #IMPLIED > <!ELEMENT seq (par|seq|text|audio|img|a)+ > <!ATTLIST seq id ID #REQUIRED class CDATA #IMPLIED customTest IDREF #IMPLIED dur CDATA #IMPLIED > <!-- pars are not allowed to nest. --> <!ELEMENT par (seq|text|audio|img|a)+ > <!ATTLIST par id ID #REQUIRED class CDATA #IMPLIED customTest IDREF #IMPLIED > <!ELEMENT text EMPTY > <!ATTLIST text id ID #IMPLIED region CDATA #IMPLIED src CDATA #REQUIRED type CDATA #IMPLIED > <!ELEMENT audio EMPTY > <!ATTLIST audio id ID #IMPLIED src CDATA #REQUIRED type CDATA #IMPLIED clipBegin CDATA #IMPLIED clipEnd CDATA #IMPLIED region CDATA #IMPLIED > <!ELEMENT img EMPTY > <!ATTLIST img id ID #IMPLIED region CDATA #IMPLIED src CDATA #REQUIRED type CDATA #IMPLIED > <!ELEMENT a (text|audio|img)* > <!ATTLIST a href %URI; #REQUIRED xml:lang NMTOKEN #IMPLIED %Core.attrib; >
<!-- NCX 1.0.0 DTD 2001-09-27 file: ncx100.dtd Authors: Mark Hakkinen, George Kerscher, Tom McLaughlin, James Pritchett, and Michael Moodie Description: NCX (Navigation Control for XML applications) is a generalised navigation definition DTD for application to Digital Talking Books, eBooks, and general web content models. This DTD is an XML application that layers navigation functionality on top of SMIL 2.0 content. The NCX defines a navigation path/model which may be applied upon existing publications, without modification of the existing publication source, so long as the navigation targets within the source publication can be directly referenced via a URI. --> <!-- The following identifiers apply to this DTD: "-//NISO//DTD ncx v1.0.0//EN" "http://www.loc.gov/nls/z3986/v100/ncx100.dtd" --> <!-- Basic Entities --> <!ENTITY % i18n "lang NMTOKEN #IMPLIED dir (ltr|rtl) #IMPLIED" > <!ENTITY % SMILtimeVal "CDATA" > <!ENTITY % uri "CDATA" > <!ENTITY % script "CDATA" > <!-- ELEMENTS --> <!-- Top Level NCX Container. --> <!ELEMENT ncx (head, docTitle, docAuthor*, navMap, navList*)> <!ATTLIST ncx version CDATA #FIXED "1.0.0" %i18n; > <!-- Document Head - Contains all NCX metadata. --> <!ELEMENT head (smilCustomTest | meta)+> <!-- smilCustomTest - Duplicates customTest data found in SMIL files. Each unique customTest element that appears in one or more SMIL files must have its attributes duplicated in a smilCustomTest element in the NCX. The NCX thus gathers in one place all customTest elements used in the SMIL files, for presentation to the user. --> <!ELEMENT smilCustomTest EMPTY> <!ATTLIST smilCustomTest id ID #REQUIRED defaultState (true|false) 'false' override (visible|hidden) 'hidden'> <!-- Meta Element - metadata about this NCX --> <!ELEMENT meta EMPTY> <!ATTLIST meta name CDATA #REQUIRED content CDATA #REQUIRED scheme CDATA #IMPLIED > <!-- DocTitle - the title of the document, required and must immediately follow head. --> <!ELEMENT docTitle (text, audio?)> <!ATTLIST docTitle id ID #IMPLIED %i18n; > <!-- DocAuthor - the author of the document, immediately follows docTitle. --> <!ELEMENT docAuthor (text, audio?)> <!ATTLIST docAuthor id ID #IMPLIED %i18n; > <!-- Navigation Structure - container for all of the NCX objects that are part of the hierarchical structure of the document. --> <!ELEMENT navMap (navLabel*, navPoint+)> <!ATTLIST navMap id ID #IMPLIED > <!-- Navigation Point - contains description(s) of target, as well as a pointer to entire content of target. Hierarchy is represented by nesting navPoints. "class" attribute describes the kind of structural unit this object represents (e.g., "chapter", "section"). "value" attribute is a numerical representation of the text content of the label if this is a purely numerical (integer only) label (e.g., a page number). "pageRef" is the id of the page navTarget on which this structure target begins. --> <!ELEMENT navPoint (navLabel+, content, navPoint*)> <!ATTLIST navPoint id ID #REQUIRED onFocus %script; #IMPLIED onBlur %script; #IMPLIED class CDATA #IMPLIED value CDATA #IMPLIED pageRef IDREF #IMPLIED > <!-- Navigation List - container for distinct, flat sets of navigable elements, e.g. page numbers, notes, figures, tables, etc. Essentially a flat version of navMap. The "class" attribute describes the type of object contained in this navList, using dtbook element names, e.g., pagenum, note. --> <!ELEMENT navList (navLabel+, navTarget+) > <!ATTLIST navList id ID #IMPLIED class CDATA #IMPLIED > <!-- Navigation Target - contains description(s) of target, as well as a pointer to entire content of target. navTargets are the equivalent of navPoints for use in navLists. "mapRef" is the id of another navPoint within this NCX that contains this navTarget. "class" attribute describes the kind of structure this target represents, using its dtbook element name, e.g., pagenum, note. --> <!ELEMENT navTarget (navLabel+, content) > <!ATTLIST navTarget id ID #REQUIRED onFocus %script; #IMPLIED onBlur %script; #IMPLIED class CDATA #IMPLIED value CDATA #IMPLIED mapRef IDREF #REQUIRED > <!-- Navigation Label - Contains a description of a given <navMap>, <navPoint>, <navList>, or <navTarget> in various media for presentation to the user. Can be repeated so descriptions can be provided in multiple languages. --> <!ELEMENT navLabel ((text,(audio?, img?))|((text?), audio, (img?))) > <!ATTLIST navLabel %i18n; > <!-- Content Element - pointer into SMIL to beginning of navPoint. --> <!ELEMENT content EMPTY> <!ATTLIST content id ID #IMPLIED src %uri; #REQUIRED > <!-- Text Element - Contains text of docTitle, navPoint heading, navTarget (e.g., page number), or label for navMap or navList. --> <!ELEMENT text (#PCDATA)> <!ATTLIST text id ID #IMPLIED class CDATA #IMPLIED > <!-- Audio Element - audio clip of navPoint heading. --> <!ELEMENT audio EMPTY> <!ATTLIST audio id ID #IMPLIED class CDATA #IMPLIED src %uri; #REQUIRED clipBegin %SMILtimeVal; #IMPLIED clipEnd %SMILtimeVal; #IMPLIED > <!-- Image Element - image that may accompany heading. --> <!ELEMENT img EMPTY> <!ATTLIST img id ID #IMPLIED class CDATA #IMPLIED src %uri; #REQUIRED >
<!-- bookmark 1.0.0 DTD 2001-09-27 file: bookmark100.dtd Authors: Tom McLaughlin and Michael Moodie The following identifiers apply to this DTD: "-//NISO//DTD bookmark v1.0.0//EN" "http://www.loc.gov/nls/z3986/v100/bookmark100.dtd" --> <!-- ********************* Entities ******************* --> <!ENTITY % uri "CDATA"> <!-- ********************* Elements ********************* --> <!-- BookmarkSet: The set of bookmarks for a book consists of the title, a unique identifier of the book, the last place the reader left off and zero or more bookmarks, highlights, and associated audio or textual notes. This set is intended for export of bookmarks, highlights and notes to another player; the markup is not required for a player's internal representation of bookmarks. --> <!ELEMENT bookmarkSet (title, uid, lastmark?, (bookmark | hilite)*) > <!-- Title: The book's title in text and an optional audio clip. --> <!ELEMENT title (text, audio?) > <!-- uid: A globally unique identifier for the book. --> <!ELEMENT uid (#PCDATA) > <!-- Bookmark: Location and optional note. Location consists of a uri pointing to the id attribute of the <par> element in the SMIL file that contains the bookmark plus a time offset in seconds (or character offset) to the exact place. Player should by default automatically number bookmarks in the order in which they fall in the book. --> <!ELEMENT bookmark (ncxRef, uri, (timeOffset | charOffset), note?) > <!ATTLIST bookmark label CDATA #IMPLIED > <!-- NcxRef: Captures current location in NCX (the id of the current navPoint)at time lastmark, bookmark, or highlight is set. Ensures that current location in NCX and SMIL are synchronized after moving to a lastmark, etc., so that any global navigation commands issued by the user will start from the current location. --> <!ELEMENT ncxRef (#PCDATA)> <!-- Lastmark: Location where reader left off and where player will resume play when restarted. --> <!ELEMENT lastmark (ncxRef, uri, (timeOffset | charOffset)) > <!-- Hilite: A block of text with an optional note attached. --> <!ELEMENT hilite (hiliteStart, hiliteEnd, note?) > <!ATTLIST hilite label CDATA #IMPLIED > <!-- HilStart: Starting point of highlighted block. --> <!ELEMENT hiliteStart (ncxRef, uri, (timeOffset | charOffset)) > <!-- HilEnd: End point of highlighted block. --> <!ELEMENT hiliteEnd (ncxRef, uri, (timeOffset | charOffset)) > <!-- Uri: pointer to id of <par> or <seq> in SMIL, to id in text-only file, or to audio file that contains the bookmark. --> <!ELEMENT uri (#PCDATA) > <!-- Timeoffset: Exact position of bookmark in SMIL file or audio-only file referenced by the uri; in seconds.fraction (seconds=DIGIT+, fraction=3DIGIT). --> <!ELEMENT timeOffset (#PCDATA) > <!-- Charoffset: Exact position of bookmark in text-only file referenced by the uri: in characters, counting from nearest previous tag with an id. White space is normalized (collapsed to one character) and tags are not counted. --> <!ELEMENT charOffset (#PCDATA) > <!-- Note: The note is for the user's input, random thoughts, musings, etc. It can be text or audio or both. --> <!ELEMENT note (text?, audio?) > <!-- Text: Text of title or note. --> <!ELEMENT text (#PCDATA) > <!-- Audio: Audio clip of user-recorded note, in any format supported by standard. --> <!ELEMENT audio EMPTY > <!ATTLIST audio src %uri; #REQUIRED clipBegin CDATA #IMPLIED clipEnd CDATA #IMPLIED >
<!-- Resource File 1.0.0 DTD 2001-09-27 file: resource100.dtd Authors: Tom McLaughlin, Michael Moodie, Thomas Kjellberg Christensen The following identifiers apply to this DTD: "-//NISO//DTD resource v1.0.0//EN" "http://www.loc.gov/nls/z3986/v100/resource100.dtd" --> <!-- ********** Attribute Types *********** --> <!-- languagecode: An RFC1766 language code. --> <!ENTITY % languagecode "NMTOKEN"> <!-- SMILtimeVal: SMIL 2.0 clock value. --> <!ENTITY % SMILtimeVal "CDATA"> <!ENTITY % URI "CDATA"> <!-- **************** Resource Elements ********** --> <!-- Resources: Root element of DTD. --> <!ELEMENT resources (head?, (resource)+) > <!ATTLIST resources version CDATA #FIXED "1.0.0" > <!-- Document Head - Contains metadata. --> <!ELEMENT head (meta*)> <!-- Resource element contains information about the alternative representations of an element present in the NCX or the textual content file. An alternative representation can be used to convey navigational information, e.g., provide a descriptive name for the kind of segment (part, chapter, section, etc.) the user is encountering. In addition, it can supply accessible versions of dtbook element names and names of skippable structures listed in the head of the NCX. Text can be used for screen or braille display, audio for digital talking book players, and image for screen display. Attribute use: type - Specifies whether the resource applies to the textual content file (dtbook) or the NCX (ncx). elementRef - Specifies the name of the element for which the resource is to be supplied. classRef - Specifies the class attribute value of the element for which the resource is to be supplied. idRef - Specifies the name of the id attribute on the smilCustomTest element in NCX for which the resource is to be supplied. lang - Specifies the language of the resource item, using an RFC 1766 language code. --> <!ELEMENT resource ((text, audio?, img?) | (text?, audio, img?)) > <!ATTLIST resource type (ncx | dtbook) #REQUIRED elementRef CDATA #REQUIRED classRef CDATA #IMPLIED idRef CDATA #IMPLIED lang %languagecode; #IMPLIED > <!ELEMENT text (#PCDATA) > <!ELEMENT audio EMPTY > <!ATTLIST audio src %URI; #REQUIRED clipBegin %SMILtimeVal; #IMPLIED clipEnd %SMILtimeVal; #IMPLIED > <!-- If the clipBegin attribute is not present in an instance of the audio element, the audio file referenced must be played from its beginning. If the clipEnd attribute is not present, the audio file must be played to its end. If the value of the clipEnd attribute exceeds the duration of the audio file, the value must be ignored, and the audio file played to its end. --> <!ELEMENT img EMPTY > <!ATTLIST img src %URI; #REQUIRED > <!-- Meta Element - producer-defined metadata about this resource file. --> <!ELEMENT meta EMPTY> <!ATTLIST meta name CDATA #REQUIRED content CDATA #REQUIRED scheme CDATA #IMPLIED >
<!-- distInfo 1.0.0 DTD 2001-09-27 file: distInfo100.dtd Author: James Pritchett Description: An XML application to describe the contents of a single piece of DTB distribution media. It consists of a list of books to be found on the media. For each book, distInfo identifies the location of each book within the media filesystem. If the book is being distributed on multiple distribution media (media units), the distInfo book element also includes: 1) the sequence id of this media unit 2) a distribution map for the book, telling where to find all the SMIL files for a book The following identifiers apply to this DTD: "-//NISO//DTD distInfo v1.0.0//EN" "http://www.loc.gov/nls/z3986/v100/distInfo100.dtd" --> <!-- * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * --> <!ENTITY % URI "CDATA"> <!ENTITY % SMILtimeVal "CDATA"> <!-- distInfo: Root element, consists of one or more books. "version" specifies the version of this DTD used in this instance. Three digits, with decimal point separators; digits one, two and three will reflect major, moderate and minor changes, respectively. This attribute must be present but parsers will not enforce its presence, just its value. --> <!ELEMENT distInfo (book+)> <!ATTLIST distInfo version CDATA #FIXED "1.0.0" > <!-- book: a DTB that is present, in part or whole, on this piece of distribution media. The uid and pkgRef attributes are required. "uid" matches the package unique-identifier. "pkgRef" is a URI that locates the book's package file on this media unit. If this is a book fragment, then the "media" attribute identifies which fragment is stored on this media unit, and a single distMap element is present to describe which SMIL files are present on which media units. The media attribute is in the format "x:y", where x is the sequence number of this media unit, and y is the total number of media units in the distribution of this book. In the case of a book fragment, <book> should contain exactly one <distMap> and optionally one or more <changeMsg> elements. --> <!ELEMENT book (distMap?, changeMsg*)> <!ATTLIST book uid CDATA #REQUIRED pkgRef CDATA #REQUIRED media CDATA #IMPLIED > <!-- distMap: a map identifying which media the various SMIL files reside upon. This consists of one or more smilRef elements. The distMap smilRef's should match one-to-one those of the book package spine. --> <!ELEMENT distMap (smilRef+)> <!-- smilRef: a reference to a DTB SMIL file. These are referenced by file name. The mediaRef attribute of each smilRef identifies the piece of media that the file resides upon, and is in the format "x:y" (see above). --> <!ELEMENT smilRef EMPTY> <!ATTLIST smilRef file CDATA #REQUIRED mediaRef CDATA #REQUIRED > <!-- changeMsg: A pointer to a custom message to be read when a new disk is requested by the reading system. "mediaRef" identifies the media unit which this message (e.g.,"Insert disc 2") specifies. Player invokes the correct <changeMsg> by matching its "mediaRef" attribute to the "mediaRef" attribute of the selected <smilRef>. "mediaRef" is in the format "x:y", where x is the sequence number of the specified media unit, and y is the total number of media pieces in the distribution of this book. --> <!ELEMENT changeMsg ((text, audio?) | (text?, audio))> <!ATTLIST changeMsg mediaRef CDATA #REQUIRED lang NMTOKEN #IMPLIED > <!-- text: Contains text of media change message. --> <!ELEMENT text (#PCDATA)> <!-- audio: Pointer to audio content of media change message. --> <!ELEMENT audio EMPTY> <!ATTLIST audio src %URI; #REQUIRED clipBegin %SMILtimeVal; #IMPLIED clipEnd %SMILtimeVal; #IMPLIED >
(This Appendix is not part of American National Standard Z39.86-200x v1.0.0, File Specifications for the Digital Talking Book. It is included for information only.)
The functions assigned to the maintenance agency as specified in section 1.7 will be administered by the National Library Service for the Blind and Physically Handicapped, the Library of Congress. Questions concerning the implementation of this standard and requests for information should be sent to the Research and Development Officer, National Library Service for the Blind and Physically Handicapped, Library of Congress, Washington, DC 20542, or nls@loc.gov, including "Z3986" in the subject line.