Automated Workflow: Transforming OpenDocument Text into DAISY DTB

How to Create a DAISY DTB from ODT Files

Overview

Convert an ODT (OpenDocument Text) into a DAISY DTB (Digital Talking Book) by preparing a well-structured source, exporting to an intermediate format (usually HTML or XML), and packaging the DAISY components (audio, navigation, text) into a DTB.

1) Prepare the ODT

  • Use styles (Heading 1, Heading 2, Normal) consistently for logical structure.
  • Add semantic elements: alt text for images, captions, lists, tables with headers.
  • Remove manual formatting that breaks structure (e.g., repeated blank lines for spacing).
  • Ensure chapter and section breaks correspond to headings.

2) Export or convert ODT → HTML/XML

  • Export ODT to clean HTML (many editors like LibreOffice can “Save As” HTML).
  • Alternatively, convert ODT → well-formed XHTML using a tool (e.g., Pandoc: pandoc file.odt -t html -o file.html).
  • Verify headings, links, image references, and character encoding (UTF-8).

3) Generate or obtain audio

  • Create narrated audio files for the text (MP3 or AAC):
    • Human narration (best accessibility/usability).
    • Or high-quality TTS (ensure output formats and timings suitable for DAISY).
  • Save audio divided by logical units (per chapter/section) and name consistently.

4) Create DAISY navigation and sync files

  • Two main DAISY flavors: DAISY 2.02 (SMIL + Ncc.html) and DAISY 3 / DTBook + SMIL. For modern DTB produce DAISY 3 (DTBook XML + SMIL).
  • Tools/process:
    • Use an authoring tool that maps text to DTBook and generates SMIL (syncs text with audio). Examples: Obi (older), commercial DAISY producers, or scripts that create SMIL from timestamps.
    • If manual: convert HTML → DTBook XML (DTBook schema) preserving hierarchy and IDs for sync targets.
    • Create SMIL files that reference audio fragments and text locations (smil: audio src + textsrc pointing to DTBook element IDs) to enable read-along.

5) Validate the DTB

  • Validate DTBook XML against the DAISY/DTBook schema.
  • Validate SMIL files for correct references and timings.
  • Check package structure (OPF-like package for DAISY 3) if required by target player.

6) Package into a DTB

  • Arrange files in a folder with required structure: DTBook XML, SMIL files, audio files, images, navigation files.
  • Create the DTB package (often a zipped folder with .zip or .daisy extension, or follow distribution format required by target players/platforms).

7) Test in DAISY players

  • Test navigation, read-along sync, skipping by heading/phrase, image descriptions, and bookmarking using multiple DAISY-compatible players or devices.

Tools & Commands (examples)

  • LibreOffice — export ODT → HTML/XHTML.
  • Pandoc — convert ODT to HTML: pandoc input.odt -t html -o output.html.
  • Obi / commercial DAISY creator — authoring and SMIL generation.
  • Custom scripts or libraries to convert HTML → DTBook and generate SMIL.
  • Validators — DTBook schema validator; SMIL validator.

Practical tips

  • Start with well-styled documents — conversion quality depends on source structure.
  • Break long audio into smaller files matching sections for easier navigation.
  • Use unique IDs on headings/elements to simplify SMIL textsrc targets.
  • Keep filenames ASCII and consistent; avoid spaces or special chars.
  • Automate repeated conversions with a script/toolchain once workflow is stable.

If you want, I can:

  • Provide a sample workflow using Pandoc + a basic SMIL template, or
  • Draft an example DTBook+SMIL pair for a short ODT chapter. Which would you prefer?

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *