Skip to content

loadfix/python-pptx

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3,593 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python-pptx

A Python library for creating, reading, and updating Microsoft PowerPoint 2007+ (.pptx) files.

This repository is a fork of python-pptx by Steve Canny. It builds on their original work with additional PowerPoint capabilities beyond upstream 1.0.2 — full-fidelity slide duplication and cross-presentation merging, animations and transitions, presentation sections, modern chart types (3D, combo, secondary axis, error bars, chartEx passthrough), shape effects (shadow, glow, reflection, soft-edge), password-protected saves, slide comments, math equations, SmartArt scaffolding, custom document properties, slide-level tags, accessibility alt text, free-function layout helpers (pptx.layout.layout_horizontal / layout_vertical / layout_grid, plus the higher-level relative / anchor positioning kit pptx.kit.layout.below / right_of / centered / stack_horizontal / stack_vertical / align_left / align_top, plus visual baseline alignment across sibling shapes (pptx.kit.alignment.align_baselines / align_top_text / align_centers — for KPI rows / icon-plus-label rows where bounding-box alignment leaves text baselines wobbling), plus a 12-column designer-familiar grid system pptx.kit.grid.Grid with grid.place(shape, col, span, row, row_span) and named-region API), text-box presets (SlideShapes.add_caption / .add_callout_number / .add_section_title), group templating via GroupShape.clone_with_labels(...), atomic template-and-substitute slide cloning (Slides.from_template(template, substitutions={...})) that pairs Slide.clone_shapes_from with exact-match tokenised text replacement, a first-class single-slide PNG render (Presentation.render_png()) via LibreOffice, a deck-level slide-thumbnail preview (Presentation.preview()list[bytes] of one PNG per slide via LibreOffice, with a graceful embedded-thumbnail fallback when soffice is not on PATH; pass save_dir=... to also write slide-<NN>.png to disk), plus an in-process per-shape PNG thumbnail (BaseShape.thumbnail(width_px, height_px, background)) via Pillow for pictures, auto-shapes, groups, and connectors (no LibreOffice needed for icon-index previews), slide-level introspection helpers (Slide.to_dict, .diagnostics, .find, .labels, .explicit_labels, .diff, .changeset_since, .describe, .find_by_tag; Presentation.save_and_reload, .save_strict (save + round-trip- verify; raises pptx.exc.FidelityError listing every drift if the saved file doesn't match the in-memory state), .diff (semantic deck compare returning a DeckDiff with .summary, iterable .changes, and .to_markdown() / .to_html() exporters for PR comments and review UIs — the agentic-editing review primitive that mirrors Document.diff() and Workbook.diff()), .describe, .lint for deck-scope style-drift checks (title font size, palette, spacing, footer presence, title case — the cross-slide companion to the per-slide Slide.diagnostics layout checker), .consistency_report / .consistency_majority (inverse-lint: report the per-slide patterns the deck exhibits so an agent can self-identify drift without being told what the style SHOULD be), .enable_journal / .disable_journal / .clear_journal / .journal_entries + save(journal=True) for a mutation-audit sidecar JSON that records every high-level shape / text / slide mutation an agent made (post-save reconciliation and debug replay without scrolling the full tool-call transcript); BaseShape.inspect_xml / .diff_xml for one-call XML readout and diffing between two shapes, plus BaseShape.tag / .tags / .untag for durable agent-intent metadata that round-trips through PowerPoint, Google Slides, and Keynote, and BaseShape.is_label_for / .labeled_by for resolving the label↔anchor pairing that add_label now records automatically) for snapshot tests, AI-agent pipelines, and pre-save authoring checks, a typed reference-icon index (pptx.reference.IconIndex.from_deck(...).find(...).clone_onto(...)) that codifies the find-by-intent → clone protocol for agents authoring architecture decks from a reference icon pack, plus an authoring-time Theme DSL (pptx.theme.Theme) for consistent agent-driven deck authoring (prs.theme = Theme(palette=…, fonts=…, sizes=…, spacing=…); add_label / add_caption / add_callout_number / add_section_title accept "@token" strings resolved against the attached theme; drift diagnostics catch off-palette / off-theme choices pre-save), plus a pptx.templates package whose load_theme(name) reads bundled TOML preset files (corporate, architecture, report) and returns a blank 16:9 deck with a purpose-fit Theme already attached so agents can skip the "default-style dance" on every new deck. Custom brand TOML files are loaded the same way: load_theme("/path/to/brand.toml"). Credit for the foundational library goes to the original author.

Installation

pip install git+https://github.com/loadfix/python-pptx.git

Requires Python 3.9+. The default install includes the full author surface (charts, SmartArt, math equations, modern threaded comments, SVG rasterizer, custom document properties).

Read-only / slim install

For callers that only need to introspect a deck — text extraction, alt-text dumps, pptx.kit.lint, pptx.kit.accessibility, CI gates, agent tools that read but never author — the default install pulls nine sibling shared packages plus their transitive deps that aren't exercised on the read path. The [read] extra documents the slim runtime list, and there's a recipe for installing it without the extras pip cannot subtract:

# 1. install the slim runtime deps directly
pip install \
  'Pillow>=10.3.0' 'lxml>=4.9.4' 'XlsxWriter>=0.5.7' \
  'typing_extensions>=4.9.0' \
  'python-ooxml-opc @ git+https://github.com/loadfix/python-ooxml-opc@master' \
  'python-ooxml-xmlchemy @ git+https://github.com/loadfix/python-ooxml-xmlchemy@master' \
  'python-ooxml-docprops @ git+https://github.com/loadfix/python-ooxml-docprops@master' \
  'python-ooxml-shared-drawingml @ git+https://github.com/loadfix/python-ooxml-shared-drawingml@master' \
  'python-ooxml-vml @ git+https://github.com/loadfix/python-ooxml-vml@master' \
  'python-ooxml-comments @ git+https://github.com/loadfix/python-ooxml-comments@master'

# 2. install python-pptx without re-resolving its deps
pip install --no-deps git+https://github.com/loadfix/python-pptx.git

This skips python-ooxml-chart, python-ooxml-math, python-ooxml-smartart, cairosvg (and its cairocffi / tinycss2 / cssselect2 / webencodings transitives) — about half the wheels of a default install. Decks containing chart / math / SmartArt content read fine on the slim install for most introspection paths; operations that manipulate those surfaces will raise ImportError with a pointer to the matching extra.

Per-feature extras

Extra Adds
[chart] python-ooxml-chart + XlsxWriter
[math] python-ooxml-math
[smartart] python-ooxml-smartart
[comments] python-ooxml-comments
[docprops] python-ooxml-docprops
[svg] cairosvg
[full] every per-feature extra above

These are useful for explicit version pinning and for slim-install workflows that opt back in to one authoring surface. All are already in the default dependencies, so a bare pip install python-pptx remains a full install — back-compat (issue #637).

Usage

from pptx import Presentation

prs = Presentation()
slide = prs.slides.add_slide(prs.slide_layouts[5])
slide.shapes.title.text = "It was a dark and stormy night."
prs.save("dark-and-stormy.pptx")

prs = Presentation("dark-and-stormy.pptx")
print(prs.slides[0].shapes.title.text)
# It was a dark and stormy night.

The package is imported as pptx, matching upstream. Existing upstream code runs unchanged against this fork.

API

Full API and user-guide documentation lives under docs/ and builds with Sphinx (theme: Furo).

pip install Sphinx furo
python -m sphinx -b html docs docs/_build/html

A single-page catalogue of every public capability (with fork-era additions marked [Added in <version>.dev0]) lives in FEATURES.md.

Reproducible builds

Presentation.save(path, reproducible=True) produces a byte-identical .pptx for byte-identical inputs across machines and runs. Use it when you need source-control-friendly diffs, fixture regeneration in test suites, or content-addressable artefact pipelines:

prs = Presentation()
prs.slides.add_slide(prs.slide_layouts[1])
prs.save("deck.pptx", reproducible=True)

The flag stamps every zip-member with the fixed 1980-01-01 timestamp, emits members in sorted order, and normalises external file attributes — the three sources of cross-machine nondeterminism a plain zip_date_time= keyword does not cover. For a custom timestamp without sort or attribute normalisation, pass zip_date_time= directly. The two are orthogonal: the matching reproducible= keyword is also accepted by python-docx, python-xlsx, and python-vsdx so cross-format build pipelines share the same idiom (issue #150).

Round-trip support

Editing real-world .pptx decks is the killer use case for this fork: load a corporate deck with theme + masters + native PowerPoint charts + SmartArt + animations + modern threaded comments, mutate a slide, save, and have nothing else change.

The cross-monorepo round-trip gate lives at tests/round_trip/ and runs as the round-trip-fidelity CI job. The per-feature support matrix (what's "fully preserved" / "preserved with caveats" / "lossy") across all four parent formats lives at docs/round-trip-fidelity.md. Per- slide structural diff helpers (Slide.diff, .changeset_since, .save_and_reload, .save_strict) are documented in FEATURES.md.

Status

Actively maintained on loadfix/python-pptx. This fork is under continuous development: upstream issues from scanny/python-pptx are triaged and addressed here, and the fork ships 340+ audited-issue capabilities layered on top of upstream 1.0.2. See FEATURES.md for the single-page capability catalogue (fork-era additions are marked [Added in <version>.dev0]) and HISTORY.rst for the user-visible changelog. The remaining upstream issues that do not represent a code gap in this fork (usage questions, out-of-scope requests, environment bugs, corrupt source files, duplicates) are catalogued in docs/community/issue-triage.rst with a one-line disposition each.

Unstable. Not yet published to PyPI — install from source only. Current version: 2026.05.3; versioning is CalVer (YYYY.MM.patch). The public API follows the upstream pptx surface, so upstream code runs unchanged; fork-era additions may still shift before a tagged release.

Contributing

Issues and pull requests are tracked at https://github.com/loadfix/python-pptx/issues. Please file issues against this fork; upstream's tracker is for upstream-shared concerns only.

For a monorepo checkout (siblings python-ooxml-xmlchemy, python-ooxml-opc, python-ooxml-docprops, python-ooxml-crypto next to this repo), run make dev-install to editable-install all sibling OOXML packages alongside this one — the released wheels are not yet on PyPI.

Run the test suite with pytest (unit) and make accept (behave acceptance tests) before opening a PR. See CLAUDE.md for the broader development workflow.

License

MIT. See LICENSE. Inherited from upstream scanny/python-pptx.

Related projects

Part of a family of document-rendering libraries:

  • docxjs — browser-side DOCX → HTML renderer (TypeScript)
  • pptxjs — browser-side PPTX → HTML renderer (TypeScript)
  • xlsxjs — browser-side XLSX → HTML renderer (TypeScript)
  • python-docx — Python DOCX parser/generator
  • python-xlsx — Python XLSX parser/generator
  • ooxml-validate — Python/.NET OOXML validator (wraps Microsoft Open XML SDK + LibreOffice)

About

Create Open XML PowerPoint documents in Python

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 98.0%
  • Gherkin 2.0%