Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[1.7.0] - 2026-03-16
Added
- Persistent Cache System: New file-based caching to speed up repeated exports
--cache flag to enable caching of API responses and SEO crawl results
--cache-ttl to set cache expiration (default: 24h, use 0 for unlimited)
--cache-dir to specify custom cache directory (default: ~/.wpexporter/cache)
--cache-clear to clear cache before export
- Caches all WordPress REST API calls: posts, pages, media, categories, tags, users
- Caches SEO crawl results (assisted-crawl, crawl-content)
- Site-isolated caching using URL hash (different sites don’t share cache)
- Significant performance improvement for repeated exports (media list from ~30-60s to <1s)
- Environment variables: WPEXPORT_CACHE, WPEXPORT_CACHE_TTL, WPEXPORT_CACHE_DIR, WPEXPORT_CACHE_CLEAR
- Preserve HTML Elements: New flags to keep specific elements intact during HTML processing
--preserve-classes - preserve elements by CSS class (comma-separated)
--preserve-ids - preserve elements by ID (comma-separated)
- Wildcard support:
klaviyo-form-* matches klaviyo-form-XL7uTf
- Works with both
--flat-html and --basic-html options
- Useful for newsletter forms (Klaviyo, Mailchimp), embedded widgets, custom elements
- Environment variables: WPEXPORT_PRESERVE_CLASSES, WPEXPORT_PRESERVE_IDS
- Configurable via config file:
preserve_classes, preserve_ids
[1.6.1] - 2026-03-11
Added
- Basic HTML Sanitization: New
--basic-html flag to clean HTML to basic elements for Shopify/ecommerce
- Preserves: tables, lists, links, images, headers, paragraphs, basic formatting
- Removes: Bricks Builder divs, Elementor widgets, custom classes, style/script tags
- Strips dangerous attributes (onclick, style, data-*) while keeping safe ones (href, src, alt)
- Mutually exclusive with
--flat-html (use one or the other)
- Keep Original URLs: New
--keep-original-urls flag to preserve WordPress URLs in content
- Prevents conversion of media URLs to local paths (e.g.,
media/images/...)
- Useful when exporting markdown for import to Shopify or other cloud platforms
- Works with all formats but most useful with
--format markdown
Fixed
- Missing Featured Images: Featured images are now properly fetched even when not returned by paginated media API
- WordPress REST API
/media endpoint may not return all media items (WPML language contexts, restricted access)
- Featured image IDs from posts/pages are now fetched individually using
GetMediaByID() when missing
- Ensures
featured_image URL is populated in frontmatter for all posts with featured images
- Relevant Media Over-Matching: Fixed
--relevant-media-only downloading too many files
- Changed from filename-only matching to path-suffix matching (e.g.,
2024/01/document.pdf)
- Prevents false positives when multiple files have the same name in different directories
- Shopify Media URLs: Fixed images not displaying in Shopify export
- Media paths are now preserved as original WordPress URLs for Shopify format
- Only json and markdown formats convert URLs to local paths
- Shopify/Magento/other cloud platforms need full URLs, not relative paths
[1.6.0] - 2026-02-11
Changed
- Markdown Excerpt Handling: Excerpt moved from body section to frontmatter metadata field
- Excerpt is now included as
excerpt: "..." in YAML frontmatter
- Removed separate
## Excerpt section from content body
- Content follows directly after frontmatter without section headers
- Improves parser compatibility for posts with or without content sections
Added
- Media Subfolder Organization: Downloaded media now organized into type-based subfolders
media/images/ - All image files (jpg, png, gif, webp, svg, etc.)
media/videos/ - Video files (mp4, avi, webm, mkv, mov, etc.)
media/audio/ - Audio files (mp3, wav, ogg, flac, aac, etc.)
media/documents/ - Documents (pdf, docx, xlsx, pptx, odt, etc.)
media/archives/ - Archives (zip, rar, 7z, tar, gz, etc.)
media/code/ - Code files (html, css, js, json, xml)
media/other/ - Unrecognized file types
- Content paths automatically updated to reference subfolder locations
- Exclude Media Types Option: New
--exclude-media-types flag to skip specific media types from download
- By category:
--exclude-media-types 'documents,videos,archives'
- By extension:
--exclude-media-types 'pdf,gif,html'
- By MIME type:
--exclude-media-types 'image,video'
- Configurable via config file:
exclude_media_types: ["documents", "pdf"]
- Extended Media MIME Type Support: Added support for 80+ file types in media downloads
- Documents: docx, doc, xlsx, xls, pptx, ppt, odt, ods, odp, epub
- Archives: rar, 7z, tar, gz, bz2, xz
- Video: mkv, m4v, 3gp, 3g2, ogv, mpeg
- Audio: flac, aac, m4a, weba, wma, midi, aiff
- Images: ico, avif, heic, heif
- Code: html, css, js, json, xml, csv, md
- Exclude SEO Tags Option: New
--exclude-tags flag to skip specific meta tags during extraction
- Usage:
--exclude-tags 'meta:description,og:title'
- Supported tags: title, meta:description, meta:keywords, og:title, og:description, og:image, canonical, lang, hreflangs
- Configurable via config file:
exclude_tags: ["meta:description", "og:title"]
- Duplicate Meta Tag Detection: SEO crawler now detects and reports duplicate meta tags
- Logs warning when duplicate tags found:
Detected duplicate tags on URL: tag1, tag2
- Uses first occurrence value (standard SEO behavior)
- Detects duplicates for: title, meta:description, meta:keywords, og:title, og:description, og:image, canonical
- Export Size Reporting: Export summary now displays file sizes
- Total export directory size
- Media folder size (when media is downloaded)
- ZIP archive size (when using
--zip)
- Human-readable format (B, KB, MB, GB)
- Shopify Export Metadata: Shopify export now includes post metadata in the HTML body content
- ID, slug, date, modified, status, type
- Link to original URL
- Author name
- Featured media ID
- Categories and tags
- Hreflang alternate links (when using
--assisted-crawl)
- Styled metadata section matching Markdown frontmatter fields
- Hreflang Extraction:
--assisted-crawl now extracts hreflang alternate links
- Captures language codes and URLs from
<link rel="alternate" hreflang="..."> tags
- Included in both Markdown frontmatter and Shopify HTML metadata
- Language Extraction:
--assisted-crawl now extracts content language
- Captures language from
<html lang="..."> or <meta http-equiv="Content-Language">
- Included in both Markdown frontmatter (
lang: "en-gb") and Shopify HTML metadata
- Human-Readable Names in Frontmatter: Markdown export now includes human-readable names
categories: ["Category Name", "Another"] with category_ids: [152, 156] as fallback
tags: ["Tag Name"] with tag_ids: [42] as fallback
author: "Author Name" with author_id: 5 as fallback
- Names resolved from WordPress categories, tags, and users data
- Use
--no-ids to exclude numeric ID fields (keep only human-readable names)
- Featured Image in Frontmatter: Markdown export now includes featured image URL
featured_image: "https://example.com/image.jpg" with URL from Media Library
featured_image_id: 100 for numeric ID (unless --no-ids is used)
- Previously only exported
featured_media: 100 (numeric ID only)
- Consistent with author/author_id and categories/category_ids pattern
- Relevant Media Link Extraction:
--relevant-media-only now extracts linked documents and videos
- Scans
<a href> links in content for media files (PDF, DOCX, MP4, ZIP, etc.)
- Previously only extracted
<img src> images from content
- Supported file types: documents (pdf, docx, xlsx, pptx, odt, epub), videos (mp4, webm, avi, mkv), audio (mp3, wav, flac), archives (zip, rar, 7z), and images
- Path-based matching: handles CDN URLs by comparing path suffix after
uploads/ (e.g., 2024/01/file.pdf)
- More precise than filename-only matching - avoids downloading unrelated files with same filename
- Query string tolerance:
file.pdf?v=1.0 matches file.pdf in Media Library
- Ensures all referenced media files are downloaded when using
--relevant-media-only
Fixed
- Relevant Media Filter with FlatHTML: Fixed
--relevant-media-only not finding linked documents when used with --flat-html and --crawl-content
- Media filtering now happens BEFORE HTML to Markdown conversion
- Previously, FlatHTML converted
<a href="file.pdf"> to [link](file.pdf) before the filter could extract URLs
- Now correctly finds PDFs, videos, and other linked media in crawled Bricks/Elementor content
- Relevant Media Over-Matching: Fixed
--relevant-media-only downloading too many files
- Previously matched by filename only (e.g.,
document.pdf would match ALL files named document.pdf)
- Now matches by path suffix after
uploads/ (e.g., 2024/01/document.pdf)
- Significantly reduces ZIP file sizes by avoiding false positive matches
[1.5.0] - 2026-02-05
Added
- MCP Server (wpmcp): New Model Context Protocol server for AI assistant integration
- Enables Claude and other MCP-compatible AI assistants to interact with WordPress sites
- 8 tools:
list_formats, get_site_info, list_posts, list_pages, export_site, get_post, list_categories, list_media
- JSON-RPC 2.0 protocol over stdio
- Basic Auth and Bearer token authentication support
- Optimized timeouts and retries for fast AI interactions
- Makefile Updates: Added
wpmcp to build, install, and release targets
- Comprehensive MCP Tests: 51 unit tests for protocol, server, and tools
[1.4.0] - 2026-02-04
Added
- Quiet Mode: New
--quiet / -q flag suppresses all output, only returns exit code (useful for scripting and automation)
- HTML to Markdown Conversion: New
--flat-html flag converts HTML content to clean Markdown format
- Built-in support for standard HTML elements (h1-h6, p, strong, em, a, img, ul, ol, blockquote, code, pre, hr)
- Built-in support for Bricks Builder CSS classes (brxe-heading, brxe-text, brxe-list, brxe-image)
- Custom conversion rules via
flat_html_rules in config.yaml for site-specific HTML class mappings
- Skip Tags Export: New
--no-tags flag to skip exporting tags
- Page Builder Configuration Examples: Added comprehensive config examples in documentation for:
- Bricks Builder
- Elementor
- Divi Builder
- Oxygen Builder
- GenerateBlocks
- Combined multi-builder configurations
Fixed
- Fixed
--crawl-content HTML extraction bug where closing tags were disappearing due to non-greedy regex matching. Implemented balanced tag extraction algorithm that properly handles nested HTML elements.
Changed
- Optimized Crawling: When both
--assisted-crawl and --crawl-content are enabled, pages are now fetched only once to extract both SEO metadata and content (previously required two separate HTTP requests per page)
[1.3.8] - 2026-02-04
Added
- Content Crawling: New
--crawl-content flag to extract content from pages built with page builders (Bricks, Elementor, etc.) that store content outside the standard WordPress content field
- Skip Empty Content: New
--skip-empty-content flag to exclude posts/pages with empty content from export
- Version Flag: Added
--version flag to display application version
- Tests for new content crawling and filtering functionality
Changed
- Cleaned up CLI help output - removed verbose examples, added feature summary
- Updated golangci-lint complexity threshold for main export function
[1.3.7] - 2026-02-04
Added
- Output directory permission check before starting export (fails fast if no write permissions)
[1.3.6] - 2026-02-04
Fixed
--no-media flag now properly skips media fetching from API (previously only skipped downloading)
- Removed redundant “(default 30)” from
--timeout flag description
[1.3.5] - 2026-02-02
Added
- 8 New Export Formats for popular CMS and e-commerce platforms:
- Wix: JSON export for Wix blog migration
- Squarespace: WXR-compatible XML for Squarespace import
- Webflow: CSV files for Webflow CMS collections
- Weebly: XML and JSON dual export for Weebly
- PrestaShop: Semicolon-delimited CSV for PrestaShop products
- Ghost: JSON export for Ghost CMS migration
- Strapi: JSON export for Strapi v4 headless CMS with separate collection files
- Contentful: JSON export for Contentful with content types and assets
- Comprehensive test coverage for all new exporters (93.7% for export package)
- Platform links in documentation for all supported export formats
Changed
- Updated
--format flag to support 14 formats: json, markdown, shopify, magento, wordpress, drupal, wix, squarespace, webflow, weebly, prestashop, ghost, strapi, contentful
- Updated golangci-lint configuration for better code quality enforcement
- Enhanced documentation with links to all supported platforms
[1.3.4] - 2026-02-02
Added
- WordPress WXR Export Format: New
wordpress export format generating WXR (WordPress eXtended RSS) XML files
- Compatible with WordPress import/export system
- Exports posts, pages, media, categories, tags, and authors
- Includes featured images and SEO metadata as post meta
- Full WXR 1.2 specification support
- Drupal Export Format: New
drupal export format generating Drupal-compatible JSON files
- Compatible with Drupal’s Migrate module and migrate_source_json plugin
- Exports nodes (articles and pages), taxonomy terms, users, and media
- Generates separate JSON files for each content type for flexible migration
- Supports Drupal 8/9/10 field structure
Changed
- Updated
--format flag to support additional formats: wordpress and drupal
[1.3.3] - 2026-02-02
Added
- Skip Users: New
--no-users flag to skip exporting users
- Timeout Flag: New
--timeout flag to configure HTTP request timeout in seconds (default 30)
Fixed
- Fixed JSON parsing error for users/categories/tags when WordPress returns
meta field as object instead of array
- Users fetching is now graceful - errors don’t stop the export, just warn and continue
[1.3.0] - 2026-01-30
Added
- Resume/Checkpoint: New
--resume flag to save progress and resume interrupted exports. Checkpoint file (.wpexport_checkpoint.json) is saved after each API page fetch and deleted on successful completion
- Rate Limiting: New
--rate-limit flag to add delay between API requests (in milliseconds) to prevent server overload and rate limiting timeouts
- Media Filtering: New
--no-media alias for --download-media=false and --relevant-media-only flag to download only featured images and images embedded in content
- URL Path Filtering: New
--path-filter flag to filter posts/pages by URL path pattern (e.g., --path-filter=/fr/arts/)
- SEO Metadata Extraction: New
--assisted-crawl flag to crawl URLs and extract SEO metadata including:
- Page titles from
<title> tags
- Meta descriptions and keywords
- Open Graph tags (og:title, og:description, og:image)
- Canonical URLs
- SEO fields are included in JSON export and Markdown frontmatter when using
--assisted-crawl
- Interactive Password Prompt: When
--auth-user is provided without --auth-pass, the tool now prompts for password input securely (hidden input)
- New Makefile targets:
vet, sec, check, test-coverage
- Restored and updated golangci-lint configuration
- Added comprehensive test coverage for cmd packages (wpexportjson: 26.1%, wpxmlrpc: 15.7%)
Changed
- Updated README with new CLI options and SEO documentation
- Enhanced test coverage for new features
- Overall test coverage improved to 80.2%
Fixed
- Clarified
--download-media flag behavior in documentation
- Added validation for
--path-filter to detect when value looks like a flag (prevents accidental --path-filter --zip confusion)
- Fixed Windows build compatibility for
term.ReadPassword (syscall.Stdin type conversion)
[1.2.1] - 2026-01-21
Added
- Authentication Support: Export content from password-protected WordPress sites/APIs (Basic Auth and Bearer Token)
--auth-user / --auth-pass: Credentials for Basic Authentication
--auth-token: Bearer token for API authentication
- Handles 401 Unauthorized responses gracefully by retrying with provided credentials
[1.2.0] - 2026-01-21
Added
- Shopify CSV Export Format: New
shopify export format for migrating WordPress content to Shopify
- Generates Shopify-compatible product CSV files
- Converts WordPress posts/pages to Shopify products
- Maps categories to product types, tags to Shopify tags
- Includes SEO fields (title, description)
- Supports featured images and additional content images
- Creates separate CSV files for posts, pages, and combined products
- Exports site metadata for reference
- Magento 2 CSV Export Format: New
magento export format for migrating WordPress content to Magento
- Generates Magento 2-compatible product CSV files (57 columns)
- Converts WordPress posts/pages to simple products
- Maps categories to Magento category paths (Default Category/Name)
- Uses tags as meta keywords for SEO
- Supports featured images and additional content images
- Includes URL key generation for SEO-friendly URLs
- Creates separate CSV files for posts, pages, and combined products
- Exports site metadata for reference
- WooCommerce Support: Automatically detects and exports WooCommerce products (requires wc/v3 API) and maps them to Shopify/Magento product fields including price, stock, and variations
- Content Filtering: New flags to control export scope:
--no-posts: Skip blog posts
--no-pages: Skip pages
--no-products: Skip WooCommerce products
- Creates separate CSV files for posts, pages, and combined products
- Exports site metadata for reference
--zip flag to create ZIP archive of export
--no-files flag to remove export files after creating ZIP (requires –zip)
- Dual licensing under MIT and BSD 3-Clause (see LICENSE)
[1.0.0] - 2025-12-05
Added
- Initial stable release with Go 1.24
- WordPress REST API client for content discovery
- Brute force content ID enumeration
- JSON and Markdown export formats
- Media download functionality (images and videos)
- CLI interface with Cobra
- Configuration management with Viper
- Progress tracking with progress bars
- Concurrent processing support
- Comprehensive documentation and README
- Makefile with development automation
- Cross-platform build support (Linux, macOS, Windows, FreeBSD)
- GitHub Actions CI/CD pipeline with auto-versioning
- Docker support with multi-arch builds
- XML-RPC export tool (wpxmlrpc)
Fixed
- Fixed media directory path must be absolute error
Security
- Fixed G301 security issues: Changed directory permissions from 0755 to 0750 for better security
- Fixed G306 security issues: Changed file permissions from 0644 to 0600 for better security
- Fixed G304 security issue: Added comprehensive path validation to prevent directory traversal attacks
- Added file path sanitization and validation in media downloader
- Enhanced security by ensuring all file operations are contained within designated directories
[0.1.0] - 2024-01-07
Added
- Initial release
- Basic WordPress content export functionality