Converting .msg to PDF using Python on Windows
I recently needed to convert a bunch of .msg files to PDF. I didn’t want to install any additional 3rd party software other than the Python libraries I needed. As the code would run on Windows servers, I though I’d utilize the excellen win32com Python package, and use only native Windows applications for performing the actual converting.
I didn’t find a way to convert .msg to PDF in one, so I solved it by first using Outlook to convert the .msg file to a .doc file, and then use Word to convert the .doc file to a .pdf file. In case it may be of use to others, I thought I’d share a code snippet showing how I approached the problem. Here it goes:
from win32com import client
import os.path
import tempfile
destination_file = "email.pdf"
source_file = "email.msg"
outlook_app = client.Dispatch("Outlook.Application")
outlook_instance = outlook_app.GetNamespace("MAPI")
msg = outlook_instance.OpenSharedItem(source_file)
with tempfile.TemporaryDirectory() as tmp_dir_name:
tmp_word_file_path = os.path.join(tmp_dir_name, "dummy.doc")
# See https://docs.microsoft.com/en-us/dotnet/api/microsoft.office.interop.outlook.olsaveastype?view=outlook-pia
msg.SaveAs(tmp_word_file_path, 4)
outlook_app.Quit()
pdf_file_format_code = 17 # See https://stackoverflow.com/a/6018039
doc = None
word = None
try:
word = client.Dispatch('Word.Application')
doc = word.Documents.Open(tmp_word_file_path)
doc.SaveAs(destination_file, FileFormat=pdf_file_format_code)
except Exception as error:
raise RuntimeError(f"Failed to convert {source_file} to PDF. Error: {str(error)}")
finally:
doc.Close()
word.Quit()