Search Apache POI

Apache POI - Encryption support

Overview

Apache POI contains support for reading few variants of encrypted office files:

  • Binary formats (.xls, .ppt, .doc, ...)
    encryption is format-dependent and needs to be implemented per format differently.
    Use Biff8EncryptionKey.setCurrentUserPassword(String password) to specify the decryption password before opening the file or (where applicable) before saving. Setting a null password before saving removes the password protection.
    The password is set in a thread local variable. Do not forget to reset it to null after text extraction.
  • XML-based formats (.xlsx, .pptx, .docx, ...)
    use the same encryption logic over all formats. When encrypted, the zipped files will be stored within an OLE file in the EncryptedPackage stream.
    If you plan to use POI to actually generate encrypted documents, be aware not to use anything less than agile encryption, because RC4 is not really secure. Of course you'll need to make sure, that your clients can read the documents, i.e. the various free Excel, Powerpoint, Word viewers have limitations in the cipher or hashing parameters.
    If you want to use high encryption parameters, you need to install the "Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files" for your JRE version (Oracle JDK6, JDK7, JDK8).

Some "write-protected" files are encrypted with the built-in password "VelvetSweatshop", POI can read that files too.

Supported feature matrix

HSSF HSLF HWPF XSSF XSLF XWPF
XOR obfuscation Read N/A No N/A N/A N/A
40-bit RC4 encryption Read N/A No N/A N/A N/A
Office Binary Document RC4 CryptoAPI Encryption No Yes No N/A N/A N/A
Office Binary Document RC4 Encryption *) N/A N/A N/A Yes Yes Yes
ECMA-376 Standard Encryption N/A N/A N/A Yes Yes Yes
ECMA-376 Agile Encryption N/A N/A N/A Yes Yes Yes
ECMA-376 XML Signature N/A N/A N/A Yes Yes Yes

*) the MS-OFFCRYPTO documentation only mentions the RC4 (without CryptoAPI) encryption as a "in place" encryption, but apparently there's also a container based method with that key generation logic.

Binary formats

As mentioned above, use Biff8EncryptionKey.setCurrentUserPassword(String password) to specify the password.

// XOR/RC4 decryption for xls
Biff8EncryptionKey.setCurrentUserPassword("pass");
NPOIFSFileSystem fs = new NPOIFSFileSystem(new File("file.xls"), true);
HSSFWorkbook hwb = new HSSFWorkbook(fs.getRoot(), true);
Biff8EncryptionKey.setCurrentUserPassword(null);
        
// RC4 CryptoApi support ppt - decryption
Biff8EncryptionKey.setCurrentUserPassword("pass");
NPOIFSFileSystem fs = new NPOIFSFileSystem(new File("file.ppt"), true);
HSLFSlideShow hss = new HSLFSlideShow(fs);
...
// Option 1: remove password
Biff8EncryptionKey.setCurrentUserPassword(null);
OutputStream os = new FileOutputStream("decrypted.ppt");
hss.write(os);
os.close();
...
// Option 2: change encryption settings (experimental)
// need to cache data (i.e. read all data) before changing the key size
PictureData picsExpected[] = hss.getPictures();
hss.getDocumentSummaryInformation();
EncryptionInfo ei = hss.getDocumentEncryptionAtom().getEncryptionInfo();
((CryptoAPIEncryptionHeader)ei.getHeader()).setKeySize(0x78);
OutputStream os = new FileOutputStream("file_120bit.ppt");
hss.write(os);
os.close();
        

XML-based formats - Decryption

XML-based formats are stored in OLE-package stream "EncryptedPackage". Use org.apache.poi.poifs.crypt.Decryptor to decode file:

EncryptionInfo info = new EncryptionInfo(filesystem);
Decryptor d = Decryptor.getInstance(info);

try {
    if (!d.verifyPassword(password)) {
        throw new RuntimeException("Unable to process: document is encrypted");
    }

    InputStream dataStream = d.getDataStream(filesystem);

    // parse dataStream

} catch (GeneralSecurityException ex) {
    throw new RuntimeException("Unable to process encrypted document", ex);
}
	

If you want to read file encrypted with build-in password, use Decryptor.DEFAULT_PASSWORD.

XML-based formats - Encryption

Encrypting a file is similar to the above decryption process. Basically you'll need to choose between binaryRC4, standard and agile encryption, the cryptoAPI mode is used internally and it's direct use would result in an incomplete file. Apart of the CipherMode, the EncryptionInfo class provides further parameters to specify the cipher and hashing algorithm to be used.

POIFSFileSystem fs = new POIFSFileSystem();
EncryptionInfo info = new EncryptionInfo(EncryptionMode.agile);
// EncryptionInfo info = new EncryptionInfo(EncryptionMode.agile, CipherAlgorithm.aes192, HashAlgorithm.sha384, -1, -1, null);

Encryptor enc = info.getEncryptor();
enc.confirmPassword("foobaa");

OPCPackage opc = OPCPackage.open(new File("..."), PackageAccess.READ_WRITE);
OutputStream os = enc.getDataStream(fs);
opc.save(os);
opc.close();

FileOutputStream fos = new FileOutputStream("...");
fs.writeFilesystem(fos);
fos.close();     
     

XML-based formats - Signing (XML Signature)

An Office document can be digital signed by a XML Signature to protect it from unauthorized modifications, i.e. modifications without having the original certificate. The current implementation is based on the eID Applet which is dual-licensed to ASF/POI. Instead of using the internal JDK API this version is based on Apache Santuario.

The classes have been tested against the following libraries, which need to be included additionally to the default dependencies:

  • BouncyCastle bcpkix and bcprov (tested against 1.51)
  • Apache Santuario "xmlsec" (tested against 2.0.1)
  • and slf4j-api (tested against 1.7.7)

Depending on the configuration and the activated facets various XAdES levels are supported - the support for higher levels (XAdES-T+) depend on supporting services and although the code is adopted, the integration is not well tested ... please support us on integration (testing) with timestamp and revocation (OCSP) services.

Further test examples can be found in the corresponding test class.

Validating a signed office document

OPCPackage pkg = OPCPackage.open(..., PackageAccess.READ);
SignatureConfig sic = new SignatureConfig();
sic.setOpcPackage(pkg);
SignatureInfo si = new SignatureInfo();
si.setSignatureConfig(sic);
boolean isValid = si.verifySignature();
...
     

Signing an office document

// loading the keystore - pkcs12 is used here, but of course jks & co are also valid
// the keystore needs to contain a private key and it's certificate having a
// 'digitalSignature' key usage
char password[] = "test".toCharArray();
File file = new File("test.pfx");
KeyStore keystore = KeyStore.getInstance("PKCS12");
FileInputStream fis = new FileInputStream(file);
keystore.load(fis, password);
fis.close();

// extracting private key and certificate
String alias = "xyz"; // alias of the keystore entry
Key key = keystore.getKey(alias, password);
X509Certificate x509 = (X509Certificate)keystore.getCertificate(alias);

// filling the SignatureConfig entries (minimum fields, more options are available ...)
SignatureConfig signatureConfig = new SignatureConfig();
signatureConfig.setKey(keyPair.getPrivate());
signatureConfig.setSigningCertificateChain(Collections.singletonList(x509));
OPCPackage pkg = OPCPackage.open(..., PackageAccess.READ_WRITE);
signatureConfig.setOpcPackage(pkg);

// adding the signature document to the package
SignatureInfo si = new SignatureInfo();
si.setSignatureConfig(signatureConfig);
si.confirmSignature();
// optionally verify the generated signature
boolean b = si.verifySignature();
assert (b);
// write the changes back to disc
pkg.close();
     
by Maxim Valyanskiy