Find Maven Information in JAR Files

By | May 19, 2014

In this post I will let the code do most of the talking and just tell you that I want to share a Groovy script for finding information from Maven pom.xml files embedded in JAR files.
The reason for this is that I wanted to create a list of third-party libraries and their versions that are provided in a Mule server. Since this list needs to be updated for each new version of Mule, I wanted to automate the process at least to some extent.
Of course this is applicable to other scenarios as well.

package se.ivankrizsan.groovy

import java.util.jar.*

/**
 * This Groovy script finds information from Maven pom-files in
 * JAR files.
 * Given the path to a root directory, the directory and all sub-directories
 * are searched for JAR files.
 * Each found JAR file is the searched for Maven pom-files.
 * Information on version and id of the artifact and the parent artifact,
 * if any, is written to a file.
 *
 * @author Ivan Krizsan
 */
def theLibPath = "/Volumes/BigHD/DEVELOPMENT/mule-standalone-3.5.0-M4"
def theOutputPath = "library-version.txt"

/* Output file is overwritten with each run. */
def theOutputFile = new File(theOutputPath)
theOutputFile.delete()
theOutputFile.createNewFile()

def theJarFiles = findJarFiles(theLibPath)

theJarFiles.each { theJarFile ->
    println "Processing JAR file $theJarFile..."
    def thePomFileContents = extractPomContents(theJarFile)
    if (thePomFileContents) {
        theOutputFile.append("--- From JAR file: $theJarFile\n")
        processMavenPomFileContents(thePomFileContents, theOutputFile)
    } else {
        println "   No Maven pom.xml found in JAR file"
    }
}

println "Done!"

/**
 * Finds JAR files in the directory at supplied path or in sub-directories.
 *
 * @param inJarFilesRootPath Path to root of directory hierarchy to search
 * for JAR files.
 * @return List of {@code File} objects referring to JAR files.
 */
def List findJarFiles(final String inJarFilesRootPath) {
    def theJarDir = new File(inJarFilesRootPath)
    def theJarFileList = []

    theJarDir.eachFileRecurse { theFileDir ->
        if (theFileDir.isFile() && theFileDir.getName().endsWith(".jar")) {
            theJarFileList << theFileDir
        }
    }
    theJarFileList
}

/**
 * Extracts the contents of the first found Maven pom.xml file in the
 * JAR file specified by supplied {@code File} object.
 *
 * @param inJarFile Specifying which JAR file to search in.
 * @return Contents of first pom.xml file in the JAR file, or null if
 * no pom.xml file found.
 */
def extractPomContents(final File inJarFile){
    def thePomFileContents = null
    def thePomFileName = "pom.xml"
    def theJarFile = new JarFile(inJarFile)

    theJarFile.entries().each{ theJarFileEntry ->
        if (theJarFileEntry.name =~ thePomFileName){
            thePomFileContents = theJarFile.getInputStream(theJarFileEntry).text
        }
    }
    thePomFileContents
}

/**
 * Processes the supplied Maven pom-file contents, appending extracted
 * information to the supplied output file.
 *
 * @param inPomFileContents Maven pom-file contents.
 * @param inOutputFile Output file to append information to.
 */
def processMavenPomFileContents(final String inPomFileContents, final File inOutputFile) {
    def thePomXml = new XmlSlurper().parseText(inPomFileContents)

    def theParentArtifactId = thePomXml.parent.artifactId.text()
    def theParentGroupId = thePomXml.parent.groupId.text()
    def theParentVersion = thePomXml.parent.version.text()

    inOutputFile.append("Parent group id: $theParentGroupId\n")
    inOutputFile.append("Parent artifact id: $theParentArtifactId\n")
    inOutputFile.append("Parent version: $theParentVersion\n\n")

    def theArtifactId = thePomXml.artifactId.text()
    def theGroupId = thePomXml.groupId.text()
    def theVersion = thePomXml.version.text()

    inOutputFile.append("Library group id: $theGroupId\n")
    inOutputFile.append("Library artifact id: $theArtifactId\n")
    inOutputFile.append("Library version: $theVersion\n\n")
}

 

When I run the script, pointing it at the Mule 3.5.0-M4 lib directory, part of the console output looks like this:

Processing JAR file /Volumes/BigHD/DEVELOPMENT/mule-standalone-3.5.0-M4/lib/boot/wrapper-3.2.3.jar...
 No Maven pom.xml found in JAR file
Processing JAR file /Volumes/BigHD/DEVELOPMENT/mule-standalone-3.5.0-M4/lib/endorsed/xalan-2.7.1.jar...
 No Maven pom.xml found in JAR file
Processing JAR file /Volumes/BigHD/DEVELOPMENT/mule-standalone-3.5.0-M4/lib/endorsed/xercesImpl-2.9.1.jar...
 No Maven pom.xml found in JAR file
Processing JAR file /Volumes/BigHD/DEVELOPMENT/mule-standalone-3.5.0-M4/lib/endorsed/xml-apis-1.3.04.jar...
 No Maven pom.xml found in JAR file
Processing JAR file /Volumes/BigHD/DEVELOPMENT/mule-standalone-3.5.0-M4/lib/endorsed/xml-serializer-2.7.1.jar...
 No Maven pom.xml found in JAR file
Processing JAR file /Volumes/BigHD/DEVELOPMENT/mule-standalone-3.5.0-M4/lib/mule/mule-common-3.5.0-M4.jar...
...
Done!

The console output is just a progress indicator – the interesting output is written to a file where each JAR file is represented by an entry in this format:

--- From JAR file: /Volumes/BigHD/DEVELOPMENT/mule-standalone-3.5.0-M4/lib/opt/wss4j-1.6.9.jar
Parent group id: org.apache
Parent artifact id: apache
Parent version: 11

Library group id: org.apache.ws.security
Library artifact id: wss4j
Library version: 1.6.9

In the above example, we can see that the wss4j-1.6.9 JAR file contains a Maven pom.xml file which specifies a parent with the group id org.apache, the artifact id apache and the version 11. The library group id is org.apache.ws.security, the library artifact id is wss4j and, finally, the library version is 1.6.9.

Leave a Reply

Your email address will not be published. Required fields are marked *