Downloading Jenkins Logs

Recently, I encountered a problem on one of the integration test run by Jenkins. This particular test was failing “sometimes”. The problem was that sometimes, the Selenium integration was timing out because a page was too slow, but it was hard to find which part of the test was the one failing. I needed some statistical information on how the test was running, but the one we have on Jenkins didn’t expose that information. As an alternative to change Jenkins configuration, I could analyze Jenkins test logs. Jenkins provides an RSS feed with all the run information, including a URL to a gzip-ed file containing the logs I need. In this article, I will describe the code I create to download this logs files to do further analysis locally.

First, I need the simplest framework I could find just to make a GET request to get the RSS feed, and then to get the gzip-ed log file. In the past, I used scalaj-http for simple REST-full service consumption on scripts. It is a simple-blocking-wrapping of the good-old-java HttpUrlConnection. That will do for this problem. The following is the build.sbt I created for the project:

	name := "jenkins-rss"

	version := "1.0"

	scalaVersion := "2.11.7"

	libraryDependencies += "org.scala-lang.modules" %% "scala-xml" % "1.0.4"

	libraryDependencies += "org.scalaj" %% "scalaj-http" % "1.1.5"

	libraryDependencies += "com.github.nscala-time" %% "nscala-time" % "2.0.0"

	libraryDependencies += "com.typesafe" % "config" % "1.3.0"

	mainClass in (Compile, run) := Some("JekinsReader")

view raw downloading-jenkins-log-build.sbt hosted with ❤ by GitHub

The last line on the build.sbt is using assembly plugin. This plugin needs to be configured by adding the file /project/assembly.sbt as described here.

To facilitate the HTTP communication, a http.Util object was created containing functions for Basic authentication and GET operations:

	package http

	import scala.util.{Success, Failure, Try}
	import scalaj.http.{HttpResponse, HttpRequest, Base64, Http}

	object Utils {

	def authenticatedGetString(username: String, password: String)(url: String) : Try[String] =
	(addAuthentication(username, password) _ andThen getString)(Http(url))

	def authenticatedGetBytes(username: String, password: String)(url: String) : Try[Array[Byte]] =
	(addAuthentication(username, password) _ andThen getBytes)(Http(url))

	def addAuthentication(username: String, password: String)(req: HttpRequest) : HttpRequest =
	req.header("Authorization", "Basic " + Base64.encodeString(username + ":" + password))

	def getString = get[HttpResponse[String],String](req => req.asString)(_)
	def getBytes = get[HttpResponse[Array[Byte]],Array[Byte]](req => req.asBytes)(_)

	def get[T <: HttpResponse[U], U]( requester: HttpRequest => T )( req: HttpRequest ) : Try[U] =
	Try( requester( req ) ) match {
	case Success( res ) => {
	val body = res.body
	if ( res.isError ) {
	Failure( new Exception( "code : " + res.code + ", body : " + body ))
	} else {
	Success(body)
	}
	}
	case Failure(e) => Failure(e)
	}
	}

view raw downloading-jenkins-log-http.Utils.scala hosted with ❤ by GitHub

Using this functions, the main object JenkinsReader is created:

	import java.io.{BufferedWriter, ByteArrayInputStream, FileWriter}
	import java.nio.file.{Paths, Files}
	import java.util.zip.GZIPInputStream

	import com.typesafe.config.ConfigFactory
	import http.Utils
	import scala.io.Source
	import scala.xml.XML

	object JenkinsReader {

	def main(args: Array[String]) = {
	val conf = ConfigFactory.load()
	val url = conf.getString("jenkins-reader.jenkins.rss-url")
	val fileUri = conf.getString("jenkins-reader.jenkins.file-uri")
	val runIdPattern = conf.getString("jenkins-reader.jenkins.run-id-extraction-pattern").r
	val username = args(0)
	val password = args(1)
	val outputDir = args(2)
	println(s"Jenkins username: $username, password: $password" )
	println(s"Jenkins rss url: $url")
	println(s"Jenkins log file uri: $fileUri")
	println(s"Run id and fail/success extraction runIdPattern: $runIdPattern")
	println(s"Output directory: $outputDir")
	println()

	val myGet = Utils.authenticatedGetString(username, password)(_)
	val myGetBytes = Utils.authenticatedGetBytes(username, password)(_)

	myGet(url)
	.map ( data => {
	val xml = XML.loadString(data)
	val entries = xml \\ "feed" \\ "entry"
	entries.map( node => ( (node \\ "title").text, (node \\ "link").\@("href") ) )
	} )
	.getOrElse(Seq())
	.map( entry => {
	val ( title, url ) = entry
	title match {
	case runIdPattern(entryRun, rest) => ( entryRun, rest.contains("fail"), url)
	}
	})
	.filter( s => s._2)
	.map( entry => {
	val ( entryNumber, isFail, url ) = entry
	val runState = if ( isFail ) "fail" else "stable"
	val fileName = s"$outputDir/$entryNumber-log-$runState.log"
	( fileName, url + fileUri)
	})
	.filter( file => !Files.exists(Paths.get(file._1)) )
	.foreach( fileUrl => {
	val ( file, url ) = fileUrl
	myGetBytes(url) map( data => {
	val text = Source
	.fromInputStream( new GZIPInputStream(new ByteArrayInputStream(data)))
	.getLines.toList
	println( s"Writing file: $file" )
	val writer = new BufferedWriter(new FileWriter(file))
	text.foreach( line => {
	writer.write(line)
	writer.newLine()
	})
	writer.close()
	})
	})
	}
	}

view raw downloading-jenkins-log-jenkins-reader.scala hosted with ❤ by GitHub

The code expects to be executed passing Jenkin’s username/password and the output directory as parameters (lines 17 - 19). This code use Lightbend(Typesafe) Config to configure the following data:

Jenkins RSS URL (line 14).
The file URI pattern to find gzip-ed log file from the URL on the RSS feed. Ex. artifact/theFile.gz. (line 15)
A regular expression to find the Jenkins run identifier. Ex. “SomeJob #([0-9]+) (.*)”. (line 16)

The code is very straightforward, here is some description of it:

Function currying to get a function with authentication already configured. (lines 27 - 28)
Get RSS feed as a string. (line 30)
Map the string to a Seq of string pairs with the first element as the text on the run and the second one the URL to the run. (lines 31 - 36)
Map the pairs to a tuple using the regular expression to extract id and if it is a failure or not. (lines 37 - 42)
Filter the failures only. (line 43)
Iterate over the URLs. (line 51)
Get the gzip-ed logs. (line 53)
Unzip the files and stored them in the desired location. (lines 54 - 63)

This code could be better as some errors could be handled (Ex. the code could not write to a particular path), but I hope it will be useful to somebody as it was for me.

Happy coding!

Downloading Jenkins Logs a way to analyze Jenkins logs offline

Downloading Jenkins Logs

a way to analyze Jenkins logs offline