Skip to content

Latest commit

 

History

History
383 lines (307 loc) · 22.1 KB

README.adoc

File metadata and controls

383 lines (307 loc) · 22.1 KB

JEP-229: Continuous Delivery of Jenkins Components and Plugins

Abstract

Maintainers of Jenkins component or plugin repositories on GitHub, can opt into continuous delivery (CD). In this mode, every successful build of the default branch by ci.jenkins.io results in a new release of the artifact. GitHub Actions are used to rebuild the code and deploy it to Artifactory, without the need for maintainers to use personal credentials or local builds. An extension of JEP-305 is used to pick deterministic version numbers; maven-release-plugin (MRP) is not used.

Specification

A great deal of infrastructure from JEP-305 is reused, with some refinements at the component/plugin build level and new work at the organization level.

Component setup

Setting up automated plugin release describes how to configure a repository to be deploy’able to Jenkins Artifactory without using maven-release-plugin (MRP).

Then GitHub Actions can be used to do this automatically after master builds succeed in Jenkins. As an example, log-cli plugin version 40.934670f4a9b8 was deployed by a GitHub action after a Jenkins build passed triggered by a master push. The important parts of source setup are the version definition with a custom format and a GitHub Action workflow running mvn deploy with a bot token for Artifactory.

Note that the deployment skips tests and all other Maven mojos not necessary to produce artifacts, because the GitHub Action first verifies that the Jenkins build passed whether the deployment was triggered automatically or manually. It is the responsibility of the Jenkinsfile to ensure that an appropriate set of tests, static analyses, etc. are run on appropriate platforms.

Release Drafter is configured as an Action as well, but always publishing the draft release under the placeholder name next rather than a real version. Before an attempt is made to deploy a release to Artifactory, Release Drafter updates a draft release, then the deployment script (if successful) creates a (lightweight) Git tag and publishes the release. Release Drafter categories are used to skip automatic releases if there are no “interesting” changes.

Organization setup

To avoid the need for each repository maintainer to manually create a bot account and perform similar setup, tooling at the organization level automates this aspect.

Repository Permissions Updater (RPU) processing recognizes a flag in a repo’s YAML entry (cd.enabled set to true). If that flag is set, a group whose name is derived from the GitHub repository name is created in Artifactory. That group is granted the same permissions as regular maintainers, but contains no users. Instead, RPU periodically generates a temporary Artifactory access token that’s a member of that group and saves it to the GitHub secrets MAVEN_USERNAME and MAVEN_TOKEN in the corresponding repository.

Motivation

Traditionally, all Jenkins components and plugins have been released using the Maven Release plugin (MRP). This is typically invoked by a developer on a local clone of the repository in the master branch:

mvn --batch-mode release:prepare release:perform

MRP presumes that the <version> in the POM is something like 1.23-SNAPSHOT. It will create a commit [maven-release-plugin] prepare release mycomponent-1.23 with a tag and a release version. It will then create a commit [maven-release-plugin] prepare for next development iteration going to 1.24-SNAPSHOT, push both commits to the master branch, build the artifact(s) from the release tag, and deploy that binary to Artifactory.

MRP has numerous intrinsic flaws that render it poorly suited to a project like Jenkins with a huge number of interrelated artifacts (mainly, though not exclusively, plugins) and many maintainers:

  • Every maintainer must obtain (and keep secure) personal credentials to Artifactory in order to perform releases. Each maintainer also needs to be listed in repository-permissions-updater, even if they have full rights to the GitHub repository.

  • A release involves running tests locally (e.g., on a laptop). CPU-intensive test runs can interfere with other work; flaky tests might behave differently than in on ci.jenkins.io.

  • It is impossible to verify that released binaries actually correspond to the purported tagged sources.

  • Developers need to pick version numbers whether or not the numbers have any meaning to humans.

  • Every release must have two machine-generated Git commits, since the version number is encoded in the version-controlled pom.xml.

  • It is tricky to git bisect a problem reproduced only by a dependency on a artifact since even local builds (mvn install) get various version numbers.

  • The perform phase cannot be simulated in advance, and involves procedures never used in other circumstances, so often things go wrong and leave a “botched release” which must be cleaned up and retried. Novice and even experienced Jenkins developers often have to ask for help with release problems.

  • Pull requests are sometimes approved and merged but left unreleased for months on end, merely because the PR’s value did not seem justified by the overhead of performing a release.

The concept of continuous delivery (CD) is that “releases” should be cheap and frequent. Rather than forcing a maintainer to run a slow and perilous command every time anything of value is merged, harness automation to publish the newest source changes right away and without intervention.

Reasoning

This JEP is inspired by JEP-221, and shares similar motivation, but diverges from that proposal in various technical aspects.

Version format

The use of JEP-305-based version numbers is attractive in that it requires no maintenance whatsoever: merely pushing a commit to master (including merging a pull request) suffices to trigger a deployment, and the version number will uniquely and securely identify that commit, with no need to create a redundant Git tag. Component maintainers who wish to follow SemVer principles, encoding some semantics into version numbers, can still do so by appending the generated number as a “micro” component to a manually maintained major.minor. prefix.

GitHub Actions

GitHub Actions are attractive in this context because they define a trust boundary naturally scoped to the repository: a given bot token is defined in only repository, useful in only that repository, and used only for a containerized build of that repository. A system using a trusted Jenkins server, as proposed in JEP-221, would add more infrastructure complexity and maintenance, and the flexibility and visualization of Jenkins is not needed or wanted for this very limited operation: running a Maven build and deployment with no test code.

Deployment authentication

The deployment system used for JEP-305, of the standard Jenkins server plus an incrementals-publisher microservice, solves a similar problem but is not suitable here. On the one hand, this JEP involves deploying from master (or perhaps another trusted origin branch), so there is no need for the precautions used in JEP-305 to check that the deployed bits match expected metadata, or the split between CI build and deployment needed to guard a single Artifactory token from malicious (especially forked) PRs. And on the flip side, the requirement for a secure execution environment is more stringent: if ci.jenkins.io were to be compromised, malicious binaries could be deployed to the user-facing update center, not merely an experimental repository used mostly by other CI builds for prerelease testing.

Frequency of deployments

The whole point of this JEP is to encourage automatic and frequent deployments. If it is widely adopted, there are some risks to this frequency. (These are not blockers to experimentation on a few repositories.)

Artifactory might not be able to handle the traffic. This is already a concern generally with our hosted Artifactory, but the Jenkins project is looking into what precisely the limits are.

Jenkins administrators might tire of constantly seeing entries in the Plugin Manager » Updates tab. In many cases, there may be few or no behavioral changes in a release, just code cleanups or POM tidying. While having these releases is sometimes valuable for PCT, they are not valuable to administrators. We could slow down the frequency at which the update center is automatically checked, currently one day, but this would also slow down notifications of security updates, which we certainly do not want; perhaps very recent updates could be hidden unless specifically requested or can be identified as security updates.

Releases of development-time components (plugin-pom, bom, jenkins-test-harness, etc.) or of widely used API plugins (workflow-step-api-plugin, credentials-plugin, etc.) might create “Dependabot storms” whereby one minor change in a base component/plugin triggers a release, followed by PRs to intermediate-level components/plugins which are then merged and trigger releases, followed by PRs to higher-level components/plugins with their own releases. Excessive updates could consume a lot of CI time and exacerbate the previously mentioned risks. By default, no such storm will occur, since 📦 updates from Dependabot do not by themselves trigger releases.

For any such issues, or for maintainers who prefer to do manual sanity checks prior to release rather than when merging PRs, there is another option: manual triggers can be used to deploy from a given branch on demand, rather than automatically upon push. This is also likely to be the preferred trigger for backport branches. Compared to running MRP locally, this is still much less effort for maintainers, though such a trigger does require that there is a passing Jenkins CI check before proceeding. (This validation is part of the Action definition, not manual, so we can be sure that deployed releases pass official test suites. If there are outages on ci.jenkins.io, the maintainer can wait for a fix, or Re-run the build.)

Push vs. pull for Artifactory tokens

Rather than having the RPU build push Artifactory tokens into repository secrets, which introduces questions of token expiry and possible theft by repository owners, we might want to have the deployment Action retrieve a short-lived Artifactory token on demand.

For this to be possible, we would need to run a new microservice in the Jenkins cluster which had broad Artifactory permissions (sufficient to create bot users and tokens) and which could read RPU configuration. The Action would need to transmit its temporary ${{ secrets.GITHUB_TOKEN }} to the service, as well as some $GITHUB_SHA from the repository. The service would then validate this token was in fact an App installation token, and determine the repository on which it is valid:

repo=$(curl --silent --header "Authorization: Bearer $TOKEN" https://api.github.com/installation/repositories | jq --raw-output '.repositories[0].full_name')

It can then (with difficulty) verify that the App has write permission to the repository, as an Action token will (to prevent spoofing from low-privileged Apps):

tag=permcheck-$RANDOM
curl --header "Authorization: Bearer $TOKEN" --data '{"ref":"refs/tags/'$tag'","sha":"'$sha'"}' https://api.github.com/repos/$repo/git/refs
curl --header "Authorization: Bearer $TOKEN" --request DELETE https://api.github.com/repos/$repo/git/refs/tags/$tag

Now knowing that it has been called from an App with write permissions, such as the deployment Action, it can create a new Artifactory token with a short expiry (say one hour) granted permission only to upload to the paths defined for this repository in RPU and return that token in its response. The Action would then bind this token to an environment variable for use from settings.xml.

On balance this “pull” approach seems worse than the currently proposed “push” approach:

  • It would require a new service to be maintained—the chief obstacle to JEP-221.

  • A publicly accessible service holding high-level Artifactory administrative permissions is a major attack target.

  • A push approach can also expire and rotate tokens, with some care. If the RPU batch job runs at least daily (not only on master push), and generates fresh tokens for all enrolled repositories, then it would be fairly safe for tokens to expire after a week, for example.

  • Theft of tokens prior to expiry by malicious maintainers is a possibility under either system; the window of opportunity would differ, as would the sort of audit trail produced.

Configuring Release Drafter to publish releases

As of version 5.13.0 it is possible to run Release Drafter as a step in the release workflow. Therefore the last few lines of the deployment script could be omitted if this Action were configured to do the publishing itself. However such a configuration would be more convoluted in cd.yaml and not really any easier to understand.

Single Action for release CI check and actual release

It would be more convenient to have a single Action which encapsulates the entire logic of a CD workflow. Unfortunately pending actions/runner #438 (part of actions/runner #646) two distinct Actions would be needed, one for the Jenkins CI status check, and one to perform the release after sources have been checked out. These could be combined at the cost of a slower and more expensive short-circuit status check (which is run several times as Jenkins sets pending CI statuses), but the CD workflow would still need to define steps to check out sources and set up the JDK.

Another easier-to-use variant would be for the CD workflow to wait for Jenkins to set a final CI status, as for example the wait-for-commit-statuses and await-status-action Actions let you do generically. However, this would waste Action minutes waiting for a Jenkins build that could take hours to complete in the worst case.

Backwards Compatibility

The Jenkins update center generator requires no modifications: releases deployed by this JEP’s mechanism appear in the regular Artifactory releases repository, using unusual but perfectly legal release version numbers. (It might make sense to ignore specific user names of deployers here, such as runner. As this is the fallback behavior when no maintainers are defined in the pom.xml, ignoring such uploader user names from the manifest file might result in plugin-site problems, though it is unlikely.)

The Jenkins plugin manager should require no modifications since it will be merely presented with valid-looking releases from the update center generator. The mechanism by which those releases were built and deployed is irrelevant.

The stock Pipeline library can be used as is, or with arbitrary modifications: customizations to how tests are run and so on would affect whether and how quickly ci.jenkins.io produces a passing commit status, without any interaction with the subsequent deployment. (The library already tolerates incremental versions from JEP-305; infra.maybeDeployIncrementals could be amended to skip deployment from master when changelist.format is defined, to avoid redundantly deploying the same bits to incrementals as would anyway be deployed to releases.)

The Plugin Compatibility Tester (PCT) should require no modifications to test plugins deployed by this JEP’s mechanism, or plugins depending on such releases: it has long since been fixed to tolerate incremental versions and JEP-305’s use of flatten-maven-plugin.

Security

The GitHub runner is solely responsible for rebuilding binary artifacts (such as plugin *.hpi) from sources. This defends against certain supply-chain attacks: if ci.jenkins.io were compromised, at worst this could result in components and plugins with test failures being deployed. The deployed binaries would still have been built from the source files stored on GitHub.

Currently we presume that maintainers are not maliciously inserting backdoors into manually deployed binaries. So long as maintainers are granted direct access to Artifactory as well as the option to use CD, trusting them is unavoidable, and it is desirable to offer this option to maintainers in order for example to produce backport releases—unless the proposed system can be used also for non-master pushes. If a maintainer were not given Artifactory credentials, they would not be able to deploy unauthorized binaries except by stealing the bot access token, which should only be possible by actually running a GitHub action that would at least leave an audit trail.

(Originally RPU meant only that a person’s Artifactory account should be allowed to deploy an artifact. It has since been overloaded to track component and plugin maintainers, including GitHub repository ownership. Making an artifact only be deployable via this CD system would imply that we need to split up the metadata in RPU, or at least add more metadata indicating that its original function should be suppressed.)

Infrastructure Requirements

RPU needs to be enhanced to generate and maintain bot accounts and tokens, which has some implications for the security of the RPU CI job itself.

Testing

Due to the number of moving parts and authentication, testing is manual. We can use this system for a while on a few canary plugins to flush out any problems with Dependabot, PCT, etc. The new system can also be tried out on non-plugin components (jenkins-test-harness, bom, etc.) since there is no immediate user impact of a new release appearing of such a component.

Prototype Implementation

The log-cli plugin implements basic aspects of this proposal from the developer side.

plugin-pom #375 and pom #147 add a POM profile to block MRP in CD mode.

The verify-ci-status Action and the jenkins-maven-cd Action implement most of the logic of the CD process.