Machine Learning Plugin project - Coding Phase 2 blog post

Loghi Perinpanayagam July 27, 2020

Welcome back folks!

This blog post is about my coding phase 2 in Jenkins Machine Learning Plugin for this GSoC 2020. After successfully passing the evaluation and demo in the phase 1, our team went ahead for facing the challenges in phase 2.

Summary

This phase of coding was well spent by documentation and by fixing many bugs. As the main feature of connecting to an IPython Kernel is done in phase 1, we were able to focus on fixing minor/major bugs and documenting for the users. According to the JENKINS-62927 issue, a Docker agent was built to facilitate users without concerning plugin dependencies in python. In the act of deprecation of Python 2, we ported our plugin to support Python 3. We have tested our plugin in Conda, venv and Windows environments. Machine learning plugin has successfully passed the end to end test. A feature for a code editor is needed for further discussion/analysis as we have done a simple editor that may be useful in other ways in the future. PR#35

Main features of Machine Learning plugin

Run Jupyter notebook, (Zeppelin) JSON and Python files
Run Python code directly
Convert Jupyter Notebooks to Python and JSON
Configure IPython kernel properties
Support to execute Notebooks/Python on Agent
Support for Windows and Linux

Upcoming features

Extract graph/map/images from the code
Save artifacts according to the step name
Generate reports for corresponding build

Future improvements

Usage of JupyterRestClient
Support for multiple language kernels
- Note : There is no commitment on future improvements during GSoC period

Docker agent

The following Dockerfile can be used to build the Docker container as an agent for the Machine Learning plugin. This docker agent can be used to run notebooks or python scripts.

Dockerfile

FROM jenkins/agent:latest

MAINTAINER Loghi <loghijiaha@gmail.com>

USER root

RUN apt update && apt install --no-install-recommends python3 -y \
    python3-pip \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt /requirements.txt

RUN pip3 install --upgrade pip setuptools && \
    pip3 install --no-cache-dir -r /requirements.txt && \
    ln -sf /usr/bin/python3 /usr/bin/python && \
    ln -sf /usr/bin/pip3 /usr/bin/pip

USER jenkins

Ported to Python 3

As discussed in the previous meeting, we concluded that the plugin should support Python 3 as Python 2.7+ has been deprecated since the beginning of 2020. Pull request for docker agent should be also ported to Python 3 support.

Jupyter Rest Client API

The Jupyter Notebook server API seemed to be promising that it can be also used to run notebooks and codes. There were 3 api implementations that were merged in the master. But we had to focus on what was proposed in the design document and had to finish all must-have issues/works. Jupyter REST client was left for future implementation. It is also a good start to contribute to the plugin from the community.

Fixed bugs for running in agent

There were a few bugs related to the file path of notebooks while building a job. The major problem was caused by the python dependencies needed to connect to a IPython kernel. All issues/bugs were fixed before the timeline given.

R support as a future improvement

This is what we tried to give a glimpse of knowledge that this plugin can be extended for multi language support in the future. There was a conclusion that the kernel should be selected dynamically using extension of the script file(like eval_model.rb or train_model.r), instead of scripting the same code for each kernel.

Documentation and End to End testing

A well explained documentation was published in the repository. A guided tutorial to run a notebook checked out from a git repo in an agent was included in the docs page. Mentors helped to test our plugin in both Linux and Windows.

Code editor with rebuild feature

Code editor was filtered as a nice to have feature in the design document. After grabbing the idea of Jenkinsfile replay editor, I could do the same for the code. At the same time, when we are getting the source code from git, it is not an elegant way of editing code in the original code. After the discussion, we had to leave the PR open that may have use cases in the future if needed.

Jenkins LTS update

The plugin has been updated to support Jenkins LTS 2.204.1 as 2.164.3 had some problems with installing pipeline supported API/plugin

Installation for experimental version

Enable the experimental update center
Search for Machine Learning Plugin and check the box along it.
Click on Install without restart

The plugin should now be installed on your system.

Resources

About the author

Loghi Perinpanayagam

He was a GSoC 2020 student, contributor and maintainer for Machine Learning for Jenkins project. Highly interested and contributing on open source projects.