An Introduction to Workflows Automation (CI,CT and CD)


Continuous Integration: CI is a development practice where members of a team integrate their work frequently, with each integration being verified by an automated build to detect errors as quickly as possible.
Continuous Testing: CT means that every time an integration is done predefined test cases are run to ensure that the new code doesn’t break the existing system.
Continuous Deployment: CD means that you deliver software in continuous incremental fashion and deploy frequently. Here deploying could be to a test environment or pre-prod environments.
So, for an organization to be being continuous it should be CI, CT and CD driven and it must be inclusive in the Software Development Lifecycle (SDLC). The diagram given below shows the phases from the SDLC life cycle and the areas of CI, CT and CD.
So, what are the benefits of being continuous? If implemented correctly and practiced regularly, being continues helps in reducing integration problems there by allowing you to deliver job/code/software more rapidly.
Continuous Integration Development Cycle
Also, by integrating regularly, you can detect errors quickly, and locate them more easily. With the usage of right tools, one could have fewer conflicts and easy conflict resolution while integrating the code. The most important point to note is you end up with less of a chance of breaking what already exists and even if it breaks it’s easier to solve/recover.

Continuous Workflows Overview

For continuous integration, there is a need to have a repository where in the code could be saved, retrieved and maintained. The repository must be good enough to provide the developers with a powerful version controlling system. Though there are a number of CI tools available, I suggest trying out Git.  Git is one of the version control systems (VCS) for tracking code changes and coordinating work on the code among multiple people. It is primarily used for source code management in software development, but it can be used to keep track of changes in any set of files. It provides few common workflow models:

Centralized Workflow

 This flow uses a central repository to serve as the single point-of-entry for all changes to the project. The default development branch is called master and all changes are committed into this branch. This workflow doesn’t require any other branches besides master. A typical centralized workflow life cycle would be as follows:
  • Developers start by cloning the central repositories in their own local copies of the project. They edit jobs and commit changes locally. Once the changes are tested, the developer “Push” their local master branch to the central repository.
  • Managing Conflicts: the central repository represents the official project, so if the local job changes conflicts with upstream commits, Git will pause the process and give a chance to manually resolve the conflicts. This makes it easy for developers to manage the merges.
You might have noticed that centralized workflow is more like SVN with few Git features. This would be a great for transitioning teams off SVN, however, it doesn’t use the distributed nature of Git

Feature Branch Workflow

  • The core idea behind the Feature Branch workflow is that all feature development should take place in a dedicated branch instead of the master Git makes no technical distinction between the master branch and feature branches. So, developers can edit, stage and commit like they did in the Centralized workflow. Here a typical workflow would look like:
    • Developers create a new branch every time they start work on a new feature.
    • Feature branches should have descriptive names like issue-#1061, Jira-190. The idea is to give a clear, highly focused purpose to each branch.

Gitflow Workflow

This defines a strict branching model designed around the project release. This provides a robust framework for managing larger projects. This is similar to Featured Branch workflow except it assigns very specific roles to different branches and defines how and when they should interact. It also uses individual branches for preparing, maintaining, and recording releases. Like the previous workflow, developers work locally and push branches to the central repo. The only difference is the branch structure of the project. You define Historical Branches, Feature Branches, Release branches and Maintenance Branches.

Forking Workflow

This is fundamentally different than the other workflow. Here instead of using a single server side central repository, it gives every developer a server-side repository. This means that each contributor has not one, but two Git repositories: a private local one and a public server-side one. Note that Forking workflow is not supported by Talend A typical workflow would look like:
  • The Forking workflow beings with an official public repository stored on a server. But when a new developer wants to start working on the project, they do not directly clone the official repository instead they fork the official repository to create a copy of it on the server.
  • This new copy serves as their personal public repository – no other developers are allowed to push to it, but they can pull changes from it.
  • When they are ready to publish a local commit, they push the commit to their public repository – not the official one. Then they file a pull request with the main repository, which lets the project maintainer know that an update is ready to be integrated.
  • The main advantage of this workflow is that contributions can be integrated without the need for everybody to push to the official repository.
  • Developers push to their own server-side repositories and only the project maintainer can push to the official repository.


A Day in the Life of an SDLC Developer

As the industry moves towards agility, speed is becoming the need of the hour. A systematic daily routine, if practiced, would help a developer and the project achieve more in less time. Though the time to develop the code remains same, there would be a drastic improvement in code integration, time to test and the time to deploy. Let’s go through the recommended steps for a smooth day to day activities of a Talend developer.

Development & Tests:

The various steps in Development Phase would be:
Step 1: In Talend Studio, the developer pulls a local copy of the Master branch from the centralized Git Repository
Step 2: He or she then performs all the development/coding/job modifications in the local copy
Step 3:  Unit testing of the changes is done
Step 4: Changes are committed to local copy (Note that the changes are still in the local repository and is not yet committed to the Master branch)
Step 5: When the developer is ready to commit the changes to the Master branch, then they perform a "Pull and Merge"
Step 6: If no conflicts appear, then the code is committed to the Master branch
Step 7: If there are conflicts then resolve the conflict manually. If required repeat steps 2 to 5 and perform a commit on the Master branch

QA Tests:

Once the developer commits the changes to Master branch, the Quality Assurance (QA) phase starts. Here's what that would look like, step-by-step.
Step 1: Beyond unit testing, do the functional testing for the newly modified job and all the other jobs part of the requirement/module/project
Step 2: Perform the non-functional testing for the whole requirement/module/project.

Go Live:

If the job passes all the test performed by QA team, it is built, deployed to a centralized repository.
Step 1: Once the code is in a centralized repository, it could be moved into other environments like Test, Stage or Production.
Step 2: Ensure the code is stable and should be used for next requirement development
Step 3: If the job fails at QA phase, it is sent back to the developer for correction.

Automating with Talend CI-Builder

If you notice most of the times QA’s have the same set of testing to be performed and, as most of the steps are repetitive, it makes sense to automate the work. Talend CI -Builder helps in automating this whole process by utilizing the Jenkins’ configuration. So, while utilizing Talend CI -builder or opting for Continuous Testing, the traditional QA steps would be modified to look something like this: 
Step 1: The development and test phases are completed by the developer in Talend Studio.
Step 2: QA tests are automated. One could schedule the build as well as where the job is built and tested using various modules from both Talend and third parties.
Step 3: If the job passes the QA tests, the job build is automated, if not an email notification is sent to developer about the job not passing the QA test, thereby allowing him to correct the defects in the job.
Step 4: Once the job is built, it is automatically deployed to the central repository where the job could be released to different environments.

Continuous Testing with Talend: Step-by-Step

Now let’s look at the detailed process of automating the testing with Talend. Continuous testing starts within development process where the developer could use Talend Studio to test the functionalities of their code. Tools such as GitHub or other repositories can be used to store test case and version together with the code. The same test case could be further utilized to test the integration or QA tests.
A test case has a set of test data, preconditions, expected results and postconditions, developed for a particular test scenario. Talend Studio comes with a test framework that allows you to create test cases by keeping your application ready and deployable at any point of time. It also enables developers to create test cases for different parts of the integration job. Test cases can be created by right-clicking on the component you want to test and select ‘Create Test Case’ from the Menu.
Talend enables developers to add many instances of test cases, which means that you can run as many test cases as you need with different input and reference/comparison files.  These test cases can then automated using Jenkins.
Once you create a test case, depending on the component selected, a default test skeleton is created. In my example below, I am testing the component tUniqRow and hence my default skeleton would look like one given below.
Note that the skeleton generated depends on the component(s) selected in the job to create the test. Here, the test case aims at:
  • Reading input data files using tFileInputDelimited components,
  • Transforming data with an immutable set of INPUT and OUTPUT components based on the initial Job,
  • Writing the output data to a tFileInputDelimited component
  • Comparing the temporary output file tCreateTemporaryFile component to a reference file you need to define, using a tFileCompare component,
  • Generating the Test execution status like OK if it succeeds, or Fail if it fails using a tAssert component.
A test case is successfully executed only when the output file provided by the developer and the reference file (result to be compared file) both provided by the developer are identical. The panel display the test case execution results like status, % of success, duration of test case execution etc. This also shows the execution history.
If all the test cases are successful, then the code is put into higher/next environment(s). Continuous Deployment automatically deploys this code to Nexus using tools like Jenkins, Bamboo etc.
To set up the automated environment the following tools are needed
  • A Jenkins server configured with JDK, Maven and GIT Plugins
  • A dedicated Talend command line (a separate one apart from the one already dedicated to TAC)
  • The Talend CI Builder plugin installed in local Maven repository
  • Access for Git and Nexus. Jenkin would pull the code from Git and store the Binaries to Nexus.
Let’s look at the automated steps of the CI process with Talend Jobs.
Step 1: All Jenkins jobs should be configured to trigger the CI process. Ideally, the process is started when code is committed from Talend Studio into the Git repository master branch (or any other branch specified in the job configuration). Jenkins allows you to specify various conditions to atomically trigger the jobs. The code generated by Talend consists of XML files having the items and the job properties.
Step 2: Once the Jenkins workflow is triggered, it will check the source code as .xml files, custom Java or routines from the Git repository. The Jenkins jobs should be configured accordingly to check out the source code.
Step 3: Once the jobs checkout to the local workspace, the ComamndLine service generates the Java code form the XML files.
Step 4: The Java source code is then compiled as directed by a Maven POM file. A Project Object Model or POM is the fundamental unit of work in Maven. It is an XML file that contains information about the project and configuration details used by Maven to build the project. It also contains default values for most projects. Some of the configuration that can be specified in the POM is the project dependencies, the plugins or goals that can be executed, the build profiles, and so on. Other information such as the project version, description, developers, mailing lists and such can also be specified.
Step 5: Once the code is compiled and binaries are created, this step will run any unit tests created in Talend (we ran through this test case creation earlier in the blog).
Step 6: If all the test cases are passed, then Jenkins would create a package and publish the jobs. Packaging is a step to create a zip file consisting of scripts, contexts, JVM parameters, and Java libraries. This zip archive will then be published to an artifact repository. If the test cases are not passed, then the Jenkins job aborts and the code is sent to the developer for issue resolution.
Step 7: Once the jobs are published to Nexus, the binaries could be taken from Nexus either Manually or via Meta servlet and deployed, schedule the run to the Talend job server.
Well, I hope this blog will give some clarity on how to use continuous integration and testing with Talend CI builder.

Comments

  1. This blog is great check it out.
    Workflow Automation Software allows us to automate our workflow and enforce our practices, spend more time on actually important things, not routine work.
    Kelsa allows users to create, edit, manage and accelerate business process in real-time without taking much time. It is highly customizable and collaborative.
    A cloud-based, hybrid tool you can use as a website or as an app - without even installing.
    Features:
    Manage work the way you want
    Create custom workflows
    Make task management easy
    Define stages and move tasks into stages
    Integrate workflow with every task list
    Streamline repetitive processes
    Allow creating and assigning tasks
    Create stages in which tasks can be defined and moved
    Synchronize tasks in every stage
    Follow up on tasks in the process
    Adapts to changes along the way and much more.

    ReplyDelete
  2. I just want to thank you for sharing your information and your site or blog this is simple but nice Information I’ve ever seen i like it i learn something today.  UCAT Practice

    ReplyDelete

Post a Comment

Popular posts from this blog

Migrate your data in Google Cloud Storage using Power of Talend

Talend Open Studio Errors (Compile and Run)