An Overview of DevOps

Introduction

DevOps is a recent concept that has been becoming more and more widely adopted in business with each passing year. DevOps is a phrase which comes from the combination of ‘development’ and ‘operations’. The main idea behind DevOps is that development and IT operations engineers should be working together throughout the entire service lifecycle; from planning to product support. DevOps can also be characterized by development and operations using the same techniques for their systems work, such as version control or testing.

DevOps is largely related to the Agile methodology for software development. DevOps can be seen as an extension of the agile methodology to cover not just the development process, but the entire service delivery process.

 

Origins

Although software development as an industry is relatively new when compared to industries that have been around for a lot longer (manufacturing, agriculture, construction), there has been an unprecedented amount of churn in the proposed methods used to develop and maintain enterprise systems. In decades past, these systems were simplistic enough that they did not require a well-defined process to be built. As development teams grew and systems became more complex, many methods were developed. These include Waterfall Method, Incremental and Iterative, Agile, Scrum, DevOps and many more. [1]

DevOps inherits from both the Agile Systems Administration (ASA) movement and the Enterprise Systems Management (ESM) movement. ESM came about in the early 2000’s thanks to John Willis and Mark Hinkle getting together and first asking the question of how to improve operations methodologies. The Velocity Conference which was held by O’Reilly in 2009 had within it some important presentations which described development/operations collaborations. These collaborations promoted safe, rapid change in web environments. [1]

At around the same time as this was taking place, the Agile methodology was starting to become more widely adopted in the development space. Some forward-thinkers began to discuss ASA, which focuses on taking ideas from Kanban and lean manufacturing processes and bringing them into the context of IT systems administration. [1]

In 2009, Patrick Debois and Andrew Shafer first met and began talking about DevOps (after they coined the term). They held the first DepOpsDay event in Ghent. The ball was well and truly rolling at this stage. DevOps’ success came about due to a number of combined forces; a growing automation and tool chain approach, more monitoring and provisioning tools, the need for agile processes and dev/ops collaboration. These came together to provide the DevOps movement with its principles, processes and practises that we see today. More recently, some thought-leaders in the field have expanded their definition of DevOps to also include the Lean principles. [5]

 

DevOps in Detail

DevOps’ main goal is to bring together the development and maintenance teams so that operations employees are capable of doing development work when necessary and vice versa. This is actually a somewhat radical idea considering the large amount of compartmentalisation of responsibilities that is present in large enterprise system development projects. This lack of communication and integration between different teams working on one project is a source of a lot of problems. [2]

When people are only responsible for a small portion of a large project, they naturally tend to feel less responsibility for the project as a whole. As Maxime wrote in a blog post titled “Why I Quit my Dream Job at Ubisoft”, big teams lead to specialization. [3] And when people specialize they develop a kind of ‘tunnel-vision’, meaning that they view their area of expertise as the most important, which greatly complicates decision making. Using the example of an AAA game development team, if you’re responsible for the design of a lamppost in the game, you’re not going to feel a whole lot of responsibility for the game as a whole. DevOps aims to address this lack of responsibility by integrating and sharing work between team members.

 

Values

The values of the DevOps methodology are essentially the same as those outlined in The Agile Manifesto:

  • Individuals and interactions over processes and tools.
  • Working software over comprehensive documentation.
  • Customer collaboration over contract negotiation.
  • Responding to change over following a plan. [4]

 

Principles

According to John Willis, the principles of DevOps can be boiled down to 4 main concepts.

Culture this means that people and process have to be prioritised. Without a solid culture, attempts to collaborate and automate will be fruitless endeavours.

Automation this follows an understanding of the culture. When each member knows and understands each other’s strengths and weaknesses, decisions that support the given team can be made. The ‘fabric’ of DevOps can be weaved by selecting suitable tools for release management, configuration management, systems integration, monitoring and control.

Measurement without measuring performance over time, improvement is unlikely. A successful DevOps implementation will measure everything it can, from process metrics to people metrics.

Sharing – acting as a loopback, this principle completes the CAMS cycle. When development and operations come together to discuss problems, they view the problem as the enemy as opposed to playing the blame game between departments.

 

 

Methods

A lot of the methods used in DevOps are an extension of those that can be used in the agile methodology. Methods such as Scrum with operations or Kanban with operations can be used, although there will be a greater emphasis on integrating ops with dev, QA and the product.

In keeping with the principle of automation, there should be automatic builds, versioning, testing, configuration management, deployment, middleware, monitoring, ticketing systems and provisioning.

 

Patterns

Scripted environments – by fully automating environment provisioning, the risk of encountering environment specific deployment errors is greatly reduced. Using scripted environments also verifies the integrity of a particular version of the software in target environments. Infrastructure automation tools such as Puppet support this pattern through the use of manifest scripts which are deployed (like application code) to version-control repositories.
There are numerous benefits to following this pattern; environments are always in a known state, they enable self-service environments and deployments, they lessen the chance that deployments behave differently based on unique environment types, environments are part of a single path to production, they lessen the chance that knowledge is maintained only in team members’ heads and most importantly deployments are more predictable and repeatable.

Test driven infrastructures – a premise of DevOps is the application of patterns from development to operations and vice versa. The test-driven approach to writing tests before writing functional code comes from software development but also lends itself to infrastructure automation. As tools for infrastructure automation become more popular, engineers are beginning to apply test-driven practices to their infrastructure. As with environment provisioning, infrastructure testing can be done with scripts. One such tool is Cucumber, in which tests are described as scenarios and handle in a behaviour driven manner (when I do X, I should see Y).
A benefit of this pattern is that problems manifest earlier as infrastructure changes are integrated with the rest of the software system using continuous integration. Also worth noting is that the tests and scripted infrastructure become the documentation.

Chaos Monkey – the Chaos Monkey is a continuous testing tool that was developed by the Netflix tech team. The tool intentionally and randomly terminates processes in the Netflix production infrastructure to ensure that systems continue to function in the event of failure. By constantly testing their infrastructure’s ability to succeed despite failure, they’re preparing for the inevitable unexpected outages. By following a principle of “everything fails, all the time”, they are prepared for the worst. [7]

Version everything – it’s still rare to find a team that version all of the artefacts required to create the software system. The purpose of versioning everything is to determine a single source of truth (a canonical version). Software should be treated like a holistic unit. When everything is versioned, nobody is unclear or navigating a mess of versions of the software system. A new team member joining the team on a new system should be able to perform a single-command checkout and be left with a complete working software system from it.

Delivery/deployment pipeline – this is a process in which different types of jobs are run based on the success of the previous job. Using a continuous integration server (such as Jenkins), various stages can be configured including a commit stage, an acceptance stage etc.
The visibility provided by a deployment pipeline ensure that all aspects of the delivery system including building, deployment, testing and releasing are visible to every member of the team. Through fully automating the process, it’s possible to deploy any version of the software to any environment automatically. [6]

 

Collaboration

Collaboration is fundamental to DevOps. Traditional development and operations teams often work in silos and have limited inter-team communication until software release time. As bad as it sounds, it’s almost an expectation that most software does not meet release deadlines and a lack of collaboration can be at least partially to blame for this. There are a number of ways to increase collaboration and breakdown the traditional barriers that prevent software from being delivered regularly.

Collective ownership – this practise dates back to the extreme programming (XP) methodology and is also associated with agile methodologies. In the context of continuous delivery, emphasis is on ensuring that all types of source files that make up the system are available for any authorized team member to modify. If everything is scripted and everything can be modified then these source files should include application code, configuration, data and even infrastructure.

Cross-functional teams – this means having teams consisting of representatives from all relevant disciplines. As opposed to treating each discipline as a separate centralized service organization, the delivery team becomes the primary organizational structure. The anti-pattern of this is siloed teams who have their own scripts and processes. Populating a delivery team with business analysts, customer representatives, DBAs, developers, project managers, and QA and release engineers greatly reduces the ‘it’s not my job’ syndrome which plagues so many organizations not adopting a DevOps or similar methodology.

Polyskilled engineers – these are team members who are skilled in all areas of the software delivery process. In general, team members should be capable of performing their specialized task firstly but also capable of carrying out the duties of other aspects of the delivery process too. This includes project managers writing tests, developers modifying database code, DBAs writing functional tests etc. Although not always being a jack of all trades, being polyskilled prevents the need to rely on a few key individuals to get software released.

 

Tools

Infrastructure automation: Bcfg2, CFEngine, Chef, CloudFormation, IBM Tivoli, Puppet.
Deployment automation: Capistrano, ControlTier, Func, Glu, RunDeck.
Infrastructure as a Service: Amazon Web Services, CloudStack, IBM SmartCloud, OpenStack, Rackspace.
Build automation: Ant, Maven, Rake, Gradle.
Test automation: JUnit, Selenium, Cucumber, easyb.
Version control: Subversion, Git, IBM Rational ClearCase.
Continuous Integration: Jenkins, CruiseControl, IBM Rational BuildForge.

As you can see from the above illustrative (and very incomplete) list, there are a whole plethora of tools to choose from when trying to automate operations in all stages of the delivery process. The tools you choose for your project will vary based on the requirements of the project however each tool should be able to run from the command line. This is to enable the pipe to run in headless mode or with just one click (running bash scripts).

Observation & Critique

In an article by Jeff Knupp entitled ‘How DevOps is Killing the Developer’, he makes the point that DevOps movement and its reliance on cross-functional profiles (jack of all trades) is better suited to startups and not Big Enterprise. [8]

Knupp writes that the scarcity of resources in a startup environment warrants the jack of all trades profile, as in a team of 7, it makes sense that the DBA can write functional code too. He argues that developers are busy enough dealing with problems within the realm of development to be doing other ‘easier’ work.

I think he has missed the point a little. DevOps is not solely about hiring people who are polyskilled; its primary goal is to improve software delivery through culture, automation, measurement and feedback (sharing). Having polyskilled team members only benefits by enabling members to share work when there’s a backlog/approaching release/absent team members. It prevents the whole project from being delayed due to a change in team dynamic.

It’s important to note that DevOps is a set of principles that should be viewed as guidelines. Each implementation of DevOps will be different to the next, depending on the software being developed, the skillset of the team members, the timeframe for development and the culture of the team. To go back to Knupp’s article, it would appear to me that his experience of DevOps was that of a poor implementation, possibly hardly following the principles. It often happens with the Buzz Word of the YearTM that it gets thrown around in a superficial manner by people who don’t understand the meaning at all.

 

Conclusion

In order to fully appreciate the potential that could be gained by adopting a DevOps methodology, one must dig deep into the story behind the term. Gaining an understanding of where the term came from and what problems it was trying to solve is essential reading for anyone who is considering a DevOps implementation.

The manual method of managing operations simply became out-dated and those that were left to deal with the frustration of moving slowly in a rapidly changing world looked to other fields for inspiration. DevOps came about in an attempt to bring the principles of the Agile software development methodology into the realm of IT operations.

The principles of the DevOps methodology are culture, automation, measurement and sharing. Growing the culture of your team means each team member knows their team mates. They know each other’s strengths and weaknesses; this helps them to work together more effectively and intelligently. If a culture of ‘all in this together’ is nurtured, team members will seek to improve processes at every stage of the service delivery system. Measurement (and sharing of measurements on dashboards!) enables the improvement in performance of operations.

As discussed above, there are numerous design patterns associated with the DevOps culture. These range from automated environment provisioning to version control systems. There are also countless tools out there to aid in implementation of these patterns. The tools you choose to use are entirely project/team specific.

A good way of understanding DevOps is as follows: Agile aims to improve communication and collaboration between developers and their clients, DevOps aims to improve communication and collaboration between developers and operations teams.

 

References

 

 

Web Scraping with VBA

My first hands on experience with web scraping was one of uncertainty and a significant amount of ‘on-the-job’ learning. Initially I was working as a tech support agent but once the operations manager caught wind of the fact that I’m a programmer, I was moved to the offline team and tasked with writing some sort of script that would scrape a relatively large amount of data from one of the company’s sites and store it in a spreadsheet for easier analysis.

I knew that this was definitely possible to do, it was just a matter of finding out how. I was given a week to determine if I was up to the task. I started where any sane person would, the Google search bar.

The first potential candidate I found was a Chrome plugin called webscraper.io. The examples they include on their site are very helpful and things were looking hopeful after a few hours of testing. The issue that led to this idea being scrapped was that it is only designed to scrape data from a website and has no functionality that would allow it to populate search fields and submit a form before commencing a scraping session. Although this tool didn’t suit my specific requirements, it could be of great help to someone looking to scrape an online store of all its products, for example.

Digging deeper and deeper through pages of Stack Overflow, I kept coming to the same conclusion; that my best bet of automatically navigating through webpages, populating search fields and trawling through pages of results, documenting the contents as I went, was to use Visual Basic for Applications (VBA).

I had no prior experience with VBA but there’s nothing particularly interesting about it or its syntax that caused me any trouble. I simply added references to ‘Microsoft Internet Controls’ and ‘Microsoft HTML Object Library’ and I had everything I needed within Excel and its IDE for VBA. The following code shows how to create an Internet Explorer object, set it to visible and navigate to a URL.


Set IE = New InternetExplorerMedium
IE.Visible = True
IE.navigate ""

Okay so now that we’ve covered how to get to the site primed for scraping, we need to figure out how to interact with elements within the page. What better way to showcase this than to attempt to login. The first thing to consider is how to address a particular element within a webpage. To do this we get the Document Object Model (DOM) and from there we can drill down to identify specific elements. Depending on the website in question and the prerogative of the developer(s) of said website, the complexity of this varies. The following code snippet works given the assumption that each element has been given a unique id.. sometimes this isn’t the case but I’ll talk about that in a moment. First, the easy way:


Dim doc: Set doc = IE.document

Dim passwordTextField = doc.getElementById("password")
passwordTextField.Value = "hunter2"

Dim submitBtn: Set submitBtn = doc.getElementById("submit")
submitBtn.Click

Looking at the below image, it’s clear to see how elements in a webpage are related. Each nested element is considered a child of its parent, meaning we can access any element by traversing through this tree structure.

An example of a DOM structure
Example DOM Structure

Let’s imagine that the case is the same as above but the submit button has not been given an id. The first place you ought to look is to the parent of that element. Let’s say that element is a

with an id of “submitBtnDiv”. In this instance we can still access the submit button a number of ways but it’s a little trickier:


' We can get the first element in the set of children elements
Dim submitBtn: set submitBtn = IE.document.getElementById("submitBtnDiv").Children(0)

If we can guarantee that the element will always be the nth instance of a given class, then we can do the following as well:


' If we know that the element in question is the first instance of an input tag
Dim submitBtn: set submitBtn = IE.document.getElementsByTagName("input")(0)

Note: using EI9 or later, you can also use the method getElementsByClassName().

Now that we can access any individual element within a webpage, writing a web-scraper just comes down to the individual site in question. For this example, let’s use IrishRail.ie. Quickly scanning through the source (using Google Chrome’s nifty inspect element feature), we can quickly determine how difficult it’s going to be to access the table of data we want.

html1

The first thing that jumps out at me here is that there’s a div with an id of ‘content’, that’s a starting point. The next thing I see is that the table tag is its 11th child. Already, we can access the table:


Dim table
' To get the 11th child element:
Set table = IE.document.getElementById("content").Children(10)

Now that we have the table, it’s just a matter of accessing each individual td and placing that value into the corresponding cell in your spreadsheet now. The following code shows a simple loop that will extract all data from the above table:


Dim trs, tds, tbody
tbody = table.getElementsByTagName("tbody")(0)
trs = tbody.getElementsByTagName("tr")

For r = 0 to trs.Length - 1
    tds = trs(r).getElementsByTagName("td")
    
    For c = - to tds.Length - 1
        ThisWorkbook.Sheets(1).Cells(r, c) = tds(c).Value
    Loop
Loop

In the above snippet, the getElementsByTagName method is used in two different ways. Firstly, we request the very first instance of tbody within the table tag. Secondly, we request a collection of elements of type tr. This gives us an ordered list of all tr tags nested within tbody. Looping through each row and then doing the same for each td, we can quickly grab all of the table’s contents without specifically referencing each individual piece of data.

And that’s basically all there is to it. There are other aspects one must think about when writing a web scraper that has to work reliably and consistently but this post should give you a basic understanding of the fundamental concepts used in web scraping.