Faster releases with cloud infrastructure platforms
Many companies are shifting to a DevOps approach, where software deployment
and the ongoing operations of production environments are both incorporated into
the software development lifecycle. Rather than throwing completed code
over the wall from development to operations when software is ready for
release into production environments, DevOps incorporates the release into
production into the overall release process. This typically relies on
having a programmable infrastructure, such as cloud and
software-defined networking (SDN), so that the whole environment is defined
as code. Only cloud infrastructure platforms meet this requirement.
By using DevOps, companies can move faster than ever. With the dynamic,
fast-changing nature of the cloud, it’s difficult for key stakeholders in
IT, operations, and security to have assurances that corporate policies
for compliance are being met. Applying a discovery-focused, event-driven automation approach can give organizations the configuration controls they need to stay secure in the cloud.
Quick overview of
DevOps is an approach that deploys software quickly when new releases
become available. One of the primary goals of DevOps is to help
organizations move quickly. New releases are exposed to real-world
users, customers, and usage as soon as possible and are tested for
functionality and value to increase the benefit of the application. Most
organizations use a cloud infrastructure as the operating environment for
their DevOps teams. A simplified process might include the following
- Commit code.
- Automatically package the application for deployment.
- Run a set of regression tests for code checking and smoke tests for
production environment readiness. Run other tests, such as
- Depending on the test results:
- Failed tests: Generate alerts and logs for development,
operations, and security teams. Return to coding.
- Passed tests: Automatically launch into production.
- Failed tests: Generate alerts and logs for development,
- Run in production until the next release.
This whole process, from end to end, is often referred to as a DevOps
routine or playbook.
But not all DevOps teams know how to provision infrastructure correctly to
keep in line with the organization’s security and compliance requirements.
And to compound the complexity, most DevOps approaches include the
following mitigation mechanism for any problems: Revert the whole
environment. DevOps playbooks almost always include rollback
Figure 1. DevOps playbook
By Medrecs (Own work) [CC BY-SA 4.0
(http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia
The cloud for DevOps
The cloud is uniquely positioned to work as the infrastructure for a DevOps
approach for many reasons, including:
- The cloud is software defined; it is programmable and flexible.
- Cloud is software driven and has many different layers. The customer
is responsible for each layer and must design it, from network
configuration to service account and credentials through data
protection and encryption settings.
- Cloud infrastructure is typically billed on a utility-style,
pay-as-you-go model. This means that total application running costs
can be discretely determined, and companies can do accurate ROI
measurements to determine the value of an application.
Most organizations think that they can use cloud infrastructure similar to
how they use IT infrastructure — namely through a centralized IT team. However, approaches like service
catalogs, provisioning templates, and limited console access fail at scale
because the needs of different teams are too diverse, and IT struggles to
keep up with them. Eventually, serious cloud adoption becomes
decentralized, and many teams access cloud infrastructure. This
brings the fundamental problem: Not all of these teams are experts in
infrastructure. And many simply don’t know or don’t understand the
security requirements of the organization or how to implement them in
software-defined cloud environments. When coupled with the complexity of
large enterprises, business units, distributed teams, and various
compliance regimes, this problem gets more and more difficult to manage at
DevOps approach to fixing
Many DevOps teams automate both the deployment and roll-back of new applications and application releases. So if they identify a problem, the entire environment is torn down.
This complete tear down often implies downtime for the environment,
followed by an exhaustive root-cause analysis. While this is a good
exercise to improve long-term organizational improvement, it’s a challenge
to the immediate application availability.
A secondary challenge is that test coverage might not include every
condition that should be considered to make a keep versus rollback decision.
Tests often focus on application behavior, code correctness, and sometimes
security assessment. Development teams tend to dominate DevOps groups, yet
they might lack the background to check for infrastructure optimization or
Putting a set of security and configuration controls (sometimes
referred to as guard rails) in place can give key executive stakeholders the
protection and peace of mind that they need to let DevOps teams operate
quickly, without putting the organization at risk.
Security perception of
The main concern that large organizations have about DevOps is the lack of security focus. This is often a conscious trade-off: Let the
DevOps teams move quickly because they are working on critical new
ventures for the organization. However, security is often sacrificed so
that the teams and their environments are not constrained.
One of the reasons that the security perception of DevOps is so negative is
that with cloud infrastructure, it can be hard to manage the number of security controls and configurations that need to be programmed. In a simple
cloud application architecture design, there are at least ten unique
controls to implement and configure correctly. A few examples:
- Are cloud audit logs enabled for all the infrastructure
- Are firewall rules configured correctly?
- Are SSL certificates used at the right points?
- Are the SSL certificates valid and immune from known
- Are service accounts correctly used and secured for the services
- Do service accounts have the appropriate permissions, policies, and
- Are subnets sized appropriately?
- Are network Access Controls Lists (ACLs) being used and correctly
- Is encryption used on data at rest where needed?
- Is encryption in transit used and correctly configured where
- Is communication between application tiers properly secured with
- Are static assets properly secured with appropriate permission
Figure 2. Sample cloud architecture
Image credit: DivvyCloud; Used with permission
What controls are
One of the key challenges for organizations in cloud adoption is defining
the policies and standards that they want in place for their cloud
environments. These requirements can vary by application, and they often
come from multiple stakeholders. For instance:
- Production environments can have the highest level of security
- Require MFA for all admin accounts
- Only firewall port 443 is allowed to be open
- Audit trail must be enabled
- Test and development environments might have a focus on controlling costs yet
not allow any access outside the organizations:
- No ACLs on cloud networks, but no public web access
- No instances over eight cores (hybrid cost control and
enforced focus on scaling out instead of scaling up)
- Maximum age of 30 days for any compute-based instance,
database, or storage volume
- Back office applications likely need to connect back to corporate
- SSH access must be open to the corporate network IP range
- Databases backed up four times a day
- All IP traffic routed over VPN
Customers can put these configuration controls in place with different
tools, ranging from those available from each cloud provider to open
source inspection tools or commercial software. These tools can inspect
and take action for the problems identified. Organizations can use tags,
naming conventions, cloud region, or account placement to determine which
sets of configuration controls apply to each application or workload, in
addition to global policies that apply universally.
Security requirements likely come from different sources, as well. In
the previous examples, for instance, the likely stakeholders would be:
|Port 443 traffic only||Security team, CISO, SecOps|
|No instances over eight cores||Finance, CFO, CIO|
|SSH access to corporate IT range||Operations team, IT, TechOps|
|Database backups|| Operations team, TechOps, CIO, DR
Figure 3. Corporate-wide policies overlaid with business unit
policies and application, project, or product-specific
Image by DivvyCloud; used with permission
A commonly successful approach in cloud automation is to start with
visibility. Having a tool that continuously monitors and takes inventory of
the cloud environment is not only useful but necessary to ensure that all
cloud infrastructure is evaluated against those controls.
Some organizations rely on gathering this inventory from various
deployment tools that should “check in” when new applications launch.
Another approach is to use server agents to report back to a central
inventory. Both of these approaches rely on the agents or deployment tools
being installed, properly configured, and accessible.
An alternative approach is to use the cloud API layers to perform an
exhaustive and repeated query of the environments. While this consistently
leads to full infrastructure visibility, it has the trade-off of not being
able to query deeply into the operating system or application tiers. As
such, many organizations employ a combined approach and use APIs to
consolidate the data.
Once the infrastructure is discovered, the applicable configuration
controls are identified and evaluated for each virtual machine,
application, network, storage asset, and workload. When problems are identified, the customer has two options: receive a notification and react or set predetermined automations to take action in near real-time.
A simple example is:
- The customer puts a control mechanism in place that disallows SSH
access to production virtual machines.
- A new application is launched into production. The application defines
a cloud network environment that inadvertently opens port 22 to the
- The cloud automation tools discover the new application and determine
that it is a production environment.
- The tools compare the new application to the applicable production
policy and determine that there is a problem.
- The customer has a predefined response that creates a notification,
logs an event to the audit trail, and closes the firewall port.
This gives key executive stakeholders both security controls and peace
of mind when applying DevOps and broader cloud adoption. These two factors
together can help the organization be agile and move fast.
Figure 4. Cloud
Configuration Control Process Diagram
Image credit: DivvyCloud; Used with permission
The cloud enables organizations to move rapidly. The true value, aside from
potential cost savings, is the agility that the cloud allows. Customers
can build, modify, and tear down environments at unprecedented speeds.
By combining software-defined infrastructure with application release pipelines, companies can enable a truly agile DevOps approach to releases. This dramatically shortens the time between releases and between the release and receiving real-world user data that leads to continuous application improvement.
While cloud is potentially transformative, the real-time nature and broader
access to infrastructure brings a new set of challenges to key stakeholders
for security, cost controls, and policy compliance. Using a
discovery-focused monitoring tool that can check and enforce controls can
help remove key concerns for cloud adoption. Organizations can safely move
forward faster with the cloud.