Category Archives: docker

Piping AWS output to Ansible Inventory

piping

I’ve had the opportunity to work with a few different infrastructure automation tools such as Puppet, Chef, Heat and CloudFormation but Ansible just has a simplicity to it that I like, although I admit I do have a strong preference for Puppet because i’ve used it extensively and have had good success with it.

In one of my previous project I was creating a repeatable solution to create a Docker Swarm cluster (before SwarmKit) with Consul and Flocker. I wanted this to be completely scripted to I climbed on the shoulders of AWS, Ansible and Docker Machine.

The script would do 4 things.

Initialize a security group in an existing VPC and create rules for the given setup.
Create machines using Docker-Machine of Consul and Swarm
Use AWS CLI to output the machines and pipe them to a python script that processes the JSON output and creates an Ansible inventory.
Use the inventory to call Ansible to run something.

This flow can actually be used fairly reliable not only for what I used it for but to automate a lot of things, even expand an existing deployment.

An example of this workflow can be found here.

I’m going to focus on steps #3 and #4 here. First, we use the AWS CLI to output machine information and pass it to a script.

# List only running my-prefix* nodes
$ aws ec2 describe-instances \
   --filter Name=tag:Name,Values=my-prefix* \
   Name=instance-state-code,Values=16 --output=json | \
   python create_flocker_inventory.py

We use the instance-state-code of 16 as it corresponds with Running instances. You can find more codes here: http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_InstanceState.html. Then we choose JSON output using –output=json.

Next, the important piece is the pipe ( `|` ). This signifies we pass the output from the command on the left of the | to the command on the right which is create_flocker_inventory.py so that the output is used as input to the script.

So what does the python script do with the output? Below is the script that I used to process the JSON output. It first setups up an _AGENT_YML variable that contains YAML for a configuration then the main() function takes the JSON from json.loads() in the script initialization and creates an array of dictionaries that represent instances and opens a file and writes each instance to the Ansible inventory file called “ansible_inventory”. After that the “agent.yml” is written to a file along with some secrets from the environment.

import os
import json
import sys


_AGENT_YML = """
version: 1
control-service:
  hostname: %s
  port: 4524
dataset:
  backend: aws
  access_key_id: %s
  secret_access_key: %s
  region: %s
  zone: %s
"""

def main(input_data):
    instances = [
        {
            u'ip': i[u'Instances'][0][u'PublicIpAddress'],
            u'name': i[u'Instances'][0][u'KeyName']
        }
        for i in input_data[u'Reservations']
    ]

    with open('./ansible_inventory', 'w') as inventory_output:
        inventory_output.write('[flocker_control_service]\n')
        inventory_output.write(instances[0][u'ip'] + '\n')
        inventory_output.write('\n')
        inventory_output.write('[flocker_agents]\n')
        for instance in instances:
            inventory_output.write(instance[u'ip'] + '\n')
        inventory_output.write('\n')
        inventory_output.write('[flocker_docker_plugin]\n')
        for instance in instances:
            inventory_output.write(instance[u'ip'] + '\n')
        inventory_output.write('\n')
        inventory_output.write('[nodes:children]\n')
        inventory_output.write('flocker_control_service\n')
        inventory_output.write('flocker_agents\n')
        inventory_output.write('flocker_docker_plugin\n')

    with open('./agent.yml', 'w') as agent_yml:
        agent_yml.write(_AGENT_YML % (instances[0][u'ip'], os.environ['AWS_ACCESS_KEY_ID'], os.environ['AWS_SECRET_ACCESS_KEY'], os.environ['MY_AWS_DEFAULT_REGION'], os.environ['MY_AWS_DEFAULT_REGION'] + os.environ['MY_AWS_ZONE']))


if __name__ == '__main__':
    if sys.stdin.isatty():
        raise SystemExit("Must pipe input into this script.")
    stdin_json = json.load(sys.stdin)
    main(stdin_json)

After this processes the JSON from the AWS CLI, all that remains is to run Ansible with our newly created Ansible inventory. In this case, we pass the inventory and configuration along with the ansible playbook we want for our installation.

$ ANSIBLE_HOST_KEY_CHECKING=false ansible-playbook \
 --key-file ${AWS_SSH_KEYPATH} \
 -i ./ansible_inventory \
 ./aws-flocker-installer.yml \
 --extra-vars "flocker_agent_yml_path=${PWD}/agent.yml"

Conclusion

Overall this flow can be used along with other Cloud CLI tools such as Azure, GCE etc that can output instance state that you can pipe to a script for more processing. It may not be the most effective way but if you want to get a semi complex environment up and running in a repeatable fashion for development needs it has worked pretty well to follow the “pre-setup_get-output_prcocess-output_install_config” flow outlined above.

Docker-based FIO I/O benchmarking

2 Replies

687474703a2f2f692e696d6775722e636f6d2f336f46443358502e706e67

What is FIO?

fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user. The typical use of fio is to write a job file matching the I/O load one wants to simulate. – (https://linux.die.net/man/1/fio)

fio can be a great tool for helping to measure workload I/O of a specific application workload on a particular device or file. Fio proves to be a detailed benchmarking tool used for workloads today with many options. I personally came across the tool while working at EMC when needing to benchmark Disk I/O of application running in different Linux container runtimes. This leads me to my next topic.

Why Docker based fio-tools

One of the projects I was working on was using Docker on AWS and various private cloud deployments and we wanted to see how workloads performed on these different cloud environments inside Docker container with various CPU, Memory, Disk I/O limits with various block, flash, or DAS based storage devices.

One way to wanted to do this was to containerize fio and allow users to pass the workload configuration and disk to the container that was doing the testing.

The first part of this was to containerize fio with the option to pass in JOB files by pathname or by a URL such as a raw Github Gist.

The Dockerfile (below) is based on Ubuntu 14 which admittedly can be smaller but we can easily install fio and pass a CMD script called run.sh.

FROM ubuntu:14.10
MAINTAINER <Ryan Wallner ryan.wallner@clusterhq.com>

RUN sed -i -e 's/archive.ubuntu.com/old-releases.ubuntu.com/g' /etc/apt/sources.list
RUN apt-get -y update && apt-get -y install fio wget

VOLUME /tmp/fio-data
ADD run.sh /opt/run.sh
RUN chmod +x /opt/run.sh
WORKDIR /tmp/fio-data
CMD ["/opt/run.sh"]

What does run.sh do? This script does a few things, is checked that you are passing a JOBFILE name (fio job) which without REMOTEFILES will expect it to exist in `/tmp/fio-data` it also cleans up the fio-data directory by copying the contents which may be jobs files out and then back in while removing any old graphs or output. If the user passes in REMOTEFILES it will be downloaded from the internet with wget before being used.

#!/bin/bash

[ -z "$JOBFILES" ] && echo "Need to set JOBFILES" && exit 1;
echo "Running $JOBFILES"

# We really want no old data in here except the fio script
mv /tmp/fio-data/*.fio /tmp/
rm -rf /tmp/fio-data/*
mv /tmp/*fio /tmp/fio-data/

if [ ! -z "$REMOTEFILES" ]; then
 # We really want no old data in here
 rm -rf /tmp/fio-data/*
 IFS=' '
 echo "Gathering remote files..."
 for file in $REMOTEFILES; do
   wget --directory-prefix=/tmp/fio-data/ "$file"
 done 
fi

fio $JOBFILES

There are two other Dockerfiles that are aimed at doing two other operations. 1. Producing graphs of the output data with fio2gnuplot and serving the graphs and output from a python SimpleHTTPServer on port 8000.

All Dockerfiles and examples can be found here (https://github.com/wallnerryan/fio-tools) and it also includes an All-In-One image that will run the job, generate the graphs and serve them all in one which is called fiotools-aio.

How to use it

Build the images or use the public images
Create a Fio Jobfile
Run the fio-tool image

docker run -v /tmp/fio-data:/tmp/fio-data \
-e JOBFILES= \
wallnerryan/fio-tool

If your file is a remote raw text file, you can use REMOTEFILES

docker run -v /tmp/fio-data:/tmp/fio-data \
-e REMOTEFILES="http://url.com/.fio" \
-e JOBFILES= wallnerryan/fio-tool

Run the fio-genplots script

docker run -v /tmp/fio-data:/tmp/fio-data wallnerryan/fio-genplots \
<fio2gnuplot options>

Serve your Graph Images and Log Files

docker run -p 8000:8000 -d -v /tmp/fio-data:/tmp/fio-data \
wallnerryan/fio-plotserve

Easiest Way, run the “all in one” image. (Will auto produce IOPS and BW graphs and serve them)

docker run -p 8000:8000 -v /tmp/fio-data \
-e REMOTEFILES="http://url.com/.fio" \
-e JOBFILES=<your-fio-jobfile> \
-e PLOTNAME=MyTest \
-d --name MyFioTest wallnerryan/fiotools-aio

Other Examples

Important

Your fio job file should reference a mount or disk that you would like to run the job file against. In the job fil it will look something like: directory=/my/mounted/volume to test against docker volumes
If you want to run more than one all-in-one job, just use -v /tmp/fio-data instead of -v /tmp/fio-data:/tmp/fio-data This is only needed when you run the individual tool images separately

To use with docker and docker volumes

docker run \
-e REMOTEFILES="https://gist.githubusercontent.com/wallnerryan/fd0146ee3122278d7b5f/raw/cdd8de476abbecb5fb5c56239ab9b6eb3cec3ed5/job.fio" \
-v /tmp/fio-data:/tmp/fio-data \
--volume-driver flocker \
-v myvol1:/myvol \
-e JOBFILES=job.fio wallnerryan/fio-tool

To produce graphs, run the fio-genplots container with -t <name of your graph> -p <pattern of your log files>

Produce Bandwidth Graphs

docker run -v /tmp/fio-data:/tmp/fio-data wallnerryan/fio-genplots \
-t My16kAWSRandomReadTest -b -g -p *_bw*

Produce IOPS graphs

docker run -v /tmp/fio-data:/tmp/fio-data wallnerryan/fio-genplots \
-t My16kAWSRandomReadTest -i -g -p *_iops*

Simply serve them on port 8000

docker run -p 8000:8000 -d \
-v /tmp/fio-data:/tmp/fio-data \
wallnerryan/fio-plotserve

To use the all-in-one image

docker run \
-p 8000:8000 \
-v /tmp/fio-data \
-e REMOTEFILES="https://gist.githubusercontent.com/wallnerryan/fd0146ee3122278d7b5f/raw/006ff707bc1a4aae570b33f4f4cd7729f7d88f43/job.fio" \
-e JOBFILES=job.fio \
-e PLOTNAME=MyTest \
—volume-driver flocker \
-v myvol1:/myvol \
-d \
—name MyTest wallnerryan/fiotools-aio

To use with docker-machine/boot2docker/DockerForMac

You can use a remote fit configuration file using the REMOTEFILES env variable.

docker run \
-e REMOTEFILES="https://gist.githubusercontent.com/wallnerryan/fd0146ee3122278d7b5f/raw/d089b6321746fe2928ce3f89fe64b437d1f669df/job.fio" \
-e JOBFILES=job.fio \
-v /Users/wallnerryan/Desktop/fio:/tmp/fio-data \
wallnerryan/fio-tool

(or) If you have a directory that already has them in it. *NOTE*: you must be using a shared folder such as Docker > Preferences > File Sharing.

docker run -v /Users/wallnerryan/Desktop/fio:/tmp/fio-data \
-e JOBFILES=job.fio wallnerryan/fio-tool

To produce graphs, run the genplots container, -p

docker run \
-v /Users/wallnerryan/Desktop/fio:/tmp/fio-data wallnerryan/fio-genplots \
-t My16kAWSRandomReadTest -b -g -p *_bw*

Simply serve them on port 8000

docker run -v /Users/wallnerryan/Desktop/fio:/tmp/fio-data \
-d -p 8000:8000 wallnerryan/fio-plotserve

Notes

The fio-tools container will clean up the /tmp/fio-data volume by default when you re-run it.
If you want to save any data, copy this data out or save the files locally.

How to get graphs

When you serve on port 8000, you will have a list of all logs created and plots created, click on the .png files to see graph (see below for example screen)

687474703a2f2f692e696d6775722e636f6d2f6e6b73516b5a692e706e67

Testing and building with codefresh

As a side note, I recently added this repository to build on Codefresh. Right now, it builds the fiotools-aio Dockerfile which I find most useful and moves on but it was an easy experience that I wanted to add to the end of this post.

Navigate to https://g.codefresh.io/repositories? or create a free account by logging into codefresh with your Github account. By logging in with Github it will have access to your repositories you gave access to and this is where the fio-tools images are.

I added the repository as a build and configured it like so.

screen-shot-2016-12-29-at-2-45-46-pm

This will automatically build my Dockerfile and run any integration tests and unit tests I may have configured in codefresh, thought right now I have none but will soon add some simple job to run against a file as an integration test with a codefresh composition.

Conclusion

I found over my time using both native linux tools and docker-based or containerized tools that there is need for both sometimes and in fact when testing container-native application workloads sometimes it is best to get metrics or benchmarks from the point of view of the application which is why we chose to run fio as a microservice itself.

Hopefully this was an enjoyable read and thanks for stopping by!

Ryan

Service Oriented Architecture vs Modern Microservices: Whats the difference?

Leave a reply

Images thanks to http://martinfowler.com/articles/microservices.html and https://en.wikipedia.org/wiki/Service-oriented_architecture

I’ve been researching and working in the area of modern microservices for the past ~3 to 4 years and have always seen a strong relationship between Modern Microservices with tools and cultures like Docker and DevOps back to Service-Oriented Architecture (SOA) and design. I traced SOA roots back to Gartner Research in 1996 [2] or at least this is what I could find, feel free to correct me here if I haven’t pegged this. More importantly for this post I will briefly explore SOA concepts and design and how they relate to Modern Microservices.

Microservice Architectures (MSA) (credit to meetups and conversations with folks at meetups), are typically RESTful and based on HTTP/JSON. MSA is an architectural style not a “thing” to conform exactly to. In other words, I view it as more of a guideline. MSAs are derived from multiple code bases and each microservice (MS) has or can have its own language it’s written in. Because of this, MSAs typically have better readability and simpler deployments for each MS deployed which in turn leads to better release cycles as long as the organization surrounding the MS teams is put together effectively (more on that later). An MSA doesn’t NEED to be a polyglot of but will often naturally become one because teams may be more familiar with one language over the other which helps delivery time especially if the interfaces between microservices are defined correctly, it truly doesn’t matter most of the time. It also enables scale at a finer level instead of worrying about the whole monolith which is more agile. Scaling a 100 lines of Golang that does one thing well can be achieved much easier when you dont have to worry about other parts of your application that dont need or you dont want to scale in the monolith. In most modern MSAs, the REST interfaces mentioned earlier can be considered the “contract” between microservices in an MSA. These contracts should self describing as they can be, meaning using formats like JSON which is human readable and well-organized.

Overall an MSA doesn’t just have technical benefits but could also mean fewer reviews and approvals because of smaller context boundaries for each microservice team. Better acquisition and on-boarding because you dont have to be so strict about language preference, instead of retooling, you can ingest using polyglot.

Motivations for SOA, from what I have learned, is typically business transformation oriented which shouldn’t be surprising. The enterprise based SOA transformation on a large budget but the motivation is different now with modern MSA, now its quick ROI and better technology to help scale using DevOps practices and platforms.

Some things to consider while designing your modern MSA that I’ve heard and stuck with me:

Do not create too many services/microservices
Try not to manage your own infrastructure if you can
Dont make too many dependencies, (e.g. 1 calls 2 calls 3 calls 4 calls 5 ……)
Circuit Breaker Pattern, a control point between microservices.
Bulkhead, do not allow 1 problem affect the entire boat, each microservice has its own data service / database / connection pool, 1 service does not take down the whole system or other microservices.
Chaos testing (Add it to your test suite!) Example: Chaos Monkey
You can do microservices with or without service discovery / catalog. Does it over complicate things?

The referenced text[1] that I use for a comparison or similar concepts and differences in this post talk about a vast number of important topics related to Service-Oriented Architecture. Such topics include the overall challenges of SOA, service reuse, deployment efficiency, integration of application and data, agility, flexibility, alignment, reference architectures, common semantics, semantic pitfalls, legacy application integration, governance, security, service discovery, inventory and registration, best practices and more. This post does not go into depth of each individual part but instead this post aims at looking at some of the similarities and differences of SOA and modern microservices.

Service-Oriented Architecture:

Some of concepts of SOA that I’d like to mention (not fully encompassing):

Technologies widely used were SOAP, XML, WSDL, XSD and lots of Java
SOAs typically had a Service Bus or ESB (Enterprise Service Bus) a complex middleware aimed at providing access and masking of interfaces.
Identification and Inventory
Value chain and business model is more about changing the entire business process

Modern Microservces:

Technologies widely used are JSON, REST/HTTP and Polyglot services.
Communication is done over HTTP and the interfaces are abstracted using RESTful contracts.
Service Discovery
Value chain and business model is about efficiencies, small teams and DevOps practices while eliminating cilos.

The Bulkhead Analogy

I want to spend a little bit of time on one of the analogies that stuck with me about modern microservices. This was the Bulkhead analogy which I cannot for the life of me remember where I heard it or seem to google a successful author so credit to who or whom ever you are.

The bulkhead analogy is pretty simple actually but has a powerful statement for microservice design. The analogy is such that a MSA, like a large ship is made up of many containers (or in the ships case, bulkheads) that have boundaries between them and hold different component of the ships such as engines, cargo, pumps etc. In MSA, these containers hold different functions or processes that do something wether its handle auth requests, connection to a DB, service a lookup or transformation mechanism it doesn’t matter, just that in both cases you want all containers to be un-damaged for everything to be running the best it can.

The bulkhead analogy goes further to say that if a container gets damaged and takes on water then the entire ship should not sink due to one or few failures. In MSA this can be applied by saying that a few broken microservices should not be designed in a way where there failure would take down your entire application or business process. It essence designs the bulkheads or containers to take damage and remain afloat or “running”.

Again, this analogy is quite simple, but when designing your MSA it’s important to think about these details and is why doing things like proper RESTful design and Chaos testing is worth your time in the long run.

Similarities and Differences or the two architectures / architecture styles:

Given the little glimpse of information I’ve provided above about service oriented architectures and microservices architectures I want to spend a little time talking about the obvious similarities and differences.

Similarities

Both SOA and MSA do the following:

Code or service reuse
Loose coupling of services
Extensibility of the system as a whole
Well-defined, self-contained services or functions that overall help the business process or system
Services Registries/Catalogs to discover services

Differences

Some of the differences that stick out to me are:

Focus on business process, instead of the focus of many services making one important business process MSA focuses on allowing one thing (containerized process) to do one thing and do it well. This allows tighter context boundaries for microservices.
SOA tailors towards SOAP, XML, WSDL while MSA favors JSON, REST and Polyglot. This is one of the major differences to me, even though its just a tech difference this RESTful polyglot paradigm enables MSAs to thrive with todays developers.
The value chain and business model is more DevOps centric allowing the focus to be on loosely coupled teams that break down cilos and can focus on faster release cycles and CI/CD of their services rather than with SOA teams typically still had one monolithic view of the ESB and services without the DevOps focus.

Conclusion

Overall this post was mainly a complete high-level overview of what I think are some of the concepts and major differences between traditional SOA and Modern Microservices that stemmed from a course I took during my masters that explored SOA while I was in the industry working on Microservices. The main point I would say I have is that SOA and MSA are very similar but MSA being SOA’s offspring in a way using modern tooling and architecture approaches to todays scaleable data center.

Note* by no means did I cover SOA or MSA to do them any real justice, so I suggest looking into some of the topics talked about here or reading through some of the references below if your interested.

Cheers!

[1] Rosen, Michael “Applied SOA: Service-Oriented Design Architecture and Design Strategies” Wiley, Publishing Inc. 2008

[2] Gartner Research “Service Oriented” Architectures, Part 1:” – //www.gartner.com/doc/code/29201

[3] “SOA fundamentals in a nutshell” Aka Sniv February 2015 http://www.ibm.com/developerworks/webservices/tutorials/ws-soa-ibmcertified/ws-soa-ibmcertified.html

Container Data Management and Production Use-cases

Leave a reply

In my last few jobs I have had the pleasure to work on three main focus areas, Software Defined Networking, OpenStack, and Linux Containers. More recently I have been focused on container data management and what it means for persistence and data management to be a first class citizen to containerized applications and microservices. This has done a few things for me, give me an opportunity to work on interesting (and hard) problems, hack, create and apply new solutions and technologies into proof of concept and production environments. In my current role as a Technical Evangelist at ClusterHQ we have been hard at work continuing the Flocker project, creating the Volume Hub and spinning out dvol (git workflows for Docker volumes / data). All of this is aimed at one major thing, helping your team move from development on your laptop to test and Q/A and finally into production seamlessly with persistence.

We’re just at the beginning of the container and data revolution and if your interested in learning more about some of these topics, click the links below and vote for some upcoming talks at OpenStack Austin 2016.

Three Critical Concepts for Containerized Storage Management

https://www.openstack.org/summit/austin-2016/vote-for-speakers/presentation/8506

Lessons learned running database containers on OpenStack

https://www.openstack.org/summit/austin-2016/vote-for-speakers/presentation/8501

A special shoutout to Andrew Sullivan and Sumit Kumar for their CFP on “Data Mobility for Docker containers with Flocker”! Please help them spread the word on stateful containers by voting for their talk as well! https://www.openstack.org/summit/austin-2016/vote-for-speakers/presentation/7859

Cheers!

Ryan – @RyanWallner

Microservices: An architecture from scratch using Docker, Swarm, Compose, Consul, Facter and Flocker

3 Replies

‘

You might be asking yourself about the information going around about microservices, containers, and the many different tools around building a flexible architecture or “made-of-many-parts” architecture. Well you’re not alone, and there are many tools out there helping (or confusing) you to do so. In this post I’ll talk about some of the different options available like Mesos, Docker Engine, Docker Swarm, Consul, Plugins and more out there. The various different layers involved in a modern microservices architecture have various responsibilities and deciding how you can go about choosing the right pieces to build out those layers can be tough. This post is by far not the only way you can put the layers together and in fact this is MY opinion on the subject given the experience I have had in the ecosystem, it also does not reflect ideas of my employer.

Typically I define modern microservices architecture as having the following layers and responsibilities:

Applications / Frameworks / App Manifests
Scheduling
Orchestration
Monitoring / Logging / Auditing
Service Discovery
Cluster Management / Distributed Systems State
Data Services and Intelligence

To give you an idea of the tools available and technologies that fall into these categories, here is the list again, but with some of the projects, products and technologies in the ecosystem added. keep in mind this is not an exhaustive list.

*UPDATE: here is separate post aimed at making this a more exhaustive list

Applications / Frameworks / App Manifests
- Docker Compose YML
- Mesos Frameworks (There are many, see Mesos Application Frameworks)
- Cloud Foundry Application Manifest
Scheduling
- Fleet
- Marathon (with Mesos)
- Docker Swarm
- Kubernetes Scheduler
Orchestration (Orchestration is sometimes an overlapping topic)
- Openstack Magnum
- Kuberenetes
- Docker Compose
- Mesos
- TerraForm
Monitoring / Logging / Auditing
- Loggly
- Fluentd
- Sysdig
Service Discovery
- Consul
- Etcd
- Zookeeper
- Eureka
- Doozer
Cluster Management / Distributed Systems State
- Flocker Control Service / Agents
- Mesos Slaves
- Kubernetes Nodes/Masters
- Docker Swarm Agents + Master
Networking
- Docker Libnetwork
- Weaveworks
- Flannel
- Calico
Container Runtimes
- Docker
- Runc
- Systemd-nspawn
- LXD
- Garden
- Rocket (rkt)
Data Services and Intelligence
- Flocker
- Rexray
- Openshift persistent volumes
- Kubernetes Persistent Volumes with GCE
- Docker Volume Plugins

So lets choose a few components, we choose components apart from networking which we leave out here and just use host-only networking with vagrant, but we could add in libnetwork support in Docker directly. For logging and monitoring, we also just spin up DockerUI for this but could also add in loggly, fluentd, sysdig and others.

Consul – Service Discovery, DNS, K/V Storage
Docker Engine (Runtime)
Docker Swarm (Scheduler / Cluster / Distributed State )
Docker Compose (Orchestration)
Docker Plugins (Volume integration)
Flocker (Data Services / Orchestration)

So what do these layers look like all together? We can represent the layers I mentioned above the the following way, including the applications as a logical mapping to the containers running programs and processes above.

Above, we can put together a microservices architecture with the tools defined above, atop of this we can create applications from manifests and schedule containers onto the architecture once this is all running. I want to pinpoint a few specific areas in this architecture because we can add some extra logic to this to make things a little more interesting.

Service Discovery and Registration:

The registration layer can serve many purposes, it is mostly used to register and allow discovery of services (microservices/containers) that are running on a system/cluster. We can use consul to do this type of registration within its key / value mechanisms. You can use Consul’s built-in service mechanism or there are other ways to talk to consul key/value like registrator. In our example we can use the registry layer for something a little more interesting, in this case we can use consul’s locking mechanisms to lock resources we put in them, allowing schedulers to tap into the registry layer instead of talking to every node in the cluster for updates on CPU, Memory etc.

We can add resource updating scripts to our consul services by adding a service to consul’s service mechanism, these services will import keys and values from Facter and other resources then upload them to the K/V store, consul will also health check these services for us as a added benefit.

Below we can see how Consul registers services, in this case we register an “update service” which updates system resources into the registry layer.

We now have the ability to add many system resources, but we also have the flexibility to upload custom resources (facts) like system overload and memory swap free, the below “fact” is a sample of how we can do so giving us a system overload.

Doing so, we can enable the swarm scheduler to use them for scheduling containers. Swarm does not do this today, instead it has only static labels added when the docker engine is started, in this example our swarm scheduler utilizes dynamic labels by getting up-to-date realtime labels that mean something to the system which allows us to schedule containers a little better. What this looks like in swarm is below. See “how to use” section below for how this actually gets used.

Data Services Layer:

Docker also allows us to plug into its ecosystem for volumes with volume plugins which enable containers to add data services like RexRay and Flocker. This part of the ecosystem is rapidly expanding and today we can provision fairly basic volumes with a size, and basic attributes. Docker 1.9 has a volume API which introduces options (opts) for more advanced features and metadata passed to data systems. As you will see in the usage examples below, this helps with more interesting workflows for the developer, tester, etc. As a note, the ecosystem around containers will continue to grow fast, use cases around different types of applications with more data needs will help drive this part of ecosystem.

If your wondering what the block diagram may actually looks like on a per node (server/vm) basis, with what services installed where etc, look no more, see below.

From the above picture, you should get a good idea of what tools sit where, and where they need to be installed. First, every server participating in the cluster needs a Swarm Agent, Flocker-Dataset-Agent, Docker Plugin for Docker, A Consul server (or agent depending how big the cluster, at least 3 servers), and a Docker Engine. The components we talked about above like the custom resource registration has custom facter facts on each node as well so consul and facter can import them appropriately. Now, setting this up by hand is sure to be a pain in the a** if your cluster is large, so in reality we should think about the DevOps pipeline and the roll of puppet, or chef to automate the deployment of a lot of this. For my example I packed everything into a Vagrantfile and vagrant shell scripts to do the install and configuration, so a simple “vagrant up” would do, given I have about 20-30 minutes to watch the cluster come up 🙂

How do I use it!!?

Okay, lets get to actually using this microservices cluster now that we have it all set. This section of the blog post should give you an idea of the use cases and types of applications you can deploy to your microservices architecture and what tooling to use given then above examples and layers we introduced.

Using the Docker CLI with Swarm to schedule new resources via constraints and volume profiles.

This will look for specific load between 0-25% because we have a custom registration layer. *(note, some of the profiles work was in collaboration Mahuri from CHQ and Sean Dell for the Docker Global Hackday #3)

docker run -d -e constraint:system_overload_10min==/[0-2][0-5]/ -e constraint:architecture==x86_64 -e constraint:virtual=virtualbox -e constraint:selinux_enforced=false -v myVol@gold:/data/ redis

Using Docker Compose with Flocker volume driver that supports storage profiles:

This will schedule a redis database container using the flocker volume driver with a “gold” volume, meaning we will get a better IOPS, Bandwidth and other “features” that are considered of more performance and value.

Using Docker compose with swarm to guarantee a server has a specific volume driver and to schedule a container to a server with a specific CPU Overload:

In this example we also use the overload percentage resource in the scheduler but we also take advantage of the registry layer knowing which nodes are running certain volume plugins and we can schedule to Swarm knowing we will get nodes with a specific driver and hypervisor making sure that it will support the profile we want.

A use case where a developer wants to schedule to specific resources, start a web service, snapshot an entire database container and its data and view that data all using the Docker CLI.

This example shows how providing the right infrastructure tooling to the Docker CLI and tools allows for a more seamless developer / test workflow. This example allows us to get everything we need via the docker CLI while never leaving our one terminal to run commands on a storage system or separate node.

Start MySQL Container with constraints:

docker run -ti -e constraint:is_virtual==True -e constraint:system_overload_10min==/[0-2][0-5]/ -v demoVol@gold:/var/lib/mysql --volume-driver=flocker -p 3306:3306 --name MySQL wallnerryan/mysql

Start a simple TODO List application:

docker run -it -e constraint:is_virtual==True --rm -e DATABASE_IP=192.168.50.13 -e DATABASE=mysql -p 8005:8080 wallnerryan/todolist

We want to snapshot our original MySQL database, so lets pull the dataset-ID from its mount point and input that into the snapshot profile.

docker inspect --format='{{.Mounts}}' MySQL
[{demoVol@gold /flocker/f23ca986-cb43-43c1-865c-b57b1023ab7e /var/lib/mysql flocker z true}]

Here we place a substring of the the Dataset ID into the snapshot profile in our MySQL-snap container creation, this will create a second MySQL container with a snapshot of production data.

docker run -ti -e affinity:container==MySQL -v demoVol-snap@snapshot-f23ca986:/var/lib/mysql --volume-driver=flocker -p 3307:3306 --name MySQL-snap wallnerryan/mysql

At this point we have pretty much been able to snapshot the entire MySQL application (container and data) and all of its data at a point in time while scheduling the point in time data and container to a specific node in one terminal and few CLI commands where our other MySQL database was running. In the future we could see the option to have dev/test clusters and have the ability to schedule different operations and workflows across shared production data in a microservices architecture that would help streamline teams in an organization.

Cheers, Thanks for reading!

What Would Microservices do!?

1 Reply

Image Credit for the Googling of images

Private/Public IaaS, and PaaS environments are some of the fastest moving technology domains of their kind right now. I give credit to the speed of change and adoption to the communities that surround them, open-source or within the enterprise. As someone who is in the enterprise but contributes to open-source, it is surprising to many to find out that within the enterprise there is a whole separate community around these technologies that is thriving. That being said, I have been working with os-level virtualization technologies now for the past (almost) 3 years and in the past 2 years most people are familiar with the “Docker-boom” and now with the resurgence or SoA/Microservices I think its worth while exploring what modern technologies are involved, what drivers for change and how it affects applications and your business alike.

Containers and Microservices are compelling technologies and architectures, however, exposing the benefits and understanding where they come from is a harder subject to catch onto. Deciding to create a microservice architecture for your business application or understanding which contexts are bound to which functionality can seems like the complexity isn’t worth it in the long wrong. So here I explore some of the knowledge of microservices that is already out there in understanding when and why to turn to microservices and figuring out why, as in many cases, there really isn’t a need to.

In this post I will hopefully talk about some of major drivers and topics of microservices and how I see them in relationship to data-center technologies and applications. These are my own words and solely my opinion, however I hope this post can help those to understand this space a little better. I will talk briefly about, Conway’s Law, what it means to Break Down the Silos, why it is important to Continuous delivery, the importance of the unix philosophy, how to define a microservice, their relationship to SOA, the complexity involved in the architecture, what changes in the organization must happen, what companies and products are involved int this space, how to write a microservice, the importance of APIs and service discovery, and layers of persistence.

Microservices

The best definition in my opinion is “A Microservice fits in your head”. There are other definitions involving an amount of pizza, or a specific amount for lines of code, but I don’t like putting these boundaries on what a microservices is. In the simplest, a microservice is something that is small enough to conceptually fit in your head without really having to think to much about it. You can argue about how much someone can fit in their head etc, but then thats just rubbish and un-important to me.

I like to bring up the unix principle here as ice heard from folks at Joyent and other inn he field, this is the design that programs are designed to do one thing and do that one thing well. Like “ls” or “cat” for instance, typically if you design a microservice this way, you can limit its internal failure domain because it does one thing and exposes and API to do so. Now, microservices is a loaded term, and just like SoA there are similarities in these two architectures. But they are just that, architectures and I will add that you can find similarities in many of their parts but the some of the main differences is that SoA used XML, SOAP, typically a Single Message Bus for communication and a shared data source for services. Microservices uses more modern lightweight protocols like RESTful APIs, JSON, HTTP, RPC and typically a single microservice is attached to its own data source, whether is a copy, shard or a its own distributed database. This helps with multi-tenancy, flexibility and context boundaries that help scale such an architecture like microservices. One of the first things people start to realize when deep divining into microservices is the amount of complexity that comes out of slicing up the monolith because inherently you need to orchestrate, monitor, audit, and log many more processes, containers, services etc than you did with a typical monolithic application. The fact that these architectures are much more “elastic and ephemeral” than others forces technical changes that center around the smallest unit of business logic that helps deliver business value when combine with other services to deliver the end goal. This way each smaller unit can have its own change lifecycle, scale independently and be developed free of other dependencies within the typical monolith.

This drives the necessity to adopt a DevOps culture and change organizationally as each service should be developed by independent, smaller teams that can each release code within their own cycles. Teams still need to adhere to the invisible contracts that are between the services, these contracts are the APIs themselves between the services which talk to one another. I could spend an entire post on this topic but there is a great book called “Migrating to Cloud-Native Application Architectures” by Matt Stine of Pivotal (Which is free, download here) that talks about organizational changes, api-based collaboration, microservices and more. There is also a great post by Martin Fowler (here) that talks about microservices and the way Conways law affects the organization.

Importance of APIs

I want to briefly talk about the importance of orchestration, choreography and the important of the APIs that exist within a microservices architecture. A small note on choreography, this is another terms that may be new but its related to orchestration. Choreography is orchestration turned on its head, instead of an orchestration unit signaling when things happen, the intelligence is pushed to the endpoints and those endpoints react to events of changing environments, therefore each service known its own job. A great comparison of this is (here) in the book “Building Microservices” by Sam Newman. Rest APIs are at the heart of this communication, if an event is received from a customer of user, a choreography chain is then initialized and each endpoint talks to each other via these APIs, therefore, these APIs must remain robust, backward compatible and act as contracts between how services interact. A great post of the Netflix microservices work (here) explains this in a little more detail.

If the last few paragraphs and resources make some sense, you end up with a combination of loosely coupled services, strict boundaries, APIs (contracts), robust choreography and vital health and monitoring for all services deployed. These services can be scaled, monitored and moved independently without risk and react well to failures. Some of the exa plea of tools to hel you do this can be found at http://netflix.github.io. This all sounds great, but without taking the approach of “design for the integrations not the infrastructure/platform” (which I’ve heard a lot but can’t quite figure out who the quote belongs to, coudos to who you are :] ) this can fail pretty easily. There is a lot of detail I didnt cover in the above and I suggest looking into the sources I listed for a start on getting into the details of each part. For now I am going to turn to a few topics within microservices, Service Discovery & Registration and Data Persistence layers in the stack.

Service Discovery and Registration

Distributed systems at scale using microservices need a way to registry and discover what services and endpoints are available, enter Service Discovery tools like Consul, Etcd, Zookeepr, Eureka, and Doozerd. (others not listed) These tools make it easy for services to call this layer and find out a way to consume what else is available. Typically this helps one service find out how to connect and use another service. There are three main processes IMO for applications to use this layer:

Registration
- when a service gets installed or “comes up” it needs to initially be registered with the discovery layer. An example of this is Registrator (https://github.com/gliderlabs/registrator) which reacts to docker containers starting and sends key/value pairs of data to a tool like Consul or Etcd to keep current discovery data about the service. Such information could be IP Endpoint, Port, API URL/Path, Resources, etc that can be used by the service.
Discovery
- Discovery is the other end of Registration, when a service wants to use a (for instance) “proxy”, how will it know where the proxy lives or how to access it? Typically in applications this information is in a configuration file or hard coded into the app, whith service discovery all the app needs to do is know how to implement the information owned by the registration mechanism. For instance, an app can start and immediately say “Where is ‘Proxy'” and the discovery mechanism can respond by saying “Here is the Proxy thats closest to you” or “Here is the first Proxy available” along with the IP and Port of that proxy, the app can then just use those values, typically given in JSON or XML and use them inside the application thus not ever hardcoding any configuration anywhere.
Consume
- Last but smallest is when the applications received the response back from the discovery mechanism it must know how to process and the the data. E.g. if your asking for a proxy or asking for a database the information given back would be different for the proxy versus how you would actually access the database.

Persistence and Backing Services

Today most applications are stateless applications, which means that they do not own any persistence themselves. You can think of a stateless application as a web-server, this web-server processes requests and talks to a database, but the database is another microsevice and this is where all the state lives. We can scale the web-server as much as we want and even actively load balance those endpoints without ever worrying about any state. However, that database I mentioned in the above example is something we should worry about, because we want our data to be available, and protected at all times. Though, you can’t (today) spin up your entire application stack (e.g. MongoDB, Express.js, Angular.js and Node.js) all in different services (containers) and not worry about how your data is stored, if you do this today you need persistent volumes that can be flexible enough to move with your apps container, which is hard to do today, the data container is just not as flexible as we need it be in todays architectures like Mesos and Cloud Foundry. Today persistence is added via Backing Services (http://12factor.net/backing-services) which are persistence / data layers that exist outside of the normal application lifecycle. This mean that in order to use a database one must first create the backing service then bind it to the application. Cloud Foundry does this today via “cf create-service and cf bind-service APPLICATION SERVICE_INSTANCE” where SERVICE_INSTANCE is the backing store, you can see more about that here. I won’t dig into this anymore other to say that this is problem that needs to be solved, and making your data services as flexible as the rest of your microservices architecture is not easy feat. The below link is a great article by Luke Marsden of ClusterHQ that talks about this very issue. http://www.infoq.com/articles/microservices-revolution

I also wanted to mention an interesting note on persistence in the way Netflix deploys Cassandra. All the data that Netflix uses is deployed on Amazon on EC2 instance and they use ephemeral storage! Which means when the node dies all their data is gone. But alas! they don’t worry about this type of issue anymore because Cassandra’s distributed, self healing architecture allows Netflix to move around their persistence layers and automatically scale them out when needed. I found out they do run incremental backups to S3 by briefly speaking with Adrian Cockcroft at offices hours at the Oreilly Software Architecture conference. I found this to be a pretty interesting point to how Netflix runs its operations for its data layers with Cassandra showing that these cloud-native, flexible, and de-coupled applications are actually working in production and remain reliable and resilient.

Major Players

Some of the major players in the field today include: (I may have missed some)

Pivotal
Apache Mesos
Joyent (Manta)
IBM Bluemix
Cloud Foundry
Amazon Elastic Container Service
Openshift by RedHat
OpenStack Magnum
CoreOS (with Kubernetes) see (Project Tectonic)
Docker (and Assorted Tools/Binaries)
Cononical’s LXD
Tutum
Giant Swarm

There are many other open source tools at work like Docker Swarm, Consul, Registrator, Powerstrip, Socketplane.io (Now owned by Docker Inc), Docker Compose, Fleet, Weave. Flocker and many more. This is just a token to how this field of technology is booming and were going to see many fast changes in the near future. Its clear that future importance of deploying a service and not caring about the “right layers” or infrastructure will be key. Enabling data flexibility without tight couplings to the service is part of an the overall application design or the data service. These architectures can be powerful for your applications and for your data itself. Ecosystems and communities alike are clearly coming together to help and try to solve problems for these architectures, I’m sure some things are coming so keep posted.

Microservices on your laptop

One way to get some experience with these tools is to run some examples on your laptop, checkout Lattice (https://github.com/cloudfoundry-incubator/lattice/) from Cloud Foundry which allows you to run some microservice-like containerized workloads. This post is more about the high-level thinking, and I hope to have some more technical posts about some of the technologies like Lattice, Swarm, Registrator and others in the future.

Exploring Powerstrip from ClusterHQ: A Socketplane Adapter for Docker

Leave a reply

sources: http://socketplane.io, https://github.com/ClusterHQ/powerstrip, http://clusterhq.com

Over the past few months one of the areas worth exploring within the container ecosystem is how it works with external services and applications. I currently work in EMC CTO Advanced Development so naturally my interest level is more about data services, but because my background working with SDN controllers and architectures is still one of my highest interests I figured I would get to know what Powerstrip was by working with Socketplane’s Tech Release.

*Disclaimer:

This is not the official integration for powerstrip with sockeplane merged over the last week or so, I was working on this in a rat hole and it works a little differently than the one that Socketplane merged recently.

What is Powerstrip?

Powerstrip is a simple proxy for docker requests and responses to and from the docker client/daemon that allows you to plugin “adapters” that can ingest a docker request, perform an action, modification, service setup etc, and output a response that is then returned to Docker. There is a good explaination on ClusterHQ’s Github page for the project.

Powerstrip is really a prototype tool for Docker Plugins, and a more formal discussion , issues, and hopefully future implementation of Docker Plugins will come out of such efforts and streamline the development of new plugins and services for the container ecosystem.

Using a plugin or adapter architecture, one could imagine plugging storage services, networking services, metadata services, and much more. This is exactly what is happening, Weave, Flocker both had adapters, as well as Socketplane support recently.

Example Implementation in GOlang

I decided to explore using Golang, because at the time I did not see an implementation of the PowerStripProtocol in Go. What is the PowerStripProtocol?

The Powerstrip protocol is a JSON schema that Powerstrip understands so that it can hook in it’s adapters with Docker. There are a few basic objects within the schema that Powerstrip needs to understand and it varies slightly for PreHook and PostHook requests and responses.

Pre-Hook

The below scheme is what PowerStripProtocolVersion: 1 implements, and it needs to have the pre-hook Type as well as a ClientRequest.

{
    PowerstripProtocolVersion: 1,
    Type: "pre-hook",
    ClientRequest: {
        Method: "POST",
        Request: "/v1.16/container/create",
        Body: "{ ... }" or null
    }
}

Below is what your adapter should respond with, a ModifiedClientRequest

{
    PowerstripProtocolVersion: 1,
    ModifiedClientRequest: {
        Method: "POST",
        Request: "/v1.16/container/create",
        Body: "{ ... }" or null
    }
}

Post-Hook

The below scheme is what PowerStripProtocolVersion: 1 implements, and it needs to have the post-hook Type as well as a ClientRequest and a Server Response. We add ServerResponse here because post hooks are already processed by Docker, therefore they already have a response.

{
    PowerstripProtocolVersion: 1,
    Type: "post-hook",
    ClientRequest: {
        Method: "POST",
        Request: "/v1.16/containers/create",
        Body: "{ ... }"
    }
    ServerResponse: {
        ContentType: "text/plain",
        Body: "{ ... }" response string
                        or null (if it was a GET request),
        Code: 404
    }
}

Below is what your adapter should respond with, a ModifiedServerResponse

{
    PowerstripProtocolVersion: 1,
    ModifiedServerResponse: {
        ContentType: "application/json",
        Body: "{ ... }",
        Code: 200
    }
}

Golang Implementation of the PowerStripProtocol

What this looks like in Golang is the below. (I’ll try and have this open-source soon, but it’s pretty basic :] ). Notice we implement the main PowerStripProtocol in a Go struct, but the JSON tag and options likes contain an omitempty for certain fields, particularly the ServerResponse. This is because we always get a ClientRequest in pre or post hooks but now a ServerResponse.

We can implement these Go structs to create Builders, which may be Generic, or serve a certain purpose like catching pre-hook Container/Create Calls from Docker and setting up socketplane networks, this you will see later. Below are generall function heads that return an Marshaled []byte Go Struct to gorest.ResponseBuilder.Write()

Putting it all together

Powerstrip suggests that adapters be created as Docker containers themselves, so the first step was to create a Dockerfile that built an environment that could run the Go adapter.

Dockerfile Snippets

First, we need a Go environment inside the container, this can be set up like the following. We also need a couple of packages so we include the “go get” lines for these.

Next we need to enable our scipt (ADD’ed earlier in the Dockerfile) to be runnable and use it as an ENTRYPOINT. This script takes commands like run, launch, version, etc

Our Go-based socketplane adapter is laid out like the below. (Mind the certs directory, this was something extra to get it working with a firewall).

“powerstrip/” owns the protocol code, actions are Create.go and Start.go (for pre-hook create and post-hook Start, these get the ClientRequests from:

```
POST /*/containers/create
```

And

```
POST /*/containers/*/start
```

“adapter/” is the main adapter that processes the top level request and figures out whether it is a prehook or posthook and what URL it matches, it uses a switch function on Type to do this, then sends it on its way to the correct Action within “action/”

“actions” contains the Start and Create actions that process the two pre hook and post hook calls mentioned above. The create hook does most of the work, and I’ll explain a little further down in the post.

Now we can run “docker buid -t powerstrip-socketplane .” in this directory to build the image. Then we use this image to start the adapter like below. Keep in mind the script is actually using the “unattended nopowerstrip” options for socketplane, since were using our own separate one here.

docker run -d --name powerstrip-socketplane \
 --expose 80 \
 --privileged \ 
 --net=host \
 -e BOOTSTRAP=true \
 -v /var/run/:/var/run/ \
 -v /usr/bin/docker:/usr/bin/docker \
 powerstrip-socketplane launch

Once it is up an running, we can use a simple ping REST URL to test if its up: It should respond “pong” if everything is running.

$curl http://localhost/v1/ping
pong

Now we need to create our YAML file for PowerStrip and start our Powerstrip container.

If all is well, you should see a few containers running and look somthing like this

dddd151d4076        socketplane/socketplane:latest   "socketplane --iface   About an hour ago   Up About an hour                             romantic_babbage

6b7a63ce419a        clusterhq/powerstrip:v0.0.1      "twistd -noy powerst   About an hour ago   Up About an hour    0.0.0.0:2375->2375/tcp   powerstrip

d698047800b1        powerstrip-socketplane:latest    "/opt/run.sh launch"   2 hours ago         Up About an hour                             powerstrip-socketplane

The adapter will automatically spawn off a socketplane/socketplane:latest container because it installs socketplane brings up the socketplane software.

Once this is up, we need to update our DOCKER_HOST environment variable and then we are ready to go to start issuing commands to docker and our adapter will catch the requests. Few examples below.

export DOCKER_HOST=tcp://127.0.0.1:2375

Next we create some containers with a SOCKETPLANE_CIDR env vairable, the adapter will automatically catch this and process the networking information for you.

docker create --name powerstrip-test1 -e SOCKETPLANE_CIDR="10.0.6.1/24" ubuntu /bin/sh -c "while true; do echo hello world; sleep 1; done"

docker create --name powerstrip-test2 -e SOCKETPLANE_CIDR="10.0.6.1/24" ubuntu /bin/sh -c "while true; do echo hello world; sleep 1; done”

Next, start the containers.

docker start powerstrip-test1

docker start powerstrip-test2

If you issue an ifconfig on either one of these containers, you will see that it owns an ovs<uuid> port that connects it to the virtual network.

sudo docker exec powerstrip-test2 ifconfig

ovs23b79cb Link encap:Ethernet  HWaddr 02:42:0a:00:06:02

          inet addr:10.0.6.2  Bcast:0.0.0.0  Mask:255.255.255.0

          inet6 addr: fe80::a433:95ff:fe8f:c8d6/64 Scope:Link

          UP BROADCAST RUNNING  MTU:1440  Metric:1

          RX packets:12 errors:0 dropped:0 overruns:0 frame:0

          TX packets:9 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:956 (956.0 B)  TX bytes:726 (726.0 B)

We can issue a ping to test connectivity over the newly created VXLAN networks. (powerstrip-test1=10.0.6.2, and powerstrip-test2=10.0.6.3)

$sudo docker exec powerstrip-test2 ping 10.0.6.2

PING 10.0.6.2 (10.0.6.2) 56(84) bytes of data.

64 bytes from 10.0.6.2: icmp_seq=1 ttl=64 time=0.566 ms

64 bytes from 10.0.6.2: icmp_seq=2 ttl=64 time=0.058 ms

64 bytes from 10.0.6.2: icmp_seq=3 ttl=64 time=0.054 ms

So what’s really going on under the covers?

In my implementation of the powerstrip adapater, the adapter does the following things

Adapter recognizes a Pre-Hook POST /containers/create call and forwards it to PreHookContainersCreate
PreHookContainersCreate checks the client request Body foe the ENV variable SOCKETPLANE_CIDR, if it doesn’t have it it returns like a normal docker request. If it does then it will probe socketplane to see if the network exists or not, if it doesn’t it creates it.
In either case, there will be a “network-only-container” created connected to the OVS VXLAN L2 domain, it will then modify the response body in the ModifiedClientRequest so that the NetworkMode gets changed to –net:container:<new-network-only-container>.
Then upon start the network is up and the container boots likes normal with the correct network namespace connected to the socketplane network.

Here is a brief architecture to how it works.

Thanks for reading, please comment or email me with any questions.

Cheers!

Docker Virtual Networking with Socketplane.io

5 Replies

ImgCred: http://openvswitch.org/, http://socketplane.io/, https://consul.io/, https://docker.com/

Containers have no doubt been a hyped technology in 2014 and now moving into 2015. Containers have been around for a while now (See my other post on a high-level overview of the timeline) and will be a major technology to think about for the developer as well as within the datacenter moving forward.

Today I want to take the time to go over Socketplane.io’s first preview of the technology they have been working on and since announcing their company in mid-october. Socketplane.io is “driving DevOps Defined Networking by enabling distributed security, application services and orchestration for Docker and Linux containers.” and is backed by some great tech talent, Brent Salisbury, Dave Tucker, Madhu Venugopal, John M. Willis who all bring leading edge network and ops skills into one great effort which is socketplane.io. I have had the pleasure to meet up with Brent and Madhu at ONS last year and have done some work with Brent way back when I was working on Floodlight, and am very excited for the future of Socketplane.io.

What’s behind Socketplane.io and What is the current preview technology?

The current tech preview released on github allows you to get a taste of multi-host networking between Dockerhosts using Open vSwitch and Consul as core enablers by building VXLAN tunnels between hosts to connect docker containers on the same virtual(logical) network with no remote/external SDN controller needed. The flows are programmed via OVSDB into the software switch so the user experience and maintenance is smooth with the least amount of moving parts. Users will interact with a wrapper CLI called “socketplane” for docker that also controls how socketplane’s virtual networks are created, deleted and manipulated. Socketplane’s preview uses this wrapper but if your following Docker’s plugin trend then you know they hope to provide virtual network services this way in the future (keep posted on this). I’d also love to see this tech be portable to other container technologies such as LXD or Rocket in the future. Enough text, lets get into the use of Socketplane.io

Walkthrough

First lets look at the components of what we will be setting up in this post. Below you will see 2 nodes: socketplane node1 and socketplane node2, we will be setting up these using Vagrant and Virtualbox using Socketplane’s included Vagrantfile. In these two nodes, when socketplane starts up we it will install OVS and Docker and start a socketplane container that runs Consul for managing network state. (one socketplane container will be the master, I’ll show more on this later as well). Then we can create networks, create containers and play with some applications. I will cover this in detail as well as show how the hosts are connected via VXLAN and demo a sample web application across hosts.

Setup Socketplane.io’s preview.

First install Virtualbox and Vagrant (I dont cover this, but use the links), then lets Checkout the repo

Set an environment variable named SOCKETPLANE_NODES that tells the installation file how many nodes to setup on your local environment. I chose 3. Then run “vagrant up” in the source directory.

After a few or ten minutes you should be all set to test out socketplane thanks to the easy vagrant setup provided by the socketplane guys. (There are also manual install instructions on their github page if you fancy setting this on on bare-metal or something) You should see 3 nodes in virtualbox after this. Or you can run “vagrant status”

Now we can SSH into one of our socketplane nodes. Lets SSH into node1.

Now you SSHed into one of the socketplane nodes. We can issues a “sudo socketplane” command and see the available options the CLI tool gives us.

Socketplane sets up a “default” network that (for me) has a 10.1.0.0/16 subnet address and if you run “socketplane network list” you should see this network. To see how we can create virtual networks (vnets) we can issue a command pictures below “socketplane network create foo4 10.6.0.0/16”

This will create a vnet named foo4 along with a vlan for vxlan and default gateway at the .1 address. Now we can see both our default network and our “foo4” network in the list command.

If we look at our Open vSwitch configuration now using the “ovs-vsctl show” command we will also see a new port named foo4 that acts as our gateway so we can talk to the rest of the nodes on this virtual network. You should also see the vxlan endpoints that aligns with your eth1 interfaces on the sockeplane node.

Great, now we are all set up so run some containers that connect over the virtual network we just created. So on socketplane-1 issue a “sudo socketplane run -n foo4 -it ubuntu:14.10 /bin/bash”, this will start a ubuntu container on socketplane-1 and connect it to the foo4 network.

You can Ctrl-Q + Ctrl-P to exit the container and leave the tty open. If you issue a ovs-vsctl show command again you will see a ovs<uuid> port added to the docker0-ovs bridge. This connects the container to the bridge allowing it to communicate over the vnet. Lets create another container, but this time on our socketplane-2 host. So exit out and ssh into socketplane-2 and issue the same command. We should then be able to ping between our two containers on different hosts using ths same vnet.

Awsome, we can ping out first container from our second without having to setup any network information on the second host. This is because the network state it propagated across the cluster so when we reference “foo4” on any of the nodes it will use the same network information. If you Ctrl-Q + Ctrl-P while running ping, we can also see the flows that are in our switch. We just need to use appctl and reference our docker0-ovs integration bridge.

As we can see our flows indicate the VXLAN flows thatheader and forward it to the destination vxlan endpoint and pop (action:pop_vlan) the vlan off the encap in ingress to our containers.

To show a more useful example we can start a simple web application on socketplane-2 and access it over our vnet on socketplane-1 without having to use the Dockerhost IP or NAT. See blow.

First start an image named tutum/hello-world and add it to the foo4 network and expose port 80 at runtime and give it a name “Web”. Use the “web” name with the socketplane info command to get the IP Address.

Next, logout and SSH to socketplane-1 and run an image called tutm/curl (simple curl tool) and run a curl <IP-Address> and you should get back a response from the simple “Web” container we just setup.

This is great! No more accessing pages based on host addresses and NAT. Although a simple use-case, this shows some of the benefit of running a virtual network across many docker hosts.

A little extra

So i mentioned before that socketplane runs Consul in a separate container, you can see the logs of consul by issuing “sudo socketplane agent logs” on any node. But for some more fun and to poke around at some things we are going to use nsenter. First find the socketplane docker container, then follow the commands to get into the socketplane container.

Now your in the socketplane container, we can issue an ip link see that socketplane uses HOST networking to attach consul and get consul running on the host network so the Consul cluster can communicate. We can confirm this by looking at the script used to start the service container.

See line:5 of this snippet, docker runs socketplane with host networking.

You can issue this command on the socketplane-* hosts or in the socketplane container and you should receive a response back from Consul showing you that is listening on 8500.

You can issue “consul members” on the socketplane hosts to see the cluster as well.

You can also play around with consul via the python-consul library to see information stored in Consul.

Conclusion

Overall this a great upgrade to the docker ecosystem, we have seen other software products like Weave, Flannel, Flocker and others i’m probably missing address a clustered Docker setup with some type of networking overlay or communications proxy to connect multi-hosted containers. Socketplane’s preview is completely opensource and is developed on github, if your interested in walking through the code, working on bugs or possibly suggesting or adding features visit the git page. Overall I like the OVS integration a lot mainly because I am a proponent of the software switch and pushing intelligence to the edge. I’d love to see some optional DPDK integration for performance in the near future as well as more features that enable fire-walling between vnets and others. I’m sure its probably on the horizon and am eagerly looking forward to see what Socketplane.io has for containers in the future.

cheers!

Docker Remote Host Management with Openstack

Leave a reply

So I decided to participate in #DockerGlobalHackday and oh boy was it a learning experience. First off, the hackday started off with great presentations from some of the hackers and docker contributors. One that caught my eye was Host Management (https://www.youtube.com/watch?v=lZGmvGw-mWc) and (https://github.com/docker/docker/issues/8681)

Ben Firshman and contributors thought up and created this feature for Docker that lets you provision remote daemons on demand given cloud providers. It had me thinking that maybe I should hack on a driver for a Local Openstack Deployment. So I did, and this is my DockerHackDayHack.

https://github.com/wallnerryan/docker/tree/host-managment-openstack

https://github.com/bfirsh/docker/pull/13

*Note, the code is raw, very raw, I haven’t coded in Go until this Hackday 🙂 Which is what I guess it is good for.

*Note this code was developed using Devstack with Flat Network orignaly, so there is some rough edged code for supporting out of the box devstack with nova network but it probably won’t work 🙂 I’ll make an update on this soon.

*Note the working example was testing on Openstack Icehouse with Neutron Networking. Neutron has one public and one private network. The public network is where the floating ip comes from for the docker daemon.

Here are the options now for host-management:

./bundles/1.3.0-dev/binary/docker-1.3.0-dev hosts create

Notice the areas with “–openstack-” prefix, this is what was added. If your using neutron network then the network for floating ips is needed. The image can be a Ubuntu or Debian based cloud image, but must support Cloud-Init / Metadata Service. This is how the docker installation is injected.Below is an example of how to kickoff a new Docker OpenStack Daemon: (beware the command is quite long with openstack options, replace X.X.X.X with your keystone endpoint, as well as UUIDs of any openstack resources.) It also includes –openstack-nameserver, this is not required but in my case I was, and will inject a nameserver line into the resolve.conf of the image using Cloud-Init / Metadata Service

In the future I plan on making this so we don’t need as many UUIDs. but rather the driver will take text as input and search for the relevant UUIDs to use. ( limited time to hack on this )

#./bundles/1.3.0-dev/binary/docker-1.3.0-dev hosts create -d openstack

–openstack-image-id=”d4f62660-3f03-45b7-9f63-165814fea55e” \

–openstack-auth-endpoint=”http://X.X.X.X:5000/v2.0” \

–openstack-floating-net=”4a3beafb-2ecf-42ca-8de3-232e0d137931″ \

–openstack-username=”admin” \

–openstack-password=”danger” \

–openstack-tenant-id=”daad3fe7f60e42ea9a4e881c7343daef” \

–openstack-keypair=”keypair1″ \

–openstack-region-name=”regionOne” \

–openstack-net-id=”1664ddb9-8a14-48cd-9bee-a3d4f2fe16a0″ \

–openstack-flavor=”2″ \

–-openstack-nameserver=”10.254.66.23″ \

–openstack-secgroup-id=“e3eb2dc6-4e67-4421-bce2-7d97e3fda356” \

openstack-dockerhost-1

The result you will see is: (with maybe some errors if your security groups are already setup)

#[2014-11-03T08:31:51.524125904-08:00] [info] Creating server.

#[2014-11-03T08:32:16.698970304-08:00] [info] Server created successfully.

#%!(EXTRA string=63323227-1c1e-40f6-9c25-78196010936b)[2014-11-03T08:32:17.888292214-08:00] [info] Created Floating Ip

#[2014-11-03T08:32:18.439105902-08:00] [info] “openstack-dockerhost-1” has been created and is now the active host. Docker commands #will now run against that host

If you specified a ubuntu image (I just downloaded the Ubuntu Cloud Img from here)The Docker Daemon will get deployed in a Ubuntu Server in openstack and the Daemon will start on port 2375

The Instance will look somthing like this in the dashboard of openstack.

Here is the floating ip association with the Daemon Host.

And it’s fully functional, see some other Docker Hosts commands below. Here is a video of the deployment (https://www.youtube.com/watch?v=aBG3uL8g124)

(View Hosts)

./bundles/1.3.0-dev/binary/docker-1.3.0-dev hosts

You can make either the local unix socket or the openstack node the active Daemon and you can use it like any other docker client. This “hosts” command can run locally on your laptop but your containers and daemon run in OpenStack. One could see this feature replacing something like Boot2Docker.

(docker ps) – Shows containers running in your openstack deployed docker daemon

./bundles/1.3.0-dev/binary/docker-1.3.0-dev ps -a

(View Remote Info)

./bundles/1.3.0-dev/binary/docker-1.3.0-dev info

Thanks for the Docker community for putting these events together! Pretty cool! Happy Monday and Happy Dockering.

P.S. I also used a Packer/VirtualBox Setup for DevStack in the begining. Here is the Packer Config and the Preseed.cfg. Just download devstack and run it on there.

{
 "variables": {
 "ssh_name": "yourname",
 "ssh_pass": "password",
 "hostname": "packer-ubuntu-1204"
 },

 "builders": [{
 "type": "virtualbox-iso",
 "guest_os_type": "Ubuntu_64",

 "vboxmanage": [
 ["modifyvm", "{{.Name}}", "--vram", "32"],
 ["modifyvm", "{{.Name}}", "--memory", "2048"],
 ["modifyvm", "{{.Name}}","--natpf1", "web,tcp,,8080,,80"],
 ["modifyvm", "{{.Name}}","--natpf1", "fivethousand,tcp,,5000,,5000"],
 ["modifyvm", "{{.Name}}","--natpf1", "ninesixninesix,tcp,,9696,,9696"],
 ["modifyvm", "{{.Name}}","--natpf1", "eightsevensevenfour,tcp,,8774,,8774"],
 ["modifyvm", "{{.Name}}","--natpf1", "threefivethreefiveseven,tcp,,35357,,35357"]
 ],

 "disk_size" : 10000,

 "iso_url": "http://releases.ubuntu.com/precise/ubuntu-12.04.4-server-amd64.iso",
 "iso_checksum": "e83adb9af4ec0a039e6a5c6e145a34de",
 "iso_checksum_type": "md5",

 "http_directory" : "ubuntu_64",
 "http_port_min" : 9001,
 "http_port_max" : 9001,

 "ssh_username": "{{user `ssh_name`}}",
 "ssh_password": "{{user `ssh_pass`}}",
 "ssh_wait_timeout": "20m",

 "shutdown_command": "echo {{user `ssh_pass`}} | sudo -S shutdown -P now",

 "boot_command" : [
 "<esc><esc><enter><wait>",
 "/install/vmlinuz noapic ",
 "preseed/url=http://{{ .HTTPIP }}:{{ .HTTPPort }}/preseed.cfg ",
 "debian-installer=en_US auto locale=en_US kbd-chooser/method=us ",
 "hostname={{user `hostname`}} ",
 "fb=false debconf/frontend=noninteractive ",
 "keyboard-configuration/modelcode=SKIP keyboard-configuration/layout=USA ",
 "keyboard-configuration/variant=USA console-setup/ask_detect=false ",
 "initrd=/install/initrd.gz -- <enter>"
 ]
 }]
}


(Preseed.cfg Starts HERE)
# Some inspiration:
# * https://github.com/chrisroberts/vagrant-boxes/blob/master/definitions/precise-64/preseed.cfg
# * https://github.com/cal/vagrant-ubuntu-precise-64/blob/master/preseed.cfg

# English plx
d-i debian-installer/language string en
d-i debian-installer/locale string en_US.UTF-8
d-i localechooser/preferred-locale string en_US.UTF-8
d-i localechooser/supported-locales en_US.UTF-8

# Including keyboards
d-i console-setup/ask_detect boolean false
d-i keyboard-configuration/layout select USA
d-i keyboard-configuration/variant select USA
d-i keyboard-configuration/modelcode string pc105


# Just roll with it
d-i netcfg/get_hostname string this-host
d-i netcfg/get_domain string this-host

d-i time/zone string UTC
d-i clock-setup/utc-auto boolean true
d-i clock-setup/utc boolean true


# Choices: Dialog, Readline, Gnome, Kde, Editor, Noninteractive
d-i debconf debconf/frontend select Noninteractive

d-i pkgsel/install-language-support boolean false
tasksel tasksel/first multiselect standard, ubuntu-server


# Stuck between a rock and a HDD place
d-i partman-auto/method string lvm
d-i partman-lvm/confirm boolean true
d-i partman-lvm/device_remove_lvm boolean true
d-i partman-auto/choose_recipe select atomic

d-i partman/confirm_write_new_label boolean true
d-i partman/confirm_nooverwrite boolean true
d-i partman/choose_partition select finish
d-i partman/confirm boolean true

# Write the changes to disks and configure LVM?
d-i partman-lvm/confirm boolean true
d-i partman-lvm/confirm_nooverwrite boolean true
d-i partman-auto-lvm/guided_size string max

# No proxy, plx
d-i mirror/http/proxy string

# Default user, change
d-i passwd/user-fullname string yourname
d-i passwd/username string yourname
d-i passwd/user-password password password
d-i passwd/user-password-again password password
d-i user-setup/encrypt-home boolean false
d-i user-setup/allow-password-weak boolean true

# No language support packages.
d-i pkgsel/install-language-support boolean false

# Individual additional packages to install
d-i pkgsel/include string build-essential ssh

#For the update
d-i pkgsel/update-policy select none

# Whether to upgrade packages after debootstrap.
# Allowed values: none, safe-upgrade, full-upgrade
d-i pkgsel/upgrade select safe-upgrade

# Go grub, go!
d-i grub-installer/only_debian boolean true

d-i finish-install/reboot_in_progress note