DevOps (Part 4): Configuration Management for Big Data Projects

Configuration Management Tools

Popular configuration management tools include Ansible, CFEngine, Chef, Puppet, RANCID, SaltStack, and Ubuntu Juju.

Key Considerations

  • A DevOps engineer should have an idea of how Big Data projects are implemented, the underlying technology platforms
  • A decision to use the right CM tool will have to be made depending on the project requirements
  • A DevOps engineer should have some experience working on the chosen CM tool

Eg: Chef

What is Chef?

Chef is an open source configuration management and infrastructure automation platform. It gives you a way to automate your infrastructure and processes. It helps in managing your IT infrastructure and applications as code. Since your infrastructure is managed with code, it can be automated, tested and reproduced with ease.

More about Chef: https://docs.getchef.com/chef_overview.html

Chef Architecture in a Nutshell

Chef typically runs in a client-server mode. The Chef server can be of 2 types: Hosted Chef, which is a SaaS offering; Private Chef is an organization specific Chef server. Private Chef could be Open Source or Licensed.

The chef client is the VM or machine that you want to manage/automate. This is called as the chef node. This is based on a “pull” mechanism where the chef node requests for any updates from the Chef server.

Chef can also run using the standalone mode called as chef solo or chef zero. This mode is typically used for development/testing.

The configuration management is done using Chef Cookbooks. Cookbooks contain recipes which are added to the chef node. These recipes define the behavior of the node. Eg: which node will run the Apache webserver, which will have a DB server and so on.

There are various other configurations supported by Chef: roles, environments, data bags etc.

More can be learnt at: https://www.getchef.com/chef/

Using Chef to Deploy Hadoop, Hive, Pig, HBase

A Chef cookbook is available which can install and configure hadoop, HBase, Hive, Pig and other Hadoop jobs.

The cookbook available: https://supermarket.getchef.com/cookbooks/hadoop/versions/1.0.4 would have to be configured as per project requirements. Most likely few changes would have to be made to the cookbook so that it can fit the existing project design.

Using Chef to setup Azkaban Job Scheduler

A couple of cookbooks are available using which Azkaban can be setup and configured.

https://github.com/yieldbot/chef-azkaban2

https://github.com/RiotGames/azkaban-cookbook

These cookbooks can be extended to fit the project’s requirements.

More reads

Chef – https://www.getchef.com/chef/

Puppet – http://puppetlabs.com/

Orchestrating HBase cluster deployment using Chef –  http://www.slideshare.net/rberger/orchestrating-hbase-cluster-deployment-with-ironfan-and-chef

Talk by John Martin about building and managing Hadoop cluster with Chef – https://www.getchef.com/blog/chefconf-talks/building-and-managing-hadoop-with-chef-john-martin/

DevOps (Part 3) – Continuous Delivery

Continuous Delivery includes automated testing, CI and continuous deployment resulting in the ability to rapidly, reliably and repeatedly pushing out enhancements and bug fixes to customers at low risk and with minimal manual overload.

Continuous Deployment Vs Continuous Delivery

I read a tweet once upon a time which sums up the difference – “Continuous Delivery doesn’t mean every change is deployed to production ASAP. It means every change is proven to be deployable at any time”.

DevOps for Kohls (1)

Deployment

Whether automated or with a manual “click”, the deployment too can be automated. The deployment varies largely depending on the project. We now step into what is called as Configuration Management.

DevOps (Part 2): Continuous Integration

Code Review, Build and Test can be automated to achieve Continuous Integration.

Code Review Tools

Every project’s repository is usually managed by some versioning tool. A choice for the versioning tool can be made, we can assume Git for now since its most popular. When the developer pushes his change, a build would be triggered. If the build is successful, then test job would be triggered. Its only after the tests pass, that this commit should be merged to the central repository. Typically the developer’s commit would also need manual review.

As a design principle for DevOps, the central repo should not have push/write access. He will commit to an “authoritative repository”.

Key Considerations

  • A decision from DevOps perspective must be made to choose the right versioning tool, managing of user’s commits to an authoritative repo and a code review tool. This will depend on the project’s requirement. Popular tools to be evaluated are Git, Gerrit, Teamcity.
  • A DevOps engineer will have to setup and configure the tools. Eg – Git-Gerrit integration needs to be installed, setup, configured

Eg: Gerrit

Gerrit is a web-based code review tool built on top of the git version control system. It is intended to provide a lightweight framework for reviewing every commit before it is accepted into the code base. Changes are uploaded to Gerrit but don’t actually become a part of the project until they’ve been reviewed and accepted. In many ways this is simply tooling to support the standard open source process of submitting patches which are then reviewed by the project members before being applied to the code base. However Gerrit goes a step further making it simple for all committers on a project to ensure that changes are checked over before they’re actually applied.

Gerrit can be integrated with several Build Automation tools like Jenkins. It can also be integrated with Issue Tracking systems like RedMine. Eg: When a user commits his change for a bug #123 in RedMine, the bug in RedMine will get updated.

More Reads

What is Gerrit? https://review.openstack.org/Documentation/intro-quick.html

Git-Gerrit configuration: http://www.vogella.com/tutorials/Gerrit/article.html

Implementing Gitflow with TeamForge and Gerrit – http://blogs.collab.net/teamforge/implementing-gitflow-with-teamforge-and-gerrit-part-i-configuration-and-usage

Build Automation

Generically Build Automation is referred to as writing scripts to automate tasks like compiling, packaging, running automated tests, deploying to production and creating documentation. This section however talks about simply building your code.

Key Considerations

  • Most projects already have build tools for them. Ant, Maven, Gradle.
  • There might be a need for distributed builds. A build automation tool must be able to manage these dependencies in order to perform distributed builds.
  • A DevOps engineer may have to write configuration scripts to build artifacts

Eg: Gradle

Gradle can be integrated with Github. What we can achieve by this is GitHub would recognize Gradle build scripts and provide nice syntax highlighting.

Gradle can be integrated with any CI server. There is a good Jenkins plugin for Gradle. It can be integrated with TeamCity – an extensible build server. So essentially what we achieve is – “a user’s commit triggers a job in Jenkins, which uses Gradle to build the repository”.

Gradle can be integrated with some Repository Managers like Nexus. So if gradle builds an artifact successfully, it will have to be transferred to some remote location, artifacts of older builds need to be maintained, maintain common binaries across different environments and provide secure access to the artifacts. This is the role of a Repository Manager.

More Reads

Integrating Gradle with Jenkins – https://wiki.jenkins-ci.org/display/JENKINS/Gradle+Plugin

Integrating Gradle with TeamCity – http://confluence.jetbrains.com/display/TCD8/Gradle

What is Nexus? http://www.sonatype.com/nexus/why-nexus/why-use-a-repo-manager

What is a Repository Manager – http://blog.sonatype.com/2009/04/what-is-a-repository/#.VFY5s_mUd8E

Test Automation

When user commits code and its successfully built and deployed to a test environment, the actual test jobs need to be started on that environment. The test jobs include unit tests as well as integration tests. The testing would most likely involve creating test VMs and cleaning them up after every tests. The test results would have to be relayed back to the developers and others at stake.

Key Considerations

  • From DevOps perspective we dont have a “test automation tool”. What we have is an automation framework, which will involve test automation. Hence this is one of the most important aspects of deciding on a DevOps automation tool.
  • There are several CI servers, most popular being Jenkins. Travis and BuildHive are hosted services offering some additional options. The choice of a CI server will have to be made depending on several factors.
  • The frequency of commits need to be estimated. Will you run tests after each commit?
  • There are some tests that would run nightly
  • A DevOps engineer will have to write configuration scripts which will trigger test jobs, create VMs, give back feedback, etc.

Eg: CI Server – Jenkins

Jenkins can be configured to trigger jobs that run tests. It can spawn VMs, clusters where the tests run. Depending on the tests, data volumes, you may have to consider using open source Jenkins or the Enterprise version.

Jenkins can be integrated with Git/Gerrit. So every push can trigger a build & test job.

Jenkins can be integrated with Code Analysis tools like Checkmarx.

Jenkins can be integrated with Repository Managers like Nexus.

Jenkins can be integrated with Issue Tracking tools like RedMine.

More Reads

DevOps and Test Automation – http://www.scriptrock.com/blog/devops-test-automation

Case Study: Deutsche Telekom with Jenkins & Chef – http://www.ravellosystems.com/customer-case-studies/deutsche-telekom

Git, Gerrit Review, Jenkins Setup – http://www.infoq.com/articles/Gerrit-jenkins-hudson

Gerrit Jenkins Git – https://wiki.openstack.org/wiki/GerritJenkinsGit

Checkmarx – https://www.checkmarx.com/glossary/code-analysis-tools-boosting-application-security/

DevOps (Part 1): Introduction to DevOps

What Is DevOps?

The answer to this question can be given philosophically and technically. It is important to know the philosophy behind DevOps. But you will find this explanation in plenty of sites so I will skip this part.

Today almost everything is getting “automated”. Repetitive tasks are replaced with machines / code. Methods are being devised to address and minimise the defects and bugs in any system. Methods to minimise human error are being devised. The issues in the software development cycle are being addressed. One major issue in the SDLC was the process of how the developed code moved to production. DevOps addresses these issues. DevOps is an umbrella concept which deals with anything to smooth out the process from development to deployment into production.

DevOps Objectives

  • Introduce Code Review System
  • Automate Build
  • Automate Testing
  • Automate Deployment
  • Automate Monitoring
  • Automate Issue Tracking
  • Automate Feedbacks

These objects can be achieved by setting up a Continuous Integration pipeline and a Continuous Deployment/Delivery process. Post delivery, a process for Continuous Monitoring is set up.

Points to be considered while setting up a CI/CD pipeline

  • Developers push lot of code (many commits)
  • With every commit a build would be triggered
  • Automated tests should be run in production-clone environment
  • Builds should be fast
  • Generate feedbacks (to developers, to testers, to admins, to open source community, etc)
  • Publish latest distributable
  • Distributable should be deployable on various environments
  • Deployment process should be reliable and secure
  • One click deploys

A Simplified CI/CD Pipeline

DevOps for Kohls

Cloud Attribute when using Chef’s knife-ssh in VPC mode

Error: FATAL: 1 node found, but does not have the required attribute to establish the connection. Try setting another attribute to open the connection using –attribute.

You need to set the –attribute to have the *name* of the cloud attribute. In AWS case most likely you would be looking for this: “ipaddress”.

So your knife ssh command would look like –

knife ssh <name> -x <user> “sudo chef-client” -a ipaddress

Facebook

15 Jan 2014

Today morning while I rushed to Faasos’ to pick up a quick wrap, I overheard 2 young girls discussing FB updates over their morning breakfast. On way to office, I saw so many people dig into their mobiles, surfing FB. I came to office and saw a colleague busy editing a picture which he said he wanted to post as his profile pic on FB. I realized, yet again, FB has touched each of our lives and it has become an important engagement activity!

Facebook has over 900 million active users. Now that’s quite huge: If Facebook was a country, it would be the third largest in the world, after China and India! A quick glance at FB stock shows that it is being traded today at $57.74 which is 3.27% up than yesterday.

That’s great news. But ever wondered how FB is making money?
A quick google search – and I learnt that FB’s major revenue is from its ads! The ads that show up on your website make money for FB! FB’s 2013 Q3 revenue increased by 60% with strong ad sales.
A smaller part of FB revenue comes from its other payments like games (made by Zynga) and FB takes 30% revenue from it and from FB gift shop and FB credits.

But gradually the internet world is shifting from PC to mobile. More than 50% of Facebook users access FB through their mobile. If you notice, since a couple of months some sponsored ads have started appearing on your mobile app. These are much less than the ads you see on your web page, but FB is gradually coming up with different monetization strategies for mobile users.

The money-making model that Facebook adopts is called “Advertising Based Revenue Model”. There are 2 types of advertisements: Direct and Contextual. Direct ads provide more revenue – there are fixed places for the ads (like hoardings you see on the road, or like the “sponsored ads” that FB shows in the form of “suggested pages”). Contextual ads are those that come up depending on the user’s profile: user’s search history, the pages he likes, the posts he read, his friend connections – all this is analyzed programatically and ads relevant to the users come up for him. A third party is involved in the contextual ads – which makes profit margins less in these type of ads.

Another big giant that gets its revenue from ads is Google. Wordstream, an internet marketing software site, analyzed that Google ads are 10 times more likely to be hit by users than FB ads. Which means, FB still lags behind Google, in generating revenue. Advertisers would prefer selling their ads to a site which promises better sales for them. Hence, companies that are based on ad-based revenue model need to keep coming up with innovative strategies to stay in the competitive market. Whether Facebook will come up with better ad-maketing strategies to overtake Google, or will it fall behind? Time will tell 🙂

NetApp Clustered Data ONTAP

NetApp Inc is an American company in the field of Computer Storage and Data Management. It is listed in the Fortune 500 companies.

NetApp was the first to reach the market with enterprise-ready, unified scale-out storage in the form of NetApp Clustered Data ONTAP. This essentially lets you take NetApp storage arrays and cluster them together to form a virtualized shared storage pool.

Scale-Up Vs Scale-Out Storage

All storage controllers have a physical limit to which they can be expanded, eg – number of CPUs, memory slots, space for disk shelves. Once the limit for these has been reached and the controller is completely populated the only option is to aquire one or more additional controller. This is called as “scale-up” where you replace your existing controller with one with newer technology or you run multiple controllers side by side. Scaling-up increases the operational burden as the environment grows. Eg – When you replace a controller with a new one, then data migration is needed which is time consuming, often disruptive and complex. If you use 2 or more controllers then you need tools for load balancing them.

With “scale-out” you can seamlessly add controllers to your resource pool which resides on a virtualized shared storage. NetApp provides this storage virtualization and using scale-out, you can move host & client connection, data stores non disruptively anywhere in the resource pool. And all this can be achieved while your environment is online and serving data, without any downtime!

Clustered Data ONTAP Architecture

clustered_ontap

The basic building block is a high availability (HA) pair which consists of 2 identical nodes. The nodes are connected using redundant cabled network paths. When one node goes down, the other takes over its storage ans maintains access to its data. When the downed system rejoins the cluster, the partner node gives back the storage resources. The minimum cluster size starts with 2 nodes in an HA pair. Using non-disruptive technology refresh, the cluster can evolve to the largest cluster size and powerful hardware.

VServer / Virtual Array

The storage array that resides on top of the available hardware. It maps to the IP address, hostname and/or the fiber channel addresses they attach to. Mapping a LUN is done through vserver. A file share or export is mapped ot mounted via the vserver.

LIFs

These are logical interfaces or virtual adapters that can share the same physical interfaces or be placed on dedicated interfaces.

Cluster Virtualization

A cluster is composed of physical hardware: storage controllers with attached disk shelves, network interface cards, and, optionally, Flash Cache cards. Together these components create a physical resource pool that is virtualized as logical cluster resources to provide data access. Abstracting and virtualizing physical assets into logical resources provide the flexibility and potential multi-tenancy in clustered Data ONTAP as well as the object mobility capabilities that are the heart of nondisruptive operations.
 

Storage Efficiency and Data Protection

Storage efficiency built into clustered Data ONTAP offers substantial space savings, allowing more data to be stored at lower cost. Data protection provides replication services, so valuable data is backed up and recoverable.

Thin Provisioning

Volumes are created using virtual sizing. All the storage is not pre-allocated upfront. The space is unused until data is written to the volume, and only that much space is used as needed by the data. The unused storage is shared across all volumes.

Deduplication

Deduplication removes duplicate data blocks in primary & secondary storage. It stores only unique blocks. Deduplication run can be customized.

Compression

Compression is achieved using algo which replaces repeating patterns within a subset of a file.

Knife-cloud Gem: Introduction & Knife Plugin Development Using It

Reposted from – Clogeny, An Msys Company

Chef Software, Inc. has released knife-cloud gem. This article talks about what is the knife-cloud gem and how you can use it to develop your custom knife-cloud plugin.

Knife is a CLI tool used for communication between local chef-repo and the Chef Server. There are a couple of knife subcommands supported by Chef, e.g., knife bootstrap, knife cookbook, knife node, knife client, knife ssh, etc. Knife plugin is an extension of the knife commands to support additional functionality. There are about 11 knife plugins managed by Chef and a lot more managed by the community.

The concept of knife-cloud came up as we have a growing number of cloud vendors, and therefore a number of knife plugins, to support the cloud specific operations. The knife-cloud plugins use cloud specific APIs to provision a VM and bootstrap it with Chef. These plugins perform a number of common tasks, such as connection to the node using SSH or WinRM and bootstrapping the node with Chef. The knife-cloud (gem) has been designed to integrate the common tasks of all knife cloud plugins. As a developer of a knife cloud plugin, you will not have to worry about writing the generic code in your plugin. More importantly, if there is any bug or change in the generic code of the knife plugin, the fix would be done in knife-cloud itself. Today we need to apply such changes across all the knife plugins that exist.

Knife-cloud is open source available at: https://github.com/opscode/knife-cloud.
You may refer to https://github.com/opscode/knife-cloud#writing-your-custom-plugin about the steps to write your custom knife cloud plugin.

Clogeny Technologies has written a knife-cloud scaffolder (https://github.com/ClogenyTechnologies/knife-cloud-scaffolder) to make your job even simpler. The scaffolder generates the stub code for you with appropriate TODO comments to guide you in writing your cloud specific code.

To use the knife-cloud-scaffolder:
– git clone https://github.com/ClogenyTechnologies/knife-cloud-scaffolder
– Update properties.json
– Run the command: ruby knifecloudgen.rb E.g., ruby knifecloudgen.rb ./knife-myplugin ./properties.json

Your knife-myplugin stub will be ready. Just add your cloud specific code to it and you should be ready to use your custom plugin.

Windows Remote Management and Kerberos Authentication

Windows Remote Management is used for communication between computers and involves the security of the communication using different methods of authentication and message encryption.

There are following types of authentications:

Basic Authentication:

Least secure
User name & Password is used for authentication
Can be used for HTTP or HTTPS transport
Used in a domain or workgroup

Negotiate Authentication:

Also called WIA ( Windows Integrated Authentication)
Negotiated, single sign on
SPNEGO – Simple & Protected GSSAPI negotiation mechanism
SPNEGO determines if to use kerberos or NTLM
Kerberos is prefered

Digest Authentication:

Client request -> server -> authentication server (domain controller)
If client is authenticated, then the server gets a digest session key for subsequent requests from the client

Kerberos Authentication:

Mutual authentication using encrypted keys between client & server
Client account must be on domain account in the same domain as the server

SPNEGO:

An authentication mechanism used by the client or server receiving requests for data through the WinRM in an Active Directory context
SPNEGO is based on an Request For Comments (RFC) protocol produced by the Internet Engineering Task Force (IETF)

Diff between kerberos and negotiate:

Kerberos is the default method of authentication when the client is in a domain and the remote destination string is not one of the following: localhost, 127.0.0.1, or [::1].
Negotiate is the default method when the client is in domain, but the remote destination string is one of the following: localhost, 127.0.0.1, or [::1].

CredSSP Authentication:

CredSSP authentication is intended for environments where Kerberos delegation cannot be used.
It was originally developed to support Remote Desktop Services single sign-on, however it can also be leveraged by other technologies such as PowerShell remoting.
CredSSP provides a non-kerb mechanism to delegate a session’s local credentials to a remote resource.

Setting up client and server for Kerberos Authentication

First steps

Create client and server machines.
Add them to the same domain

Configure Server

You can use the WinRM tool. Run the following commands –

winrm quickconfig -q
winrm set winrm/config/winrs @{MaxMemoryPerShellMB=”300″}
winrm set winrm/config @{MaxTimeoutms=”1800000″}
winrm set winrm/config/service @{AllowUnencrypted=”true”}
winrm set winrm/config/service/auth @{Basic=”false”}
netsh advfirewall firewall set rule name=”Windows Remote Management (HTTP-In)” profile=public protocol=tcp localport=5985 remoteip=localsubnet new remoteip=any
winrm set winrm/config/client/auth @{Digest=”false”}
winrm set winrm/config/service/auth @{Kerberos=”true”}
winrm set winrm/config/client @{TrustedHosts=”*”}

To ensure that Kerberos authentication is enabled on WinRM service:

winrm get winrm/config/service/auth

Connect to server from client

Using winrm –

winrm identify -r:servername.domain.com -auth:Kerberos -u:username@domain.com -p:password

Using winrs –

winrs /r:servername.domain.com /u:domain/username> /p:password dir

Using Chef’s knife-windows –

knife winrm -m servername.domain.com -x username -P passowrd -R krb5_realm dir

knife winrm -m servername.domain.com -x username -P passowrd -t kerberos-keytab -R krb5_realm -S kerberos-service dir

Creating Nested Hash in Ruby

Single line code!

block = lambda {|hash, key| hash[key] = Hash.new(&block)}

nested_hash = Hash.new(&block)

nested_hash[:one] = “1”
nested_hash[:two][:sub1] = “2.1”
nested_hash[:two][:sub2] = “2.2”
nested_hash[:three][:sub1][:subsub1] = “3.1.1”
nested_hash[:three][:sub1][:subsub2] = “3.1.2”
nested_hash[:three][:sub2] = “3.2”

Result Hash:

{:one=>”1″, :two=>{:sub1=>”2.1″, :sub2=>”2.2″}, :three=>{:sub1=>{:subsub1=>”3.1.1″, :subsub2=>”3.1.2″}, :sub2=>”3.2″}}