On our puppet module design

Most puppeteers seem to work for either one company or only deploy one type of service to different companies. You can notice this in the way they build their modules. Most people tend to create one module that contains everything that is needed for a service or application, since they only need to have it (mostly) the same. We at Kumina work for a lot of different companies and try to be as flexible as possible. This makes our module design a fair bit different than most others, in my experience. This post tries to explain some of the choices we’ve made with regards to module design.

We work at three different levels, which we gave separate names:

  • The first one is the most generic and we call it… generic.
  • The second one builds on top of a generic module and implements our best practises. We call it “Kumina Best Practises” or kbp for short.
  • The final one is the customer specific layer, which actually implements our kbp modules with the correct variables for a specific customer.

When designing our modules, we keep several guidelines in mind:

  • Prefer to include instead of inheriting. Inheritance is no longer needed since we can pass parameters to classes.
  • In generic, the modules name should be the name of the application. So if you’re creating a module for Apache, call it “gen_apache”.
  • In generic, you should only add resources that are needed for that exact application, nothing else.
  • In generic, you can only rely on functions that are part of puppet or the gen_common module. Nothing else.
  • In kbp, you can only rely on functions in gen_common and any included class from generic or kbp.
  • In kbp, make sure the module is totally self-contained. This means that if it uses defined types or functions from other modules, it should include those modules explicitely.
  • The kbp module should setup monitoring and trending explicitely using the kbp_icinga and kbp_munin classes (or whichever modules we use per default for those).
  • The customer specific modules are only allowed to include kbp modules or customer specific modules from the same customer (actually, each customer has her own environment, so they cannot access specific modules from other customers).
  • The customer specific modules should be divided in actual service, where the class includes everything that’s needed for a service. So if you have multiple PHP websites, each site has a separate class and each class includes everything it needs, even if that means that both site classes will include the mysql server, for example.
  • Anything slightly generic should be built in kbp, not in the customer specific modules.

That looks like a fairly long list, but most of it seems rather logical once you’re working with it. These guidelines make sure that our entire team can quickly and easily work on each customer’s setup, where needed. Keeping the environments separate from each other also allows us to easily see the impact certain changes will make. In practise, most of the resources will be defined in the kbp layer. But the generic layer is still important, because we try to create an API-like approach the applications of the same service type. I’ve described how to do that a while ago on the Puppet mailinglist (I should probably write a blog post about that too, but not today). The main advantage of this being that you should be able to easily replace for example apache with nginx.

Using the above guidelines, setting up our webserver is simply a block like this:

node 'web.kumina.nl' inherits 'kumina_default' {
  include site::www_kumina_nl
  include site::www_twenty_five_nl
  include site::blog_kumina_nl
  include mail::incoming
}

And everything is setup as we need it. The class itself looks as follows (for example):

class site::www_kumina_nl {
  include kbp_httpd
  include site::common

  kbp_httpd::simple_site { "www.kumina.nl":
    ensure => 'present',
    documentroot => '/srv/www/www.kumina.nl/',
  }
}



class site::www_twenty_five_nl {
  include kbp_httpd
  include site::common
  include kbp_httpd::php
  include kbp_mysql

  kbp_httpd::simple_site { "www.twenty-five.nl":
    ensure => 'present',
    documentroot => '/srv/www/www.twenty-five.nl/',
  }

  kbp_mysql::db_with_user { "tf_interface":
    password_hash => 'very_secret',
  }
}



class site::blog_kumina_nl {
  include kbp_httpd
  include site::common
  include kbp_httpd::php
  include kbp_mysql

  kbp_httpd::simple_site { "blog.kumina.nl":
    ensure => 'present',
    documentroot => '/srv/www/blog.kumina.nl/',
  }

  kbp_mysql::db_with_user { "kumiblog":
    password_hash => 'very_secret',
  }
}

If we decide to ever move the blog to a separate server, we can simply do:

node 'web.kumina.nl' inherits 'kumina_default' {
  include site::www_kumina_nl
  include site::www_twenty_five_nl
  include mail::incoming
}



node 'blog.kumina.nl' inherits 'kumina_default' {
  include site::blog_kumina_nl
}

Aside from manually moving the data in the database, everything should work as expected. This allows us to easily move sites (or other applications) from one machine to another.

This way of building your modules either appeals because of the flexibility or seems a horribly inefficient use of your time. We find it’s a nice way to keep some order without losing too much flexibility.

ANNOUNCEMENT: Microsoft Windows support coming soon!

Kumina is happy to announce it’s in the process of switching all its managed servers from Debian Linux to Microsoft Windows. Due to high demand, it was decided last January that this would be a good time to switch from Linux to Windows.

The last few months we’ve been working hard to migrate our best practices over to this new and exciting platform. All of our employees have followed a week long course to learn about all the licensing issues surrounding the Microsoft platform, and by now we’re comfortable with advising you about all the benefits Microsoft licensing can offer you!

Most of our best practices have been migrated by now, so we feel confident that we can provide you with the same or better service level as you’re used from us by now.

We will inform our customers separately of the intended migration schedule as it affects them. We’re happy and proud to finally be part of the Microsoft world!

If you have any questions, feel free to contact us!

PS
Yes! We’re so excited!

HowTo: Reset a cryptostick

We use this cryptostick a lot and always thought that there was no way to reset it once you entered the admin PIN incorrectly three times. Well, there is a way to reset it! Found it here and describing it below for future reference.

Create a file with the following contents:

/hex

scd serialno

scd apdu 00 20 00 81 08 40 40 40 40 40 40 40 40

scd apdu 00 20 00 81 08 40 40 40 40 40 40 40 40

scd apdu 00 20 00 81 08 40 40 40 40 40 40 40 40

scd apdu 00 20 00 81 08 40 40 40 40 40 40 40 40

scd apdu 00 20 00 83 08 40 40 40 40 40 40 40 40

scd apdu 00 20 00 83 08 40 40 40 40 40 40 40 40

scd apdu 00 20 00 83 08 40 40 40 40 40 40 40 40

scd apdu 00 20 00 83 08 40 40 40 40 40 40 40 40

scd apdu 00 e6 00 00

scd apdu 00 44 00 00

/echo card has been reset to factory defaults

And make the key accept those commmands:

gpg-connect-agent

That's it!

Job opening

Kumina is looking for a new full-time junior systems administrator per April 2011. Are you the person we’re looking for?

We’re looking for someone who…

  • … doesn’t quit when the going gets tough
  • … has an interest in system maintenance
  • … is comfortable with responsibility
  • … wants to go the extra mile if that results in higher quality of the end-product
  • … is versatile and wants to learn new things all the time
  • … can work with a team, but not necessarily in a team

Linux knowledge is not necessarily required, if you’re willing to learn fast. You can find info about what we do on our website. Some keywords:

We’re looking for a full-time employee starting April 2011. Interested? Send your resumé and an introductory letter to jobs@kumina.nl!

Hetzner Failover IP OCF script

At Hetzner you can get very cheap servers. If your application stack can handle failovers and the like, it’s a cheap venue to setup a fairly large setup. One thing that’s a bit different than at most other colocators I know, is their network setup. They actually route all traffic via managed switches to your machine. So all machines are in their own network. That can be a problem if you want to do cool stuff like moving an IP address on the fly.

Luckily, they have provided “Failover IP” addresses, which you can allocate to a server and which you can switch to another server. But only via a web interface. The web interface also has an API, which makes things a bit easier. For one of our customers, we wrote an OCF script that can perform the failover, so we can user heartbeat and pacemaker over there.


Due to the fact that pacemaker expects all variables to be the same on both machines, we need to use several data sources. We’ve created it as follows:

  • An OCF script that calls a Python script for assigning the failover IP
  • The aforementioned Python script, which reads some variables from a local file (defaults to /etc/hetzner.cfg) and which actually talks to the API to switch the IP address or check if the IP address is currently assigned to this host
  • A local config file which is read by the Python script and contains the Hetzner API credentials and the local machine IP address.

The local IP address in the configuration file is needed because we run all important stuff in VMs and the API expects the IP address of the iron to which you want the failover IP to point. Usually, you do not have access to the local IP address, which is why we simply set it up in the configuration file. The Python script is fairly simple. You can run it with -h to see the possible commands you can give it. The config file probably requires some explanation:

[dummy]

user = #12345+RaNdM

pass = sEcReT

local_ip = 1.2.3.4

The user and pass can be generated from the Hetzner Robot interface. When you have selected the server to which the failover IP is assigned, select the Admin option and request new credentials. These are specific to that machine and all resources assigned to that machine. This is a safety measure. The local IP is the primary IP address of the local machine. So if you want to be able to switch the failover IP address to the machine with the local IP address of 2.3.4.5, that machine will have local_ip = 2.3.4.5 in it’s /etc/hetzner.cfg file. Are you still following this? Good!

Now, the using the OCF script is simple. Add it to /usr/lib/ocf/resource.d/kumina/hetzner-failover-ip and setup your CRM configuration as follows:

primitive IP_mysql ocf:kumina:hetzner-failover-ip \
    op start interval="0" timeout="300s" \
    op monitor interval="60s" timeout="300s" \
    params ip="1.1.1.1" script="/usr/local/sbin/parse-hetzner-json.py"

The 1.1.1.1 should be replaced with your failover IP, of course. The script needs to be added. If you want to use another configuration file, you can change it into /usr/local/sbin/parse-hetzner-json.py -c /etc/myconfig.hetz or something that suits your fancy. The timeout is needed, because the Hetzner API is a slow beast. (On a related note, I think it’s possible to change the OCF script to use this as a default, but I couldn’t find it quickly.)

Do let us know if you have questions or if this helped you!

The files:

Update: Add monitor statement to CRM configuration, to work with scenarios where failover addresses are modified manually.

Puppet on puppetmaster, some tips

We often run a puppet on the puppetmaster which connects to the local puppetmaster. In the past, I’ve run into some problems, so I thought it best to write down a couple of tips to keep in mind when setting this up. These helped me out in the past:

  • Have a separate SSL dir for the puppetmaster and the client. The following snippet shows how to do that:
    [puppetd]
    
    ssldir = /var/lib/puppet/ssl
    
    
    
    [puppetmasterd]
    
    ssldir = /var/lib/puppet-server/ssl
    
    
    
    [puppetca]
    
    ssldir = /var/lib/puppet-server/ssl

    The addition to puppetca is needed because it needs to know where to sign the certificates. Of course, if you run 2.6 or higher, you need to replace puppetd with agent, puppetmasterd with mast and puppetca with… ca I think.

  • Explicitely set the certname and the certdnsnames for the puppetmaster, as follows:
    [puppetmasterd]
    
    certname = puppet
    
    certdnsnames = puppet.my.domain

That’s it. Hope it helps someone. You’re going to need to remove all old ssl dirs after you changed this and regenerate the certificates.

[Tomcat] java.net.BindException: Cannot assign requested address:17180

More as a note to myself, since I spent 30 minutes finding this. If your tomcat throws this error and for the life of you, you cannot see anything bound to the port that's mentioned, check the IP address in the server.xml. Just had a Tomcat that had an incorrect IP address configured (and IP address from another machine, that is) and it threw this error. Was very confusing.

Maximum allocated memory per process

A process in Linux can allocate memory without actually using it. This can create situations in which you have far more memory allocated than you have physically in the machine. We had one process that kept allocating memory without using it, until it ran into a barrier. Look at this munin graph to see what happened there:

Now I was puzzled by the 250GB limit. A 2.6 kernel should be able to allocate 1TB of memory on a machine, if it's available. So why would it run out of allocatable space at 250GB? Took me a little while, but after a tip from jtopper on IRC, I looked at the vm sysctl documentation and found the overcommit_ratio setting. Which happened to be 50. And the machine just happened to have about 5GB of RAM. Well look there, 5x50 is 250GB... We found the reason why the graph stops increasing at about 250GB!

[Ubuntu] Right-click on Intel Macbook

Finally found out how I can right-click on my Intel Macbook in Ubuntu via the touchpad. And it's very simple, too, just double-tap the touchpad and it's a right-click. Probably needs the pommed service from the MacTel Support PPA, but there you have it. A simple Google search gave me the answer, though. No idea why I didn't just search for it before.

Puppet Tips&Tricks: Running apt-get update only when needed

A small example on how you can make apt-get update only run if a) the machine rebooted and b) something changed in /etc/apt. We use cron-apt to run an update every night, to keep the machine up-to-date, so this is really all we need. If you need to add a repository before you can install a package (say, you want to install a package from the Kumina Debian Repository), you can now do it in one puppet run, if you make sure your package resource depends on apt-get update. This is the code:

# Run apt-get update when anything beneath /etc/apt/ changes

exec { "apt-get update":
command => "/usr/bin/apt-get update && touch /tmp/apt.update",
onlyif => "/bin/sh -c '[ ! -f /tmp/apt.update ] || /usr/bin/find /etc/apt -cnewer /tmp/apt.update | /bin/grep . > /dev/null'",
}