Concurrent and Queued SaltStack State Runs

The function "state.highstate" is running as PID 17587 and was started at 2014, Aug 29 23:21:46.540749 with jid 20140829232146540749

Ever get an error like that? Salt doesn’t allow more than a single state run to occur at a time, to ensure that multiple state runs can’t interfere with each other. This is really important, for instance, if you run highstate on a schedule, since a second highstate might be called before the previous one had finished.

What if you use states as simple function calls, for frequent actions, though? Or what if you’re using Salt for orchestration and want to run multiple salt-calls for a number of different salt state files concurrently?

SaltStack Patterns: Grain/State

It’s occasionally necessary to do actions in configuration management that aren’t easy to define in an idempotent way. For instance, sometimes you need to do an action only the first time your configuration management runs, or you need to fetch some static information from an external source, or you want to put instances in a specific state for a temporary period of time.

In SaltStack (Salt) a common pattern for handling this is what I call the Grain/State pattern. Salt’s grains are relatively static, but it’s possible to add, update, or delete custom grains during a state run, or outside of a state run either by salt-call locally or through remote execution. Grains can be used for conditionals inside of state runs to control the state of the system dynamically.

A SaltStack Highstate Killswitch

On rare occasion it’s necessary to debug a condition on a system by making temporary changes to the running system. If you’re using config management, especially as part of your deployment process, it’s necessary to disable it so that your temporary changes won’t be reset. salt-call doesn’t natively have a mechanism for this like Puppet does (puppet agent –disable; puppet agent –enable). It’s possible to do this yourself, though.

This requires that you’re using the failhard option in your configuration, that you’re using the 2014.7 (Helium) or above release, and also assumes you have some base state that is always included and is always included first.

SaltStack AWS Orchestration and Masterless Bootstrapping

In my last post, I mentioned that we’re using SaltStack (Salt) without a master. Without a master, how are we bootstrapping our instances? How are we updating the code that’s managing the instances? For this, we’re using python virtualenvs, S3, autoscaling groups with IAM roles, cloud-init and an artifact-based deployer that stores artifacts in S3 and pulls them onto the instances. Let’s start with how we’re creating the AWS resources.

Moving away from Puppet: SaltStack or Ansible?

Over the past month at Lyft we’ve been working on porting our infrastructure code away from Puppet. We had some difficulty coming to agreement on whether we wanted to use SaltStack (Salt) or Ansible. We were already using Salt for AWS orchestration, but we were divided on whether Salt or Ansible would be better for configuration management. We decided to settle it the thorough way by implementing the port in both Salt and Ansible, comparing them over multiple criteria.

Truly ordered execution using SaltStack

SaltStack’s documentation implies that by default, since the Hydrogen (2014.1.x) release, execution of states is ordered as defined. In practice, however, this isn’t true. SaltStack supports a feature called requisites, which provide features like require, watch, onchange, etc.. Some requisites, like watch, are basically impossible to live without. For instance, if you want to conditionally restart a service when a configuration file changes you need watch. If you use requisites you can’t ensure the state run will execute in order.

Per-project users and groups (aka service groups)

In Wikimedia Labs, we’re using OpenStack with heavy LDAP integration. With this integration we have a concept of global groups and users. When a user registers with Labs, the user’s account immediately becomes a global user, usable in all of Labs and its related infrastructure. When the user is added to an OpenStack project it’s also added to a global group, which is usable throughout the infrastructure.

Global users and groups are really useful for handing authentication and authorization at a global level, especially when interacting with things like Gerrit and other global services. Global users can also be used as service accounts within instances, between instances or between projects. There’s a number of downsides to global users though:

OpenStack wiki migration

On Feb 15th we migrated the MoinMoin powered OpenStack wiki to a new wiki powered by MediaWiki. Overall the migration went well. There was a large amount of cleanup that needed to get done, but we followed up the migration with a doc cleanup sprint. The wiki should be in a mostly good state. If you happen to find any articles that need cleanup, be bold!

So, what’s new with the wiki?

  1. All articles now have discussion pages
  2. It’s possible to make PDFs out of individual pages or to create a book (as a PDF or an actual physical book) from collections of articles

Extending a flatdhcp network the hard way

The title may make you think there’s an easy way. No such luck. Nova has no facility for extending a flatdhcp network, and as far as I can tell Quantum also has no facility for doing so.

Extending the flatdhcp network can be kind of a pain in the ass, so here’s how I handled it:

Assumptions

  • Network before extension:
    • Network CIDR: 10.4.0.0/24
    • Broadcast: 10.4.0.255
    • Netmask: 255.255.255.0
    • Network ID: 2
  • Network after extension:
    • Network CIDR: 10.4.0.0/21
    • Broadcast: 10.4.7.255
    • Netmask: 255.255.248.0
    • Network ID: 2

Modify the network

First modify the network via the database:

OpenStack Foundation Board Candidacy

Voting has started for the OpenStack board and I’m one of the 39 candidates. Many of the candidates have posted answers to a set of questions asked of all candidates. You can read my responses at the candidate site. Rather than reiterating those answers, I’d like to bring up some of the specific things I’d like to do as a board member.

Fight for the users

Being an OpenStack user is difficult, currently. Unless you have an OpenStack developer on your team, it’s difficult to even run OpenStack, let alone migrate between versions. Many deployments will run into bugs in the stable version of OpenStack and getting those bugs fixed and moved into the stable branch is difficult.