SaltStack Development: Behavior of Exceptions in Modules

The SaltStack developer docs are missing information about exceptions that can be thrown and how the state system and the CLI behaves when they are thrown.

Thankfully this is easy to test and is actually a pretty good development exercise. So, let’s write an execution module, a state module, and an sls file, then run them to determine the behavior.

A simple example execution module


from salt.exceptions import CommandExecutionError

def example(name):
    if name == 'succeed':
        return True
    elif name == 'fail':
        return False
        raise CommandExecutionError('Example function failed due to unexpected input.')

A simple example state module


Dealing with splunkforwarder via Config Management

The splunkforwarder package is very poorly written, at least for Debian/Ubuntu. There’s a number of things it does that make it difficult to use:

  1. It installs a splunk user and group, but doesn’t install them as system users/groups, so they’ll conflict with your uids/gids.
  2. It requires manual interaction the first time you start the daemon, on every single system it’s installed on.
  3. It modifies its configuration files when the daemon restarts.

The first is an honest mistake, but the last two put me into a blind rage. There’s not great documentation about how to workaround this, so to avoid other opsen going into rages here’s how you can handle this shitty package:

Concurrent and Queued SaltStack State Runs

The function "state.highstate" is running as PID 17587 and was started at 2014, Aug 29 23:21:46.540749 with jid 20140829232146540749

Ever get an error like that? Salt doesn’t allow more than a single state run to occur at a time, to ensure that multiple state runs can’t interfere with each other. This is really important, for instance, if you run highstate on a schedule, since a second highstate might be called before the previous one had finished.

What if you use states as simple function calls, for frequent actions, though? Or what if you’re using Salt for orchestration and want to run multiple salt-calls for a number of different salt state files concurrently?

SaltStack Patterns: Grain/State

It’s occasionally necessary to do actions in configuration management that aren’t easy to define in an idempotent way. For instance, sometimes you need to do an action only the first time your configuration management runs, or you need to fetch some static information from an external source, or you want to put instances in a specific state for a temporary period of time.

In SaltStack (Salt) a common pattern for handling this is what I call the Grain/State pattern. Salt’s grains are relatively static, but it’s possible to add, update, or delete custom grains during a state run, or outside of a state run either by salt-call locally or through remote execution. Grains can be used for conditionals inside of state runs to control the state of the system dynamically.

A SaltStack Highstate Killswitch

On rare occasion it’s necessary to debug a condition on a system by making temporary changes to the running system. If you’re using config management, especially as part of your deployment process, it’s necessary to disable it so that your temporary changes won’t be reset. salt-call doesn’t natively have a mechanism for this like Puppet does (puppet agent –disable; puppet agent –enable). It’s possible to do this yourself, though.

This requires that you’re using the failhard option in your configuration, that you’re using the 2014.7 (Helium) or above release, and also assumes you have some base state that is always included and is always included first.

SaltStack AWS Orchestration and Masterless Bootstrapping

In my last post, I mentioned that we’re using SaltStack (Salt) without a master. Without a master, how are we bootstrapping our instances? How are we updating the code that’s managing the instances? For this, we’re using python virtualenvs, S3, autoscaling groups with IAM roles, cloud-init and an artifact-based deployer that stores artifacts in S3 and pulls them onto the instances. Let’s start with how we’re creating the AWS resources.

Moving away from Puppet: SaltStack or Ansible?

Over the past month at Lyft we’ve been working on porting our infrastructure code away from Puppet. We had some difficulty coming to agreement on whether we wanted to use SaltStack (Salt) or Ansible. We were already using Salt for AWS orchestration, but we were divided on whether Salt or Ansible would be better for configuration management. We decided to settle it the thorough way by implementing the port in both Salt and Ansible, comparing them over multiple criteria.

Truly ordered execution using SaltStack

SaltStack’s documentation implies that by default, since the Hydrogen (2014.1.x) release, execution of states is ordered as defined. In practice, however, this isn’t true. SaltStack supports a feature called requisites, which provide features like require, watch, onchange, etc.. Some requisites, like watch, are basically impossible to live without. For instance, if you want to conditionally restart a service when a configuration file changes you need watch. If you use requisites you can’t ensure the state run will execute in order.

Per-project users and groups (aka service groups)

In Wikimedia Labs, we’re using OpenStack with heavy LDAP integration. With this integration we have a concept of global groups and users. When a user registers with Labs, the user’s account immediately becomes a global user, usable in all of Labs and its related infrastructure. When the user is added to an OpenStack project it’s also added to a global group, which is usable throughout the infrastructure.

Global users and groups are really useful for handing authentication and authorization at a global level, especially when interacting with things like Gerrit and other global services. Global users can also be used as service accounts within instances, between instances or between projects. There’s a number of downsides to global users though:

OpenStack wiki migration

On Feb 15th we migrated the MoinMoin powered OpenStack wiki to a new wiki powered by MediaWiki. Overall the migration went well. There was a large amount of cleanup that needed to get done, but we followed up the migration with a doc cleanup sprint. The wiki should be in a mostly good state. If you happen to find any articles that need cleanup, be bold!

So, what’s new with the wiki?

  1. All articles now have discussion pages
  2. It’s possible to make PDFs out of individual pages or to create a book (as a PDF or an actual physical book) from collections of articles