I’m actually on time for this update, this year! Here’s my goals from last year; I’ll give feedback inline:
- Continue with the Labs project. Finish set up of test/dev Labs, and begin work and make major progress on tool Labs.
- Partial success: Test/dev Labs is going really well. At the time of this writing we have 99 projects, 174 instances, and 446 users. We have per-project nagios, ganglia, puppet, and sudo. We also have an all-in-one MediaWiki puppet configuration. We currently have one zone with 5 compute nodes, and will mostly triple the capacity of that in the next month. We have another zone coming up in another datacenter that will be 8 large compute nodes. Stability is still currently a concern, and we haven’t come out of closed beta, yet, though. Also, work on Tool Labs is mostly not started. We do have a bots cluster that’s community managed, but we don’t have database replication and don’t have a simple way for tool authors to contribute.
- Hire a devops contractor for work on Labs.
- Success: Not only did we hire a devops contractor, we built a larger team. We now have Andrew Bogott (developer), Sara Smollett (operations), Faidon Liambotis (operations) and myself (operations).
- Build a devops community around the Wikimedia architecture.
- Success: We had roughly 700 changes pushed into the operations/puppet repository from people who are not operations team members. A number of our larger Labs projects were built by volunteers (bots, deployment-prep, nagios, for instance). Volunteers are members of most of the projects that exist in Labs.
- Finish the HTTPS project. This will hopefully be complete from the ops perspective by the end of this year.
- Partial success: HTTPS is fully enabled on all sites, for both IPv4 and IPv6. I’ve listed this as a partial success, because I’d like the default for logged-in users to be HTTPS. Also, I wanted secure.wikimedia.org to redirect properly to HTTPS by now, and haven’t found time to do so.
- On-board new employees.
- Success: We brought on a lot of new Operations Engineers last year and I helped on-board nearly all of them. That said, I wish I would have written more documentation on the process as I was doing it.
- Enable OpenID as a provider and oAuth on Wikimedia (this goal still needs consensus).
- Partial failure, again: That said, I’ve been pushing for oAuth very strongly internally and it looks like this is now a stated goal of next year! oAuth is crucial to the success of Labs, so I’m very happy this is happening.
What did I accomplish that was outside of my stated goals?
- Installed Gerrit, moved our operations repositories from SVN to Git and released our puppet repository as open source and cloneable to the world.
- Assisted the core services team with the migration from SVN to Git.
- Launched Labs (in October 2011 at the New Orleans MediaWiki hackathon).
- Wrote the OpenStackManager and OATHAuth MediaWiki extensions.
- Massively refactored the LdapAuthentication MediaWiki extension.
- Rewrote a couple IRC bots (ircecho and adminbot).
- Wrote a new deployment system that may replace our production deployment system.
- Did the operations portion of the SOPA blackout.
- Organized the New Orleans MediaWiki hackathon.
- Organized an OpenStack meetup held at the Wikimedia Foundation offices.
- Pushed 790 changes into Gerrit.
- Made 1,100 edits to labsconsole (those edits include project creations, modification of projects, creation/deletion of instances and actual writing of documentation).
- Got the 100,000th revision in Wikimedia SVN, much to the dismay of others!
What are my goals for next year?
- Stabilize Labs.
- Add a second Labs zone in eqiad.
- Make major progress on Tool Labs.
- Add a real queue to the Wikimedia infrastructure, for jobs and other needs.
- Continue building a solid community around Labs.
- Continue to improve the HTTPS infrastructure.