asimov 0.4.0

I’m very pleased to announce that the first release of the 0.4 development and review cycle for asimov!

Asimov 0.4.0 makes it much easier to run gravitational wave analyses using asimov than previous versions, with asimov graduating from being the tool which we use to produce gravitational wave catalogues to a much more powerful system for managing batches of analyses on any scale, whether you want to run a single parameter estimation run, or produce your own catalogue.

You can find the technical details of what’s changed in the release notes, which also has some discussion about how to install the new version but in this post I want to talk a bit about the design decisions made when writing this new version, why we made them, and how I see asimov developing in the near future.

Before I go on though, I’d like to express my deep thanks to the review team and everyone who reported bugs and provided answers and solutions to problems during what has turned-out to be a very long development period!

A little bit of history

Asimov started life as a collection of bash and python which helped us to produce configuration files for an analysis pipeline called lalinference when we were working on GWTC-2 which was the first really large gravitational wave catalogue which the LIGO and Virgo collaborations had put together. For previous publications we’d only had a handful of events to work with and it was still feasible to set all of the analyses up “by hand”, adding in the appropriate settings for each event in their own configuration files. When we started doing this for GWTC-2 it ver quickly became clear that this would be too slow and too error-prone for that much larger project. I ended up writing a series of scripts to check the various different data sources which needed to be consulted, and partially automated the process of creating the configurations for each analysis.

I was trying to do this in a hurry, and didn’t have time to develop any form of “nice” user interface to this unfortunately, and so I came up with a rather hacky solution. I used an issue tracker on our collaboration’s gitlab instance to hold the information about the state of each analysis. This worked surprisingly well, with the scripts interacting with gitlab via its RESTful API.

When we came to start work on the GWTC-2.1 and GWTC-3 analyses I tidied things up quite a lot, and ended up rolling asimov 0.3.0, and the 0.3 series was what we ended up using for both of these analyses. Analysing around 100 events really stretched this code to its limits, especially the way that we were using gitlab, and it made things difficult to have multiple people running analyses, leaving me in the slightly inconvenient position of being the only person capable of running the analyses for these papers.

After the main analyses for these had wrapped-up I started on the development of the version we’d use for the next observing run, O4, which is due to start its 18-month run later in 2023.

A new asimov

This new release of asimov is designed to do everything (and more!) that previous versions could do, but do them a lot better. It’s no longer necessary to set up a project on a gitlab server in order to start running analyses with asimov. Instead, this process has been rolled into a single command:

asimov init --name "My project"

which turns the working directory into an asimov project. (I recommend running this in an empty directory though, as it creates a directory structure and several auxilliary files to keep track of things). All of the information about runs is kept locally in this project directory, so you no longer need to worry about connecting to an external server.

Something asimov was previously bad at was the process of actually loading-in information about events. The new version has lots of tools for fetching information from a variety of collaboration data sources, but if you want to analyse something we’ve already published then you can easily fetch the settings from our catalogue analyses. Asimov loads this information from a “blueprint” file in YAML format, and we’ve published these for (almost) all of the previously published events in their own repository. You can learn more about using these in the getting started guide, and I’ll put together some more blog posts soon exploring some more advanced things which we can do with these settings.

Asimov 0.4.0 takes the rather complicated process of making production-quality PSDs and analyses, and reduces it to a few command line arguments. Not only does this make it easier for us to set-up new analyses on new events, but it also means that it’s much easier to repeat the entire workflow required to reproduce our results. Right now the current version of asimov can accurately repeat the main analyses, but we’ve got a little more work to do on including all of the post-processing, but that will follow soon; this is one of the more complicated parts of producing a catalogue, and it’s important that we get it right.

The final new feature I want to highlight here is that asimov is now able to work with more pipelines than before. Out of the box it’s able to run bayeswave (which is used to estimate the noise levels in data) and bilby (which is used for parameter estimation) pipelines. Soon I’ll finish adding support for a second parameter estimation code, RIFT (I expect this to drop in asimov 0.5.0). There’s also legacy support for lalinference, which is no longer actively developed, and which I can’t guarantee will always work as intended. You can also add support almost any analysis code to asimov through its plugin ecosystem, and I’ll write a post about this soon. I’ve already added support for one of my own experimental codes, and I’ll put together some tutorials.

What’s next

There’s still a lot to do! Before O4 starts in the summer I need to get things working with our entire post-processing workflow. In O3 this was much less automated than the parameter estimation pipeline, so I had less of a head-start on this, but things are slowly coming together, and support for this should be in v0.6.0 if things go according to plan (but I might need to add some releases in before then if other things come up; this close to an observing run starting there are a lot of pieces on the move!)

Looking a bit further into the future I want asimov to be more useful for things other than gravitational wave analyses. Over the next few releases you’ll see some things getting renamed to make them less gravitational wave specific to help with this. The problem of coordinating very large numbers of analyses on computing clusters is not unique to gravitational wave astronomy, and I’d like our solutions to be more generally useful.

Project

A little bit of history

A new asimov

What’s next