Skip to content

Multiplex filtering#362

Open
amyjaynethompson wants to merge 2 commits intomainfrom
multiplex_filtering
Open

Multiplex filtering#362
amyjaynethompson wants to merge 2 commits intomainfrom
multiplex_filtering

Conversation

@amyjaynethompson
Copy link
Copy Markdown
Contributor

@amyjaynethompson amyjaynethompson commented Apr 28, 2026

xia2.multiplex has a filtering option built in which can greatly improve data reduction quality. VMXm, in particular, always manually reprocess datasets with xia2.multiplex to turn on these filtering parameters. Therefore, it would be nice to include this as a part of the auto processing infrastructure.

The issue has always been that the filtering can be slow, and this can impede rapid feedback. In xia2, we recently made a new command line program, xia2.multiplex_filtering. This performs the same filtering algorithms on a completed multiplex job. By breaking the algorithm into two separate programs, this would allow for rapid feedback as well as providing a filtered mtz later. This PR attempts to provide trigger/wrappers for such a filtering pipeline.

The cluster number is passed through from multiplex to multiplex_filtering to ensure that it is not triggered on clusters (possible implementation for clusters in the future, but would need slightly different triggering requirements).

As this pipeline relies on a finished multiplex directory (specific files needed that are not user-interesting), checks are done to make sure data is available where expected. This is done using the same delay multipliers as multiplex.

The sample group information is also passed through from multiplex. This is important, as there can be multiple sample groups related to a single DCID. Multiplex also passes through the actual DCID's it used in processing. This is also important, as the stored list of related DCID's can include both rotation/grid scans or other datasets that should not be used. Given all the relevant queries are already done in the multiplex trigger, it seemed easiest to pass these through rather than repeating all these queries.

The filtering itself is set to image_group mode, which means all the images are grouped into batches and a deltacchalf algorithm is used to see if any of these batches do not correlate well with the rest of the data. A group size of 50 is set as default, as this corresponds to 5deg rotation (following standard 0.1 deg fine slicing). However, VMXm have had success using a group size of 10, so they have this specified for their beamline.

General intent here is to test on VMXm first via staging, then roll it out live just for VMXm initially. This will be useful stress testing prior to deployment on other beam lines. Eventually, it is expected that this is triggered on all beam lines after multiplex.

NOTE: will need dials/latest to run -> this includes xia2.multiplex_filtering bug fixes which are not in the latest release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant