11 Dec 2025

Attendees:

Petrovic, D. McDonagh, D. Waterman, E. Krissinel

Results

Developed a fitting object that reads an hkl file, bins spots based on resolution and then performs the fitting (as in the original paper [Clabbers et al., 2019]). The fitter then performs a scaling correction on the original data. The only problem is that the corrected datasets have lower \(R_1\) when processed with shelxt.


Fitting the intensity correction by resolution.	Original \(F_{\rm o}\) replaced with \(F_c\).	Correction in `shelxt` \(R_1\) for datasets with intensities computed directly from Gemmi.

Replaced the original data for Paracetamol with scaled \(|F_c|^2\) (computed using Gemmi) and compared the resulting \(R_1\) factors obtained by processing with shelxt. In the majority of cases, the new R-factors are lower than the original ones.
Trained a Gradient boosted regressor on a single dataset (input parameters: H, K, L, intensity, sigma, image index, resolution, global scaling parameter).


Computed \(F_c\) on the trained data	Computed \(F_c\) on the new data.

Find the original data for the J. P. Abrahams paper and try to reproduce their results to make sure our processing pipeline is correct.
Compute \(R_1\) per image to identify images that are highly impacted by dynamical effects.
Retrain the gradient boosted regressor (GBR) to include more information about the spot environment (e.g. maximal and minimal intensity on an image, average intensity on an image, miller indices of the neighbouring spots). Train on more datasets.
Read about PointNet Architecture.
Test how GBR model reproduces scaled training data.