Please click here to go to the tutorial for DIALS 2.2.
Multi-crystal symmetry analysis and scaling with DIALS¶
Introduction¶
Recent additions to DIALS and xia2 have enabled multi-crystal analysis to be
performed following integration. These tools are particularly relevant
for analysis of many partial-datasets, which may be the only practical way of
performing data collections for certain crystals. After integration, the
space group symmetry can be investigated by testing for the presence of symmetry
operations relating the integrated intensities of groups of reflections - the
program to perform this is analysis is dials.symmetry
(with algorithms
similar to those of the program Pointless).
Another thing to consider is that for certain space groups (polar space groups),
there is an inherent ambiguity in the way that the diffraction pattern can be
indexed. In order to combine multiple datasets from these space groups, one must
reindex all data to a consistent setting, which can be done with the program
dials.cosym
(see Gildea and Winter for details).
Finally, the data must be scaled, to correct for experimental effects such as
differences in crystal size/illuminated volume and radiation damage - this can
be done with the program dials.scale
(with algorithms similar to those
of the program Aimless). After the data has been scaled, choices
can then be made about applying a resolution limit to exclude certain regions
of the data which may be negatively affected by radiation damage.
In this tutorial, we shall investigate a multi-crystal dataset collected on
the VMXi beamline, Diamond’s automated facility for data collection from
crystallisation experiments in-situ. The dataset consists of four repeats of
a 60-degree rotation measurement on a crystal of Proteinase K, taken at different
locations on the crystal. We shall start with the integrated reflections and
experiments files generated by running the automated processing software
xia2
with pipeline=dials
.
Have a look at the Processing in Detail tutorial if you
want to know more about the different processing steps up to this point.
Note
To obtain the data for this tutorial you can run
dials.data get vmxi_proteinase_k_sweeps
. If you are at Diamond
Light Source on BAG training then the data are already available.
After typing module load bagtraining
you’ll be moved to a working
folder, with the data already located in the tutorial-data/ccp4/integrated_files
subdirectory. The processing in this tutorial will produce quite a few files,
so it’s recommended to make an move to new directory:
mkdir multi_crystal
cd multi_crystal
xia2.multiplex¶
The easiest way to run these tools for a multi-dataset analysis is through the
program xia2.multiplex
.
This runs several DIALS programs, including the programs described above, while
producing useful plots and output files.
To run xia2.multiplex
, we must provide the path to the input integrated files from
dials.integrate
:
xia2.multiplex experiments_0.expt experiments_1.expt experiments_2.expt experiments_3.expt reflections_0.refl reflections_1.refl reflections_2.refl reflections_3.refl
Show/Hide Log
DIALS 3.dev.1158-ga2f8f714a
The following parameters have been modified:
input {
experiments = experiments_0.expt
experiments = experiments_1.expt
experiments = experiments_2.expt
experiments = experiments_3.expt
reflections = reflections_0.refl
reflections = reflections_1.refl
reflections = reflections_2.refl
reflections = reflections_3.refl
}
Selecting 4 experiments with profile-fitted reflections
Selecting 4 experiments with refined reflections
0 singletons:
Point group a b c alpha beta gamma
1 cluster
Cluster_id N_xtals Med_a Med_b Med_c Med_alpha Med_beta Med_gamma Delta(deg)
4 in P 4 2 2.
cluster_1 4 68.36 (0.01 ) 68.36 (0.01 ) 103.95(0.02 ) 90.00 (0.00) 90.00 (0.00) 90.00 (0.00)
P 4/m m m (No. 123) 68.36 68.36 103.95 90.00 90.00 90.00 0.0
Standard deviations are in brackets.
Each cluster:
Input lattice count, with integration Bravais setting space group.
Cluster median with Niggli cell parameters (std dev in brackets).
Highest possible metric symmetry and unit cell using LePage (J Appl Cryst 1982, 15:255) method, maximum delta 3deg.
Using all data sets for subsequent analysis
Laue group determined by dials.cosym: P 4 2 2
Resolution limit: 1.78 (cc_half > 0.3)
Space group determined by dials.symmetry: P 41 21 2
Overall merging statistics:
+--------------------+--------------+------------------+-------------------+
| | Overall | Low resolution | High resolution |
|--------------------+--------------+------------------+-------------------|
| Resolution (Å) | 68.37 - 1.78 | 68.42 - 4.83 | 1.81 - 1.78 |
| Observations | 216892 | 21172 | 137 |
| Unique reflections | 20799 | 1380 | 126 |
| Multiplicity | 10.4 | 15.3 | 1.1 |
| Completeness | 85.67% | 100.00% | 10.61% |
| Mean I/σ(I) | 30.3 | 57.3 | 2.4 |
| Rmerge | 0.057 | 0.047 | 0.155 |
| Rmeas | 0.059 | 0.049 | 0.212 |
| Rpim | 0.016 | 0.012 | 0.145 |
| CC½ | 0.999 | 0.999 | 0.958 |
+--------------------+--------------+------------------+-------------------+
Resolution shells:
+------------------+----------+-------------+----------------+----------------+----------+---------------+----------+---------+--------+---------+--------+---------+
| Resolution (Å) | N(obs) | N(unique) | Multiplicity | Completeness | Mean I | Mean I/σ(I) | Rmerge | Rmeas | Rpim | Ranom | CC½ | CCano |
|------------------+----------+-------------+----------------+----------------+----------+---------------+----------+---------+--------+---------+--------+---------|
| 68.42 - 4.83 | 21172 | 1380 | 15.34 | 100 | 286.4 | 57.3 | 0.047 | 0.049 | 0.012 | 0.025 | 0.999* | -0.162 |
| 4.83 - 3.84 | 20606 | 1269 | 16.24 | 100 | 420.5 | 61 | 0.045 | 0.047 | 0.011 | 0.024 | 0.999* | 0.014 |
| 3.84 - 3.35 | 20449 | 1258 | 16.26 | 100 | 318.8 | 56.5 | 0.048 | 0.05 | 0.012 | 0.026 | 0.999* | -0.171 |
| 3.35 - 3.05 | 20533 | 1226 | 16.75 | 100 | 219.6 | 51.6 | 0.054 | 0.055 | 0.013 | 0.028 | 0.999* | -0.077 |
| 3.05 - 2.83 | 20413 | 1221 | 16.72 | 100 | 151.2 | 45 | 0.06 | 0.062 | 0.015 | 0.032 | 0.999* | -0.074 |
| 2.83 - 2.66 | 20944 | 1230 | 17.03 | 100 | 125.4 | 42.5 | 0.066 | 0.068 | 0.016 | 0.036 | 0.998* | -0.168 |
| 2.66 - 2.53 | 20254 | 1200 | 16.88 | 100 | 99.2 | 36.2 | 0.075 | 0.077 | 0.019 | 0.041 | 0.998* | 0.062 |
| 2.53 - 2.42 | 16541 | 1212 | 13.65 | 100 | 88.4 | 30.9 | 0.081 | 0.084 | 0.023 | 0.048 | 0.997* | -0.046 |
| 2.42 - 2.32 | 12326 | 1208 | 10.2 | 100 | 79.2 | 26.3 | 0.083 | 0.088 | 0.027 | 0.059 | 0.996* | -0.001 |
| 2.32 - 2.24 | 10344 | 1209 | 8.56 | 100 | 75.8 | 23 | 0.085 | 0.091 | 0.031 | 0.064 | 0.994* | -0.077 |
| 2.24 - 2.17 | 8654 | 1193 | 7.25 | 99.92 | 66.9 | 19.5 | 0.09 | 0.097 | 0.036 | 0.071 | 0.992* | -0.146 |
| 2.17 - 2.11 | 6964 | 1189 | 5.86 | 99.41 | 56.9 | 15.5 | 0.101 | 0.111 | 0.044 | 0.093 | 0.991* | -0.125 |
| 2.11 - 2.06 | 5481 | 1148 | 4.77 | 96.15 | 51.6 | 13 | 0.102 | 0.114 | 0.05 | 0.107 | 0.990* | -0.01 |
| 2.06 - 2.01 | 4333 | 1108 | 3.91 | 92.72 | 44.5 | 10.7 | 0.114 | 0.13 | 0.062 | 0.13 | 0.983* | -0.019 |
| 2.01 - 1.96 | 3142 | 1028 | 3.06 | 86.46 | 38.1 | 8.4 | 0.126 | 0.15 | 0.08 | 0.17 | 0.974* | -0.031 |
| 1.96 - 1.92 | 2119 | 928 | 2.28 | 77.46 | 34.7 | 6.9 | 0.125 | 0.155 | 0.09 | 0.174 | 0.963* | -0.418 |
| 1.92 - 1.88 | 1174 | 719 | 1.63 | 60.27 | 29.2 | 4.9 | 0.142 | 0.188 | 0.122 | 0.252 | 0.947* | 0.947 |
| 1.88 - 1.84 | 819 | 566 | 1.45 | 49 | 25.7 | 4.2 | 0.162 | 0.22 | 0.148 | 0.368 | 0.937* | 0 |
| 1.84 - 1.81 | 487 | 381 | 1.28 | 31.54 | 23.3 | 3.6 | 0.196 | 0.269 | 0.184 | 0.325 | 0.903* | 0 |
| 1.81 - 1.78 | 137 | 126 | 1.09 | 10.61 | 15.8 | 2.4 | 0.155 | 0.212 | 0.145 | 9.5 | 0.958* | 0 |
+------------------+----------+-------------+----------------+----------------+----------+---------------+----------+---------+--------+---------+--------+---------+
Intensity correlation clustering summary:
========= ============== ========== ======== ============== ==============
Cluster No. datasets Datasets Height Multiplicity Completeness
========= ============== ========== ======== ============== ==============
1 2 0 3 0.0021 5.6 0.81
2 2 1 2 0.0033 5.7 0.79
3 4 0 1 2 3 0.004 10.4 0.86
========= ============== ========== ======== ============== ==============
Cos(angle) clustering summary:
========= ============== ========== ======== ============== ==============
Cluster No. datasets Datasets Height Multiplicity Completeness
========= ============== ========== ======== ============== ==============
1 2 1 2 0.00087 5.7 0.79
2 2 0 3 0.014 5.6 0.81
3 4 0 1 2 3 0.048 10.4 0.86
========= ============== ========== ======== ============== ==============
xia2.multiplex used... dials, dials.cosym, dials.scale, xia2.multiplex
Here are the appropriate citations (BIBTeX in xia2-citations.bib.)
Beilsten-Edmands, J. et al. (2020) Acta Cryst. D76, 385-399.
Gildea, R. J. and Winter, G. (2018) Acta Cryst. D74, 405-410.
Gildea, R. J. et al. (2022) Acta Cryst. D78, 752-769.
Winter, G. et al. (2018) Acta Cryst. D74, 85-97.
This runs dials.cosym
to analyse the Laue symmetry and reindex all datasets
consistently, scales the data with dials.scale
,
calculates a resolution limit with dials.estimate_resolution
and reruns
dials.scale
with the determined resolution cutoff. The
final dataset is exported to an unmerged mtz and a
HTML report
is generated. The easiest way to see the results is to open the
HTML report
in your browser of choice e.g.:
firefox xia2.multiplex.html
Provided is a summary of the merging statistics as well as several plots, please explore these for a few minutes now! This dataset results in good merging statistics, however if you navigate to the “Analysis by batch” tab in “All data”, you will see that the fourth dataset has poorer statistics compared to the others. Let’s repeat the processing manually to explore the different steps and address this issue.
Manual reprocessing¶
The first step is Laue/Patterson group analysis using dials.cosym:
dials.cosym experiments_0.expt experiments_1.expt experiments_2.expt experiments_3.expt reflections_0.refl reflections_1.refl reflections_2.refl reflections_3.refl
Scoring all possible sub-groups
+-------------------+-----+--------------+----------+--------+--------+---------+--------------------+
| Patterson group | | Likelihood | NetZcc | Zcc+ | Zcc- | delta | Reindex operator |
|-------------------+-----+--------------+----------+--------+--------+---------+--------------------|
| P 4/m m m | *** | 1 | 9.71 | 9.71 | 0 | 0 | -a,-b,c |
| C m m m | | 0 | 0.03 | 9.72 | 9.69 | 0 | a+b,-a+b,c |
| P 4/m | | 0 | 0.01 | 9.71 | 9.7 | 0 | -a,-b,c |
| P m m m | | 0 | -0 | 9.71 | 9.71 | 0 | -a,-b,c |
| P 1 2/m 1 | | 0 | 0.04 | 9.74 | 9.7 | 0 | -a,-c,-b |
| C 1 2/m 1 | | 0 | 0.04 | 9.74 | 9.7 | 0 | a-b,a+b,c |
| P 1 2/m 1 | | 0 | -0 | 9.7 | 9.71 | 0 | -b,-a,-c |
| C 1 2/m 1 | | 0 | -0.01 | 9.7 | 9.71 | 0 | a+b,-a+b,c |
| P 1 2/m 1 | | 0 | -0.04 | 9.68 | 9.71 | 0 | -a,-b,c |
| P -1 | | 0 | -9.71 | 0 | 9.71 | 0 | -a,-b,c |
+-------------------+-----+--------------+----------+--------+--------+---------+--------------------+
Best solution: P 4/m m m
Unit cell: 68.360, 68.360, 103.953, 90.000, 90.000, 90.000
Reindex operator: -a,-b,c
Laue group probability: 1.000
Laue group confidence: 1.000
Reindexing operators:
x,y,z: [0, 1, 2, 3]
Show/Hide Log
DIALS 3.dev.1158-ga2f8f714a
The following parameters have been modified:
input {
experiments = experiments_0.expt
experiments = experiments_1.expt
experiments = experiments_2.expt
experiments = experiments_3.expt
reflections = reflections_0.refl
reflections = reflections_1.refl
reflections = reflections_2.refl
reflections = reflections_3.refl
}
Using Andrews-Bernstein distance from Andrews & Bernstein J Appl Cryst 47:346 (2014)
Distances have been calculated
0 singletons:
Point group a b c alpha beta gamma
1 cluster
Cluster_id N_xtals Med_a Med_b Med_c Med_alpha Med_beta Med_gamma Delta(deg)
4 in P 4 2 2.
cluster_1 4 68.36 (0.01 ) 68.36 (0.01 ) 103.95(0.02 ) 90.00 (0.00) 90.00 (0.00) 90.00 (0.00)
P 4/m m m (No. 123) 68.36 68.36 103.95 90.00 90.00 90.00 0.0
Standard deviations are in brackets.
Each cluster:
Input lattice count, with integration Bravais setting space group.
Cluster median with Niggli cell parameters (std dev in brackets).
Highest possible metric symmetry and unit cell using LePage (J Appl Cryst 1982, 15:255) method, maximum delta 3deg.
Mapping all input cells to a common minimum cell
Filtering reflections for dataset 0
Read 76079 predicted reflections
Selected 54367 reflections integrated by profile and summation methods
Combined 1127 partial reflections with other partial reflections
Removed 20 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5
Removed 14 intensity.prf.value reflections with I/Sig(I) < -5
Filtering reflections for dataset 1
Read 75607 predicted reflections
Selected 54845 reflections integrated by profile and summation methods
Combined 1284 partial reflections with other partial reflections
Removed 50 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5
Removed 14 intensity.prf.value reflections with I/Sig(I) < -5
Filtering reflections for dataset 2
Read 77983 predicted reflections
Selected 54461 reflections integrated by profile and summation methods
Combined 1404 partial reflections with other partial reflections
Removed 38 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5
Removed 8 intensity.prf.value reflections with I/Sig(I) < -5
Filtering reflections for dataset 3
Read 76468 predicted reflections
Selected 53877 reflections integrated by profile and summation methods
Combined 1062 partial reflections with other partial reflections
Removed 8 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5
Removed 5 intensity.prf.value reflections with I/Sig(I) < -5
Patterson group: P 4/m m m
--------------------------------------------------------------------------------
Normalising intensities for dataset 1
ML estimate of overall B value:
13.52 A**2
ML estimate of -log of scale factor:
-3.04
--------------------------------------------------------------------------------
Normalising intensities for dataset 2
ML estimate of overall B value:
11.06 A**2
ML estimate of -log of scale factor:
-3.50
--------------------------------------------------------------------------------
Normalising intensities for dataset 3
ML estimate of overall B value:
11.38 A**2
ML estimate of -log of scale factor:
-2.96
--------------------------------------------------------------------------------
Normalising intensities for dataset 4
ML estimate of overall B value:
12.14 A**2
ML estimate of -log of scale factor:
-2.67
--------------------------------------------------------------------------------
Estimation of resolution for Laue group analysis
Removing 3 Wilson outliers with E^2 >= 16.0
Resolution estimate from <I>/<σ(I)> > 4.0 : 2.16
Resolution estimate from CC½ > 0.60: 1.80
High resolution limit set to: 1.80
Selecting 148799 reflections with d > 1.80
================================================================================
Automatic determination of number of dimensions for analysis
+--------------+--------------+
| Dimensions | Functional |
|--------------+--------------|
| 1 | 14.7478 |
| 2 | 15.8545 |
| 3 | 14.5975 |
| 4 | 14.9222 |
| 5 | 15.3241 |
| 6 | 14.8896 |
| 7 | 15.2194 |
| 8 | 14.8325 |
+--------------+--------------+
Best number of dimensions: 7
Using 7 dimensions for analysis
Principal component analysis:
Explained variance: 0.0065, 0.0052, 0.0043, 0.0042, 0.0039, 0.0036, 2.8e-05
Explained variance ratio: 0.23, 0.19, 0.15, 0.15, 0.14, 0.13, 0.001
Scoring individual symmetry elements
+--------------+--------+------+-----+-----------------+
| likelihood | Z-CC | CC | | Operator |
|--------------+--------+------+-----+-----------------|
| 0.933 | 9.69 | 0.97 | *** | 4 |(0, 0, 1) |
| 0.934 | 9.7 | 0.97 | *** | 4^-1 |(0, 0, 1) |
| 0.934 | 9.7 | 0.97 | *** | 2 |(1, 0, 0) |
| 0.932 | 9.68 | 0.97 | *** | 2 |(0, 1, 0) |
| 0.936 | 9.74 | 0.97 | *** | 2 |(0, 0, 1) |
| 0.936 | 9.74 | 0.97 | *** | 2 |(1, 1, 0) |
| 0.934 | 9.7 | 0.97 | *** | 2 |(-1, 1, 0) |
+--------------+--------+------+-----+-----------------+
Scoring all possible sub-groups
+-------------------+-----+--------------+----------+--------+--------+---------+--------------------+
| Patterson group | | Likelihood | NetZcc | Zcc+ | Zcc- | delta | Reindex operator |
|-------------------+-----+--------------+----------+--------+--------+---------+--------------------|
| P 4/m m m | *** | 1 | 9.71 | 9.71 | 0 | 0 | -a,-b,c |
| C m m m | | 0 | 0.03 | 9.72 | 9.69 | 0 | a+b,-a+b,c |
| P 4/m | | 0 | 0.01 | 9.71 | 9.7 | 0 | -a,-b,c |
| P m m m | | 0 | -0 | 9.71 | 9.71 | 0 | -a,-b,c |
| P 1 2/m 1 | | 0 | 0.04 | 9.74 | 9.7 | 0 | -a,-c,-b |
| C 1 2/m 1 | | 0 | 0.04 | 9.74 | 9.7 | 0 | a-b,a+b,c |
| P 1 2/m 1 | | 0 | -0 | 9.7 | 9.71 | 0 | -b,-a,-c |
| C 1 2/m 1 | | 0 | -0.01 | 9.7 | 9.71 | 0 | a+b,-a+b,c |
| P 1 2/m 1 | | 0 | -0.04 | 9.68 | 9.71 | 0 | -a,-b,c |
| P -1 | | 0 | -9.71 | 0 | 9.71 | 0 | -a,-b,c |
+-------------------+-----+--------------+----------+--------+--------+---------+--------------------+
Best solution: P 4/m m m
Unit cell: 68.360, 68.360, 103.953, 90.000, 90.000, 90.000
Reindex operator: -a,-b,c
Laue group probability: 1.000
Laue group confidence: 1.000
Reindexing operators:
x,y,z: [0, 1, 2, 3]
Writing html report to: dials.cosym.html
Writing json to: dials.cosym.json
Saving reindexed experiments to symmetrized.expt
Saving reindexed reflections to symmetrized.refl
As you can see, the \(P\,4/m\,m\,m\) Patterson group is found with the highest confidence. For the corresponding space group, the mirror symmetries are removed to give \(P\,4\,2\,2\), as the chiral nature of macromolecules means we have a restricted choice of space groups. In this example, all datasets were indexed consistently, but this is not the case in general.
Next, the data can be scaled:
dials.scale symmetrized.expt symmetrized.refl
From the merging statistics it is clear that the data quality is good out to the furthest resolution (\(CC_{1/2} > 0.3\)), which can be confirmed by a resolution analysis:
dials.estimate_resolution scaled.expt scaled.refl
Resolution cc_half: 1.78
Show/Hide Log
The following parameters have been modified:
input {
experiments = scaled.expt
reflections = scaled.refl
}
DIALS 3.dev.1158-ga2f8f714a
Detected existence of a multi-dataset reflection table
containing 4 datasets.
Read 74952 predicted reflections
Selected 54371 scaled reflections
Combined 2 partial reflections with other partial reflections
Read 74323 predicted reflections
Selected 54522 scaled reflections
Combined 3 partial reflections with other partial reflections
Read 76579 predicted reflections
Selected 54045 scaled reflections
Combined 4 partial reflections with other partial reflections
Read 75406 predicted reflections
Selected 53974 scaled reflections
Combined 2 partial reflections with other partial reflections
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Removing 9 Wilson outliers with E^2 >= 16.0
Resolution cc_half: 1.78
If the resolution limit was lower than the extent of the data, scaling would be rerun with a new resolution limit, for example:
dials.scale scaled.expt scaled.refl d_min=1.78
Show/Hide Log
DIALS 3.dev.1158-ga2f8f714a
The following parameters have been modified:
cut_data {
d_min = 1.78
}
input {
experiments = scaled.expt
reflections = scaled.refl
}
Checking for the existence of a reflection table
containing multiple datasets
Detected existence of a multi-dataset reflection table
containing 4 datasets.
Found 4 reflection tables & 4 experiments in total.
Dataset ids are: 0,1,2,3
Space group being used during scaling is P 4 2 2
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Scaling models have been initialised for all experiments.
================================================================================
The experiment id for this dataset is 0.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Read 74952 predicted reflections
Selected 57758 reflections integrated by profile or summation methods
Removed 1774 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5.0
Removed 14 intensity.prf.value reflections with I/Sig(I) < -5.0
Excluding 20037/74952 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 12546
criterion: excluded for scaling, reflections: 20037
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.79703, b = 0.05782
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 21.700
Using previously determined optimal intensity choice: 9999.7432
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 1.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Read 74323 predicted reflections
Selected 58232 reflections integrated by profile or summation methods
Removed 1917 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5.0
Removed 14 intensity.prf.value reflections with I/Sig(I) < -5.0
Excluding 19172/74323 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 11927
criterion: excluded for scaling, reflections: 19172
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.79703, b = 0.05782
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 21.700
Using previously determined optimal intensity choice: 9999.7432
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 2.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Read 76579 predicted reflections
Selected 57792 reflections integrated by profile or summation methods
Removed 1869 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5.0
Removed 8 intensity.prf.value reflections with I/Sig(I) < -5.0
Excluding 21795/76579 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 14121
criterion: excluded for scaling, reflections: 21795
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.79703, b = 0.05782
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 21.700
Using previously determined optimal intensity choice: 9999.7432
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 3.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Read 75406 predicted reflections
Selected 57424 reflections integrated by profile or summation methods
Removed 1772 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5.0
Removed 5 intensity.prf.value reflections with I/Sig(I) < -5.0
Excluding 20789/75406 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 12917
criterion: excluded for scaling, reflections: 20789
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.79703, b = 0.05782
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 21.700
Using previously determined optimal intensity choice: 9999.7432
Completed preprocessing and initialisation for this dataset.
================================================================================
Configuring a MultiScaler to handle the individual Scalers.
Determining symmetry equivalent reflections across datasets.
Using quasi-random reflection selection. Selecting from 17132 symmetry groups
with <I/sI> > 1.0 (203702 reflections)). Selection target of 716.56 reflections
from each dataset, with a total number between 11465.02 and 13758.03.
Summary of cross-dataset reflection groups chosen (1277 groups, 13392 reflections):
+---------------+------------+----------+-----+-----+-----+-----+
| d-range | n_groups | n_refl | 0 | 1 | 2 | 3 |
|---------------+------------+----------+-----+-----+-----+-----|
| 68.43 - 7.961 | 42 | 689 | 105 | 203 | 178 | 203 |
| 7.961 - 5.649 | 31 | 727 | 70 | 241 | 199 | 217 |
| 5.649 - 4.617 | 33 | 831 | 72 | 277 | 231 | 251 |
| 4.617 - 4.001 | 34 | 912 | 70 | 304 | 262 | 276 |
| 4.001 - 3.58 | 37 | 982 | 72 | 324 | 285 | 301 |
| 3.58 - 3.269 | 31 | 885 | 72 | 296 | 242 | 275 |
| 3.269 - 3.027 | 34 | 846 | 70 | 277 | 228 | 271 |
| 3.027 - 2.832 | 31 | 775 | 70 | 253 | 209 | 243 |
| 2.832 - 2.67 | 34 | 818 | 71 | 263 | 220 | 264 |
| 2.67 - 2.533 | 34 | 773 | 71 | 248 | 207 | 247 |
| 2.533 - 2.415 | 31 | 560 | 70 | 197 | 190 | 103 |
| 2.415 - 2.313 | 38 | 551 | 104 | 177 | 200 | 70 |
| 2.313 - 2.222 | 41 | 523 | 121 | 165 | 167 | 70 |
| 2.222 - 2.141 | 46 | 505 | 133 | 164 | 137 | 71 |
| 2.141 - 2.069 | 77 | 594 | 153 | 176 | 105 | 160 |
| 2.069 - 2.003 | 104 | 582 | 150 | 150 | 105 | 177 |
| 2.003 - 1.943 | 137 | 657 | 159 | 217 | 141 | 140 |
| 1.943 - 1.889 | 205 | 614 | 141 | 172 | 161 | 140 |
| 1.889 - 1.838 | 213 | 474 | 83 | 125 | 132 | 134 |
| 1.838 - 1.792 | 44 | 94 | 19 | 28 | 32 | 15 |
+---------------+------------+----------+-----+-----+-----+-----+
Summary of reflections chosen for minimisation from each dataset (52727 total):
+--------------+------------------+----------------------+----------------------+--------------------+
| Dataset id | reflections | randomly selected | randomly selected | combined number |
| | connected to | reflections | reflections | of reflections |
| | other datasets | within dataset | across datasets | |
|--------------+------------------+----------------------+----------------------+--------------------|
| 0 | 1876 | 4603 | 6250 | 11840 |
| 1 | 4257 | 4759 | 6416 | 13937 |
| 2 | 3631 | 4673 | 6395 | 13354 |
| 3 | 3628 | 4801 | 6405 | 13596 |
| total | 13392 | 18836 | 25466 | 52727 |
+--------------+------------------+----------------------+----------------------+--------------------+
Completed configuration of MultiScaler.
================================================================================
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with an LBFGS minimizer.
Time taken for refinement 1.57
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52727 | 1.1251 |
| 1 | 52727 | 1.125 |
| 2 | 52727 | 1.1247 |
| 3 | 52727 | 1.1245 |
| 4 | 52727 | 1.1243 |
| 5 | 52727 | 1.1242 |
+--------+--------+----------+
RMSD no longer decreasing
lbfgs minimizer stop: callback_after_step is True
================================================================================
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
2923 outliers have been identified.
Performing multi-dataset profile/summation intensity optimisation.
+-----------------+---------+---------+
| Combination | CC1/2 | Rmeas |
|-----------------+---------+---------|
| prf only | 0.99898 | 0.05535 |
| sum only | 0.99904 | 0.06074 |
| Imid = 871.79 | 0.9992 | 0.05405 |
| Imid = 11484.08 | 0.99921 | 0.05422 |
| Imid = 1148.41 | 0.99921 | 0.05394 |
| Imid = 114.84 | 0.99915 | 0.0572 |
+-----------------+---------+---------+
Combined intensities with Imid = 1148.41 determined to be best for scaling.
Combined outlier rejection has been performed across multiple datasets,
337 outliers have been identified.
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with an LBFGS minimizer.
Time taken for refinement 1.17
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52646 | 1.0458 |
| 1 | 52646 | 1.0456 |
| 2 | 52646 | 1.0454 |
| 3 | 52646 | 1.0453 |
+--------+--------+----------+
RMSD no longer decreasing
lbfgs minimizer stop: callback_after_step is True
================================================================================
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
345 outliers have been identified.
Determining a combined error model for all datasets
Performing a round of error model refinement.
Error model details:
Type: basic
Parameters: a = 0.76062, b = 0.06355
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 20.689
Results of error model refinement. Uncorrected and corrected variances
of normalised intensity deviations for given intensity ranges. Variances
are expected to be ~1.0 for reliable errors (sigmas).
+--------------------------+----------+------------------------+----------------------+
| Intensity range (<Ih>) | n_refl | Uncorrected variance | Corrected variance |
|--------------------------+----------+------------------------+----------------------|
| 10329.29 - 1835.92 | 1427 | 31.885 | 1.098 |
| 1835.92 - 1406.75 | 1427 | 14.75 | 0.924 |
| 1406.75 - 1184.41 | 1427 | 12.067 | 0.936 |
| 1184.41 - 921.70 | 2824 | 10.039 | 0.956 |
| 921.70 - 503.75 | 12372 | 6.036 | 0.975 |
| 503.75 - 275.33 | 21116 | 3.555 | 1.058 |
| 275.33 - 150.48 | 29134 | 2.182 | 1.169 |
| 150.48 - 82.24 | 31555 | 1.475 | 1.208 |
| 82.24 - 44.95 | 25894 | 1.126 | 1.225 |
| 44.95 - 24.56 | 15545 | 0.845 | 1.098 |
+--------------------------+----------+------------------------+----------------------+
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with a Levenberg-Marquardt minimizer.
Time taken for refinement 10.91
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52644 | 1.0481 |
| 1 | 52644 | 1.046 |
| 2 | 52644 | 1.0455 |
| 3 | 52644 | 1.0453 |
| 4 | 52644 | 1.0452 |
| 5 | 52644 | 1.0451 |
+--------+--------+----------+
RMSD no longer decreasing
================================================================================
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with a Levenberg-Marquardt minimizer.
Time taken for refinement 3.54
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52644 | 1.0451 |
| 1 | 52644 | 1.0451 |
+--------+--------+----------+
RMSD no longer decreasing
================================================================================
Calculating error estimates of inverse scale factors.
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
330 outliers have been identified.
Determining a combined error model for all datasets
Performing a round of error model refinement.
Error model details:
Type: basic
Parameters: a = 0.76108, b = 0.06326
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 20.769
Results of error model refinement. Uncorrected and corrected variances
of normalised intensity deviations for given intensity ranges. Variances
are expected to be ~1.0 for reliable errors (sigmas).
+--------------------------+----------+------------------------+----------------------+
| Intensity range (<Ih>) | n_refl | Uncorrected variance | Corrected variance |
|--------------------------+----------+------------------------+----------------------|
| 10217.88 - 1833.83 | 1427 | 31.279 | 1.049 |
| 1833.83 - 1407.62 | 1427 | 14.861 | 0.925 |
| 1407.62 - 1184.50 | 1427 | 12.311 | 0.965 |
| 1184.50 - 915.76 | 2940 | 10.261 | 0.97 |
| 915.76 - 501.06 | 12444 | 5.977 | 0.978 |
| 501.06 - 274.15 | 21170 | 3.511 | 1.062 |
| 274.15 - 150.00 | 29104 | 2.17 | 1.169 |
| 150.00 - 82.07 | 31420 | 1.468 | 1.206 |
| 82.07 - 44.91 | 25840 | 1.128 | 1.228 |
| 44.91 - 24.56 | 15525 | 0.843 | 1.098 |
+--------------------------+----------+------------------------+----------------------+
The reflection table variances have been adjusted to account for the
uncertainty in the scaling models for all datasets
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Total time taken: 23.5571s
================================================================================
50.00% of model parameters have significant uncertainty
(sigma/abs(parameter) > 0.5)
Summary of dataset partialities
+------------------+----------+
| Partiality (p) | n_refl |
|------------------+----------|
| all reflections | 301260 |
| p > 0.99 | 246904 |
| 0.5 < p < 0.99 | 2533 |
| 0.01 < p < 0.5 | 5463 |
| p < 0.01 | 46360 |
+------------------+----------+
Reflections below a partiality_cutoff of 0.4 are not considered for any
part of the scaling analysis or for the reporting of merging statistics.
Additionally, if applicable, only reflections with a min_partiality > 0.95
were considered for use when refining the scaling model.
----------Merging statistics by resolution bin----------
d_max d_min #obs #uniq mult. %comp <I> <I/sI> r_mrg r_meas r_pim r_anom cc1/2 cc_ano
68.41 4.83 21416 1380 15.52 100.00 286.9 57.4 0.047 0.049 0.012 0.025 0.999* 0.013
4.83 3.84 20916 1271 16.46 100.00 421.5 61.0 0.045 0.047 0.011 0.024 0.999* 0.001
3.84 3.35 20681 1256 16.47 100.00 322.5 57.1 0.048 0.050 0.012 0.026 0.999* -0.052
3.35 3.05 20830 1226 16.99 100.00 222.6 52.7 0.053 0.055 0.013 0.028 0.999* -0.022
3.05 2.83 20700 1221 16.95 100.00 153.7 46.3 0.060 0.062 0.015 0.031 0.999* -0.104
2.83 2.66 21300 1233 17.27 100.00 127.8 44.0 0.065 0.067 0.016 0.034 0.998* -0.151
2.66 2.53 20395 1197 17.04 100.00 101.1 37.5 0.073 0.076 0.018 0.040 0.999* 0.004
2.53 2.42 16677 1212 13.76 100.00 90.1 32.0 0.079 0.082 0.022 0.048 0.998* -0.009
2.42 2.32 12527 1213 10.33 99.92 81.4 27.5 0.081 0.085 0.026 0.058 0.997* 0.035
2.32 2.24 10415 1207 8.63 99.92 76.9 23.9 0.083 0.088 0.030 0.066 0.996* -0.055
2.24 2.17 8636 1191 7.25 99.75 68.4 20.2 0.086 0.093 0.034 0.068 0.994* -0.185
2.17 2.11 6956 1189 5.85 99.33 58.0 16.1 0.096 0.106 0.042 0.093 0.991* -0.108
2.11 2.06 5477 1147 4.78 95.98 52.7 13.5 0.096 0.108 0.047 0.106 0.992* -0.002
2.06 2.01 4340 1110 3.91 92.65 45.4 11.1 0.108 0.124 0.059 0.127 0.986* -0.021
2.01 1.96 3144 1029 3.06 86.40 38.9 8.7 0.119 0.142 0.075 0.169 0.977* -0.013
1.96 1.92 2115 929 2.28 77.35 35.0 7.1 0.120 0.149 0.086 0.168 0.967* -0.316
1.92 1.88 1171 716 1.64 60.17 29.7 5.2 0.136 0.181 0.117 0.250 0.951* 0.951
1.88 1.84 816 564 1.45 48.83 26.1 4.4 0.155 0.212 0.142 0.365 0.942* 0.000
1.84 1.81 489 383 1.28 31.63 23.4 3.8 0.183 0.251 0.171 0.292 0.910* 0.000
1.81 1.78 136 125 1.09 10.52 16.3 2.5 0.140 0.194 0.133 9.559 0.962* 0.000
68.36 1.78 219137 20799 10.54 85.44 132.6 31.0 0.056 0.059 0.016 0.040 0.999* -0.046
-------------Summary of merging statistics--------------
Overall Low High
High resolution limit 1.78 4.83 1.78
Low resolution limit 68.36 68.41 1.81
Completeness 85.4 100.0 10.5
Multiplicity 10.5 15.5 1.1
I/sigma 31.0 57.4 2.5
Rmerge(I) 0.056 0.047 0.140
Rmerge(I+/-) 0.055 0.046 0.117
Rmeas(I) 0.059 0.049 0.194
Rmeas(I+/-) 0.059 0.049 0.161
Rpim(I) 0.016 0.012 0.133
Rpim(I+/-) 0.020 0.016 0.110
CC half 0.999 0.999 0.962
Anomalous completeness 69.3 100.0 0.1
Anomalous multiplicity 6.2 9.4 1.1
Anomalous correlation -0.046 0.013 0.000
Anomalous slope 1.034
dF/F 0.037
dI/s(dI) 0.846
Total observations 219137 21416 136
Total unique 20799 1380 125
Writing html report to dials.scale.html
Saving the scaled experiments to scaled.expt
Saving the scaled reflections to scaled.refl
See dials.github.io/dials_scale_user_guide.html for more info on scaling options
For exploring the scaling results, a wide variety of scaling and merging plots
can be found in the dials.scale.html
report generated by dials.scale
.
Almost there¶
As mentioned previously, the fourth dataset is giving significantly higher
R-merge values and much lower I/sigma.
Therefore the question one must ask is if it is better to exclude this dataset.
We can get some useful information about the agreement between datasets by
running the program dials.compute_delta_cchalf
. This program implements
a version of the algorithms described in Assmann et al. :
dials.compute_delta_cchalf scaled.refl scaled.expt
# Datasets: 4
# Groups: 4
# Reflections: 216930
# Unique reflections: 20793
CC 1/2 mean: 99.297
CC 1/2 excluding group 0: 99.343
CC 1/2 excluding group 1: 99.314
CC 1/2 excluding group 2: 99.277
CC 1/2 excluding group 3: 99.203
Dataset: 0, ΔCC½: -0.046
Dataset: 1, ΔCC½: -0.017
Dataset: 2, ΔCC½: 0.020
Dataset: 3, ΔCC½: 0.094
mean delta_cc_half: 0.013
stddev delta_cc_half: 0.052
cutoff value: -0.196
Show/Hide Log
Read 301260 predicted reflections
Selected 219137 scaled reflections
Combined 18 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Combined 20 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Removed 2127 reflections below partiality threshold
Resolution bins
0: 68.363, 5.616
1: 5.616, 3.978
2: 3.978, 3.250
3: 3.250, 2.815
4: 2.815, 2.518
5: 2.518, 2.299
6: 2.299, 2.129
7: 2.129, 1.991
8: 1.991, 1.878
9: 1.878, 1.781
Summary of input data:
# Datasets: 4
# Groups: 4
# Reflections: 216930
# Unique reflections: 20793
CC 1/2 mean: 99.297
CC 1/2 excluding group 0: 99.343
CC 1/2 excluding group 1: 99.314
CC 1/2 excluding group 2: 99.277
CC 1/2 excluding group 3: 99.203
Dataset: 0, ΔCC½: -0.046
Dataset: 1, ΔCC½: -0.017
Dataset: 2, ΔCC½: 0.020
Dataset: 3, ΔCC½: 0.094
mean delta_cc_half: 0.013
stddev delta_cc_half: 0.052
cutoff value: -0.196
Writing table to delta_cchalf.dat
Saving 301260 reflections to filtered.refl
Saving the experiments to filtered.expt
Writing html report to: compute_delta_cchalf.html
It looks like we could get a significantly better \(CC_{1/2}\) by excluding the final dataset - it has a negative \(\Delta CC_{1/2}\). But how bad is too bad that it warrants exclusion? Unfortunately this is a difficult question to answer and it may be the case that one would need to refine several structures with different data excluded to properly address this question. If we had many datasets and only a small fraction had a very large negative \(\Delta CC_{1/2}\) then one could argue that these measurements are not drawn from the same population as the rest of the data and should be excluded.
To see the effect of removing the last dataset (dataset ‘3’), we can rerun
dials.scale
(note that this will overwrite the previous scaled files):
dials.scale scaled.expt scaled.refl d_min=1.78
Show/Hide Log
DIALS 3.dev.1158-ga2f8f714a
The following parameters have been modified:
cut_data {
d_min = 1.78
}
input {
experiments = scaled.expt
reflections = scaled.refl
}
Checking for the existence of a reflection table
containing multiple datasets
Detected existence of a multi-dataset reflection table
containing 4 datasets.
Found 4 reflection tables & 4 experiments in total.
Dataset ids are: 0,1,2,3
Space group being used during scaling is P 4 2 2
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Scaling models have been initialised for all experiments.
================================================================================
The experiment id for this dataset is 0.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Read 74952 predicted reflections
Selected 57758 reflections integrated by profile or summation methods
Removed 1774 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5.0
Removed 14 intensity.prf.value reflections with I/Sig(I) < -5.0
Excluding 20037/74952 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 12546
criterion: excluded for scaling, reflections: 20037
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.76108, b = 0.06326
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 20.769
Using previously determined optimal intensity choice: 1148.4077
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 1.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Read 74323 predicted reflections
Selected 58232 reflections integrated by profile or summation methods
Removed 1917 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5.0
Removed 14 intensity.prf.value reflections with I/Sig(I) < -5.0
Excluding 19172/74323 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 11927
criterion: excluded for scaling, reflections: 19172
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.76108, b = 0.06326
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 20.769
Using previously determined optimal intensity choice: 1148.4077
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 2.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Read 76579 predicted reflections
Selected 57792 reflections integrated by profile or summation methods
Removed 1869 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5.0
Removed 8 intensity.prf.value reflections with I/Sig(I) < -5.0
Excluding 21795/76579 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 14121
criterion: excluded for scaling, reflections: 21795
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.76108, b = 0.06326
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 20.769
Using previously determined optimal intensity choice: 1148.4077
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 3.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Read 75406 predicted reflections
Selected 57424 reflections integrated by profile or summation methods
Removed 1772 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5.0
Removed 5 intensity.prf.value reflections with I/Sig(I) < -5.0
Excluding 20789/75406 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 12917
criterion: excluded for scaling, reflections: 20789
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.76108, b = 0.06326
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 20.769
Using previously determined optimal intensity choice: 1148.4077
Completed preprocessing and initialisation for this dataset.
================================================================================
Configuring a MultiScaler to handle the individual Scalers.
Determining symmetry equivalent reflections across datasets.
Using quasi-random reflection selection. Selecting from 17133 symmetry groups
with <I/sI> > 1.0 (203731 reflections)). Selection target of 716.56 reflections
from each dataset, with a total number between 11465.02 and 13758.03.
Summary of cross-dataset reflection groups chosen (1277 groups, 13392 reflections):
+---------------+------------+----------+-----+-----+-----+-----+
| d-range | n_groups | n_refl | 0 | 1 | 2 | 3 |
|---------------+------------+----------+-----+-----+-----+-----|
| 68.43 - 7.961 | 42 | 689 | 105 | 203 | 178 | 203 |
| 7.961 - 5.649 | 31 | 727 | 70 | 241 | 199 | 217 |
| 5.649 - 4.617 | 33 | 831 | 72 | 277 | 231 | 251 |
| 4.617 - 4.001 | 34 | 912 | 70 | 304 | 262 | 276 |
| 4.001 - 3.58 | 37 | 982 | 72 | 324 | 285 | 301 |
| 3.58 - 3.269 | 31 | 885 | 72 | 296 | 242 | 275 |
| 3.269 - 3.027 | 34 | 846 | 70 | 277 | 228 | 271 |
| 3.027 - 2.832 | 31 | 775 | 70 | 253 | 209 | 243 |
| 2.832 - 2.67 | 34 | 818 | 71 | 263 | 220 | 264 |
| 2.67 - 2.533 | 34 | 773 | 71 | 248 | 207 | 247 |
| 2.533 - 2.415 | 31 | 560 | 70 | 197 | 190 | 103 |
| 2.415 - 2.313 | 38 | 551 | 104 | 177 | 200 | 70 |
| 2.313 - 2.222 | 41 | 523 | 121 | 165 | 167 | 70 |
| 2.222 - 2.141 | 46 | 505 | 133 | 164 | 137 | 71 |
| 2.141 - 2.069 | 77 | 594 | 153 | 176 | 105 | 160 |
| 2.069 - 2.003 | 104 | 582 | 150 | 150 | 105 | 177 |
| 2.003 - 1.943 | 137 | 657 | 159 | 217 | 141 | 140 |
| 1.943 - 1.889 | 205 | 614 | 141 | 172 | 161 | 140 |
| 1.889 - 1.838 | 213 | 474 | 83 | 125 | 132 | 134 |
| 1.838 - 1.792 | 44 | 94 | 19 | 28 | 32 | 15 |
+---------------+------------+----------+-----+-----+-----+-----+
Summary of reflections chosen for minimisation from each dataset (52237 total):
+--------------+------------------+----------------------+----------------------+--------------------+
| Dataset id | reflections | randomly selected | randomly selected | combined number |
| | connected to | reflections | reflections | of reflections |
| | other datasets | within dataset | across datasets | |
|--------------+------------------+----------------------+----------------------+--------------------|
| 0 | 1876 | 4675 | 6304 | 11892 |
| 1 | 4257 | 4706 | 6334 | 13802 |
| 2 | 3631 | 4707 | 6376 | 13260 |
| 3 | 3628 | 4831 | 6341 | 13283 |
| total | 13392 | 18919 | 25355 | 52237 |
+--------------+------------------+----------------------+----------------------+--------------------+
Completed configuration of MultiScaler.
================================================================================
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with an LBFGS minimizer.
Time taken for refinement 0.78
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52237 | 1.1153 |
| 1 | 52237 | 1.1152 |
+--------+--------+----------+
RMSD no longer decreasing
lbfgs minimizer stop: callback_after_step is True
================================================================================
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
2347 outliers have been identified.
Performing multi-dataset profile/summation intensity optimisation.
+-----------------+---------+---------+
| Combination | CC1/2 | Rmeas |
|-----------------+---------+---------|
| prf only | 0.99875 | 0.05684 |
| sum only | 0.99904 | 0.06057 |
| Imid = 878.53 | 0.99926 | 0.05389 |
| Imid = 11484.08 | 0.99909 | 0.05536 |
| Imid = 1148.41 | 0.99926 | 0.05382 |
| Imid = 114.84 | 0.99916 | 0.05703 |
+-----------------+---------+---------+
Combined intensities with Imid = 1148.41 determined to be best for scaling.
Combined outlier rejection has been performed across multiple datasets,
328 outliers have been identified.
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with an LBFGS minimizer.
Time taken for refinement 0.78
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52158 | 1.0422 |
| 1 | 52158 | 1.0422 |
+--------+--------+----------+
RMSD no longer decreasing
lbfgs minimizer stop: callback_after_step is True
================================================================================
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
331 outliers have been identified.
Determining a combined error model for all datasets
Performing a round of error model refinement.
Error model details:
Type: basic
Parameters: a = 0.76206, b = 0.06314
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 20.783
Results of error model refinement. Uncorrected and corrected variances
of normalised intensity deviations for given intensity ranges. Variances
are expected to be ~1.0 for reliable errors (sigmas).
+--------------------------+----------+------------------------+----------------------+
| Intensity range (<Ih>) | n_refl | Uncorrected variance | Corrected variance |
|--------------------------+----------+------------------------+----------------------|
| 10219.66 - 1833.81 | 1427 | 31.335 | 1.051 |
| 1833.81 - 1407.39 | 1427 | 14.895 | 0.926 |
| 1407.39 - 1183.79 | 1427 | 12.306 | 0.965 |
| 1183.79 - 916.44 | 2925 | 10.253 | 0.968 |
| 916.44 - 501.50 | 12431 | 5.996 | 0.98 |
| 501.50 - 274.43 | 21158 | 3.521 | 1.064 |
| 274.43 - 150.18 | 29088 | 2.173 | 1.17 |
| 150.18 - 82.18 | 31441 | 1.469 | 1.205 |
| 82.18 - 44.97 | 25840 | 1.128 | 1.227 |
| 44.97 - 24.60 | 15569 | 0.843 | 1.095 |
+--------------------------+----------+------------------------+----------------------+
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with a Levenberg-Marquardt minimizer.
Time taken for refinement 8.99
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52158 | 1.0418 |
| 1 | 52158 | 1.0414 |
| 2 | 52158 | 1.0412 |
| 3 | 52158 | 1.041 |
| 4 | 52158 | 1.0409 |
+--------+--------+----------+
RMSD no longer decreasing
================================================================================
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with a Levenberg-Marquardt minimizer.
Time taken for refinement 3.50
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52158 | 1.0409 |
| 1 | 52158 | 1.0409 |
+--------+--------+----------+
RMSD no longer decreasing
================================================================================
Calculating error estimates of inverse scale factors.
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
333 outliers have been identified.
Determining a combined error model for all datasets
Performing a round of error model refinement.
Error model details:
Type: basic
Parameters: a = 0.76340, b = 0.06293
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 20.816
Results of error model refinement. Uncorrected and corrected variances
of normalised intensity deviations for given intensity ranges. Variances
are expected to be ~1.0 for reliable errors (sigmas).
+--------------------------+----------+------------------------+----------------------+
| Intensity range (<Ih>) | n_refl | Uncorrected variance | Corrected variance |
|--------------------------+----------+------------------------+----------------------|
| 10186.59 - 1836.69 | 1427 | 30.915 | 1.048 |
| 1836.69 - 1407.47 | 1427 | 14.714 | 0.923 |
| 1407.47 - 1183.95 | 1427 | 12.389 | 0.979 |
| 1183.95 - 914.39 | 2959 | 10.113 | 0.963 |
| 914.39 - 500.50 | 12444 | 5.902 | 0.976 |
| 500.50 - 273.96 | 21172 | 3.495 | 1.062 |
| 273.96 - 149.95 | 29102 | 2.174 | 1.172 |
| 149.95 - 82.08 | 31415 | 1.467 | 1.205 |
| 82.08 - 44.93 | 25798 | 1.129 | 1.225 |
| 44.93 - 24.58 | 15557 | 0.842 | 1.093 |
+--------------------------+----------+------------------------+----------------------+
The reflection table variances have been adjusted to account for the
uncertainty in the scaling models for all datasets
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Total time taken: 20.2349s
================================================================================
Warning: Over half (50.60%) of model parameters have significant
uncertainty (sigma/abs(parameter) > 0.5), which could indicate a
poorly-determined scaling problem or overparameterisation.
Summary of dataset partialities
+------------------+----------+
| Partiality (p) | n_refl |
|------------------+----------|
| all reflections | 301260 |
| p > 0.99 | 246904 |
| 0.5 < p < 0.99 | 2533 |
| 0.01 < p < 0.5 | 5463 |
| p < 0.01 | 46360 |
+------------------+----------+
Reflections below a partiality_cutoff of 0.4 are not considered for any
part of the scaling analysis or for the reporting of merging statistics.
Additionally, if applicable, only reflections with a min_partiality > 0.95
were considered for use when refining the scaling model.
----------Merging statistics by resolution bin----------
d_max d_min #obs #uniq mult. %comp <I> <I/sI> r_mrg r_meas r_pim r_anom cc1/2 cc_ano
68.41 4.83 21417 1380 15.52 100.00 287.2 57.4 0.047 0.049 0.012 0.025 0.999* 0.017
4.83 3.84 20916 1271 16.46 100.00 421.9 61.1 0.045 0.047 0.011 0.024 0.999* -0.008
3.84 3.35 20681 1256 16.47 100.00 322.8 57.2 0.048 0.050 0.012 0.026 0.999* -0.056
3.35 3.05 20830 1226 16.99 100.00 222.8 52.7 0.053 0.055 0.013 0.028 0.999* -0.016
3.05 2.83 20700 1221 16.95 100.00 153.9 46.3 0.060 0.062 0.015 0.031 0.999* -0.103
2.83 2.66 21299 1233 17.27 100.00 127.9 44.0 0.065 0.067 0.016 0.034 0.998* -0.160
2.66 2.53 20393 1197 17.04 100.00 101.2 37.5 0.073 0.076 0.018 0.039 0.999* 0.011
2.53 2.42 16675 1212 13.76 100.00 90.2 32.0 0.079 0.083 0.022 0.049 0.998* -0.010
2.42 2.32 12527 1213 10.33 99.92 81.5 27.5 0.081 0.085 0.026 0.058 0.997* 0.028
2.32 2.24 10416 1207 8.63 99.92 76.9 23.8 0.083 0.088 0.030 0.066 0.996* -0.050
2.24 2.17 8636 1191 7.25 99.75 68.4 20.1 0.087 0.093 0.034 0.069 0.993* -0.195
2.17 2.11 6956 1189 5.85 99.33 58.0 16.0 0.097 0.106 0.043 0.093 0.991* -0.101
2.11 2.06 5477 1147 4.78 95.98 52.7 13.4 0.097 0.108 0.047 0.106 0.992* -0.005
2.06 2.01 4340 1110 3.91 92.65 45.4 11.1 0.109 0.125 0.059 0.128 0.986* -0.034
2.01 1.96 3144 1029 3.06 86.40 38.9 8.7 0.119 0.142 0.075 0.169 0.977* -0.025
1.96 1.92 2115 929 2.28 77.35 35.0 7.1 0.120 0.149 0.086 0.167 0.967* -0.328
1.92 1.88 1171 716 1.64 60.17 29.7 5.2 0.136 0.181 0.117 0.249 0.951* 0.949
1.88 1.84 816 564 1.45 48.83 26.1 4.4 0.156 0.212 0.143 0.363 0.941* 0.000
1.84 1.81 489 383 1.28 31.63 23.5 3.8 0.182 0.250 0.170 0.289 0.910* 0.000
1.81 1.78 136 125 1.09 10.52 16.3 2.5 0.141 0.194 0.133 9.449 0.962* 0.000
68.36 1.78 219134 20799 10.54 85.44 132.7 31.0 0.056 0.059 0.016 0.040 0.999* -0.049
-------------Summary of merging statistics--------------
Overall Low High
High resolution limit 1.78 4.83 1.78
Low resolution limit 68.36 68.41 1.81
Completeness 85.4 100.0 10.5
Multiplicity 10.5 15.5 1.1
I/sigma 31.0 57.4 2.5
Rmerge(I) 0.056 0.047 0.141
Rmerge(I+/-) 0.055 0.046 0.117
Rmeas(I) 0.059 0.049 0.194
Rmeas(I+/-) 0.059 0.049 0.161
Rpim(I) 0.016 0.012 0.133
Rpim(I+/-) 0.020 0.016 0.110
CC half 0.999 0.999 0.962
Anomalous completeness 69.3 100.0 0.1
Anomalous multiplicity 6.2 9.4 1.1
Anomalous correlation -0.049 0.017 0.000
Anomalous slope 1.033
dF/F 0.037
dI/s(dI) 0.846
Total observations 219134 21417 136
Total unique 20799 1380 125
Writing html report to dials.scale.html
Saving the scaled experiments to scaled.expt
Saving the scaled reflections to scaled.refl
See dials.github.io/dials_scale_user_guide.html for more info on scaling options
The overall merging statistics look significantly improved and therefore one would probably proceed with the first three datasets:
Resolution: 68.40 - 1.78 > 68.40 - 1.79
Observations: 222563 > 166095
Unique reflections: 16534 > 16285
Redundancy: 13.5 > 10.2
Completeness: 68.18% > 67.56%
Mean intensity: 45.3 > 46.0
Mean I/sigma(I): 25.0 > 26.1
R-merge: 0.132 > 0.059
R-meas: 0.136 > 0.062
R-pim: 0.033 > 0.017
We could have also excluded a subset of images, for example using the option
exclude_images=3:301:600
to exclude the last 300 images of dataset 3.
This option could be used to exclude the end of a dataset that was showing
significant radiation damage, or if the crystal had moved out of the beam part-way
through the measurement.
It is also worth checking the assigned space group using dials.symmetry
.
In dials.cosym
, only the Laue/Patterson group was tested to determine a space
group of \(P\,4\,2\,2\). However, a number of other MX space groups are possible for the
Laue group (due to the possibility of screw-axes), such as \(P\,4\,2_1\,2\),
\(P\,4_1\,2\,2\) etc. The screw-axes tests are performed by dials.symmetry
, and we can disable the
Laue group testing as we are already confident about this:
dials.symmetry scaled.expt scaled.refl laue_group=None
Read 74952 predicted reflections
Selected 54879 scaled reflections
Combined 18 partial reflections with other partial reflections
Read 74323 predicted reflections
Selected 55060 scaled reflections
Combined 21 partial reflections with other partial reflections
Read 76579 predicted reflections
Selected 54689 scaled reflections
Combined 20 partial reflections with other partial reflections
Read 75406 predicted reflections
Selected 54506 scaled reflections
Combined 21 partial reflections with other partial reflections
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Removing 9 Wilson outliers with E^2 >= 16.0
Resolution estimate from <I>/<σ(I)> > 4.0 : 1.83
Resolution estimate from CC½ > 0.60: 1.78
Performing systematic absence checks on scaled data
Read 301260 predicted reflections
Selected 219134 scaled reflections
Removed 1 reflections with d <= 1.78
Combined 18 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Combined 20 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Laue group: P 4/m m m
Scoring method: direct
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
| Screw axis | Score | No. present | No. absent | <I> present | <I> absent | <I/sig> present | <I/sig> absent |
|--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------|
| 41c | 1 | 12 | 34 | 601.056 | 0.086 | 32.104 | 0.49 |
| 21a | 1 | 14 | 14 | 846.734 | 0.457 | 23.214 | 1.19 |
| 42c | 1 | 24 | 22 | 300.354 | 0.322 | 16.437 | 0.337 |
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
+---------------+---------+
| Space group | score |
|---------------+---------|
| P 4 2 2 | 0 |
| P 4 21 2 | 0 |
| P 41 2 2 | 0 |
| P 42 2 2 | 0 |
| P 41 21 2 | 1 |
| P 42 21 2 | 0 |
+---------------+---------+
Recommended space group: P 41 21 2
Space group with equivalent score (enantiomorphic pair): P 43 21 2
Show/Hide Log
DIALS 3.dev.1158-ga2f8f714a
The following parameters have been modified:
laue_group = None
input {
experiments = scaled.expt
reflections = scaled.refl
}
Detected existence of a multi-dataset reflection table
containing 4 datasets.
================================================================================
Analysing systematic absences
Laue group: P 4/m m m
Read 74952 predicted reflections
Selected 54879 scaled reflections
Combined 18 partial reflections with other partial reflections
Read 74323 predicted reflections
Selected 55060 scaled reflections
Combined 21 partial reflections with other partial reflections
Read 76579 predicted reflections
Selected 54689 scaled reflections
Combined 20 partial reflections with other partial reflections
Read 75406 predicted reflections
Selected 54506 scaled reflections
Combined 21 partial reflections with other partial reflections
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Removing 9 Wilson outliers with E^2 >= 16.0
Resolution estimate from <I>/<σ(I)> > 4.0 : 1.83
Resolution estimate from CC½ > 0.60: 1.78
Performing systematic absence checks on scaled data
Read 301260 predicted reflections
Selected 219134 scaled reflections
Removed 1 reflections with d <= 1.78
Combined 18 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Combined 20 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Laue group: P 4/m m m
Scoring method: direct
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
| Screw axis | Score | No. present | No. absent | <I> present | <I> absent | <I/sig> present | <I/sig> absent |
|--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------|
| 41c | 1 | 12 | 34 | 601.056 | 0.086 | 32.104 | 0.49 |
| 21a | 1 | 14 | 14 | 846.734 | 0.457 | 23.214 | 1.19 |
| 42c | 1 | 24 | 22 | 300.354 | 0.322 | 16.437 | 0.337 |
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
+---------------+---------+
| Space group | score |
|---------------+---------|
| P 4 2 2 | 0 |
| P 4 21 2 | 0 |
| P 41 2 2 | 0 |
| P 42 2 2 | 0 |
| P 41 21 2 | 1 |
| P 42 21 2 | 0 |
+---------------+---------+
Recommended space group: P 41 21 2
Space group with equivalent score (enantiomorphic pair): P 43 21 2
Saving reindexed experiments to symmetrized.expt in space group P 41 21 2
Saving 301260 reindexed reflections to symmetrized.refl
By analysing the sets of reflections we expect to be present and absent, the existence of the \(4_1\) and \(2_1\) screw axes are confirmed, hence the space group is assigned as \(P\,4_1\,2_1\,2\). Note that we can do this analysis before or after scaling, as we only need to know the Laue group for scaling, however it is preferable to do this after scaling as outliers may have been removed by scaling.
Finally, we must merge the data and produce an MTZ file for downstream structure solution:
dials.merge symmetrized.expt symmetrized.refl
Show/Hide Log
DIALS 3.dev.1158-ga2f8f714a
The following parameters have been modified:
input {
experiments = symmetrized.expt
reflections = symmetrized.refl
}
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Merging scaled reflection data
Read 301260 predicted reflections
Selected 219134 scaled reflections
Combined 18 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Combined 20 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Running systematic absences check
Laue group: P 4/m m m
Scoring method: direct
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
| Screw axis | Score | No. present | No. absent | <I> present | <I> absent | <I/sig> present | <I/sig> absent |
|--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------|
| 41c | 1 | 12 | 34 | 601.056 | 0.086 | 32.104 | 0.49 |
| 21a | 1 | 14 | 14 | 846.734 | 0.457 | 23.214 | 1.19 |
| 42c | 1 | 24 | 22 | 300.354 | 0.322 | 16.437 | 0.337 |
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
+---------------+---------+
| Space group | score |
|---------------+---------|
| P 4 2 2 | 0 |
| P 4 21 2 | 0 |
| P 41 2 2 | 0 |
| P 42 2 2 | 0 |
| P 41 21 2 | 1 |
| P 42 21 2 | 0 |
+---------------+---------+
Recommended space group: P 41 21 2
Space group with equivalent score (enantiomorphic pair): P 43 21 2
Performing French-Wilson treatment of scaled intensities
Total number of rejected intensities 2
===================== Absolute scaling and Wilson analysis ====================
----------Maximum likelihood isotropic Wilson scaling----------
ML estimate of overall B value:
12.40 A**2
Estimated -log of scale factor:
-2.90
The overall B value ("Wilson B-factor", derived from the Wilson plot) gives
an isotropic approximation for the falloff of intensity as a function of
resolution. Note that this approximation may be misleading for anisotropic
data (where the crystal is poorly ordered along an axis). The Wilson B is
strongly correlated with refined atomic B-factors but these may differ by
a significant amount, especially if anisotropy is present.
----------Maximum likelihood anisotropic Wilson scaling----------
ML estimate of overall B_cart value:
11.63, 0.00, 0.00
11.63, 0.00
13.70
Equivalent representation as U_cif:
0.15, -0.00, -0.00
0.15, 0.00
0.17
Eigen analyses of B-cart:
-------------------------------------------------
| Eigenvector | Value | Vector |
-------------------------------------------------
| 1 | 13.699 | ( 0.00, 0.00, 1.00) |
| 2 | 11.632 | (-0.71, 0.71, -0.00) |
| 3 | 11.632 | ( 0.71, 0.71, -0.00) |
-------------------------------------------------
ML estimate of -log of scale factor:
-2.90
----------Anisotropy analyses----------
For the resolution shell spanning between 1.93 - 1.78 Angstrom,
the mean I/sigI is equal to 4.88. 58.7 % of these intensities have
an I/sigI > 3. When sorting these intensities by their anisotropic
correction factor and analysing the I/sigI behavior for this ordered
list, we can gauge the presence of 'anisotropy induced noise amplification'
in reciprocal space.
The quarter of Intensities *least* affected by the anisotropy correction show
<I/sigI> : 5.25e+00
Fraction of I/sigI > 3 : 6.27e-01 ( Z = 1.80 )
The quarter of Intensities *most* affected by the anisotropy correction show
<I/sigI> : 3.79e+00
Fraction of I/sigI > 3 : 4.68e-01 ( Z = 5.40 )
Z-scores are computed on the basis of a Bernoulli model assuming independence
of weak reflections with respect to anisotropy.
----------Wilson plot----------
The Wilson plot shows the falloff in intensity as a function in resolution;
this is used to calculate the overall B-factor ("Wilson B-factor") for the
data shown above. The expected plot is calculated based on analysis of
macromolecule structures in the PDB, and the distinctive appearance is due to
the non-random arrangement of atoms in the crystal. Some variation is
natural, but major deviations from the expected plot may indicate pathological
data (including ice rings, detector problems, or processing errors).
----------Mean intensity analyses----------
Inspired by: Morris et al. (2004). J. Synch. Rad.11, 56-59.
The following resolution shells are worrisome:
-----------------------------------------------------------------
| Mean intensity by shell (outliers) |
|---------------------------------------------------------------|
| d_spacing | z_score | completeness | <Iobs>/<Iexp> |
|---------------------------------------------------------------|
| 2.033 | 4.86 | 0.76 | 0.820 |
-----------------------------------------------------------------
Possible reasons for the presence of the reported unexpected low or elevated
mean intensity in a given resolution bin are :
- missing overloaded or weak reflections
- suboptimal data processing
- satellite (ice) crystals
- NCS
- translational pseudo symmetry (detected elsewhere)
- outliers (detected elsewhere)
- ice rings (detected elsewhere)
- other problems
Note that the presence of abnormalities in a certain region of reciprocal
space might confuse the data validation algorithm throughout a large region
of reciprocal space, even though the data are acceptable in those areas.
----------Possible outliers----------
Inspired by: Read, Acta Cryst. (1999). D55, 1759-1764
Acentric reflections:
None
Centric reflections:
-----------------------------------------------------------------------------------------------------
| Centric reflections |
|---------------------------------------------------------------------------------------------------|
| d_spacing | H K L | |E| | p(wilson) | p(extreme) |
|---------------------------------------------------------------------------------------------------|
| 2.628 | 26, 0, 1 | 4.17 | 3.07e-05 | 8.22e-02 |
-----------------------------------------------------------------------------------------------------
p(wilson) : 1-(erf[|E|/sqrt(2)])
p(extreme) : 1-(erf[|E|/sqrt(2)])^(n_acentrics)
p(wilson) is the probability that an E-value of the specified
value would be observed when it would selected at random from
the given data set.
p(extreme) is the probability that the largest |E| value is
larger or equal than the observed largest |E| value.
Both measures can be used for outlier detection. p(extreme)
takes into account the size of the dataset.
----------Ice ring related problems----------
The following statistics were obtained from ice-ring insensitive resolution
ranges:
mean bin z_score : 1.41
( rms deviation : 1.13 )
mean bin completeness : 0.86
( rms deviation : 0.29 )
The following table shows the Wilson plot Z-scores and completeness for
observed data in ice-ring sensitive areas. The expected relative intensity
is the theoretical intensity of crystalline ice at the given resolution.
Large z-scores and high completeness in these resolution ranges might
be a reason to re-assess your data processsing if ice rings were present.
-------------------------------------------------------------
| d_spacing | Expected rel. I | Data Z-score | Completeness |
-------------------------------------------------------------
| 3.897 | 1.000 | 2.81 | 1.00 |
| 3.669 | 0.750 | 1.09 | 1.00 |
| 3.441 | 0.530 | 3.37 | 0.99 |
| 2.671 | 0.170 | 1.12 | 0.99 |
| 2.249 | 0.390 | 1.45 | 0.98 |
| 2.072 | 0.300 | 0.91 | 0.81 |
| 1.948 | 0.040 | 1.73 | 0.51 |
| 1.918 | 0.180 | 0.00 | 0.37 |
| 1.883 | 0.030 | 0.49 | 0.30 |
-------------------------------------------------------------
Abnormalities in mean intensity or completeness at resolution ranges with a
relative ice ring intensity lower than 0.10 will be ignored.
No ice ring related problems detected.
If ice rings were present, the data does not look worse at ice ring related
d_spacings as compared to the rest of the data set.
Size of anomalous differences
+---------+---------+----------------+
| d_max | d_min | <|ΔF|/σ(ΔF)> |
|---------+---------+----------------|
| 68.36 | 5.59 | 0.964 |
| 5.59 | 4.3 | 0.889 |
| 4.3 | 3.71 | 0.941 |
| 3.71 | 3.34 | 0.964 |
| 3.34 | 3.08 | 0.929 |
| 3.08 | 2.88 | 0.932 |
| 2.88 | 2.72 | 0.884 |
| 2.72 | 2.6 | 0.924 |
| 2.6 | 2.49 | 0.886 |
| 2.49 | 2.4 | 0.906 |
| 2.4 | 2.31 | 0.904 |
| 2.31 | 2.24 | 0.882 |
| 2.24 | 2.18 | 0.767 |
| 2.18 | 2.12 | 0.743 |
| 2.12 | 2.07 | 0.745 |
| 2.07 | 2.02 | 0.729 |
| 2.02 | 1.97 | 0.754 |
| 1.97 | 1.92 | 0.73 |
| 1.92 | 1.87 | 0.709 |
| 1.87 | 1.78 | 0.647 |
+---------+---------+----------------+
----------Merging statistics by resolution bin----------
d_max d_min #obs #uniq mult. %comp <I> <I/sI> r_mrg r_meas r_pim r_anom cc1/2 cc_ano
68.41 4.83 21411 1380 15.52 100.00 287.2 57.4 0.047 0.048 0.012 0.025 0.999* -0.130
4.83 3.84 20907 1271 16.45 100.00 421.9 61.1 0.045 0.047 0.011 0.024 0.999* -0.014
3.84 3.35 20672 1256 16.46 100.00 322.8 57.2 0.048 0.050 0.012 0.026 0.999* -0.045
3.35 3.05 20823 1226 16.98 100.00 222.8 52.7 0.053 0.055 0.013 0.028 0.999* -0.004
3.05 2.83 20684 1221 16.94 100.00 153.9 46.3 0.059 0.061 0.015 0.031 0.999* -0.080
2.83 2.66 21289 1233 17.27 100.00 127.9 44.0 0.065 0.067 0.016 0.034 0.998* -0.166
2.66 2.53 20381 1197 17.03 100.00 101.2 37.5 0.073 0.076 0.018 0.039 0.999* -0.008
2.53 2.42 16671 1212 13.75 100.00 90.2 32.0 0.079 0.082 0.022 0.049 0.998* -0.009
2.42 2.32 12524 1213 10.32 100.00 81.5 27.5 0.081 0.085 0.026 0.058 0.997* 0.036
2.32 2.24 10412 1207 8.63 100.00 76.9 23.8 0.083 0.088 0.030 0.066 0.996* -0.049
2.24 2.17 8636 1191 7.25 99.92 68.4 20.1 0.087 0.093 0.034 0.069 0.993* -0.195
2.17 2.11 6956 1189 5.85 99.41 58.0 16.0 0.097 0.106 0.043 0.093 0.991* -0.101
2.11 2.06 5477 1147 4.78 96.14 52.7 13.4 0.097 0.108 0.047 0.106 0.992* -0.005
2.06 2.01 4340 1110 3.91 92.73 45.4 11.1 0.109 0.125 0.059 0.128 0.986* -0.034
2.01 1.96 3144 1029 3.06 86.47 38.9 8.7 0.119 0.142 0.075 0.169 0.977* -0.025
1.96 1.92 2115 929 2.28 77.48 35.0 7.1 0.120 0.149 0.086 0.167 0.967* -0.328
1.92 1.88 1171 716 1.64 60.22 29.7 5.2 0.136 0.181 0.117 0.249 0.951* 0.949
1.88 1.84 816 564 1.45 48.87 26.1 4.4 0.156 0.212 0.143 0.363 0.941* 0.000
1.84 1.81 489 383 1.28 31.65 23.5 3.8 0.182 0.250 0.170 0.289 0.910* 0.000
1.81 1.78 136 125 1.09 10.53 16.3 2.5 0.141 0.194 0.133 9.449 0.962* 0.000
68.36 1.78 219054 20799 10.53 85.66 132.7 31.0 0.056 0.059 0.016 0.040 0.999* -0.050
-------------Summary of merging statistics--------------
Overall Low High
High resolution limit 1.78 4.83 1.78
Low resolution limit 68.36 68.41 1.81
Completeness 85.7 100.0 10.5
Multiplicity 10.5 15.5 1.1
I/sigma 31.0 57.4 2.5
Rmerge(I) 0.056 0.047 0.141
Rmerge(I+/-) 0.055 0.046 0.117
Rmeas(I) 0.059 0.048 0.194
Rmeas(I+/-) 0.059 0.049 0.161
Rpim(I) 0.016 0.012 0.133
Rpim(I+/-) 0.020 0.016 0.110
CC half 0.999 0.999 0.962
Wilson B factor 12.400
Anomalous completeness 69.3 100.0 0.1
Anomalous multiplicity 6.2 9.4 1.1
Anomalous correlation -0.050 -0.130 0.000
Anomalous slope 1.033
dF/F 0.037
dI/s(dI) 0.846
Total observations 219054 21411 136
Total unique 20799 1380 125
Writing reflections to merged.mtz
Title: From dials.merge
Total Number of Datasets = 2
Dataset 0 HKL_base > HKL_base > HKL_base:
cell 68.3603 68.3603 103.953 90 90 90
wavelength 0
Dataset 1 AUTOMATIC > XTAL > NATIVE:
cell 68.3603 68.3603 103.953 90 90 90
wavelength 0.979493
Number of Columns = 21
Number of Reflections = 20799
Number of Batches = 0
Missing values marked as: nan
Global Cell (obsolete): 68.3603 68.3603 103.953 90 90 90
Resolution: 1.78 - 68.36 A
Sort Order: 0 0 0 0 0
Space Group: P 41 21 2
Space Group Number: 92
Header info:
Column Type Dataset Min Max
H H 0 0 37
K H 0 0 24
L H 0 0 56
FreeR_flag I 0 0 19
IMEAN J 1 -26.2354 5531.73
SIGIMEAN Q 1 0.115465 189.582
N I 1 1 38
F F 1 0.252334 74.0742
SIGF Q 1 0.0394528 1.4001
I(+) K 1 -26.2354 5531.73
SIGI(+) M 1 0.115465 189.582
I(-) K 1 -30.3239 3633.08
SIGI(-) M 1 0.338149 75.147
N(+) I 1 1 21
N(-) I 1 1 20
F(+) G 1 0.252334 74.0742
SIGF(+) L 1 0.0430216 1.4001
F(-) G 1 0.353365 60.2233
SIGF(-) L 1 0.0556057 1.32954
DANO D 1 -3.81385 3.46364
SIGDANO Q 1 0.0799557 1.60514
History (1 lines):
From DIALS 3.dev.1158-ga2f8f714a, run on 2024-09-14 at 08:20:38 BST (2024-09-14 at 07:20:38 GMT)
Writing html report to dials.merge.html
This merges the data and performs a truncation procedure, to give a merged MTZ file containing intensities and strictly-positive structure factors (Fs).