Click here to go to the corresponding page for the latest version of DIALS
dials.algorithms.symmetry¶
Methods for symmetry determination.
This module provides a base class for symmetry determination algorithms.
-
class
dials.algorithms.symmetry.
symmetry_base
(intensities, normalisation='ml_aniso', lattice_symmetry_max_delta=2.0, d_min=<libtbx.AutoType object>, min_i_mean_over_sigma_mean=4, min_cc_half=0.6, relative_length_tolerance=None, absolute_angle_tolerance=None)[source]¶ Bases:
object
Base class for symmetry analysis.
-
static
kernel_normalisation
(intensities)[source]¶ Kernel normalisation of the input intensities.
Parameters: intensities (cctbx.miller.array) – The intensities to be normalised. Returns: The normalised intensities. Return type: cctbx.miller.array
-
static
ml_aniso_normalisation
(intensities)[source]¶ Anisotropic maximum-likelihood normalisation of the input intensities.
Parameters: intensities (cctbx.miller.array) – The intensities to be normalised. Returns: The normalised intensities. Return type: cctbx.miller.array
-
static
ml_iso_normalisation
(intensities)[source]¶ Isotropic maximum-likelihood normalisation of the input intensities.
Parameters: intensities (cctbx.miller.array) – The intensities to be normalised. Returns: The normalised intensities. Return type: cctbx.miller.array
-
static
quasi_normalisation
(intensities)[source]¶ Quasi-normalisation of the input intensities.
Parameters: intensities (cctbx.miller.array) – The intensities to be normalised. Returns: The normalised intensities. Return type: cctbx.miller.array
-
static
Algorithms for determination of Laue group symmetry.
-
class
dials.algorithms.symmetry.determine_space_group.
CorrelationCoefficientAccumulator
(x=None, y=None)[source]¶ Bases:
object
Class for incremental computation of correlation coefficients.
- Uses the single-pass formula for Pearson correlation coefficient:
- https://en.wikipedia.org/wiki/Pearson_correlation_coefficient#For_a_sample
-
accumulate
(x, y)[source]¶ Accumulate the x and y values provided.
Parameters: - x (list) – The list of x values to accumulate.
- y (list) – The list of y values to accumulate.
-
coefficient
()[source]¶ Calculate the correlation coefficient.
Returns: The correlation coefficient. Return type: float
-
denominator
()[source]¶ “Calculate the denominator of the correlation coefficient formula.
\[\sqrt{n \sum{x^2} - \sum{x}^2} \sqrt{n \sum{y^2} - \sum{y}^2}\]Returns: The value of the denominator. Return type: float
-
class
dials.algorithms.symmetry.determine_space_group.
ScoreSubGroup
(subgroup, sym_op_scores)[source]¶ Bases:
object
Score the probability of a given subgroup being the true subgroup.
- Calculates the combined correlation coefficients for symmetry operations present/absent from the subgroup.
- Calculates overall Zcc scores for symmetry elements present/absent from the subgroup.
- Calculates the overall likelihood for this subgroup.
See appendix A2 of Evans, P. R. (2011). Acta Cryst. D67, 282-292.
-
as_dict
()[source]¶ Return a dictionary representation of the subgroup scoring.
- The dictionary will contain the following keys:
- patterson_group: The current subgroup
- likelihood: The likelihood of the subgroup being correct
- confidence: The confidence of the subgroup being correct
- z_cc_for: The combined Z-scores for all symmetry elements present in the subgroup
- z_cc_against: The combined Z-scores for all symmetry elements present in the lattice group but not in the subgroup
- z_cc_net: The net Z-score, i.e. z_cc_for - z_cc_against
- cc_for: The overall correlation coefficient for all symmetry elements present in the subgroup
- cc_against: The overall correlation coefficient for all symmetry elements present in the lattice group but not in the subgroup
- max_angular_difference: The maximum angular difference between the symmetrised unit cell and the P1 unit cell.
- cb_op: The change of basis operation from the input unit cell to the ‘best’ unit cell.
Returns: Return type: dict
-
class
dials.algorithms.symmetry.determine_space_group.
ScoreSymmetryElement
(intensities, sym_op, cc_true, cc_sig_fac)[source]¶ Bases:
object
Analyse intensities for presence of a given symmetry operation.
- Calculate the correlation coefficient, CC, for the given sym op.
- Calculate the probability of observing this CC if the sym op is present, p(CC; S), modelled by a Cauchy distribution centred on cc_true and width gamma = sigma_cc.
- Calculate the probability of observing this CC if the sym op is NOT present, p(CC; !S).
- Calculate the likelihood of symmetry element being present, p(S; CC) = p(CC; S) / (p(CC; S) + p(CC; !S))
See appendix A1 of Evans, P. R. (2011). Acta Cryst. D67, 282-292.
-
as_dict
()[source]¶ Return a dictionary representation of the symmetry element scoring.
- The dictionary will contain the following keys:
- likelihood: The likelihood of the symmetry element being present
- z_cc: The Z-score for the correlation coefficent
- cc: The correlation coefficient for the symmetry element
- n_ref: The number of reflections contributing to the correlation coefficient
- operator: The xyz representation of the symmetry element
Returns: Return type: dict
-
class
dials.algorithms.symmetry.determine_space_group.
determine_space_group
(intensities, normalisation='ml_aniso', lattice_symmetry_max_delta=2.0, d_min=<libtbx.AutoType object>, min_i_mean_over_sigma_mean=4, min_cc_half=0.6, relative_length_tolerance=None, absolute_angle_tolerance=None)[source]¶ Bases:
dials.algorithms.symmetry.symmetry_base
Determination of Laue group symmetry using algorithms similar to POINTLESS.
See also
Evans, P. (2006). Acta Cryst. D62, 72-82 and Evans, P. R. (2011). Acta Cryst. D67, 282-292.
-
dials.algorithms.symmetry.determine_space_group.
trunccauchy_pdf
(x, a, b, loc=0, scale=1)[source]¶ Calculate a truncated Cauchy probability density function.
Parameters: - x (float) – The point at which to calculate the PDF.
- a (float) – The lower bound of the truncated distribution.
- b (float) – The upper bound of the truncated distribution.
- loc (float) – The location parameter for the Cauchy distribution.
- scale (float) – The scale parameter for the Cauchy distribution.
Returns: The value of the probability density function.
Return type:
Functions to help with reindexing against a reference dataset.
-
dials.algorithms.symmetry.reindex_to_reference.
determine_reindex_operator_against_reference
(test_miller_set, reference_miller_set)[source]¶ Reindex a miller set to match a reference miller set.
This function takes two miller arrays, a reference and a test array. The space group is checked to see if any reindexing may be required to give consistent indexing between both datasets. If possible twin operators exist, the different indexing options are tested against the reference set, using the correlation between datasets as the test.
Parameters: - test_miller_set (cctbx.miller.array) – The input miller set to be reindexed.
- reference_miller_set (cctbx.miller.array) – The reference miller set.
Returns: The change of basis operator which should be applied to the test dataset to give consistent indexing with the reference.
Return type: cctbx.sgtbx.change_of_basis_op
dials.algorithms.symmetry.cosym¶
Methods for symmetry determination from partial datasets.
This module implements the methods of Gildea, R. J. & Winter, G. (2018). Acta Cryst. D74, 405-410 for determination of Patterson group symmetry from sparse multi-crystal data sets in the presence of an indexing ambiguity.
-
class
dials.algorithms.symmetry.cosym.
analyse_datasets
(intensities, params)[source]¶ Bases:
dials.algorithms.symmetry.symmetry_base
Peform cosym analysis.
Peform cosym analysis on the input intensities using the methods of Gildea, R. J. & Winter, G. (2018). Acta Cryst. D74, 405-410 for determination of Patterson group symmetry from sparse multi-crystal data sets in the presence of an indexing ambiguity.
-
dials.algorithms.symmetry.cosym.
plot_dendrogram
(linkage_matrix, file_name, labels=None, color_threshold=0.05)[source]¶ Plot dendrogram from a linkage matrix.
Parameters: - linkage_matrix (numpy.ndarray) – The hierarchical clustering of centroids of
the initial clustering as produced by
scipy.cluster.hierarchy.linkage()
. - file_name (str) – The output file name.
- labels (list) – Optional labels for the leaves of the dendrogram.
- color_threshold (float) – The color threshold passed to the :func:scipy.cluster.hierarchy.dendrogram` function.
- linkage_matrix (numpy.ndarray) – The hierarchical clustering of centroids of
the initial clustering as produced by
-
dials.algorithms.symmetry.cosym.
plot_matrix
(correlation_matrix, linkage_matrix, file_name, labels=None, color_threshold=0.05)[source]¶ Plot correlation and linkage matrices.
Parameters: - correlation_matrix (numpy.ndarray) – The distance matrix used to generate
the
linkage_matrix
. - linkage_matrix (numpy.ndarray) – The hierarchical clustering of centroids of
the initial clustering as produced by
scipy.cluster.hierarchy.linkage()
. - file_name (str) – The output file name.
- labels (list) – Optional labels for the leaves of the dendrogram.
- color_threshold (float) – The color threshold passed to the :func:scipy.cluster.hierarchy.dendrogram` function.
- correlation_matrix (numpy.ndarray) – The distance matrix used to generate
the
Target function for cosym analysis.
-
class
dials.algorithms.symmetry.cosym.target.
Target
(intensities, lattice_ids, weights=None, min_pairs=None, lattice_group=None, dimensions=None, nproc=1)[source]¶ Bases:
object
Target function for cosym analysis.
-
dim
¶ int – The number of dimensions used in the analysis.
-
compute_functional
(x)[source]¶ Compute the target function at coordinates x.
Parameters: x (scitbx.array_family.flex.double) – a flattened list of the N-dimensional vectors, i.e. coordinates in the first dimension are stored first, followed by the coordinates in the second dimension, etc. Returns: The value of the target function at coordinates x. Return type: f (float)
-
compute_functional_and_gradients
(x)[source]¶ Compute the target function and gradients at coordinates x.
Parameters: x (scitbx.array_family.flex.double) – a flattened list of the N-dimensional vectors, i.e. coordinates in the first dimension are stored first, followed by the coordinates in the second dimension, etc. Returns: f: The value of the target function at coordinates x. grad: The gradients of the target function with respect to the parameters. Return type: Tuple[float, scitbx.array_family.flex.double]
-
compute_gradients_fd
(x, eps=1e-06)[source]¶ Compute the gradients at coordinates x using finite differences.
Parameters: - x (scitbx.array_family.flex.double) – a flattened list of the N-dimensional vectors, i.e. coordinates in the first dimension are stored first, followed by the coordinates in the second dimension, etc.
- eps (float) – The value of epsilon to use in finite difference calculations.
Returns: The gradients of the target function with respect to the parameters.
Return type:
-
curvatures
(x)[source]¶ Compute the curvature of the target function.
Parameters: x (scitbx.array_family.flex.double) – a flattened list of the N-dimensional vectors, i.e. coordinates in the first dimension are stored first, followed by the coordinates in the second dimension, etc. Returns: The curvature of the target function with respect to the parameters. Return type: curvs (scitbx.array_family.flex.double)
-
curvatures_fd
(x, eps=1e-06)[source]¶ Compute the curvatures at coordinates x using finite differences.
Parameters: - x (scitbx.array_family.flex.double) – a flattened list of the N-dimensional vectors, i.e. coordinates in the first dimension are stored first, followed by the coordinates in the second dimension, etc.
- eps (float) – The value of epsilon to use in finite difference calculations.
Returns: The curvature of the target function with respect to the parameters.
Return type: curvs (scitbx.array_family.flex.double)
-
get_sym_ops
()[source]¶ Get the list of symmetry operations used in the analysis.
Returns: The list of symmetry operations. Return type: List[cctbx.sgtbx.rt_mx]
-
plot_rij_cumulative_frequency
(plot_name=None)[source]¶ Plot the cumulative frequency of the rij values.
Parameters: plot_name (str) – The file name to save the plot to. If this is not defined then the plot is displayed in interactive mode.
-
plot_rij_histogram
(plot_name=None)[source]¶ Plot a histogram of the rij values.
Parameters: plot_name (str) – The file name to save the plot to. If this is not defined then the plot is displayed in interactive mode.
-
plot_rij_matrix
(plot_name=None)[source]¶ Plot the matrix of rij values.
Parameters: plot_name (str) – The file name to save the plot to. If this is not defined then the plot is displayed in interactive mode.
-
plot_wij_cumulative_frequency
(plot_name=None)[source]¶ Plot the cumulative frequency of the wij values.
Parameters: plot_name (str) – The file name to save the plot to. If this is not defined then the plot is displayed in interactive mode.
-
plot_wij_histogram
(plot_name=None)[source]¶ Plot a histogram of the wij values.
Parameters: plot_name (str) – The file name to save the plot to. If this is not defined then the plot is displayed in interactive mode.
-
LBFGS refinement engine for cosym analysis.
-
class
dials.algorithms.symmetry.cosym.engine.
lbfgs_with_curvs
(target, coords, use_curvatures=True, termination_params=None)[source]¶ Bases:
object
Minimise a target function using the LBFGS minimiser.
Implementation of an LBFGS minimiser using curvature information, according to the interface defined by
scitbx.lbfgs
.-
callback_after_step
(minimizer)[source]¶ Log progress after each successful step of the minimisation.
-
compute_functional_and_gradients
()[source]¶ Compute the functional and gradients.
Returns: A tuple of the functional and gradients. Return type: tuple
-
compute_functional_gradients_and_curvatures
()[source]¶ Compute the functional, gradients and curvatures.
Returns: A tuple of the functional, gradients and curvatures. Return type: tuple
-
Seed clustering method for cosym analysis.
-
class
dials.algorithms.symmetry.cosym.seed_clustering.
seed_clustering
(coordinates, n_datasets, n_sym_ops, min_silhouette_score, n_clusters=<libtbx.AutoType object>, plot_prefix=None)[source]¶ Bases:
object
Perform seed clustering of coordinates.
Labels points into clusters such that cluster contains exactly one copy of each dataset, then performs silhouettete analysis on the resulting clusters to determine the true number of clusters present, under the constraint that only equal-sized clusterings are valid, i.e. each dataset should appear an equal number of times in each cluster.
See also
https://en.wikipedia.org/wiki/Silhouette_(clustering) http://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_silhouette_analysis.html
-
cluster_labels
¶ scitbx.array_family.flex.int – A label for each coordinate.
-