DIALS core meeting 2020-09-17
Previous Actions
- ND: rewrite contribution guidelines (as PR) to include towncrier/newsfragments
- DWP: have a look over contributing guidelines and email comments to ND (20200903)
- MG+ND: write a proposal for ‘overall architecture discussion’ and ‘new installer’
- ND: update
CONTRIBUTING
in a (separate) pull request to include a soft-imposition of stable master, dials#1353.
Agenda
In this meeting we will only discuss the agenda items on typing
and dxtbx json/msgpack performance
.
Typing: mypy
This is a pre-commit together with a Github action (#1357). Mypy only checks explicit type declarations. To deploy this we need to do minor cleanups as some existing type declarations are wrong. At an absolute minimum this would guarantee that all type declarations we make are correct.
The mypy action checks the entire codebase, but only flags issues where we have conflicting type declaration/usage. The pre-commit hook only checks modified files.
This can be added relatively quickly, existing warnings must be addressed before enabling mypy type checking.
Discussion outcome:
- Let’s not enable this straight-away
- Have an introductory lecture on typing
- Allow people to run this stand-alone and outside of a pre-commit
- Add links to PEP articles
Typing: pytype
Separately from mypy we can also add a pytype action (#1364). This does do type inference, so can go beyond (far beyond) existing type declarations. However it only covers a smaller part of the codebase, as it will not process any file that depends directly or indirectly on any file that contains pytype warnings/errors. Currently in DIALS this currently excludes 75% of all python files, but as we start addressing issues this percentage is expected to fall.
Pytype is also much more expensive to run, so I would not recommend running this as a pre-commit hook. I believe this could be very useful as a pull request GitHub action though. I have implemented a proof-of-concept action that only fails whenever new errors are introduced.
This can be added straightaway, no code changes required.
Discussion outcome:
- Sounds useful
- Have to monitor the output on pull requests, we don’t want this this annoy us by regularly taking away the green tick
dxtbx json/msgpack performance
ASB reports json being problematic when importing e.g. 50,000 experiments, as everything is stored in a single file, so everything needs to be read at once.
- GW: cf. Load before heat death (dxtbx#118). At the core of it are bad design choices in dxtbx.
- Future discussion topics:
- treatment of large serial datasets
- using an HDF5 backend for reflection files
- dials#1407 specific proposal
- ASB: I want
check_format
to go away.
Discussion outcomes:
- ASB:
- General consensus that this is needed.
- Make models and store them
- Load them in parallel with minimal memory requirements
- Version all models and rewind them
- This needs a lot of thought and at this time we don’t have agreement here.
- Not need to copy data like shoeboxes when they aren’t being changed, operate by reference
DIALS documentation build
(to be discussed in the following meeting)
Should the DIALS documentation build outside a cctbx environment?
- Currently you can only build the DIALS documentation inside a cctbx environment. This means a remote documentation build needs to build cctbx first, so we can’t easily run the documentation build on every commit. When things break they are difficult to fix. We can’t use readthedocs.
- MG: I believe the main obstacles are three cctbx Sphinx plugins that may need to be extracted or removed from DIALS documentation, and some phil parsing logic.
renaming master
branch → main
(to be discussed in the following meeting)
- Nothing much will happen on this any time soon - either way we’ll wait for GitHub support on this one
- General assent that this is useful
- from 1st of October new GitHub repositories will have
main
as the default - MG: Suggest turning this into an issue and removing from agenda until GitHub provides a migration path for existing repositories.
cbflib_adaptbx
dependency
(to be discussed in the following meeting)
We are only referencing compress
and uncompress
functions, and cbflib_adaptbx
has a MSVC-ancient-dependency. We could copy the functions into dxtbx_ext
and remove the dependency.
Any other business
(to be discussed in the following meeting)
Next meeting
October 1st, 4pm UK time, 8am PDT.