Welcome to SMORES!#
GitHub: https://github.com/austin-mroz/SMORES
Attention
This introduction is under construction but will be filled in by Austin + Lukas.
Things to cover:
What does smores do?
Why would the user want to do this?
What are steric metrics, why are they useful?
What can you do with steric metrics?
Why is using eletric field based stuff cool?
Link to STREUSEL
A cool graphic never hurt anyone
Talk about morfeus
Installation#
pip install smores
Getting help#
We want to make sure you’re able to use smores to the
fullest, and we recognize that not everyone who wishes to use
smores is a confident programmer. As such, if you get
stuck using our tool we encourage you to get in touch with us on
Discord (invite link), or by asking us in the Q&A
section. We’re happy to help!
If on the other hand you find an issue or bug with
smores please let us know by making an issue.
Quickstart#
Getting started with smores is really simple!
You start with
>>> import smores
and then you load a molecule and calculate the steric parameters
>>> molecule = smores.Molecule.from_smiles("CC", dummy_index=0, attached_index=1)
>>> molecule.get_steric_parameters()
StericParameters(L=3.57164113574581, B1=1.9730970556668774, B5=2.320611610648539)
Which will calculate the parameters using the STREUSEL radii of the atoms.
Tip
The radii which are used can be modified by the user! See
the documentation of Molecule.get_steric_parameters()
for more details.
See also
Molecule: For additional documentation and examples.Molecule.from_xyz_file(): For loading molecules from.xyzfiles. Other file types are supported too!Molecule.get_steric_parameters(): For configuration options.
Integration with machine learning workflows#
It doesn’t take a lot of code to get smores working with a great
library like sklearn! Here we calculate the steric parameters for a
bunch of molecules with varying chain lengths and use them to predict
their UFF energy.
import rdkit.Chem.AllChem as rdkit
import smores
from sklearn.linear_model import LinearRegression
def uff_energy(molecule):
rdkit.SanitizeMol(molecule)
return rdkit.UFFGetMoleculeForceField(molecule).CalcEnergy()
cores = [smores.rdkit_from_smiles("CBr")]
chains = ["C" * chain_length for chain_length in range(1, 50)]
substituents = [smores.rdkit_from_smiles("Br" + chain) for chain in chains]
X = []
y = []
for combo in smores.combine(cores, substituents):
molecule = smores.Molecule.from_combination(combo)
X.append(list(molecule.get_steric_parameters()))
y.append(uff_energy(combo.product))
reg = LinearRegression()
reg.fit(X, y)
>>> reg.score(X, y)
0.8067966147933283
We hope that’s a useful jumping off point for some quick prototyping! We expect there are much more useful and interesting properties you can target.
Plays nice with rdkit#
smores molecules can easily be created from RDKit molecules
import smores
import rdkit.Chem.AllChem as rdkit
rdkit_molecule = rdkit.AddHs(rdkit.MolFromSmiles("CBr"))
rdkit.EmbedMolecule(rdkit_molecule) # Generate a 3-D structure.
smores_molecule = smores.Molecule.from_rdkit(rdkit_molecule, dummy_index=0, attached_index=1)
and we provide a handy function for creating rdkit molecules from SMILES
rdkit_molecule = smores.rdkit_from_smiles("CC")
which takes care of adding hydrogen atoms and generating a 3-D structure for you, unlike RDKit’s own rdkit.MolFromSmiles function.
See also
Molecule.from_rdkit(): For addtional configuration options.EspMolecule.from_rdkit(): For addtional configuration options.rdkit_from_smiles()For additional documentation.
Quick comparison of substituents#
A very common workflow is to try different substituents on a molecule and compare their steric parameters, so we wrote some code that lets you do this quick
import smores
import rdkit.Chem as rdkit
cores = [
smores.rdkit_from_smiles("c1ccccc1Br"),
]
substituents = [
smores.rdkit_from_smiles("BrCCC"),
smores.rdkit_from_smiles("BrCC(C)(C)C"),
]
for combo in smores.combine(cores, substituents):
molecule = smores.Molecule.from_combination(combo)
params = molecule.get_steric_parameters()
print(
f"Combination of {rdkit.MolToSmiles(rdkit.RemoveHs(combo.core))} and "
f"{rdkit.MolToSmiles(rdkit.RemoveHs(combo.substituent))} "
f"has SMORES parameters of {params}."
)
Combination of Brc1ccccc1 and CCCBr has SMORES parameters of StericParameters(L=5.6397512133212935, B1=1.7820154803719914, B5=3.4938688496917782).
Combination of Brc1ccccc1 and CC(C)(C)CBr has SMORES parameters of StericParameters(L=5.668756954899209, B1=1.747631456476209, B5=4.532148595320116).
See also
combine(): For additional examples and configuration options.rdkit_from_smiles(): For additional configuration options.
Using electrostatic potentials#
smores can also calculate the steric parameters using electrostatic
potentials defined on a voxel grid
>>> import smores
>>> molecule = smores.EspMolecule.from_cube_file("HBr.cube", dummy_index=0, attached_index=1)
>>> molecule.get_steric_parameters()
StericParameters(L=3.57164113574581, B1=1.9730970556668774, B5=2.320611610648539)
See also
EspMolecule: For additional documentation and examples.EspMolecule.get_steric_parameters(): For configuration options.smores.psi4: For using Psi4 to make.cubefiles.
Calculating electrostatic potentials#
Attention
Psi4 is a big dependency and we therefore do not install it automatically.
If you wish to use the functions in smores.psi4 you will have to install
Psi4 yourself. We find
conda install -c psi4 psi4
tends to work for us, but we recommend you check out Psi4’s official documentation for up-to-date information.
import smores
import smores.psi4
cube_path = smores.psi4.calculate_electrostatic_potential(
molecule=smores.rdkit_from_smiles("Br"),
output_directory="outdir",
grid_origin=(-3., -3., -3.),
grid_length=10.,
num_voxels_per_dimension=20,
)
esp_molecule = smores.EspMolecule.from_cube_file(cube_path, dummy_index=0, attached_index=1)
>>> esp_molecule.get_steric_parameters()
StericParameters(L=3.57164113574581, B1=1.9730970556668774, B5=2.320611610648539)
See also
psi4.calculate_electrostatic_potential(): For configuration options.
Optimizing molecules with xtb#
The values of the calculated steric parameters depend on the
geometry of the molecule. There are many different ways to
determine the geometry of a molecule. It may well be that the
structure returned by rdkit_from_smiles() is suitable
for your purposes.
However, if that’s not the case, you may want to optimize the
geometry yourself. To make your experience using
smores a little smoother, we provide a little helper
function in case you want to optimize your structures
using xtb
>>> import smores
>>> molecule = smores.rdkit_from_smiles("CBr")
>>> optimized = smores.xtb.optimize_geometry(molecule, "xtb_output")
>>> smores_molecule = smores.Molecule.from_rdkit(optimized, dummy_index=0, attached_index=1)
>>> smores_molecule.get_steric_parameters()
StericParameters(L=3.57164113574581, B1=1.9730970556668774, B5=2.320611610648539)
Whether this is necessary or desirable, we leave to your judgement, but it’s here if you need it.
Note
This example assumes that you have xtb available in your PATH.
If that’s not the case, you can set the location of the
xtb binary using the xtb_path parameter. For more details,
see xtb.optimize_geometry().
See also
xtb.optimize_geometry(): For configuration options.