Welcome to SMORES!#

GitHub: https://github.com/austin-mroz/SMORES

Attention

This introduction is under construction but will be filled in by Austin + Lukas.

Things to cover:

  • What does smores do?

  • Why would the user want to do this?

  • What are steric metrics, why are they useful?

  • What can you do with steric metrics?

  • Why is using eletric field based stuff cool?

  • Link to STREUSEL

  • A cool graphic never hurt anyone

  • Talk about morfeus

Installation#

pip install smores

Getting help#

We want to make sure you’re able to use smores to the fullest, and we recognize that not everyone who wishes to use smores is a confident programmer. As such, if you get stuck using our tool we encourage you to get in touch with us on Discord (invite link), or by asking us in the Q&A section. We’re happy to help!

If on the other hand you find an issue or bug with smores please let us know by making an issue.

Quickstart#

Getting started with smores is really simple!

You start with

>>> import smores

and then you load a molecule and calculate the steric parameters

>>> molecule = smores.Molecule.from_smiles("CC", dummy_index=0, attached_index=1)
>>> molecule.get_steric_parameters()
StericParameters(L=3.57164113574581, B1=1.9730970556668774, B5=2.320611610648539)

Which will calculate the parameters using the STREUSEL radii of the atoms.

Tip

The radii which are used can be modified by the user! See the documentation of Molecule.get_steric_parameters() for more details.

See also

Integration with machine learning workflows#

It doesn’t take a lot of code to get smores working with a great library like sklearn! Here we calculate the steric parameters for a bunch of molecules with varying chain lengths and use them to predict their UFF energy.

import rdkit.Chem.AllChem as rdkit
import smores
from sklearn.linear_model import LinearRegression

def uff_energy(molecule):
    rdkit.SanitizeMol(molecule)
    return rdkit.UFFGetMoleculeForceField(molecule).CalcEnergy()

cores = [smores.rdkit_from_smiles("CBr")]
chains = ["C" * chain_length for chain_length in range(1, 50)]
substituents = [smores.rdkit_from_smiles("Br" + chain) for chain in chains]

X = []
y = []
for combo in smores.combine(cores, substituents):
    molecule = smores.Molecule.from_combination(combo)
    X.append(list(molecule.get_steric_parameters()))
    y.append(uff_energy(combo.product))

reg = LinearRegression()
reg.fit(X, y)
>>> reg.score(X, y)
0.8067966147933283

We hope that’s a useful jumping off point for some quick prototyping! We expect there are much more useful and interesting properties you can target.

Plays nice with rdkit#

smores molecules can easily be created from RDKit molecules

import smores
import rdkit.Chem.AllChem as rdkit

rdkit_molecule = rdkit.AddHs(rdkit.MolFromSmiles("CBr"))
rdkit.EmbedMolecule(rdkit_molecule)  # Generate a 3-D structure.
smores_molecule = smores.Molecule.from_rdkit(rdkit_molecule, dummy_index=0, attached_index=1)

and we provide a handy function for creating rdkit molecules from SMILES

rdkit_molecule = smores.rdkit_from_smiles("CC")

which takes care of adding hydrogen atoms and generating a 3-D structure for you, unlike RDKit’s own rdkit.MolFromSmiles function.

See also

Quick comparison of substituents#

A very common workflow is to try different substituents on a molecule and compare their steric parameters, so we wrote some code that lets you do this quick

import smores
import rdkit.Chem as rdkit

cores = [
    smores.rdkit_from_smiles("c1ccccc1Br"),
]
substituents = [
    smores.rdkit_from_smiles("BrCCC"),
    smores.rdkit_from_smiles("BrCC(C)(C)C"),
]
for combo in smores.combine(cores, substituents):
    molecule = smores.Molecule.from_combination(combo)
    params = molecule.get_steric_parameters()
    print(
        f"Combination of {rdkit.MolToSmiles(rdkit.RemoveHs(combo.core))} and "
        f"{rdkit.MolToSmiles(rdkit.RemoveHs(combo.substituent))} "
        f"has SMORES parameters of {params}."
    )
Combination of Brc1ccccc1 and CCCBr has SMORES parameters of StericParameters(L=5.6397512133212935, B1=1.7820154803719914, B5=3.4938688496917782).
Combination of Brc1ccccc1 and CC(C)(C)CBr has SMORES parameters of StericParameters(L=5.668756954899209, B1=1.747631456476209, B5=4.532148595320116).

See also

Using electrostatic potentials#

smores can also calculate the steric parameters using electrostatic potentials defined on a voxel grid

>>> import smores
>>> molecule = smores.EspMolecule.from_cube_file("HBr.cube", dummy_index=0, attached_index=1)
>>> molecule.get_steric_parameters()
StericParameters(L=3.57164113574581, B1=1.9730970556668774, B5=2.320611610648539)

See also

Calculating electrostatic potentials#

Attention

Psi4 is a big dependency and we therefore do not install it automatically. If you wish to use the functions in smores.psi4 you will have to install Psi4 yourself. We find

conda install -c psi4 psi4

tends to work for us, but we recommend you check out Psi4’s official documentation for up-to-date information.

import smores
import smores.psi4

cube_path = smores.psi4.calculate_electrostatic_potential(
    molecule=smores.rdkit_from_smiles("Br"),
    output_directory="outdir",
    grid_origin=(-3., -3., -3.),
    grid_length=10.,
    num_voxels_per_dimension=20,
)
esp_molecule = smores.EspMolecule.from_cube_file(cube_path, dummy_index=0, attached_index=1)
>>> esp_molecule.get_steric_parameters()
StericParameters(L=3.57164113574581, B1=1.9730970556668774, B5=2.320611610648539)

See also

Optimizing molecules with xtb#

The values of the calculated steric parameters depend on the geometry of the molecule. There are many different ways to determine the geometry of a molecule. It may well be that the structure returned by rdkit_from_smiles() is suitable for your purposes.

However, if that’s not the case, you may want to optimize the geometry yourself. To make your experience using smores a little smoother, we provide a little helper function in case you want to optimize your structures using xtb

>>> import smores
>>> molecule = smores.rdkit_from_smiles("CBr")
>>> optimized = smores.xtb.optimize_geometry(molecule, "xtb_output")
>>> smores_molecule = smores.Molecule.from_rdkit(optimized, dummy_index=0, attached_index=1)
>>> smores_molecule.get_steric_parameters()
StericParameters(L=3.57164113574581, B1=1.9730970556668774, B5=2.320611610648539)

Whether this is necessary or desirable, we leave to your judgement, but it’s here if you need it.

Note

This example assumes that you have xtb available in your PATH. If that’s not the case, you can set the location of the xtb binary using the xtb_path parameter. For more details, see xtb.optimize_geometry().

See also