Manipulate BigDFT input files¶
The goal of this notebook is to present how the MyBigDFT package allows to manipulate BigDFT input files. In order to run a BigDFT calculation, it is required to provide an initial geometry and generally a set of input parameters (even though default parameters are used if none are given).
The Posinp and Atom classes allow to create input geometries while the InputParams class is meant to specify the input parameters of a BigDFT calculation. All of them are presented in this notebook.
[1]:
from mybigdft import Posinp, Atom, InputParams
import numpy as np
The Posinp class¶
This class allows to manipulate the BigDFT input geometries in the xyz format:
2 angstroem # Number of atoms, units
free # Boundary conditions
N 0.0 0.0 0.0 # Atom type and cartesion coordinates of each atom
N 0.0 0.0 1.1
[2]:
atoms = [Atom('N', [0.0, 0.0, 0.0]), Atom('N', [0.0, 0.0, 1.1])]
pos = Posinp(atoms, units="angstroem", boundary_conditions="free")
The Atom class is mostly used to store the data related to a given atom. It requires an atom type and the cartesian coordinates but does not worry about the units used. It has some extra functionalities, such as a translate method, taking a vector as argument (three components) returning another Atom
instance,
whose positions are the ones of the pre-existing atom translated by the vector:
[3]:
Atom('N', [0, 0, 0]).translate([0, 0, 1])
[3]:
Atom('N', [0.0, 0.0, 1.0])
Main attributes¶
A Posinp
instance has some attributes:
[4]:
assert pos.atoms == atoms
assert pos.units == "angstroem"
assert pos.boundary_conditions == "free"
assert pos.cell is None
They cannot be set afterwards:
[5]:
try:
pos.atoms = [Atom('C', [0, 0, 0])]
except AttributeError as e:
print(repr(e))
AttributeError("can't set attribute")
Representation of a Posinp instance¶
Printing a Posinp
instance returns a string representation in the xyz format:
[6]:
print(pos)
2 angstroem
free
N 0.0 0.0 0.0
N 0.0 0.0 1.1
The actual representation of a Posinp
instance is the following:
[7]:
print(repr(pos))
Posinp([Atom('N', [0.0, 0.0, 0.0]), Atom('N', [0.0, 0.0, 1.1])], 'angstroem', 'free', cell=None)
Note that the cell optional argument is set to None
here: this is because there is no need to define a cell for free boundary conditions.
Equality of Posinp instances¶
The order of the atoms is not relevant: changing the order of the atoms in the list do not mean the Posinp
instances are different:
[8]:
shuffled_atoms = [Atom('N', [0.0, 0.0, 1.1]), Atom('N', [0.0, 0.0, 0.0])]
shuffled_pos = Posinp(shuffled_atoms, units="angstroem", boundary_conditions="free")
print(shuffled_pos)
assert shuffled_pos == pos # If the two were different,
2 angstroem
free
N 0.0 0.0 1.1
N 0.0 0.0 0.0
It behaves as expected if, for instance, there is not the same number of atoms or if the atomic types are different:
[9]:
# One atom is missing
assert pos != Posinp([Atom('N', [0.0, 0.0, 1.1])],
units="angstroem", boundary_conditions="free")
# One atom has a different type
assert pos != Posinp([Atom('C', [0.0, 0.0, 0.0]), Atom('N', [0.0, 0.0, 1.1])],
units="angstroem", boundary_conditions="free")
Iterating over a Posinp instance¶
You can easily iterate over the atoms of a Posinp
instance:
[10]:
for atom in pos:
print(f"'{atom.type}': {atom.position}")
'N': [0. 0. 0.]
'N': [0. 0. 1.1]
Class methods to intialize a Posinp instance¶
Other ways of initializing a Posinp
instance are provided:
The from_file class method¶
It allows to read an xyz file written on disk, given a path to this input file:
[11]:
pos = Posinp.from_file("../../../tests/free.xyz")
print(pos)
4 atomic
free
C 0.6661284109 0.0 1.153768252
C 3.330642055 0.0 1.153768252
C 4.662898877 0.0 3.461304757
C 7.327412521 0.0 3.461304757
The from_string class method¶
This method is mostly meant to allow the formatting of the string representation of a posinp:
[12]:
pos_str = """\
4 reduced
surface {x} inf {z}
C 0.08333333333 0.5 0.25
C 0.41666666666 0.5 0.25
C 0.58333333333 0.5 0.75
C 0.91666666666 0.5 0.75"""
for aCC in [2.65, 2.7]:
new_str = pos_str.format(x=3*aCC, z=np.sqrt(3)*aCC)
pos = Posinp.from_string(new_str)
print(f"cell size for aCC={aCC:.2f}: {pos.cell}")
cell size for aCC=2.65: [7.949999999999999, 'inf', 4.589934640057525]
cell size for aCC=2.70: [8.100000000000001, 'inf', 4.676537180435969]
It would actually be possible to achieve the same thing without having to go through the string formatting. The following example should be the preferred way:
[13]:
atoms = [
Atom('C', [0.08333333333, 0.5, 0.25]),
Atom('C', [0.41666666666, 0.5, 0.25]),
Atom('C', [0.58333333333, 0.5, 0.75]),
Atom('C', [0.91666666666, 0.5, 0.75]),
]
for aCC in [2.65, 2.7]:
cell = [3*aCC, 'inf', np.sqrt(3)*aCC]
pos = Posinp(atoms, "reduced", "surface", cell=cell)
print(f"cell size for aCC={aCC:.2f}: {pos.cell}")
cell size for aCC=2.65: [7.949999999999999, 'inf', 4.589934640057525]
cell size for aCC=2.70: [8.100000000000001, 'inf', 4.676537180435969]
The from_dict class method¶
This last class method is meant to initialize a posinp instance from a dictionary. You can use it to initialize your input files, but know using it creates more verbose code than the usual initialization (presented in the begeinning of the notebook). It was actually implemented to retrieve the posinp from a valid input parameters file (when the posinp is defined in it) or from a valid logfile (output file of a BigDFT calculation).
Also, there is no key to specify the boundary conditions: it has to be inferred from the value of the "cell"
key. If there is no such key, it means that free boundary conditions must be used. However, when it exists, you must be careful with the values. For instance, if you want to use surface boundary conditions, you must set the second element of the "cell"
key to "inf"
or ".inf"
.
[14]:
pos_dict = {
"units": "reduced",
"cell": [8.07007483423, 'inf', 4.65925987792],
"positions": [
{'C': [0.08333333333, 0.5, 0.25]},
{'C': [0.41666666666, 0.5, 0.25]},
{'C': [0.58333333333, 0.5, 0.75]},
{'C': [0.91666666666, 0.5, 0.75]},
]
}
pos = Posinp.from_dict(pos_dict)
assert pos.boundary_conditions == "surface"
See the documentation to check the extra possibilities offered by the Posinp class.
The InputParams class¶
This class allows to manage the BigDFT input parameters, in the yaml format:
dft:
hgrids: [0.35, 0.35, 0.35]
It is therefore convenient to initialize this class via a dictionary representing the input parameters:
[15]:
inp = InputParams({"dft": {"hgrids": [0.35]*3}})
print(inp)
{'dft': {'hgrids': [0.35, 0.35, 0.35]}}
If the given value of a parameter corresponds to its default value, it is as if nothing was given:
[16]:
InputParams({"dft": {"hgrids": [0.45]*3}})
[16]:
{}
The validity of the input dictionary is also checked:
[17]:
try:
InputParams({'dfpt': {'hgrids': [0.35]*3}})
except KeyError as e:
print(repr(e))
KeyError("Unknown key 'dfpt'")
[18]:
try:
InputParams({'dft': {'hgrid': [0.35]*3}})
except KeyError as e:
print(repr(e))
KeyError("Unknown key 'hgrid' in 'dft'")
Main attributes¶
The input parameters may contain the input positions under the "posinp"
key (whose content must be a dictionary allowing to create a Posinp
via the from_dict
, see the example above). Here, no input parameters were given:
[19]:
assert inp.posinp is None
The dictionary of parameters is actually stored by the params
attribute:
[20]:
inp.params
[20]:
{'dft': {'hgrids': [0.35, 0.35, 0.35]}}
An InputParams instance behaves like a dictionary¶
[21]:
inp["dft"]
[21]:
{'hgrids': [0.35, 0.35, 0.35]}
[22]:
inp['dft']['hgrids']
[22]:
[0.35, 0.35, 0.35]
You can modify the content of a key afterwards, the validity of the keys will also be checked:
[23]:
# This modification is valid, and therefore taken into account
inp["dft"] = {"rmult": [6, 8]}
inp
[23]:
{'dft': {'rmult': [6, 8]}}
[24]:
try:
# hgrid is not a valid key: an error is raised!
inp['dft'] = {'hgrid': [0.35]*3}
except KeyError as e:
print(repr(e))
KeyError("Unknown key 'hgrid' in 'dft'")
However, modifying the input parameters in this fashion is not checked, you must be careful when using that:
[25]:
inp['dft']["hgrid"] = [0.45]*3
One way of doing making sure that the modified input parameters are still valid is by cleaning them:
[26]:
from mybigdft.iofiles.inputparams import clean
try:
inp = clean(inp)
except KeyError as e:
print(repr(e))
del inp["dft"]["hgrid"] # Delete the bad key
KeyError("Unknown key 'hgrid' in 'dft'")
This is what is actually done when initializing or updating the input parameters. It is also performed before writing the input parameters on a file on disk, so that using bad keys on-the-fly will still be catched before running a BigDFT calculation.
You can also add initial positions to the input parameters by using its dict
representation:
[27]:
inp["posinp"] = {
"units": "angstroem",
"positions": [
{'N': [0.0, 0.0, 0.0]},
{'N': [0.0, 0.0, 1.1]},
],
}
It won’t reflect in the content of the input parameters:
[28]:
inp
[28]:
{'dft': {'rmult': [6, 8]}}
However, the posinp
attribute is not None
anymore:
[29]:
print(inp.posinp)
2 angstroem
free
N 0.0 0.0 0.0
N 0.0 0.0 1.1
A much simpler way is to directly update the posinp parameter:
[30]:
inp.posinp = pos
print(inp.posinp)
4 reduced
surface 8.07007483423 inf 4.65925987792
C 0.08333333333 0.5 0.25
C 0.41666666666 0.5 0.25
C 0.58333333333 0.5 0.75
C 0.91666666666 0.5 0.75
Class methods to intialize a InputParams instance¶
Other ways of initializing a InputParams
instance are provided. They are very similar to the ones of the Posinp
class. The from_dict
method is however missing : it would be redundant with the basic way of initializing an InputParams
instance.
The from_string method¶
This allows to initialize an InputParams instance from a string written as a yaml file:
[32]:
# You can even format that string to modify it according to your needs
base_inp = """\
dft:
rmult: {}
hgrids: [0.35, 0.35, 0.35]"""
for i, rm in enumerate([[5, 7], [6, 8], [7, 9]]):
inp = InputParams.from_string(base_inp.format(rm))
print(f"Input parameters n°{i+1}: {inp}")
Input parameters n°1: {'dft': {'rmult': [5, 7], 'hgrids': [0.35, 0.35, 0.35]}}
Input parameters n°2: {'dft': {'rmult': [6, 8], 'hgrids': [0.35, 0.35, 0.35]}}
Input parameters n°3: {'dft': {'rmult': [7, 9], 'hgrids': [0.35, 0.35, 0.35]}}
However, the same result can be achieved by using the basic initialization procedure. The following code should be prefered:
[33]:
for i, rm in enumerate([[5, 7], [6, 8], [7, 9]]):
inp = InputParams({"dft": {"rmult": rm, "hgrids": [0.35]*3}})
print(f"Input parameters n°{i+1}: {inp}")
Input parameters n°1: {'dft': {'rmult': [5, 7], 'hgrids': [0.35, 0.35, 0.35]}}
Input parameters n°2: {'dft': {'rmult': [6, 8], 'hgrids': [0.35, 0.35, 0.35]}}
Input parameters n°3: {'dft': {'rmult': [7, 9], 'hgrids': [0.35, 0.35, 0.35]}}