{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Manipulate BigDFT input files\n", "\n", "The goal of this notebook is to present how the [MyBigDFT](https://mmoriniere.gitlab.io/MyBigDFT/) package allows to manipulate BigDFT input files. In order to run a BigDFT calculation, it is required to provide an initial geometry and generally a set of input parameters (even though default parameters are used if none are given). \n", "\n", "The [Posinp](https://mmoriniere.gitlab.io/MyBigDFT/posinp.html#mybigdft.iofiles.posinp.Posinp) and [Atom](https://mmoriniere.gitlab.io/MyBigDFT/posinp.html#mybigdft.iofiles.posinp.Atom) classes allow to create input geometries while the [InputParams](https://mmoriniere.gitlab.io/MyBigDFT/inputparams.html#mybigdft.iofiles.inputparams.InputParams) class is meant to specify the input parameters of a BigDFT calculation. All of them are presented in this notebook." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from mybigdft import Posinp, Atom, InputParams\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The [Posinp](https://mmoriniere.gitlab.io/MyBigDFT/posinp.html#mybigdft.iofiles.posinp.Posinp) class\n", "\n", "This class allows to manipulate the BigDFT input geometries in the xyz format:\n", "\n", " 2 angstroem # Number of atoms, units\n", " free # Boundary conditions\n", " N 0.0 0.0 0.0 # Atom type and cartesion coordinates of each atom\n", " N 0.0 0.0 1.1" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "atoms = [Atom('N', [0.0, 0.0, 0.0]), Atom('N', [0.0, 0.0, 1.1])]\n", "pos = Posinp(atoms, units=\"angstroem\", boundary_conditions=\"free\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The [Atom](https://mmoriniere.gitlab.io/MyBigDFT/posinp.html#mybigdft.iofiles.posinp.Atom) class is mostly used to store the data related to a given atom. It requires an atom type and the cartesian coordinates but does not worry about the units used. It has some extra functionalities, such as a [translate](https://mmoriniere.gitlab.io/MyBigDFT/posinp.html#mybigdft.iofiles.posinp.Atom.translate) method, taking a vector as argument (three components) returning another `Atom` instance, whose positions are the ones of the pre-existing atom translated by the vector:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Atom('N', [0.0, 0.0, 1.0])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Atom('N', [0, 0, 0]).translate([0, 0, 1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Main attributes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A `Posinp` instance has some attributes:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "assert pos.atoms == atoms\n", "assert pos.units == \"angstroem\"\n", "assert pos.boundary_conditions == \"free\"\n", "assert pos.cell is None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "They cannot be set afterwards:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "AttributeError(\"can't set attribute\")\n" ] } ], "source": [ "try:\n", " pos.atoms = [Atom('C', [0, 0, 0])]\n", "except AttributeError as e:\n", " print(repr(e))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Representation of a Posinp instance\n", "\n", "Printing a `Posinp` instance returns a string representation in the xyz format:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2 angstroem\n", "free\n", "N 0.0 0.0 0.0\n", "N 0.0 0.0 1.1\n", "\n" ] } ], "source": [ "print(pos)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The actual representation of a `Posinp` instance is the following:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Posinp([Atom('N', [0.0, 0.0, 0.0]), Atom('N', [0.0, 0.0, 1.1])], 'angstroem', 'free', cell=None)\n" ] } ], "source": [ "print(repr(pos))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the cell optional argument is set to `None` here: this is because there is no need to define a cell for free boundary conditions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Equality of Posinp instances\n", "\n", "The order of the atoms is not relevant: changing the order of the atoms in the list do not mean the `Posinp` instances are different:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2 angstroem\n", "free\n", "N 0.0 0.0 1.1\n", "N 0.0 0.0 0.0\n", "\n" ] } ], "source": [ "shuffled_atoms = [Atom('N', [0.0, 0.0, 1.1]), Atom('N', [0.0, 0.0, 0.0])]\n", "shuffled_pos = Posinp(shuffled_atoms, units=\"angstroem\", boundary_conditions=\"free\")\n", "print(shuffled_pos)\n", "assert shuffled_pos == pos # If the two were different, " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It behaves as expected if, for instance, there is not the same number of atoms or if the atomic types are different:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "# One atom is missing\n", "assert pos != Posinp([Atom('N', [0.0, 0.0, 1.1])],\n", " units=\"angstroem\", boundary_conditions=\"free\")\n", "# One atom has a different type\n", "assert pos != Posinp([Atom('C', [0.0, 0.0, 0.0]), Atom('N', [0.0, 0.0, 1.1])],\n", " units=\"angstroem\", boundary_conditions=\"free\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Iterating over a Posinp instance" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can easily iterate over the atoms of a `Posinp` instance:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "'N': [0. 0. 0.]\n", "'N': [0. 0. 1.1]\n" ] } ], "source": [ "for atom in pos:\n", " print(f\"'{atom.type}': {atom.position}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Class methods to intialize a Posinp instance\n", "\n", "Other ways of initializing a `Posinp` instance are provided:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### The [from_file](https://mmoriniere.gitlab.io/MyBigDFT/posinp.html#mybigdft.iofiles.posinp.Posinp.from_file) class method\n", "\n", "It allows to read an xyz file written on disk, given a path to this input file:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 atomic\n", "free\n", "C 0.6661284109 0.0 1.153768252\n", "C 3.330642055 0.0 1.153768252\n", "C 4.662898877 0.0 3.461304757\n", "C 7.327412521 0.0 3.461304757\n", "\n" ] } ], "source": [ "pos = Posinp.from_file(\"../../../tests/free.xyz\")\n", "print(pos)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### The [from_string](https://mmoriniere.gitlab.io/MyBigDFT/posinp.html#mybigdft.iofiles.posinp.Posinp.from_string) class method\n", "\n", "This method is mostly meant to allow the formatting of the string representation of a posinp:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cell size for aCC=2.65: [7.949999999999999, 'inf', 4.589934640057525]\n", "cell size for aCC=2.70: [8.100000000000001, 'inf', 4.676537180435969]\n" ] } ], "source": [ "pos_str = \"\"\"\\\n", "4 reduced\n", "surface {x} inf {z}\n", "C 0.08333333333 0.5 0.25\n", "C 0.41666666666 0.5 0.25\n", "C 0.58333333333 0.5 0.75\n", "C 0.91666666666 0.5 0.75\"\"\"\n", "for aCC in [2.65, 2.7]:\n", " new_str = pos_str.format(x=3*aCC, z=np.sqrt(3)*aCC)\n", " pos = Posinp.from_string(new_str)\n", " print(f\"cell size for aCC={aCC:.2f}: {pos.cell}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It would actually be possible to achieve the same thing without having to go through the string formatting. The following example should be the preferred way: " ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cell size for aCC=2.65: [7.949999999999999, 'inf', 4.589934640057525]\n", "cell size for aCC=2.70: [8.100000000000001, 'inf', 4.676537180435969]\n" ] } ], "source": [ "atoms = [\n", " Atom('C', [0.08333333333, 0.5, 0.25]),\n", " Atom('C', [0.41666666666, 0.5, 0.25]),\n", " Atom('C', [0.58333333333, 0.5, 0.75]),\n", " Atom('C', [0.91666666666, 0.5, 0.75]),\n", "]\n", "for aCC in [2.65, 2.7]:\n", " cell = [3*aCC, 'inf', np.sqrt(3)*aCC]\n", " pos = Posinp(atoms, \"reduced\", \"surface\", cell=cell)\n", " print(f\"cell size for aCC={aCC:.2f}: {pos.cell}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### The [from_dict](https://mmoriniere.gitlab.io/MyBigDFT/posinp.html#mybigdft.iofiles.posinp.Posinp.from_dict) class method\n", "\n", "This last class method is meant to initialize a posinp instance from a dictionary. You can use it to initialize your input files, but know using it creates more verbose code than the usual initialization (presented in the begeinning of the notebook). It was actually implemented to retrieve the posinp from a valid input parameters file (when the posinp is defined in it) or from a valid logfile (output file of a BigDFT calculation).\n", "\n", "Also, there is no key to specify the boundary conditions: it has to be inferred from the value of the `\"cell\"` key. If there is no such key, it means that free boundary conditions must be used. However, when it exists, you must be careful with the values. For instance, if you want to use surface boundary conditions, you must set the second element of the `\"cell\"` key to `\"inf\"` or `\".inf\"`." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "pos_dict = {\n", " \"units\": \"reduced\",\n", " \"cell\": [8.07007483423, 'inf', 4.65925987792],\n", " \"positions\": [\n", " {'C': [0.08333333333, 0.5, 0.25]},\n", " {'C': [0.41666666666, 0.5, 0.25]},\n", " {'C': [0.58333333333, 0.5, 0.75]},\n", " {'C': [0.91666666666, 0.5, 0.75]},\n", " ]\n", "}\n", "pos = Posinp.from_dict(pos_dict)\n", "assert pos.boundary_conditions == \"surface\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See the [documentation](https://mmoriniere.gitlab.io/MyBigDFT/posinp.html#mybigdft.iofiles.posinp.Posinp) to check the extra possibilities offered by the Posinp class." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The [InputParams](https://mmoriniere.gitlab.io/MyBigDFT/inputparams.html#inputparams) class\n", "\n", "This class allows to manage the BigDFT input parameters, in the yaml format:\n", "\n", " dft:\n", " hgrids: [0.35, 0.35, 0.35]\n", " \n", "It is therefore convenient to initialize this class via a dictionary representing the input parameters:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'dft': {'hgrids': [0.35, 0.35, 0.35]}}\n" ] } ], "source": [ "inp = InputParams({\"dft\": {\"hgrids\": [0.35]*3}})\n", "print(inp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If the given value of a parameter corresponds to its default value, it is as if nothing was given:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{}" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "InputParams({\"dft\": {\"hgrids\": [0.45]*3}})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The validity of the input dictionary is also checked:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "KeyError(\"Unknown key 'dfpt'\")\n" ] } ], "source": [ "try:\n", " InputParams({'dfpt': {'hgrids': [0.35]*3}})\n", "except KeyError as e:\n", " print(repr(e))" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "KeyError(\"Unknown key 'hgrid' in 'dft'\")\n" ] } ], "source": [ "try:\n", " InputParams({'dft': {'hgrid': [0.35]*3}})\n", "except KeyError as e:\n", " print(repr(e))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Main attributes\n", "\n", "The input parameters may contain the input positions under the `\"posinp\"` key (whose content must be a dictionary allowing to create a `Posinp` via the `from_dict`, see the example above). Here, no input parameters were given:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "assert inp.posinp is None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The dictionary of parameters is actually stored by the `params` attribute:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'dft': {'hgrids': [0.35, 0.35, 0.35]}}" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inp.params" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### An InputParams instance behaves like a dictionary" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'hgrids': [0.35, 0.35, 0.35]}" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inp[\"dft\"]" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0.35, 0.35, 0.35]" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inp['dft']['hgrids']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can modify the content of a key afterwards, the validity of the keys will also be checked:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'dft': {'rmult': [6, 8]}}" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# This modification is valid, and therefore taken into account\n", "inp[\"dft\"] = {\"rmult\": [6, 8]}\n", "inp" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "KeyError(\"Unknown key 'hgrid' in 'dft'\")\n" ] } ], "source": [ "try:\n", " # hgrid is not a valid key: an error is raised!\n", " inp['dft'] = {'hgrid': [0.35]*3}\n", "except KeyError as e:\n", " print(repr(e))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, modifying the input parameters in this fashion is not checked, you must be careful when using that:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "inp['dft'][\"hgrid\"] = [0.45]*3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One way of doing making sure that the modified input parameters are still valid is by cleaning them:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "KeyError(\"Unknown key 'hgrid' in 'dft'\")\n" ] } ], "source": [ "from mybigdft.iofiles.inputparams import clean\n", "try:\n", " inp = clean(inp)\n", "except KeyError as e:\n", " print(repr(e))\n", " del inp[\"dft\"][\"hgrid\"] # Delete the bad key" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is what is actually done when initializing or updating the input parameters. It is also performed before writing the input parameters on a file on disk, so that using bad keys on-the-fly will still be catched before running a BigDFT calculation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also add initial positions to the input parameters by using its `dict` representation:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "inp[\"posinp\"] = {\n", " \"units\": \"angstroem\",\n", " \"positions\": [\n", " {'N': [0.0, 0.0, 0.0]},\n", " {'N': [0.0, 0.0, 1.1]},\n", " ],\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It won't reflect in the content of the input parameters:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'dft': {'rmult': [6, 8]}}" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inp" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, the `posinp` attribute is not `None` anymore:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2 angstroem\n", "free\n", "N 0.0 0.0 0.0\n", "N 0.0 0.0 1.1\n", "\n" ] } ], "source": [ "print(inp.posinp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A much simpler way is to directly update the posinp parameter:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 reduced\n", "surface 8.07007483423 inf 4.65925987792\n", "C 0.08333333333 0.5 0.25\n", "C 0.41666666666 0.5 0.25\n", "C 0.58333333333 0.5 0.75\n", "C 0.91666666666 0.5 0.75\n", "\n" ] } ], "source": [ "inp.posinp = pos\n", "print(inp.posinp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Class methods to intialize a InputParams instance\n", "\n", "Other ways of initializing a `InputParams` instance are provided. They are very similar to the ones of the `Posinp` class. The `from_dict` method is however missing : it would be redundant with the basic way of initializing an `InputParams` instance." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### The [from_file](https://mmoriniere.gitlab.io/MyBigDFT/inputparams.html#mybigdft.iofiles.inputparams.InputParams.from_file) method" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{}\n" ] } ], "source": [ "inp = InputParams.from_file(\"../../../tests/test.yaml\")\n", "print(inp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### The [from_string](https://mmoriniere.gitlab.io/MyBigDFT/inputparams.html#mybigdft.iofiles.inputparams.InputParams.from_string) method\n", "\n", "This allows to initialize an InputParams instance from a string written as a yaml file:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Input parameters n°1: {'dft': {'rmult': [5, 7], 'hgrids': [0.35, 0.35, 0.35]}}\n", "Input parameters n°2: {'dft': {'rmult': [6, 8], 'hgrids': [0.35, 0.35, 0.35]}}\n", "Input parameters n°3: {'dft': {'rmult': [7, 9], 'hgrids': [0.35, 0.35, 0.35]}}\n" ] } ], "source": [ "# You can even format that string to modify it according to your needs\n", "base_inp = \"\"\"\\\n", "dft:\n", " rmult: {}\n", " hgrids: [0.35, 0.35, 0.35]\"\"\"\n", "for i, rm in enumerate([[5, 7], [6, 8], [7, 9]]):\n", " inp = InputParams.from_string(base_inp.format(rm))\n", " print(f\"Input parameters n°{i+1}: {inp}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, the same result can be achieved by using the basic initialization procedure. The following code should be prefered:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Input parameters n°1: {'dft': {'rmult': [5, 7], 'hgrids': [0.35, 0.35, 0.35]}}\n", "Input parameters n°2: {'dft': {'rmult': [6, 8], 'hgrids': [0.35, 0.35, 0.35]}}\n", "Input parameters n°3: {'dft': {'rmult': [7, 9], 'hgrids': [0.35, 0.35, 0.35]}}\n" ] } ], "source": [ "for i, rm in enumerate([[5, 7], [6, 8], [7, 9]]):\n", " inp = InputParams({\"dft\": {\"rmult\": rm, \"hgrids\": [0.35]*3}})\n", " print(f\"Input parameters n°{i+1}: {inp}\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" } }, "nbformat": 4, "nbformat_minor": 2 }