Molssi Driver Interface Library
Tutorials

User Tutorial

In this tutorial we will use an MDI driver to perform a simple Ab Initio Molecular Dynamics (AIMD) simulation, using Quantum ESPRESSO (QE) to calculate forces and LAMMPS to update the atomic coordinates each time step. LAMMPS is a code that simulates chemical systems using molecular mechanics (MM) force fields, which are computationally very efficient to evaluate and can be used for lengthy dynamics simulations of large systems. Quantum ESPRESSO is a code that runs quantum mechanics (QM) calculations, which are much more computationally expensive, but can be applied much more straigtforwardly to chemically reactive systems and other specific use cases. MM codes like LAMMPS often offer time integration features that are not widely available in QM codes, so it can be desirable to be able to mix the time integration functionality of an MM code with the force evaluation functionality of a QM code.

Although we will be using QE and LAMMPS as the MDI engines in this tutorial, you could do similar calculations using other MDI engines. A list of codes that support MDI can be found here, along with instructions on how to compile them with MDI support.

Compiling the Driver and Engines

The hardest part of getting an MDI calculation working is often getting all the necessary codes compiled. Computational chemistry software packages tend to be complex, often with numerous dependencies that can themselves be challenging to compile. You could try to compile the codes manually by following the specific compile process for each code, and the next section will offer some guidance for this process; however, this tutorial will also describe an easier way to get the codes compiled using containers.

The Hard Way: Manually Compile the Engines

As mention earlier, manually compiling computational chemistry software can be challenging. Depending on your computational environment, you can easily spend days trying to get certain codes operating properly. This section will offer an overview of the manual compile process for the codes used in this tutorial, but you may wish to consider skipping to the next section, which describes an easier way to get started with your first MDI calculations.

First, compile both QE and LAMMPS (or your preferred QM and MM packages) with MDI support by following the instructions provided.

After you've finished compiling the engines, you can download and compile the AIMD driver with the following commands:

git clone git@github.com:MolSSI/MDI_AIMD_Driver.git
mkdir MDI_AIMD_Driver/build
cd MDI_AIMD_Driver/build
cmake ..
make

In MDI_AIMD_Driver/tests/locations there are several single-line text files, each of which is intended to indicate the location of one of the codes you installed above. Edit each of them so that they provide the absolute path to the corresponding code's executable.

The Easy(er) Way: Compile Using MDI Mechanic

MDI provides a support tool, called MDI Mechanic, which can help you avoid the need to figure out how to compile various QM and MM codes yourself. MDI Mechanic automatically builds Docker images from recipes you provide it, allowing you to compile and run all the codes you need for this tutorial in a self-contained, portable environment. Before continuing, you'll need to install and launch Docker Desktop.

You can now install MDI Mechanic with:

pip install mdimechanic

You can download the MDI Mechanic recipe you need with:

git clone https://github.com/MolSSI-MDI/MDI_AIMD_Driver_container.git

This recipe contains instructions for building QE, LAMMPS, and the MDI_AIMD_Driver code within a self-contained Docker environment. To build the recipes, do:

cd MDI_AIMD_Driver_container
mdimechanic build

This is likely to take a significant amount of time to complete, as QE and LAMMPS are both large codes. Once it has successfully finished, you can start working in the environment it has created using:

mdimechanic interactive

This will start an interactive container session. If you're not familiar with Docker or containerization, you may wish to read a bit about these tools. Otherwise, you can think about the "mdimechanic interactive" session as being similar to logging onto a remote computer via ssh - except that in the case of containers, the computer you are "logging into" is actually running as a self-contained environment on your local machine. While you are in the interactive session, you can run tests and other calculations within the containerized environment. Any changes you make to the environment while within the interactive session are transient, and will be discarded upon exiting the session (exception: any changes to files or subdirectories within the directory where mdimechanic.yml is located will be preserved upon exiting the interactive session).

For more information about MDI Mechanic, please see the MDI Mechanic Getting Started Guide.

While in the interactive session, you can find the compiled codes in subdirectories of the "/repo/build" directory. Go ahead and check that now:

cd /repo/build
ls

You should see subdirectories for LAMMPS, QE, and the MDI_AIMD_Driver code.

Running the Simulation

We will now run a simple AIMD simulation.

There is a simple test calculation in MDI_AIMD_Driver/tests/water. Within this directory, the data subdirectory contains standard input files for LAMMPS and QE. The input files correspond to a small water box of 8 water molecules. They are exactly like any non-MDI input files for their respective codes, except for data/lammps.in. At the end of data/lammps.in is a command that tells LAMMPS it should begin running as an MDI engine:

...
mdi engine

There are two scripts associated with the water test: tcp.sh and mpi.sh. The tcp.sh script will run the simulation using the TCP/IP method for communicating between the driver and the engines. It consists of the following code:

#location of required codes
DRIVER_LOC=$(cat ../locations/MDI_AIMD_Driver)
LAMMPS_LOC=$(cat ../locations/LAMMPS)
QE_LOC=$(cat ../locations/QE)
#remove old files
if [ -d work ]; then
rm -r work
fi
#create work directory
cp -r data work
cd work
cd work
${QE_LOC} -mdi "-role ENGINE -name QM -method TCP -port 8021 -hostname localhost" -in qe.in > qe.out &
#launch LAMMPS
${LAMMPS_LOC} -mdi "-role ENGINE -name MM -method TCP -port 8021 -hostname localhost" -in lammps.in > lammps.out &
#launch driver
${DRIVER_LOC} -mdi "-role DRIVER -name driver -method TCP -port 8021" &
wait

The first few lines simply read the location of the codes from the files in MDI_AIMD_Driver/tests/locations and copy the LAMMPS and QE input files into a subdirectory called work. Then the script launches QE. The QE launch command is exactly the same as a normal run of QE, except for the addition of a -mdi option. This option provides information to the MDI Library regarding how the codes will communicate with one another. Its argument is a string, which itself consists of a series of options that are read by the MDI Library. In this case, it provides the following options:

  • role: Indicates whether this code will run as an ENGINE or a DRIVER.
  • name: Identifies the purpose of this engine. Each MDI driver will expect its engines to be named according to a particular standard. The MDI_AIMD_Driver engine expects one of the engines to be named "QM" and the other to be named "MM".
  • method: The communication method used to transfer information between the driver and the engines.
  • port: The port number that the driver will be listening over. Only used if the method is TCP.
  • hostname: The host name of the driver. Only used by engines and if the method is TCP.

Note the presence of an ampersand ("&") at the end of each run command, which causes the codes to run in the background. Without doing this, the script would launch QE and would wait until QE terminated before ever launching LAMMPS and the driver, leading to an indefinite hang. Finally, the wait command at the end ensures that the driver and the engines are permitted to terminate before the script terminates.

You can now execute the tcp.sh script, which should print out the energies of LAMMPS and QE for several timesteps of a short AIMD simulation.

If you compiled everything (QE, LAMMPS, MDI_AIMD_Driver, and the internal copies of the MDI Library) with MPI or are using the MDI Mechanic build, you can also execute the mpi.sh script, which will run the same simulation using MPI for the inter-code communication. The mpi.sh script redirects the driver's output to a file called work/driver.out . The output produced by QE and LAMMPS can be found in work/qe.out and work/lammps.out, respectively.

Driver Development Tutorial

Please complete the User Tutorial before starting this tutorial.

In this tutorial we will set up a simple driver for running Ab Initio Molecular Dynamics (AIMD) simulations. We will use a quantum chemistry (QM) code to compute the forces, while using a molecular mechanics (MM) code to propagate the atomic coordinates each time step.

Creating a New Project

The easiest way to start work on a new driver is to use the MDI Driver Cookiecutter, which will automatically do most of the preparatory work for you. Using the cookiecutter will require that you have Python and Cookiecutter on your machine. If you do not already have Python installed, follow the directions on the Python Website and install both Python and the pip installer tool. To install Cookiecutter, type:

pip install cookiecutter

If you are running on an external machine and do not have write permission to the system directories, you may need to add the –user option:

pip install --user cookiecutter

Now use Cookiecutter to create a new driver project:

cookiecutter git@github.com:MolSSI/MDI_Driver_Cookiecutter.git

When prompted for the repo_name, type aimd, and then when prompted to select a language, type 1 for C++. This will create a new directory called aimd and populate it with some of the things you will need for the driver, including a copy of the MDI Library. The overall directory structure is:

.
└── aimd
├── CMakeLists.txt
├── aimd
├── CMakeLists.txt
├── STUBS_MPI
│   └── mpi.h
└── aimd.cpp
└── mdi
...

Writing the Driver

Open the file called aimd.cpp. It contains the following code:

#include <iostream>
#include <mpi.h>
#include <stdexcept>
#include <string.h>
#include "mdi.h"
using namespace std;
int main(int argc, char **argv) {
// Initialize the MPI environment
MPI_Comm world_comm;
MPI_Init(&argc, &argv);
// Initialize MDI
if ( MDI_Init(&argc, &argv) ) {
throw std::runtime_error("The MDI library was not initialized correctly.");
}
// Confirm that MDI was initialized successfully
int initialized_mdi;
if ( MDI_Initialized(&initialized_mdi) ) {
throw std::runtime_error("MDI_Initialized failed.");
}
if ( ! initialized_mdi ) {
throw std::runtime_error("MDI not initialized: did you provide the -mdi option?.");
}
// Get the correct MPI intra-communicator for this code
if ( MDI_MPI_get_world_comm(&world_comm) ) {
throw std::runtime_error("MDI_MPI_get_world_comm failed.");
}
// Connect to the engines
// <YOUR CODE GOES HERE>
// Perform the simulation
// <YOUR CODE GOES HERE>
// Send the "EXIT" command to each of the engines
// <YOUR CODE GOES HERE>
// Synchronize all MPI ranks
MPI_Barrier(world_comm);
MPI_Finalize();
return 0;
}

The first few lines of code simply initialize both MPI and and MDI. Don't worry if you don't have access to an MPI library - the code will just fall back to a set of dummy MPI functions provided in STUBS_MPI. The call to the MDI_MPI_get_world_comm() function obtains from MDI an MPI intra-communicator that spans all MPI ranks associated with the driver. This call is important when using the MPI protocol for MDI communication, because in that context, MPI_COMM_WORLD spans all ranks associated with both the driver and the engine(s). In general, an MDI-enabled code should call MDI_MPI_get_world_comm() immediately after calling MDI_Init(), and the resulting MPI intra-communicator should be used instead of MPI_COMM_WORLD throughout the remainder of the code.

After initializing MDI, we need to connect the driver to its engines. For this particular tutorial, we will use a QM engine to calculate forces and an MM engine to update the atomic coordinates each timestep. In the MDI standard, engines request a connection to a driver. The driver will need to call the MDI_Accept_communicator() function to accept each connection. The general format of this functional call looks like this:

MDI_Comm comm;

This will assign a new MDI communicator to comm, which functions very similarly to an MPI communicator and is used in certain MDI function calls to identify which engine is expected to send/receive data to/from the driver. Our AIMD driver will connect to two different engines, so we will be calling MDI_Accept_communicator() twice. We don't know the order in which the engines will request a connection to the driver, so we will need some way to determine which engine is the QM code and which engine is the MM code.

This can be accomplished through the use of the "<NAME" command. The entire MDI standard is built around the idea that drivers can send "commands" to engines, each of which is defined by the MDI standard and has a specific outcome. The "<NAME" command tells the engine to send the driver a string of length MDI_NAME_LENGTH that identifies the engine. The end-user is responsible for indicating the name of each engine at runtime, using the -name command line argument that was described in the User Tutorial. As authors of a driver, we will need to decide what we expect each of the engines to be named and clearly document that decision for the benefit of the end-users. For the purpose of this tutorial, we will expect the quantum mechanics code to be called "QM" and the molecular mechanics code to be named "MM".

Sending the "<NAME" command to an engine and then receiving the engine's name can be done as follows:

char* code_name = new char[MDI_NAME_LENGTH];
MDI_Send_command("<NAME", comm);
MDI_Recv(code_name, MDI_NAME_LENGTH, MDI_CHAR, comm);

The following code accepts connections from two engines and assigns the communicator from the engine named "MM" to mm_comm and assigns the communicator from the engine named "QM" to qm_comm. Replace the comment that reads "// Connect to the engines" with:

// Connect to the engines
MDI_Comm mm_comm = MDI_COMM_NULL;
MDI_Comm qm_comm = MDI_COMM_NULL;
int nengines = 2;
for (int iengine=0; iengine < nengines; iengine++) {
MDI_Comm comm;
// Determine the name of this engine
char* engine_name = new char[MDI_NAME_LENGTH];
MDI_Send_command("<NAME", comm);
MDI_Recv(engine_name, MDI_NAME_LENGTH, MDI_CHAR, comm);
cout << "Engine name: " << engine_name << endl;
if ( strcmp(engine_name, "MM") == 0 ) {
if ( mm_comm != MDI_COMM_NULL ) {
throw runtime_error("Accepted a communicator from a second MM engine.");
}
mm_comm = comm;
}
else if ( strcmp(engine_name, "QM") == 0 ) {
if ( qm_comm != MDI_COMM_NULL ) {
throw runtime_error("Accepted a communicator from a second QM engine.");
}
qm_comm = comm;
}
else {
throw runtime_error("Unrecognized engine name.");
}
delete[] engine_name;
}

We are now ready to use MDI to orchestrate an AIMD simulation. In broad terms, during each dynamics iteration we will send a set of atomic coordinates from the MM engine to the QM engine, order the QM engine to send the driver the corresponding atomic forces, send those forces to the MM engine, and order the MM engine to perform a time step.

Keep in mind that the driver doesn't know anything about the simulated system beyond what it can query by sending MDI commands. MDI engines initialize basic information about the system using whatever input file format the engine's developer chose. When the MM and QM engines are launched, they will each read their standard input files, perform basic initialization tasks, and then request a connection to the driver. The driver can request information about the engine's system by sending certain MDI commands. For example, to determine the number of atoms in the MM engine's system, you could do:

// Receive the number of atoms from the MM engine
int natoms;
MDI_Send_command("<NATOMS", mm_comm);
MDI_Recv(&natoms, 1, MDI_INT, mm_comm);

Similarly, to learn the coordinates of the atoms in the MM engine's system, you could do:

// Receive the coordinates from the MM engine
double coords[3*natoms];
MDI_Send_command("<COORDS", mm_comm);
MDI_Recv(&coords, 3*natoms, MDI_DOUBLE, mm_comm);

You could then update the coordinates of the QM engine's system:

// Send the coordinates to the QM engine
MDI_Send_command(">COORDS", qm_comm);
MDI_Send(&coords, 3*natoms, MDI_DOUBLE, qm_comm);
Note
As we will discuss later in Using the Driver, this driver assumes that the engines have both been initialized with the same number of atoms, the same atom types and ordering of atom types, and the same cell dimensions.

The following code will handle all of the work associated with driving an AIMD simulation Replace the comment that reads "// Perform the simulation" with:

// Perform the simulation
int niterations = 10; // Number of MD iterations
int natoms;
double qm_energy;
double mm_energy;
// Receive the number of atoms from the MM engine
MDI_Send_command("<NATOMS", mm_comm);
MDI_Recv(&natoms, 1, MDI_INT, mm_comm);
// Allocate the arrays for the coordinates and forces
double coords[3*natoms];
double forces[3*natoms];
// Have the MM engine initialize a new MD simulation
MDI_Send_command("@INIT_MD", mm_comm);
// Perform each iteration of the simulation
for (int iiteration = 0; iiteration < niterations; iiteration++) {
// Receive the coordinates from the MM engine
MDI_Send_command("<COORDS", mm_comm);
MDI_Recv(&coords, 3*natoms, MDI_DOUBLE, mm_comm);
// Send the coordinates to the QM engine
MDI_Send_command(">COORDS", qm_comm);
MDI_Send(&coords, 3*natoms, MDI_DOUBLE, qm_comm);
// Have the MM engine proceed to the @FORCES node
MDI_Send_command("@FORCES", mm_comm);
// Get the QM energy
MDI_Send_command("<ENERGY", qm_comm);
MDI_Recv(&qm_energy, 1, MDI_DOUBLE, qm_comm);
// Get the MM energy
MDI_Send_command("<ENERGY", mm_comm);
MDI_Recv(&mm_energy, 1, MDI_DOUBLE, mm_comm);
// Receive the forces from the QM engine
MDI_Send_command("<FORCES", qm_comm);
MDI_Recv(&forces, 3*natoms, MDI_DOUBLE, qm_comm);
// Send the forces to the MM engine
MDI_Send_command(">FORCES", mm_comm);
MDI_Send(&forces, 3*natoms, MDI_DOUBLE, mm_comm);
// Have the MM engine proceed to the @COORDS node, which completes the timestep
MDI_Send_command("@COORDS", mm_comm);
cout << "timestep: " << iiteration << " " << mm_energy << " " << qm_energy << endl;
}

The above code does the following:

  1. Queries the number of atoms from the MM engine and allocates appropriately sized arrays for the coordinates and forces
  2. Orders the MM engine to initialize a new MD simulation
  3. Begins an iterative loop over MD iterations
    1. Receives the atomic coordinates from the MM engine and updates the QM engine with them
    2. Receives the energy of the QM and MM systems
    3. Receives the atomic forces from the QM engine and sends them to the MM engine
    4. Orders the MM engine update the atomic coordinates

Finally, we should send an MDI command to cause the engines to exit. Replace the comment that reads "// Send the "EXIT" command to each of the engines" with the following:

// Send the "EXIT" command to each of the engines
MDI_Send_command("EXIT", mm_comm);
MDI_Send_command("EXIT", qm_comm);

Driver Compilation

The cookiecutter came with everything you need to build with CMake. To compile, simply navigate to the directory where you want the driver to be built, then do the following, replacing "<driver_top_directory>" with the path to top directory of your driver repository.

cmake <driver_top_directory>
make

Using the Driver

To test the driver, you will need access to a QM engine and an MM engine. The User Tutorial describes how to compile QE and LAMMPS for this purpose.

You will also need appropriate input files for a test simulation. The test files in the MDI_AIMD_Driver repository (see the User Tutorial) will work fine, as will the scripts. You can simply edit MDI_AIMD_Driver/tests/locations/MDI_AIMD_Driver to point to your new repository and run the scripts in the manner described in the User Tutorial.

Final Notes

Note that although we used QE as the QM code and LAMMPS as the MM code, we swap out QE for any QM engine with MDI support or LAMMPS for any MM engine with MDI support. Furthermore, nothing about our AIMD driver strictly requires that the code named "QM" actually corresponds to a quantum mechanics code. You could, for example, use LAMMPS as the QM code, while another instance of LAMMPS serves as the MM code. Doing so would allow you to run a calculation that will produce to same results as simply running an MD simulation entirely within a single instance of LAMMPS. Although not generally useful for production runs, this can be a good way to benchmark the cost of the computational overhead introduced by using our driver as a middleman between two codes.

In other cases, it may be desirable to use two different MM codes as the engines if, for example, you wish to use a force field from one MM code and a thermostat from another MM code. The only requirement on the engines is that the "QM" code supports all of the MDI commands sent by the AIMD driver to the "QM" engine, and that the "MM" code supports all of the MDI commands sent by the AIMD driver to the "MM" engine.

MDI_MPI_get_world_comm
int MDI_MPI_get_world_comm(void *world_comm)
Obtain the MPI communicator that spans the single code corresponding to the calling rank.
Definition: mdi.c:2010
MDI_Initialized
int MDI_Initialized(int *flag)
Indicates whether MDI_Init has been called.
Definition: mdi.c:265
MDI_COMM_NULL
const MDI_Comm MDI_COMM_NULL
value of a null communicator
Definition: mdi.c:56
MDI_Send
int MDI_Send(const void *buf, int count, MDI_Datatype datatype, MDI_Comm comm)
Send data through the MDI connection.
Definition: mdi.c:326
MDI_DOUBLE
const int MDI_DOUBLE
double precision float data type
Definition: mdi.c:78
MDI_NAME_LENGTH
const int MDI_NAME_LENGTH
length of an MDI name in characters
Definition: mdi.c:50
MDI_Recv
int MDI_Recv(void *buf, int count, MDI_Datatype datatype, MDI_Comm comm)
Receive data through the MDI connection.
Definition: mdi.c:359
MDI_Send_command
int MDI_Send_command(const char *buf, MDI_Comm comm)
Send a command of length MDI_COMMAND_LENGTH through the MDI connection.
Definition: mdi.c:404
MDI_Init
int MDI_Init(int *argc, char ***argv)
Initialize communication through the MDI library.
Definition: mdi.c:115
MDI_CHAR
const int MDI_CHAR
character data type
Definition: mdi.c:80
MDI_INT
const int MDI_INT
integer data type
Definition: mdi.c:60
MDI_Accept_communicator
int MDI_Accept_communicator(MDI_Comm *comm)
Accept a new MDI communicator.
Definition: mdi.c:292