Features Download
From: Konrad Hinsen <research-TMPnbQDBoqj3uN+rDzSBzuTW4wlIGRCZ <at> public.gmane.org>
Subject: Re: Tuples and topology, the return
Newsgroups: gmane.science.simulation.h5md.user
Date: Wednesday 9th July 2014 08:32:27 UTC (over 3 years ago)
Pierre de Buyl writes:

 > > 1) The term "topology" is used in at least two very different
 > > meanings in molecular simulations [(1) bond connectivity, (2)
 > > universe topology], so it's perhaps better to avoid it in a
 > > specification.
 > Right, I forgot your point about that. bond_connectivity,
bonded_interactions, ?

Both are fine, but they mean different things. Which is exactly my point 2:

 > > Point 2 also suggests that any specific module should state its
 > > intended field of application, with its limitations, which yours
 > > currently doesn't.
 > The actual field of application is coarse-grained molecular
 > models. In practice, there should be a lot in common with atomistic
 > force fields but I don't currently use them.

That's not so much the distinction I had in mind. As you say, they are
because CG force fields were inspired by the atomistic ones.

What I see as lacking is a clear statement of *what* is being encoded
in the connectivity data. Put differently, imagine someone who gets a
trajectory with connectivity information but nothing else - no paper,
no README. How should that person interpret the connectivity data?

Some possibilities:

1) A connectivity table which is part of the molecular model.
   That's chemical bonds for an atomistic model, many CG models
   use a similar concept.

2) A list of interaction terms derived from a force field.

3) An annotation of the data for visualization or analysis.

4) Chemical bonds obtained as the result of a quantum-chemical computation.

1) and 3) are almost the same, but 1) and 2) definitely not. As an
illustration, take the TIP3P water model: it has two bonds (O-H), but
three bond-type interactions (H-H in addition to O-H). 4) differs from
the others in being a result and not an input.

Some of these cases would require additional annotation to allow for
the right interpretation. In the case of 1), a reference to the underlying
model. In the case of 2), a reference to the forcefield, plus perhaps
some of its parameters.

A specific problem of 2) is that for non-trivial forcefields (proteins
etc.), a simple bond list is not of much use. What you want is *all*
forcefield terms. I can't think of any practical use for just the

 > That's fine for randomly placed tuple lists but in a structure like
 > /clever_name_for_connectivity//FENE
 > it would be redundant, a problem that we have succeeded in avoiding
until now,
 > unless I am just being too optimistic with the state of H5MD :-)

I'd look at this from the other end: if a tuple already has a
reference to the particle group it refers to, why have this
information encoded a second time in the connectivity group?

Felix Höfling writes:

 > From Olaf (whom I consider an expert in PDB) I got the information
 > that PDB only stores connections, no bond types.

The PDB file format only stores connections. The PDB database also
stores bond types, and makes them available in the mmCIF and PDBML
files. The bond types are 'sing', 'doub', 'trip', 'quad', and 'arom',
which all make sense only at the atomic scale.

The PDB file format is not a good reference. It's completely outdated
and insufficient. Even the PDB doesn't want it any more.

Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: research AT khinsen DOT fastmail DOT net
ORCID: http://orcid.org/0000-0003-0330-9428
Twitter: @khinsen
CD: 3ms