-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elastic network #54
Elastic network #54
Conversation
The discrepancy in the distance matrix is likely due to #36. Because we do not have weights in the mapping, yet, the BB bead positions are wrong by a bit. Hence the distance matrix does not match martinize1. |
There are discrepancies in the mappings between martinize1 and backward. This, plus the fact that I was not using the masses to compute the centre of mass -_- was causing the errors in the elastic network since the beads were not located where they should have. I also calculate the weights from the mapping as I thought it was the problem before realizing the mapping discrepancy. So this PR fixes #36. Now, the elastic network is good, and the only differences are due to rounding. Note that it can lead to bonds being added or removed compare to martinize1 when the bond length is very close to the cutoff. |
Rebased against master after renaming. |
Also for this one, I'll review it once the other two are merged to prevent duplication of effort. |
Rebased against master. |
* Add a `AttachMass` processor. * Compute mapping weights when reading the mapping files. The weights are attached to the bead node, alongside the underlying graphs. * `AverageBeads` uses the mapping-weighted COM instead of the unweighted COG
Histidine is still different than martinize1 because of an hydrogen. This should be fixed with protonation as PTMs.
Bonds are excluded for residue separated by 2 or less residues. The previous try only accounted for separations of 1.
During the mapping step, attributes from the original molecules are passed on to the mapped ones. This leads to the mapped molecules to have wrong attributes, such as atomistic masses on a coarse grained bead. This commit allows to keep only the attributes listed as argument. The attributes required to identify a residue are passed on in any case. These default attributes are chain, resid, and resname.
Some node attributes may be set to None when they are unknown, instead of being undefined. This means that `node.get(attr, 0)` can return `None` even if there is no known valid value. This breaks the PDB writer is some situations. This commit makes sure that the PDB writer uses its default value wjen an attribute is set to `None`.
Rebased (again) against master. I'll be less responsive for 8 hours or so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor things. In addition, there's docstrings missing, and I expect well find a dragon in do_mapping later on. Can't find it now though.
|
||
|
||
def compute_decay(distance_matrix, shift, rate, power): | ||
return np.exp(-rate * ((distance_matrix - shift) ** power)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call it just distance instead of distance_matrix, and let numpy deal with the broadcasting
decay_factor, decay_power, base_constant, | ||
minimum_force): | ||
constants = compute_decay(distance_matrix, lower_bound, decay_factor, decay_power) | ||
np.fill_diagonal(constants, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But do note that this should be distance_matrix here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. The point of that line is to make sure that the force constant for the self pair is 0, which could not be the case depending of the parameters for the decay; but may especially not be the case because of rounding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant at line 33 ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! It was related to the comment above!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jup :) You're too fast
np.fill_diagonal(constants, 0) | ||
constants *= base_constant | ||
constants[constants < minimum_force] = 0 | ||
constants[distance_matrix > upper_bound] = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constants[(constants < minimun_force) | (distance_matrix > upper_bound)] = 0
is also an option. Yours is more readable. Do you think it's worth profiling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would work indeed. I'll let optimization for latter, though. We need a general profiling pass anyway.
if separation <= 0: | ||
raise ValueError('Separation has to be strictly positive.') | ||
if separation == 1: | ||
return nx.to_numpy_matrix(graph, nodelist=selection).astype(bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a comment
subgraph = graph.subgraph(selection) | ||
connectivity = np.zeros((len(subgraph), len(subgraph)), dtype=bool) | ||
for (i, key_i), (j, key_j) in itertools.combinations(enumerate(subgraph.nodes), 2): | ||
shortest_path = len(nx.shortest_path(subgraph, key_i, key_j)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would've done it the other way around: make a complete distance matrix, and do numpy magic to find the node_keys between which the distance is what you want. There's not really a convenient method in networkx for this though. Maybe networkx.algorithms.shortest_paths.dense.floyd_warshall_numpy
, but it would require profiling.
upper_bound, decay_factor, decay_power, | ||
base_constant, minimum_force) | ||
connectivity = build_connectivity_matrix(molecule, 2, selection=selection) | ||
constants *= ~connectivity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one needs a comment
from_key = selection[from_idx] | ||
to_key = selection[to_idx] | ||
force_constant = constants[from_idx, to_idx] | ||
length = round(distance_matrix[from_idx, to_idx], 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not do distance_matrix = distance_matrix.round(5)
before the loop?
to_key = selection[to_idx] | ||
force_constant = constants[from_idx, to_idx] | ||
length = round(distance_matrix[from_idx, to_idx], 5) | ||
if force_constant > minimum_force: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this check not be part of compute_force_constants
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compute_force_constants
compute the constants, what we do with them is not its responsibility. Also, filtering the force constants in compute_force_constants
will only change the condition of the test, it will not get rid of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In make_force_constants
there's already a check that does
constants[constants < minimum_force] = 0
So maybe just if force_constant
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. It is possible, but then the way it is here may be more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmn. Or set it to np.nan in compute_force_constants, and check for that here. Now you're doing the same thing twice. And set a comment that it's set to nan in said function
if from_idxs: | ||
self.weights[to_idx][from_idxs[0]] = ( | ||
weights[(res_to, name_to)][(res_from, name_from)] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only from_idxs[0]
? I'm lost in my own code :s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not completely sure either from the top of my head. But I'd rather have some tests in place before messing around with some refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm ought to do a massive run of docstringing soon.
decay_factor, decay_power, base_constant, | ||
minimum_force): | ||
constants = compute_decay(distance_matrix, lower_bound, decay_factor, decay_power) | ||
np.fill_diagonal(constants, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. The point of that line is to make sure that the force constant for the self pair is 0, which could not be the case depending of the parameters for the decay; but may especially not be the case because of rounding.
np.fill_diagonal(constants, 0) | ||
constants *= base_constant | ||
constants[constants < minimum_force] = 0 | ||
constants[distance_matrix > upper_bound] = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would work indeed. I'll let optimization for latter, though. We need a general profiling pass anyway.
to_key = selection[to_idx] | ||
force_constant = constants[from_idx, to_idx] | ||
length = round(distance_matrix[from_idx, to_idx], 5) | ||
if force_constant > minimum_force: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compute_force_constants
compute the constants, what we do with them is not its responsibility. Also, filtering the force constants in compute_force_constants
will only change the condition of the test, it will not get rid of it.
if from_idxs: | ||
self.weights[to_idx][from_idxs[0]] = ( | ||
weights[(res_to, name_to)][(res_from, name_from)] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not completely sure either from the top of my head. But I'd rather have some tests in place before messing around with some refactoring.
By the way, is the WIP label/tag still relevant? |
Hum... In a sense... everything is always a work in progress? |
What is still blocking in this PR? |
IIRC the missing docstrings. |
I added docstring! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work! 2 minor comments and it's good to go. If you disagree, feel free to merge as is.
|
||
|
||
def compute_decay(distance, shift, rate, power): | ||
""" | ||
Compute the decay function of the force constant as function to the distance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too vague imho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will see with @Tsjerk if he can explain the parameters to me, or I will try to plot the function... This is an obscure feature that I never seen used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already better if you describe the functional form. No need for shiny.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, little need.
The base force constant for the bonds in kJ/mol/nm^2. If 'decay_factor' | ||
or 'decay_power' is set to 0, then it will be the used force constant. | ||
minimum_force: float | ||
Minimum force constat in kJ/mol/nm^2 under which bonds are not kept. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:math:? Or TeX formatting for the units? I'm pretty sure sphinx/rest can deal with it somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Units look ugly in latex. It would be something like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably write something liek this: $$\frac{kJ}{nm}^2$$
; but I take your point. Is there no middle ground?
I applied a compromise on the units. |
A first go at the rubber band elastic network. I have to figure out why I have discrepancies in my distance matrix compared to martinize1.
This is a step forward for #33. It requires #41 and #53 to be merged.