Skip to content

Commit

Permalink
added short feature description
Browse files Browse the repository at this point in the history
  • Loading branch information
berylgithub committed Dec 23, 2019
1 parent db48451 commit 03c2f70
Showing 1 changed file with 32 additions and 0 deletions.
32 changes: 32 additions & 0 deletions visualization.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,38 @@
"https://github.com/berylgithub"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Feature Extraction\n",
"\n",
"The data is pre-processed to acquire the $D=\\{(y^{(n)}, \\vec{x}^{(n)})\\}^N_{n=1}$ array. By calculating the Heaviside step function that counts the contacts of atoms between proteins ($d_{cutoff}$) within 12 Angstrom. The formula to count the number of occurence between $j$ and $i$ atom is as follows:\n",
"<br>\n",
"$x_{Z(P_X(j)),Z(P_Y(i))}\\equiv\\sum^{K_j}_{k=1}\\sum^{L_i}_{l=1} \\odot (d_{cutoff} - d_{kl})$\n",
"<br>\n",
"\n",
"Here $P_X(j)$ corresponds to the protein atoms within the first chain of the protein pairing combination, and $P_Y(i)$ corresponds to the protein atoms within the second one, where four atom types are considered respectively as follows:\n",
"<br>\n",
"$\\{P_N(j)\\}_{j=1}^{4} = \\{C,N,O,S\\}$, $N \\in Z^+$,<br>\n",
"The hydrophobic patches and acid patches interactions respectively are also considered: \n",
"<br>\n",
"- hydrophobic patch : $\\{C\\alpha_H\\}$\n",
"<br>\n",
"- acid patch : $\\{C\\alpha_A\\}$,\n",
"<br>\n",
"\n",
"where $C\\alpha_H$ and $C\\alpha_A$ is the total interactions between carbon-alpha on Hydrophobic and Acid patch respectively.\n",
"\n",
"Therefore the total atomic interaction features used are $|\\{P_N(j)\\}_{j=1}^{4}|^2 + |\\{C\\alpha_H\\}| + |\\{C\\alpha_A\\}| = 18$\n",
"\n",
"Suppose there are four chains of protein $S = \\{P_1, P_2, P_3, P_4 \\}$, the interaction is the combinations of two chains, hence the total of interactions between two chains for this case is $c^{|S|}_2$=6 (where $c$ here refers to combination formula), the list of chains combination are as follows:\n",
"<br>\n",
"$\\sum interaction = interaction(P_1, P_2) + interaction(P_1,P_3) + interaction(P_1,P_4) + interaction(P_2,P_3) + interaction(P_2,P_4) + interaction(P_3,P_4)$\n",
"\n",
"The script used for data preprocessing and feature extraction is available at https://github.com/berylgithub/ppbap."
]
},
{
"cell_type": "code",
"execution_count": 1,
Expand Down

0 comments on commit 03c2f70

Please sign in to comment.