The following graph models the similarities between agents based on a special subset of sentences which prescribe medications (e.g. oxycodone 5 mg tablet 1 tablet by mouth every 4 hours as needed.
). Each agent is represented by a node and the connecting edge denotes the similarity score of the sentence pair. It is possible that an agent pair occurs more than once. In these cases, the average similarity score is used. To extract the agents from the sentences, the MedEx-UMA system is used followed by additional postprocessing steps.
The general idea is that the similarity information of the agents is also useful to predict the similarity between a new sentence pair. For this, we use the corresponding agent nodes, find the shortest path between them and aggregate the information on the edges to a final similarity score. To see this calculation in detail, mark two arbitrary nodes in the graph as start- and endpoint and take a look at the example below.
To calculate the similarity score between two agents, we can use the scores on the edges of the shortest path. Before we aggregate this information, though, one important observation: the final score cannot be higher than the lowest score on the path because it is not possible to restore dissimilarities. For example, if one edge has a score of 1 (low similarity), additional connections cannot increase upon this value. They can, however, influence to which extent this value gets decreased further.
To account for this observation, we are using the resistance calculation of parallel circuits
\begin{equation*} \label{eq:ResistanceParallelCircuits} \frac{1}{R_{eq}} = \frac{1}{R_1} + \frac{1}{R_2} + \ldots \end{equation*}with the final resistance \(R_{eq}\) of the circuit and the resistances \(R_i\) of the individual flows. The nice thing about this equation is that the final resistance \(R_{eq}\) is always smaller than the individual resistances \(R_i\), i.e. \(R_{eq} \leq \min(R_1, R_2, \ldots)\). This is exactly what we need to predict scores in the graph. In our case, \(R_{eq}\) is the final score \(s\) obtained from the graph for a sentence pair and \(R_i\) are the scores on the edges along the way.
To see a concrete example, select two (connected) nodes in the graph and mark them as start- and endpoint.