# On the Applicability of Network Science in Personalized Medicine

##### Nushi, Elio (2020)

Nushi, Elio

Åbo Akademi

2020

Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.

**Julkaisun pysyvä osoite on**

https://urn.fi/URN:NBN:fi-fe2020051838213

##### Tiivistelmä

The availability of big data regarding genetic information and the knowledge about the behavior and interactions between genes and proteins have drawn the interest of computational scientists in the field of biology. In particular system biology is the science which aims to study the complex communication between objects in biological environments in order to get a holistic understanding of living systems as opposed to the reductionist approach which studies the components separately. Mathematical models are usually built in order to analyze these complex biological systems such as protein-protein interactions (PPI) and disease networks. The ultimate goal of understanding the system is being able to manipulate it into a desired state which is referred to as the controllability of the system. System controllability is a strong background motivation for the concept of personalized medicine or precision medicine which aims to identify treatment lines based on individual characteristics of the patients in order to find the most appropriate drug(s) which can transition the biological system from a sick state to a healthy state by minimizing side effects.

Full controllability over a network can be solved in polynomial time but the solutions that it offers especially in the case of complex systems is large, thus making it inappropriate to use for personalized medicine. In the case of cancer networks which are considerably large and complex having full controllability is not useful given the big number of nodes that have to be changed by external controllers (e.g, drugs). Since full controllability offers infeasible solutions to use in practice, a more realistic goal is to obtain target controllability - that is, being able to transform the network from an initial state to a new state where only some of the nodes have the desired values (i.e., target nodes). Target controllability has been proven to be a NP-Complete problem and many approximate computational techniques have been tried to solve it.

In this thesis we focus on the core intuition behind some of the approximate techniques of solving target controllability whose aim is to keep the set of drug target nodes as small as possible. We exploit several network science methods based on centrality measures to approach the problem of gaining information over biological networks in terms of their topological structure and the identification of important nodes. Numerous studies have been conducted to analyze the concept of centrality in the context of social networks. However, their possible applicability on biological networks has not taken equal attention. We thought it is relevant and necessary to provide a comprehensive summary of the commonly used centrality methods and see how well they predict important proteins and genes in cancer networks. Furthermore, we review the topic of random graphs, discuss their properties, and describe different models that exist for generating them. Afterwards, we identify common properties that real multiple myeloma (MM) cancer networks share with random graph models. Finally, we apply different centrality methods in MM networks and compare the outcomes with what is already known from clinical medicine and supported by research papers in the field. Our final goal is to identify the significant genes and proteins that play a crucial role in the development of this disease based merely on topological attributes.

Full controllability over a network can be solved in polynomial time but the solutions that it offers especially in the case of complex systems is large, thus making it inappropriate to use for personalized medicine. In the case of cancer networks which are considerably large and complex having full controllability is not useful given the big number of nodes that have to be changed by external controllers (e.g, drugs). Since full controllability offers infeasible solutions to use in practice, a more realistic goal is to obtain target controllability - that is, being able to transform the network from an initial state to a new state where only some of the nodes have the desired values (i.e., target nodes). Target controllability has been proven to be a NP-Complete problem and many approximate computational techniques have been tried to solve it.

In this thesis we focus on the core intuition behind some of the approximate techniques of solving target controllability whose aim is to keep the set of drug target nodes as small as possible. We exploit several network science methods based on centrality measures to approach the problem of gaining information over biological networks in terms of their topological structure and the identification of important nodes. Numerous studies have been conducted to analyze the concept of centrality in the context of social networks. However, their possible applicability on biological networks has not taken equal attention. We thought it is relevant and necessary to provide a comprehensive summary of the commonly used centrality methods and see how well they predict important proteins and genes in cancer networks. Furthermore, we review the topic of random graphs, discuss their properties, and describe different models that exist for generating them. Afterwards, we identify common properties that real multiple myeloma (MM) cancer networks share with random graph models. Finally, we apply different centrality methods in MM networks and compare the outcomes with what is already known from clinical medicine and supported by research papers in the field. Our final goal is to identify the significant genes and proteins that play a crucial role in the development of this disease based merely on topological attributes.