Curating Glioma Tumors for Classification Methods in Machine Learning
Sjöberg, Joel (2021)
Sjöberg, Joel
2021
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2021052431387
https://urn.fi/URN:NBN:fi-fe2021052431387
Tiivistelmä
Glioma is a cancer which begins in the glial cells within the brain. The presence of glioma tumors within the brain can cause a wide variety of symptoms, e.g. seizures, headaches and nausea. The analysis of these tumors is a long and tedious process yet it is essential in determining the correct line of treatment for patients whose life expectancy ranges between a couple of years to a matter of months. Optimizing the process of diagnosing these tumors is of great interest to the medical field. Furthermore, recent studies present new ways of categorizing these tumors by analyzing the methylation type rather than grade and glial cell of origin. A possible way of analyzing glioma tumors is found through the use of Raman spectroscopy. The Raman spectra of the tumors contain information about the vibrations within the molecules of the tumor material.
Machine learning is a technique which has helped in analyzing non-trivial patterns and automating tasks previously thought impossible to perform with computers. The technique utilizes data to form models which have been successful to the point of outperforming their human counterparts. It is evident that the models can perform well in situations where tremendous amounts of data is available. Within cancer research, the focus is on qualitative data gathering rather than quantitative. This is because data gathering is often expensive and lacking due to a limited number of patients available for study.
In this thesis, we explain the process of analyzing the spectra extracted from the tumors of glioma patients. The analysis is performed by sorting the spectra into different groups which maintain minimal variance among the different frequencies within the grouped spectra. This is done to separate tumor spectra from spectra extracted from non-tumor material which may have been present during scanning i.e. plastic, blood, necrotic tissue etc. We utilize machine learning methods to group and examine the samples in detail. We also validate the analysis and pre-processing by creating a model in an attempt to classify the tumors according to the new categories based on the methylation types. Our findings and conclusions as to how these methods can be utilized further for improved results are also presented as a conclusion to this thesis.
Machine learning is a technique which has helped in analyzing non-trivial patterns and automating tasks previously thought impossible to perform with computers. The technique utilizes data to form models which have been successful to the point of outperforming their human counterparts. It is evident that the models can perform well in situations where tremendous amounts of data is available. Within cancer research, the focus is on qualitative data gathering rather than quantitative. This is because data gathering is often expensive and lacking due to a limited number of patients available for study.
In this thesis, we explain the process of analyzing the spectra extracted from the tumors of glioma patients. The analysis is performed by sorting the spectra into different groups which maintain minimal variance among the different frequencies within the grouped spectra. This is done to separate tumor spectra from spectra extracted from non-tumor material which may have been present during scanning i.e. plastic, blood, necrotic tissue etc. We utilize machine learning methods to group and examine the samples in detail. We also validate the analysis and pre-processing by creating a model in an attempt to classify the tumors according to the new categories based on the methylation types. Our findings and conclusions as to how these methods can be utilized further for improved results are also presented as a conclusion to this thesis.