Authors: Siddhant Jain,Jalal Ziauddin,Paul Leonchyk,Joseph Geraci
ArXiv: 1810.11959
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1810.11959v1
The ability to accurately classify disease subtypes is of vital importance,
especially in oncology where this capability could have a life saving impact.
Here we report a classification between two subtypes of non-small cell lung
cancer, namely Adeno- carcinoma vs Squamous cell carcinoma. The data consists
of approximately 20,000 gene expression values for each of 104 patients. The
data was curated from [1] [2]. We used an amalgamation of classical and and
quantum machine learning models to successfully classify these patients. We
utilized feature selection methods based on univariate statistics in addition
to XGBoost [3]. A novel and proprietary data representation method developed by
one of the authors called QCrush was also used as it was designed to
incorporate a maximal amount of information under the size constraints of the
D-Wave quantum annealing computer. The machine learning was performed by a
Quantum Boltzmann Machine. This paper will report our results, the various
classical methods, and the quantum machine learning approach we utilized.