Identification of Biomarkers and Drug Target in Human Cancers


  1. Introduction
    1. Cancer
      1.    Burden of Cancer in World & India
      2.    Types of Cancer
      3.    Pathogenesis of Cancer
      4.    Stages of Cancer
      5.    Screening Methods
      6.    Early diagnosis of Cancer is essential
    2. Cancer Biomarkers
      1.    Properties of an ideal cancer biomarkers
      2.    Advantages of cancer biomarkers
      3.    Historical overview of cancer biomarkers
      4.    Current applications of cancer biomarkers and their clinical utility
      5.    Currently available cancer biomarkers: clinical utility and limitations
    3. Mechanism of biomarker dysregulation
      1.    Gene expression
      2.    MicroRNA
    4. Technologies
      1.    Microarray
      2.    Next-generation sequencing
    5. Obstacles in delaying biomarker’s clinical translation
    6. Role of systemic review and meta-analysis in biomarker development
    7. Project aim and objectives


    1.  Cancer

Cell are the fundamental unit of life. Human cells grow and divide to form new cells as the body needs them. When cells grow old or become damaged, they die, and new cells take their place. But sometimes this orderly process breaks down and the abnormal growth of cells occurs. As cells become more and more abnormal, old or damaged cells survive when they should die, and new cells form when they are not needed. These extra cells can divide without stopping and may form growths called tumors. Tumor may be benign or malignant. Malignant tumor are cancerous growth [1]. Malignant tumor or Cancer has the potential to invade and spread from one body part to another through blood or lymphatic system.

Cancers are generally classified by the type of cells or organ from which they originate. Since malignant growth can occur in virtually all locations of the body, there are over 100 different types of cancers. Cancer is an immensely complex and diverse disease; however, a set of characteristics are shared among almost all malignancies. Those characteristics, named hallmarks of cancer, are a unified set of capabilities that are acquired during tumorgenesis.

Figure 1.1: The Hallmarks of Cancer. A: The set of hallmarks of cancer proposed in year 2000. B: Extended Emerging Hallmarks and Enabling Characteristics.

Source: Hanahan D, Weinberg RA (2011) Hallmarks of cancer: the next generation. Cell 144: 646–674.

The originally proposed hallmarks of cancer are self-sufficiency in growth signals, insensitivity to growth-inhibitory signals, evasion of programmed cell death, limitless replicative potential, sustained angiogenesis, and tissue invasion and metastasis [2]. The list has been further extended with emerging hallmarks such as deregulating cellular energetics and avoiding immune response. Additionally, enabling characteristics were proposed, which are tumor promoting inflammation, and genome instability and mutation [3].

  1.      Burden of Cancer in world & India

According to the World Health Organization (WHO) fact sheet 2012, out of 171 million deaths that occurred worldwide in 2008, 7.6 million (13%) were attributed to cancer [4]. In addition, more than 14.1 million people were diagnosed with cancer in 2012 of which 8.2 million succumbed to the disease. Table1 shows cases, deaths and 5-year prevalence of cancer by regions [5]. Around 8 million (57%) new cancer cases, 5.3 million (65%) cancer deaths and 15.6 million (48%) 5-year prevalent cancer cases occurred in the less developed regions. In context to India, 1.015 million cases were reported which constitutes ~7.2% of total cancer cases and .68 million deaths due to cancer. The incidence rate in women is higher than men in India as shown in Table 1.

Table 1: Estimated cases, deaths and 5-year prevalence of cancer. Source: Globocan 2012 [5].

Estimated numbers (thousands) Men Women Both sexes
Cases Deaths 5-year prev. Cases Deaths 5-year prev. Cases Deaths 5-year prev.
 World 7410 4653 15296 6658 3548 17159 14068 8202 32455
 More developed regions 3227 1592 8550 2827 1287 8274 6054 2878 16823
 Less developed regions 4184 3062 6747 3831 2261 8885 8014 5323 15632
 WHO Africa region (AFRO) 265 205 468 381 250 895 645 456 1363
 WHO Americas region (PAHO) 1454 677 3843 1429 618 4115 2882 1295 7958
 WHO East Mediterranean region (EMRO) 263 191 461 293 176 733 555 367 1194
 WHO Europe region (EURO) 1970 1081 4791 1744 852 4910 3715 1933 9701
 WHO South-East Asia region (SEARO) 816 616 1237 908 555 2041 1724 1171 3278
 WHO Western Pacific region (WPRO) 2642 1882 4493 1902 1096 4464 4543 2978 8956
 IARC membership (24 countries) 3689 1900 9193 3349 1570 9402 7038 3470 18595
 United States of America 825 324 2402 779 293 2373 1604 617 4775
 China 1823 1429 2496 1243 776 2549 3065 2206 5045
 India 477 357 665 537 326 1126 1015 683 1790
 European Union (EU-28) 1430 716 3693 1206 561 3464 2635 1276 7157

The most commonly diagnosed cancers worldwide are lung (1.82 million, 13% of the total), breast (1.67 million, 11.9%) and colorectal cancers (1.36 million, 9.7%). The most common causes of cancer death are lung (1.58 million, 19.4 % of the total), liver (0.74 million, 9.1%) and colorectal cancer (.69 million, 8.5%) [5]. Cancer is a major health burden in both developed and developing countries. Every year about 8,50,000 new cancer cases being diagnosed and about 5,80,000 cancer related death occurs every year in India. India had the highest number of the oral and throat cancer cases in the world [6]. Number of deaths due to cancer are rising continuously with approximate nine million people are estimated to die in 2015, and more than 11 million in 2030 [7]. Lung, liver, stomach, colorectal and breast cancers cause the most cancer deaths each year. In India, the highest cases occurred due to breast, cervical, lip, oral cavity, lung and colorectal cancers. Breast and cervical cancer are one of the most reported cancers in Indian women while in men cancer like lip, oral cavity and lung are present.

  1.      Types of Cancer

According to NCI, cancers can be grouped according to the type of cell they start in. There are 5 main categories:

  • Carcinoma – cancer that begins in the skin or in tissues that line or cover internal organs. There are a number of subtypes, including adenocarcinoma, basal cell carcinoma, squamous cell carcinoma, and transitional cell carcinoma
  • Sarcoma – cancer that begins in the connective or supportive tissues such as bone, cartilage, fat, muscle, or blood vessels
  • Leukaemia – cancer that starts in blood forming tissue such as the bone marrow and causes large numbers of abnormal blood cells to be produced and go into the blood
  • Lymphoma and myeloma – cancers that begin in the cells of the immune system
  • Brain and spinal cord cancers – these are known as central nervous system cancers
  1.      Pathogenesis of Cancer

Cancer is often described as the disease of the genome because it acquires the hallmarks of cancer through the accumulation of DNA mutations and genome instability [2]. Till date cancer is not curable, many theories have been proposed to understand the cause of cancer. These theories include cancer is caused by various viruses [8], chromosomal abnormalities [9,10], somatic mutations [11], accumulated multiple mutations [12], immunological surveillances [13,14], nonhealing wounds [15], non-mutagenic mechanism [16], tissue organization field theories [17] and wound-oncogene-wound healing theory [18]. Current prevalent cancer theories hold that cancer is an uncontrolled somatic cell proliferation caused by the genetic alterations in critical genes that control cell growth and differentiation [19,20,21,22,23]. Alterations in three types of genes are responsible for tumorigenesis: oncogenes, tumor-suppressor genes and stability genes (Table 2). Oncogene and tumor-suppressor gene mutations all operate similarly at the physiologic level: they drive the neoplastic process by increasing tumor cell number through the stimulation of cell birth or the inhibition of cell death or cell-cycle arrest. The increase can be caused by activating genes that drive the cell cycle, by inhibiting normal apoptotic processes or by facilitating the provision of nutrients through enhanced angiogenesis. A third class of cancer genes, called stability genes or caretakers, promotes tumorigenesis in a completely different way when mutated. This class includes the mismatch repair (MMR), nucleotide-excision repair (NER) and base-excision repair (BER) genes responsible for repairing subtle mistakes made during normal DNA replication or induced by exposure to mutagens. Stability genes keep genetic alterations to a minimum, and thus when they are inactivated, mutations in other genes occurs at a higher rate [23]. Recently it was reported that various microRNA genes are also involved in initiation and progression of cancer. These microRNA genes altered the expression of genes having role in cell growth and differentiation [24, 25, 26].

Table 2: Few examples of oncogenes, tumor-suppressor genes and stability genes those are associated with cancers.

Tumor-suppressor genes Oncogenes Stability genes

It is well reported that the genetic alteration related to these cancer-related molecules are associated with initiation and progression of cancer (Figure 2) [19,20,21,22,23,27].

Figure 2: Initiation & Progression of cancer.

These alterations may be transmitted through germline and result in susceptibility to cancer, or can arise by somatic mutation. A single genetic change is rarely sufficient for the development of a malignant tumor. Most evidence points to a multistep process of sequential alterations in several, often many, oncogenes, tumor-suppressor genes, or microRNA genes in cancer cells [27]. These genetic variations may be referred as “drivers” or “passengers”. Drivers include the genomic alterations that cause or promote cancer whereas the passengers referred to alterations present in the cancer genome but without obvious advantage to the cancerous cells when they occurred [28]. The major known alterations in the cancer genome include amplifications, frameshift mutations, germline mutations, large deletions, missense mutations, nonsense mutations, somatic mutations, splicing mutations, translocations etc. (Figure 3) [29, 30,31]. These cancer-related molecules may be transcription factors, chromatin remodelers, growth factors, growth factor receptors, signal transducers, and apoptosis regulators which help the cell to undergo uncontrolled growth and differentiation. These cancer-related molecules may be activated by various phenomenon mainly chromosomal rearrangements, mutations, gene amplification [27]. These genetic alterations are most likely reflected by the altered expression of sets of genes or pathways, rather than individual genes [32]. In addition, to the somatic and germline mutations, epigenetic alterations that regulate gene expression also involved in most of the cancers [33, 34]. All of these acquired changes occur in the setting of germline variations of copy number and nucleotide sequence, which may influence the rate of occurrence and/or the effects of somatic genetic alterations [35].

Figure 3: Cancer Gene Census (Source: )

  1.      Stages of Cancer

Cancer staging refers to the extent of the tumor or cancer growth. Stage of cancer is determined by means of X-rays, biopsy and other lab tests and procedures. Staging system gives the information like location, grade and size of tumor in body, cell type (adenocarcinoma / squamous cell carcinoma), spread of cancer (in other body parts / lymph nodes)[36].

TNM Staging System

The TNM system is the most widely used cancer staging system. TNM is used to determine the extent of tumor (T), spread of tumor to lymph nodes (N) and presence of metastasis (M). TNM staging is developed, maintained and regulated by AJCC (American Joint Committee on Cancer) and by UICC (Union for International Cancer Control). The TNM classification system was developed as a tool for doctors to stage different types of cancer based on certain, standardized criteria and among the most used staging system by medicos worldwide.

The “T” category designates the original tumor:

Category Implication
1. TX Primary tumor cannot be detected
2. T0 No evidence of primary tumor
3. Tis Carcinoma in situ (early cancer that has not spread to neighboring tissue)​
4. T1 – T4 Size and/or extent of the primary tumor​

The category “N” describes whether or not the cancer has reached nearby lymph nodes:

Category Implication
1. NX Regional lymph nodes cannot be evaluated​
2. N0 No regional lymph node involvement
3. N1 – N3 Involvement of regional lymph nodes

The M category tells whether there are distant metastases (spread of cancer to other parts of the body):

Category Implication
1. M0 No distant metastasis ​
2. M1 Distant metastasis

Once the T, N, and M are determined, they are combined, and an overall stage of 0, I, II, III, IV is assigned (Table 3).

Table 3: Cancer Stages

Stage What it means
Stage 0 Abnormal cells are present but have not spread to nearby tissue. Also called carcinoma in situ, or CIS. CIS is not cancer, but it may become cancer.
Stage I, Stage II, and Stage III Cancer is present. The higher the number, the larger the cancer tumor and the more it has spread into nearby tissues.
Stage IV The cancer has spread to distant parts of the body.


Another staging system that is used for all types of cancer groups the cancer into one of five main categories.

  • In situ—abnormal cells are present but have not spread to nearby tissue.
  • Localized—Cancer is limited to the place where it started, with no sign that it has spread.
  • Regional—Cancer has spread to nearby lymph nodes, tissues, or organs.
  • Distant—Cancer has spread to distant parts of the body.
  • Unknown—there is not enough information to figure out the stage.
    1.      Screening Methods

The primary goal of screening is to prevent lethal, progressive disease by detecting cancer at an earlier, more treatable stage or by detecting precursor lesions that can be removed before they develop into invasive cancers. Screening is a presumptive identification for disease; it is not a diagnostic tool. Screening alerts individuals for further testing. There are different kinds of screening tests which includes:

  • Physical exam and history: An exam of the body to check general signs of health, including checking for signs of disease, such as lumps or anything else that seems unusual. A history of the patient’s health habits and past illnesses and treatments will also be taken.
  • Laboratory tests: Medical procedures that test samples of tissue, blood, urine, or other substances in the body.
  • Imaging procedures: Procedures that make pictures of areas inside the body.
  • Genetic tests: Tests that look for certain gene mutations (changes) that are linked to some types of cancer.

Table 4: Commonly available screening methods [37].

Test Cancer Name
Colonoscopy, sigmoidoscopy and high sensitivity fecal occult blood tests (FOBTs) Colorectal cancer
Low-dose helical computed tomography Lung cancer
Mammography Breast cancer
Pap test and human papillomavirus (HPV) testing Cervical cancer
Alpha-fetoprotein blood test Liver cancer
Breast MRI Breast cancer
CA-125 test Ovarian cancer
PSA test Prostate cancer
Skin exams Skin cancer
  1.      Early diagnosis of Cancer is essential

The concept of early detection of various forms of cancer before they spread and become incurable, has enticed physicians and research scientists for decades [38]. Most of the cancers have regional or distant spread of their disease at the time of diagnosis [39]. Moreover, less survival rates for people diagnosed with advanced cancer stage as compared to the diagnosed when it is at early stage as shown in Table 5. Without doubt, shifting all cases to early detection will have a profound impact on overall mortality and economic burden. There are very few screening test is presently suitable for the early detection of cancers. This is because sufficiently high sensitivity (the probability of the test being positive in individuals with the disease) and specificity (the probability of the test being negative in individuals without the disease) are usually both not attributes of the same test; an increase in sensitivity tends to result in a reduction in specificity, and vice versa. Newer diagnostic methods with improved sensitivity and specificity are clearly needed to identify early stage cancers. The criteria for effective early detection state that the disease must be common with a high mortality rate. Second, the screening test must accurately detect early-stage disease. Third, the treatment after detection through screening must demonstrate improvements in prognosis and finally, the potential benefits must outweigh the potential harms and costs of screening [38]. One of the most promising ways to achieve this is through the use of cancer biomarkers.

Table 5: 5-year relative survival rate of mostly occurred cancers at different stages of detection.

Stage Breast Cancer Colon Cancer Rectal Cancer Liver Cancer Cervical Cancer Non-small cell Lung Cancer
1 100% 92% 87% 31% 80-93% 45-49%
2 93% 63-87% 49-80% 31% 58-63% 30-31%
3 72% 53-89% 58-84% 11% 32-35% 5-14%
4 22% 11% 12% 3% 15-16% 1%

Source: American Cancer Society

  1.      Cancer Biomarkers

According to NCI, a biomarker is “a biological molecule found in blood, other body fluids, or tissues as a sign of a normal or abnormal process or of a condition or disease like cancer”. Biomarkers typically distinguish suffering patient from a healthy person. The variations can be due to a number of factors like germline or somatic mutations, transcriptional changes and post-translational modifications. Generally, biomarkers are used in three primary ways [40]:

  • To help diagnose conditions, as in case of identifying early stage cancers (Diagnostic)
  • To forecast how aggressive a condition is, as in the case of determining a patient’s ability to fare in the absence of treatment (Prognostic)
  • To predict how well a patient will respond to treatment (Predictive)

There is great variety of biomarkers, which can comprise

  • Genetic biomarkers (eg. Gene Mutations)
  • Transcriptional biomarkers (eg. Altered gene or microrna expression)
  • Proteomic biomarkers (eg. Altered protein status)
  • Metabolic biomarkers (eg. Altered metabolites

These biomarkers can be identified during the process of carcinogenesis as shown in Figure 4.

Figure 4: Cancer biomarkers which can be identified during the cancer progression.

A biomarker can also be a collection of alterations, such as gene expression, proteomic, and metabolomic signatures. Biomarkers can be detected in the circulation (whole blood, serum, or plasma) or excretions or secretions (stool, urine, sputum, or nipple discharge), and thus easily assessed non-invasively and serially, or can be tissue-derived, and require either biopsy or special imaging for evaluation. Genetic biomarkers can be inherited, and detected as sequence variations in germ line DNA isolated from whole blood, sputum, or buccal cells, or can be somatic, and identified as mutations in DNA derived from tumor tissue [41].

Tumor markers are the chemical substances produced by cancerous cells or by other cells of the body in response to cancer and related conditions. Most of the tumor markers are produced by both normal and cancer cells. In cancer, the levels become much higher and could be found in blood, urine, stool, bodily fluids, tumor tissues and other body tissues of the patient. Mostly, the tumor markers are proteinous in nature. Recently, gene expression patterns and changes in genetic material have also begun to be used as cancer markers.  Many different tumor markers have been characterized and are in clinical use. Some are associated with only one type of cancer, whereas others are associated with two or more cancer types. There is no universal tumor marker is available to detect all types of cancer yet. [42].

  1.      Properties of an ideal cancer biomarkers [43,44]–
  • Tumor marker should be measured easily.
  • Reliable
  • Cost-effective
  • High-analytical sensitivity and specificity
  • Specific
  • Should be present in detectable (or higher than normal) quantities at early or preclinical stages and the quantitative levels of the tumor marker should reflect the tumor burden
  • Demonstrate high diagnostic sensitivity (few false negatives) and specificity (few false positives)
    1.      Advantages of cancer biomarkers –
  • Help in detecting, diagnosing and managing many cancer types. Elevated level of tumor marker suggests presence of cancer therefore this measurement is combined with other examinations like biopsy in order to confirm the presence of cancer.
  • Help doctors to plan the appropriate therapy for the treatment.
  • Tumor markers reflect the stages of cancers and prognosis.
  • Periodic measurement of tumor markers during cancer therapy. An alleviation in level of the tumor marker indicate that the cancer is responding to the treatment and vice versa [45].
  • Tumor markers might also be measured after treatment has ended to check for recurrence of the cancer [46].
  • Aid in determining malignant tissue with the benign.
  • Biomarkers have been identified that can be used to determine an individual’s risk of developing cancer. For example, a woman with a strong family history of ovarian cancer can undergo genetic testing to determine if she is a carrier of a germline mutation, such as BRCA1, which will increase her risk of developing breast and/or ovarian cancer [47].
  • In a patient with an abnormality, biomarkers can also be used to distinguish between different possibilities that are in the differential diagnosis [41].
    1.      Historical overview of cancer biomarkers

The first cancer marker ever reported was the presence of the light chain of immunoglobulin in the urine, of 75% of myeloma patients [48]. Since its discovery in 1847, the test is still employed by clinicians today, but with use of modern quantification techniques. From 1930–1960, scientists identified numerous hormones, enzymes, and other proteins whose concentration was altered in biological fluids from cancer patients. The modern era of monitoring malignant disease, however, began in the 1960s with the discovery of alpha-fetoprotein (AFP)  and carcinoembryonic antigen (CEA), which was facilitated by the introduction of immunological techniques such as the radioimmunoassay [49,50]. In the 1980s, the era of hybridoma technology enabled development of the ovarian epithelial cancer marker, carbohydrate antigen 125 (CA 125) [51]. In 1980, prostate specific antigen (PSA), considered one of the best cancer markers, was discovered [52].

  1.      Current applications of tumor markers and their clinical utility

One of the applications of a tumor marker is for population screening. A screening test should have very high sensitivity and exceptional specificity, to avoid too many false positives in low cancer prevalence populations. Furthermore, the test must demonstrate a benefit in terms of clinical outcome. Unfortunately, current biomarkers suffer from low diagnostic sensitivity and specificity to serve as screening markers. With the exception of PSA, current tumor markers are more frequently elevated at late stages of disease. Hence, the current clinical utility of any marker to serve as a screening tool is limited. Another application of a tumor marker is for diagnosis. Similar to its utility as a screening marker, the current biomarkers suffer from low diagnostic sensitivity and specificity to serve as diagnostic markers. A further application of a tumor marker is as a prognostic marker. Most cancer markers have some prognostic value however; specific therapeutic interventions cannot be issued since their accuracy of prediction is rather poor. In addition, some markers can serve as a predictive indicator of therapeutic response. In this respect, very few markers have predictive power (exceptions include steroid hormone receptors and HER-2 amplification for breast cancer) but the provided information helps for therapy selection. Yet another application of a tumor marker is for tumor staging. Besides AFP and human chorionic gonadotropin-β (HCG) for use of staging testicular cancer, the accuracy of the other markers to determine tumor staging is poor. Two more current applications of tumor markers exist which include detecting early tumor recurrence and monitoring effectiveness of cancer therapy. The usefulness of the current markers to serve the former role is controversial as lead time is short and does not significantly affect outcome. In addition, therapies for treating recurrent disease are not usually effective and clinical relapses could occur without biomarker elevation or biomarker elevation is non-specific. With respect to the latter application (monitoring effectiveness of cancer therapy), current biomarkers provide information on therapeutic response (effective or non-effective) that is readily interpretable and more economical than imaging modalities. Hence current markers play a very essential clinical role in this application.

  1.      Currently available cancer biomarkers: clinical utility and limitations

Cancer biomarkers may be diagnostic or prognostic biomarkers are quantifiable traits that help clinical oncologists at the first interaction with the suspected patients. These particularly aid in (i) identifying who is at risk, (ii) diagnose at an early stage, (iii) select the best treatment modality, and (iv) monitor response to treatment [54]. These biomarkers exist in many different forms; traditional biomarkers include those that can be assessed with radiological techniques viz., mammograms etc., and circulating levels of tumor specific (related) antigens for example, prostate-specific antigen (PSA). With the availability of complete human genome sequence, and advancement in key technologies such as high throughput DNA sequencing, microarrays, and mass spectrometry, the plethora of potentially informative cancer biomarkers has expanded dramatically to include the sequence and expression levels of DNA, RNA, and protein as well as metabolites [55]. Advances in imaging technologies open up the possibility that pertinent molecular biomarkers (e.g., those marking response to therapy) can be monitored in cancer patients non-invasively. The currently available cancer biomarkers are as follows:

A number of tumor markers are currently being used for a wide range of cancer types. Although most of these can be tested in laboratories that meet standards set by the Clinical Laboratory Improvement Amendments, some cannot be and may therefore be considered experimental. Tumor markers that are currently in common use are shown in Table 6. These markers were widely used but they are having some limitations in terms of their sensitivity and specificity. Example: The key problems in using the CA125 test as a screening tool are its lack of sensitivity and its inability to detect early stage cancers. Increased levels of CA 19-9 can also be found in patients with nonmalignant inflammatory diseases, such as cholecystitis and obstructive icterus, cholelithiasis, cholecystolithiasis, acute chlolangitis, toxic hepatitis and other liver diseases and therefore should be used with caution [57,58]. An elevated blood level of hCG is also be found in the urine of pregnant women and therefore may not be useful as a marker under this condition.

Table 6: Currently available cancer biomarkers [56].

S.No. Tumor Markers Type of cancer Analyzed Tissue Use
1. ALK gene rearrangements and overexpression Non-small cell lung cancer and anaplastic large cell lymphoma Tumor To help determine treatment and prognosis
2. Alpha-fetoprotein (AFP) Liver cancer and germ cell tumors Blood To help diagnose liver cancer and follow response to treatment; to assess stage, prognosis, and response to treatment of germ cell tumors
3. Beta-2-microglobulin (B2M) Multiple myelomachronic lymphocytic leukemia, and some lymphomas Blood, urine or cerebrospinal fluid To determine prognosis and follow response to treatment
4. Beta-human chorionic gonadotropin (Beta-hCG) Choriocarcinoma and germ cell tumors Urine or blood To assess stage, prognosis, and response to treatment
5. BRCA1 and BRCA2 gene mutations Ovarian cancer Blood To determine whether treatment with a particular type of targeted therapy is appropriate
6. BCR-ABL fusion gene (Philadelphia chromosome) Chronic myeloid leukemiaacute lymphoblastic leukemia, and acute myelogenous leukemia Blood and/or bone marrow To confirm diagnosis, predict response to targeted therapy, and monitor disease status
6. BRAF V600 mutations Cutaneous melanoma and colorectal cancer Tumor To select patients who are most likely to benefit from treatment with certain targeted therapies
7. Gastrointestinal stromal tumor and mucosal melanoma Tumor To help in diagnosing and determining treatment
8. CA15-3/CA27.29 Breast cancer Blood To assess whether treatment is working or disease has recurred
9. CA19-9 Pancreatic cancer, gallbladder cancer, bile duct cancer, and gastric cancer Blood To assess whether treatment is working
10. CA-125 Ovarian cancer Blood
  • To help in diagnosis, assessment of response to treatment, and evaluation of recurrence
11. Calcitonin Medullary thyroid cancer Blood To aid in diagnosis, check whether treatment is working, and assess recurrence
12. Carcinoembryonic antigen (CEA) Colorectal cancer and some other cancers Blood To keep track of how well cancer treatments are working or check if cancer has come back
13. CD20 Non-Hodgkin lymphoma Blood To determine whether treatment with a targeted therapy is appropriate
14. Chromogranin A (CgA) Neuroendocrine tumors Blood To help in diagnosis, assessment of treatment response, and evaluation of recurrence
15. Chromosomes 3, 7, 17, and 9p21 Bladder cancer Urine To help in monitoring for tumor recurrence
16. Circulating tumor cells of epithelial origin (CELLSEARCH®) Metastatic breast, prostate, and colorectal cancers Blood To inform clinical decision making, and to assess prognosis
17. Cytokeratin fragment 21-1 Lung cancer Blood To help in monitoring for recurrence
18. EGFR gene mutation analysis Non-small cell lung cancer Tumor To help determine treatment and prognosis
19. Estrogen receptor (ER)/progesterone receptor (PR) Breast cancer Tumor To determine whether treatment with hormone therapy and some targeted therapies is appropriate
20. Fibrin/fibrinogen Bladder cancer Urine To monitor progression and response to treatment
21. HE4 Ovarian cancer Blood To plan cancer treatment, assess disease progression, and monitor for recurrence
22. HER2/neu gene amplification or protein overexpression Breast cancer, gastric cancer, and gastroesophageal junctionadenocarcinoma Tumor To determine whether treatment with certain targeted therapies is appropriate
23. Immunoglobulins Multiple myeloma and Waldenström macroglobulinemia Blood and urine To help diagnose disease, assess response to treatment, and look for recurrence
24. KRAS gene mutation analysis Colorectal cancer and non-small cell lung cancer Tumor To determine whether treatment with a particular type of targeted therapy is appropriate
25. Lactate dehydrogenase 


Germ cell tumors, lymphoma, leukemia, melanoma, and neuroblastoma Blood To assess stage, prognosis, and response to treatment
26. Neuron-specific enolase (NSE) Small cell lung cancer and neuroblastoma Blood To help in diagnosis and to assess response to treatment
27. Nuclear matrix protein 22 Bladder cancer Urine To monitor response to treatment
28. Programmed death ligand 1 (PD-L1) Non-small cell lung cancer Tumor To determine whether treatment with a particular type of targeted therapy is appropriate
29. Prostate-specific antigen (PSA) Prostate cancer Blood To help in diagnosis, assess response to treatment, and look for recurrence
30. Thyroglobulin Thyroid cancer Blood To evaluate response to treatment and look for recurrence
31. Urokinase plasminogen activator (uPA) and plasminogen activator inhibitor (PAI-1) Breast cancer Tumor To determine aggressiveness of cancer and guide treatment
32. 5-Protein signature (OVA1®) Ovarian cancer Blood To pre-operatively assess pelvic mass for suspected ovarian cancer
33. 21-Gene signature (Oncotype DX®) Breast cancer Tumor To evaluate risk of recurrence
34. 70-Gene signature (Mammaprint®) Breast cancer Tumor To evaluate risk of recurrence
  1.      Mechanism of biomarker dysregulation
    1.      Gene-expression

The protein encoded by a gene can be expressed in increased quantities due to increases in gene or chromosome copy number (i.e. gene amplification) or through increased transcriptional activity. The latter could be the result of imbalances between gene repressors and activators. Epigenetic changes, such as DNA methylation, are also known to affect gene expression. On a larger scale, chromosomal translocations can result in gene regulation by promoters that are sometimes enhanced by steroid hormones; transposons can also serve a similar role [59]. An example of a putative biomarker is the protein human epididymis protein 4 (HE4), which is overexpressed in ovarian carcinoma. Using cDNA microarrays to identify overexpressed genes in ovarian carcinoma, 101 transcripts were shown to be overexpressed in ovarian cancers compared with normal tissues [60, 61]. Real-time polymerase-chain reaction (PCR) of an independent set of benign and malignant tissues confirmed that 12 of the transcripts were indeed overexpressed in ovarian cancers. Two of them, WDFC2 (also known as HE4) and MSLN, seemed to have the highest selectivity. Quantification of HE4 protein levels in serum revealed that it can be a potential biomarker for ovarian cancer [62]; though, clinical evaluation is pending. Gene and protein expression of HE4 in a large series of normal and malignant adult tissues, however, showed that HE4 is present in pulmonary, endometrial and breast adenocarcinomas, in addition to positive staining in ovarian carcinoma [63].

  1.      Microrna

MicroRNAs are the small sized (18-22 nucleotide in length) RNAs and a class of small non-coding RNA. Other classes of such RNAs are siRNA or small interfering RNA and piRNA or PIWI-interacting RNA with diverse functions. miRNAs have important role in tumor suppression, apoptosis, metabolism, cellular growth and differentiation. miRNAs are also capable of negatively modulating gene expression during post-transcription modification. Deregulated expression of miRNA due to miRNA gene mutation (Calin et al., 2002), epigenetic modification (Toyota et al., 2008) plays a crucial role in carcinogenesis (Xi, 2013; Cattaneo et al., 2014; Reddy, 2015). miR-29b and miRNA-30-5p have tumor suppressor capability (Rossi et al., 2012; Amodio et al., 2013). miR-17-92 cluster have oncogenic properties. miRNA are present in various biological fluid like cerbospinal fluid, plasma, serum, saliva and breast milk (D’Angelo et al., 2016). Studies suggests the role of circulating miRNA in cancer related pathology. Presence of miR-22, miR-24 and miR30a in blood is related to the non-small cell lung cancer pathophysiology (Franchina et al., 2014). High level of miR-19a is a favorable prognostic biomarker for the detection of metastatic HER2+ inflammatory breast cancer (Anfossi et al., 2014) while the microRNAs miR-429, miR-205, miR-200b, miR-203, miR-125b and miR-34b are used as diagnostic tools in lung cancer and gastric cancer pathology (Qui et al., 2016).

Now a days a number of patients participating in the clinical trials for the glioma, breast cancer, hepatocellular carcinoma and pediatric cancer were being suggested for the profiling of miRNAs in order to diagnose the cancer type and for the betterment of the prognosis (D’angelo et al., 2016).

miRNA have these diagnostic properties due to their non-cellular characteristics, their resisting capability to degradation processes and their presence in bodily fluids (D’angelo et al., 2016).

Apart from aforementioned properties and abilities of miRNAs being biomarkers for cancer there are certain limitations among which one is the presence of some miRNAs in normal healthy conditions for example miRNA – miR-141 is reported to be present in healthy pregnant women and also in the women with prostate cancer (Chim et al., 2008; Mitchell et al., 2008). Such conditions make the miRNA profiling complicated. In order to resolve this problem, scientists suggested to trace only certain miRNAs not the whole pool of the circulating miRNAs.  The circulating miRNAs can be divided into two types – one is miRNAs which get complex with the Argonaute-2 proteins and second one is miRNAs packed in vesicles. t microRNAs encapsulated in the exosomes partake in intracellular signaling that is why attention should be given to only vesicle packed miRNAs for biomarkers (Turchinovich et al., 2011).

  1.      Technologies
    1.      Microarray

Microarray has been the most commonly used method to measure targeted loci or genes of interests for almost 20 years (Lashkari et al., 1997). It allows simultaneous profiling of thousands of genetic features, such as SNPs, CNVs, mRNAs and miRNAs. Typically an mRNA array measures the expression of over 20,000 genes, and an miRNA array measures the expression of ~1,000 microRNAs. However there are still problems in the preprocessing of the data. In addition to the true signal, raw microarray data may exhibit systematic differences between samples due to bias introduced by technical factors. Proper normalization is one of the critical steps in order to ensure downstream suitable comparative data analysis in terms of minimizing false negative and false positive results. Because array technology is based on the mutual and specific affinity of DNA strands, it relies on a known reference genome and transcriptome before the microarray platform can be designed. The technology also has limited probe density, especially in detecting SNPs or CNVs where the number of targeted loci can reach a few million; however this number is very low considering there are 3 billion base pairs in the human genome.

  1.      Next Generation sequencing

The use of NGS has grown rapidly during the last decade. This technology permits global measurement of the whole genome or transcriptome to produce large amounts of sequencing reads in a single run within a short time frame, in a cost-effective manner relative to traditional Sanger sequencing (Metzker, 2010). NGS allows a DNA fragment to be repeatedly sequenced (a procedure known as deep sequencing), delivers greatly increased sensitivity and accuracy, and has revolutionized the world of genomics. The technique has most recently been extended to the analysis of the transcriptome by what is known as RNA-Seq. Current commercial NGS systems include Illumina, Applied Biosystems Supported Oligonucletide Ligation Detection System (SOLiD), the Roche 454 and so on. NGS has the potential to measure all known mutations, structure variants in the genome, genes and isoforms and miRNAs in the transcriptome and, furthermore, to discover novel variants. To facilitate and accelerate the process of identifying genetic variations at the population level, whole-genome sequencing of a large number of individuals was performed at great effort by the 1000 Genomes Project  ( To characterize disease-specific alterations in cancer genomes, the International Cancer Genome Consortium (ICGC; and The Cancer Genome Atlas (TCGA; sequenced over 20,000 cancer genome in at least 50 types of cancer. In sequencing experiments, millions of reads are generated and stored in FASTA format. The aligned reads are usually saved as Binary Aligned Format (BAM), from which read count 22 information can be computed for downstream estimation and analysis. However, there are some technical problems in processing and analysing NGS data. Firstly, due to polymerase chain reaction (PCR) amplification bias and sequencing error, initial raw reads must be preprocessed and filtered properly. Secondly, annotation is still incomplete. Inaccurate annotation on gene-isoform structure may cause bias in estimating isoform-level expression. Thirdly, non-uniform read coverage is an important issue especially in RNA-sequencing experiments, because the coverage not only ensures adequate information but is also related to the expression level to be estimated. Several issues may further complicate the use of sequencing technology, for example counting reads that span more than one region, multiple mapped reads and the challenge of dealing with paired-end as compared to single-end reads. In this thesis, whole-genome sequencing data have not been analysed; instead, we analysed Exome-seq data, which is a less expensive alternative approach that only sequences the exon regions of the genome. It is known that exons comprise roughly 1% of the genome (Gilissen et al., 2011), so the compromised approach reduces the sequenced region by 99%, while the most informative sources of genetic variation remain. An important project to identify genetic variants in coding regions is the Exome Sequencing Project (ESP;  This is a multi-cohort project on heart, lung and blood disorders, to discover novel genes and mechanisms contributing to the diverse phenotypes.

  1.      Obstacles in delaying biomarker’s clinical translation

Basic research in clinical translation is one of the precedent task on both academic and industrial levels (178). There is not much investment in clinical research and for this reason, translational research is prominent and gaining more importance. This has been considered that genetic revolution studies gives slow benefits. In every type of cancer, the amount and number of diagnostic markers is huge but clinical application of such biomarkers is comparatively low (162).

The major reasons and factors behind the delaying in clinical translation of the under investigated candidate diagnostic biomarkers can be categorized under six major heads, can be applied to almost all cancer types and play a vital role in identification and validation of biomarkers. These six reasons are –

  • Privation in quantification and synthesis of the existing evidences samples

Clinical translational research comprises the identification if existing diagnostic biomarkers as a prior step in which biomarkers are investigated in order to address a clinical problem (157, 179) In some cases, a single biomarker can be investigated for more than one case study (174, 180).  Such studies showed significant results in differentiating two tumor types (179) and also differentiate subtypes (157) of same tumor. More importantly, biomarkers differentiate malignant cancers from the benign ones (132, 166). The literature has very less studies about the synthesis and quantification of biomarkers and about their performance and thus there is a need to focus just on such systemic reviews and succeeding biomarker meta-analysis which specifically deals with diagnosis of clinical problems.  This approach leads to the identification of suitable biomarker candidates also selecting a single biomarker for one potential validation study which was firstly used in different investigational studies. Additionally this approach will provide forte for the development of better biomarkers in solo setting.

  • Sample size

Finding inadequate sample size of biomarkers in most common in pilot study of finding novel biomarkers (142, 181) because of the unavailability of an immense tissue source for the investing new biomarker development.  In many of such studies, even the sufficient statistical tools could not been enough to oversee the problem of sample size. However using a large sample size of biomarkers must be investigated for the meta-analysis but having lack of performance like sensitivity and specificity. Also, expression of biomarker is not homogenous as extracted from different tissue form different patients (182, 183). With large sample size the heterogeneity inter-tumor derived biomarkers are easily distinguishing and leads to the betterment of the biomarkers for diagnostic purpose. Inter-tumor heterogenetic biomarkers are less sensitive and less accurate in disease diagnosis. Another major problem is the distribution of the sample number between disease and normal groups (142) and a potent biomarker must differentiate between benign and malignant cancers. An ideal size of sample must equally distributed between benign and malignant samples.

  • Lack of an optimal scoring system and threshold

A potent and inclusive scoring system is a must requirement while interpreting IHC leading to quantify the extent of expression of biomarkers. Investigation of scoring system and thresholds is used to cataloguing patients into diagnostic categories like benign and malignant. A wide array of such scoring systems are using by the researchers widely (157, 184-186) based on intensity of staining, %age of stained positive cells, semi-quantitative histoscores and combination of staining intensity and positive cells (160, 171, 187, 188). A semi-quantitative histocore comprises both intensity and proportion of staining and thus is a standard scoring system by quantifying expression level of biomarkers and by calculating various cut-offs for diagnostic purposes. Choosing an easy to use appropriate cut-off after scoring is a challenge for pathologists who desire an optimal cut-off more reliable and reproducible. A receiver operating characteristic (ROC) curve analysis is used in order to choose an optimal cut-off for providing diagnostic sensitivity and specificity to biomarkers (189, 190) for researchers to select an optimal cut-off with high diagnostic potential. This cut-off can be used later for future validation studies and to observe variations between different scorers.

  • Limitation in using panel of biomarkers

An ideal diagnostic biomarker must have homogenous expression (within or between tumor tissues) in same cancer type and intra tumor and inter tumor heterogeneity of biomarker expression (191). It is a very unlike feature of a single biomarker to work as a perfect sensitive and specific biomarker in all patients (122, 192). The possible solution is using panel of biomarkers which address both inter and intra- tumor heterogeneity. Most of the researchers investigated biomarkers singly with a restricted panel approach. The studies using more than one biomarker in a case study is very rare (Jhala et al., 2006). In a single experiment, using panel approach is a powerful tool in order to identify suitable biomarker and exploring their diagnostic performance which allows to perform a comparison between biomarkers and panel of biomarkers. This comparison later governs a suitable panel of biomarkers for clinical translational study and other future validations. Different cell compartment stained differently by different biomarkers but upon using panel of biomarkers stains all major sub- cellular compartments. In this way, this approach provides more confident search to a pathologist to report and identify a disease.

  • Lab technical differences in IHC

Boosting IHC for biomarker research development is important for a proper staining of tissue because this leads to suggest protocols to the manufacturers for IHC. Also using this most research labs optimized antibodies. The optimization protocol used to increase strength and specificity of the signal as background signals and artefacts suppress (O’Hurley et al., 2014; Taylor, 2006). Research labs also employ in different IHC experimental conditions like cloning of primary antibodies, heat induced epitope and enzymatic antigen retrieval methods, dilution of primary antibodies etc (Anagnostou et al., 2010) which contribute in investigating different sensitivity and specificity values of a biomarker investigated in different studies. Such studies deal with the issue and compared as different antibody clones, different pH of antigen retrieval buffers and different dilution of the primary antibodies (Emoto et al., 2005; Vassallo et al., 2004; Hermansen et al., 2011; McCabe et al., 2005) all led to better optimization of antibodies. In order to deal the technical heterogeneity for the systemic identification of biomarker with sensitivity and specificity. Development of an assay is important to develop a quality biomarker and thus sometimes biomarkers fail to achieve the list of potential biomarker (Phillips et al., 2006; Carden et al., 2010).

  • Well-designed validation studies

Delaying in clinical translation is due to the validation of potent IHC biomarkers in different tissue coherts (Issaq et al., 2011). Many promising and excellent biomarkers were investigated by the researchers without using further designing in validation studies. But validation studies improved the chances of giving better biomarkers for clinical purposes (Rifai et al., 2006; Drucker E, Krapfenbauer, 2013). The steps in validation studies are as follows –

  • Same IHC methods, scoring system and cut-offs were used for validating biomarkers in different labs and patient coherts.
  • Expression levels, consequent diagnostic sensitivity and specificity must be similar in validation studies leads to launch cut-offs and IHC methodology reproducibility for diagnosis.
  • Establishing a multi-institutional group for validation studies to carry out such studies and address technical problems and others (Wagner and Srivastava, 2012). One such group is “European Study Group for Pancreatic Cancer” or “ESPAC” (Neoptolemos et al., 2001).

In this way, pathologists and translational research scientists investigate biomarker panels for optimization and facilitate academic- industry collaborations. Thus, pooling of existing data, synthesis and analysis of evidence, biomarker identification according to sample size, scoring system, cut-offs and validation of biomarkers play important role in biomarker development.

  1.      Role of systemic review and meta-analysis in biomarker development

A long list of biomarkers came out by using omics technologies which further use lengthy and tedious, costly screening process to screen these biomarkers and trace out only the potential ones. Overexpression of biomarkers, specificity and no or less expression in normal tissue and in cancerous tissue and appropriate assay plays a significant role in identification of a good biomarker (Chaing et al., 2013). Validation by IHC eliminate the issues of identification of biomarkers for clinical purpose and suitable biomarkers were filtered using the sensitivity and specificity values. Systemic review and meta-analysis were performed for identification of IHC diagnostic biomarkers. Literature in biomarkers is speckled thus there is a need to collect enough existing evidence on IHC biomarkers for PDAC. Reported novel biomarker is compared with the existing ones by researchers to investigate its clinical importance. But we cannot depend on such results came out from a single institution. This problem is eliminated by systemic review and meta-analysis which randomized clinical trials. Systemic review identify most relevant research evidence, critically appraise it to collect evidence for clinical trial and for identifying papers for meta-analysis (Sauerland and Seiler, 2005; Yuan and Hunt, 2009). Mostly, reported studies in PDAC diagnostic biomarker research are cohort or case control studies. Systemic review and meta-analysis qualifies and ranked IHC biomarkers and ensures their quality, Also it aids in designing study, sample size determination, distribution of sample, scoring system and cut-offs, single or panel investigation of biomarkers, heterogeneity of the biomarkers.

  1.      Project aim and objectives

The aim of the present work is the identification of biomarkers and drug target in human cancers.

In order to achieve the aim, the following objectives have been set. 


Objective 1: Database development:

  1. Identification of colorectal cancer related genes and creation of web-based composite retrieval tool.
  2. Identification of colorectal cancer related miRNAs and creation of web-based composite retrieval tool.
  3. Creation of cancer biomarkers and drug targets database.

Objective 2: Identification of cancer responsive genes and pathways in different stages of colorectal cancer and cervical cancer.

Objective 3: Stage wise comparison of different cancer types.

Objective 4: Identification of cancer gene signatures using machine learning approaches.


Leave a Reply