Skip to content

Latest commit

 

History

History

Endometrial Cancer Prediction

Endometrial cancer Prediction Dataset 💻

Welcome to the Endometrial cancer Prediction Dataset! 🎉 This dataset contains information about endometrial cancer, also known as endometrial carcinoma, which is a type of cancer that starts in the cells of the inner lining of the uterus (the endometrium). Endometrial carcinomas can be categorized into different types based on cellular characteristics observed under a microscope.

Cause of Uterine Cancer

The exact cause of uterine cancer is not fully understood. However, it is believed that mutations occur in the cells of the uterus, causing them to grow and multiply uncontrollably, leading to the formation of tumors.

Attribute Description 📊

1.Patient ID: Unique identifier for each patient.

2.Sample ID: Unique identifier for each sample.

3.Cancer Type Detailed: Detailed description of the cancer type.

4.Overall Survival Status: Patient's overall survival status (e.g., "1:DECEASED" indicates deceased compared to "0:LIVING" (indicating living)).

5.Disease Free Status: Patient's disease-free status (e.g., "0:DiseaseFree" indicates disease-free).

6.Disease-specific Survival status: Patient's disease-specific survival status.

7.Mutation Count: Number of mutations detected.

8.Fraction Genome Altered: Fraction of the genome that is altered.

9.Diagnosis Age: Age of the patient at diagnosis.

10.MSI MANTIS Score: MSI (Microsatellite Instability) score calculated using MANTIS algorithm.

11.MSIsensor Score: MSI (Microsatellite Instability) score calculated using MSIsensor.

12.Race Category: Patient's race or ethnicity category.

13.Subtype: Subtype of the cancer.

14.Tumor Type: Type of tumor (e.g., "Serous Endometrial Adenocarcinoma").

Overall Survival Status Distribution:

The distribution of overall survival status shows that there is a class imbalance, with a higher proportion of patients being labeled as "1:DECEASED" (indicating deceased) compared to "0:LIVING" (indicating living)

Disease Free Status Distribution:

The distribution of disease-free status indicates the proportion of patients who are disease-free at a certain point in time. Further analysis may reveal trends in disease recurrence or remission.

Mutation Count and Fraction Genome Altered:

The distribution of mutation count and fraction genome altered suggests variability among patients in terms of genetic alterations. Some patients may have a higher number of mutations or a larger fraction of the genome altered, which could impact disease progression and treatment response.

MSI Mantis Score

The MANTIS score is a predictive score for a patient's MSI (Microsatellite Instability) status. A higher MANTIS score indicates a higher likelihood of MSI-H (high microsatellite instability) status.

MSI Sensor

MSIsensor is a tool used for microsatellite instability detection using paired tumor-normal sequence data. The resulting MSIsensor score is a value between 0 and 100, indicating the percentage of mutated microsatellite loci.

Diagnosis Age Distribution:

The distribution of diagnosis age shows the age range at which patients are diagnosed with cancer. Understanding the age distribution can provide insights into the demographics of the patient population and potential age-related factors influencing cancer development.

Race Category and Ethnicity:

Exploring the race category and ethnicity of patients can shed light on disparities in cancer incidence, treatment outcomes, and access to healthcare services among different demographic groups.

Cancer Subtypes and Tumor Types:

The detailed descriptions of cancer subtypes and tumor types provide valuable information about the specific characteristics of the cancer cases included in the dataset. Analyzing these attributes can help identify patterns and associations with clinical outcomes.

Usage

  • Researchers and healthcare professionals can utilize this dataset for studying endometrial cancer and its various types.
  • ial and should be used only for educational purposes.

Conclusion:

The EDA of the Endometrial cancer Prediction Dataset provided valuable insights into the demographic and clinical characteristics of the patient population. Analysis of attributes such as overall survival status, disease-free status, mutation count, MSI scores, diagnosis age, race category, and tumor subtype revealed patterns and associations relevant to cancer prognosis and treatment. The findings underscore the importance of understanding the heterogeneity of cancer and its impact on patient outcomes. Further research based on these insights could contribute to personalized approaches to cancer care and improve treatment strategies for better patient outcomes.