Overview
Brought to you by YData
Dataset statistics
| Number of variables | 11 |
|---|---|
| Number of observations | 94 |
| Missing cells | 68 |
| Missing cells (%) | 6.6% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 69.6 KiB |
| Average record size in memory | 757.7 B |
Variable types
| Text | 1 |
|---|---|
| Categorical | 10 |
SITE_SUBTYPE_3 has constant value "NS" | Constant |
PRIMARY_HISTOLOGY has constant value "gastrointestinal_stromal_tumour" | Constant |
HISTOLOGY_SUBTYPE_3 has constant value "NS" | Constant |
EFO is highly overall correlated with HISTOLOGY_SUBTYPE_1 and 2 other fields | High correlation |
HISTOLOGY_SUBTYPE_1 is highly overall correlated with EFO and 1 other fields | High correlation |
HISTOLOGY_SUBTYPE_2 is highly overall correlated with EFO | High correlation |
NCI_CODE is highly overall correlated with EFO and 3 other fields | High correlation |
PRIMARY_SITE is highly overall correlated with NCI_CODE | High correlation |
SITE_SUBTYPE_1 is highly overall correlated with NCI_CODE | High correlation |
PRIMARY_SITE is highly imbalanced (91.5%) | Imbalance |
SITE_SUBTYPE_1 is highly imbalanced (91.5%) | Imbalance |
HISTOLOGY_SUBTYPE_2 is highly imbalanced (71.8%) | Imbalance |
EFO is highly imbalanced (60.9%) | Imbalance |
EFO has 68 (72.3%) missing values | Missing |
COSMIC_PHENOTYPE_ID has unique values | Unique |
Reproduction
| Analysis started | 2025-07-15 00:43:24.358706 |
|---|---|
| Analysis finished | 2025-07-15 00:43:25.005719 |
| Duration | 0.65 seconds |
| Software version | ydata-profiling vv4.16.1 |
| Download configuration | config.json |
Variables
Unique 
| Distinct | 94 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
Length
| Max length | 13 |
|---|---|
| Median length | 12 |
| Mean length | 12.202128 |
| Min length | 12 |
Unique
| Unique | 94 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | COSO318721653 |
|---|---|
| 2nd row | COSO97595385 |
| 3rd row | COSO287721665 |
| 4th row | COSO36605385 |
| 5th row | COSO60695381 |
| Value | Count | Frequency (%) |
| coso318721653 | 1 | 1.1% |
| coso97595385 | 1 | 1.1% |
| coso287721665 | 1 | 1.1% |
| coso36605385 | 1 | 1.1% |
| coso60695381 | 1 | 1.1% |
| coso34435546 | 1 | 1.1% |
| coso35205546 | 1 | 1.1% |
| coso31875763 | 1 | 1.1% |
| coso28815381 | 1 | 1.1% |
| coso36845546 | 1 | 1.1% |
| Other values (84) | 84 |
Most occurring characters
| Value | Count | Frequency (%) |
| O | 188 | |
| 5 | 147 | |
| 3 | 141 | |
| 8 | 96 | |
| C | 94 | |
| S | 94 | |
| 6 | 93 | |
| 7 | 89 | |
| 1 | 76 | |
| 4 | 49 | 4.3% |
| Other values (3) | 80 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1147 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| O | 188 | |
| 5 | 147 | |
| 3 | 141 | |
| 8 | 96 | |
| C | 94 | |
| S | 94 | |
| 6 | 93 | |
| 7 | 89 | |
| 1 | 76 | |
| 4 | 49 | 4.3% |
| Other values (3) | 80 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1147 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| O | 188 | |
| 5 | 147 | |
| 3 | 141 | |
| 8 | 96 | |
| C | 94 | |
| S | 94 | |
| 6 | 93 | |
| 7 | 89 | |
| 1 | 76 | |
| 4 | 49 | 4.3% |
| Other values (3) | 80 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1147 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| O | 188 | |
| 5 | 147 | |
| 3 | 141 | |
| 8 | 96 | |
| C | 94 | |
| S | 94 | |
| 6 | 93 | |
| 7 | 89 | |
| 1 | 76 | |
| 4 | 49 | 4.3% |
| Other values (3) | 80 |
PRIMARY_SITE
Categorical
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.0 KiB |
| soft_tissue | |
|---|---|
| large_intestine | 1 |
Length
| Max length | 15 |
|---|---|
| Median length | 11 |
| Mean length | 11.042553 |
| Min length | 11 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 1.1% |
Sample
| 1st row | soft_tissue |
|---|---|
| 2nd row | soft_tissue |
| 3rd row | soft_tissue |
| 4th row | soft_tissue |
| 5th row | soft_tissue |
Common Values
| Value | Count | Frequency (%) |
| soft_tissue | 93 | |
| large_intestine | 1 | 1.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| soft_tissue | 93 | |
| large_intestine | 1 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 280 | |
| t | 188 | |
| e | 96 | 9.2% |
| i | 95 | 9.2% |
| _ | 94 | 9.1% |
| o | 93 | 9.0% |
| f | 93 | 9.0% |
| u | 93 | 9.0% |
| n | 2 | 0.2% |
| l | 1 | 0.1% |
| Other values (3) | 3 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1038 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| s | 280 | |
| t | 188 | |
| e | 96 | 9.2% |
| i | 95 | 9.2% |
| _ | 94 | 9.1% |
| o | 93 | 9.0% |
| f | 93 | 9.0% |
| u | 93 | 9.0% |
| n | 2 | 0.2% |
| l | 1 | 0.1% |
| Other values (3) | 3 | 0.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1038 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| s | 280 | |
| t | 188 | |
| e | 96 | 9.2% |
| i | 95 | 9.2% |
| _ | 94 | 9.1% |
| o | 93 | 9.0% |
| f | 93 | 9.0% |
| u | 93 | 9.0% |
| n | 2 | 0.2% |
| l | 1 | 0.1% |
| Other values (3) | 3 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1038 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| s | 280 | |
| t | 188 | |
| e | 96 | 9.2% |
| i | 95 | 9.2% |
| _ | 94 | 9.1% |
| o | 93 | 9.0% |
| f | 93 | 9.0% |
| u | 93 | 9.0% |
| n | 2 | 0.2% |
| l | 1 | 0.1% |
| Other values (3) | 3 | 0.3% |
SITE_SUBTYPE_1
Categorical
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.2 KiB |
| fibrous_tissue_and_uncertain_origin | |
|---|---|
| rectum | 1 |
Length
| Max length | 35 |
|---|---|
| Median length | 35 |
| Mean length | 34.691489 |
| Min length | 6 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 1.1% |
Sample
| 1st row | fibrous_tissue_and_uncertain_origin |
|---|---|
| 2nd row | fibrous_tissue_and_uncertain_origin |
| 3rd row | fibrous_tissue_and_uncertain_origin |
| 4th row | fibrous_tissue_and_uncertain_origin |
| 5th row | fibrous_tissue_and_uncertain_origin |
Common Values
| Value | Count | Frequency (%) |
| fibrous_tissue_and_uncertain_origin | 93 | |
| rectum | 1 | 1.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| fibrous_tissue_and_uncertain_origin | 93 | |
| rectum | 1 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 465 | |
| _ | 372 | |
| n | 372 | |
| u | 280 | |
| r | 280 | |
| s | 279 | |
| t | 187 | |
| e | 187 | |
| o | 186 | 5.7% |
| a | 186 | 5.7% |
| Other values (6) | 467 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3261 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 465 | |
| _ | 372 | |
| n | 372 | |
| u | 280 | |
| r | 280 | |
| s | 279 | |
| t | 187 | |
| e | 187 | |
| o | 186 | 5.7% |
| a | 186 | 5.7% |
| Other values (6) | 467 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3261 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 465 | |
| _ | 372 | |
| n | 372 | |
| u | 280 | |
| r | 280 | |
| s | 279 | |
| t | 187 | |
| e | 187 | |
| o | 186 | 5.7% |
| a | 186 | 5.7% |
| Other values (6) | 467 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3261 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 465 | |
| _ | 372 | |
| n | 372 | |
| u | 280 | |
| r | 280 | |
| s | 279 | |
| t | 187 | |
| e | 187 | |
| o | 186 | 5.7% |
| a | 186 | 5.7% |
| Other values (6) | 467 |
SITE_SUBTYPE_2
Categorical
| Distinct | 28 |
|---|---|
| Distinct (%) | 29.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
| small_intestine | |
|---|---|
| stomach | |
| large_intestine | |
| retroperitoneum | |
| peritoneum | 5 |
| Other values (23) |
Length
| Max length | 43 |
|---|---|
| Median length | 20 |
| Mean length | 12.702128 |
| Min length | 2 |
Unique
| Unique | 9 ? |
|---|---|
| Unique (%) | 9.6% |
Sample
| 1st row | retroperitoneum |
|---|---|
| 2nd row | extra-gastrointestinal_site |
| 3rd row | large_intestine |
| 4th row | gastrointestinal_tract_(site_indeterminate) |
| 5th row | mesocolon |
Common Values
| Value | Count | Frequency (%) |
| small_intestine | 10 | 10.6% |
| stomach | 9 | 9.6% |
| large_intestine | 8 | 8.5% |
| retroperitoneum | 6 | 6.4% |
| peritoneum | 5 | 5.3% |
| gastrointestinal_tract_(site_indeterminate) | 5 | 5.3% |
| NS | 5 | 5.3% |
| extra-gastrointestinal_site | 4 | 4.3% |
| oesophagus | 4 | 4.3% |
| omentum | 4 | 4.3% |
| Other values (18) | 34 |
Length
| Value | Count | Frequency (%) |
| small_intestine | 10 | 10.6% |
| stomach | 9 | 9.6% |
| large_intestine | 8 | 8.5% |
| retroperitoneum | 6 | 6.4% |
| peritoneum | 5 | 5.3% |
| gastrointestinal_tract_(site_indeterminate | 5 | 5.3% |
| ns | 5 | 5.3% |
| extra-gastrointestinal_site | 4 | 4.3% |
| oesophagus | 4 | 4.3% |
| omentum | 4 | 4.3% |
| Other values (18) | 34 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 153 | |
| t | 147 | |
| i | 109 | |
| a | 101 | |
| n | 97 | 8.1% |
| s | 88 | 7.4% |
| r | 71 | 5.9% |
| o | 63 | 5.3% |
| m | 59 | 4.9% |
| l | 59 | 4.9% |
| Other values (19) | 247 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1194 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 153 | |
| t | 147 | |
| i | 109 | |
| a | 101 | |
| n | 97 | 8.1% |
| s | 88 | 7.4% |
| r | 71 | 5.9% |
| o | 63 | 5.3% |
| m | 59 | 4.9% |
| l | 59 | 4.9% |
| Other values (19) | 247 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1194 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 153 | |
| t | 147 | |
| i | 109 | |
| a | 101 | |
| n | 97 | 8.1% |
| s | 88 | 7.4% |
| r | 71 | 5.9% |
| o | 63 | 5.3% |
| m | 59 | 4.9% |
| l | 59 | 4.9% |
| Other values (19) | 247 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1194 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 153 | |
| t | 147 | |
| i | 109 | |
| a | 101 | |
| n | 97 | 8.1% |
| s | 88 | 7.4% |
| r | 71 | 5.9% |
| o | 63 | 5.3% |
| m | 59 | 4.9% |
| l | 59 | 4.9% |
| Other values (19) | 247 |
SITE_SUBTYPE_3
Categorical
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.2 KiB |
| NS |
|---|
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | NS |
|---|---|
| 2nd row | NS |
| 3rd row | NS |
| 4th row | NS |
| 5th row | NS |
Common Values
| Value | Count | Frequency (%) |
| NS | 94 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| ns | 94 |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 94 | |
| S | 94 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 188 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| N | 94 | |
| S | 94 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 188 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| N | 94 | |
| S | 94 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 188 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| N | 94 | |
| S | 94 |
PRIMARY_HISTOLOGY
Categorical
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.8 KiB |
| gastrointestinal_stromal_tumour |
|---|
Length
| Max length | 31 |
|---|---|
| Median length | 31 |
| Mean length | 31 |
| Min length | 31 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | gastrointestinal_stromal_tumour |
|---|---|
| 2nd row | gastrointestinal_stromal_tumour |
| 3rd row | gastrointestinal_stromal_tumour |
| 4th row | gastrointestinal_stromal_tumour |
| 5th row | gastrointestinal_stromal_tumour |
Common Values
| Value | Count | Frequency (%) |
| gastrointestinal_stromal_tumour | 94 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| gastrointestinal_stromal_tumour | 94 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 470 | |
| s | 282 | |
| a | 282 | |
| r | 282 | |
| o | 282 | |
| m | 188 | 6.5% |
| i | 188 | 6.5% |
| n | 188 | 6.5% |
| l | 188 | 6.5% |
| u | 188 | 6.5% |
| Other values (3) | 376 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2914 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 470 | |
| s | 282 | |
| a | 282 | |
| r | 282 | |
| o | 282 | |
| m | 188 | 6.5% |
| i | 188 | 6.5% |
| n | 188 | 6.5% |
| l | 188 | 6.5% |
| u | 188 | 6.5% |
| Other values (3) | 376 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2914 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 470 | |
| s | 282 | |
| a | 282 | |
| r | 282 | |
| o | 282 | |
| m | 188 | 6.5% |
| i | 188 | 6.5% |
| n | 188 | 6.5% |
| l | 188 | 6.5% |
| u | 188 | 6.5% |
| Other values (3) | 376 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2914 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 470 | |
| s | 282 | |
| a | 282 | |
| r | 282 | |
| o | 282 | |
| m | 188 | 6.5% |
| i | 188 | 6.5% |
| n | 188 | 6.5% |
| l | 188 | 6.5% |
| u | 188 | 6.5% |
| Other values (3) | 376 |
HISTOLOGY_SUBTYPE_1
Categorical
High correlation 
| Distinct | 9 |
|---|---|
| Distinct (%) | 9.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.1 KiB |
| NS | |
|---|---|
| spindle | |
| spindle_and_epithelioid | |
| epithelioid | |
| dedifferentiated | |
| Other values (4) |
Length
| Max length | 46 |
|---|---|
| Median length | 23 |
| Mean length | 11.968085 |
| Min length | 2 |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | 2.1% |
Sample
| 1st row | dedifferentiated |
|---|---|
| 2nd row | spindle |
| 3rd row | dedifferentiated |
| 4th row | spindle |
| 5th row | NS |
Common Values
| Value | Count | Frequency (%) |
| NS | 24 | |
| spindle | 21 | |
| spindle_and_epithelioid | 17 | |
| epithelioid | 14 | |
| dedifferentiated | 8 | 8.5% |
| transdifferentiated | 6 | 6.4% |
| diffuse_interstitial_cell_of Cajal_hyperplasia | 2 | 2.1% |
| spindle_and_epithelial_and_rhabdoid | 1 | 1.1% |
| unusual_sub-type | 1 | 1.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| ns | 24 | |
| spindle | 21 | |
| spindle_and_epithelioid | 17 | |
| epithelioid | 14 | |
| dedifferentiated | 8 | 8.3% |
| transdifferentiated | 6 | 6.2% |
| diffuse_interstitial_cell_of | 2 | 2.1% |
| cajal_hyperplasia | 2 | 2.1% |
| spindle_and_epithelial_and_rhabdoid | 1 | 1.0% |
| unusual_sub-type | 1 | 1.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 173 | |
| e | 162 | |
| d | 129 | |
| l | 83 | |
| n | 81 | |
| p | 76 | 6.8% |
| t | 73 | 6.5% |
| s | 53 | 4.7% |
| a | 52 | 4.6% |
| _ | 47 | 4.2% |
| Other values (14) | 196 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1125 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 173 | |
| e | 162 | |
| d | 129 | |
| l | 83 | |
| n | 81 | |
| p | 76 | 6.8% |
| t | 73 | 6.5% |
| s | 53 | 4.7% |
| a | 52 | 4.6% |
| _ | 47 | 4.2% |
| Other values (14) | 196 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1125 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 173 | |
| e | 162 | |
| d | 129 | |
| l | 83 | |
| n | 81 | |
| p | 76 | 6.8% |
| t | 73 | 6.5% |
| s | 53 | 4.7% |
| a | 52 | 4.6% |
| _ | 47 | 4.2% |
| Other values (14) | 196 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1125 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 173 | |
| e | 162 | |
| d | 129 | |
| l | 83 | |
| n | 81 | |
| p | 76 | 6.8% |
| t | 73 | 6.5% |
| s | 53 | 4.7% |
| a | 52 | 4.6% |
| _ | 47 | 4.2% |
| Other values (14) | 196 |
HISTOLOGY_SUBTYPE_2
Categorical
High correlation  Imbalance 
| Distinct | 10 |
|---|---|
| Distinct (%) | 10.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.5 KiB |
| NS | |
|---|---|
| anaplastic_and_spindle | 3 |
| rhabdomyoblastic_and_spindle | 2 |
| anaplastic_and_epithelioid | 1 |
| rhabdomyoblastic_and_epithelioid | 1 |
| Other values (5) | 5 |
Length
| Max length | 51 |
|---|---|
| Median length | 2 |
| Mean length | 5.5 |
| Min length | 2 |
Unique
| Unique | 7 ? |
|---|---|
| Unique (%) | 7.4% |
Sample
| 1st row | NS |
|---|---|
| 2nd row | NS |
| 3rd row | anaplastic_and_epithelioid |
| 4th row | NS |
| 5th row | NS |
Common Values
| Value | Count | Frequency (%) |
| NS | 82 | |
| anaplastic_and_spindle | 3 | 3.2% |
| rhabdomyoblastic_and_spindle | 2 | 2.1% |
| anaplastic_and_epithelioid | 1 | 1.1% |
| rhabdomyoblastic_and_epithelioid | 1 | 1.1% |
| plexiform | 1 | 1.1% |
| anaplastic_and_spindle_and_epithelioid | 1 | 1.1% |
| rhabdomyoblastic_and_anaplastic | 1 | 1.1% |
| rhabdomyoblastic_and_epithelioid_and_spindle | 1 | 1.1% |
| rhabdomyoblastic_and_chondrosarcomatous_and_spindle | 1 | 1.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| ns | 82 | |
| anaplastic_and_spindle | 3 | 3.2% |
| rhabdomyoblastic_and_spindle | 2 | 2.1% |
| anaplastic_and_epithelioid | 1 | 1.1% |
| rhabdomyoblastic_and_epithelioid | 1 | 1.1% |
| plexiform | 1 | 1.1% |
| anaplastic_and_spindle_and_epithelioid | 1 | 1.1% |
| rhabdomyoblastic_and_anaplastic | 1 | 1.1% |
| rhabdomyoblastic_and_epithelioid_and_spindle | 1 | 1.1% |
| rhabdomyoblastic_and_chondrosarcomatous_and_spindle | 1 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 82 | |
| S | 82 | |
| a | 46 | 8.9% |
| i | 33 | 6.4% |
| d | 33 | 6.4% |
| n | 29 | 5.6% |
| _ | 28 | 5.4% |
| l | 25 | 4.8% |
| s | 22 | 4.3% |
| o | 21 | 4.1% |
| Other values (12) | 116 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 517 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| N | 82 | |
| S | 82 | |
| a | 46 | 8.9% |
| i | 33 | 6.4% |
| d | 33 | 6.4% |
| n | 29 | 5.6% |
| _ | 28 | 5.4% |
| l | 25 | 4.8% |
| s | 22 | 4.3% |
| o | 21 | 4.1% |
| Other values (12) | 116 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 517 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| N | 82 | |
| S | 82 | |
| a | 46 | 8.9% |
| i | 33 | 6.4% |
| d | 33 | 6.4% |
| n | 29 | 5.6% |
| _ | 28 | 5.4% |
| l | 25 | 4.8% |
| s | 22 | 4.3% |
| o | 21 | 4.1% |
| Other values (12) | 116 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 517 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| N | 82 | |
| S | 82 | |
| a | 46 | 8.9% |
| i | 33 | 6.4% |
| d | 33 | 6.4% |
| n | 29 | 5.6% |
| _ | 28 | 5.4% |
| l | 25 | 4.8% |
| s | 22 | 4.3% |
| o | 21 | 4.1% |
| Other values (12) | 116 |
HISTOLOGY_SUBTYPE_3
Categorical
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.2 KiB |
| NS |
|---|
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | NS |
|---|---|
| 2nd row | NS |
| 3rd row | NS |
| 4th row | NS |
| 5th row | NS |
Common Values
| Value | Count | Frequency (%) |
| NS | 94 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| ns | 94 |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 94 | |
| S | 94 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 188 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| N | 94 | |
| S | 94 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 188 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| N | 94 | |
| S | 94 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 188 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| N | 94 | |
| S | 94 |
NCI_CODE
Categorical
High correlation 
| Distinct | 7 |
|---|---|
| Distinct (%) | 7.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.5 KiB |
| C3868 | |
|---|---|
| C27792 | |
| C27793 | |
| C179932 | |
| C3486 | |
| Other values (2) |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 5.712766 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | C179932 |
|---|---|
| 2nd row | C27792 |
| 3rd row | C179932 |
| 4th row | C27792 |
| 5th row | C3868 |
Common Values
| Value | Count | Frequency (%) |
| C3868 | 24 | |
| C27792 | 20 | |
| C27793 | 17 | |
| C179932 | 14 | |
| C3486 | 14 | |
| C5811 | 3 | 3.2% |
| C27735 | 2 | 2.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| c3868 | 24 | |
| c27792 | 20 | |
| c27793 | 17 | |
| c179932 | 14 | |
| c3486 | 14 | |
| c5811 | 3 | 3.2% |
| c27735 | 2 | 2.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 94 | |
| 7 | 92 | |
| 2 | 73 | |
| 3 | 71 | |
| 8 | 65 | |
| 9 | 65 | |
| 6 | 38 | |
| 1 | 20 | 3.7% |
| 4 | 14 | 2.6% |
| 5 | 5 | 0.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 537 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| C | 94 | |
| 7 | 92 | |
| 2 | 73 | |
| 3 | 71 | |
| 8 | 65 | |
| 9 | 65 | |
| 6 | 38 | |
| 1 | 20 | 3.7% |
| 4 | 14 | 2.6% |
| 5 | 5 | 0.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 537 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| C | 94 | |
| 7 | 92 | |
| 2 | 73 | |
| 3 | 71 | |
| 8 | 65 | |
| 9 | 65 | |
| 6 | 38 | |
| 1 | 20 | 3.7% |
| 4 | 14 | 2.6% |
| 5 | 5 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 537 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| C | 94 | |
| 7 | 92 | |
| 2 | 73 | |
| 3 | 71 | |
| 8 | 65 | |
| 9 | 65 | |
| 6 | 38 | |
| 1 | 20 | 3.7% |
| 4 | 14 | 2.6% |
| 5 | 5 | 0.9% |
EFO
Categorical
High correlation  Imbalance  Missing 
| Distinct | 2 |
|---|---|
| Distinct (%) | 7.7% |
| Missing | 68 |
| Missing (%) | 72.3% |
| Memory size | 7.5 KiB |
| http://purl.obolibrary.org/obo/MONDO_0011719 | |
|---|---|
| http://www.ebi.ac.uk/efo/EFO_1000192 | 2 |
Length
| Max length | 44 |
|---|---|
| Median length | 44 |
| Mean length | 43.384615 |
| Min length | 36 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | http://purl.obolibrary.org/obo/MONDO_0011719 |
|---|---|
| 2nd row | http://purl.obolibrary.org/obo/MONDO_0011719 |
| 3rd row | http://purl.obolibrary.org/obo/MONDO_0011719 |
| 4th row | http://purl.obolibrary.org/obo/MONDO_0011719 |
| 5th row | http://purl.obolibrary.org/obo/MONDO_0011719 |
Common Values
| Value | Count | Frequency (%) |
| http://purl.obolibrary.org/obo/MONDO_0011719 | 24 | 25.5% |
| http://www.ebi.ac.uk/efo/EFO_1000192 | 2 | 2.1% |
| (Missing) | 68 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| http://purl.obolibrary.org/obo/mondo_0011719 | 24 | |
| http://www.ebi.ac.uk/efo/efo_1000192 | 2 | 7.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 122 | 10.8% |
| / | 104 | 9.2% |
| r | 96 | 8.5% |
| 1 | 76 | 6.7% |
| b | 74 | 6.6% |
| . | 54 | 4.8% |
| 0 | 54 | 4.8% |
| t | 52 | 4.6% |
| p | 50 | 4.4% |
| O | 50 | 4.4% |
| Other values (22) | 396 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1128 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 122 | 10.8% |
| / | 104 | 9.2% |
| r | 96 | 8.5% |
| 1 | 76 | 6.7% |
| b | 74 | 6.6% |
| . | 54 | 4.8% |
| 0 | 54 | 4.8% |
| t | 52 | 4.6% |
| p | 50 | 4.4% |
| O | 50 | 4.4% |
| Other values (22) | 396 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1128 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 122 | 10.8% |
| / | 104 | 9.2% |
| r | 96 | 8.5% |
| 1 | 76 | 6.7% |
| b | 74 | 6.6% |
| . | 54 | 4.8% |
| 0 | 54 | 4.8% |
| t | 52 | 4.6% |
| p | 50 | 4.4% |
| O | 50 | 4.4% |
| Other values (22) | 396 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1128 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 122 | 10.8% |
| / | 104 | 9.2% |
| r | 96 | 8.5% |
| 1 | 76 | 6.7% |
| b | 74 | 6.6% |
| . | 54 | 4.8% |
| 0 | 54 | 4.8% |
| t | 52 | 4.6% |
| p | 50 | 4.4% |
| O | 50 | 4.4% |
| Other values (22) | 396 |
Correlations
| EFO | HISTOLOGY_SUBTYPE_1 | HISTOLOGY_SUBTYPE_2 | NCI_CODE | PRIMARY_SITE | SITE_SUBTYPE_1 | SITE_SUBTYPE_2 | |
|---|---|---|---|---|---|---|---|
| EFO | 1.000 | 0.979 | 1.000 | 0.716 | 0.252 | 0.252 | 0.000 |
| HISTOLOGY_SUBTYPE_1 | 0.979 | 1.000 | 0.346 | 0.900 | 0.000 | 0.000 | 0.000 |
| HISTOLOGY_SUBTYPE_2 | 1.000 | 0.346 | 1.000 | 0.200 | 0.000 | 0.000 | 0.000 |
| NCI_CODE | 0.716 | 0.900 | 0.200 | 1.000 | 0.659 | 0.659 | 0.000 |
| PRIMARY_SITE | 0.252 | 0.000 | 0.000 | 0.659 | 1.000 | 0.486 | 0.000 |
| SITE_SUBTYPE_1 | 0.252 | 0.000 | 0.000 | 0.659 | 0.486 | 1.000 | 0.000 |
| SITE_SUBTYPE_2 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 |
Missing values
Sample
| COSMIC_PHENOTYPE_ID | PRIMARY_SITE | SITE_SUBTYPE_1 | SITE_SUBTYPE_2 | SITE_SUBTYPE_3 | PRIMARY_HISTOLOGY | HISTOLOGY_SUBTYPE_1 | HISTOLOGY_SUBTYPE_2 | HISTOLOGY_SUBTYPE_3 | NCI_CODE | EFO | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 170 | COSO318721653 | soft_tissue | fibrous_tissue_and_uncertain_origin | retroperitoneum | NS | gastrointestinal_stromal_tumour | dedifferentiated | NS | NS | C179932 | NaN |
| 317 | COSO97595385 | soft_tissue | fibrous_tissue_and_uncertain_origin | extra-gastrointestinal_site | NS | gastrointestinal_stromal_tumour | spindle | NS | NS | C27792 | NaN |
| 322 | COSO287721665 | soft_tissue | fibrous_tissue_and_uncertain_origin | large_intestine | NS | gastrointestinal_stromal_tumour | dedifferentiated | anaplastic_and_epithelioid | NS | C179932 | NaN |
| 365 | COSO36605385 | soft_tissue | fibrous_tissue_and_uncertain_origin | gastrointestinal_tract_(site_indeterminate) | NS | gastrointestinal_stromal_tumour | spindle | NS | NS | C27792 | NaN |
| 394 | COSO60695381 | soft_tissue | fibrous_tissue_and_uncertain_origin | mesocolon | NS | gastrointestinal_stromal_tumour | NS | NS | NS | C3868 | http://purl.obolibrary.org/obo/MONDO_0011719 |
| 640 | COSO34435546 | soft_tissue | fibrous_tissue_and_uncertain_origin | mediastinum | NS | gastrointestinal_stromal_tumour | epithelioid | NS | NS | C3486 | NaN |
| 709 | COSO35205546 | soft_tissue | fibrous_tissue_and_uncertain_origin | mesentery | NS | gastrointestinal_stromal_tumour | epithelioid | NS | NS | C3486 | NaN |
| 855 | COSO31875763 | soft_tissue | fibrous_tissue_and_uncertain_origin | retroperitoneum | NS | gastrointestinal_stromal_tumour | spindle_and_epithelioid | NS | NS | C27793 | NaN |
| 911 | COSO28815381 | soft_tissue | fibrous_tissue_and_uncertain_origin | small_intestine | NS | gastrointestinal_stromal_tumour | NS | NS | NS | C3868 | http://purl.obolibrary.org/obo/MONDO_0011719 |
| 932 | COSO36845546 | soft_tissue | fibrous_tissue_and_uncertain_origin | pelvic_cavity | NS | gastrointestinal_stromal_tumour | epithelioid | NS | NS | C3486 | NaN |
| COSMIC_PHENOTYPE_ID | PRIMARY_SITE | SITE_SUBTYPE_1 | SITE_SUBTYPE_2 | SITE_SUBTYPE_3 | PRIMARY_HISTOLOGY | HISTOLOGY_SUBTYPE_1 | HISTOLOGY_SUBTYPE_2 | HISTOLOGY_SUBTYPE_3 | NCI_CODE | EFO | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 6680 | COSO36075546 | soft_tissue | fibrous_tissue_and_uncertain_origin | stomach | NS | gastrointestinal_stromal_tumour | epithelioid | NS | NS | C3486 | NaN |
| 6690 | COSO36605381 | soft_tissue | fibrous_tissue_and_uncertain_origin | gastrointestinal_tract_(site_indeterminate) | NS | gastrointestinal_stromal_tumour | NS | NS | NS | C3868 | http://purl.obolibrary.org/obo/MONDO_0011719 |
| 6736 | COSO57285385 | soft_tissue | fibrous_tissue_and_uncertain_origin | hip | NS | gastrointestinal_stromal_tumour | spindle | NS | NS | C27792 | NaN |
| 6778 | COSO318721673 | soft_tissue | fibrous_tissue_and_uncertain_origin | retroperitoneum | NS | gastrointestinal_stromal_tumour | transdifferentiated | rhabdomyoblastic_and_chondrosarcomatous_and_spindle | NS | C179932 | NaN |
| 6822 | COSO37735546 | soft_tissue | fibrous_tissue_and_uncertain_origin | abdomen | NS | gastrointestinal_stromal_tumour | epithelioid | NS | NS | C3486 | NaN |
| 6839 | COSO37565381 | soft_tissue | fibrous_tissue_and_uncertain_origin | liver | NS | gastrointestinal_stromal_tumour | NS | NS | NS | C3868 | http://purl.obolibrary.org/obo/MONDO_0011719 |
| 6941 | COSO31355546 | soft_tissue | fibrous_tissue_and_uncertain_origin | NS | NS | gastrointestinal_stromal_tumour | epithelioid | NS | NS | C3486 | NaN |
| 6957 | COSO34395385 | soft_tissue | fibrous_tissue_and_uncertain_origin | abdominal_wall | NS | gastrointestinal_stromal_tumour | spindle | NS | NS | C27792 | NaN |
| 7052 | COSO37735763 | soft_tissue | fibrous_tissue_and_uncertain_origin | abdomen | NS | gastrointestinal_stromal_tumour | spindle_and_epithelioid | NS | NS | C27793 | NaN |
| 7124 | COSO287721654 | soft_tissue | fibrous_tissue_and_uncertain_origin | large_intestine | NS | gastrointestinal_stromal_tumour | dedifferentiated | anaplastic_and_spindle | NS | C179932 | NaN |