create_case_lists
genie.create_case_lists
¶
Creates case lists per cancer type
Attributes¶
CASE_LIST_TEXT_TEMPLATE = 'cancer_study_identifier: {study_id}\nstable_id: {stable_id}\ncase_list_name: {case_list_name}\ncase_list_description: {case_list_description}\ncase_list_ids: {case_list_ids}'
module-attribute
¶
Functions¶
create_case_lists_map(clinical_file_name)
¶
Creates the case list dictionary
| PARAMETER | DESCRIPTION |
|---|---|
clinical_file_name
|
clinical file path
|
| RETURNS | DESCRIPTION |
|---|---|
dict
|
key = cancer_type value = list of sample ids |
dict
|
key = seq_assay_id value = list of sample ids |
list
|
Clinical samples |
_write_single_oncotree_case_list(cancer_type, ids, study_id, output_directory)
¶
Writes one oncotree case list. Python verisons below 3.6 will sort the dictionary keys which causes tests to fail
| PARAMETER | DESCRIPTION |
|---|---|
cancer_type
|
Oncotree code cancer type
|
ids
|
GENIE sample ids
|
study_id
|
cBioPortal study id
|
output_directory
|
case list output directory
|
| RETURNS | DESCRIPTION |
|---|---|
|
case list file path |
write_case_list_files(clinical_file_map, output_directory, study_id)
¶
Writes the cancer_type case list file to case_lists directory
| PARAMETER | DESCRIPTION |
|---|---|
clinical_file_map
|
cancer type to sample id mapping from create_case_lists_map
|
output_directory
|
Directory to write case lists
|
study_id
|
cBioPortal study id
|
| RETURNS | DESCRIPTION |
|---|---|
list
|
oncotree code case list files |
create_sequenced_samples(seq_assay_map, assay_info_file_name)
¶
Gets samples sequenced
| PARAMETER | DESCRIPTION |
|---|---|
seq_assay_map
|
dictionary containing lists of samples per seq_assay_id
|
assay_info_file_name
|
Assay information name
|
| RETURNS | DESCRIPTION |
|---|---|
|
lists of cna and sv samples |
write_case_list_sequenced(clinical_samples, output_directory, study_id)
¶
Writes the genie sequenced and all samples. Since all samples are sequenced, _all and _sequenced are equal
| PARAMETER | DESCRIPTION |
|---|---|
clinical_samples
|
List of clinical samples
|
output_directory
|
Directory to write case lists
|
study_id
|
cBioPortal study id
|
| RETURNS | DESCRIPTION |
|---|---|
list
|
case list sequenced and all |
write_case_list_cna(cna_samples, output_directory, study_id)
¶
Writes the cna sequenced samples
| PARAMETER | DESCRIPTION |
|---|---|
cna_samples
|
List of cna samples
|
output_directory
|
Directory to write case lists
|
study_id
|
cBioPortal study id
|
| RETURNS | DESCRIPTION |
|---|---|
|
cna caselist path |
write_case_list_sv(samples, output_directory, study_id)
¶
Writes the structural variant (sv) sequenced samples
| PARAMETER | DESCRIPTION |
|---|---|
samples
|
List of sv samples
TYPE:
|
output_directory (str
|
Directory to write case lists
|
study_id
|
cBioPortal study id
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
str
|
sv caselist path
TYPE:
|
write_case_list_cnaseq(cna_samples, output_directory, study_id)
¶
writes both cna and mutation samples (Just _cna file for now)
| PARAMETER | DESCRIPTION |
|---|---|
cna_samples
|
List of cna samples
|
output_directory
|
Directory to write case lists
|
study_id
|
cBioPortal study id
|
| RETURNS | DESCRIPTION |
|---|---|
|
cnaseq path |
main(clinical_file_name, assay_info_file_name, output_directory, study_id)
¶
Gets clinical file and gene matrix file and processes it to obtain case list files
| PARAMETER | DESCRIPTION |
|---|---|
clinical_file_name
|
Clinical file path
|
assay_info_file_name
|
Assay information name
|
output_directory
|
Output directory of case list files
|
study_id
|
cBioPortal study id
|