extract
genie.extract
¶
This module contains all the functions that extract data from Synapse
Attributes¶
logger = logging.getLogger(__name__)
module-attribute
¶
stdout_handler = logging.StreamHandler(stream=(sys.stdout))
module-attribute
¶
Functions¶
get_center_input_files(syn, synid, center, process='main', downloadFile=True)
¶
Walks through each center's input directory to get a list of tuples of center files
| PARAMETER | DESCRIPTION |
|---|---|
syn
|
Synapse connection
TYPE:
|
synid
|
Synapse Id of a folder
TYPE:
|
center
|
GENIE center name
TYPE:
|
process
|
Process type include "main", "mutation". Defaults to "main".
TYPE:
|
downloadFile
|
Downloads the file. Defaults to True.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list
|
List of Synapse entities
TYPE:
|
_map_name_to_filetype(name)
¶
Maps file name to filetype
| PARAMETER | DESCRIPTION |
|---|---|
name
|
File name
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
str
|
filetype
TYPE:
|
get_file_mapping(syn, synid)
¶
Get mapping between Synapse entity name and Synapse ids of all entities in a folder
| PARAMETER | DESCRIPTION |
|---|---|
syn
|
Synapse connection
TYPE:
|
synid
|
Synapse Id of folder
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict
|
mapping between Synapse Entity name and Id
TYPE:
|
get_public_to_consortium_synid_mapping(syn, release_synid)
¶
Gets the mapping between potential public release names and the consortium release folder
| PARAMETER | DESCRIPTION |
|---|---|
syn
|
Synapse connection
TYPE:
|
release_synid
|
Release folder fileview
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict
|
Mapping between potential public release and consortium release synapse id
TYPE:
|
get_syntabledf(syn, query_string)
¶
Get dataframe from Synapse Table query
| PARAMETER | DESCRIPTION |
|---|---|
syn
|
Synapse connection
TYPE:
|
query_string
|
Table query
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
pd.DataFrame: Query results in a dataframe |
_get_synid_database_mappingdf(syn, project_id)
¶
Get database to synapse id mapping dataframe
| PARAMETER | DESCRIPTION |
|---|---|
syn
|
Synapse object
|
project_id
|
Synapse Project ID with a 'dbMapping' annotation.
|
| RETURNS | DESCRIPTION |
|---|---|
|
database to synapse id mapping dataframe |
getDatabaseSynId(syn, tableName, project_id=None, databaseToSynIdMappingDf=None)
¶
Get database synapse id from database to synapse id mapping table
| PARAMETER | DESCRIPTION |
|---|---|
syn
|
Synapse object
|
project_id
|
Synapse Project ID with a database mapping table.
DEFAULT:
|
tableName
|
Name of synapse table
|
databaseToSynIdMappingDf
|
Avoid calling rest call to download table if the mapping table is already downloaded
DEFAULT:
|
| RETURNS | DESCRIPTION |
|---|---|
str
|
Synapse id of wanted database |
_get_database_mapping_config(syn, synid)
¶
Gets Synapse database to Table mapping in dict
| PARAMETER | DESCRIPTION |
|---|---|
syn
|
Synapse connection
TYPE:
|
synid
|
Synapse id of database mapping table
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict
|
{'databasename': 'synid'}
TYPE:
|
get_genie_config(syn, project_id)
¶
Get configurations needed for the GENIE codebase
| PARAMETER | DESCRIPTION |
|---|---|
syn
|
Synapse connection
TYPE:
|
project_id
|
Synapse project id
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict
|
GENIE table type/name to Synapse Id
TYPE:
|
_get_oncotreelink(syn, genie_config, oncotree_link=None)
¶
Gets oncotree link unless a link is specified by the user
| PARAMETER | DESCRIPTION |
|---|---|
syn
|
Synapse connection
TYPE:
|
genie_config
|
database name to synid mapping
TYPE:
|
oncotree_link
|
link to oncotree. Default is None
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
str
|
oncotree link
TYPE:
|