Skip to content

load

genie.load

This module contains all the functions that stores data to Synapse

Attributes

__version__ = '17.1.0' module-attribute

logger = logging.getLogger(__name__) module-attribute

Functions

store_file(syn, filepath, parentid, name=None, annotations=None, used=None, version_comment=None)

Stores file into Synapse

PARAMETER DESCRIPTION
syn

Synapse connection

TYPE: Synapse

filepath

Path to file

TYPE: str

parentid

Synapse Id of a folder or project

TYPE: str

name

Name of entity. Defaults to basename of your file path.

TYPE: str DEFAULT: None

annotations

Synapse annotations to the File Entity. Defaults to None.

TYPE: Dict DEFAULT: None

used

Entities used to generate file. Defaults to None.

TYPE: List[str] DEFAULT: None

version_comment

File Entity version comment. Defaults to None.

TYPE: str DEFAULT: None

RETURNS DESCRIPTION
File

synapseclient.File: Synapse File entity

store_files(syn, filepaths, parentid)

Stores a list of files

PARAMETER DESCRIPTION
syn

Synapse connection

TYPE: Synapse

filepaths

List of filepaths

TYPE: List[str]

parentid

Synapse Id of a folder or project

TYPE: str

RETURNS DESCRIPTION
List[File]

List[synapseclient.File]: List of Synaps File entities

store_table(syn, filepath, tableid)

Stores a tsv to a Synapse Table. This can append, update, and delete rows in a Synapse table depending on how the tsv file is formatted.

PARAMETER DESCRIPTION
syn

Synapse connection

TYPE: Synapse

filepath

Path to a TSV

TYPE: str

tableid

Synapse Id of a Synapse Table

TYPE: str

update_process_trackingdf(syn, process_trackerdb_synid, center, process_type, start=True)

Updates the processing tracking database

PARAMETER DESCRIPTION
syn

Synapse connection

TYPE: Synapse

process_trackerdb_synid

Synapse Id of Process Tracking Table

TYPE: str

center

GENIE center (ie. SAGE)

TYPE: str

process_type

processing type (dbToStage or public)

TYPE: str

start

Start or end of processing. Default is True for start

TYPE: bool DEFAULT: True

update_table(syn, databaseSynId, newData, filterBy, filterByColumn='CENTER', col=None, toDelete=False)

Update Synapse table given a new dataframe

PARAMETER DESCRIPTION
syn

Synapse connection

TYPE: Synapse

databaseSynId

Synapse Id of Synapse Table

TYPE: str

newData

New data in a dataframe

TYPE: DataFrame

filterBy

Value to filter new data by

TYPE: str

filterByColumn

Column to filter values by. Defaults to "CENTER".

TYPE: str DEFAULT: 'CENTER'

col

List of columns to ingest. Defaults to None.

TYPE: List[str] DEFAULT: None

toDelete

Delete rows given the primary key. Defaults to False.

TYPE: bool DEFAULT: False

_update_table(syn, database, new_dataset, database_synid, primary_key_cols, to_delete=False)

A helper function to compare new dataset with existing data, and store any changes that need to be made to the database

_get_col_order(orig_database_cols)

Get column order

PARAMETER DESCRIPTION
orig_database_cols

A list of column names of the original database

TYPE: Index

RETURNS DESCRIPTION
List[str]

The list of re-ordered column names

_reorder_new_dataset(orig_database_cols, new_dataset)

Reorder new dataset based on the original database

PARAMETER DESCRIPTION
orig_database_cols

A list of column names of the original database

TYPE: Index

new_dataset

New Data

TYPE: DataFrame

RETURNS DESCRIPTION
DataFrame

The re-ordered new dataset

_generate_primary_key(dataset, primary_key_cols, primary_key)

Generate primary key column a dataframe

PARAMETER DESCRIPTION
dataset

A dataframe

TYPE: DataFrame

primary_key_cols

Column(s) that make up the primary key

TYPE: List[str]

primary_key

The column name of the primary_key

TYPE: str

RETURNS DESCRIPTION
DataFrame

pd.DataFrame: The dataframe with primary_key column added

check_database_changes(database, new_dataset, primary_key_cols, to_delete=False)

Check changes that need to be made, i.e. append/update/delete rows to the database based on its comparison with new data

PARAMETER DESCRIPTION
database

Original Data

TYPE: DataFrame

new_dataset

New Data

TYPE: DataFrame

primary_key_cols

Column(s) that make up the primary key

TYPE: list

to_delete

Delete rows. Defaults to False

TYPE: bool DEFAULT: False

store_database(syn, database_synid, col_order, all_updates, to_delete_rows)

Store changes to the database

PARAMETER DESCRIPTION
syn

Synapse object

TYPE: Synapse

database_synid

Synapse Id of the Synapse table

TYPE: str

col_order

The ordered column names to be saved

TYPE: List[str]

all_updates

rows to be appended and/or updated

TYPE: DataFrame

to_deleted_rows

rows to be deleted

TYPE: DataFrame

_copyRecursive(syn, entity, destinationId, mapping=None, skipCopyAnnotations=False, **kwargs)

NOTE: This is a copy of the function found here: https://github.com/Sage-Bionetworks/synapsePythonClient/blob/develop/synapseutils/copy_functions.py#L409 This was copied because there is a restriction that doesn't allow for copying entities with access requirements

Recursively copies synapse entites, but does not copy the wikis

PARAMETER DESCRIPTION
syn

A Synapse object with user's login

TYPE: Synapse

entity

A synapse entity ID

TYPE: str

destinationId

Synapse ID of a folder/project that the copied entity is being copied to

TYPE: str

mapping

A mapping of the old entities to the new entities

TYPE: Dict[str, str] DEFAULT: None

skipCopyAnnotations

Skips copying the annotations Default is False

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
Dict[str, str]

a mapping between the original and copied entity: {'syn1234':'syn33455'}