synapse_io

Module: synapse_io

Input/output functions to read and write Synapse tables.

Authors:

Copyright 2015, Sage Bionetworks (http://sagebase.org), Apache v2.0 License

Functions

mhealthx.synapse_io.copy_synapse_table(synapse_table_id, synapse_project_id, table_name='', remove_columns=[], username='', password='')

Copy Synapse table to another Synapse project.

synapse_table_id : string
Synapse ID for table to copy
synapse_project_id : string
copy table to project with this Synapse ID
table_name : string
schema name of table
remove_columns : list of strings
column headers for columns to be removed
username : string
Synapse username (only needed once on a given machine)
password : string
Synapse password (only needed once on a given machine)
table_data : Pandas DataFrame
Synapse table contents
table_name : string
schema name of table
synapse_project_id : string
Synapse ID for project within which table is to be written
>>> from mhealthx.synapse_io import copy_synapse_table
>>> synapse_table_id = 'syn4590865'
>>> synapse_project_id = 'syn4899451'
>>> table_name = 'Copy of ' + synapse_table_id
>>> remove_columns = ['audio_audio.m4a', 'audio_countdown.m4a']
>>> username = ''
>>> password = ''
>>> table_data, table_name, synapse_project_id = copy_synapse_table(synapse_table_id, synapse_project_id, table_name, remove_columns, username, password)
mhealthx.synapse_io.extract_rows(synapse_table, save_path=None, limit=None, username='', password='')

Extract rows from a Synapse table.

synapse_table : string or Schema
a synapse ID or synapse table Schema object
save_path : string
save rows as separate files in this path, unless empty or None
limit : integer or None
limit to number of rows returned by the query
username : string
Synapse username (only needed once on a given machine)
password : string
Synapse password (only needed once on a given machine)
rows : list of pandas Series
each row of a Synapse table
row_files: list of strings
file names corresponding to each of the rows
>>> from mhealthx.synapse_io import extract_rows
>>> import synapseclient
>>> syn = synapseclient.Synapse()
>>> syn.login()
>>> synapse_table = 'syn4590865'
>>> save_path = '.'
>>> limit = 3
>>> username = ''
>>> password = ''
>>> rows, row_files = extract_rows(synapse_table, save_path, limit, username='', password='')
mhealthx.synapse_io.feature_file_to_synapse_table(feature_file, raw_feature_file, source_file_id, provenance_activity_id, command, command_line, synapse_table_id, username='', password='')

Upload files and file handle IDs to Synapse.

feature_file : string
path to file to upload to Synapse
raw_feature_file : string
path to file to upload to Synapse
source_file_id : string
Synapse file handle ID to source file used to generate features
provenance_activity_id : string
Synapse provenance activity ID
command : string
name of command run to generate raw feature file
command_line : string
full command line run to generate raw feature file
synapse_table_id : string
Synapse table ID for table to store file handle IDs, etc.
username : string
Synapse username (only needed once on a given machine)
password : string
Synapse password (only needed once on a given machine)
>>> from mhealthx.synapse_io import feature_file_to_synapse_table
>>> feature_file = '/Users/arno/Local/wav/test1.wav'
>>> raw_feature_file = '/Users/arno/Local/wav/test1.wav'
>>> source_file_id = ''
>>> provenance_activity_id = ''
>>> command_line = 'SMILExtract -C blah -I blah -O blah'
>>> synapse_table_id = 'syn4899451'
>>> username = ''
>>> password = ''
>>> feature_file_to_synapse_table(feature_file, raw_feature_file, source_file_id, provenance_activity_id, command_line, synapse_table_id, username, password)
mhealthx.synapse_io.read_files_from_row(synapse_table, row, column_name, out_path=None, username='', password='')

Read data from a row of a Synapse table.

synapse_table : string or Schema
a synapse ID or synapse table Schema object
row : pandas Series or string
row of a Synapse table converted to a Series or csv file
column_name : string
name of file handle column
out_path : string or None
a local path in which to store downloaded files. If None, stores them in (~/.synapseCache)
username : string
Synapse username (only needed once on a given machine)
password : string
Synapse password (only needed once on a given machine)
row : pandas Series
same as passed in: row of a Synapse table as a file or Series
filepath : string
downloaded file (full path)
>>> from mhealthx.synapse_io import extract_rows, read_files_from_row
>>> import synapseclient
>>> syn = synapseclient.Synapse()
>>> syn.login()
>>> synapse_table = 'syn4590865'
>>> save_path = '.'
>>> limit = 3
>>> username = ''
>>> password = ''
>>> rows, row_files = extract_rows(synapse_table, save_path, limit, username='', password='')
>>> column_name = 'audio_audio.m4a' #, 'audio_countdown.m4a']
>>> out_path = '.'
>>> for i in range(3):
>>>     row = rows[i]
>>>     row, filepath = read_files_from_row(synapse_table, row, column_name, out_path, username, password)
>>>     print(row)
mhealthx.synapse_io.write_synapse_table(table_data, synapse_project_id, table_name='', username='', password='')

Write data to a Synapse table.

table_data : Pandas DataFrame
Synapse table contents
synapse_project_id : string
Synapse ID for project within which table is to be written
table_name : string
schema name of table
username : string
Synapse username (only needed once on a given machine)
password : string
Synapse password (only needed once on a given machine)
>>> from mhealthx.synapse_io import read_synapse_table_files, write_synapse_table
>>> in_synapse_table_id = 'syn4590865'
>>> synapse_project_id = 'syn4899451'
>>> column_names = []
>>> download_limit = None
>>> out_path = '.'
>>> username = ''
>>> password = ''
>>> table_data, files = read_synapse_table_files(in_synapse_table_id, column_names, download_limit, out_path, username, password)
>>> table_name = 'Contents of ' + in_synapse_table_id
>>> write_synapse_table(table_data, synapse_project_id, table_name, username, password)