Skip to content

Fileformat

genie.example_filetype_format

TODO: Rename this to model.py This contains the GENIE model objects

Attributes

logger = logging.getLogger(__name__) module-attribute

Classes

ValidationResults dataclass

Validation results

Functions
is_valid()

True if file is valid

collect_errors_and_warnings()

Aggregates error and warnings into a string.

RETURNS DESCRIPTION
str

A message containing errors and warnings.

TYPE: str

FileTypeFormat

Functions
__init__(syn, center, genie_config=None, ancillary_files=None)

A validator helper class for a center's files.

PARAMETER DESCRIPTION
syn

a synapseclient.Synapse object

TYPE: Synapse

center

The participating center name.

TYPE: str

genie_config

The configurations needed for the GENIE codebase. GENIE table type/name to Synapse Id. Defaults to None.

TYPE: dict DEFAULT: None

ancillary_files

all files downloaded for validation. Defaults to None.

TYPE: List[List[Entity]] DEFAULT: None

read_file(filePathList)

Each file is to be read in for validation and processing. This is not to be changed in any functions.

PARAMETER DESCRIPTION
filePathList

A list of file paths (Max is 2 for the two clinical files)

RETURNS DESCRIPTION
df

Pandas dataframe of file

validateFilename(filePath)

Validation of file name. The filename is what maps the file to its validation and processing.

PARAMETER DESCRIPTION
filePath

Path to file

RETURNS DESCRIPTION
str

file type defined by self._fileType

process_steps(df, **kwargs)

This function is modified for every single file. It reformats the file and stores the file into database and Synapse.

preprocess(newpath)

This is for any preprocessing that has to occur to the entity name to add to kwargs for processing. entity name is included in the new path

PARAMETER DESCRIPTION
newpath

Path to file

process(filePath, **kwargs)

This is the main processing function.

PARAMETER DESCRIPTION
filePath

Path to file

kwargs

The kwargs are determined by self._process_kwargs

DEFAULT: {}

RETURNS DESCRIPTION
str

file path of processed file

validate(filePathList, **kwargs)

This is the main validation function. Every file type calls self._validate, which is different.

PARAMETER DESCRIPTION
filePathList

A list of file paths.

kwargs

The kwargs are determined by self._validation_kwargs

DEFAULT: {}

RETURNS DESCRIPTION
tuple

The errors and warnings as a file from validation.

TYPE: ValidationResults