Skip to content

transform

genie.transform

This module contains all the transformation functions used throughout the GENIE package

Functions

_col_name_to_titlecase(string)

Convert strings to titlecase. Supports strings separated by _.

PARAMETER DESCRIPTION
string

A string

TYPE: str

RETURNS DESCRIPTION
str

A string converted to title case

TYPE: str

_convert_col_with_nas_to_str(df, col)

This converts a column into str while preserving NAs

_convert_float_col_with_nas_to_int(df, col)

This converts int column that was turned into a float col because pandas does that with int values that have NAs back into an int col with NAs intact

_convert_df_with_mixed_dtypes(read_csv_params)

This checks if a dataframe read in normally comes out with mixed data types (which happens when low_memory = True because read_csv parses in chunks and guesses dtypes by chunk) and converts a dataframe with mixed datatypes to one datatype.

PARAMETER DESCRIPTION
read_csv_params

of input params and values to pandas's read_csv function. needs to include filepath to dataset to be read in

TYPE: dict

RETURNS DESCRIPTION
DataFrame

pd.DataFrame : The dataset read in

_convert_values_to_na(input_df, values_to_replace, columns_to_convert)

Converts given values to NA in an input dataset

PARAMETER DESCRIPTION
input_df

input dataset

TYPE: DataFrame

values_to_replace

string values to replace with na

TYPE: List[str]

columns_to_convert

subset of columns to convert with na in

TYPE: List[str]

RETURNS DESCRIPTION
DataFrame

pd.DataFrame: dataset with specified values replaced with NAs