Cohort generator#

create_cohort_tables(cohort_database_schema, connection_details=None, connection=None, cohort_table_names=None, incremental=False)#

Create cohort tables

Creates the cohort tables in the database.

Wraps the R CohortGenerator::createCohortTables function defined in CohortGenerator/R/CohortTables.R.

Parameters:

cohort_database_schema (str) – The schema to create the cohort tables in
connection_details (ListVector, optional) – The connection details, by default None
connection (RS4, optional) – The connection, by default None
cohort_table_names (ListVector, optional) – The names of the cohort tables, by default None
incremental (bool, optional) – A boolean representing if the tables should be created incrementally, by default False

Returns:

A wrapped cohort generation R object

Return type:

RS4

create_empty_cohort_definition_set(verbose=False)#

Create an empty CohortDefinitionSet object

Wraps the R CohortGenerator::createEmptyCohortDefinitionSet function defined in CohortGenerator/R/CohortDefinitionSet.R.

Parameters:: verbose (bool, optional) – When True, descriptions of each field in the data frame are returned. By default False.
Return type:: RS4

export_cohort_stats_tables(cohort_database_schema, cohort_statistics_folder, connection_details=None, connection=None, cohort_table_names=None, snake_case_to_camel_case=True, file_names_in_snake_case=False, incremental=False, database_id=None)#

Export the cohort statistics tables to the file system

This function retrieves the data from the cohort statistics tables and writes them to the inclusion statistics folder specified in the function call.

Parameters:

cohort_database_schema (str) – The schema to create the cohort tables in
cohort_statistics_folder (str) – The path to the folder where the cohort statistics folder where the results will be written
connection_details (ListVector, optional) – The connection details, by default None
connection (RS4, optional) – The connection, by default None
cohort_table_names (ListVector, optional) – The names of the cohort tables, by default None
snake_case_to_camel_case (bool, optional) – Should column names in the exported files convert from snake_case to camelCase? Default is FALSE
file_names_in_snake_case (bool, optional) – Should the exported files use snake_case? Default is FALSE
incremental (bool, optional) – A boolean representing if the tables should be created incrementally, by default False
database_id (str, optional) – When specified, the databaseId will be added to the exported results

generate_cohort_set(cdm_database_schema, cohort_definition_set, connection_details=None, connection=None, temp_emulation_schema=None, cohort_database_schema=None, cohort_table_names=None, stop_on_error=True, incremental=False, incremental_folder=None)#

Generate a cohort set

This function generates a set of cohorts in the cohort table.

Wraps the R CohortGenerator::generateCohortSet function defined in CohortGenerator/R/CohortConstruction.R.

Parameters:

cdm_database_schema (str) – The schema containing the CDM
connection_details (ListVector | None, optional) – The connection details obtained using Connect.create_connection_details(...), by default None
connection (None, optional) – The connection object obtained from Connect.connect(...), by default None
temp_emulation_schema (None, optional) – The schema to use for temp tables, by default None

Return type:

RS4

get_cohort_counts(cohort_database_schema, connection_details=None, connection=None, cohort_table='cohort', cohort_ids=[], cohort_definition_set=None, database_id=None)#

Get cohort counts.

Gets the counts for the specified cohort ids.

Wraps the R CohortGenerator::getCohortCounts function defined in CohortGenerator/R/CohortCount.R.

Parameters:

cohort_database_schema (str) – The schema containing the cohort tables
connection_details (ListVector, optional) – The connection details, by default None
connection (RS4, optional) – The connection, by default None
cohort_table (str, optional) – The name of the cohort table, by default “cohort”
cohort_ids (list[int], optional) – The cohort ids to get the counts for, by default []
cohort_definition_set (RS4, optional) – The cohort definition set, by default None
database_id (str, optional) – The database id, by default None

Returns:

A wrapped cohort counts R object

Return type:

RS4

get_cohort_table_names(cohort_table='cohort', cohort_inclusion_table=None, cohort_inclusion_result_table=None, cohort_inclusion_stats_table=None, cohort_summary_stats_table=None, cohort_censor_stats_table=None)#

Get the names of the cohort tables

Wraps the R CohortGenerator::getCohortTableNames function defined in CohortGenerator/R/CohortTables.R.

Parameters:

cohort_table (str, optional) – The name of the cohort table, by default “cohort”
cohort_inclusion_table (str, optional) – The name of the cohort inclusion table, by default None
cohort_inclusion_result_table (str, optional) – The name of the cohort inclusion result table, by default None
cohort_inclusion_stats_table (str, optional) – The name of the cohort inclusion stats table, by default None
cohort_summary_stats_table (str, optional) – The name of the cohort summary stats table, by default None
cohort_censor_stats_table (str, optional) – The name of the cohort censor stats table, by default None

Returns:

A list of cohort table names

Return type:

ListVector

save_cohort_definition_set(cohort_definition_set, settings_file_name='inst/cohorts.csv', json_folder='inst/cohorts', sql_folder='inst/sql/sql_server', cohort_file_name_format='%s', cohort_file_name_value=['cohort_id'], subset_json_folder='inst/cohort_subset_definitions/', verbose=False)#

Save the cohort definition set to the file system

This function saves a cohort_definition_set to the file system and provides options for specifying where to write the individual elements: the settings file will contain the cohort information as a CSV specified by the settingsFileName, the cohort JSON is written to the jsonFolder and the SQL is written to the sqlFolder. We also provide a way to specify the json/sql file name format using the cohort_file_name_format and cohort_file_name_value parameters.

Wraps the R CohortGenerator::saveCohortDefinitionSet function defined in CohortGenerator/R/CohortDefinitionSet.R.

Parameters:

cohort_definition_set (RS4) – A CohortDefinitionSet object
settings_file_name (str, optional) – The name of the CSV file that will hold the cohort information including the cohortId and cohortName
json_folder (str, optional) – The name of the folder that will hold the JSON representation of the cohort if it is available in the cohortDefinitionSet
sql_folder (str, optional) – The name of the folder that will hold the SQL representation of the cohort
cohort_file_name_format (str, optional) – Defines the format string for naming the cohort JSON and SQL files. The format string follows the standard defined in the base sprintf function.
cohort_file_name_value (list[str], optional) – Defines the columns in the cohortDefinitionSet to use in conjunction with the cohortFileNameFormat parameter
subset_json_folder (str, optional) – Defines the folder to store the subset JSON
verbose (bool, optional) – When TRUE, logging messages are emitted to indicate export progress. By default False.

Return type:

None