Cohort generator#

create_cohort_tables(cohort_database_schema, connection_details=None, connection=None, cohort_table_names=None, incremental=False)#

Create cohort tables

Creates the cohort tables in the database.

Wraps the R CohortGenerator::createCohortTables function defined in CohortGenerator/R/CohortTables.R.

Parameters:
  • cohort_database_schema (str) – The schema to create the cohort tables in

  • connection_details (ListVector, optional) – The connection details, by default None

  • connection (RS4, optional) – The connection, by default None

  • cohort_table_names (ListVector, optional) – The names of the cohort tables, by default None

  • incremental (bool, optional) – A boolean representing if the tables should be created incrementally, by default False

Returns:

A wrapped cohort generation R object

Return type:

RS4

create_empty_cohort_definition_set(verbose=False)#

Create an empty CohortDefinitionSet object

Wraps the R CohortGenerator::createEmptyCohortDefinitionSet function defined in CohortGenerator/R/CohortDefinitionSet.R.

Parameters:

verbose (bool, optional) – When True, descriptions of each field in the data frame are returned. By default False.

Return type:

RS4

export_cohort_stats_tables(cohort_database_schema, cohort_statistics_folder, connection_details=None, connection=None, cohort_table_names=None, snake_case_to_camel_case=True, file_names_in_snake_case=False, incremental=False, database_id=None)#

Export the cohort statistics tables to the file system

This function retrieves the data from the cohort statistics tables and writes them to the inclusion statistics folder specified in the function call.

Parameters:
  • cohort_database_schema (str) – The schema to create the cohort tables in

  • cohort_statistics_folder (str) – The path to the folder where the cohort statistics folder where the results will be written

  • connection_details (ListVector, optional) – The connection details, by default None

  • connection (RS4, optional) – The connection, by default None

  • cohort_table_names (ListVector, optional) – The names of the cohort tables, by default None

  • snake_case_to_camel_case (bool, optional) – Should column names in the exported files convert from snake_case to camelCase? Default is FALSE

  • file_names_in_snake_case (bool, optional) – Should the exported files use snake_case? Default is FALSE

  • incremental (bool, optional) – A boolean representing if the tables should be created incrementally, by default False

  • database_id (str, optional) – When specified, the databaseId will be added to the exported results

generate_cohort_set(cdm_database_schema, cohort_definition_set, connection_details=None, connection=None, temp_emulation_schema=None, cohort_database_schema=None, cohort_table_names=None, stop_on_error=True, incremental=False, incremental_folder=None)#

Generate a cohort set

This function generates a set of cohorts in the cohort table.

Wraps the R CohortGenerator::generateCohortSet function defined in CohortGenerator/R/CohortConstruction.R.

Parameters:
  • cdm_database_schema (str) – The schema containing the CDM

  • connection_details (ListVector | None, optional) – The connection details obtained using Connect.create_connection_details(...), by default None

  • connection (None, optional) – The connection object obtained from Connect.connect(...), by default None

  • temp_emulation_schema (None, optional) – The schema to use for temp tables, by default None

Return type:

RS4

get_cohort_counts(cohort_database_schema, connection_details=None, connection=None, cohort_table='cohort', cohort_ids=[], cohort_definition_set=None, database_id=None)#

Get cohort counts.

Gets the counts for the specified cohort ids.

Wraps the R CohortGenerator::getCohortCounts function defined in CohortGenerator/R/CohortCount.R.

Parameters:
  • cohort_database_schema (str) – The schema containing the cohort tables

  • connection_details (ListVector, optional) – The connection details, by default None

  • connection (RS4, optional) – The connection, by default None

  • cohort_table (str, optional) – The name of the cohort table, by default “cohort”

  • cohort_ids (list[int], optional) – The cohort ids to get the counts for, by default []

  • cohort_definition_set (RS4, optional) – The cohort definition set, by default None

  • database_id (str, optional) – The database id, by default None

Returns:

A wrapped cohort counts R object

Return type:

RS4

get_cohort_table_names(cohort_table='cohort', cohort_inclusion_table=None, cohort_inclusion_result_table=None, cohort_inclusion_stats_table=None, cohort_summary_stats_table=None, cohort_censor_stats_table=None)#

Get the names of the cohort tables

Wraps the R CohortGenerator::getCohortTableNames function defined in CohortGenerator/R/CohortTables.R.

Parameters:
  • cohort_table (str, optional) – The name of the cohort table, by default “cohort”

  • cohort_inclusion_table (str, optional) – The name of the cohort inclusion table, by default None

  • cohort_inclusion_result_table (str, optional) – The name of the cohort inclusion result table, by default None

  • cohort_inclusion_stats_table (str, optional) – The name of the cohort inclusion stats table, by default None

  • cohort_summary_stats_table (str, optional) – The name of the cohort summary stats table, by default None

  • cohort_censor_stats_table (str, optional) – The name of the cohort censor stats table, by default None

Returns:

A list of cohort table names

Return type:

ListVector

save_cohort_definition_set(cohort_definition_set, settings_file_name='inst/cohorts.csv', json_folder='inst/cohorts', sql_folder='inst/sql/sql_server', cohort_file_name_format='%s', cohort_file_name_value=['cohort_id'], subset_json_folder='inst/cohort_subset_definitions/', verbose=False)#

Save the cohort definition set to the file system

This function saves a cohort_definition_set to the file system and provides options for specifying where to write the individual elements: the settings file will contain the cohort information as a CSV specified by the settingsFileName, the cohort JSON is written to the jsonFolder and the SQL is written to the sqlFolder. We also provide a way to specify the json/sql file name format using the cohort_file_name_format and cohort_file_name_value parameters.

Wraps the R CohortGenerator::saveCohortDefinitionSet function defined in CohortGenerator/R/CohortDefinitionSet.R.

Parameters:
  • cohort_definition_set (RS4) – A CohortDefinitionSet object

  • settings_file_name (str, optional) – The name of the CSV file that will hold the cohort information including the cohortId and cohortName

  • json_folder (str, optional) – The name of the folder that will hold the JSON representation of the cohort if it is available in the cohortDefinitionSet

  • sql_folder (str, optional) – The name of the folder that will hold the SQL representation of the cohort

  • cohort_file_name_format (str, optional) – Defines the format string for naming the cohort JSON and SQL files. The format string follows the standard defined in the base sprintf function.

  • cohort_file_name_value (list[str], optional) – Defines the columns in the cohortDefinitionSet to use in conjunction with the cohortFileNameFormat parameter

  • subset_json_folder (str, optional) – Defines the folder to store the subset JSON

  • verbose (bool, optional) – When TRUE, logging messages are emitted to indicate export progress. By default False.

Return type:

None