ipumspy.api.MicrodataExtract.__init__#

MicrodataExtract.__init__(collection, samples, variables, description='', data_format='fixed_width', data_structure={'rectangular': {'on': 'P'}}, time_use_variables=None, **kwargs)[source]#

Class for defining an extract for an IPUMS microdata collection.

Parameters:
  • collection (str) – name of an IPUMS data collection

  • samples (Union[List[str], List[Sample]]) – list of sample IDs from an IPUMS microdata collection

  • variables (Union[List[str], List[Variable]]) – list of variable names from an IPUMS microdata collection

  • description (str) – short description of your extract

  • data_format (str) – fixed_width and csv supported

  • data_structure (Dict) – nested dict with “rectangular”, “hierarchical”, or “household-only” as first-level key. “rectangular” extracts require further specification of “on” : <record type>. Default {“rectangular”: “on”: “P”} requests an extract rectangularized on the “P” record.

  • time_use_variables (Union[List[str], List[TimeUseVariable], None]) – a list of IPUMS Time Use Variable names or Objects. This argument is only valid for IPUMS ATUS, MTUS, and AHTUS data collections. If the list contains user-created Time Use Variables, these must be passed as a list of TimeUseVariable objects with the ‘owner’ field specified.

Keyword Arguments:
  • data_quality_flags – a boolean value which, if True, adds the data quality flags for each variable included in the variables list if a data quality flag exists for that variable.

  • sample_members – a dictionary of non-default sample members to include for Time Use collections where keys are strings indicating sample member type and values are boolean. This argument is only valid for IPUMS ATUS, MTUS, and AHTUS data collections. Valid keys include ‘include_non_respondents’ and ‘include_household_members’.

  • case_select_who – indicates how to interpret any case selections included for variables in the extract. "individuals" includes records for all individuals who match the specified case selections, while "households" includes records for all members of each household that contains an individual who matches the specified case selections.