py_research.geo module#

Utilities for working with geographical data, esp. data associated with countries.

class GeoAlliance(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: StrEnum

List of international alliances used to define geo-regions of interest.

EU = 'eu'#

See Eurostat definition

EU12 = 'eu12'#
EU15 = 'eu15'#
EU25 = 'eu25'#
EU27 = 'eu27'#
EU27_2007 = 'eu27_2007'#
EU28 = 'eu28'#
EEA = 'eea'#

EEA according to Eurostat

G7 = 'g7'#

G7 according to Wikipedia

G20 = 'g20'#

G20 according to Wikipedia

APEC = 'apec'#

APEC according to Wikipedia

BRIC = 'bric'#

BRIC according to Wikipedia

BASIC = 'basic'#

BASIC according to Wikipedia

CIS = 'cis'#

CIS according to Wikipedia

OECD = 'oecd'#

OECD members

class GeoScheme(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: StrEnum

List of schemes, which can be used to define geo-regions of interest.

country_name = 'country_name'#

Short name of a country.

continent = 'continent'#

Name of a continent.

cc_iso3 = 'cc_iso3'#

ISO3 code of a country.

cc_iso2 = 'cc_iso2'#

ISO2 code of a country.

alliance = 'alliance'#

Name of an international alliance, to which a country belongs.

world = 'world'#

Dummy scheme to match all of the world.

to_coco_scheme()[source]#

Return associated coco scheme, if applicable.

Return type:

str | None

class GeoRegion(label, scheme=GeoScheme.cc_iso3, display_label=None, exclude_already_covered=True)[source]#

Bases: object

Define a geo-region according to some scheme (country, continent, etc.).

Parameters:
  • label (str) –

  • scheme (GeoScheme) –

  • display_label (str | None) –

  • exclude_already_covered (bool) –

label: str#

The geo-location’s label according to scheme.

scheme: GeoScheme = 'cc_iso3'#

The naming/classification scheme used.

display_label: str | None = None#

Optional, custom display label for the geo-location.

exclude_already_covered: bool = True#

When listing multiple geo-regions, exclude locations which have already been covered by previously listed regions.

get_label()[source]#

Return string label of this region.

Return type:

str

to_country_list(scheme=GeoScheme.cc_iso3)[source]#

Return list of matching countries in given scheme.

Parameters:

scheme (Literal[GeoScheme.country_name, GeoScheme.cc_iso2, GeoScheme.cc_iso3]) – The scheme to convert to countries.

Returns:

List of countries in scheme.

Return type:

list[str]

CountryScheme#

A GeoScheme which can be used to define a single country.`

alias of Literal[country_name, cc_iso2, cc_iso3]

class Country(label, scheme=None)[source]#

Bases: UserString

A country represented by ISO2 code, ISO3 code or name.

Parameters:
  • label (str) –

  • scheme (Literal[GeoScheme.country_name, GeoScheme.cc_iso2, GeoScheme.cc_iso3] | None) –

label: str#

The country’s label according to scheme.

scheme: Literal[GeoScheme.country_name, GeoScheme.cc_iso2, GeoScheme.cc_iso3] | None = None#

The naming/classification scheme used.

to(scheme)[source]#

Convert to other scheme.

Parameters:

scheme (Literal[GeoScheme.country_name, GeoScheme.cc_iso2, GeoScheme.cc_iso3]) –

Return type:

Self

property data#

Return the country’s label.

countries_to_scheme(countries, target=GeoScheme.cc_iso3, src=None)[source]#

Translate given series of country labels to scheme.

Parameters:
  • countries (Series) – Series of country labels.

  • target (Literal[GeoScheme.country_name, GeoScheme.cc_iso2, GeoScheme.cc_iso3]) – Target scheme to translate to.

  • src (Literal[GeoScheme.country_name, GeoScheme.cc_iso2, GeoScheme.cc_iso3] | None) – Source scheme to translate from.

Returns:

Series of translated country labels.

Return type:

Series

expand_geo_col_to_cc(df, geo_col, scheme=GeoScheme.country_name, cc_scheme=GeoScheme.cc_iso3)[source]#

Expand geo-regions present in geo_col to country codes.

Expand such that rows of df with multiple mapped CCs are multiplicated.

Parameters:
  • df (DataFrame) – The dataframe to expand.

  • geo_col (str) – The column containing geo-regions.

  • scheme (GeoScheme) – The scheme used to define the geo-regions.

  • cc_scheme (Literal[GeoScheme.country_name, GeoScheme.cc_iso2, GeoScheme.cc_iso3]) – The scheme to expand to.

Returns:

The expanded dataframe.

Return type:

DataFrame

merge_geo_regions(df, geo_col, geo_regions, input_scheme=GeoScheme.country_name, pretty_labels=True)[source]#

Right-merge geo_regions onto df based on geo_col.

Merge such that rows with multiple mapped regions are multiplicated.

Parameters:
  • df (DataFrame) – The dataframe to merge into.

  • geo_col (str) – The column containing geo-regions.

  • geo_regions (Iterable[GeoRegion | str]) – The geo-regions to merge.

  • input_scheme (GeoScheme) – The scheme used to define the geo-regions.

  • rest_of_world – Whether to add a “Rest of World” region.

  • pretty_labels (bool) – Whether to use pretty labels for regions.

Returns:

The merged dataframe.

Return type:

DataFrame

match_to_geo_region(countries, geo_region, country_scheme=None)[source]#

Check whether countries are in given geo-region.

Parameters:
  • countries (Series) – Series of countries to check.

  • geo_region (GeoRegion) – The geo-region to check against.

  • country_scheme (Literal[GeoScheme.country_name, GeoScheme.cc_iso2, GeoScheme.cc_iso3] | None) – The scheme of the countries.

Returns:

Series of booleans indicating whether countries are in geo-region.

Return type:

Series

gen_flag_url(cc, width)[source]#

Get the URL of a small flag image for a given country code.

Parameters:
  • cc (Series) – Series of country codes.

  • width (int) – The desired width of the flag.

Returns:

Series of flag image URLs.

Return type:

Series

gen_flag_img_tag(cc, width)[source]#

Generate a HTML image tag with a small flag for a given country code.

Parameters:
  • cc (Series) – Series of country codes.

  • width (int) – The desired width of the flag.

Returns:

Series of HTML image tags.

Return type:

Series