Methodology and data notes

Overview

The Global Education Futures Readiness Index (GEFRI) benchmarks countries’ education futures readiness using globally comparable, openly available data. This methodology page summarises indicator selection, imputation, and scoring at a policy level. For deeper, code-level review, consult the GEFRI scoring technical appendix, which contains additional details on the indicators, dimensions, and the scoring engine.

How GEFRI works

  1. Retrieve data from the latest World Bank indicator updates.
  2. Clean and validate the data, flagging microstates (population < 300,000) so they do not anchor reference statistics.
  3. Impute missing values using a Region+Income → Region → Income → Global hierarchy, excluding microstates from every reference set.
  4. Apply indicator-specific transformations (log1p, articles per million, etc.) and min–max normalization on observed non-microstate values.
  5. Compute dimension scores with plausibility checks and FCS caps.
  6. Assign confidence levels from the share of directly observed indicators and apply penalties to low-confidence dimensions.
  7. Average the penalised dimension scores to produce the composite GEFRI score.

For the exact Python implementation of these steps, see the technical appendix.

Data sources

Indicator families

GEFRI tracks five readiness dimensions, each anchored by a curated set of World Bank and UNESCO indicators. This page summarises the role of each dimension; the complete indicator catalog (including codes and transformations) is published in the GEFRI scoring technical appendix.

  • Infrastructure: Power and connectivity indicators that enable digital and hybrid learning environments.
  • Human Capital: Investment, literacy, and participation measures that show how well learners are supported across the lifecycle.
  • School Access & Gender Parity: Enrollment, completion, and gender parity metrics that surface who is still being left out.
  • Innovation: R&D expenditure, research intensity, and technology export indicators that describe a country’s ability to create and scale new solutions.
  • Governance: World Governance Indicators that track policy effectiveness, regulatory quality, corruption control, and civic voice.

All inputs are sourced from openly available datasets; where multiple agencies contribute to an indicator series, attribution is provided within the appendix.

Indicator transformations & normalization bounds

GEFRI sets normalization limits for each indicator using a rolling 18-month window of observed minimum and maximum values across non-microstate countries. This approach stabilises month-to-month results while preserving the empirical nature of min-max scaling. Because many indicators—particularly research production and high-tech exports—have no natural upper limit, the global maximum may rise over time as new data enter the window. When this occurs, countries with stable raw values may receive a lower normalized score. This behaviour reflects GEFRI’s design as a relative, frontier-tracking index for open-ended indicators rather than an absolute scoring system. Imputed values adopt these bounds but never define them, and the window is seeded from historical archives to maintain continuity during transitions between annual and monthly updates.

  • Linear indicators: Use direct min-max scaling; negative or impossible values are clipped to plausible ranges before scaling.
  • Log/derived indicators: Secure servers, researchers per million, journal articles, and high-tech exports use log1p transforms to dampen outliers before min-max scaling.
  • Per-million eligibility: Scientific articles per million only scale for populations ≥ 1 million. Smaller systems keep the indicator missing, contributing to a lower innovation confidence rating.
  • Equity functions: Out-of-school, completion, and gender-parity indicators use purpose-built linear functions with fixed anchors (described below) rather than dataset-derived min and max.
  • Frontier-following behaviour: For indicators that expand over time, such as scientific output or secure servers, rising global maxima can widen the normalization range. Countries with stable values may therefore see lower normalized scores even though their underlying data have not changed. This reflects the relative nature of GEFRI’s scaling for indicators without fixed theoretical limits.
CodeIndicatorDimension
EG.ELC.ACCS.ZSAccess to electricity (% of population)Infrastructure
IT.NET.USER.ZSInternet users (% of population)Infrastructure
IT.NET.SECR.P6Secure internet servers (per 1 million people)Infrastructure
IT.CEL.SETS.P2Mobile cellular subscriptions (per 100 people)Infrastructure
SE.XPD.TOTL.GD.ZSGovernment expenditure on education (% of GDP)Human Capital
SE.ADT.LITR.ZSAdult literacy rate (% age 15+)Human Capital
SE.SEC.ENRRSchool enrollment, secondary (% gross)Human Capital
SE.TER.ENRRSchool enrollment, tertiary (% gross)Human Capital
SE.ENR.SECO.FM.ZSSecondary GPI (Gross enrollment ratio, female/male)School Access & Gender Parity
SE.ENR.TERT.FM.ZSTertiary GPI (Gross enrollment ratio, female/male)School Access & Gender Parity
SE.PRM.UNER.ZSChildren out of school, primary (% of primary school age)School Access & Gender Parity
SE.SEC.UNER.LO.ZSAdolescents out of school, secondary (% of lower secondary school age)School Access & Gender Parity
SE.SEC.CMPT.LO.FE.ZSLower secondary completion rate, female (% of relevant age group)School Access & Gender Parity
SE.SEC.CMPT.LO.MA.ZSLower secondary completion rate, male (% of relevant age group)Auxiliary
GB.XPD.RSDV.GD.ZSR&D expenditure (% of GDP)Innovation
SP.POP.SCIE.RD.P6Researchers in R&D (per million people)Innovation
IP.JRN.ARTC.SCScientific and technical journal articlesInnovation
TX.VAL.TECH.CDHigh-tech exports (current US$)Innovation
SP.POP.TOTLPopulation, totalAuxiliary
GE.ESTGovernment Effectiveness (WGI)Governance
RQ.ESTRegulatory Quality (WGI)Governance
CC.ESTControl of Corruption (WGI)Governance
VA.ESTVoice and Accountability (WGI)Governance

IP.JRN.ARTC.SC is ingested as the raw article count and converted to a per-million indicator for Innovation scoring when population ≥ 1 million. “Population, total” and the male completion rate are auxiliary inputs that support these derivations, imputation confidence, and dashboards; they do not feed directly into the composite GEFRI score. For transformation and imputation notes on each indicator, see the technical appendix.

Data imputation and confidence

GEFRI aims to maximize data comparability while ensuring transparency about imputed (estimated or filled using averages from comparable countries) values. When indicator data are missing for a country, values are imputed using a structured, four-stage process:

  1. Region + Income Group Average: If both region and income group are known and there is sufficient data, the average for countries matching both is used.
  2. Regional Average: If no value is available for the combined group, the average for all countries in the same region is used.
  3. Income Group Average: If regional data is missing, the average for the income group is used.
  4. Global Average: If none of the above are available, the global average is imputed.

Special Case: For Adult literacy rate in high-income countries with missing data, a value of 100% is imputed, with the imputation source recorded as Assumed (high income). This assumption reflects international norms, unless other evidence is available.

All imputed values are clearly flagged, and the specific imputation method is recorded for each data point. For every country and indicator, the source (“Original” or “Imputed: [method]”) is displayed in the data tables. Microstates (countries with populations of less than 300,000) are not included in imputation calculations.

For each GEFRI dimension (such as Infrastructure, Innovation, Human Capital, etc.), a confidence rating is assigned based on how many underlying indicators were imputed:

  • High confidence: No imputed data — all indicators for the dimension are based on directly reported values.
  • Moderate confidence: Imputed indicators are present, but less than half of the dimension’s indicators are imputed.
  • Low confidence: Half or more of the dimension’s indicators are imputed.

The full list of imputed indicators, their imputation methods, and confidence ratings is displayed for each country profile. Users should interpret results with extra caution when confidence is low.

Special note: In rare cases where a specific indicator is unavailable for small-population countries (e.g., scientific articles per million for populations under 1 million), the relevant dimension is automatically rated as “Low confidence.”

Normalization and scoring

  • Min–max scaling: Most indicators are normalized to 0–1 (reported as 0–100) using the observed minimum and maximum captured in the rolling 18-month bounds history for non-microstate countries. Indicator values still come from the latest published datapoint within the seven-year fetch horizon, and imputed values follow these bounds without redefining them.
  • Outlier management: Reported values are clipped to plausible ranges before scaling (e.g., negative rates become 0). After scaling, all results are capped within 0–100 (or 10–100 for equity functions).
  • Log transforms: Secure internet servers, researchers per million, scientific articles per million, and high-tech exports apply log1p transforms prior to scaling to temper heavy-tailed distributions.
  • Population eligibility: Scientific articles per million apply only to countries with populations ≥ 1 million; smaller systems receive “Low” Innovation confidence.
  • Dimension scoring: Infrastructure, Human Capital, Innovation, and Governance scores are the arithmetic mean of their normalized indicators. School Access & Gender Parity uses the minimum of four banded sub-scores, with adjustments for plausibility (high-income anomalies), imputation penalties, and an FCV cap.
  • Confidence & penalties: Dimensions with <70% directly observed indicators (or missing an eligible metric) are labelled “Low” confidence and multiplied by 0.7. “Moderate” confidence (70–<90%) is reported without penalty.
  • Composite GEFRI score: The composite is the mean of the (penalised) dimension scores, rescaled to 0–100.

School Access & Gender Parity highlights

This dimension reflects how many children are in school, whether girls and boys reach lower secondary, and whether reported gaps are likely to reflect real inequities or data artefacts. It then makes room for fragile or conflict-affected realities so crises do not masquerade as readiness gains.

  • Indicators: Children out of school (primary), adolescents out of school (secondary), lower secondary completion (female), and secondary GPI. Each is converted to a 10–100 score via linear functions.
  • Plausibility adjustment: For high-income, non-FCS countries where three of the four inputs score ≥80 but one indicator falls ≤40, GEFRI substitutes the second-lowest value. This threshold catches known reporting artifacts (e.g., incomplete vocational pathways). A November 2025 sensitivity run shows that removing the rule would depress the School Access & Gender Parity score by 4–12 points across nine high-income systems without changing their relative rank elsewhere.
  • Imputation penalty: 5 points per imputed indicator (capped at 15) plus an additional 10-point deduction when three or more indicators are imputed (maximum penalty 25 points). Penalties apply before plausibility or FCS adjustments.
  • Fragility cap: Countries appearing on the World Bank FY2025 Fragile & Conflict-Affected Situations (FCS) list have their School Access & Gender Parity score capped at 40 after other adjustments. Removing the cap in sensitivity testing raises FCS scores by 8–20 points but does not move any country more than five places in the composite ranking.
  • Transparency: The full public-safe scoring engine, including formulas and a worked example, is available in the GEFRI scoring technical appendix.

Design choices and parameter justification

  • Microstate threshold (population < 300,000): Very small systems often have discontinuous data series; excluding them from reference sets prevents volatile values from skewing regional and income-group averages.
  • Articles per million eligibility (population ≥ 1,000,000): The research publication metric becomes unstable in very small populations, so the per-million form is only used when there is a meaningful base of residents.
  • Equity imputation penalties: Deducting 5 points per imputed indicator (up to 15), plus 10 points when three or more equity inputs are estimated, highlights when the School Access & Gender Parity score leans heavily on synthetic data.
  • Equity plausibility check: Substituting the second-lowest equity sub-score in high-income, non-FCS contexts where three indicators are ≥80 but one is ≤40 guards against known reporting artefacts (e.g., vocational pathways missing from administrative counts).
  • Confidence tiers (<70%, 70–<90%, ≥90% observed): These breakpoints mirror widely used completeness thresholds and make the distinction between sparse, partial, and robust indicator coverage clear to the reader.
  • FCS cap at 40: The cap keeps the equity dimension from overstating readiness when countries identified as fragile and conflict-affected face structural service disruptions that raw indicators may not show.

Terminology note: The World Bank now publishes the classification as Fragile and Conflict-Affected Situations (FCS). GEFRI uses the FCS label in the interface, although legacy code and documentation may still reference the earlier FCV acronym.

Data limitations and use notes

GEFRI is based on internationally reported data and transparent processing, but several limitations remain. In this frontier-tracking design, all indicators are pulled using the most recent available value, capped at a maximum of 7 years from the present to avoid the use of outdated or unrepresentative figures. Imputation introduces uncertainty, especially where many indicators are missing or are based on older data. Some indicators may still lag behind real-world changes, and reporting standards can vary between countries. The population filter on innovation metrics helps prevent distortion from microstates, though small-sample issues may still occur.

GEFRI includes a history of past annual scores, beginning in 2016, drawn from the same indicators that inform the current snapshot, pulled from the World Bank's indicators API. Because the World Bank updates its indicators as new data are received and validated, some historical values may change when GEFRI refreshes its datasets. To reflect these revisions, GEFRI recalculates historical scores for the previous four years each July using the same methodology and preserved normalization bounds, ensuring consistency while incorporating newly validated data. Each refresh re-runs the scoring pipeline with the same methodology while reusing the maintained 18-month normalization bounds history (seeded from previous releases) rather than recomputing min/max limits from scratch. Where global maxima continue to rise, countries with unchanged values may record lower normalized scores, which is expected for indicators that lack a natural upper bound. Scores therefore remain stable unless underlying observations are revised or a new data point extends the rolling window. Users who need stable time-series histories should consult the original World Bank indicators. Historical minimum and maximum ranges and shifts in indicator availability do not always align with the readiness lens used in GEFRI and should be interpreted with care.

The index should not be used for simple league tables or to draw definitive conclusions about individual countries, especially where confidence is low. GEFRI is best used as a starting point for inquiry and policy dialogue, supported by local expertise and contextual understanding.

  • Scores reflect the latest available data published by the reporting agency (within a 7-year window), not current-year events or rapid recent changes.
  • High shares of imputed data will reduce confidence in those scores.

For feedback or to report issues, contact us.

Changelog and updates

  • 2025-11-26: Scoring engine 1.1 introduces an 18-month rolling normalization window seeded from historical archives to reduce month-to-month score drift. Major UX/UI improvements, including addition of historical data.
  • 2025-05-27: [1] Microstates are no longer ranked and are no longer included in computations for imputed scores. This results in a new score set. [2] GEFRI applies a plausibility adjustment to the School Access and Gender Parity dimension for high-income, non-conflict countries. When three of four indicators are very high, but a single value is anomalously low, the score is based on the second-lowest indicator to reduce the risk of data artifacts distorting results (e.g., due to vocational tracking of students in secondary education that may not be reflected in reported data).
  • 2025-05-16: Initial public beta, test run of GEFRI Score and component scores for public comment.