Opportunity Data AI Exposure Index Methodology

Construction, data sources, and limitations. For the searchable tables, see the program index or occupation index.

What the Index Measures

The Opportunity Data AI Exposure Index measures how much an occupation's core work tasks overlap with current AI capabilities. It balances one exposure factor against two protective factors:

A score of 0 means fully protected. A score of 1 means fully exposed. The index does not predict job loss. It measures task-level overlap with AI capabilities. Whether exposed occupations are augmented, restructured, or displaced depends on adoption speed, regulation, and organizational decisions the index cannot capture.

The Nine O*NET Variables

All variables are drawn from O*NET 23.1 (U.S. Department of Labor). O*NET surveys incumbent workers and occupation analysts to produce standardized descriptors for 966 occupations. Scores are self-reported survey responses, not modeled or imputed.

Variable O*NET Element Scale Dimension Direction
Interacting With Computers 4.A.3.b.1 IM (1-5) Digital Higher = more exposed
Analyzing Data or Information 4.A.2.a.4 IM (1-5) Digital Higher = more exposed
Processing Information 4.A.2.a.2 IM (1-5) Digital Higher = more exposed
Assisting and Caring for Others 4.A.4.a.5 IM (1-5) Contact Higher = more protected
Performing for or Working Directly with the Public 4.A.4.a.8 IM (1-5) Contact Higher = more protected
Contact With Others 4.C.1.a.2.l CX (1-5) Contact Higher = more protected
Spend Time Using Hands 4.C.2.d.1.g CX (1-5) Physical Higher = more protected
Spend Time Sitting 4.C.2.d.1.a CX (1-5) Physical Inverted: more sitting = less protected
Responsible for Others' Health and Safety 4.C.1.c.1 CX (1-5) Physical Higher = more protected
Why "Responsible for Others' Health and Safety" is in the Physical dimension: This variable captures the requirement for physical presence. If someone's safety depends on you being there (a nurse, a firefighter, a pilot, a lifeguard), the work cannot be performed remotely or delegated to software. It complements the hands-on and posture variables by adding a safety-stakes signal that pure manual-work measures miss.

Three Dimensions

Digital Work Intensity

How computer- and data-intensive is the core work? Average of three normalized variables: Interacting With Computers, Analyzing Data or Information, Processing Information.

Human Contact Intensity

Does the job require empathy, direct care, or constant interpersonal contact? Average of three normalized variables: Assisting and Caring for Others, Working Directly with the Public, Contact With Others.

Physical Task Intensity

Is the work grounded in physical presence, manual skill, or safety responsibility? Average of three normalized variables: Spend Time Using Hands, Spend Time Sitting (inverted), Responsible for Others' Health and Safety.

Each dimension ranges from 0 to 1 after normalization. Digital intensity drives exposure up. Human contact and physical task intensity drive it down.

Construction: From Raw Scores to Composite Index

1

Extract and normalize. For each of the 966 O*NET occupations, extract the raw score (1-5) for all 9 variables. Min-max normalize each variable to [0, 1] across all 966 occupations. Invert "Spend Time Sitting" so that higher values mean less sitting. Occupations missing any variable are excluded, leaving 966 scored occupations.

2

Compute dimension scores. Each dimension is the simple average of its constituent normalized variables.

Digital      = mean( computers, analyzing, processing )
Contact     = mean( caring, public, contact_with_others )
Physical    = mean( hands, 1 − sitting, responsible )
3

Compute the composite. One exposure factor minus two protective factors, rescaled to [0, 1].

AI Exposure Index = ( Digital − Contact − Physical + 2 ) / 3

The +2 offset ensures the result stays in [0, 1]. All three dimensions receive equal weight (1:1:1).

4

Assign tiers.

TierScore RangeOccupationsU.S. Workers
AI-protected Below 0.4535661.7 million
Moderate 0.45 - 0.6023035.3 million
AI-exposed 0.60 - 0.7013419.2 million
Highly exposed Above 0.70526.4 million

Occupation-Level Index

Workers in hard hats on a construction site

772 occupations covering 122.6 million workers

The occupation index scores each Standard Occupational Classification (SOC) code directly from O*NET variables and pairs it with national employment estimates from the BLS Occupational Employment and Wage Statistics (May 2024). Of the 966 scored O*NET occupations, 772 map to unique 6-digit BLS codes. The remaining 194 are O*NET-specific subdivisions (e.g., "Data Warehousing Specialists") that map to broader BLS categories.

Each occupation receives scores on all three dimensions plus the composite index, alongside its BLS employment count. This is the foundation layer: every program-level score traces back to these occupation scores.

Browse the Occupation Index →

Academic Program-Level Index

University campus with students

1,786 academic programs across the full federal CIP taxonomy

The program index maps each Classification of Instructional Programs (CIP) code to occupations via the NCES CIP2020-SOC2018 Crosswalk. For each program, the linked occupations' AI Exposure scores are averaged using BLS national employment as weights. Programs whose linked occupations all have zero BLS employment receive an unweighted average.

Employment weighting

Employment weighting ensures that a program's score reflects the occupations graduates actually enter, not just the full list of possible jobs. Larger occupations contribute proportionally more to the program score.

Example: CIP 51.3801 (Registered Nursing) links to SOC codes for Registered Nurses (3.2M employed), Nurse Practitioners (355K), Nurse Anesthetists (45K), and others. The program's score is the employment-weighted average of those occupations' individual scores, so the 3.2 million Registered Nurses dominate the program's composite.
Browse the Program Index →

Data Sources

SourceVersionWhat We Use
O*NET Database
U.S. Department of Labor
23.1 9 work characteristic variables (5 Work Activities, 4 Work Context) for 966 occupations. Self-reported survey data from incumbent workers.
BLS OES
Bureau of Labor Statistics
May 2024 National employment estimates for 831 detailed occupations. Used as weights for program-level aggregation and displayed alongside occupation scores.
NCES CIP-SOC Crosswalk
National Center for Education Statistics
2020 CIP / 2018 SOC Maps academic programs (CIP codes) to occupations (SOC codes). Used to aggregate occupation-level scores to the program level.

Limitations and Scope

What the index does

What it does not do

Known data limitations

Reproducibility

All source data, construction scripts, and intermediate outputs are available in the GitHub repository.

FileDescription
onet_raw/db_23_1_text/Work Activities.txtSource: O*NET Work Activities (5 variables)
onet_raw/db_23_1_text/Work Context.txtSource: O*NET Work Context (4 variables)
national_M2024_dl.xlsxSource: BLS OES May 2024 national employment
CIP2020_SOC2018_Crosswalk.xlsxSource: NCES CIP-SOC crosswalk
dii_v2_by_soc.csvIntermediate: 966 occupations with raw + normalized scores
soc_dii_v2_employment.csvOutput: 772 occupations with dimensions, composite, employment
all_cip_dii_v2.csvOutput: 1,786 programs with dimensions and composite
dii_v2_variables.csvReference: 9-row variable table with element IDs and dimensions

Citation

Suggested citation:
Rowe, B. (2026). Opportunity Data AI Exposure Index. Opportunity Data. opportunitydata.org/ai-exposure-methodology