Name: Mapping the Structural Divide: Institutional Resilience, Post-College Market Position, and AI Exposure Across U.S. Higher Education
Creator: Kyle Saunders
Published: 2026-03-22
License: https://creativecommons.org/licenses/by/4.0/

Where does your institution stand?

U.S. four-year colleges and universities face compounding pressures — demographic decline, fiscal stress, and artificial intelligence — that will reshape the sector over the next decade. This project maps where 1,556 institutions are structurally positioned across two dimensions, using federal data anyone can verify.

X-axis: Institutional Resilience
Can this institution absorb financial and enrollment shocks?
Endowment per student · Revenue diversification · Enrollment trend · Admissions selectivity

Y-axis: Post-College Market Position
How well does this institution position graduates for the labor market ahead?
Completion rate · Earnings-to-debt ratio · AI exposure (inverted) · Demographic trajectory

Institutions above the median on both axes are High Capacity; below both, High Stress. The other two quadrants — Market Misaligned and Structurally Exposed — capture institutions with mixed structural positions. Institutions scoring higher are better-positioned on the indicators I track — but the indicators I track are analytical choices, and institutions serve missions that no set of quantitative measures can fully represent.

A few patterns from the data: most R1 research universities fall in High Capacity. Master's and RCU institutions distribute across all four quadrants — the tiers where institutional choices matter most. Carnegie classification alone explains less than 30% of the variance in combined scores, which is why a framework like this adds information beyond what tier labels provide.

Data: All measures are derived from IPEDS, College Scorecard, O*NET, WICHE, and the Anthropic Economic Index. The sample includes 1,556 of 1,609 four-year institutions classified under the 2025 Carnegie system with sufficient data. For methodology, see the Methods tab or the working paper.

The Whole 2×2
See all 1,556 institutions on the resilience × market position map AI Exposure
Explore which degree fields face the highest entry-level AI task exposure Factor View
See how the data's own structure groups institutions across three latent dimensions

Position on the 2×2

Resilience Components (Percentile)

Each bar shows where this institution ranks relative to all 1,556 mapped institutions. E.g., "72nd" means it scores higher than 72% of all institutions on that measure.

Market Position Components (Percentile)

X-axis: Institutional Resilience
Can this institution absorb financial and enrollment shocks?
Endowment per student · Revenue diversification · Enrollment trend · Admissions selectivity

The Institutional Landscape

1,556 institutions positioned by Institutional Resilience (x-axis) and Post-College Market Position (y-axis). Dashed lines mark the sample median on each axis. Institutions in the upper right (High Capacity) score above the median on both dimensions; those in the lower left (High Stress) score below both.

This is a strategic classification heuristic, not a ranking or a predictive model. The axes are equal-weight composites of publicly available indicators. Institutions scoring higher are better-positioned on the measures I track — but the measures are analytical choices, and institutions serve missions beyond what any set of quantitative indicators captures. Component-level scores tell a richer story than the quadrant label alone. Note: 53 of 1,609 eligible institutions lack sufficient data for both scores and are not shown — these are disproportionately smaller private and Baccalaureate Diverse schools, meaning the map likely underrepresents the most structurally stressed institutions.

High Capacity

Structurally Exposed

Market Misaligned

High Stress

Color by: Filter:

How the Tiers Separate: Combined Score Distribution by Carnegie Classification

The chart below shows the distribution of combined scores (Resilience + Market Position) within each Carnegie tier. The combined score is used here for distributional comparison only — it is not a ranking, and the framework deliberately avoids collapsing two distinct dimensions into a single number for institutional assessment. The purpose is to show where tiers overlap and separate: a Master's L/M institution can score as well as an R2, or as poorly as a Baccalaureate. Where they separate, the tiers represent genuinely distinct structural positions. The new RCU (Research Colleges and Universities) tier captures 182 institutions with meaningful research activity that previously had no research recognition.

What Does This Tell You?

The tiers are real, but they're not destiny. R1 research universities cluster in the upper right — 82% land in the High Capacity quadrant. The Master's tiers show the opposite pattern: 46% of Master's L/M and 43% of Master's S institutions fall in High Stress. Baccalaureate institutions are the most concentrated in High Stress (50%). Carnegie classification alone explains about 25% of the variance, and a Carnegie-modal baseline achieves only 45% quadrant accuracy — barely better than chance. Two institutions in the same tier can occupy opposite quadrants depending on their endowment, completion rate, program mix, enrollment trajectory, and regional context.

37% of institutions are simultaneously shrinking and structurally stressed. These 581 institutions — disproportionately in the Midwest and Northeast, concentrated in the Baccalaureate and Master's S tiers — are losing enrollment from a position of financial weakness. Many serve as anchor institutions in their communities, providing both educational access and local economic activity. The framework cannot predict which will close, but it identifies the population where compound pressures are most intense. About 38% (584) are both growing and above the median on resilience.

The quadrant labels are shorthand, not sentences. For any individual institution, the component-level scores — visible in the institution detail view — tell a richer story than the quadrant label alone. The Methods tab details how these assignments hold up under alternative specifications. Use the "Find Your Institution" tab to see the full breakdown.

AI Exposure: Two Timelines of Disruption

EXPLORATORY MEASURE — This is a first-generation attempt to quantify institutional AI exposure. External validation against actual labor market outcomes is not yet possible; I present it as a structured starting point for a conversation the sector needs to have.

What this measure captures — and what it can't

Note: AI exposure (inverted) already feeds into the market position Y-axis as one of four equally-weighted components. What follows on this tab is a deeper, field-level analysis — an exploratory extension that validates the measure against real-world adoption data and examines its relationship to earnings. These are related but distinct uses of the same underlying measure.

The AI exposure component asks a specific question: to what degree do this institution's graduates enter career pathways where entry-level tasks could be performed or assisted by current AI systems? It is not measuring whether a school uses AI, how "techy" its programs are, or how vulnerable any individual graduate is. It is measuring task-level overlap between AI capabilities and the occupations that graduates of specific degree programs typically enter.

The pipeline works in five steps, each of which introduces assumptions:

1. Task classification. I classify 41 standardized O*NET work activities as AI-positive (routine cognitive tasks: processing information, analyzing data, working with computers) or AI-negative (physical, interpersonal, and embodied tasks: handling objects, caring for others, operating machinery). This is a judgment call — reasonable people could draw the line differently.

2. Occupation scoring. For each of 894 O*NET occupations, I compute a blended AI exposure score from two approaches: work activity importance weighting and individual task-level DWA classification. The two methods correlate at ρ = 0.835.

3. Entry-level weighting. I weight by O*NET Job Zone because AI disruption disproportionately affects entry-level work. Zones 2–3 (typical college-graduate entry points) receive full weight; Zone 4 receives half; Zones 1 and 5 receive minimal weight (0.2). This is consequential: Statisticians drop from 1.0 raw exposure to 0.2 entry-weighted.

4. Degree-to-occupation crosswalk. I map degree fields to occupations via the NCES CIP-SOC crosswalk, then aggregate across each occupation's linked tasks. This step assumes that the occupational distribution implied by a degree field actually reflects where graduates end up — a reasonable but imperfect assumption.

5. Institutional aggregation. Each institution's score is the enrollment-weighted average across its degree programs. This smooths out field-level variation: most institutions cluster in a narrow band (standard deviation = 0.030 on a 0.27–0.55 scale) because program diversification washes out the signal. The measure discriminates most at the field level, not the institutional level — which is why I show the field-level heatmap alongside the institutional score.

The fundamental limitation is that this entire pipeline uses retrospective occupational data — O*NET describes jobs as they exist today — to make prospective claims about what AI might change. The approach uses the rearview mirror to look forward. A high exposure score means that AI is theoretically capable of performing tasks central to an occupation's entry-level work. It does not tell you whether the market response will be displacement (AI replacing workers), augmentation (AI enhancing productivity), or restructuring (AI reorganizing the occupation entirely). These are distinct labor market dynamics with very different implications for graduates, and I cannot distinguish between them with current data.

The gap between theory and reality

This is where the Anthropic Economic Index comes in. Anthropic recently published observed real-world AI adoption rates by occupation, based on actual conversations with Claude. This provides something genuinely new: a way to compare what AI could theoretically do (the O*NET-derived measure) with what AI is actually doing (Anthropic's observed usage data).

The field-level correlation between theoretical task exposure and observed AI adoption is approximately zero (ρ ≈ −0.09). The fields where AI is most actively used today — computer science, education, arts — are not the same fields where the task structure is most susceptible to entry-level automation — business administration, engineering technology, legal support. The scatter plot below shows this gap for each degree field.

This divergence is consistent with a substantive interpretation — that what AI can theoretically automate and what AI is currently automating are largely independent, implying two distinct exposure timelines. However, alternative interpretations (measurement artifacts, vendor-specific bias in single-platform adoption data, uneven regulatory constraints) are comparably plausible. The data cannot empirically distinguish among them. The measure belongs in the framework as a forward-looking signal, but its predictive validity is genuinely unknown.

Theoretical Exposure vs. Observed Adoption by Degree Field

AI Entry-Level Exposure by Degree Field (Ranked)

Institution-Level AI Exposure: Theoretical vs. Observed

The charts above show AI exposure by degree field. This chart asks a different question: where does your institution fall? Each dot is a single institution. Its horizontal position reflects the weighted average theoretical AI exposure of its graduates' career pathways (based on what programs it offers). Its vertical position reflects the weighted average observed AI adoption in those same career pathways (based on how much AI is actually being used).

The four corners tell different stories. Upper right: graduates enter fields where AI is both structurally capable of performing entry-level tasks and already actively in use — these institutions' program mix puts graduates in the thick of the AI transition right now. Lower right: graduates enter fields with high theoretical exposure but low current adoption — the "latent vulnerability" zone, where the task structure is ready for AI but the tools haven't arrived yet. Upper left: graduates are in fields with high AI adoption but lower theoretical task susceptibility, suggesting augmentation (AI as a tool) rather than substitution. Lower left: graduates enter physical, interpersonal, or hands-on fields where AI has limited relevance on either dimension.

Most institutions cluster in a narrow band because program diversification smooths out field-level differences. The institutions at the edges are the ones with concentrated program mixes — heavily STEM, heavily business, or heavily health/education — where the field-level AI signal comes through at the institutional level.

Color by: Filter:

What Does This Tell You?

What AI can do and what AI is being used for are almost entirely unrelated — for now. The near-zero correlation (ρ ≈ −0.09) between theoretical task exposure and observed adoption means that knowing how structurally susceptible a field's entry-level tasks are to AI tells you almost nothing about how much AI is actually being used in that field today. Computer science and education show high observed adoption; business administration and engineering technology show high theoretical exposure. These are different lists.

The highest-earning fields are the most exposed. This is not a coincidence. The fields that command the highest graduate earnings — engineering, computer science, business, finance — involve exactly the kind of premium cognitive work (analyzing data, processing information, working with complex systems) that AI is structurally capable of performing. This creates a paradox for institutions: the programs that currently deliver the best return on investment for students are the same programs whose graduates face the most structural AI task exposure. Expanding these programs improves an institution's market position score today but may increase its exposure to disruption tomorrow.

Alternate View: What the Data's Own Structure Reveals

The primary framework imposes two theoretically-motivated axes — Resilience and Post-College Market Position — with equal weighting across components. But what happens when the data speaks for itself?

Factor analysis — tested across multiple specifications (varimax, promax, and oblimin rotations; two and three factors) — reveals a consistent three-factor structure: Credential Outcomes (completion rate, earnings-to-debt ratio), Institutional Character (endowment, revenue diversification, AI exposure), and Demand Environment (enrollment trajectory, regional demographics). The Kaiser criterion supports three factors across all specifications. The two-axis equal-weight model deliberately compresses three empirical dimensions into two composite axes, with the third dimension captured by the demand environment overlay.

Critically, the forward-looking indicators the framework deliberately elevates — enrollment trend and demographic trajectory — have uniqueness exceeding 0.87 in every two-factor solution. The historical covariance structure treats them as noise. They emerge as a distinct third dimension, "Demand Environment," only when three factors are extracted. This is the framework's most consequential bet: elevating indicators whose effects have not yet registered in the backward-looking data.

The scatter plot below positions institutions on the two-factor solution. Dot color reflects the third factor (Demand Environment) — warmer colors indicate institutions with stronger demand indicators (growing enrollment, favorable demographics), cooler colors indicate weaker demand environment.

Color by: Filter:

Factor Loadings: How Components Group Together

The chart below shows which components tend to go together across institutions. Each row is one of the eight components in the framework. The colored bars show how strongly each component is associated with each factor — longer bars mean a stronger association. When two components have long bars on the same factor (same color), they tend to rise and fall together: institutions that score high on one tend to score high on the other.

Green (Factor 1: Credential Outcomes) groups completion rate and earnings-to-debt ratio — the components measuring whether institutions deliver credentials that pay off. Blue (Factor 2: Institutional Character) groups endowment and AI exposure on one side, with revenue diversification on the other — capturing institutional capacity and structural character. Orange (Factor 3: Demand Environment) groups enrollment trend and demographic trajectory — the external demand pressures that don't correlate with either outcomes or institutional resources. Faded bars indicate weak associations (below 0.35).

Equal-Weight vs. Factor-Implied Component Weights

Weights normalized across all 8 components.

Robustness Across Specifications

Results are robust to extraction method (PCA, iterated principal factors), rotation (varimax, oblique promax), and variable scaling (raw standardized, percentile rank) on the core findings. What is stable across all specifications: earnings-to-debt ratio and completion rate anchor Factor 1 (loadings 0.85 and 0.83); forward-looking indicators are invisible in two-factor solutions (uniqueness > 0.87) and anchor Factor 3 when extracted; AI exposure loads on Factor 2, never with earnings outcomes; oblique rotation confirms moderate factor correlations (|r| = 0.10–0.22). What varies: the specific composition of Factor 2 (endowment groups with outcomes in raw data but separates in ranked data) and where completion loads most heavily. Full results across all specifications are in the supplementary materials.

What Does This Tell You?

The factor analysis reveals something that matters for how you read every other tab on this site.

There is one dominant dimension in American higher education, and everyone already knows what it is. Call it institutional hierarchy: the gradient from well-resourced, selective institutions with strong graduate outcomes to tuition-dependent, open-access institutions with weaker outcomes. Completion rate, earnings, endowment, and selectivity all load together because they are, in practice, measuring overlapping aspects of the same underlying reality. This is what Carnegie classification, U.S. News rankings, and common intuition approximate. The factor analysis confirms it formally.

The interesting question is what's independent of that hierarchy. AI exposure, demographic trajectory, and enrollment trend do not correlate with institutional hierarchy — statistically, they are nearly orthogonal to it. A prestigious R1 university in the Northeast and a struggling regional master's institution in the same state may score very differently on the hierarchy dimension but face similar demographic headwinds. A well-endowed liberal arts college and a tuition-dependent business school may occupy opposite ends of the prestige spectrum but share similar AI exposure profiles because of their program mix.

This is where the framework adds something new. Everyone already knows where institutions stand relative to each other on the hierarchy. What the framework surfaces are the contextual pressures that cut across that hierarchy — pressures that can't be predicted from an institution's prestige, wealth, or outcome quality alone. The equal-weight composites deliberately elevate these independent dimensions because they represent emerging conditions whose effects haven't fully materialized in the backward-looking data. The factor analysis can't validate this bet — only time can. But it can confirm that the bet is real: the framework is deliberately weighting information the data's own structure treats as noise.

Methods

1. Scope & Data

1.1 Institutional Universe

Institutions are included if they are four-year, degree-granting, public or private nonprofit, and currently operating. The analysis sample of 1,609 institutions is drawn from the 2021 IPEDS universe and classified using the 2025 Carnegie Classification system (Institutional Classification and Research Activity Designation, as incorporated in IPEDS HD2024). The 9-tier grouping uses Research Activity Designation (R1/R2/RCU) as the primary axis for research institutions and Award Level Focus for non-research institutions. Under the 2025 system, some institutions previously classified as Baccalaureate are now classified as Special Focus based on academic program concentration — these remain in the analysis sample. Of the 1,609 institutions, 1,556 have sufficient data to compute both composite scores. The 53 institutions lacking both scores are disproportionately smaller private institutions — precisely the population most likely to be structurally stressed. Results should be interpreted with this coverage gap in mind.

1.2 Data Sources

IPEDS institutional characteristics, finance, enrollment, and completions (2024) · College Scorecard institution-level outcomes (most recent cohort) · WICHE Knocking at the College Door (11th ed., 2024) · O*NET Database (v29.0) · NCES CIP 2020 to SOC 2018 Crosswalk · Anthropic Economic Index (August 2025) · Census Post-Secondary Employment Outcomes (PSEO).

1.3 GASB/FASB Finance Correction

A critical data-processing correction was required for cross-sector financial comparisons. Public institutions report under GASB standards where the F1A form uses cumulative revenue fields that, if misinterpreted as individual line items, produce tuition dependence ratios near 1.0 for all publics. The corrected approach uses F1D01 (total core revenues) and F1B01 (net tuition and fees). After correction, mean tuition dependence is 16.7% for publics and 77.8% for private nonprofits. This correction eliminated a substantial systematic bias against public institutions.

2. Framework Construction

2.1 X-Axis: Institutional Resilience

A composite of four equally-weighted percentile-ranked components: endowment per full-time-equivalent student, revenue diversification (1 − tuition dependence ratio, where tuition dependence = net tuition revenue / total core revenues), five-year enrollment trajectory (percent change in 12-month unduplicated headcount, 2019–2024), and selectivity (1 − admission rate).

Selectivity is the most contested component, and it admits a dual interpretation: a low admission rate can signal either a strong demand buffer (more applicants than seats) or a limited applicant pool that constrains growth. I include it as a demand signal — institutions with more applicants than seats have a structural advantage in enrollment management — while acknowledging the ambiguity. Removing it from the index produces 85.7% quadrant agreement with the baseline, meaning the vast majority of institutional placements are unaffected. Readers concerned about this choice can consult the no-selectivity specification in the sensitivity analysis (Supplementary Materials, Appendix C), where I report full results with selectivity excluded.

2.2 Y-Axis: Post-College Market Position

A composite of four equally-weighted percentile-ranked components: six-year completion rate for first-time, full-time bachelor’s degree-seeking students, earnings-to-debt ratio (10-year median earnings / median graduate debt), inverted AI entry-level exposure, and regional demographic trajectory (WICHE projected change in high school graduates, 2024–2030).

This axis is deliberately a hybrid: it blends realized outcomes (completion rate, earnings-to-debt ratio) with a modeled future-risk construct (AI exposure) and a regional exogenous condition (demographic trajectory). This heterogeneity is a design choice — the goal is to capture both what the institution delivers for its students and how exposed that position may be to emerging pressures — but it means the axis is better understood as a strategic index than as a single latent construct. Neither axis measures one coherent underlying reality in the psychometric sense; both are strategically bundled composites designed for institutional diagnosis, not latent variable measurement.

2.3 Weighting Rationale

All components within each axis receive equal weight. This is a deliberate simplicity choice for transparency and reproducibility, not an empirical claim that each component contributes equally to institutional outcomes. Factor analysis reveals that the data's own covariance structure would weight endowment and earnings far more heavily while giving minimal weight to AI exposure, demographic trajectory, and enrollment trend. I retain equal weighting because it avoids data-driven weights that would reflect historical relationships and thereby underweight emerging pressures whose effects have not yet fully materialized.

2.4 Quadrant Assignment

Quadrant boundaries are set at the sample median on each axis: High Capacity (above both medians), High Stress (below both), Structurally Exposed (below resilience, above market position), and Market Misaligned (above resilience, below market position). These labels are descriptive shorthand, not definitive assessments. The median split imposes categorical distinctions on a continuous space.

3. AI Exposure Methodology

3.1 Theoretical Exposure Pipeline

The AI exposure component is the most exploratory element of the framework. It measures the degree to which graduates' typical entry-level career pathways involve tasks that current AI systems can perform or assist with. The pipeline: (1) classify 41 O*NET work activities as AI-positive (routine cognitive tasks), AI-negative (physical, interpersonal, embodied tasks), or neutral; (2) compute occupation-level exposure scores; (3) weight by O*NET Job Zone for entry-level relevance (Zones 2–3 full weight, Zone 4 half, Zones 1 and 5 minimal); (4) crosswalk from SOC occupations to CIP degree fields via the NCES CIP-SOC mapping; (5) aggregate to institutions using program mix weights.

3.2 Exposure vs. Substitution vs. Augmentation

This measure captures task-level exposure — whether AI systems can perform the tasks central to an occupation's entry-level work. It does not determine whether the labor market response will be task substitution (AI replacing workers), task augmentation (AI enhancing productivity), or occupational restructuring (workflow reorganization). I use the term "exposure" rather than "vulnerability" or "risk" for this reason. The measure is presented as provisional and exploratory — external validation against actual labor market disruption outcomes is not yet possible given the recency of large-scale AI deployment.

3.3 Theoretical vs. Observed Adoption

When compared with observed real-world AI adoption rates from the Anthropic Economic Index, the field-level correlation is approximately zero (ρ ≈ −0.09). Multiple interpretations are possible, including a substantive temporal gap between capability and adoption, measurement artifacts, vendor-specific bias in adoption data, and uneven enterprise governance. I present both theoretical and observed dimensions separately rather than collapsing them into a single score.

4. Validation & Robustness

4.1 Sensitivity Analysis

I tested 18 alternative specifications plus a z-score scaling alternative against the baseline. The resilience axis is highly robust (all Spearman ρ > 0.93 across the original specifications). The post-college market position axis is more sensitive, particularly to AI specification (ρ ranges from 0.68 to 0.97). Across all specifications, 488 institutions (31.4%) never change quadrant: 216 consistently High Capacity, 167 consistently High Stress. These stable positions are overdetermined and represent robust structural assessments. The remaining 69% are boundary cases whose classification depends on analytical assumptions.

The z-score specification produces 92.1% quadrant agreement with the baseline (ρ = 0.987 on resilience, ρ = 0.985 on market position), confirming that the main tier-level patterns are robust to the scaling choice. The 123 institutions that shift are distributed across all tiers, with no tier showing agreement below 90%. I recommend examining component-level scores and stability ratings alongside quadrant labels.

4.2 Factor Analysis

Principal components analysis of the eight percentile-ranked framework components (KMO = 0.63, Bartlett's χ²(28) = 1900.58, p < .001, N = 1,262 complete cases) reveals a consistent three-factor structure: "Credential Outcomes" (completion, earnings-to-debt), "Institutional Character" (endowment, revenue diversification, AI exposure), and "Demand Environment" (enrollment, demographics). Both the Kaiser criterion and parallel analysis (Horn, 1965) support three factors; the third eigenvalue (1.15) exceeds the 95th percentile of random data (1.06). The equal-weight model compresses these into two axes, with the third captured by the demand environment overlay. The factor-derived alternate view is available on the Factor View tab. Complete results are in the supplementary materials.

4.3 PSEO Earnings Context

For 572 institutions with Census PSEO earnings data, I compared AI entry-level exposure scores to actual median earnings by field. The field-level correlation is positive (ρ = 0.257): higher AI exposure is associated with higher current earnings. This reflects the structural composition of AI-exposed fields — engineering technology, computer science, and business command above-average wages because they involve premium cognitive work that is also structurally susceptible to AI. The pattern does not confirm or disconfirm the AI exposure measure as a predictor of future disruption; it contextualizes what the measure currently captures.

5. Limitations & Caveats

The AI exposure measure captures structural task characteristics but not institutional curricular response — a school teaching students to work with AI receives the same exposure score as one ignoring AI entirely. College Scorecard earnings data covers only Title IV federal financial aid recipients, systematically understating outcomes at wealthier institutions. The CIP-SOC crosswalk applies typical career pathways regardless of institutional prestige — a psychology major from Harvard enters a different labor market than one from a regional institution, but both receive the same AI exposure score. The analysis is cross-sectional, providing a snapshot rather than a trajectory. Completion rate correlates strongly with both endowment and earnings, raising a question about whether the framework partly rediscovers institutional hierarchy through a variable already known to correlate with selectivity and socioeconomic composition.

Note on structurally unusual institutions

The resilience index uses IPEDS and Scorecard data — endowment per student, revenue diversification, enrollment trends, selectivity. Military academies are structurally unusual in ways that cut against some of those components. Revenue diversification, for instance, would score low because funding is almost entirely federal — but that's arguably the most secure funding source in higher education. The index reads concentration as risk; for a service academy, concentration is the opposite of risk. It's a real limitation of building composites from general-purpose institutional data. The framework works well for the ~1,500 institutions that operate in something like a common market. Institutions with fundamentally different funding and enrollment structures — military academies, tribal colleges, a few others — are cases where the component scores are technically accurate but strategically misleading.

This framework is a strategic classification heuristic, not a predictive model or a measure of latent social reality. The quadrant labels describe composite positions on the indicators I track. Institutions scoring higher are better-positioned on those measures, but the measures are analytical choices — institutions serve missions that no set of quantitative indicators fully represents.

6. Working Paper & Citation

Saunders, K. (2026). "Mapping the Structural Divide: Institutional Resilience, Post-College Market Position, and Artificial Intelligence Exposure Across U.S. Higher Education." Working Paper, April 2026.

Download Working Paper (PDF) Download Supplementary Materials (PDF)

Data & Downloads

All data used in this analysis are derived from publicly available federal sources. I provide the complete institutional dataset for transparency and replication.

Download Working Paper (PDF) Download Supplementary Materials (PDF)

What's in the dataset

1,556 institutions with: composite resilience and market position scores, all eight component percentile ranks, raw values for key variables (endowment, enrollment, completion rate, earnings-to-debt ratio, admission rate), AI exposure scores (theoretical and observed), quadrant assignment, 2025 Carnegie classification (IC2025, Research Activity Designation, and derived 9-tier grouping), legacy 2021 Basic Classification, and state.

Data Sources

IPEDS (nces.ed.gov/ipeds) · College Scorecard (collegescorecard.ed.gov) · WICHE (wiche.edu/knocking) · O*NET (onetcenter.org) · NCES CIP-SOC Crosswalk · Anthropic Economic Index (huggingface.co/datasets/Anthropic/EconomicIndex) · Census PSEO (lehd.ces.census.gov/data/pseo)

License

The dataset is released under CC BY 4.0. You are free to use, share, and adapt it with attribution.

Replication Materials

The complete analysis is reproducible from raw federal data files using a single Python script. The replication package includes:

replicate.py — A self-contained script that loads all 11 raw data files, executes every processing step (IPEDS merging, GASB/FASB finance correction, Scorecard merge, WICHE demographic projections, O*NET AI exposure pipeline, CIP-SOC crosswalk, percentile ranking, composite scoring, and quadrant assignment), and produces the final dataset.

README.md — Download URLs for all required raw data files, directory structure, output field descriptions, and critical methodological notes.

Requirements: Python 3.8+, pandas, numpy, scipy, openpyxl.

Download Replication Script (Python) Download README

About This Project

Origin

I first encountered Scott Galloway's 2020 "USS University" 2×2 not long after he published it. It stuck with me — because the exercise itself was clarifying in a way that most writing about higher education isn't. Mapping institutions on two dimensions forced a conversation about which schools were structurally positioned for what was coming and which weren't. It made the landscape legible. Galloway deserves credit for that: he was, as far as I know, the first person to attempt this kind of strategic positioning exercise for the full sector, and the fact that it resonated so widely tells you something about the demand for this kind of analysis.

The project came back to mind a few months ago as I started thinking more seriously about the compounding pressures facing higher education — demographic decline, enrollment contraction, fiscal stress, and now artificial intelligence. Galloway had done his version through a COVID lens: international student dependence, brand strength, pandemic resilience. The pandemic turned out to be a shock that most institutions survived. The pressures accumulating now feel more structural and more permanent, and the measures available to capture them have improved substantially since 2020.

When I went back to the literature, I found that Galloway's work sat inside a larger scholarly conversation I hadn't fully appreciated. Kelchen, Ritter, and Webber (2025) had built comprehensive institutional closure prediction models. Zemsky, Shaman, and Baldridge (2020) had developed the "College Stress Test." Grawe (2018, 2021) had mapped the demographic cliff in granular detail. Third Way (2023) had documented persistent outcome variation across the sector. What Galloway added to this was the strategic visualization — the idea that you could position institutions on a map and let the quadrants tell the story. What I wanted to add was methodological rigor: outcome-based measures instead of brand proxies, forward-looking indicators including AI exposure, and full transparency about the analytical choices involved.

One data source that made this project possible in a way it wouldn't have been even a year ago is the Anthropic Economic Index — a recently released dataset that provides observed real-world AI adoption rates by occupation, based on actual conversations with Claude. This let me do something genuinely new: compare theoretical AI task exposure (derived from O*NET occupational analysis) with observed AI adoption (what's actually happening in the workplace). The near-zero correlation between the two is one of the project's most distinctive findings — and it's only measurable because Anthropic made this data publicly available.

This project is not a prediction and not a judgment about which institutions deserve to survive. It is evaluative — institutions scoring higher are better-positioned on the indicators I track — but the indicators are analytical choices, not comprehensive measures of institutional value or educational mission. An institution serving a vital access mission in a declining-enrollment region may score low on these measures precisely because it is doing important work in a structurally challenging context. The framework makes structural positions visible; it does not determine which positions are worth occupying.

The companion essay, "On the Adaptations Before the Exigencies: Higher Education and the Structural Pressures Ahead", provides the broader argument for why this mapping exercise matters now — and why the signals the labor market is sending back to higher education may be more consequential than the sector has yet recognized. A related piece, "Why Higher Education's AI Backlash Reveals Some of Its Deepest Cracks", examines how institutional responses to AI — from outright bans to cautious avoidance — may themselves be a source of structural vulnerability.

Human–AI Collaboration

I built this project in collaboration with Claude (Anthropic). The AI contributed substantially to data processing, pipeline development, code generation, statistical analysis, visualization, and drafting across many iterative sessions. It would be dishonest to minimize that contribution — the scope and speed of this work would not have been possible without it.

That said, the intellectual direction is mine. The decision to replace Galloway's brand proxies with outcome measures. The choice to include AI exposure as an exploratory component despite knowing it would complicate the factor structure. The insistence on presenting limitations prominently and owning the framework as a strategic heuristic rather than a discovered latent structure. The interpretation of what the patterns mean for institutions, students, and the sector. These are human decisions, informed by years of thinking about research methodology and, more recently, about what's happening to the institutions where that methodology gets practiced.

The convention for crediting AI contributions in academic work is not yet settled. I've chosen transparency: describe what the tool did, be honest about it, and take responsibility for the output. That seems like the right standard for now.

Author

Kyle Saunders
Professor, Department of Political Science
Colorado State University
kyle.saunders@colostate.edu
kylesaunders.com · Substack · Google Scholar

How to Cite

Saunders, K. (2026). Mapping the Structural Divide: Institutional Resilience, Post-College Market Position, and Artificial Intelligence Exposure Across U.S. Higher Education. Working Paper, Colorado State University. https://kylesaunders.com/university-map

Acknowledgments

I will update this list moving forward, but already owe thanks to a lot of people who have influenced my thinking on this and helped out with initial drafts of this whole thing. I appreciate you.

Version History

Last updated: April 15, 2026

v1.4 — April 15, 2026

Z-score robustness check: Added a z-score (probit transform) scaling alternative to the sensitivity analysis. The z-score specification produces 92.1% quadrant agreement with the baseline (ρ = 0.987 on resilience, ρ = 0.985 on market position), confirming that the main tier-level patterns are robust to the choice between percentile ranking and standardization. The 123 institutions that shift are distributed across all tiers, with no tier showing agreement below 90%; R1 institutions show the highest agreement (96.0%).

Manuscript framing: Revised AI exposure interpretation on the AI tab to present all interpretations with equal weight, matching the manuscript's treatment. Strengthened the two-timelines framing. Added Master's tier High Stress percentages (46% L/M, 43% S) and Carnegie-modal baseline accuracy (45%) to the Overview tab. Updated sensitivity section with stable-institution quadrant breakdown (216 HC, 167 HS) and axis-level robustness statistics.

Static figures: Regenerated all four manuscript figures with website-matched D3.js color palette for visual parallelism between the journal submission and the interactive tool.

v1.3 — March 29, 2026

Carnegie 2025 integration: Replaced Carnegie Basic 2021 (C21BASIC) with the 2025 Carnegie Classification system as reported in IPEDS HD2024. Tier groupings now use a 9-tier scheme derived from the Research Activity Designation (R1/R2/RCU for research institutions) and Award Level Focus (Doctorate/Master's/Baccalaureate/Associate for non-research institutions). All visualizations — ridge plots, scatter plot color-coding, filter dropdowns — updated to the new tier definitions. The new RCU (Research Colleges and Universities) tier captures 182 institutions with meaningful research activity that previously had no research tier recognition. Raw 2025 classification codes (IC2025, RESEARCH2025) added to the downloadable dataset. Composite scores unchanged (Carnegie tier is a grouping variable, not a model input).

Expanded sensitivity analysis: Added 5 new specifications testing revenue diversification (sponsored research discount via ICR and full HERD removal), endowment encumbrance (yield vs. level), half-weight endowment, and admission yield as a selectivity alternative. Total sensitivity specifications: 18 (up from 13). Stability score now reflects proportion of all 18 specifications producing the same quadrant assignment.

Quadrant label correction: Fixed a temporary label transposition in the sensitivity analysis function where High Stress and Structurally Exposed assignments were swapped. All 780 affected institutions now carry correct quadrant labels. Composite scores and component values were unaffected — this was purely a labeling fix in the quadrant assignment logic.

Data integrity audit: Forensic diff of v1.2 → v1.3 dataset confirmed zero changes to any composite score or component value across all 1,609 institutions. JSON and CSV verified in sync (1,609 records, zero empty strings, all numeric fields in [0,1] range). Cross-referenced all quantitative claims across working paper, supplementary appendix, website, and README. Cache buster updated to force browsers to load current data file.

v1.2.1-geo — March 25, 2026

GEO/AEO metadata (back-of-house): Added JSON-LD structured data (Dataset and WebApplication schemas) to page head for AI search engine and generative engine discoverability. Added sameAs entity linking to author profiles. Created llms.txt providing a machine-readable project summary. No changes to visible content, data, or methodology.

v1.2.1 — March 23, 2026

Carnegie 2025 notice: Added preliminary notice about the 2025 Carnegie reclassification. (Superseded by v1.3 full Carnegie 2025 integration.)

Structurally unusual institutions: Added limitation note in Methods tab Section 5 acknowledging that military academies, tribal colleges, and other institutions with fundamentally different funding and enrollment structures may produce technically accurate but strategically misleading component scores under the framework's general-purpose indicators.

Ordinal suffix fix: Corrected percentile display across institution detail panel and scatter plot tooltips — previously all values displayed with "th" suffix (e.g., "63th," "2th"); now renders correctly (e.g., "63rd," "2nd," "1st").

Ridge plot label fix: Increased left margin on combined score distribution chart to prevent Carnegie tier labels from clipping outside the visible area.

Prose clarifications: Added note on AI Exposure tab clarifying the dual role of AI exposure as both a market position component and the subject of a separate field-level analysis. Added combined-score rationale to distribution chart description. Expanded selectivity component description to flag dual interpretation (demand buffer vs. limited pool). Added parenthetical gloss to demographic trajectory label on institution detail.

v1.2 — March 20, 2026

Factor analysis diagnostics: Added KMO measure of sampling adequacy (0.63), Bartlett's test of sphericity (χ²(28) = 1900.58, p < .001), parallel analysis (Horn, 1965; 1,000 replications confirming 3-factor solution), communality tables for all loading specifications, and per-variable MSA values. Documented missingness: 294 of 1,556 mapped institutions excluded from factor analysis due to incomplete component data, with excluded institutions disproportionately in High Stress (42.5%). Clarified PCA used for data reduction, not latent construct identification. Updated Methods tab and supplementary materials throughout.

Framework framing: Revised quadrant label language across all pages to acknowledge the evaluative gradient while contextualizing what the indicators do and don't capture. Replaced "descriptive labels, not predictions or rankings" with language that owns the directional content of the axes while noting that the indicators are analytical choices and institutions serve missions beyond what quantitative measures represent.

PSEO section: Reframed from "External Benchmarking" to "Earnings Context" across working paper, supplementary materials, and website. The PSEO finding contextualizes what the AI exposure measure captures but does not validate it as a predictor of future disruption.

AI exposure interpretation: Softened "preferred interpretation" language around the ρ ≈ −0.09 finding. All interpretations now presented with equal weight; the substantive interpretation (temporal gap between capability and adoption) is one of several plausible readings.

Methodology citations: Added references to composite index methodology literature (Saisana & Saltelli, 2011; NORC, 2024), parallel analysis (Horn, 1965), and Kelchen et al. (2025) as external evidence supporting enrollment trend as a forward-looking indicator.

AI tab: Added institutional-level standard deviation (0.030) to clarify that AI exposure has low institutional-level discriminating power; field-level analysis is where the measure provides meaningful differentiation.

v1.1 — March 10, 2026

Framework restructure: Restructured component axes based on pre-factor correlation diagnostics. Dropped raw median earnings (redundant with earnings-to-debt ratio at r = 0.80). Moved completion rate from Resilience to Market Position axis to resolve cross-axis loading (completion–earnings r = 0.734). Framework now uses 8 components (4 per axis). Recomputed all composites, quadrant assignments, factor loadings, and sensitivity analysis. 1,556 institutions mapped (vs. 1,550 previously).

Factor analysis: Recomputed all factor results using proper 8-component specification. Updated promax loadings, factor correlations, and uniqueness values across website, working paper, and supplementary materials. Renamed Factor 2 from “Institutional Resources” to “Institutional Character.” Flipped Factor 1 sign on the factor scatter so higher credential outcomes appear to the right, matching the axis arrow.

Front page and navigation: Added contextual onboarding around the search box — axis definitions, quadrant labels, key patterns, data sources, and explore cards for each visualization tab. Added same context to institution detail view. Fixed mobile tab scrolling. Renamed “The 2×2” tab to “The Whole 2×2.”

AI Exposure tab: Added comprehensive explainer walking through the 5-step AI exposure pipeline, the retrospective-vs-prospective problem, and the Anthropic Economic Index comparison. Reframed the exploratory badge from warning to informational.

Parallelism and documentation: Full consistency pass across all deliverables. Fixed abstract sensitivity wording (31% stable, not 31% sensitive). Clarified 1,262 complete-case vs. 1,556 mapped institution counts. Updated README with correct v1.1 component structure and field descriptions. Updated supplementary Appendix C to reflect current axis composition. Regenerated all PDFs with embedded figures.

SSRN: Working paper posted at SSRN.

Website readability: Increased font sizes and added bold weight to axis labels, quadrant markers, and tick text across all chart tabs (Whole 2×2, AI Exposure, Factor View). Fixed y-axis arrow direction on main scatter plot. Adjusted chart legend positioning to prevent overlap with data. Corrected equal-weight normalization in component weights comparison chart (now normalized across all 8 components). Added caption clarifying normalization basis.

v1.0 — March 8, 2026

Initial public release. Website, working paper (v1.0), and supplementary materials published simultaneously.

Framework: 1,556 institutions, 8 components, equal-weight composite scoring, median-split quadrant assignment. AI exposure derived from O*NET work activities pipeline with entry-level Job Zone weighting. Factor analysis across varimax, promax, and oblimin rotations (2 and 3 factors). Sensitivity analysis across 13 alternative specifications. PSEO external benchmarking for 572 institutions.

Planned Updates

Annual data refresh: IPEDS, College Scorecard, and WICHE data update annually. I intend to refresh the dataset each fall when new data releases are available, enabling longitudinal tracking of institutional positions.

Replication code: Python replication script and README available on the Data & Downloads tab. Full raw-data-to-output pipeline.

Preprint: Working paper v1.4 is available on SSRN.

Correspondence

Working paper version: v1.4 (April 2026)
Website version: v1.4
Dataset version: v1.4
Correspondence: kyle.saunders@colostate.edu
All three are synchronized — changes to any component will be reflected in updated version numbers across all materials.