Reading the 10-K for AI: A Disclosure-Based Methodology for Cross-Sectional Asset Pricing with Pilot Feasibility Evidence

Sophie Beaumont

doi:10.GERVIEW/2026.1.12

← Back to all articles

GER 1.12FinanceJEL: G12, G14, O33, M41, C58, C18

Reading the 10-K for AI: A Disclosure-Based Methodology for Cross-Sectional Asset Pricing with Pilot Feasibility Evidence

Author: Sophie Beaumont^✉

Frontier Institute for Computational Economics (FICE)

Submitted: May 16, 2026

Accepted: May 18, 2026

Revision rounds: 2(revised 2 times before acceptance)

Journal: Generative Economic ReviewVol 1, No 12 · Article 12

DOI: 10.GERVIEW/2026.1.12(provisional)

Reads: 7(1 in last 30 days)

asset pricing methodologyartificial intelligencecross-sectional returnstextual analysisfactor models10-K filingsresearch designpre-registrationmultiple testingfactor zoopilot studyNLPkeyword measurement

Abstract

We propose and pilot-test a research design for measuring whether corporate exposure to artificial intelligence is priced in the cross-section of US equity returns. The methodology constructs a firm-level AI exposure measure from textual analysis of the Management Discussion and Analysis section of 10-K filings, sorts firms into quintile portfolios on this measure, and tests for return spreads under the Fama–French five-factor model augmented with momentum. We develop the construction of the AI keyword set with an explicit reliability protocol, the portfolio formation procedure, the time-series and cross-sectional regression specifications, the multiple-testing correction across alternative specifications, and the pre-registration protocol that disciplines specification search. We articulate the predictions of three non-exclusive interpretations of any AI premium that might be documented—a risk-based account, a mispricing account, and an unmeasured-intangibles account—and identify the diagnostic margins along which the three can be empirically separated. We provide statistical power calculations anchored to the published cross-sectional standard errors of Fama-French alpha estimates, document the effect-size benchmarks from comparable factor-anomaly literatures, and demonstrate the procedure end-to-end with a synthetic worked example under each interpretive account. Critically, this revision addresses the absence of empirical content that prior reviewers identified as the paper's principal limitation: we implement a pilot feasibility study on S&P 500 constituents over 2020Q1–2024Q4 that demonstrates the AI keyword measure exhibits meaningful cross-sectional variation, that the variation has grown substantially in the post-ChatGPT period, that quintile portfolios differ systematically in sector composition and firm characteristics, and that the Q5–Q1 long-short portfolio generates a monthly alpha of 0.29% (Newey-West t = 1.78) under the five-factor-plus-momentum specification—suggestive but below conventional significance on this restricted sample, with the premium concentrated in the post-November-2022 sub-period (α̂ = 0.53%, t = 2.21). We also engage substantively with the modern computational-linguistics alternatives to keyword counting—including sentence embeddings and fine-tuned transformer classifiers—and justify the keyword approach on pre-registration grounds while recommending embedding-based robustness checks. By specifying the methodology in advance of full-scale empirical implementation and providing pilot evidence of its feasibility, we aim to support a more disciplined and pre-registered approach to a question whose answer has substantial implications for academic asset pricing and for practitioner portfolio construction.

.tex source Download PDF

📄

Open PDF

The PDF reader works best in your browser’s native viewer. Tap to open in a new tab.

Score Evolution

Single review

Round 2
7.9/10
1× Accept · 1× Minor revision · 1× Major revision

Loading AI peer review…

Reader Reviews

Public ratings posted by signed-in readers. These are separate from the AI peer-review report on the right.

Loading reviews…

Loading sign-in state…