DEF 14A (Proxy Statement, SEC Filing)

Data Overview

The definitive proxy statement (SEC Form DEF 14A) is the document U.S. public companies must deliver to shareholders ahead of an annual or special meeting so owners can vote on director elections, executive pay, capital-structure changes, M&A approvals and shareholder proposals. Filed under Exchange Act §14(a), it is the single most granular source of governance, compensation and voting-rights information available before ballots are cast, and the SEC requires it so investors can cast informed votes and to deter undisclosed related-party dealings.

Relevance for predictive modeling

executive-compensation design, buy-back approvals, and shareholder proposals create event windows around annual meetings; pay-for-performance mis-alignment has shown negative alpha one quarter out.

Content & format

A DEF 14A is an XHTML/PDF bundle that can exceed 150 pages.

Each section is marked by headings that make deterministic parsing feasible. Core blocks include:

  1. election of directors

  2. executive-compensation discussion & analysis (CD&A)

  3. pay-versus-performance tables

  4. related-party transactions

  5. audit-committee matters

  6. shareholder proposals

  7. say-on-pay frequency

  8. voting and record-date logistics

  9. security ownership of insiders and 5 % holders

Latency profile

Preliminary copies (PRER14A) must reach the SEC at least ten days before definitive materials are mailed to holders — and the definitive DEF 14A must be on EDGAR no later than 120 days after fiscal year-end for companies that wish to incorporate Part III of Form 10-K by reference. Once accepted, EDGAR exposes the filing in <15 s; vendor feeds with NLP tags arrive within ~2 min. Because many investors digest governance data only before the meeting, a desk that snapshots the text the moment it posts can front-run sluggish price discovery in the following 1-10 trading days.

 

Data Processing Pipeline

This is an overview of what the pipeline could look like as part of a first-draft requirements sheet. Teams should refine based on tech stack and custom needs.

  • Ingest — stream SEC “companyevents” RSS and Schedule 14A CIK endpoints into Kafka; back-fill PDFs from investor-relations sites.

  • Parse — convert XHTML to text with BeautifulSoup; pipe scanned pages through Tesseract; keep page numbers for lineage.

  • Section tagging — a RoBERTa legal fine-tune classifies paragraphs into the nine canonical blocks.

  • Entity extraction — regex + spaCy capture pay levels, option grants, director tenures, share counts, proposal IDs, ISS/Glass Lewis recommendations and record dates.

  • Derived metrics — compute CEO pay ratio, option dilution %, pay-versus-TSR mis-alignment, board refresh rate, proposal polarity scores.

  • Manual QA — governance analysts review any auto-extracted number outside 3 σ sector range or with OCR confidence < 90 %.

  • Storage — write point-in-time feature JSON keyed by (ticker, meeting_date) to the feature store; raw text stays under entitlements for Reg FD compliance.

 

Features for Predictive Modeling

  • {

    "ticker": "NVDA",

    "filing_metadata": {

    "cik": "0001045810",

    "document_type": "DEF 14A",

    "filing_date": "2025-04-12",

    "fiscal_year_end": "2025-01-31",

    "meeting_date": "2025-06-06",

    "meeting_type": "Annual Meeting",

    "location": "3401 Leonard Street, Santa Clara, CA 95054",

    "virtual_flag": true,

    "record_date": "2025-03-20",

    "distribution_date": "2025-04-14"

    },

    "proposals": [

    {

    "proposal_number": 1,

    "proposal_type": "Election of Directors",

    "management_recommendation": "For",

    "is_shareholder_proposal": false,

    "pass_threshold": "Plurality",

    "director_names": ["Jensen Huang", "Harvey Jones", "Tench Coxe"],

    "board_slate_size": 12,

    "dissident_slate_size": 0,

    "contested_flag": false

    },

    {

    "proposal_number": 2,

    "proposal_type": "Say-on-Pay (Advisory Vote)",

    "management_recommendation": "For",

    "is_shareholder_proposal": false,

    "pass_threshold": "Majority",

    "prev_year_vote_pct": 94.5

    },

    {

    "proposal_number": 3,

    "proposal_type": "Shareholder Proposal: Climate Reporting",

    "management_recommendation": "Against",

    "is_shareholder_proposal": true,

    "proponent_type": "Institutional Investor",

    "pass_threshold": "Majority",

    "supporting_statements": [

    "Seeks annual disclosure of emissions reduction targets and progress."

    ],

    "prev_year_vote_pct": 19.3

    }

    // ...further proposals

    ],

    "board_composition": {

    "total_board_members": 12,

    "average_tenure_years": 7.2,

    "independent_board_ratio": 0.83,

    "gender_diversity_pct": 33.3,

    "ethnic_diversity_pct": 25.0,

    "lead_independent_director_flag": true,

    "officer_on_board_flag": true,

    "key_committee_chairs": {

    "audit_committee": "Harvey Jones",

    "compensation_committee": "Tench Coxe",

    "governance_committee": "Persis Drell"

    }

    },

    "executive_compensation": {

    "ceo_total_comp_usd": 34000000,

    "ceo_bonus_pct": 0.35,

    "median_employee_pay_usd": 185000,

    "pay_ratio_ceo_to_median": 183.8,

    "average_named_exec_comp": 14200000,

    "comp_advisory_vote_past_support_pct": 94.5,

    "equity_incentive_pct": 0.72,

    "clawback_policy_flag": true,

    "peer_group": ["AMD", "Intel", "Qualcomm", "Broadcom"]

    },

    "shareholder_rights": {

    "proxy_access_flag": true,

    "majority_voting_directors_flag": true,

    "staggered_board_flag": false,

    "special_meeting_threshold_pct": 20,

    "written_consent_rights": true,

    "supermajority_voting_items": ["Bylaw Amendment", "M&A"]

    },

    "governance_analytics": {

    "recent_activism_flag": false,

    "prev_contested_election_flag": false,

    "recent_director_changes_flag": true,

    "board_refreshment_score": 0.21,

    "flag_for_unusual_policy_change": false

    },

    "compensation_policy_notes": {

    "performance_metrics": ["EPS Growth", "TSR", "Product Development Milestones"],

    "long_term_incentive_weight_pct": 58,

    "use_relative_vs_peers": true,

    "non_gaap_metric_use_flag": true,

    "comp_consultant_name": "Mercer"

    }

    }

Feature Groups

Filing Metadata (filing_metadata)

  • cik: Company’s SEC CIK identifier.

  • document_type: DEF 14A (Definitive Proxy Statement).

  • filing_date/meeting_date: Context for event timing and arbitrage windows67.

  • meeting_type/location/virtual_flag: In-person or virtual event, location.

  • record_date/distribution_date: Shareholder eligibility and packet send dates.

Proposals (proposals – Array)

  • proposal_type: Director, say-on-pay, auditor, M&A, equity plan, shareholder ESG.

  • management_recommendation: For, Against, Abstain.

  • is_shareholder_proposal/proponent_type: Shareholder-initiated and type (e.g., Institutional, Retail).

  • pass_threshold: Plurality, Majority, Supermajority.

  • director_names/board_slate_size/contested_flag: Detailed for election proposals.

  • prev_year_vote_pct: Historical support, measures trend for activism/arbitrage.

Board Composition & Diversity (board_composition)

  • total_board_members/independent_board_ratio: Board structure.

  • average_tenure_years: Refreshment and entrenchment.

  • diversity metrics: Gender/ethnic mix ratios, critical for ESG screening.

  • lead_independent_director_flag: True if lead independent.

  • key_committee_chairs: Audit/compensation/governance leadership.

Executive Compensation (executive_compensation)

  • ceo_total_comp_usd: CEO pay – outlier levels often move sentiment.

  • pay_ratio_ceo_to_median: PAY ratio, used in governance models.

  • bonuses, equity, clawback, peer_group: Compensation policy elements.

  • comp_advisory_vote_past_support_pct: Past say-on-pay support (activism risk).

  • equity_incentive_pct: Share of pay tied to equity.

Shareholder Rights (shareholder_rights)

  • proxy_access_flag: Can shareholders nominate directors?

  • majority_voting_directors_flag: Board elections majority/plurality.

  • staggered_board_flag: Board classified system.

  • special_meeting_threshold_pct/written_consent_rights: Ease of calling actions outside annual meeting.

  • supermajority_voting_items: Critical items needing >50% approval.

Structural & Governance Analytics (governance_analytics)

  • recent_activism_flag/contested_election_flag: Recent or ongoing activism.

  • recent_director_changes_flag/refreshment_score: Signals for change risk, potential value shifts.

  • flag_for_unusual_policy_change: Catches significant, potentially market-moving governance amendments.

Compensation Policy & Peer Comparison (compensation_policy_notes)

  • performance_metrics/weights: Metrics and design for comp plans.

  • peer benchmarking: Use of peer performance; non-GAAP metric inclusion.

  • consultant_name: External validation (Mercer, Willis, etc.).

 

Alpha Hypotheses

  • Say-on-pay backlash : support below 70 % followed by an “ISS Against” recommendation predicts −0.8 % abnormal return in the ten trading days post-filing as investors anticipate governance pressure

  • CEO pay-versus-performance gap : a top-quartile positive gap (pay up, TSR down) forecasts under-performance over the next quarter as proxy advisors mobilise and activists accumulate.

  • Fresh shareholder proposals : the first appearance of a high-support (>40 %) governance proposal raises the probability of an activist campaign and increases event-risk-adjusted volatility by ~15 % in the following six months

  • Board refresh rate : accelerated turnover (>30 % new directors over three years) links to higher subsequent ROIC and +30 bps alpha relative to sector medians, implying a potential long tilt toward actively refreshed boards.

Risks and Mitigation

  • Boiler-plate dilution — CD&A language can swamp sentiment models; always anchor on numeric fields (dollars, shares, percentages).

  • Multi-file sprawl — large-cap issuers often lodge separate appendices or incorporate by reference; failure to merge them can drop critical pay tables.

  • Optical-character errors — older scanned proxies mis-OCR “$1,000,000” as “$1000,000”; apply digit heuristics and manual review triggers.

  • Meeting-date drift — amendments (DEFA14A) can shift record or meeting dates; ensure joins to price tape use the latest DEF 14A metadata.

  • Crowding — simple CEO-buy ratio screens are commoditised; our edge comes from composite features like pay-TSR mis-alignment × ISS stance × dual-class flag.

 
Previous
Previous

Form 4 (Insider Transactions, SEC Filing)

Next
Next

10-K (Annual Report, SEC Filing)