I like to write, and LLMs make it even better.
This page reflects some of my notes while learning & working with different data types.
DATA DUMP
Form 4 (Insider Transactions, SEC Filing)
Finding relevant features within Form 4s based on insider info, transaction details, post-transaction position, historical context, regulatory flags
DEF 14A (Proxy Statement, SEC Filing)
Finding relevant features within DEF 14As based on metadata, proposals, board composition, compensation, shareholder rights, governance
10-K (Annual Report, SEC Filing)
Finding relevant features within 10-Ks based on filing structure, risk disclosures, MD&A analytics, financial statement metrics, footnotes, ESG, semantic NLP, visuals.
10-Q (Quarterly Report, SEC Filing)
Finding relevant features within 10-Qs based on metadata, financial statement metrics, qualitative sections and risk disclosures, footnote, ESG, semantic NLP
8-K (Current Report, SEC Filing)
Finding relevant features within 8-Ks based on filing metadata, item code information, text metrics, sentiment features, financial disclosures, topic modeling, named entity recognition, timing and temporal metadata
Private Placement Memorandum (PPM)
Finding relevant data to extract from PPMs to enable better onboarding of fund documents. Some categories are issuer information, offering structure, fee, performance targets, fund management, strategy, legal risks, ESG
Startup Pitch Deck
Finding relevant data to extract from startup pitch decks to evaluate investment potential based on market opportunity, business traction, technology and product, go-to-market strategy, team, funding, investors, risks
Earnings Call Transcript
Finding relevant features within earnings call transcripts based on call metadata, sentiment metrics, emotion signals, Q&A interaction dynamics, market reaction.
Credit / Debit-Card Transaction Panels
Finding relevant features within credit/debit card data based on identifiers, temporal, transaction content, merchant, behavioral rolling, customer profile, clustering
News on Webull
Finding relevant features within news headlines based on headline content, event annotations, publisher metrics, temporal context, behavioral signals, interheadline relations
Twitter / X
Finding relevant features within tweets based on metadata, sentiment, volume, user diversity, topic, engagement, volatility