What a Real Track Record Looks Like

Research April 12, 2026 8 min read

The previous analysis argued that systematic process outperforms discretionary judgment through consistency. But a systematic process is only credible if its results are published transparently — otherwise it is just another black box making claims.

The investment industry’s relationship with performance reporting is an elaborate exercise in selective disclosure. Track records are curated, cherry-picked, backfill-adjusted, and survivorship-biased before they reach any investor. The result is that the performance data most widely presented to investors is systematically and materially misleading — not usually because of outright fraud, but because the reporting conventions that define standard practice have been developed to maximize persuasiveness, not accuracy.

Understanding what a real track record requires is, in effect, understanding how to strip away every layer of distortion that standard reporting adds. What remains, once those layers are removed, is either credible evidence or a marketing document.

The Cost Layer: Gross vs. Net

The first and most straightforward distortion is the gap between gross and net returns. Gross returns are what a strategy earned before fees. Net returns are what an investor actually received after the manager took their share.

The conventional hedge fund fee structure — 2% annual management fee on assets plus 20% performance fee on gains — creates a substantial and often underappreciated wedge. On a fund generating 15% gross returns, the arithmetic runs as follows: the 2% management fee reduces returns to 13%. The 20% performance fee on the remaining 13% takes an additional 2.6 percentage points. Net return to the investor: 10.4%. The manager earned 4.6 percentage points on a 15% return year — capturing 30% of the upside.

The asymmetry is more damaging in drawdown years. The manager’s 2% annual management fee continues to accrue regardless of performance. An investor who enters a fund before a 30% drawdown year watches their capital decline by 30% while simultaneously paying 2% in management fees — with no offsetting refund of performance fees earned in prior years in most fund structures. High-water marks protect against paying performance fees on recovered losses, but they do not protect against paying management fees during periods of persistent underperformance.

Any credible track record must report net-of-all-fees returns as the primary performance figure. Gross returns are an accounting artifact that does not represent any investor’s actual experience. Reporting gross returns as the headline figure and burying net returns in a footnote is standard practice and is uniformly misleading.

The Survivor Problem: What the Dead Funds Cannot Tell You

Survivorship bias is the most structurally embedded distortion in investment performance data. Funds that perform poorly close. Closed funds are removed from performance databases. The performance databases that investors use to evaluate managers therefore contain only the funds that survived — which are, by selection, the funds with relatively good records.

Fung and Hsieh’s 2000 paper “Performance Characteristics of Hedge Funds and Commodity Funds: Natural vs. Spurious Biases” (Journal of Financial and Quantitative Analysis) estimated that survivorship bias in hedge fund databases created an average upward bias in reported performance of approximately 2 to 3 percentage points annually. Their methodology was conservative: they estimated the return drag from including dead funds in the database rather than excluding them, using data from a period when database construction was still being standardized.

Subsequent research has generally confirmed or extended this estimate. Ackermann, McEnally, and Ravenscraft’s 1999 paper “The Performance of Hedge Funds: Risk, Return, and Incentives” (Journal of Finance) found similar magnitudes of bias. The implication is direct: when an investor evaluates a hedge fund universe and sees an average net return of 8%, the true average net return — including the funds that closed during the measurement period — was likely 5 to 6%. The investment universe looks better than it is because the worst evidence has been systematically removed before the investor sees it.

Survivorship bias operates below the fund level as well. Within a fund family that manages multiple strategies, poor-performing strategies are quietly closed or merged into better-performing ones, and the surviving record gets attributed to the manager’s overall track record. At the strategy level within a fund, signals or factors that underperformed get dropped from the model and their historical underperformance is no longer visible in the reported track record.

A credible track record must be generated from a specific, fixed strategy with no post-hoc modification, measured over the full period including all losing sub-periods, and reported without the ability to exclude any portion of the history on the grounds that the strategy was “being refined.”

The Backtest Problem: Paper Performance Is Not Real Performance

The distinction between backtested performance and live performance is the most important distinction in quantitative investment reporting, and it is routinely obscured.

A backtest is a simulation. The researcher applies a defined set of rules to historical data and calculates what returns would have been if those rules had been followed. Backtests have a structural bias toward overestimating live performance because they are conducted with knowledge of the outcome. Even researchers who make every effort to avoid look-ahead bias — using only data that would have been available at each historical decision point — cannot fully escape the fact that the strategy was designed with some awareness of what the historical data looks like.

The gap between backtested performance and live performance is consistently large. Correlation between a strategy’s backtested Sharpe ratio and its live Sharpe ratio is significantly below 1. The reasons are well understood: transaction costs in live trading are higher than backtest assumptions, position sizes in live markets create market impact that backtests cannot model, fills are partial and timing is imperfect, and the strategies that get deployed live are the ones that looked best in backtesting — creating a selection effect where live strategies start from an already-elevated backtested baseline.

Lopez de Prado’s 2018 book “Advances in Financial Machine Learning” (Wiley) documents the multiple testing problem that amplifies backtest inflation for ML-based strategies specifically: when a researcher tests many variations of a strategy and selects the best-performing version for deployment, the selected strategy’s backtest will outperform its live performance by a margin proportional to the number of variations tested. The strategy was not found; it was mined from the data. Live performance cannot be expected to match a number derived through data mining.

A credible track record labels every return clearly as backtested, paper traded, or live. It reports the live track record as the primary evidence and treats backtested returns as hypothesis generation rather than performance evidence. Any presentation that blends backtested and live returns into a single continuous equity curve without explicit labeling is obscuring information that any rational investor would want to have.

The Selection Effect: Showing Only Winners

Performance reporting at the strategy selection level has a selection effect problem analogous to publication bias in academic research. Managers report the strategies that worked. The strategies that did not work are not reported.

This applies both to forward-looking performance claims and to historical analysis. An asset manager who ran five strategies over the past decade, three of which underperformed and two of which outperformed, will present the two outperformers in their marketing materials. The three underperformers are described as “legacy strategies no longer offered” or simply omitted. The investor who sees only the two successful strategies is seeing a deliberately incomplete picture.

The SEC has taken enforcement action on precisely this issue. Under Rule 206(4)-1 of the Investment Advisers Act, it is unlawful for an investment adviser to include in advertisements any untrue statement of a material fact or to omit a material fact necessary to make the statement not misleading. The SEC’s 2022 guidance on investment adviser marketing rules specifically addressed cherry-picking — the practice of selecting from a portfolio of accounts or strategies only those that performed well when showing historical performance — as a deceptive practice.

Cherry-picking is not limited to the gross/net or dead fund problems. It operates at the level of what gets included in the track record at all. Any report that does not address this question — including all strategies run by the manager, not just the successful ones — is, at minimum, incomplete.

Maximum Drawdown: The Number Nobody Leads With

Among the statistics that illuminate the real risk of an investment strategy, maximum drawdown — the largest peak-to-trough decline in the strategy’s equity curve over the full measurement period — is among the most informative and most consistently underemphasized in marketing-oriented performance presentations.

Returns are routinely presented as annualized CAGR without any companion statistic describing the path of those returns. A strategy that earned 12% CAGR over 15 years looks the same in that single number whether the path was a smooth upward trend or included a 45% drawdown from which recovery took four years. The investor’s experience of those two paths is completely different. Their ability to maintain the strategy — to avoid the behavioral costs documented in the Dalbar research — is completely different.

Maximum drawdown, paired with time-to-recovery, provides the information that CAGR alone obscures. The investor who understands that a strategy has historically declined by up to 35% from peak, and that recovery from such declines has taken 18 to 30 months, has the information needed to make a realistic judgment about whether they can hold through such a period without making the behavioral errors that destroy the return they were promised in the CAGR figure.

A credible track record leads with risk metrics alongside return metrics: maximum drawdown, time to recovery from maximum drawdown, Sharpe ratio (net of all costs), and Calmar ratio (return divided by maximum drawdown). The return figure reported without these companions is an incomplete description of what the investor actually experienced.

What a Real Track Record Contains

The standard for a credible investment track record is not complicated, but it is high enough that most published track records do not meet it:

Net-of-all-fees returns, with explicit disclosure of the fee structure applied. No gross return figures presented as the primary performance metric.

Full time period with no gaps, starting from the strategy’s actual inception date, including all drawdown periods, all underperformance periods, and all years that the strategy performed poorly relative to its benchmark.

Clear labeling of backtested versus paper-traded versus live returns, with live returns treated as the primary evidence and backtest returns treated as hypothesis generation.

All strategies run by the manager, not a selected subset of successful ones. If legacy strategies were closed due to underperformance, that fact and their performance record should be disclosed.

Maximum drawdown, time to recovery, volatility, and Sharpe ratio presented alongside return figures as first-class statistics, not footnotes.

Methodology disclosure sufficient for an independent party to verify that the reported returns are consistent with the stated strategy — not sufficient to replicate the implementation, but sufficient to confirm that the results are arithmetically consistent and not dependent on look-ahead bias or data snooping.

The Transparency Standard

Any investment process that claims systematic or quantitative rigor bears a particular obligation here. The entire premise of systematic investing — as argued in the preceding analysis in this series — is that process consistency and reproducibility are the source of edge. A systematic process whose results cannot be examined critically is making the strongest possible claim (algorithmic discipline) while refusing to submit to the most basic evidentiary standard (verifiable results).

The credibility of a systematic approach rests entirely on the quality and integrity of its reported results. Without that, the claim of systematic discipline is indistinguishable from any other marketing claim made by any manager who prefers opacity.

This publication will launch a transparent trading journal. It will show delayed results from a live systematic process — including every loss, every defensive action, every month the system underperformed its benchmark. The numbers will be real. The failures will be published alongside the successes. That is the standard.

Get OVRWCH's regime report and trade analysis.

Free. No spam. Unsubscribe anytime.

We'll connect this to Beehiiv when we launch.

Your Next Move

Build your foundation — these are the moves that compound.

What Is a Brokerage Account

You're losing years of compound growth for every month you don't have one.

How to Invest $1,000

That cash sitting in your checking account lost 3% to inflation last year.

What Is Dollar-Cost Averaging

Timing the market costs the average investor 1.5% per year. There's a better way.