Sound judgment toolkit: Representativeness

Why I prefer secondary over primary data, why I read first & talk later, how I try to limit bias.

May 04, 2021

Bias is an increasingly fashionable term amidst polarized discourse. Fashionable implies that we refer to it more, not that it’s increased over time. Buggy humans have been biased forever and won’t ever change. To make it worse, we’re biased while believing that only others are so, especially those on the other team. We’re as hypocritical as we’re biased.

How can I limit damage from bias and improve judgment, in my day job and otherwise? Not by pretending to be objective or meditating my way into illusory self-awareness. To continue from my previous essay, it’s easier to change the context than to change the individual. It’s easier to not stock chips than to test my self-control. I have found it helpful to focus on improving representativeness of evidence as an indirect path to limiting (never eliminating) effects of bias.

Representativeness means all evidence, not some. Evidence pertaining to all of the relevant universe, not a cherry-picked part of it. In studying any company, first step is to map its entire industry, broadly defined, using reliable secondary data. Ideally, this shouldn’t be limited to India. Or to present time. Over decades, across places, what did industry look like? How did comparable companies fare? How does company in question stack up against its contextual background? Does evidence suggest company is in a good place? Special within it?

Representativeness means viewing absence of evidence as evidence too. In studying industries with high fatality rates, it’s not helpful to find five survivors who’ve done well without realizing that a hundred started out. Or get carried away by up-cycle data, without reviewing decades. This is especially true in my own industry where active manager performance looks even worse after incorporating survivorship bias. In analysing frauds like Madoff, absence of volatility is relevant evidence. Company in question not making ruinous decisions that peers did counts as evidence. Figuring out what’s missing or what was consciously avoided is crucial. Actively evaluating counterfactual narratives is part of representativeness.

Representativeness means never viewing anecdotes in isolation. It’s easy to talk to a few industry ‘experts’ or pick up a paper to read headlines, and feel like an ‘expert’ myself. In my experience, doing so with a ‘blank slate’ is positively dangerous. Since minds are hard to change, an initial unrepresentative opinion is a hole that’s hard to get out of. When I study a company for the first time, I don’t schedule any diligence-calls without at least a fortnight of data gathering, reading, chewing the cud, seeking out more info and figuring out questions that matter. Regulatory filings of Western companies with cleverly hidden segmental details that overlap with my company of interest or credit rating notes with colour on unlisted competitors are invaluable. Having digested broad-based published evidence, I form a mental map that’s relatively uncoloured by anecdotes or impressions. Subsequent discussions with ‘experts’ are to refine this map. I juxtapose what I directly hear from a few against what I’ve indirectly learnt from many. If I hear something incongruous, I make note and see if any other ‘expert’ makes the same point without prompting. If there’s a pattern, I revisit broader evidence with more pointed questions and update my mental map. If there are stray anecdotes, I don’t necessarily do. Don’t ignore stories, but don’t start or stop with them.

Representativeness means recognizing that statistics are stories too. Less vivid, but way more numerous. Sometimes, investors tell me how they commissioned primary research and spoke to hundreds of customers to derive profound insights. Not only have I never done this, I am unsure of utility of asking for opinions. We don’t really know why we do what we do (but are ready with compelling rationalizations). I draw solace from knowing that secondary data is summation of millions of customer opinions. Without a questionnaire, but backed by real money. When I study paints or battery industry using decadal data covering a vast majority of that universe, my (free) sample runs into millions. I don’t have to eavesdrop on every purchase to form a representative view of what consumers prefer. Why bother with opinion polls when I already have election results.

Representativeness means calibrating opinions. While we cannot avoid opinions as an input in messy world, we can calibrate opinion-givers to decide what weightage to assign. Luckily, opinion-givers too have a track record. I usually go through every conference call transcript, sometimes spanning decades, keeping track of what was said and what actually happened. Ditto with external ‘experts’ and analysts. It not only gives colour on what went on behind the evidence, but lets me gauge reliability of protagonists. I know who’s conservative, who bullshits and who’s an inveterate idiot. Against my default setting of zero weightage to ‘experts’ and opinions, I gradually make exceptions for a few whose opinions are most representative of reality. By implication, I am perennially judging every person I encounter. Sorry, occupational hazard.

Representativeness is removing my own discretion. To find out who’s the lowest cost commodity producer, I don’t start with analyst opinion or my first impression. I also don’t go company-wise micro-adjusting financial statements of my preferred candidate. I decide which parameters to base this on and apply a mechanical screen on downloaded data across dozens of companies to see how companies fare. Sure, there’ll be idiosyncratic effects that I’ll factor into next level of analysis, but my first-cut should be an unbiased sweep across the world as it is, not as I want it to be. If representative analysis shows my favourite to be outside top-5, it’s harder for me to sustain self-delusion (not impossible, though). Key is to pick robust criteria and apply method to all, never a few. Poking around selectively can yield stories or tell me what I want to hear. Without systematically poking around everywhere, there’s no pattern or inference. Especially in relatively crowded industries, this approach has been invaluable to picking objectively superior companies over seemingly fashionable ones.

But how about chor promoters or dodgy accounting, where secondary data is unreliable? Fun part about lies is that it’s hard to keep them consistent. When my sweep is rigorous, bad apples end up sticking out on a few key parameters. Underproviding for bad loans, extra-long debtor days or absurdly high capex intensity (a favoured spot for siphoning cash) is best seen in light of what’s par for that peerset. While I can poke around further to decode exact nature of malfeasance, my broader objective of not getting suckered is already met.

Statistics gets a bad rap, more so in times like these. It’s dry and feels heartless. Whether investing or real world, we tend to overweight vivid stories, moving anecdotes and first impressions. Effectively, this is analysis by walking around and looking selectively. Even if we were robots, it’s an unreliable method. For buggy humans, it’s disastrous, reinforcing innate bias with selection bias and confirmation bias, leading to poor inferences and decisions. In investing decisions, I have been well served forcing myself to get broad-based, representative evidence. Weighing secondary data over primary impressions has improved decision-quality. I have made my peace with this impersonal approach via the realization that statistics is merely an aggregation of stories, albeit non-detailed. While each story may have been stripped of vividness, I process millions of them without their being cherry-picked. While I am still biased, my inputs are less so.

Representativeness simply means seeking evidence that represents reality. All of it. Without selectivity or photo-shopping. Zoom out to the most panoramic view possible before clicking. Ideally, convert photo into movie, by adding a time dimension. Since there’s no pixel or bandwidth constraint, picture can still be high-resolution. We can truly see how our area of interest contrasts against its broader backdrop. Any filters are to be applied to the entire panorama, not just to our favourite part. I can still cheat by cropping out inconvenient bits, but there’s a difference between doing so explicitly vs implicitly. A bit of dissonance creeps in, making it a wee bit harder for me to airbrush reality.

Buggy Humans in a Messy World

Discussion about this post