I spent the first week of 2021 in the same way that I have spent the first week of every year since 1995, collecting data on publicly traded companies and analyzing how they navigated the cross currents of the prior year, both in operating and market value terms. I knew that this year would be more challenging than most other years, for two reasons. The first was that the shut down of the global economy, initiated by the spreading of COVID early last year, had significant effects on the operations of companies in different sectors, and across the world. The second was that, starting mid-year in 2020, equity markets and the real economy moved in different directions, with the former rising on the expectations a post-virus future, and the latter languishing, as most of the world continued to operate with significant constraints. In this post, I will start with a rationalization of why I do this data analysis every year, follow up with a description (geographic and sector) of the overall universe of companies that are in my analysis, list out the variables that I estimate and report, and conclude with a short caveat about 2020 data.
Data: A Pragmatist View
We live in the age of data worship, where investors, analysts and businesses all seem to have bought into the idea that big data has answers for every question and that collecting the data (or paying for it) will create positive payoffs. I am a skeptic and I have noted that to make money on big data, two conditions have to be met:
- If everyone has it, no one does: I believe that if everyone has a resource or easy access to that resource, it is difficult to make money off that resource. Applying that concept to data, the most valuable data is unique and exclusively available to its owner, and the further away you get from exclusivity, the less valuable data becomes.
- Data is not dollars: Data is valuable only if it can be converted into a product or service, or improvements thereof, allowing a company to capture higher earnings and cash flows from those actions. Data that is interesting but that cannot be easily monetized through products or services is not as valuable.
- Gain perspective: One of the challenges of being a business or an investor is developing and maintaining perspective, i.e., a big picture view of what comprises normal, high or low. Consider, for instance, an investor who picks stocks based upon price to book ratios, who finds a stock trading at a price to book ratio of 1.5. To make a judgment on whether that stock is cheap or expensive, she would need to know what the distribution of price to book ratios is for companies in the sector that the company operates in, and perhaps in the market in which it is traded.
- Clear tunnel vision: Investors are creatures of habit, staying in their preferred markets, and often within those markets, in their favored sectors. Equity research analysts are even more focused on handfuls of companies in their assigned industries. So what? By focusing so much attention on a small subset of companies, you risk developing tunnel vision, especially when doing peer group comparisons. Thus, an analyst who follows young technology companies may decide that paying ten times revenues for a company is a bargain, if all of the companies that he tracks trade at multiples greater than ten times revenues. Nothing is lost, and a great deal is gained, by stepping back from your corner of the market and looking at how stocks are priced across industries and markets.
- Expose BS: I know that everyone is entitled to their opinions, but they are not entitled to their facts. I am tired of market experts and analysts who make assertions, often based upon anecdotal evidence and plainly untrue, to advance a thesis or agenda. Are US companies more levered than they were a decade ago? Do US companies pay less in taxes than companies in other parts of the world? Are companies that buy back stocks more financially fragile than companies that do not? These are questions that beg to be addressed with data, not emotions or opinions. (I know.. I know... You would like those answers now, but stay tuned to the rest of my data updates and the data will speak for itself.)
- Challenge rules of thumb and conventional wisdom: Investing has always had rules of thumb on how and when to invest, ranging from using historical PE or CAPE ratios to decide if markets are over valued, to simplistic rules (eg. buy stocks that trade at less than book value or trade at PEG ratios less than one) for individual stocks. It is very likely that these rules of thumb were developed from data and observation, but at a different point in time. As markets and companies have moved on, many of them no longer work, but investors continue to use them. To illustrate, consider a practice in valuation, where analysts are trained to add a small cap premium to discount rates for smaller companies, on the intuition that they are riskier than larger companies. The small cap premium was uncovered in the the early 1980s, when researchers found that small cap stocks in the US earned significantly higher returns than large cap stocks, based upon data from 1926 to 1980. In the decades since, the small cap premium has disappeared in returns, but inertia and laziness have kept the practice of adding small cap premium alive.
- Currency: One of the challenges of dealing with a global sample is that you are working with accounting and market numbers quoted in multiple currencies, and since you cannot aggregate or average across them, I will employ two techniques. First, all value numbers (like market capitalization, debt or revenues) that I aggregate or average will be converted into US dollars to ensure currency consistency. Second, most of the numbers that I report will be ratios, where the currencies are no longer an issue; a PE ratio for a Turkish company, with market cap and earnings denominated in Turkish Lira, can be compared to the PE ratio for a US company, with market cap and earnings denominated in US $. It is true that the Turkish company will face more risk because of its location, but that is an issue separate from currency.
- Missing Data: Information disclosure requirements vary across the world, and there are some data items that are available for some companies or some markets, and not for others. Rather than remove all firms with missing data, which will eliminate a large portion of my sample, I keep the firms in the sample and report only the variable/metric that is affected by the missing item as "NA". For instance, I have always computed the present value of lease commitments in future years and treated that value as debt, a practice that IFRS and GAAP have adopted in 2019, but that computation requires explicit disclosures of lease commitments in future years. That is standard practice in the United States, but not in many emerging markets, but rather than not do the computation for all companies or remove all companies with missing lease commitments, I compute lease debt for those companies that report commitments and report it as zero for those companies that do not, an imperfect solution but the least imperfect of the many choices.
- Accounting Differences: In addition to disclosure differences, there are also accounting differences in revenue recognition, expensing rules, depreciation methods and other details across markets. I work with the publicly available data, recognizing that net income for a Japanese company may be measured differently than earnings for an Indian company, and accept that this may skew the results. However, it is worth noting that accounting rules around the world have converged over the last four decades, and they share a lot more in common than they have as differences.
- Source Reliability and Errors: I obtain my raw data from S&P, Bloomberg and others, and I am grateful that I can do that, because I could never hand collect and input data on 40,000+ companies. They, in turn, obtain the data from computerized databases in different markets that collect public filings, and at every stage in this process, there is the possibility of errors. I do run some simple error checks to eliminate obvious mistakes, but I am sure that there are others that I miss. My defense is that, unless the mistake is on a very large scale, the impact it has on my group statistics is small, simply because of my sample size. In addition, there is also the possibility of accounting fraud in some companies, and there is little or nothing that can be done about them.
- US Equity Risk Premiums: The equity risk premium is the price of risk in equity markets. In my view, it is the most comprehensive measure (much more so than PE ratios or other pricing multiples) of how stocks are being priced, with a higher (lower) equity risk premium correlating with lower (higher) stock prices. The conventional approach to measuring this premium is looking at past returns on stocks and treasuries (or something close to riskfree) and measuring the difference in historical returns and I report the updated levels (through 2020) for historical premiums for stocks over treasuries in this dataset. I have argued that this approach is both backward looking and static, and have computed and reported forward-looking and dynamic equity risk premiums, based upon current stock price levels and expected future cash flows; you can find both the current level and past values in this dataset.
- Country Risk Premiums: In a world where both investing and business is globalized, we need equity risk premiums for markets around the world, not only to value companies listed in those markets, but also to value companies that have operations in those countries. I will not delve into the details here, but I use the same approach that I have used for the last 30 years to estimate the additional risk premiums that I would charge for investing in other markets to get equity risk premiums for almost every country in the world in this dataset.
- Bond Default Spreads: The bond default spread is the price of risk in corporate bond (lending) markets, and as with the equity risk premium, higher (lower) spreads go with lower (higher) corporate bond prices. In the same dataset where I compute historical equity risk premiums, I report historical returns on corporate bonds in two ratings classes (Moody’s Aaa and Baa ratings). I also report estimates of the default spreads based upon current yields on bonds in different ratings classes and the current riskfree rate.
- Industry: Data can be consolidated by geography, industry or company size, and I use all three, to some extent or the other. My primary consolidation is by industry and I break down my sample of 46,579 firms into 94 industry groupings. To make this classification, I start with the industry classifications that are in my raw data, but create my own industry groups, again to prevent stepping on the toes of my data providers. I know that this description is opaque, but the best way to understand my industry groups is to go to this dataset, where I report the companies that I include in each industry.
- Geography: There is one dataset where I look at companies broken down by country, and I report a number of different statistics for 136 countries. I would caution you to take this data with a grain of salt, since there are only a handful of listings in some country. I do report much more data for a broader geographical classification, where I classify firms into six broad geographical groupings: Note that while emerging markets is a very large and diverse group, I do report statistics for India and China, two of the bigger components, separately.
- Averaging: I hope it does not sound patronizing, but I want to explain how I compute group values (averages), because it is not as obvious as it sounds. To illustrate why, consider the challenge of computing the PE ratio for companies in the software industry. You could compute the PE ratio for each company in the group and take a simple average, but that approach has two problems. First, it weights small firms as much as large firms, and outliers can cause the average to take on outlandish values. Second, it eliminates firms that have negative earnings (and thus have no PE ratios) from the sample, potentially creating biased samples. Using a weighted average PE ratio can counter the first problem and using a median can reduce the outlier effect, but neither approach can deal with the second problem (of sample bias). For most of my industry averaging, I use an aggregation approach, where I compute ratios using aggregated values; to compute the PE ratio for chemical companies, I add up the market capitalizations of all chemical companies and divide that number by the aggregate net income of all chemical companies (including money losers). This ensures that (a) all companies are counted, (b) the computed number is weighted since larger companies contribute more to the aggregate and (c) the risk of outliers is reduced, since it is less likely to occur in a large sample than for an individual firm.
- Current data: The focus of this data update is to report on how companies did in 2020, rather than to provide historical time series. Since I am updating the data in early January 2021, and the complete numbers for 2020 will not be available until March or April at the earliest, I will be using the trailing twelve month numbers for operating variables (like revenues and operating income) to compute ratios. For accounting numbers, that will effectively be the twelve months through September 30, 2020, that will be captured in the data. This practice of focusing on current data can cause the computed numbers to be volatile, but I do have the archived data going back in time (more than 20 years in the US, less for other parts of the world) at this link.
- COVID effects: To capture how COVID has played out in different sectors and geographies, I computed the changes in aggregate market capitalization during 2020 broken down by sub periods (1/1/20 - 2/14, 2/14- 3/20, 3/20 - 9/1 and 9/1 - 12/31/20), reflecting the ups and downs in the overall market. I also looked at the change in revenues and operating income over the last twelve months (October 2019 - September 2020) compared to revenues in the year prior (October 2018 - September 2019). Since the worst effects of the crises were in the second and third quarters of 2020, this comparison should provide insight into how much damage was wrought by the viral shutdown. Just as a preview of how consequential the year was for stock prices, take a look at the median percentage change in market capitalization in the table below:
I will come write more extensively about the COVID effects in a future post, but you can see the data for US companies, broken down by industry, by clicking here, ands you can also download the data for other geographies here: Europe, Emerging Markets (China and India), Australia, NZ & Canada, Japan & Global.
- Operating metrics: My computations for operating margins and accounting returns (returns on equity and capital) reflect the COVID effect on earnings in 2020, and not surprisingly, you will see that their values are much lower for the most damaged sectors (restaurants, airlines) than in prior years. If you are comparing across companies in these sectors, that may not be an issue, but if you are valuing companies and want to find a target value for margins or accounting returns, you will be better served using my archived values for these variables from 2019.
- Pricing metrics: I compute and report a range of pricing multiples from PE ratios to Enterprise Value (EV) to sales ratios, but as with the operating metrics, COVID has left its imprint on the numbers this year. As market capitalizations have quickly retraced their losses, but operating variables have not, the multiples reflect that disconnect. They are either not meaningful in some sectors, which are reporting aggregated losses, or at elevated levels in others (where the collective earnings are down significantly, but market values are not). Again, while this should not be an issue with cross company comparisons, there are two cautionary notes. The first is that investors who come in with strong preconceptions of what comprises cheap or expensive in pricing ratio terms or historic norms will find that everything looks exorbitantly priced. The other is that you will lose large segments of your peer groups if you stick with multiples like PE ratios for comparisons, since so many companies will be money losing.
- Data access: The data is accessible on my website, if you click on data. The data for 2020 is available under current data, and data from previous years under archived data. If you do click on current data, you will see the data classified into groupings based upon how I see the world (corporate governance, risk, investment returns, debt, dividends and pricing). You can see the data online for US companies by clicking on the links next to the data item, but I would strongly recommend that you download the excel spreadsheets that contain the data instead. Not only will this let you access data for other geographical regions, but each excel spreadsheet includes descriptions of the variables reported in that dataset and many include short YouTube videos explaining the data.
- Data Use: I know that those who download my data use it in many different contexts. If you use it at their jobs as corporate finance or equity analysts, I am glad to take some of that burden off you, and I hope that you find more enjoyable uses for the time you save. If you use my data to buttress an argument or debunk someone else’s, I wish you the best, as long as you don’t make it personal. If you use is it to back your case in legal settings, go ahead, but please do not involve me formally. I believe that courts are a graveyard for good valuation practices and while I do not begrudge you, I have no interest in playing that game.
- Data Questions: If you have a question about how a variable is computed, please check the website first, since the question has probably been answered already, but if you cannot find that answer, you know how to find me, and I will try to address your issues. As I mentioned earlier, the excel spreadsheets that contain the data include the descriptions of how I compute the variables.
- Suggestions and Complaints: Before you send off angry or impassioned emails to my team, about my data, you should know that this team has only one member and it is me. I am not a full time data service, and I cannot provide customized data solutions. If you find a mistake in a computation or a typo, please do let me know, and I will fix the error, perhaps not as quickly as you would like me to, but I will.
- Data Update 1: A (Data) Look Back at a Most Forgettable Year!