Appendix 4 Purchasing power parities: how to make comparisons of GDP across countries

Joe Grice

A4.1 Introduction

One of the things we might want to know is how the size of one economy compares to another, for instance the UK economy relative to the United States or China. In each case we can measure the size of the economy using gross domestic product (GDP) or an alternative, such as gross national income (GNI), but these tend to be measured in national currencies.

The Office for National Statistics presents measures of UK GDP in pounds sterling. Likewise, the Bureau of Economic Analysis will publish GDP measures for the US in dollars, and the National Bureau of Statistics of China will calculate Chinese GDP in renminbi. Therefore, making comparisons on the size of the economy, productivity or welfare is complicated by the fact that GDP numbers are produced in different currencies.

This appendix looks at how these comparisons can be made and what they tell us.

A4.2 Comparing different countries is not as simple as it looks

At first sight, comparisons across countries might seem straightforward, particularly for countries which share the same currency, for example, countries that are members of the euro area. We can see, for example, that in 2019, the GDP of Italy was €1.79 trillion and of Germany was €3.45 trillion. So, we conclude, the German economy must be almost twice as large as the Italian one.

Is there any more to it than that?

Here we have also ignored an important limitation of GDP per capita in that it tells us nothing about the distribution of income. It is simply an arithmetic average, and so the same level of aggregate GDP per capita would be consistent with both an economy in which most citizens enjoyed the same level of income, or one in which income was mostly earned by a small affluent elite while most people struggled to survive.

Any systematic comparison of welfare across countries will therefore also need to be based on information about the dispersion of income levels as well as the average. Chapter 7 on Measuring economic inequality discusses how this can be approached.

Let’s use the same method to compare countries with different currencies, because we can look up exchange rates between currencies. In 2019, UK GDP was £2.22 trillion. The exchange rate on average was £1 = €1.1384. So, the figure we calculate for UK GDP is €2.53 trillion euros.

Can we now use this method to compare the size of the UK economy with that of Germany, Italy and any other country that uses the euro? Or for that matter, can we use other exchange rates to compare the size of the UK’s economy with that of any other country in the world?

In general, such comparisons can be seriously misleading. Most obviously, we will usually be interested in what the comparisons tell us about the relative prosperity of an average citizen in those economies. In 2019, the GDP of China in US dollars was $14.3 trillion, five times larger than the UK economy ($2.83 trillion). However, the population of China was 1.4 billion people, much higher than the 66.6 million who were living in the UK.

If we compare GDP per capita instead, the average UK citizen would have a share of GDP worth $42,504, while GDP per capita for the average Chinese citizen would be less than a quarter of that figure at $10,214.

Comparing per capita GDP can take us a long way towards understanding the relative prosperity of different countries. Up-to-date population figures are readily available, so this is also not a difficult adjustment to make. But comparisons between countries based on exchange rates, whether in per capita terms or not, can still be misleading.

A4.3 The law of one price (LOOP) and purchasing power parity (PPP)

If we are trying to assess the relative incomes of an average Chinese citizen and UK citizen, what we really need to know is the amount of goods and services that this person could buy with their income, given the prices they pay in local shops. If price levels in the UK are higher than prices in China, we would clearly need to take that into account, because the real command over resources of people facing higher prices would be lower than those facing lower prices.

This is an issue that goes to the heart of the debate about how exchange rates are determined. There is a very old proposition in economics called the “law of one price”.

law of one price (LOOP)
When expressed in a common currency, the price of the same good is the same everywhere. Note that price comparison data finds little support for the law of one price holding in most cases.

The law of one price is a very old concept. Historians of economic thought can trace it back to the School of Salamanca in the early 16th century. It has since resurfaced in different reformulations over the centuries. Unfortunately, this venerability does not imply its validity.

The purchase and sale of assets in different markets to exploit differences in price. Economic theory suggests that this process reduces price difference until it is no longer profitable for the arbitrageur.

This proposition states that the price of the same goods when expressed in the same currency must be the same in every country. The underlying rationale is arbitrage. Suppose the price of a tin of baked beans in US dollars was lower in the US than in the UK. Then someone could make a profit by buying baked beans in the US and exporting them for sale in the UK.

This would increase demand for beans in the US, and so the price of beans would rise. The opposite would occur in the UK, where more tins of beans being offered for sale would lower prices. This process would go on until the price of baked beans in common currency terms was the same in both markets. At this point, there would then be no further opportunity for arbitrage profits.

If the LOOP were universally true, the prices in a common currency faced by the average UK citizen and average Chinese citizen would be the same, and the simple comparisons we made above on the basis of exchange rates would tell us all we needed to know about relative prosperity. There are, however, a range of reasons why the LOOP might not hold, and the overwhelming empirical evidence is that it does not:

For these and many other reasons, empirical evidence overwhelmingly suggests that the levels of prices in separate economies but in common currencies can and do differ over long periods of time, if not indefinitely. In fact, it is now widely accepted that the law of one price will only occur under certain exacting conditions. So, if we want to make comparisons across economies, we need to take this complication into account.

There are services that are traded freely, such as financial and professional services and tourism. But trade in goods is normally more important. Services account for three-quarters of the UK domestic economy, but UK trade in services is just over half of the size of the trade in goods.

A4.3.1 Comparing countries with a common currency

Since the problem arises because of differences in the price for the same goods and services in different countries, the obvious approach is to measure those differences. To simplify matters, let’s suppose we are dealing with only two countries, A and B, with the same currency and only one product. Continuing with the theme discussed in Chapter 2, we will say this is pasta production.

We can observe current price GDP in both countries, which we will denote by GDPA and GDPB, as simply the value of pasta production in each. If the quantities of pasta production in the two countries are QA and QB respectively and the price of pasta in the two countries are PA and PB respectively, then:

\[\text{GDP}_{\text{A}} = P_{\text{A}} × Q_{\text{A}}\]


\[\text{GDP}_{\text{B}} = P_{\text{B}} × Q_{\text{B}}\]

The price ratio PB/PA simply tells us the price of pasta in country B relative to that in country A.

The volume of pasta production in country B (QB), or real GDP if you like, can be found just by dividing country B’s current price GDP by its deflator, PB.

But suppose we calculate instead a different amount, QB*, by deflating country B’s current price GDP by country A’s level of prices:

\[Q_{\text{B}}^{\ \ *} = {GDP}_{\text{B}}/P_{\text{A}}\]

QB* can be described as the volume of production in country B, expressed at country A’s prices. It tells us the purchasing power of GDP produced in Country B if it were to be consumed in Country A.

Finally, we can calculate the volume of GDP in country A in its own prices by using a normal deflator for country A (the price of pasta in country A, which is PA).

So, we can now make a direct comparison based on the same prices, which are those prevailing in country A. That is, we compare QB* and QA. In this example, Country A is known as the “numeraire” country, although we could have carried out the procedure the other way around and made the comparison in terms of country B’s prices. In that case country B would have been the numeraire.

A4.3.2 Comparing countries with different currencies

So far so good. But what happens if the two countries have different currencies with an exchange rate between them that isn’t one-to-one? Fortunately, this does not lead to much extra complication. Suppose instead of having the same currency, Country A uses sterling (£) and Country B uses dollars ($).

Our main object of concern is still the relative price level PA/PB.

When we calculate this ratio for the common currency case, the result is a unitless ratio. Prices in country A are, say, 20% higher or lower than in country B. With different currencies, the ratio is no longer unitless, but expressed in terms of units of sterling per dollar. This generates further useful information.

purchasing power parity (PPP)
The rate at which the currency of one country would have to be converted into that of another country so that one unit of currency can buy the same amount of goods and services in each country.

What we have has the form of an exchange rate. As a straightforward matter of arithmetic, we can calculate the hypothetical exchange rate that would equalise the level of prices in the two economies in common currency terms. This hypothetical rate is known as the purchasing power parity rate (PPP).

We can then compare this PPP rate with the actual exchange rate in the market. As an illustration, suppose we calculate the PPP between the UK and the US to be £1 = $1.45. This means that you would require $1.45 to buy the same basket of goods and services in the US as you could buy for £1 in the UK. If, however, the market exchange rate was £1 = $1.55, this would be evidence suggesting sterling was expensive, because, in common currency terms, prices in the US would be lower than in the UK. That is, £1 would buy more goods and services in the US than in the UK.

If the market rate was only £1 = $1.20, this would be evidence that sterling was cheap, because the reverse would be the case. Now, £1 would buy fewer goods and services in the US than the UK.

It should be stressed that this does not mean future exchange rate movements will be in this direction, at least in the short run. The evidence is that periods of over- or undervaluation of currencies, in these terms, can persist for prolonged periods of time, because the trade arbitrage we discussed may be too weak to have immediate or decisive effects.

So far, we have been talking about a single product (pasta) in the two economies. Of course, the reality is that there are many products, and we need to incorporate this.

Again, at a conceptual level, this does not present many difficulties as all we need is a representative price index reflecting the cost of output in each country. From Chapter 1 and Appendix 5, the aggregate price level in the two countries relevant to their respective GDPs can be calculated simply by compiling price indices to give us the GDP deflator in each case. To compare price levels for the purposes of the PPP, we simply take the ratio of the two, just as we did in the single product case when we compared the respective prices of pasta.

A4.4 Some complications

So far, at conceptual level, this all seems to be plain sailing. We have a well-based and methodologically straightforward approach to comparing GDP or GDP per capita in different economies. In practice, it is not so simple.

A4.4.1 Finding comparable products and services

For historical or cultural reasons, different economies often comprise varying packages of goods and services. So, an important product in one economy may have much less significance in another. An extreme example is staple cereals. In Ethiopia, the key staple is teff grain, while rice is largely unobtainable. In Thailand, rice is the main staple while teff is not generally available. As an intermediate case, in the UK, rice is available, though not the main cereal consumed, while teff is scarcely sold.

There is no practical solution to the problem of finding a way to make comparisons that do justice to the varying importance of these two cereals in the different economies. This is an extreme case, but the same applies to some degree for thousands of products and services. Air conditioning, for example, will be important in economies with hot or humid climates but less so in countries with temperate climates. Going out for restaurant meals is more prevalent in some countries than others.

A4.4.2 Comparing quality

We must also account for differences in quality.

As we saw in Appendix 5, when comparing growth in the same economy over time, recognising changes in quality is a significant issue. But changes in quality through time are likely to be gradual. When we compare across economies at a single point, differences in quality may be far greater.

An example would be how to compare an economy in which haircuts are cheap but basic, with one where sophisticated salons are more prevalent and in which the prices are higher. Ignoring the quality difference could lead to misleading conclusions. Again, this applies to a broad range of goods and services.

A4.4.3 Who provides the service?

Services in different economies may be delivered in different ways. A good example would be healthcare. In the UK, most healthcare is delivered by the government through the National Health Service and is free at the point of delivery. In contrast, healthcare provision in the US comes primarily from the private sector, often paid for by private insurance or from government schemes. Modes of health care provision in other countries also have important variations.

This is a far from trivial issue. In the UK, healthcare expenditure accounts for nearly a tenth of the economy and in the US it is rather more. Calculating a relative price for healthcare in such circumstances is far from straightforward.

Differences in the ways in which universities operate and are financed, and how educational services more widely are provided, are other examples.

A4.4.4 Complex formulae

A comprehensive and detailed account of the issues and methodology for compiling PPPs is given in Chapter 12, ‘Calculation and aggregation of PPPs’ of the Eurostat-OECD Methodological Manual on Purchasing Power Parities.

One of the main sources of information about PPPs is provided by the OECD’s online database.

The International Comparisons Program (ICP) at the World Bank is a long-established authority on the calculation of PPP exchange rates.

When compiling deflators that compare the performance of a single economy over time, the arithmetic and algebra of Laspeyres, Paasche and other indices are relatively straightforward and intuitive, even in chained index form. But while the principle of what comparisons of economies at a single point of time are intended to achieve is clear, the algebraic formulae required to implement these are far more complex:

Finding a methodology that achieves these aims at the same time turns out to be more or less impossible! (Finding improved methodologies is a noble endeavour amongst specialists working in this area.) The best obtainable methodologies today are not perfect, but they are the best we can do to create something that is computable and reflects the underlying aims of PPP comparisons.

How it’s done How PPP is calculated

The current workhorse procedure is based on a methodology first proposed in the 1960s, called EKS reflecting the initial proponents, Éltető, Köves, and Szulc. 

The EKS methodology has some drawbacks and, as Chapter 12, ‘Calculation and aggregation of PPPs’ in the Eurostat-OECD Methodological Manual on Purchasing Power Parities explains, attempts have been made to improve it. This manual is probably written as clearly as the complexity of the technical issues involved allow it to be. Be warned, however, that it is still very heavy going.

As a result of these complications, estimates of PPPs, however conscientiously compiled, have to be treated with caution. The OECD, one of the main practitioners in this field, warns against regarding its estimates as accurate to within more than plus or minus 5%. So, differences in PPPs of less than this magnitude do not have any great significance.

Does this mean, though, that after all this effort, calculating PPPs is a waste of time? Far from it. Keynes’s advice that it is better to be approximately right than precisely wrong comes into play here. As we shall see, if we had to rely only on exchange rates as the way of comparing different economies, we could be precise and very wrong indeed.

A4.5 What do the calculations show?

The OECD dataset of PPPs provides conversion factors (exchange rates) to apply to national levels of GDP to allow comparison between them, taking the US as the numeraire. This means GDP expressed in PPP is presented in US dollars ($). Helpfully, it also provides the corresponding conversion factors that would come from using only market exchange rates between respective national currencies and the US dollar.

The dataset provides estimates of PPPs for 58 countries for each year from 2000 to 2019 and, for some countries, up to 2020. These conversion factors are then used to calculate comparable levels of GDP and GDP per capita (GDP per head).

Comparisons of GDP and GDP per capita are most conveniently obtained from the OECD’s annual publication National Accounts of OECD Countries.

Figure A4.1 shows a selection of these estimates for 2019.

GDP ($ billion) GDP per capita ($)
PPP exchange rates Market exchange rates PPP exchange rates Market exchange rates
US 21,433 21,433 65,240 65,240
China 23,520 14,280 18,255 10,214
Denmark 352 350 60,308 60,108
France 3,321 2,717 49,226 40,256
Germany 4,644 3,861 55,891 46,467
UK 3,243 2,831 48,342 42,379

Figure A4.1 Comparisons of national output at purchasing power parity and at current exchange rates, 2019

Comparisons of national output at purchasing power parity and at current exchange rates, 2019

Several interesting points emerge from these comparisons:

These implications necessarily carry through to the comparative estimates of GDP per capita. Most strikingly, the average GDP of a Chinese citizen would have been estimated at market exchange rates as less than a sixth of that of a US resident. But reflecting the reality of purchasing power in the two countries, the PPP estimate raises the ratio to more than a quarter. GDP per capita for France, Germany and the UK are also seen to be closer to that of the US than the crude exchange rate valuation would have suggested.

The overriding conclusion is that using PPPs for purposes of international comparisons is important. Even if with some necessary approximation, valuation on this basis does reflect the reality of the prices and purchasing power of agents in different economies. By contrast, valuations that rely on the vagaries of the foreign exchange markets can give rise to misleading conclusions, and, as we have seen, potentially by large amounts.

A4.6 Comparisons over time at constant PPP

Policymakers are interested in how matters are changing over time. For example, how does GDP per capita in a particular economy relative to the position in the US differ, and how is the difference changing over time?

It would obviously be possible to try to answer such questions by using the market exchange rate to allow such comparisons to be made in common currency terms. But we now know why this may lead to inaccurate and misleading conclusions.

One approach would be to construct a time series of comparisons of economy’s sizes based on valuations at PPPs. We did this for one year, 2019, in the first and third columns in Figure A4.1. Given that PPP conversion factors have been available annually since at least 2000, we should be able to carry out similar compilations for various series of years as we choose.

Again, not so simple. Experience suggests a variation on this approach is preferable. Our rationale for using PPPs at all is to take account of different price levels in different economies. But for individual economies, national deflators, as discussed in Appendix 5, are likely to be more reliable as indicators of changes in an economy than are changes in PPPs. (Not least because more resources are available for national statisticians to compilate national deflators than for constructing international PPPs.) In addition, some of the complications inherent in constructing PPPs noted earlier do not apply or apply to a lesser degree in the calculation of national deflators.

For these reasons, the OECD’s and other international guidance suggest a procedure known as “comparison at constant PPP”. Under this methodology, a normal PPP comparison is made for a particular single year. The OECD currently uses 2015 for this base year purpose. For other years, before and after the base year, the changes in the national deflators are applied to the base year starting point. So, effectively for the countries being compared, changes in constant (national) price GDP is used, but from a starting point set by a full PPP comparison in the base year. In this way, it is possible to benefit from the relative strength of the national deflators, but still to abstract from the distortions that could be caused by making a comparison based only on market exchange rate valuations.

The OECD National Accounts publication cited earlier presents such “comparisons at constant PPP” for all its member states. Figure A4.2 gives a selection of these to illustrate some recent movements. For the comparison of the UK against the US, a comparison based only on market exchange rates is presented as a memorandum item in the final column of the table. Its purpose is not to be an indicator in its own right, but to show how different the calculated trends would be had this unreliable procedure been used instead.

  US France Germany Japan UK UK (market exchange rate)
2012 53,932 40,332 46,412 38,847 40,439 42,446
2013 54,553 40,356 46,488 39,692 41,065 43,416
2014 55,552 40,554 47,317 39,863 41,919 47,456
2015 56,832 40,830 47,610 40,398 42,572 45,045
2016 57,399 41,123 48,280 40,666 42,950 41,026
2017 58,370 41,920 49,352 41,622 43,438 40,318
2018 59,801 42,543 49,827 41,843 43,700 43,011
2019 60,800 43,062 49,991 42,226 44,800 42,379
% change 2012–2019 12.7 6.8 7.8 8.7 10.8 −0.2

Figure A4.2 Comparisons of GDP per capita at constant 2015 purchasing power parity ($)

Comparisons of GDP per capita at constant 2015 purchasing power parity ($)

Several points are clear from this table:

The memorandum item for the UK of results based only on market exchange rate valuations shows very clearly its defects:

All this underlines the point that in making international comparisons between different economies relying only on market-based exchange rate valuations is liable to produce misleading and, sometimes absurd, conclusions.

A4.7 More direct measures of PPP

It seems clear that comparisons between different economies based only on market exchange rates have many problems and risk misleading or wrong conclusions. But at the same time, PPPs laboriously calculated by organisations such as the OECD involve complicated methodologies and are computationally time-consuming. So, are there more straightforward and direct ways to calculate relative differences in purchasing power between economies?

Take a look at the Big Mac Index.

A number of indices with a similar intent have since been compiled and published. These have been based on prices of items such as a large Starbucks latte, an Apple iPod and IKEA’s Billy bookcases.

In fact, some years before the Big Mac index, Milton Friedman had suggested using the prices of men’s haircuts around the world as a similar basis for such an index. But he appears not to have taken this proposition forward.

One of the first, and perhaps still most prominent, of these is the Big Mac index compiled by The Economist magazine. This was first published in 1986 and has been updated annually since. It draws its inspiration from the fact that the McDonald’s hamburger is widely available around the world to (nearly) the same specifications. So, if the Big Mac price is taken as representative of prices in general in a particular location, it is straightforward to calculate the purchasing power parity exchange rate which would equate prices in common currency terms. This would be following the same underlying logic as discussed earlier, but achieves the results using much simpler calculations.

There are qualifications to be made on the use of the Big Mac index:

Figure A4.3 sets out the degree of currency over/undervaluation for 2019 for a selection of countries, taking the US as the numeraire, based on the OECD’s PPP methodology and the methodology implied by the Big Mac index.

Price of a Big Mac ($) % over (+) or under (−) valuation based on
OECD method Big Mac index
US 5.66
UK 4.44 −13 −27
China 3.46 −39 −38
Euro area 5.16 −17 −14
Denmark 4.90 −1 −14

Figure A4.3 Estimates of purchasing power parities compared, 2019

Estimates of purchasing power parities compared, 2019

OECD – PPPS and exchange rates.
The Economist (2021), Burgernomics: The Big Mac Index

Note: The comparisons are not exact because of timing issues. The Big Mac index relates to the end of January while the OECD estimates relate to the year as a whole.

These estimates show a significant degree of correlation. Both the OECD-based and Big Mac-based figures indicate that each of these four currencies were undervalued against the dollar. In addition, the quantitative estimates of the size of the undervaluation for China and the euro area are notably close.

However, there are differences in regard to the UK and Denmark. The OECD estimates suggest that sterling’s undervaluation against the dollar was only half as large as that implied by the Big Mac index. For the Danish krone, the OECD figures portray negligible undervaluation, whereas the Big Mac methodology would suggest undervaluation by some 14%.

When looking at the entire sample of countries in the Big Mac index, one of the interesting conclusions is that it tends to conform to the “Balassa-Samuelson effect” – that is, PPP calculations are biased by relative productivity levels across countries. This shows up with reported under-valuations usually being fairly large in developing countries, where labour costs and consumer prices are systematically lower, compared to developed countries.

Overall, the conclusion is perhaps that broad-brush measures such as the Big Mac index have value as giving a quickly available indication of the general picture. On the other hand, they are not likely to be a substitute for the methodologically more rigorous approaches discussed earlier.

A4.8 Summary

We have looked at ways to compare the size of different economies at a point of time and extended this to look at how their relative size changes over time. An important extension is to look at the same information in per capita terms, so we can look at material well-being of “average” citizens in different economies and how these compare with each other.

To do this, we took account of the fact that prices for the same goods and services can differ between economies even when expressed in common currency terms using the prevailing exchange rate. For this reason, we looked at the concept and calculation of purchasing power parities (PPPs), best thought of as the hypothetical exchange rate which would equate prices between the countries concerned in common currency terms. We used these PPPs as the basis for our comparisons, rather than actual market exchange rates.

If there was any temptation to regard this as an unnecessary complication, we saw the evidence suggesting that failing to follow this approach and relying only on market exchange rates for making the comparisons may lead to misleading and sometimes absurd results.

A4.9 Further reading