Skip to main content

The truth about publishing statistics

I learned an interesting lesson today about how the publishing industry reports their sales and I want to share it with you. It will give you an inside view of publishing that I rather wish I didn’t have.

Update: The AAP is now holding the monthly stats until after 3 months have passed so the figures are far more accurate now.

The short version of the lesson is this: the sales stats released each month are accurate only at the time of release. By the time that a month has passed the numbers will be wrong.

The Long Version

The May 2011 sales stats were released today by the AAP, and as usual I tried to do a month to month analysis. My figures for sales in January through April didn’t quite add up to what other blogs were reporting. I asked someone at the AAP for assistance because the ebook sales figures in the old press releases didn’t seem to add up right.

Let me pass along what I was told.

The AAP publishes numbers each month which are then revised in the weeks and months after the press release goes out. The figures change because  publishers send in updated sales data for a given month for a number of reasons: late returns, unreported sales, sales originally reported in the wrong categories, etc. Today’s numbers are going to change over the next few weeks.

You know, if Al Capone had gotten into publishing I don’t think the IRS would ever have caught him.

What this means is that all of my analysis is now bunk. The numbers I was using were very likely wrong fairly soon after they were committed to paper. It also means that all the news articles based on the monthly AAP figures  – everyone’s articles – are all wrong. They were based on fiction. No one really knows what the sales figures are until months and months later.

I’m not sure which bother me more. The fact that i passed along bad data or the fact that everyone has passed along bad data and will continue to pass along bad data.

In the future, when i hear about spectacular sales, I plan to take them with a grain of salt.

Similar Articles


Mike Cane July 21, 2011 um 7:10 pm

They’re not exactly fiction. They’re just fluid. This is why I no longer believe any gov’t economic stats. Those too are always revised. Any July stats aren’t real until September.

Nate Hoffelder July 21, 2011 um 7:34 pm

They’re as real as next year’s budget projection.

Mike Cane July 22, 2011 um 6:08 am

No, next year’s budget projection contains more red ink than ever. If it said we’d be in the black, then it’d be fiction.

Kat July 21, 2011 um 9:37 pm

Sounds like any other balance sheet. The point is that at any given time you can do some comparisons of data.

Doug July 22, 2011 um 4:36 pm

Mike: there’s a (probably irrelevant to you) difference between the publishing stats changing and economic stats changing.

The publishing stats are changing because the reported data is changing, but economic stats change because they’re based on a moving average of this month plus the previous 2 months and the next 2 months. Until the next 2 months happen, the "average" for this month is going to be funky. (For various data, the number of months ahead and back may not be 2, but that’s a typical example.) There are other factors related to auto-regression removal (including seasonal influences) if the AR coefficients are recomputed each month, but the "moving average" part is easiest to understand. If you really care, look into the X-11, X-12, and X-13 algorithms. X-11 was particularly clunky in dealing with the lack of future data.

Ebooks now 15% of Simon & Schuster revenue – The Digital Reader August 4, 2011 um 12:03 am

[…] Let me put the growth into perspective. Last year the US ebook market averaged around $30 million in reported sales each month. This means that S&S reported sales for Q2 would have been around a third of the ebook market for last year. Considering that they’re significantly less than a third of the general book market, it should tell you how fast the ebook market has grown. Of course, the S&S figures are from the quarter ending in June  and thus they are too new to be considered reliable. […]

dee August 4, 2011 um 10:42 am

Mike – great point, I had no idea. It makes sense once you explain the reasons for the fluid numbers, of course.

New Statistics Show Publishing Isn’t in a Death Spiral After All – eBookNewser August 9, 2011 um 10:06 am

[…] The data released today doesn’t cover 2011, though. That’s because most of the stats for 2011 are still up in the air. No one knows what the sales figures are for a given month until about 4 months afterwards (here’s why). […]

Ebooks now 12% of sales at HarperCollins – The Digital Reader August 11, 2011 um 9:48 am

[…] Of course, any figures from a recently ended quarter  are too new to be considered reliable. […]

Write a Comment