The truth about publishing statistics

I learned an interesting lesson today about how the publishing industry reports their sales and I want to share it with you. It will give you an inside view of publishing that I rather wish I didn't have.

Update: The AAP is now holding the monthly stats until after 3 months have passed so the figures are far more accurate now.

The short version of the lesson is this: the sales stats released each month are accurate only at the time of release. By the time that a month has passed the numbers will be wrong.

The Long Version

The May 2011 sales stats were released today by the AAP, and as usual I tried to do a month to month analysis. My figures for sales in January through April didn't quite add up to what other blogs were reporting. I asked someone at the AAP for assistance because the ebook sales figures in the old press releases didn't seem to add up right.

Let me pass along what I was told.

The AAP publishes numbers each month which are then revised in the weeks and months after the press release goes out. The figures change because  publishers send in updated sales data for a given month for a number of reasons: late returns, unreported sales, sales originally reported in the wrong categories, etc. Today's numbers are going to change over the next few weeks.

You know, if Al Capone had gotten into publishing I don't think the IRS would ever have caught him.

What this means is that all of my analysis is now bunk. The numbers I was using were very likely wrong fairly soon after they were committed to paper. It also means that all the news articles based on the monthly AAP figures  - everyone's articles - are all wrong. They were based on fiction. No one really knows what the sales figures are until months and months later.

I'm not sure which bother me more. The fact that i passed along bad data or the fact that everyone has passed along bad data and will continue to pass along bad data.

In the future, when i hear about spectacular sales, I plan to take them with a grain of salt.

Nate Hoffelder

View posts by Nate Hoffelder
Nate Hoffelder is the founder and editor of The Digital Reader: He's here to chew bubble gum and fix broken websites, and he is all out of bubble gum. He has been blogging about indie authors since 2010 while learning new tech skills at the drop of a hat. He fixes author sites, and shares what he learns on The Digital Reader's blog. In his spare time, he fosters dogs for A Forever Home, a local rescue group.

9 Comments

  1. Mike Cane21 July, 2011

    They’re not exactly fiction. They’re just fluid. This is why I no longer believe any gov’t economic stats. Those too are always revised. Any July stats aren’t real until September.

    Reply
    1. Nate Hoffelder21 July, 2011

      They’re as real as next year’s budget projection.

      Reply
      1. Mike Cane22 July, 2011

        No, next year’s budget projection contains more red ink than ever. If it said we’d be in the black, then it’d be fiction.

        Reply
  2. Kat21 July, 2011

    Sounds like any other balance sheet. The point is that at any given time you can do some comparisons of data.

    Reply
  3. Doug22 July, 2011

    Mike: there’s a (probably irrelevant to you) difference between the publishing stats changing and economic stats changing.

    The publishing stats are changing because the reported data is changing, but economic stats change because they’re based on a moving average of this month plus the previous 2 months and the next 2 months. Until the next 2 months happen, the “average” for this month is going to be funky. (For various data, the number of months ahead and back may not be 2, but that’s a typical example.) There are other factors related to auto-regression removal (including seasonal influences) if the AR coefficients are recomputed each month, but the “moving average” part is easiest to understand. If you really care, look into the X-11, X-12, and X-13 algorithms. X-11 was particularly clunky in dealing with the lack of future data.

    Reply
  4. […] Let me put the growth into perspective. Last year the US ebook market averaged around $30 million in reported sales each month. This means that S&S reported sales for Q2 would have been around a third of the ebook market for last year. Considering that they’re significantly less than a third of the general book market, it should tell you how fast the ebook market has grown. Of course, the S&S figures are from the quarter ending in June  and thus they are too new to be considered reliable. […]

    Reply
  5. dee4 August, 2011

    Mike – great point, I had no idea. It makes sense once you explain the reasons for the fluid numbers, of course.

    Reply
  6. […] The data released today doesn’t cover 2011, though. That’s because most of the stats for 2011 are still up in the air. No one knows what the sales figures are for a given month until about 4 months afterwards (here’s why). […]

    Reply
  7. […] Of course, any figures from a recently ended quarter  are too new to be considered reliable. […]

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to top
%d bloggers like this: