iBooks only has 33k titles, and other embarrassing details
An interesting article was posted on Monday about the relative sizes of the various ebookstores. It took me a couple days to think about and double check his methodology. I’m now confident that I can tell you the data is not valid and I can show you why.
The author didn’t go far enough. I found the error in the data because I checked more stores than he did. He only checked 4 ebookstores, and among the ones he skipped were Borders, Kobo, and Smashwords. There are others, yes, but most all of the smaller ebookstores are honest about the number of titles they sell so there is little reason to double check.
I’m going to take you through the methodology and and the results, then I’ll explain why the results are invalid.
First, a word about Smashwords. This was the one honest site of the bunch. Their about page says 18k titles and I found 22k. Obviously they’ve grown since the last time that page was updated.
I used Google to search the ebookstore sites. In this day and age, a professionally made site should be set up so Google can find every page. (If it’s not, then it’s the fault of the developers, not me.) The Google searches fell into 2 types. Some sites (Kobo, Sony, Smashwords) had their ebooks organized in a particular location. Other sites required me to identify and search for a particular term found on each ebook listing. Here are the searches, if you’re interested:
- Sony (www.google.com/search?q=site:ebookstore.sony.com/ebook)
- Kobo (www.google.com/search?q=site:kobobooks.com/ebook/)
- Borders (www.google.com/search?q=site:borders.com "About the eBook")
- Smashwords (www.google.com/search?q=site:smashwords.com/books/)
- iBooks (www.google.com/search?q=site:itunes.apple.com/us/book "This book is available for download with iBooks")
- B&N (www.google.com/search?q="buy this ebook" site:search.barnesandnoble.com)
You might have noticed that I left Amazon out of the mix. If you ran a similar search for them you’d get around 700k titles.That’s about the number they claim to have (even though most of the titles are in the public domain).
- Sony (57k)
- Kobo (128k)
- Borders (92k)
- Smashwords (22k)
- iBooks (33k)
- B&N (29k)
Smashwords were the key to finding out the results were invalid. You see, Smashwords are both an bookstore and a distributor. Their titles are available from Sony, B&N, and iBooks.
Think about it. If Smashwords are providing 20 thousand titles to B&N then all of B&N’s other distributors account for only 9k. Does that make any sense? The same applies for iBooks. Do you really think that all of the Agency 5 only account for 13k titles?
It was an interesting idea, but unfortunately this method doesn’t work.
Mike Cane September 8, 2010 um 11:02 am
This is a good argument for metadata. The stores themselves could poll the metadata and break out the titles anywhichway. We have X free. We have X free public domain. We have X between $1-$4.99. We have X between $5.00-$9.99."
I’m still frikkin confused over who has the most *real* books now — setting aside the public domain stuff ripped off for resale.
Smashwords also distributes to Kobo too, I think.
Doug September 8, 2010 um 2:22 pm
The technique used (Google search) doesn’t give valid numbers except as to the number of ebook pages that Google has picked up. Useful if you’re interested in Google’s indexing, but not helpful if you’re figuring out the size of the site’s e-book catalogue.
A simple search of the entire B&N e-book store (including Google Books) gives 1,176,150 titles:
Of those, 956,156 titles are free: "0.00"
That leaves 219,994 titles that aren’t free. Total Smashwords titles are 9065, of which 1044 are free, so 8021 of those not-free titles are from Smashwords (not all Smashwords titles are eligible for the Premium Catalog, which is distributed to the e-booksellers):
The "29K" number from the Google search is nowhere close to reality.
[I chose B&N because that’s the site I’m familiar with, not because I thought they were singled out.]
Nate the great September 8, 2010 um 6:38 pm
Thanks for pointing out the better technique.