January 2018 Report: US online book sales, Q2-Q4 2017

Share this:

It has been nearly a year since our last Author Earnings report, which is probably far too long between updates. But while we haven’t said much publicly during that time, behind the scenes we’ve been super busy on the commercial side, and as a result we’ve taken our industry data and analytics capabilities to a whole new professional level.

For large publishers and other scaled industry players, this has led to a brand new source of real-time business data: a perfect complement to Bookscan, covering digital and online book sales. For authors, it means that we can now provide a far greater depth and accuracy of analysis here, pro bono, under the AuthorEarnings banner. So it’s a win-win for everyone.

But why did traditional publishers and publishing-industry analysts become so interested in our data in the first place?

Two reasons: Full-market coverage. And timeliness.

Over the past few years, traditional publishers have largely been able to navigate the digital disruption and adapt their businesses to the changing bookselling landscape with varying degrees of success. Unfortunately, the industry’s legacy sales-reporting providers, upon whom those publishers rely for data… haven’t.

Which has caused problems industry-wide.

For some book formats, these providers were still able to give decent visibility into overall sales. Print sales data from Bookscan, for instance, captures somewhere between 70%-80% of all US hardcover and paperback purchases at point of sale, giving publishers a reasonably accurate and statistically meaningful picture of which books US readers are buying in hardcover and paperback formats. And more importantly, Bookscan sales numbers for last week are available this week, to support publisher business decisions for next week.

Data reporting on the digital side of the market has been a whole different story.

Legacy data providers like PubTrack Digital and the AAP are effectively blind to vast sectors of the consumer ebook & audiobook market. And those non-traditional sectors are precisely where ebook sales have continued to grow, year after year, even as PubTrack-and-AAP-reporting publishers have seen their own ebook sales dramatically shrink. As a result, what was once a small blind spot in the industry’s online-sales numbers now blocks half the view. Data from PubTrack and the AAP is now missing two thirds of US consumer ebook purchases, and nearly half of all ebook dollars those consumers spend. (And reporting is so long-delayed–often by 4-6 months–that even if the data were more complete, it would still be useless.)

When you can only see half the market, five months after the fact, you miss a lot.

And nobody likes running their business half blind.

Which is why more and more publishers have privately sought out our data and analysis–which gives visibility into the industry in real-time, and *includes* all those untracked purchases that these other industry data providers can’t see. And not just publishers have sought our help, but also book distributors, aggregators, global consulting firms, international publishing startups, and even private-equity firms investing in or advising major transactions in the publishing space. In other words, we’ve been busy.

While these commercial efforts have been kept wholly separate from AuthorEarnings, they’ve put us in a unique position, data-wise. In the past, even when we analyzed a million top selling titles at a time, we were still only looking at a single day’s sales. But no longer.

Now we capture over a million top selling titles a day. Every day.

Our analytics run in real-time, 24/7.

Which means that if a book sold even a single online copy since April 2017, no matter whom the publisher or author, we can probably find it in our ever-growing dataset. Whether that title sold two copies yesterday or two thousand, we can see those sales. We can total them up in our dashboard. And for next week’s unreleased titles–or next month’s–we can tally up their accumulated online preorders, too.

With over 250 million rows of live ebook, audiobook, and online-print data at our fingertips, we can now, with the click of a mouse, slice & dice online book sales from last quarter, last month, or last week, any way we like. So let’s take a look back over the last three quarters of 2017, from late April thru the end of December, to see what our dashboard can tell us about which books US consumers were actually buying online during that 9-month period.

The Big Picture: 9 Months of US Online Book Sales, Captured Daily

During the last three quarters of 2017, we recorded $1.3 billion in individually tracked ebook sales, $490 million in individually tracked audiobook sales, and $3.1 billion in individually tracked online hardcover and paperback purchases. While this is not quite 100% of online sales during the period, it comes pretty close — we ramped up from a much smaller share in April, to where we are capturing more than 90% of all US online sales for Q4 2017 and beyond.

No other data set in our industry comes close to matching ours for full-market coverage, let alone timeliness. So, although we’ll barely have time to scratch the surface here, let’s dig in.

Online Sales By Format (All Book Categories Combined)

The above two pie charts show 2017 US online book sales by format: on the left, total units purchased, and on the right, total consumer dollars spent.

(It’s worth noting that even print sales have, by now, moved mostly online: in 2017, we show a full 45.5% of Bookscan’s reported 687 million total US print book sales coming from Amazon alone. By our measurement, Amazon’s share of the print market has been steadily and continuously climbing, from “only” 41.7% in 2016 and 37.7% in 2015, while sales at bookstores and other brick and mortar outlets shrink — a fact obscured by Bookscan’s lumping of Amazon online sales and brick and mortar bookstore sales together in a single combined category called “Club & Retail.” So the red online “print” share shown here represents roughly half of all US print sales, period. ).

Unsurprisingly, when we look at the above pie charts, most online book purchases in 2017 were ebooks (55%), while audiobooks made up a small but fast-growing share of units (6%), and print books accounted for the remaining 39% of units. In dollar terms, ebooks–with their generally lower purchase prices–made up a far smaller share of total online dollar spending, while 63% of online book dollar spending was for print.

But that doesn’t mean adult fiction dollars split that way. Nor even trade adult nonfiction dollars.

Why? Because it turns out a huge chunk of those print dollars are actually going to textbooks and other academic/professional print titles (strangely, the DSM-5 Psychiatric Manual of Mental Disorders was a particularly high 2017 seller). Textbooks, which are generally priced in the $60-$200 range, skew the dollar total significantly toward print. As do children’s books (including Board Books), another huge category of book sales where almost all purchases are in print.

When we leave out textbooks and children’s titles, and look only at adult fiction & trade nonfiction, the picture changes somewhat…

Online Sales By Format (Adult Fiction & Nonfiction)

70% of online purchases of adult fiction & nonfiction are ebooks & audiobooks, and online consumer dollars skew mostly digital, too. In fact, most of the remaining online print share here is nonfiction; further narrowing the scope to just adult fiction, we see that online sales are even more digitally dominated, as shown below.

Online Sales By Format (Adult Fiction)

In short, the answer to “which format(s) now make up the majority of sales” depends entirely on the genre. Blending together textbooks and “trade” titles, children’s books and adult books, and fiction with nonfiction hides obscures which formats particular readers are buying most.

Even within the broader category of Adult fiction, we see wide variation. Romance readers are overwhelmingly buying digital now: 90% of all Romance purchases are ebooks. And we can see that Science Fiction & Fantasy, with roughly 75% of sales now ebooks & audio, is not that far behind. On the other hand, readers of Poetry are still buying 82% of those Poetry books as print, and 85% of Drama & Plays are bought in print, even online.

So clearly, depending on what you write, YMMV.

Online Sales By Month

Overall US ebook dollars (green) and audiobook dollars (yellow) were pretty flat from month to month: December doesn’t really look all that different than May. Digital book-buyers seem on the whole to be very steady consumers, showing little seasonality in their overall purchasing behavior.

Print sales (shown in red), on the other hand, exhibit wild seasonal swings — even online — with total print sales more than doubling during back-to-school months (August/September & January), mainly due to the sales of those textbooks we mentioned earlier. There’s also a secondary peak in December, around the time of trade publishers’ big holiday releases, but it’s significantly smaller in size — unlike at brick-and-mortar bookstores, for whom a 2x December sales spike typically means make-or-break for the year.)

So how much does release timing matter for your next title? Less and less, nowadays. It seems to matter mostly for print, and even then, only for bookstore sales. For online and digital sales, it appears the “pool is open” year round.

US Online Book Sales by Publisher Type

From 2014 – 2016, Author Earnings reports only broke sales down into five broad publisher groupings. The 5 largest “Big Five” trade publishers were broken out into a category, as were Amazon’s own publishing imprints. We had a category for self-published indie authors whom we had individually verified, but we threw the rest of the unverified “single author” publishers into a different category. And finally, we had a catch-all category called “Small/Medium Publisher” where we basically stuck all other traditional publishers of any size — a gross simplification that let us put off the administrative pain of manually sorting through hundreds of thousands of variations of publishing imprint names and scattershot publisher metadata labels.

In 2017, we finally bit the bullet, classifying tens of thousands of top selling publishing imprints and metadata labels, and grouping them all under their appropriate parent entities — it was a ton of work, but worth it.

Now, instead of 5 publisher types, we have 20 — giving us laser-keen visibility into how traditional publishers of various sizes and stripes are faring, too.

2017 Ebook Sales by Publisher Type

With apologies in advance to color-challenged readers, here are pie charts showing US ebook sales for the last 3 quarters of 2017, in unit sales and in total consumer dollars, broken down by publisher type. (In assigning the new colors, we tried to stay as consistent as possible with the historic AE report palette. Purples and pinks/reds still capture the market shares of traditional publishers of various sizes, blue wedges depict market shares for different types of self-publisher, and green shows Amazon imprints.)

Circling clockwise from the top:

  • Purple represents the Big Five’s share of US sales.
  • The next four slices, magenta through light pink, show in order of descending size the shares held by Large, Medium, and Small trade publishers, and lastly the smallest Micropresses.
  • Dark yellow-orange shows Large Academic Publishers, while the next lighter yellow wedge (here an invisible sliver) shows other Academic publishers, and the even paler yellow sliver that follows is University Presses.
  • Next comes Amazon Publishing Imprints in green: Montlake, Thomas & Mercer, 47North, etc.
  • Followed by dark teal “Single Author Mega Imprints” (which is J.K. Rowling’s Pottermore getting its own category, basically).
  • Then come 3 types of blue indie self-publisher: pale aqua for small indie collectives who share publishing duties via a shared imprint, darker blue for individual self-publishers with their own LLC or DBA imprint name, and finally a slightly lighter blue for those individual self-publishers who use no separate imprint name at all.
  • Lastly, we have 3 types of Uncategorized imprints: pale gray for Single-Author imprints, medium gray for Few-Author imprints, and dark gray for Many-Author imprints. Collectively, these are the 200,000 lowest-selling imprints and label names left after we were done categorizing more than 90% of 2017’s online sales, starting from the top-selling imprints on down. None of the imprints that were left uncategorized — even the Many-Author ones — had 2017 sales adding up to more than $30,000.

And in tabular form, here’s the 2017 US ebook market broken down by publisher type, ordered from largest dollar share down to smallest:

Remember that spike in “Small/Medium Publisher” ebook market share we reported, back in October 2016?

At the time, AE’s categorization of traditional publishers was way too broad: just one category for the Big Five, and one for everyone else. In our October 2016 report, when the non-Big Five traditional share of ebook dollars suddenly ramped up, we knew we needed to dig deeper. Now that we are able to differentiate the sales of 20 different types of publisher, the answer is crystal clear. The single category responsible for that late 2016 spike was Large Academic Publishers.

eTextbooks, in other words.

In Q3 2016, several of the largest academic publishers, frustrated with the growing competition from online sales of used print textbooks from previous years, began moving more aggressively into ebook sales. Their eTextbooks, while cheaper than new print textbooks, are on average 5x as expensive as trade fiction/non-fiction ebooks — leading rapid dollar market share capture. For 2017, ebooks from Large Academic Publishers–while only 1% of ebook units–captured a full 5.5% of ebook dollar spending.

A mystery back then, a one-click answer now…

So was self-publishing losing ground or gaining market share in 2017?

Throughout 2017, a frequent meme circulating in indie author loops was that self-publishing was hitting headwinds, and that self-published sales had slowed dramatically for “everyone.” Even the biggest indie stars of yesteryear were no longer pulling down what they used to, so times must be even tougher for everyone else.

A quick glance at the pie charts above reveal a different story. The indie share of the entire US ebook market, comprising the various blue wedges in the pies above, now looks like what the indie share of Amazon alone used to be, in our quarterly snapshots from previous years. In other words, far from losing ground, the overall indie market share has grown.

By how much? To know for sure, we’d need to be able to do apples-to-apples same-month comparisons, year over year. Which requires having more than a year’s worth of continuous market-wide data. And since we only started collecting continuous data since April, we won’t quite reach that point for another 3 months. (It doesn’t make sense to compare our new, continuously aggregated market-wide data against one of our old single-day, single-retailer AE snapshots; that would be an apples-vs-oranges comparison, and relatively meaningless.)

In the meantime, we can look at how indie ebook sales have trended relative to that of traditional trade publishers during the 9 months we’ve been tracking the whole market, using each group’s sales for first few months as a baseline. The graph below shows that.

Month by month, we can see the percentage increase/decrease in indie sales relative to that of all trade publishers combined. The curve is red where indies grew slower (or shrank faster) than trade publishers, and green where indies grew faster (or shrank slower).

It tells us that, in aggregate, trade publishers of all sizes, combined, grew their dollar sales 1.1% over the 9 month period. Indies grew theirs 2.1% during the same period. For the last 9 months of 2017, then, it appears self-publishers in the aggregate were still gaining market share, albeit slowly.

So why have we been hearing so many prominent indie pioneers telling newer authors and aspirants that “things are tough all over” now? Why are so many of them saying “indie publishing isn’t what it used to be”?

When we dug deeper, the data led us to a pretty simple answer, which we’ll circle back to at the end of the report.

In the meantime, let’s look at audio.

2017 Audiobook Sales by Publisher Type

The audio breakdown is a little more complicated. Amazon-owned publishing imprints (in this case, Audible Studios and Brilliance Audio) are the publisher of record for a fifth of all US audio sales. But a significant minority of these are sales of titles where audio rights have been sub-licensed from self-published authors, the Big Five, or other traditional publishers.

Amazingly, JK Rowling alone is still capturing 2.4% of all US Audio dollars with her Harry Potter titles!

Pottermore sales make up the “Single Author Mega-Imprint” category in the table below:

And finally, let’s see how online print sales break down.

2017 Online Print Sales by Publisher Type

Here, in the dark yellow and lighter yellow wedges, we can see the massive share of print sales which are Textbooks and other Academic/Professional publications, accounting for more than 38% of online print dollars in Q2-Q4 2017. And again, below, in tabular form:

Next, let’s look at the 10 top selling genres for ebooks, for audiobooks, and for online print book sales. Note that what sells best in each format is quite different.

US Ebook Sales By Genre (Top 10)

US Audiobook Sales By Genre (Top 10)

US Online Print Sales By Genre (Top 10)

Next, a brief look at frontlist sales versus backlist sales.

US Ebook Sales : Frontlist vs Backlist

In a future post, we can split this out and show a separate frontlist-vs-backlist sales graph for each type of publisher; the differences between them are quite illuminating.

But for now, we’ll move on to pricing, and show how many books US customers are buying, and how many dollars they are spending, at different price points.

Amazon Ebook Units Sold By Price Point

Amazon Ebook Dollar Spend By Price Point

Needless to say, this represents the aggregate across all genres and publisher types; In a future post, it’s easy now to drill down into specific genres to see how sales vary by price point for particular types of book.

On a related note, let’s take a quick look at US ebook dollar sales by discount from list price.

Amazon Ebook Dollars By Discount From Digital List Price


And finally, because everyone loves bestseller lists, we’ll share the overall top 25 publishers and authors in each format, ranked by total dollar sales, for the entire nine-month period (Q2-Q4 2017). The color-coded dots alongside each entry indicate the publisher type (or, for authors, whichever publisher type generated the majority of their 2017 sales dollars).

Total units and actual dollar sales have been blurred out for obvious privacy reasons.

Top 25 Ebook Publishers by Dollar Sales, Q2-Q4 2017

For more detailed analysis, we could drill down further into individual imprints for each publisher, and even specific labels within each imprint. But for a best seller list, that would be superfluous: the parent entity is most relevant here.

Remarkably, one of the top 25 ebook publishers for the US, ranked by total gross dollar sales for the entire nine month period, turned out to be the self-publishing imprint of a single indie author!

Top 25 Audiobook Publishers by Dollar Sales, Q2-Q4 2017

Top 25 Online Print Publishers by Dollar Sales, Q2-Q4 2017

And now, top selling online authors…

Top 25 Ebook Authors by Dollar Sales, Q2-Q4 2017

Note that we are ranking here by total customer dollars spent on the author’s books, rather than the portion that actually goes to the author. So it’s not surprising to see a lot of Big Five’s longest-tenured bestselling authors at the top of the list; in fact, it’s stunning that 2017’s #11 overall best selling ebook author in dollar terms was an indie, who doesn’t have to split those dollars with a publisher.

Further down the list, even in dollar terms, indies become much more prevalent.

  • 7 of the top 100 selling ebook authors in the US were self-published indies
  • 50 of the top 250 selling ebook authors in the US were self-published indies
  • 121 of the top 500 selling ebook authors in the US were self-published indies
  • 284 of the top 1,000 selling ebook authors in the US were self-published indies

Amazon Publishing also makes a strong ebook showing.

  • 102 of the top 1,000 selling ebook authors in the US were published by Amazon imprints

So why are so many memes circulating about sales falling for “all” indies? We think we know why… and we’ll talk about that last.

Top 25 Audiobook Authors by Dollar Sales, Q2-Q4 2017

Note: entry #5, Roy Dotrice, is actually George R. R. Martin’s narrator, a glitch that will shortly be corrected in the dataset.

Top 25 Online Print Authors by Dollar Sales, Q2-Q4 2017


And finally, circling back to the earlier anecdotal reports about “all” indies struggling, we took a look at the 50 top-selling indie authors for 2017, ranked by total US dollar sales, and noticed something interesting…

A Changing of the Guard Among Indie Best Sellers

ETA: We had initially shared a ranked Top 50 Indie Ebook Sellers list here (with units and dollars blurred out, of course). But then some of the authors on it started emailing us and asking for their names to be blurred out, too. As a courtesy to those authors, upon request we did so, but after the first few it became too much of a hassle. And besides, with a quarter of the list greyed out, it no longer effectively illustrated the point we were making, anyway. So we yanked the whole list, and will just simply state what we observed on it.

Despite the prevalent indie doom-and-gloom rumors, today’s top sellers are handily making as much or more than the top selling indies from prior years were.

But here’s the kicker:

  • A lot of today’s top-selling indies are relatively new names. We didn’t recognize a lot of them.
  • And a lot of yesteryear’s pioneering indie superstars no longer even make the Top 50.

In retrospect, upon further reflection, that shouldn’t really be surprising to anyone. But it helps explain why, despite today’s rosy picture, a lot of the public indie-author-community chatter sounds so unrelievedly grim.

Remember that the early indie pioneers, who rode the first wave of self-publishing when it was shiny and new, were super excited about the industry changes that enabled their newfound success. Many of them consequently spent time and energy evangelizing for self-publishing, accumulating along the way large followings among their fellow authors and aspirants, and to this day, many other authors and newbies still look to those early pioneer gurus for ultimate insight and information about the industry.

So perhaps it was inevitable that, now that we’re 7 or 8 years into the era of viable self-publishing, many of these early pioneers are hitting their sophmore slumps. And telling their large author followings that indie publishing is no longer what it used to be, and that “nobody” is making big money anymore.

The data proves them wrong.

What it shows instead is a changing of the author guard.

But the newest superstars of Indie, Inc., who have replaced the indie pioneers in the top rankings, are in general a quieter bunch than their predecessors were. They spend far less time evangelizing self-publishing, or giving advice to large groups of authors. And why would they? Indie publishing has gone mainstream.

Which is why it’s nice that we finally have comprehensive data covering the nontraditional side of the industry, too, at long last. 🙂

132 Responses to “January 2018 Report: US online book sales, Q2-Q4 2017”

  1. John B says:

    The audiobooks list is missing Recorded Books (RBMedia) which is an umbrella company comprising Recorded Books, Tantor Media, HighBridge Audio, ChristianAudio, Gildan Media, W.F. Howes, Wavesound .. this is the largest audiobook publisher in the world. They must be in the top 5 if not top 3. They sell 30,000 titles, including works by Danielle Steel, J.R.R. Tolkien, Dean Koontz.

    • Data Guy says:

      Hi John,

      Entry #3 for audio is Recorded Books, just as you figured — which list were you looking at?

      • John B says:

        I’m blind – thanks! I wonder it says 18,000 titles their website says 30,000, maybe this is only for the original Recorded Books and not the new umbrella RBMedia which formed in 2017 that includes Tantor and the rest.

        • Data Guy says:

          Rolled up into Recorded Books, we track as separate imprints: Tantor, Gildan Media, Highbridge, Christian Audio, Audiobooks.com, and Novel Audio.

          But our data spiders don’t capture data on non-selling books, so that’s probably why we only see 18,000 for them rather than 30,000.

  2. One thing not captured in print sales is the secondary market. I know many print readers who almost never buy a new book, only used, many by exchange with used bookstores. While there’s no way to track these sales, maybe there’s a way to estimate them to get a little closer to full accuracy?

    • Nirmala says:

      I have often wondered if a sale of a used copy of a book on Amazon affects its sales rank. If so, would used sales already be included in this data?

    • Data Guy says:

      Hi David,

      That’s a request I get a lot from publisher clients, too, especially in the academic/professional space; for them, those used print (textbook) sales represent a business-critical competitive challenge.

      Used sales are less relevant in an Author Earnings context, though — just like we ignore free ebook downloads, I don’t anticipate AE looking at used print, either, here.

  3. Elizabeth Jennings says:

    David–this website is author EARNINGS. Why would data on secondary market sales be of use to authors, since they won’t earn anything from them?

  4. Orna Ross says:

    Fascinating and important findings, as ever. Thank you so much for the work!

  5. Amazing data and a great analysis as usual. It’s a shame that traditional publishers hasn’t been able to come to terms with ebooks and digital distribution.

    Looking at all those sales by indie authors, I can’t help wondering how many books are given away as promotions. From my experience–which has been at the far end of the tail–I give away about 10 books for every one I sell.

    What impact do these freebies have on sales? Does it stimulate sales by drawing attention to an author’s work, does it reduce sales by muscling out paid titles a reader would have purchased. Are free ebooks, given out by the author as promotions, replacing libraries, who have been less than enthusiastic about the rise of indie authors?

    Have y’all considered counting freebies (if it’s even possible). I know it’s not about author earnings, but we could chalk it up as author expenses.

    Thanks for all the work, btw, it’s amazing.

    • Data Guy says:

      Hi Hercules,

      Given our focus (i.e. author earnings), we’ve not paid much attention to free books.
      We’ve got our hands full tracking just the paid ones. 🙂

      • Hercules Bantas says:

        It was worth a try 🙂

        • Chuck Litka says:

          Hi Hercules,
          The Data Guy said in a comment:
          “As to the large numbers of library ebook free downloads, we similarly don’t count Amazon’s vast volume of free ebook downloads, either, which is several times larger than their paid sales.
          Our focus is author earnings, so we only count as unit sales the transactions that generate earnings for authors.”
          The comment can be found in the comment section of this post:

          “Ignore the little man behind the curtain.” 🙂

          • Hercules Bantas says:

            Thanks Chuck,

            The link is much appreciated.

            If the number of freebies are several times the magnitude of sales on Amazon, and assuming that most freebies are indie published, then readers are downloading more indie books than industry published books.

            Even if only a fraction of those freebies are read, that still represents more eyeballs on indie published ebooks than industry published ebooks. The question is, what is the industry going to do about their falling market share?

            I thought Pronoun was the industry answer, but Macmillan killed it for some reason. What next? Another attempt at an online competitor to Amazon? Good luck with that. And waht if Amazon starts opening brick-and-mortar stores.

            Sorry, wandering off topic as usual.



      • Don Trowden says:

        Years ago I read that the typical classic paperback novel is read 77 times but paid for only once. Do not recall the source. So at a 15% royalty rate the author is paid pennies. This is why treating ebooks like software, with licensing, can deliver more money than print in a digital future. I’d rather be paid $2.50 for each ebook I sell than pennies for each print book passed around. Plus I can charge the purchaser far less as an ebook, around $5.00 on average, which was once the right mass market paperback price. Anyway great information and thanks so much for trying to measure the full market unlike the AAP.

  6. Mike Bray says:

    Thank you for another fascinating report. Will your raw data be accessible as it has been in the past?

    • Data Guy says:

      Hi Mike,

      No, unfortunately it won’t — this report is based on hundreds of millions of rows of data, which costs orders of magnitude more dollars to gather and process than our past snapshots did; so we had to make some tradeoffs in how much raw data we would share free.

  7. Diane Capri says:

    Thanks for this! Lots of good data here. Where can we access the full list of top 2,000 indie authors?

    • Amy Gamet says:

      LOL Diane, that’s what I wanted to see, too!

    • Data Guy says:

      Hi Diane,

      We probably won’t be making that public; it’s always a balancing act between individual author privacy versus sharing data that helps all authors. In this case, the main reason we shared the top 50 was to illustrate a particular point.

      I think, also, that focusing too much on the top sellers, indie or otherwise, distracts from the broader picture: the opportunities that indie publishing has created for midlist authors to earn a living or at least pay a few bills doing what they love.

      • I notice that between yesterday and today several more independent author names in the top 50 list have been obscured. I assume they’ve requested it? They don’t need the “visibility” with their readers, and I understand why they wouldn’t want to suddenly be inundated with requests for blurbs, collaborations and even contract offers, but it’s an interesting conundrum. It seems a bit disingenuous to suggest their names are a distraction from the larger picture. The larger picture is what we had before your amazing development work. The specifics and granularity of the new information presented are what make it so valuable. Again, I sympathize with the desire for privacy, but the top 50 aren’t just private individuals; they are industry professionals and that’s an industry statistic. It doesn’t seem like they should be able to opt out. What if the Forbes 500 list had its #4 slot obscured?! Or maybe I’m completely wrong and the names are obscured for a different reason? At any rate, I think it’s a fantastic development to have a bestsellers list among independent authors. I’d never heard of most of them. I’d like to check out their work as a reader, and as a professional see if there is something for me to emulate. I’ve taken screen shots in case more of them disappear! 🙂

        • Debora Geary says:

          Kathryn, being inundated with email is the least of those authors’ worries. Being targeted by scammers would be at the top of my list, but there is a substantial and very worrying list of problems these authors are all facing now, including the potential selling of that data to parties who absolutely do not have the skills to compile it themselves.

          While this data can be compiled through publicly available information, the level of work Data Guy and his team went through to get it, in my mind, could be likened to having a PI stalk you for a month and then put a “Personal Profile of Kathryn Guare” on the Internet – down to the foods you like, the books you think suck, the noises you make in your bedroom, and best guesses as to your bank card PIN number. It makes me very uncomfortable to consider that level of detail “industry data.”

          • I agree with Debora,

            With the scammer shenanigans on Amazon, why on earth put a target on people’s back? One thing that seriously bothers me about naming names and especially the comment about names being blanked (not actually removed) is this: What is it that makes people believe they have a ‘right’ to anything and everything about anyone? Not just data, but to people’s supposed earnings and invading their privacy and their lives? It’s as if, if we don’t reveal stuff on demand then we’re in the wrong? Speaking personally, other people have no conception of my life, for example, or what I’ve been through, but unless I choose to ‘give’ people access on their terms, to personal details (like how much I earn), suddenly I’m the one with a problem? I don’t effing think so. The people who are cheering on Data Guy—whose identity is hidden, btw— do not know me. I don’t know them. I owe them no part of me or my earnings, my books, OR my identity. Just say’in. The sense of entitlement is breathtaking.

  8. Martin Lake says:

    Another fascinating report after what must have involved a truly mind-boggling amount of work. There is one part I’m unclear about which is the difference between your 3 types of Uncategorized imprints: Single-Author imprints, Few-Author imprints, and Many-Author imprints so I’d welcome more info on this please.

    This was well worth the wait.

    • Data Guy says:

      Hi Martin,

      Uncategorized Single-Author Imprints are those with only a single associated author name (and sometimes only a single book) selling any copies at all, but with sales too low to justify the effort of manually categorizing them.

      Uncategorized Few-Author Imprints are those with 2-10 authors publishing under the imprint name but collectively selling too little to be worth categorizing.

      Uncategorized Many-Author Imprints are those with 11-500 authors publishing under the imprint name, but collectively selling too little for us to get around to categorizing that particular imprint.

  9. Alyssa Day says:

    Thanks for this! You once published a breakdown of how many authors were making more than 10K per year, more than 25K, 50K, 100K, etc. Do you have that again this time? When people asked me what to expect when they went indie, I pointed them to your survey. I’d love to see an updated chart.

    • Data Guy says:

      Hi Alyssa,

      That’s a great idea! And in a few months, we’ll even have a full year’s worth of comprehensive data to base it on. That’ll make the conclusions ironclad, so let’s do it then. 🙂

      • Melanie Cellier says:

        That’s what I was hoping to see as well, so I’m glad to hear it will be included in the next report. I remember it was very eye-opening, especially once you broke it out into when the authors debuted as well.

    • Tim Moon says:

      I was wondering the same thing and I’m glad to hear that will be included later.

  10. Thank you so much for the data. Your findings are the opposite of what I usually read nowadays, but it makes sense. The older ‘guard’ evangelized self-publishing. The ‘new’ generation focuses on creating books after books for their target audience without telling the public about the process, because as you said, indie publishing is mainstream.

  11. William Ockham says:


    By the way, the DSM is a big print seller every year. It’s a textbox and a necessary professional reference for tens of thousands of mental health and primary care providers.

    • Mr says:

      The reason the DSM vaulted so high is they finally updated it. They added more sub classes and specifics to catergories so professionals are ordering the current as it will go out of print plus the updated to cross reference. Also alot of coders are taking classes to keep up with the ever changing codes and the classes include the book in the price of the class itself. Alot of times the coders just get the new books and keep them rather then go through the hasel of contacting the the entity offering the class, it’s a pain. They just end up with a back up book, most want one anyways. When ICD-10 is completely implaminted and 9 no longer used in anyway I imagine sales will drop. You just have to wait for doctors offices not attached to hospitals to catch up to new Ahima laws. Once they are nation wides sales will drop significantly.

      • HIM person says:

        Um, AHIMA doesn’t make laws. It’s an organization that represents health information management professionals. CMS makes laws, HHS makes laws. AHIMA merely alerts its members to what they are, and the guidelines for following them. But you’re quite correct that any major changes set forth in criteria, guidelines and codes that reflect them will spur a flurry of purchasing activity to remain compliant with the changes.

  12. Rockin’ as always, guys! Thanks for all the great info!

  13. Curious says:

    I’m curious as to where you got your indie sales/earnings data? I’m in several author groups where sales and earnings are shared openly and there are a good handful authors who verifiably out-earned many of the authors on your list but they’re not shown. I’m just wondering how accurate your data is or if your data collection is maybe flawed.

    • Data Guy says:

      Hi Curious,

      Authors with multiple pen names could be part of the reason; the list here really should have been called “Top 50 Indie Pen Names.” The rankings also don’t include non-indie revenue for hybrid authors (i.e. revenue from Amazon Imprints or any traditionally published titles those authors have). Nor do they include (yet) revenue from Kobo US & GooglePlay US.

      Sales from other countries are also not part of the calculations here; while US sales are 70%+ of the revenue pie for most indie authors, some do see a disproportionately high share of their revenue coming from the UK, Canada, Australia, Germany, etc — those dollars aren’t included here, as we’re ranking on US sales only.

      Finally, sales from two-author collaborations and multi-author box sets aren’t divided up neatly between authors in our data; so authors who derive a big chunk of their revenue from either or both could be getting assigned too large a share of revenue from those box sets, or too little.

  14. AW says:

    Hi Data Guy! Great info as always. Quick question: some of your top 50 ebook earning authors have incorrect title counts. At a glance, a few are just way bloated, more titles than that author actually seem to have. What data are you pulling that’s making the title lists look like this? Thanks!

    • Data Guy says:

      Hi AW,

      Sometimes, especially when an indie isn’t using ISBNs (and most aren’t), the algorithms don’t do a great job of auto-correlating different versions of the same book selling at different retailers, so they look like different books to our spider.

      Foreign-language editions, even very low-selling ones, also creep into the title count if they are available from US retailers.

  15. In a few charts, “subscription sales” are visible. Could you explain how you counted these? I always assumed you used bestseller rank to approximate sales. By my experience, the share of subscription sales varies a lot between genres. And it’s impossible to calculate or approximate them from real data because KDP does not tell their numbers (only pagereads are counted which are not reflected in rank). This could skew any author rankings (and earnings as well) in favour of genres with a high borrow rate… I hope I made my objection clear 🙂

    • Data Guy says:

      Hi Matthias,

      We’ve described our core methodology here and in past Author Earnings reports like this one. The way we initially calculated subscription revenue from Kindle Unlimited borrows is described here. The way we do both now has become a little more complicated, because we’re now basing it on rolling rank histories for all titles rather than a single snapshot, and we’ve become a little more granular in how we assign average read-through rates to titles in different genres.

      Your objection is noted. 🙂 But unwarranted, we think, at least in the aggregate, because the total page counts our methodology generates for each month tend to match the page counts calculable from Amazon’s monthly KU pot size announcements quite well. While undoubtedly we’re still over-attributing reads to some books and under-attributing them to others, we’ve matched actuals surprisingly well for the majority of authors we’ve cross-checked.

  16. Thanks for providing this data! It’s extraordinary information. In the AudioBooks by Genre chart, you show the top two categories as Literature & Fiction, then Fiction & Literature. What is the difference between the two?

  17. Top 54 Audiobook Sub-genres:
    #1 Literature & Fiction
    #2 Fiction & Literature
    The difference being…?

    • Data Guy says:

      Hi Mark,

      The difference being only how the particular genre category happened to be named by a retailer, which is slightly annoying. But hopefully by next report, we’ll be in a position to show rankings by industry-standard category (Adult Fiction, Children’s Nonfiction, etc.) and even potentially by BISAC code.

      With literally tens of thousands of retailer genre categories and subcategories and five thousand BISAC codes, cross-mapping them accurately has been a bit of a data challenge — we’re working on it now.

      Even for what’s shown here, the fact that each title is listed in multiple genre categories required applying some large-scale real-time number crunching to make sense of the data. 🙂

  18. Dan says:

    Hey Data Guy!

    Amazing, as always. What is the very latest guide you are using to correlate author rank with sales per day? Is that something you could share? Thanks! Dan.

    • Data Guy says:

      Hi Dan,

      In exchange for being able to report on a far greater depth of data now, we’ve had to make some tradeoffs in what we can share for free, and so unfortunately we can’t share it. Further complicating things, sales are now calculated from a title’s entire history of past rankings, rather than a one-day ranking measurement. So even applying it to a single title requires a bunch of number crunching.

      • Dan says:

        Completely understand. I asked because I’m always interested in what that looks like these days and because there was a post a couple days ago on the Kboards about the topic. A couple people shared their current guesses at what a chart might look like these days. Here’s one example of one for a daily ranking between 1 and 10,000. Can you at least nod or tap your finger twice if this looks like it sort of comes close? 🙂

        Thanks, Dan.

        Rank Consistent Daily Sales
        1 4885
        5 2588
        20 1301
        35 950
        100 497
        200 311
        350 208
        500 160
        750 117
        1000 93
        2000 52
        3000 37
        4000 28
        5000 23
        6000 20
        7000 17
        8000 15
        9000 13
        10000 12

        • Data Guy says:

          If those numbers are for just Amazon ebooks, and for titles with a consistent daily rate of sales, then the ranks shown are not too terribly far off. Number 1, of course, can vary significantly — a lot more than the other ranks.

          • Penny Reid says:

            1 4885
            5 2588
            20 1301
            35 950
            100 497

            In my experience, the top is more like:
            (1) Over 5k a day
            (5) Over 3k a day for 2 or more days
            (20) 2000 a day for 2 or more days
            (35) 1700 a day for 2 or more days
            (100) 1000 a day for 2 or more days

          • Justin says:

            Would you say that print book sales on Amazon.com reflect those same numbers? Or would they be higher or lower?


          • Data Guy says:

            Hi Justin,


            The Amazon rank-to-sales curve for print books has a somewhat shorter head, but a longer tail — once you get out past the the top 100,000-ish. But those long-tail sales aren’t enough to tip the balance.

          • RJ says:

            I recently hit #35, and when I did, my sales were 200 sales per day for 2 days, then 2,300 sales during a bookbub to hit it. I only did ads to get the 200 per day for 2 days to help with the bookbub when it dropped hoping for better organic exposure. I didn’t hold the position for very long (four or five ranking cycles,) but that’s what happened with my specific title. It took about 24 hours to exit top 100, and the following day I had 371 sales on the title. This was in late July. A one day spike can get you there, but you won’t stay long in my personal experience.

  19. Rook Winters says:

    This is amazing! Thank you for sharing.

  20. Sela says:

    Hey, Data Guy!

    Great report, as usual. I was hankering for another report and voila!

    It would be great to have some data on how many indies are making a living at self-publishing. While there will always be those huge sellers who make shedloads of money, a healthy indie author community with indies able to do this for a living is important for our collective futures. I like the idea of thousands of indies making a living at this gig as independent author/publishers without the Big 5 involved. If you could in the future provide an earnings breakdown so we can see the average income or more for the US as a whole, etc. that would be really useful and encouraging.

    Keep up the great work!


    • Data Guy says:

      Hi Sela,


      Love the idea; we’ll definitely show a histogram of author counts by total (tracked) one-year revenue in the next report. By then, we’ll have a complete year’s worth of daily data to base it on.

      • Dan says:

        That would be amazing. Would love to see how many make 100K, 75K, 50K, 25K, 10K etc. Would be really cool. As always, thanks for doing this!


  21. David Lang says:

    I think it would be very interesting to see what the numbers look like if you filter it to only include books that have both print and e-book versions out.

  22. Debora Geary says:

    Data Guy, can you state your policy (or your commercial arm’s policy) on selling individual-identified sales data to your trad-pub clients? Have you done it? Will you do it in future?

  23. Widdershins says:

    Thank you for this waaaaay-beyond-impressive data. 🙂

  24. Domino Finn says:

    Awesome reading! Love the focused analysis, Data Guy!

    This might be a statistically insignificant quibble, but it always bugs me when I read your audiobook data. You separate book sales into digital and physical (ie. print), but you don’t separate audiobook sales into digital and physical (ie. CDs).

    Now, we both know most audiobook sales are going to be digital, but I’m curious what that breakdown is and if you even track it. The lack of data makes me wonder whether you’re underreporting total audio sales or overreporting digital audio sales.

    • Data Guy says:

      Hi Domino,

      The industry convention is to track physical audio as part of print sales (Bookscan includes physical audiobook CD sales in a subcategory of print sales called “Other,” along with Calendars and other miscellany). So, strange though it may seem, we’ve adopted the same convention here.

      Either way, physical audio sales have fallen to such an insignificant a number that it doesn’t skew things much one way or another.

      For Q2-Q4 2017, physical audiobook CDs sold online only added up to 3% of our digital audio total, and 0.5% of our print total (which they are included in).

      • Domino Finn says:

        Thanks for the response and the added data! Audiobook CDs are quite insignificant!

        The programmer in me sees an opportunity to “fix” the old method of data categorization, ie. remove things like calendars from print book sales and properly categorize CDs, but I can understand why those aren’t priorities. (It makes me wonder how much the “Other” category skews print sales.) But as your report shows, those overall pie charts are not nearly as useful as the category/genre-specific ones, which are probably the best way to analyze trends.

  25. Data Guy, amazing amount of work and great data as others have said.

    I have one issue I need you to drop by and address on my web site if you would. Behind your paywall to big publishers, you are not releasing individual data of any writer’s sales. Correct?

    Or at least not without written permission. Correct?

    Please, please tell me I am correct.

    I know you blocked sales and names here, but the sounds of what you are doing behind the paywall scares many I have talked to today about this. Major business issues with business privacy and so on. Because if traditional publishers and movie industry and gaming industry and such start making decisions on books because they know exact sales from you of any author, that will lead to more lawsuits than I can imagine.

    So please clear that up if you would.

    Fantastic work on all this.

    • Debora Geary says:

      I asked a very similar question above. Data Guy, if you could answer it here in both places, that would be much appreciated.

    • Data Guy says:

      Hi Dean,

      No part of any business I’m involved with is sharing or releasing actual sales data that has been shared with me confidentially by any author, publisher, or agent.

      Nor will it ever happen.

      • Debora Geary says:

        Checking to see if I’ve been blocked, as well as having my follow-up question deleted. Here it is again, in case people are interested and this actually posts. [And Data Guy, if you are moderating comments, please know that deleting polite questions asking whether you are selling individual author data seems very much like the behavior of someone who is.]

        My deleted question:

        “Data Guy, that’s a partial answer, but not the part that at least I was asking. You calculate individual-level estimates of units sold and income earned (with a very sophisticated algorithm you are comparing to data sources like Bookscan, which are actual sales.) That data is blurred in this report, but with a “lock” symbol. Are you selling unlocked versions of that individual-identified data to your paying clients? What is your policy on selling this data if you have not already done so?”

        • Debora Geary says:

          It looks like my deleted comment was undeleted as I was reposting it.

          • Debora Geary says:

            Ah. Nope. It’s all just “awaiting moderation” now. Data Guy, I have screen shots of all this and I’m sharing them. I really, really don’t like doing that – I believe it’s entirely possible you got to here with nothing but good intentions. I know what it is to chase data and not think hard enough about the consequences of what you find.

            Please give us honest answers and develop a clear and ethical data use policy moving forward.

      • AW says:

        Like Debora, I am checking to see if I’ve been blocked, as well as having my comment deleted. Here it is again, in case people are interested and this actually posts.

        My deleted comment:

        You may not release the actual sales data that has been released to you by others, but you have used that data, as you admit, to check and fine tune your algorithm. Hence, if you release the data provided by your algorithm, as it appears you do as indicated by the paywall, you are in part, betraying the confidential sharing of data, as this has been specifically used to make sure that your algorithm ends up giving results that will be pretty close to the data that was shared. So to claim that you will release only “your” algorithm data, and not the shared data, is a legalistic workaround.

  26. Great job as always. I’m so glad you guys are able to gather data continuously now, that’s so HUGE! Question for you…do you adjust your sales to rank ratio depending on various times of the year? For instance I know from my own data (where I can see my own daily sales vs daily rank) that a rating of 1,000 in December produces a much different number of sales than 1,000 in April. How (or do) you account for this?

    • Data Guy says:

      Hi Michael,

      Seasonal adjustments are factoring in, although the whole revamped continuous-data approach is new enough that we’re learning as we go. There’s also a very nontrivial day-of-week pattern to ebook sales, too, that we’ll be factoring in shortly. 🙂

  27. Mit Sandru says:

    I’m glad that the Indie authors are doing well, at least a few of them. But, you gave no answer why the sales are slowing down for the rest of us. Hmm!

  28. David says:

    FYI- Roy Dotrice is listed as the #5 audiobook author but he was actually a narrator. (Narrated the Song of Ice and Fire books and a few others.)

  29. Anma Natsu says:

    Really interesting trends since the last report! I’d be curious to know, when looking at the print sales, could the domination of print by traditional labels be due, in part, to indies who don’t ignore the print market and only go ebook? Just curious as to how many of the ebook authors also have print versions, or is there a large swath ignoring a potentially untapped market?

    • Data Guy says:

      Hi Anma,

      Personally, I think it’s a little of both. Most indies who sell ebooks don’t seem to have print editions at all. It would be nice to see that change over time. While the best-selling print genres and best-selling ebook genres, as shown early in the report, do seem fairly distinct from each other, it’s hard to imagine there isn’t a significant untapped market there for indie print.

  30. Susan Faw says:

    Love the data. Indie is here to stay. There is always room to climb, if you have the talent 🙂 Thank you for sharing so freely with us all!

  31. Amy Maroney says:

    Hi Data Guy, thanks so much for this fascinating treasure trove of information. I’m going to share it widely with other indies. I appreciate all you do to shine a light on the data behind the trends in publishing. Keep up the great work!

  32. Mandy says:

    Great work, thanks.

  33. JCK says:

    Out of curiosity, why is JK Rowling listed twice on the print book list (5 & 27)?

    • Data Guy says:

      Hi JCK,

      Inconsistency in the publisher-entered “author name” metadata at the different retailers, or for different books by the same author at the same retailer. One of the many items on the cleanup list… 😉

  34. antares says:


    You collected truckloads of data. What software did you use for your analyses? SAS? R?

    • Data Guy says:

      Hi Antares,

      Some of the calibration & formulae were initially developed in R, as you surmise.

      But at the scale it runs at now, the core engine is all custom code, leveraging both columnar SQL RDBMS & clustered NoSQL in the architecture.

      Unfortunately I’m going to have to stop there, because some parts of the software architecture as implemented are non-obvious/innovative enough that it might be protectable IP.

      • antares says:

        In my previous life as a statistician, I dealt with large datasets, too. For us, 50,000 datapoints was small. Some datasets ran into the millions. Data gathered every quarter — from multiple sources — and collated, analyzed, and reported within a week of capture.

        Yeah, your data gathering and collating exercise is non-trivial. My guess is that all the code you write can be copyrighted and the data gathering and collating routines can be patented.

        Best regards.

  35. Colette says:

    Dear Data Guy

    Thank you so much for this amazing resource.
    I was wondering: does your magic work on foreign markets? I am interested in this kind of data (even rough) for the French market. Or perhaps you could kindly point me to some people who do?
    Thanks a million

  36. Susan says:

    Do the earnings include sales to libraries?

    • Data Guy says:

      Hi Susan,

      No — it’s new sales at major online retailers only. (However, libraries buying new print books on Amazon, for instance, will be counted in here.)

      The thing to keep in mind about library digital borrow #’s is that, most of the time, they don’t generate any author revenue beyond the initial sale to the library… and Author Earnings are wht this site is focused on.

  37. Jackie Weger says:

    Great stuff, Data Guy. And you verified something I’ve been saying for three years…Many of the early indie authors earning in six figures are falling away since Amazon leveled the playing field with a change in algorithms and the newness of digital wore off and settled into a mature market. Those of us who are new just quietly built our subscriber lists. We learned to evaluate promoters. We don’t ‘fake’ editors, but actually hire professionals for editing/proofing/cover art and formatting. We sell books. Many of us are in Amazon Select and often see borrows/KENP pages read in the hundred thousands. Without seeing your reports, I know where I stand among indie author earners as do many of my close indie colleagues. We are not walking away from the challenges, but working within them and always searching and beta testing new ways to promote our books. Not a single one of us believe the indie market is dying. Some of us see modest sales, some better than modest. It takes work to market our books, so we do it.

    I love your amazing reports. Best to you and your team.

  38. Hi, Data Guy,

    I’m going through the information you’ve presented for free here and find it fascinating. I have one question, though. Why is the paywall structured so that only companies with revenues of $10 million can subscribe? I often purchase data for my blog from analysis firms that usually only cater to larger businesses. I report only what I’m allowed to report, but it does give me an insight into the larger picture that those firms have collated.

    In all my years of blogging and in my years as a reporter, I have only seen paywalls like this when a certain business is targeted–for example, the analysis firm might be targeting venture capital firms with financial data as opposed to small banks, things like that. So what was your thinking setting up the paywall in this way? Are you going after a particular market?

    Also, I’ve scanned this site again, and no longer see a reference to your new data company. Maybe I missed it. In case others are wondering what I’m referring to, it’s this: https://bookstat.com/#contact

    Thanks, DG.

    • Data Guy says:

      Hi Kris,

      The costs of industry data collection alone at this scale are exorbitant, let alone the costs of infrastructure and support for real-time analytics software to interpret it.

      Without AuthorEarnings.com reports, that data and insight would remain entirely the province of industry’s largest publishers, the biggest global consulting firms, and other related entities. Why? Because many of them have already built similar whole-market tracking capabilities in house. And more are doing so today.

      Obviously, these companies will never publicly share any of the data they’ve collected or insight they’ve gained into the broader book market. Why would they? In fact, for a number of obvious reasons, their recent public statements on the direction of the market and its prevailing trends are carefully crafted to be most supportive of their own existing business models. That’s just how PR works.

      The last 4 years of AuthorEarnings reports have provided a pro bono counterweight to that industry information asymmetry. They have given authors a peek behind the curtain, at what the largest and most capable publishers already could see for themselves.

      But for cost reasons, AuthorEarnings simply could not — and cannot — continue to exist in its previous form.

      For 4 years, Hugh and I gave generously and unstintingly of our time — and each of us has over that period separately dug deep into our pockets (he for site hosting, I for the full computer costs of data gathering and analysis) — for the sole purpose to helping our fellow authors.

      The cost of the latter piece (data gathering and analysis) has spiraled ever upward, for a variety of technical reasons, and continues to do so rapidly today. In short, AE as it had previously existed had finally reached the end of the road.

      This is the only solution that lets me continue to freely share any industry data at all with the author community.

      It would be remarkably naive for any author to imagine that, even apart from the big publishers’ own in-house efforts, there aren’t already multiple other third party businesses already doing this kind of data collection and selling that data to the industry. There are. I know. I talk to the principals at many of them. And there are many more of those companies on the way; the era of Big Data analytics is coming into its own and finally reaching our little, late-adopting industry of book publishing.

      Whether I’m involved or not, large-scale industry data collection is going to continue — and continue to grow rapidly. It’s unrealistic for anyone to imagine otherwise. Even the principals at the existing analytics companies that have served this industry for decades–and I talk to them, too–are looking at ways to expand into this area.

      So the only difference between whether I personally stay involved or I leave the book publishing industry and instead do similar work for another large, more lucrative industry vertical, is whether there’s anyone involved at all who gives authors even an occasional unvarnished data-driven peek at what’s really going on in this industry.

      Like I did here.

      And I know that it’s doubtful anybody else with equal visibility into the industry is likely to do the same, going forward, regardless of what they are seeing behind closed doors — and regardless of which particular companies happen to be collecting and supplying that data.

      So I share what I can, when I can, because I love to help authors.

      When you ask why would a business commercially focus on the largest customers first? For the same reason that every other, similar Big Data business does. When startup costs and incremental costs associated with supporting each new customer are exorbitantly high, there is no real economic alternative. But in the future, when economies of scale make it possible, the long term goal also includes business models that are geared toward smaller entities. Fingers crossed.

      In the meantime, I hope to continue being able to share as much as I can pro bono here with the author community.

      Further responses will be limited to questions only about the AE report data above.

      • Thanks for the explanation of your thinking, Data Guy. I’m sorry that you haven’t done what many other similar companies do, and make reports available to anyone who can afford a particular fee. It might be exorbitant, but then paying or not paying is the choice of the consumer. By doing this in such a manner, you’re automatically excluding a large market, one that helped you build your business.

        I see that you’ve acknowledged that with the idea of moving toward smaller entities down the road. I would hope you find that solution sooner rather than later.

        But it is your business, after all. I appreciate the response, and hope it gives some clarity to the various discussions going on across the web.


  39. Peter Griffin says:

    Great work Data Guy – very much worth the wait.

    Two things from me – personally I think you should go ahead and publish the names and sales figures of indie authors. It is essentially publicly available data, just difficult for people to collate. Transparency is good and information is power for authors going into negotiations as well.

    Secondly, I don’t really buy the theory about the sentiment change expressed by the first wave of indie publishers and the “sophomore slump”.

    I read a lot of their blogs and they are saying it is tougher because they have to work harder to sell the same number of books for the same value AND, their marketing and advertising costs are rising. Before they could count on organic reach and cheap ad campaigns to reach readers. Now they can’t, so they are struggling to make the same returns. That’s the elephant in the room and is down to indie publishers’ reliance on a small number of massive platforms namely Amazon, Facebook and Google. The glory days are over.

    • Transparency is good and information is power for authors going into negotiations as well.

      But the information provided by Bookstat won’t be available to most authors. It will be available to multi-million-dollar corporations (publishers, movie makers, etc.), giving them yet more power in their negotiations with authors.

  40. AnotherDataGuy says:

    DataGuy, great report as always. One thing that could shed more light on your Unit Sales by Price Point and Dollar Sales by Price Point graphs would be to divide them by the distinct number of titles sold at that price point. By doing that it might show that $1.99 isn’t quite the black hole it appears to be. Or that the cliff from $4.99 to $5.99 isn’t as steep. I believe it would better show how willing people are to buy books at each price point.

    Thanks again for a great report.

    • Data Guy says:

      Hi AnotherDataGuy,

      Cool name. 😉

      And great suggestion for next time — I definitely agree. The hard part would be to also count all the non-selling books in each price band. That could drive data collection costs up 10-20x, and they are eye-wateringly ugly already!

      There’s another factor at play here, too, in the price point graph: publisher type.

      Traditional publishers whose Amazon contracts pay them the same rev share percent at $1.99 as at higher prices love to temporarily price at $1.99, undercutting $2.99 indies with a price band most indies stay out of because 35%.

      BTW, that was how Pronoun could offer indies 70% below $2.99 for a brief while — they were no doubt leveraging the Amazon terms of the larger Macmillan Amazon contract that was in place for Macmillan’s own books.

      • AnotherDataGuy says:

        Even if you only counted the distinct number of titles that had a sale during the quarter (which you’re already doing, right?) I think it would give insight into price sensitivity. If a book doesn’t sell a copy in a given quarter it’s probably because it has no visibility in the store. As a result, one could argue that the book isn’t in the store because no potential buyers ever see it. Anyway, by only counting the books you’re already processing it should give you more info on price sensitivity without additional data collection costs.

        Again, thanks for providing the data to the public.

        • Data Guy says:

          That makes sense, and we do definitely track the # of *selling* titles in each price band, as well as the total sales. For the next report, we might try to do a per-selling-title average in each price range to see what insights it can provide. Great suggestion.

  41. Erwin says:

    Given that you are doing continuous acquisition, is there any way of getting typical effect sizes for things like publishing a new title or putting one (of several) novels on Kindle Unlimited? I notice a lot of authors experimenting with that sort of thing – and getting an idea of how well that sort of thing works might ease the pain of experimenting on one’s revenue stream.

    • Data Guy says:

      Hi Erwin,

      Short answer: yes, it probably would be possible to dig that info out of the data, at least in general terms. As always, mileage would vary substantially by genre, and by individual author — but the broader statistical picture might still be very helpful.

      Thanks for the suggestion — we might tackle that in a future report, to see what we can learn.

  42. M.H. Lee says:

    First, thank you for publishing this analysis. I always appreciate data and appreciate you sharing with all indies something that could be kept for just high-paying clients.

    Second, I was wondering how you were calculating KENP for titles. Not sure if you were aware of some of the recent discussion around titles that contain bonus content, but part of that conversation was that some authors have taken steps so that the page count for their ebooks as reported on Amazon is much lower than it actually is with the bonus content. Using the publicly available page count on those books might give an estimated KENP of 400 when the actual KENP of the title is potentially 3000. Is this something you were aware of and have captured in your analysis?

    • Data Guy says:

      Hi M.H.,

      Here’s a slightly outdated description: (link). The approach has been updated somewhat since then, but remains generally similar.

      To answer your specific question about KENP count, specific adjustments have been made to account for the different typical read-through rates seen for books with extra-long page counts.

  43. Kristin Over-Rein says:

    Thank you so much for the new report! I were starting to think you had stopped making reports, but I can tell you have really had some comprehensive amount of working the last year.

    I am a Norwegian entrepreneur and founder of a startup in the indie author business. In Norway we are lagging some years behind the US book publishing industry. We are now in the “evangelizing phase”, working with consciousness about that indie authors can use the same services as the publishers use, to make quality books.

    I will make a blog post pointing to this report. But there are a distinction in self-publishing categories that I don’t understand – could you please explain this?

    – darker blue for individual self-publishers with their own LLC or DBA imprint name
    – slightly lighter blue for those individual self-publishers who use no separate imprint name at all
    – pale gray for Single-Author imprints

    How will you explain the differences, and the need to split them into three categories? I mean, if they were one group, it would be more obvious that independent authors have a larger maket share.

    • Data Guy says:

      Hi Kristin,

      You’re probably right that combining the four different self-publisher pie wedges would make the full size of the self-publishing sector clearer.

      I do however want to continue to differentiate between *individually verified* self-publishers (shown in the 3 blue wedges) and the grey “Single-author Uncategorized Imprints” who, despite being 90+% self-publishers, weren’t individually verified as such, the way all titles and authors in the other 3 self-publisher categories were.

      That way, our calculations of self-published market share remains always a conservative underestimate, rather than risking overstating it.

  44. DT says:

    Is any of the data collected by Bookstat only non-publicly available? And I am talking about raw data before it runs through the algorithms.

  45. PS says:

    Can I ask how author earnings were calculated for books that were enrolled in Prime Reading? The borrows there impact the sales ranks of the books, but some authors received $500+ as a flat fee while others got “exposure” – and Amazon did not pay authors on a per-unit basis for PR books borrowed by Prime members.

  46. Bryan says:

    Hi @Data Guy

    Quick question – in your 2016 presentation to the RWA, you put the size of the US romance ebook market at 235m units. In the data above, 9 months is 50m units – which seems a precipitous drop. What’s responsible for this? Improved accuracy in new crawling?

    It seems an enormous difference, would that put into question your previous estimates of the market, or am I misreading the data? (More than likely).


    • Data Guy says:

      Hi Bryan,

      Congratulations on your keen eye and discernment; I was actually waiting for someone to ask that question, and was kind of bummed out when no one did! 🙂

      Note the sneaky language in the genre-breakdown table headers here? i.e “Share of Units”, “Share of Dollars”?

      The difference between the old 235M-ish number and what’s calculated here comes down to how particular book sales get assigned to particular genres, basically, and how to account for titles that are listed under multiple genre categories.

      The old number you referenced included every sale of every single book that was listed under any Romance subcategory, even if that same book was *also* listed under Mystery/Thriller and/or Scifi/Fantasy and/or Literature & Fiction, etc. On the other hand, these newer calculations shown here only give “partial credit” to Romance for sales of books that are double-, triple-, or quadruple-listed under other genres.

      Doing it this way gives a truer, non-overlapping measure of what percentage share of all book sales each genre or subgenre commands. The percentage breakdowns by genre or subgenre will now add up to 100%, which is the kind of intuitive slice-of-the-pie measure most people tend to expect.

  47. Zara Altair says:

    Ponderous amount of data. Thank you for putting it all together, mining, and the interpretation.

  48. Sarah says:

    Hey Data Guy,

    Thanks for this great report. By far the best one in the industry IMO! I was wondering if you can confirm some of Mark Coker’s findings that he states in his 2017 Smashwords Report.

    There he says that “New Adult Romance” is a potentially underserved romance category which had the highest yield per title in 2017 (meaning the sum of all sales divided by the number of titles being the highest above avg). He comes to the conclusion that “New Adult Romance” is one of the least saturated niches which will experience a crazy rise in the next few months. Can you confirm that?

    You say that the ebook romance market is ~$160m. What’s the share that “Contemporary Romance” has and the one “New Adult Romance” has and how did they evolve over time (since you started tracking on a daily basis)? Are there really hidden opportunities in the “New Adult Romance” category?

    Thanks so much in advance for helping! 🙂

    • Data Guy says:

      Hi Sarah,

      The broader market data for the last 11 months shows something different than what Mark sees in his Smashwords-specific dataset. Looking at the whole US market, sales of “New Adult & College” Romance titles have grown 2.8% during the last 11 months, whereas the rest of Romance has grown somewhat more rapidly than NA, seeing sales expand 4.1% during that same time period.

      Of course a slight difference in growth rates shouldn’t be viewed as indicating a lack of opportunity, hidden or otherwise, to sell well in that genre.

      About the $160M number for all of Romance, that was only for 9 months, but more importantly it was also only a fractional share computed by giving “partial genre credit” to titles listed under multiple top-level genres: please see my answer to Bryan above. In other words, a title listed under multiple top-level genres (i.e. Romance + Fiction/Literature, Romance + Mystery/Thriller/Suspense, etc.) gets only half its sales counted, or a third, etc. in that $160M total. And for Romance-listed titles, most are multiply-listed in other genres as well (esp. Fiction & Literature subgenres).

      When we consider ALL titles listed under Romance or any Romance sub-genre, regardless of what other non-Romance genres they are also listed under, the 9-month total for the last 3 quarters of 2017 was $483 million (meaning each Romance-listed title was, on average, listed under several other non-Romance categories as well.) Of that $483 million, $118 million — or almost 25% — is the share of Romance sales attributed to Contemporary Romance titles. On the other hand, $26.7 million — or 5.5% — is the share of Romance sales attributed to New Adult & College Romance titles.

      So, in total dollar sales terms, NA Romance is between a fifth and a quarter the size of Contemporary Romance.

  49. Thank you so much for this… I love your no-nonsense breakdown so that we can get a real distinction for what’s going on…. it’s crucial for running a small business. Great as always, and so appreciated!

  50. Henry Stradford says:

    Thanks for the info, again!

  51. Bill Peschel says:

    Just checking in. Like others, I was wondering about the breakdown by writers’ projected income. In January, you promised “within a few months,” Not complaining, but just letting you know we’ll be very interested in seeing the results of your in-depth survey.

    • Data Guy says:

      Hi Bill,

      Hopefully sometime this summer I’ll get a chance to do so and share it here.

      I’ll probably put it in terms of total gross dollar sales, though, leaving the derivation of traditionally published and self-published “author income” from those gross dollar sales as an exercise for the reader. 🙂

  52. Remi says:

    I might be missing something really obvious, but I can’t wrap my head around this.

    In your February report, you reported close to $3.2B in E-book sales. In this report, we see a very steady, slightly increasing $150M / month. That’s a drop of almost 60%

    Did the E-book market really crash somewhere between these 2 data sets? I expect the prevalence and popularity of Kindle Unlimited will have a a strong downwards effect on Gross Dollar sales, (even as respective Author Earnings increase.), but that’s a startling difference.

    • Data Guy says:

      Hi Remi,

      Not at all — which is why we specifically cautioned against comparing these numbers to the old AE numbers — you can’t directly compare them.
      Old AE snapshots were statistical projections based on a single-day’s collected sales.
      This new data OTOH is a very, very large collected sample taken over the entire time period, without any statistical projection to the missed books whatsoever.

      What seems to throw people off is that our latest sample is now so huge, it’s assumed to be the whole market rather than just the majority of it.

      So statistically projecting 2017 based on the known coverage levels we achieved with our new, huge data sample, we can actually see modest growth in US ebook units coupled with a modest decline in dollars.

      US ebook sales totaled roughly 545 million units for the last 12 months, up from the 510 million & 525 million in 2015 & 2016. Dollar sales OTOH are down about 7%, to just over $3.04 billion, reflecting a continued shift away from high-priced Big Five ebooks and toward lower-priced indie and Amazon-imprint published ebooks.

  53. Neliza Drew says:

    I wish the print category was broken down a bit more. I’d love to know the percentage of sales for hard cover, mass market, and trade paperback. Be interesting to see if there was a geographical or age difference in preference, too.

  54. naveen says:

    Hello data guy,
    In your last report you mentioned total ebook sales in USA 2017 was 3,177 M dollar, But in last 3 quarter you showed 1,339 M. Can you please let me know if both data are correct?

  55. Bobbi Williams says:

    Just looking to see what percentage of sales are cancer survivor memoirs

  56. Thanks so much for putting together these incredibly helpful statistics into an easy to understand report. I look forward to every new release. As a writer, I have used many of the findings in your reports to help shape my sales and publishing strategy for the future. I can’t thank you enough!

  57. Tom Shepherd says:

    That’s a level of insight that I would like to have real time!

  58. Marilyn says:

    I’m not in the book business, but I am writing a report on book sales by genre. Can you point me in the right direction? I don’t mind paying a little for the information, but I will only be using it once for this report.

  59. Love this snapshot of who is selling, etc. I believe that you should not look at what others are doing, but just do what you want to do. When someone says the Indie market is down, etc. They are only looking at all the numbers accumulated. They are not looking at an individual’s success. For me the success was in finishing the book. Everything after that is gravy. I have self sold 60 of my 125 hardcovers “My Time with Meta Given” in less than a month. I now have my book on Kindle and I intend to put in on audible. Will I be successful – like I said I am already successful. The Amazon process lets you do it for free. If you are not marketing your book everyday, then the future of your success is grim. Never stop selling your book. Everyday I find about 2 people to purchase my book. At $25 a pop that’s $50 a day, just for talking. While my actual profit is still in the hole with the hardcovers. I don’t care. I had enough money to pay off my credit card and that is all that matters. I wrote a book about a woman who was left out of history. It is my life’s work from here forward to put her back into history. I have already been successful. I wrote a book!

  60. Lavinia says:

    Hi there,

    really interesting statistic. However, I was wondering if you could add alternative descriptions. I’m using a screenreader and cannot read the images otherwise

    Best wishes

  61. Martin Kenney says:


    I have watched your datasite idiosyncratically for a couple of years and find it really interesting. I am writing a paper about “platform entrepreneurs” and have been looking for a graphic that shows the sales income by rank over an entire population. I have tried to back out this data for YouTube from Social Blade, but the results are quite unsatisfactory. It seems like it is a power law curve….big winners, smallish middle and very very long tail.

    Have you taken your data and done such a curve? Basically, sales $ per day/month/whatever versus rank, #1 – #N. If you have could you point me to that graphic?

    It seems like there are so many platform based businesses that have these similar power law rewards. Finally, do you think this is more true in the e-books genre than in physical books or in the day prior to online book sales? Thanks for the data analysis.

  62. Harry says:

    extremely intriguing measurement. Be that as it may, I was thinking about whether you could include elective depictions. I’m utilizing a screen reader and can’t peruse the pictures generally..

  63. Christy says:

    what you said about PubTrack and AAP are really true. They need to change.

Leave a Reply

Get future updates

Author Survey

Add your data! No matter where you are in your publishing career, your data can help other writers better understand this rapidly changing market. Take this anonymous survey and view the results.

Take the Survey Results

Sign the Petition

Would you like to make your voice heard? Whether you are a reader, an aspiring writer, or a published author, sign here to allow us to advocate for you.

Sign the Petition