A large scanned PDF document shrinking in file size after compression

How to Compress a Scanned PDF (Why Scans Are So Large)

Scanned PDFs are large because every page is a full-resolution image. Here's how to compress one online, why it happens, and how to keep quality intact.

To compress a scanned PDF, run it through an online PDF compressor that downsamples the images inside it. A scanner saves every page as a high-resolution photo, so reducing the image resolution (DPI) and re-encoding those photos with JPEG compression typically shrinks the file by 50-90% while keeping pages readable.

Key takeaways

  • A scanned PDF is large because each page is a full-resolution image (often 300 DPI or higher), not selectable text.
  • The fastest fix is to downsample the images inside the file to a lower DPI and recompress them.
  • For everyday documents, 150 DPI is a sweet spot: clearly readable, far smaller than 300 DPI.
  • Converting color scans to grayscale can cut the size again with no real loss when the page is just black ink.
  • Aggressive compression blurs small print, so always check the result before sending.
  • If you need to search or copy the text, run OCR after compressing, not instead of it.

Why is my scanned PDF so large?

When you scan a page, the scanner doesn't "read" the words. It takes a photograph of the paper and stores that picture inside the PDF. A single letter-size sheet scanned at 300 DPI (dots per inch) holds roughly 8.4 million pixels in color. Multiply that across 20, 50, or 100 pages and the file balloons to tens or even hundreds of megabytes.

Compare that to a normal, digitally-created PDF, where text is stored as actual characters and fonts. A 50-page text document might be under 1 MB. The same 50 pages scanned can easily hit 80 MB. The difference isn't the words on the page; it's that one file stores letters and the other stores pictures of letters. A picture of the letter "e" takes thousands of times more space than the single character "e," and every page of a scan is nothing but those pictures.

Three things drive the size of a scanned PDF:

  • Resolution (DPI). Higher DPI means more pixels per page and a bigger file. Doubling the DPI quadruples the pixel data, so 600 DPI is four times the size of 300 DPI for the same page.
  • Color depth. A full-color scan is far heavier than grayscale, which is heavier than pure black-and-white. Color stores three values for every pixel; grayscale stores one.
  • Image compression. Some scanners save pages as lightly-compressed or even uncompressed images, which wastes enormous space. The same scan can be three or four times larger depending only on how the scanner chose to encode it.

There's a fourth, quieter culprit worth naming: scanner defaults. Many office scanners and phone scanning apps ship set to "high quality" or "300 DPI color," and most people never change them. So a one-page receipt that could be a 200 KB grayscale file arrives as a 4 MB color image instead. If you scan a lot, lowering your scanner's default resolution at the source saves you from compressing everything afterward.

If you want the deeper version of this, with all the non-scanner culprits too, our guide on why your PDF file is so large and the hidden causes behind it walks through each one. But for scans specifically, it almost always comes down to those factors above.

How to compress a scanned PDF (step by step)

Here's the straightforward path using an online editor. You don't need to re-scan anything, and the work happens on the server, so a large scan won't bog down your computer.

  1. Open the compressor. Go to the PDF editor and choose the compress option, then upload your scanned file. Larger files take a little longer to upload, but the processing itself is quick.
  2. Pick a compression level. Most tools offer something like high quality, balanced, and small file. "Balanced" usually downsamples images to around 150 DPI, which is plenty for reading and printing everyday documents.
  3. Let it downsample the images. Behind the scenes, the tool reduces the resolution of each page-image and re-encodes it with efficient JPEG compression. This is where the size drops the most.
  4. Review the result. Open the compressed file and zoom in on the smallest text: footnotes, fine print, account numbers, or handwritten notes. Make sure it's still legible.
  5. Download and use it. If it looks good, you're done. If it's too blurry, repeat at a gentler level; if it's still too big, push the level higher.

That's the whole task. Compressing a scan is genuinely a one-step job once you've picked the right level. The skill is in choosing how aggressive to go, which depends entirely on what the document is for.

Choosing the right level for the job

Compression isn't a single "best" setting; it's a trade-off you tune to the document's purpose. A scan you'll email and forget needs nothing like the care a scan you'll keep for ten years does.

  • Email or upload to a portal: Go for "small file." A receipt or signed form doesn't need to be archive-quality, and many portals reject anything over a few megabytes anyway.
  • Printing later: Stay at "balanced" (around 150 DPI) so print output stays crisp. Print needs more detail than a screen does, because the eye sits closer to paper.
  • Legal, medical, or archival records: Use a gentle level, or skip compression on the pages that matter. Detail you throw away here is gone for good, and you may not find out it mattered until much later.

A practical habit: name your compressed copy something obvious like contract-compressed.pdf and keep the original beside it. If anyone ever questions the document, you still have the full-quality source.

How to compress a scanned PDF without losing quality

"Without losing quality" needs a small honesty caveat: any image compression discards some data. The goal isn't zero loss; it's loss you can't notice at normal reading size. Here's how to keep it invisible.

  • Don't over-downsample. Dropping from 300 DPI to 150 DPI is usually unnoticeable on screen and fine for print. Dropping to 72 DPI will visibly soften small text. Stop at the lowest DPI where your smallest text still reads cleanly.
  • Convert color scans to grayscale when color doesn't matter. A black-ink contract scanned in color carries useless color data in every pixel. Switching to grayscale alone can cut the size substantially with no real quality loss.
  • Compress once, not repeatedly. Each pass re-encodes the images and adds artifacts, and those artifacts stack. Keep your original, compress it a single time, and save that as the working copy. If you later need a smaller version, go back to the original rather than re-compressing the already-compressed file.
  • Match the level to the content. Photos and detailed diagrams tolerate less compression than plain typed text. A scanned page of black text on white paper is the most forgiving thing you can compress; a scanned color photograph is the least.

If your PDF is a mix of a scanned page or two plus regular digital content, you may only need to squeeze the embedded images rather than touch the whole file. Our walkthrough on how to compress the images inside a PDF without re-scanning covers that targeted approach, which preserves your real text untouched while shrinking only the heavy pictures.

The realistic failure mode: text becomes unreadable

The most common way scanned-PDF compression goes wrong is pushing the level too hard. When the DPI drops too low or the JPEG quality is cranked down, letters get a fuzzy halo, thin strokes break apart, and tiny print turns to mush. This is especially brutal on handwriting, stamps, signatures, and 8-point legal footnotes, where the detail was already barely there.

There's a second, sneakier failure: a scan that looks fine on your laptop screen but falls apart when printed or zoomed. Screens forgive a lot because they show fewer dots per inch than a printer lays down. So the page that looked acceptable at 100% on your monitor can come out fuzzy on paper. That's why step 4 above matters, and why "balanced" beats "small file" whenever printing is on the table.

The fix is simple: there's no universally "correct" setting. A glossy photo brochure and a typed lease agreement need different treatment. Always open the compressed file and inspect the worst-case content before you rely on it. If the smallest text on the page is still sharp, you compressed it well. If it isn't, you went one notch too far, so step back to a gentler level and try again.

One more trap: compression does not make a scan searchable. Many people expect to copy text from a compressed scan and are surprised when they can't. Compressing shrinks the picture; it doesn't turn the picture back into words.

What about making the scan searchable too?

Compression and searchability are two different jobs that people often confuse. Compression makes the file smaller. OCR (optical character recognition) reads the pictures of letters and adds a hidden text layer behind the image, so you can search, select, and copy the words even though you're still looking at the scan.

If you need both, do them in a sensible order: compress first to a reasonable level, then run OCR to add the text layer. A good OCR pass adds very little to the file size, because the text characters it stores are tiny next to the page images. So you get a searchable, smaller file without paying much of a size penalty for the searchability. Our guide on turning scanned PDFs into searchable PDFs covers exactly how that works and what to expect from the accuracy.

Don't compress so aggressively that OCR can no longer read the letters, though. OCR works by recognizing letter shapes, and if you've blurred those shapes into mush, the software guesses wrong or gives up. If you plan to OCR, keep the text reasonably crisp, which is one more vote for stopping at 150 DPI rather than going lower.

A quick word on quality versus size

Every scanned PDF sits on a slider between small file and perfect detail, and you can't max out both at once. The art is knowing where your particular document needs to land:

  • A receipt for an expense report: slide hard toward small file. Nobody will ever zoom in on it.
  • A contract you'll print and sign: stay near the middle, where text is sharp but the file is still manageable.
  • A historical record or certified document: stay near perfect detail, and keep the original untouched no matter what.

Pick the lightest setting that still serves the document's real purpose, verify it with your own eyes, and you'll get a file that's a fraction of the size without anyone noticing what changed. That last part, verifying with your own eyes, is the whole game. The right setting is whatever passes that check for your specific scan.

FAQ

Why is my scanned PDF so large?

Because your scanner saved each page as a high-resolution photograph rather than as text. A single page at 300 DPI in color can be several megabytes, and that adds up fast across many pages. The size comes from pixel data, not the words, which is why compressing or downsampling those page-images shrinks the file so dramatically.

How much smaller can I make a scanned PDF?

It varies with the original, but a 50-90% reduction is common. Scans created at very high DPI or saved with weak compression have the most room to shrink. A file that's already lean, or one scanned at modest resolution, won't drop as much because there's less waste to remove.

Will compressing a scanned PDF make the text searchable?

No. Compression only reduces the file size of the images; the text inside a scan stays as a picture. To search or copy the words, you need OCR, which adds a hidden text layer on top of the scan. Compress first, then run OCR if you need both a smaller file and searchable text.

What DPI should I compress a scanned PDF to?

For everyday documents, 150 DPI is a reliable target. It stays clearly readable on screen and prints well, while cutting file size significantly compared to 300 or 600 DPI. Drop to 100 DPI or below only for quick on-screen viewing, since small text starts to blur at those resolutions.

Does converting a scan to grayscale reduce the size?

Yes, often a lot. Color scans store three channels of data per pixel; grayscale stores one. If your document is black ink on white paper, the color information is wasted, so converting to grayscale shrinks the file with essentially no loss of useful detail.

Is it safe to compress an important document?

It can be, as long as you keep your original and compress gently. Always save the uncompressed scan, compress a copy, and inspect the smallest text before relying on it. For legal, medical, or archival records, use a light compression level or skip compression on critical pages, because any detail you discard cannot be recovered later.

Usama Ramzan
Written byUsama RamzanFounder, Online PDF Edits

Usama Ramzan is the founder of Online PDF Edits, a browser-based PDF editor built to change text, images, and tables in existing PDFs without breaking their fonts, spacing, or multi-page layout. He writes about practical PDF editing, document workflows, and the engineering behind layout-safe editing.

Recommended reading

View all articles →