A PDF document with an arrow pointing to a Markdown file showing headings and bullet points

How to Convert a PDF to Markdown

A plain-English guide to converting a PDF into clean Markdown, with simple steps and fixes for the formatting that usually goes wrong.

To convert a PDF to Markdown, open the PDF in an online editor that exports text, confirm the text is selectable, choose Markdown (.md) as the output format, and download the result. The tool rewrites the headings, paragraphs, and lists using Markdown symbols, like # for a title and - for a bullet. You then tidy up anything the conversion missed.

Key takeaways

  • Markdown is plain text with light formatting symbols, so converting from PDF strips away fonts and colors and keeps the underlying structure.
  • The cleanest results come from PDFs built from real, selectable text, not scanned images of pages.
  • Always proofread the output. Headings, tables, and multi-column layouts are the parts most likely to come out wrong.
  • An online PDF editor handles the conversion in your browser, with files processed on the server rather than stored long-term.
  • For the reverse trip, see how to convert a Markdown file to PDF.
  • If you want styled rich text instead of plain Markdown, converting to RTF may suit you better.

What Markdown actually is (and why people want it)

Markdown is a way of writing formatted text using ordinary keyboard characters. Instead of clicking a "bold" button, you wrap a word in **asterisks**. Instead of choosing a heading style from a menu, you put a # in front of a line. The file itself is plain text, which means it opens in any text editor, weighs almost nothing, and stays readable for decades.

People convert PDFs to Markdown for practical reasons. Writers paste content into note apps like Obsidian or Notion. Teams move documentation into a shared repository, where Markdown is the everyday standard. Bloggers feed text into a website builder that renders Markdown into pages. In each case the goal is the same: take the words trapped inside a fixed PDF and turn them into clean, editable, portable text.

The trade-off is that Markdown is deliberately simple. It has headings, bold, italics, lists, links, code blocks, blockquotes, and basic tables, and that is roughly the whole vocabulary. A PDF can hold things Markdown cannot express, such as exact fonts, columns, and absolute positioning, so part of converting is accepting that some visual polish will not survive the trip. That is usually fine, because the reason you wanted Markdown was to get at the substance, not the styling.

It helps to picture what the converter is doing under the hood. It walks through the text in your PDF, looks at clues like font size and weight, and makes its best guess about which lines are headings, which are body paragraphs, and which belong to a list. Those guesses are good on a clean, simply laid-out document and shakier on a heavily designed one. Knowing that the output is an interpretation, not a perfect copy, sets the right expectation before you start.

How do I turn a PDF into Markdown?

The fastest route for most people is a browser-based editor that can read a PDF's text and export it. Here is the full process.

  1. Open your PDF in an online editor. Head to a tool like our online PDF editor and upload the file. It loads on the server and shows you the pages, so you can confirm you have the right document before doing anything else.
  2. Check that the text is selectable. Try highlighting a sentence on the page. If you can select it, the PDF contains real text and will convert well. If your cursor just draws a box over the page the way it would over a photo, the PDF is a scan and needs an extra step, which is covered below.
  3. Choose Markdown as the export format. Look for an export or download option and pick Markdown (.md). If the editor offers plain text or HTML instead, those are close cousins you can clean up, but native Markdown saves the most work.
  4. Review the structure before you download. Glance at how headings, lists, and tables have been interpreted. This is your chance to spot a heading that got flattened into an ordinary paragraph, or a list that lost its bullets.
  5. Download the .md file. Save it somewhere you will remember. The file will be tiny, often a fraction of the size of the original PDF.
  6. Open it in a Markdown editor and proofread. Tools like VS Code, Obsidian, or a basic text editor with a preview pane show you the rendered result side by side with the raw symbols. Fix anything that looks off.

That is the whole job. For a clean, text-based PDF the result is usually most of the way there out of the box, and the remaining cleanup takes a few minutes.

The realistic failure mode

The single biggest reason a conversion disappoints is that the PDF was a scanned image rather than real text. A scan is a photograph of a page, so there are no letters to extract, only pixels. A converter will either return an empty file or, if it includes optical character recognition (OCR), a rough text guess that needs heavy proofreading. Before you blame the tool, check step 2 above. If the text will not highlight, run OCR first to create a text layer, then convert.

The second most common problem is layout that does not map onto Markdown. Two-column newsletters, sidebars, footnotes, repeating headers and footers, and text wrapped around images can all confuse the reading order. The converter reads the page in a sequence that made sense to it but not to you, and sentences arrive jumbled. There is no magic fix for this; you reflow the text by hand afterward. Documents with a simple top-to-bottom flow rarely have this trouble, which is why a plain report converts far more cleanly than a glossy brochure.

A third, smaller snag is special characters. Curly quotes, em dashes, accented letters, and symbols usually carry over fine, but the occasional one lands as a stray code or a question mark. These are quick to spot and quick to fix during the proofread.

Getting the cleanest possible result

A little preparation makes a noticeable difference.

Start from the original file. If you have the source PDF that was exported straight from a word processor, use it. A PDF that has been printed, scanned, re-saved, and emailed three times carries baggage that hurts the conversion and is far more likely to behave like an image than like text.

Simplify before you convert when you can. If your editor lets you delete a decorative cover page, a full-page logo, or repeating headers and footers, removing them first keeps junk out of your Markdown. Every element you cut beforehand is one less thing to clean up afterward.

Pick the right output for the job. Markdown is ideal when you want lightweight, portable, version-controllable text. If you actually need the bold, italics, fonts, and colors preserved as visible formatting, Markdown is the wrong target and converting to RTF will serve you better, since RTF carries rich styling that Markdown intentionally drops.

Know how tables will behave. Markdown tables exist but are basic. A simple grid of rows and columns usually survives. A table with merged cells, nested headers, or cells that span several lines will likely come out misaligned, and you will need to rebuild it using Markdown's pipe-and-dash syntax. For a one-off complex table, copying it into a spreadsheet and rebuilding from there is sometimes faster than fighting the raw output.

Convert in sensible chunks for very long files. If you are dealing with a two-hundred-page PDF, it can be easier to convert and proofread it in sections rather than wrestling one enormous Markdown file. You catch reading-order problems sooner and keep each pass manageable.

Cleaning up the Markdown after conversion

No conversion is perfect, so budget a few minutes to polish. Open the file in any editor with a live preview and work through this short checklist.

  • Heading levels. Make sure your document title is a single #, section titles are ##, and subsections are ###. Converters sometimes guess the level wrong or turn a heading into bold text instead of a proper heading.
  • Lists. Confirm bullet points use - and numbered lists use 1.. Watch for lists that got merged into one long paragraph, which happens when the original spacing was tight.
  • Line breaks. PDFs often hard-wrap every line. You may see a single sentence broken across several lines in the raw text. Joining those back into full paragraphs makes the Markdown cleaner and easier to edit later.
  • Links. Check that any web links survived as [text](url) and are not just bare URLs or, worse, dropped entirely.
  • Stray characters. Page numbers, footer text, and hyphenation artifacts (a word split as "exam-\nple") sometimes sneak in. A quick read-through catches them.

Because Markdown is plain text, every one of these fixes is just typing. You do not need a special program; you only need to see the symbols and understand what they do. If you learn just six pieces of syntax, # for headings, ** for bold, * for italics, - for bullets, 1. for numbered lists, and [text](url) for links, you can repair almost any messy conversion by hand.

When Markdown is the right choice

Markdown shines when your priority is the words and their structure, and you want a file that is small, future-proof, and easy to edit anywhere. It is the natural fit for notes, documentation, blog drafts, README files, and anything headed into a system that reads Markdown natively.

It is the wrong choice when you need the document to look exactly like the PDF, with precise fonts, page layout, and graphics intact. In that situation, leave it as a PDF or convert to a format built to preserve appearance. And if you later finish editing your Markdown and want a polished, shareable document again, you can go the other direction and turn it back into a PDF; our guide on how to convert a Markdown file to PDF walks through that.

PDFs have carried formatted documents since Adobe released PDF 1.0 in 1993, and the format became an open ISO standard (ISO 32000-1) in 2008. It was built to make a page look identical everywhere it is opened. Markdown was designed for the opposite priority: the lightest possible way to write structured text that stays editable. Converting between them is really about choosing whether you value fixed appearance or flexible substance for the task in front of you.

FAQ

How do I turn a PDF into Markdown?

Open the PDF in an online editor, confirm the text is selectable, choose Markdown (.md) as the export format, and download the file. Then open it in a Markdown editor with a preview pane and proofread the headings, lists, and tables. If the text will not highlight in the PDF, it is a scan and you will need to run OCR before converting.

Can I convert a scanned PDF to Markdown?

Yes, but only with an extra step. A scanned PDF is an image, so there is no text to pull out directly. You first run optical character recognition (OCR) to create a text layer, then convert that to Markdown. Expect to proofread carefully, because OCR can misread characters, especially on faint or skewed scans.

Will my tables and images survive the conversion?

Simple tables usually convert into basic Markdown tables, but complex ones with merged or multi-line cells often come out misaligned and need rebuilding by hand. Images are referenced rather than embedded in Markdown, so they generally are not carried into the text file. Plan to re-add any pictures separately if you need them alongside the text.

Is converting a PDF to Markdown free and safe?

Many online tools, including ours, let you convert without installing anything. Files are processed on the server to do the conversion and are not kept long-term. As with any document handling, avoid uploading highly sensitive material to a service you do not trust, and read the provider's privacy terms first.

What is the difference between PDF to Markdown and PDF to RTF?

Markdown is plain text with simple symbols and deliberately drops fonts, colors, and complex layout. RTF (Rich Text Format) keeps visible styling like bold, italics, and fonts. Choose Markdown for lightweight, portable, editable text, and choose RTF when you want the formatting to stay intact.

Why does my Markdown look messy after converting?

Usually because the PDF had a layout Markdown cannot mirror, such as multiple columns, sidebars, or a hard line break on every line. The converter reads the page in its own order and the text arrives jumbled or over-split. Open the file in a preview editor, rejoin broken paragraphs, fix the heading levels, and the result cleans up quickly.

Usama Ramzan
Written byUsama RamzanFounder, Online PDF Edits

Usama Ramzan is the founder of Online PDF Edits, a browser-based PDF editor built to change text, images, and tables in existing PDFs without breaking their fonts, spacing, or multi-page layout. He writes about practical PDF editing, document workflows, and the engineering behind layout-safe editing.

Recommended reading

View all articles →