Question 1

How does it detect headings?

Accepted Answer

By font size. Text meaningfully larger than the body becomes H1, H2, or H3 depending on how much bigger. Works well for most design patterns.

Question 2

Does it handle bulleted and numbered lists?

Accepted Answer

Yes. Lines starting with •, ●, -, *, or a number followed by a period/paren become proper markdown list items.

Question 3

What about tables?

Accepted Answer

Tables are extracted as text, not markdown table syntax — that's a separate heuristic we're considering adding. For now expect row-by-row text output you may need to reformat.

Question 4

Why do I see --- between sections?

Accepted Answer

Those are page breaks. Markdown preserves the original page boundaries so you can see where the PDF transitions.

Question 5

Is this better than copy-pasting from a PDF viewer?

Accepted Answer

Yes. Direct copy-paste often mangles line breaks, skips headers/footers, and loses reading order. This runs pdf.js's reading-order heuristics plus our markdown formatting on top.

Question 6

Is my PDF uploaded?

Accepted Answer

Never. All extraction and formatting happen client-side.

PDF to Markdown

How It Works

Upload PDF

Auto-Format

Copy or Save

Frequently Asked Questions

Built With Open Source

Related Tools