DropFormat

Deskew PDF scans without uploading your file

Fix crooked scanned PDF pages without uploading the file. Drop a PDF here to straighten image-only pages while leaving searchable text pages untouched.

Loading converter...

The deskew pass runs entirely in your browser. DropFormat checks each page, skips pages that already contain selectable text, rotates scanned image pages when a confident angle is found, and builds a new PDF locally.

Use the quality slider to keep scanned text sharp. If automatic detection misses a page, the force-angle field lets you apply a corrective angle between -15 and 15 degrees.

How it works

The tool renders image-only pages with pdf.js, estimates the skew angle with several browser-side detectors, rotates the page image, and writes the result with pdf-lib. Page count and page dimensions are preserved.

Pages with selectable text are copied instead of flattened so searchable text stays searchable. Pages that are already straight, blank, or too ambiguous to measure are reported in the result summary instead of being silently changed.

Common questions

Will OCR text stay selectable? Yes. If a page has selectable text, the deskew tool skips that page and copies it into the output PDF.

What does force angle mean? It is the corrective rotation applied to image-only pages. Leave it blank for automatic detection. Use it only when you know the scan needs a specific rotation.

Are my PDF pages uploaded? No. The PDF is read, rendered, rotated, and saved in your browser tab. Error reports include file metadata and the error message, never file contents.

Accuracy benchmark

The deskew detector that ships in DropFormat was benchmarked on 642 synthetic skewed PDF pages between -15 and +15 degrees of skew, against the four classical detector methods that existing open-source deskew tools draw from. Each method ran on the same rasterized pages in the same browser. The consensus detector that ships in the product had the lowest 95th-percentile error, no wrong-direction corrections, and only abstained on 8 of 642 cases.

Per-method accuracy across 642 synthetic skewed PDF pages, -15deg to +15deg in 0.1deg steps on body text and tables, plus stress angles on multi-column, low-density, dark-background, and low-DPI fixtures. Lower is better for every column except Confident pages.
Detector Median absolute error (deg) 95th-percentile error (deg) Maximum error (deg) Wrong-direction rate Confident pages
DropFormat consensus 0.04 0.127 1.235 0% 634 / 642
Projection-profile variance (used by ImageMagick deskew) 0.025 0.12 5.58 0% 642 / 642
Hough transform (textbook edge voting) 7.455 14.3 15 0% 642 / 642
Fourier-domain line energy 0.177 0.483 27.867 1.71% 642 / 642
Connected components baseline fit 1.083 1.225 1.289 1.18% 424 / 642

The takeaway is the spread between the columns, not any single row. Projection-profile has a lower median than the consensus on this corpus but a 5.58 degree maximum error on the inverted-scan and ruled-table fixtures. Hough has 7.455 degree median error because page rules dominate the edge vote. Fourier rotates pages the wrong way 1.71 percent of the time. Connected components only finds enough glyphs to commit to an angle for two thirds of the corpus. The consensus rule trades a small amount of median accuracy for a flat tail and a zero wrong-direction rate.

The corpus, the detector code, and the consensus rule live in the open-source plugin that runs this site. The benchmark summary is in docs/decisions/pdf-deskew-benchmark.json and the methodology is written up at How we measured PDF deskew accuracy. Both can be regenerated locally with node scripts/pdf-deskew-accuracy-spike.js --profile decision.