SnapPDF
endpoint

Extract text

POST /api/v1/extract-text

Pull embedded text from a PDF. For scanned PDFs use /ocr.

credits: 1returns: application/json

Parameters

NameTypeRequiredDescription
filemultipart filerequiredSource PDF.
preserveLayoutboolean (query)optionalPreserve whitespace and reading order.
positionsboolean (query)optionalReturn per-word (x, y, width, height).

Examples

curl
curl -X POST "https://api.snappdf.au/api/v1/extract-text?preserveLayout=true" \
  -H "Authorization: Bearer $SNAPPDF_API_KEY" \
  -F "file=@doc.pdf"
JavaScript
const r = await snap.pdf.extractText({ file: bytes, preserveLayout: true });
console.log(r.text);
Python
r = snap.pdf.extract_text(file=bytes, preserve_layout=True)
PHP
$r = $snap->pdf->extractText(file: $bytes, preserveLayout: true);
Ruby
r = snap.pdf.extract_text(file: bytes, preserve_layout: true)
Go
r, _ := client.ExtractText(ctx, &snappdf.ExtractTextInput{File: bytes, PreserveLayout: true})