PDF Redaction

PDF mit Datei-Upload anonymisieren

Anonymisieren Sie eine PDF-Datei durch Erkennung und Schwärzung von PII (Personenbezogene Informationen). Akzeptiert PDF als multipart/form-data Datei-Upload. Gibt anonymisierte PDF als Binärstrom zurück. Dieser Endpunkt ist nützlich für direkte Datei-Uploads ohne base64-Kodierung. Unterstützt mehrere OCR-Sprachen, Erkennung von rotiertem Text und anpassbare PII-Tag-Erkennung über Abfrageparameter. Verarbeitet nur die erste Seite der PDF.

POST
/api/anonymize/file/pdf

Authorization

APIKeyHeader
X-API-Key<token>

In: header

Request Body

multipart/form-data

pdf*file

PDF file to anonymize

Formatbinary
force_ocr?boolean

Force OCR processing even if text is extractable from PDF

Defaultfalse
rotated_text?boolean

Enable detection and recognition of rotated text in the document

Defaultfalse
redact_text?boolean

Enable text redaction using NER (Named Entity Recognition). When enabled, detected PII entities are redacted (blacked out) in the output PDF

Defaulttrue
min_chunk_size?integer

Minimum chunk size for text processing. Used to control text segmentation for NER processing. Larger values may improve accuracy but increase processing time

Default0
custom_tags?|null

List of custom tags to detect and redact. These tags are added to the standard PII tags

tags?|null

List of predefined PII tags to detect and redact. If empty, all available tags are used. Available tags: DATE, PERSON_NAME, ORGANIZATION, LOCATION, EMAIL, PHONE, ID, ACCOUNT, ZIP_CODE, ADDRESS, IP, URL, SSN, DRIVER_LICENSE, PASSPORT, PASSWORD, AGE, CREDIT_CARD, MONEY_AMOUNT, SIGNATURE, QR_CODE, FACE. Can be comma-separated string like 'PERSON_NAME,EMAIL,PHONE' or list of strings.

ocr_langs?|null

List of OCR languages to use for text recognition. Available languages: eng (English), spa (Spanish), fra (French), deu (German), ita (Italian), por (Portuguese), rus (Russian). Defaults to English only. Multiple languages can improve accuracy for multilingual documents. Can be comma-separated string like 'eng,spa' or list of strings.

Default["eng"]

Response Body

application/pdf

application/json

application/json

application/json

curl -X POST "https://api.pdf-redaction.com/api/anonymize/file/pdf" \  -F pdf="document.pdf"
"binary pdf data"
{
  "detail": "Invalid document type"
}
Empty
{
  "error_code": "LLM_CALL_ERROR",
  "message": "string"
}
{
  "detail": [
    {
      "loc": [
        "string"
      ],
      "msg": "string",
      "type": "string"
    }
  ]
}
Empty