Specify Document Pages

By default, Rhubarb attempts to process all pages (up to 20 pages) of a PDF or TIFF file. However, you can specify page numbers to process by passing in pages parameter to DocAnalysis.

Note

Acceptable page numbers for pages starts with 1, and max 20.

Specify a single page

Specify a single page by passing an array of page number, for example to process the third page of a document

  from rhubarb import DocAnalysis

  da = DocAnalysis(file_path="./test_docs/employee_enrollment.pdf",
                   boto3_session=session,
                   pages=[3])
  resp = da.run(message="For beneficiary type 'Secondary', what is the full name?")

Specify multiple pages

Specify multiple pages by passing an array of page numbers, for example to process the 3rd, 5th, and 6th page of a document

  from rhubarb import DocAnalysis

  da = DocAnalysis(file_path="./test_docs/employee_enrollment.pdf",
                   boto3_session=session,
                   pages=[3, 5, 6])
  resp = da.run(message="For beneficiary type 'Secondary', what is the full name?")