Quickstart¶
This guide details the steps needed to install or update Rhubarb.
Installation¶
You will need Python 3.9 or above to use Rhubarb.
Install or update Python¶
Before installing Boto3, install Python 3.9 or later. For information about how to get the latest version of Python, see the official Python documentation.
Install Rhubarb¶
Rhubarb official package is available on PyPI and can be installed with
pip install pyrhubarb
By default this will install the latest version of Rhubarb.
From Source¶
To install the package, clone the repository with the following command -
git clone git@github.com:awslabs/rhubarb.git
Navigate into the project directory on the terminal and perform the following steps.
It is recommended you use python virtual environment such as
venv
usingpython -m venv rhubarb
.Activate
venv
usingsource rhubarb/bin/activate
.Rhubarb is built and packaged using Poetry. You will need to install Poetry.
Perform
poetry install
from the root of the project i.e. where thepyproject.toml
file resides. This will install all the dependencies including Rhubarb.Install
pre-commit
hooks usingpre-commit install
. This will setupruff
linter and formater checks before your changes are committed to the repo.Install dependencies using
poetry install
.Build the project using
poetry build
.Poetry will create a
whl
file within adist/
directory which can be installed.
pip install pyrhubarb-0.0.1-py3-none-any.whl
Configuration¶
Note
You must have Anthropic Claude V3 model and Amazon Titan Multi-modal Embedding model access enabled in Amazon Bedrock. To enable models, see Amazon Bedrock documentation.
Before using Boto3, you need to set up authentication credentials for your AWS account using either the IAM Console or the AWS CLI. You can either choose an existing user or create a new one.
For instructions about how to create a user using the IAM Console, see Creating IAM users. Once the user has been created, see Managing access keys to learn how to create and retrieve the keys used to authenticate the user.
If you have the AWS CLI installed, then you can use the aws configure command to configure your credentials file:
aws configure
Alternatively, you can create the credentials file yourself. By default, its location is
~/.aws/credentials
. At a minimum, the credentials file should specify the access key and secret
access key. In this example, the key and secret key for the account are specified in the default
profile:
[default]
aws_access_key_id = YOUR_ACCESS_KEY
aws_secret_access_key = YOUR_SECRET_KEY
You may also want to add a default region to the AWS configuration file, which is located by default
at ~/.aws/config
:
[default]
region=us-east-1
Alternatively, you can pass a region_name
when creating clients and resources.
You have now configured credentials for the default profile as well as a default region to use when creating connections. See Boto3 configuration for in-depth configuration sources and options.
Using Rhubarb¶
Using with a local file
from rhubarb import DocAnalysis
import boto3
session = boto3.Session()
da = DocAnalysis(file_path="./path/to/doc/doc.pdf", boto3_session=session)
response = da.run(message="What is the employee's name?")
With file in Amazon S3
from rhubarb import DocAnalysis
import boto3
session = boto3.Session()
da = DocAnalysis(file_path="s3://path/to/doc/doc.pdf", boto3_session=session)
response = da.run(message="What is the employee's name?")