gpf_entrepot_toolbelt.orchestrator.storage.s3_client module

S3 client.

class gpf_entrepot_toolbelt.orchestrator.storage.s3_client.GpfS3Client(endpoint_url: str | None = None, region_name: str = 'sbg', access_key_id: str | None = None, access_key_secret: str | None = None, service_name='s3', header_user_agent: str = 'IGNGéoplateformePythonToolbelt/1.14.0', verbose: bool = False, skip_checks: bool = True, bucket_name: str | None = None, object_threads: int = 20, output_dir: Path | None = None, prefix: str | None = None)

Bases : object

Client to interact with object storage behind S3 protocol, within IGN Geoplateforme context.

Lève:

ClientException – when things go wrong of this side of network

__init__(endpoint_url: str | None = None, region_name: str = 'sbg', access_key_id: str | None = None, access_key_secret: str | None = None, service_name='s3', header_user_agent: str = 'IGNGéoplateformePythonToolbelt/1.14.0', verbose: bool = False, skip_checks: bool = True, bucket_name: str | None = None, object_threads: int = 20, output_dir: Path | None = None, prefix: str | None = None)

Client initialization.

Paramètres:
  • endpoint_url (str, optional) – URL to the S3 API endpoint. Defaults to None. Overridden by “GPF_S3_URL” or “S3_ENDPOINT” environment variables.

  • region_name (str, optional) – name of cloud region. Defaults to « sbg ». Overridden by “GPF_S3_REGION” environment variable.

  • access_key_id (str, optional) – key id used to authenticate. Defaults to None. Overridden by “AWS_ACCESS_KEY_ID” or “GPF_S3_KEY” environment variables.

  • access_key_secret (str, optional) – secret key used to authenticate. Defaults to None. Overridden by “GPF_S3_SECRETKEY” or “AWS_SECRET_ACCESS_KEY” environment variables.

  • header_user_agent (str, optional) – custom user-agent for HTTP requests. Defaults to f »{__title_clean__}/{__version__} ».

  • verbose (bool, optional) – if enabled, operations details are logged. Defaults to False.

  • skip_checks (bool, optional) – check connection and validate parameter should occur when serializing requests. You can disable checks validation for performance reasons. Defaults to True.

  • bucket_name (str, optional) – name of container. Mandatory for download method. Defaults to None.

  • object_threads (int, optional) – number of threads to use to download objects. Defaults to 20. Overridden by “GPF_S3_THREADS_OBJECTS” environment variable.

  • output_dir (Path, optional) – local folder where to download objects. Defaults to None.

  • prefix (str, optional) – only download objects whom name startswith this prefix. Defaults to None.

check_bucket_exist(bucket_name: str | None = None) bool

Verify if a bucket exists.

Paramètres:

bucket_name (str, optional) – bucket name. If no specified, it uses the one given in object instanciation. Defaults to None.

Lève:

error – if the request error is different than 404

Renvoie:

True if the bucket does exist, False if not.

Type renvoyé:

bool

check_connection() bool

Check if authentication settings allow to authenticate a connection.

Renvoie:

True if everything is OK.

Type renvoyé:

bool

check_object_is_file(s3_object) bool

Determine si l’objet S3 est un fichier ou pas (un dossier).

Paramètres:

s3_object (S3Object) – objet S3 à vérifier.

Renvoie:

True si l’objet est un fichier. False sinon.

Type renvoyé:

bool

property client_config: Config

Create client configuration from object parameters.

Renvoie:

configuration object

Type renvoyé:

Config

delete_multiple_objects(bucket_name: str, prefix: str) tuple[list[str], list[str]]

Delete multiple objects

Paramètres:
  • bucket_name (str) – nom du bucket S3

  • prefix (str, optional) – Filter only objects with specific prefix. Defaults to None.

Renvoie:

list of successfully deleted objects, and list

of objects which deletion failed.

Type renvoyé:

tuple[list[str], list[str]]

delete_single_object(s3_object: Object, bucket_name: str | None = None, s3_client: S3Client | None = None) tuple[str, Path]

Delete an S3 object.

Paramètres:
  • s3_object (Object) – object to be deleted.

  • bucket_name (str | None, optional) – bucket name. If no specified, it uses the one given in object instanciation. Defaults to None.

  • s3_client (S3Client | None, optional) – client to use. If not set, the object’s client will be used. Defaults to None.

Renvoie:

a status message and deleted object’s filepath

Type renvoyé:

tuple[str, Path]

download_objects(output_folder: Path | None = None, prefix: str | None = None) tuple[list[str], list[str]]

Download stored objects.

Paramètres:
  • output_folder (Path, optional) – folder where to download. If not set, the folder defined during object’s instanciation is used. Defaults to None.

  • prefix (str, optional) – Filter only objects with specific prefix. Defaults to None.

Lève:

ClientException – if something goes wrong

Renvoie:

tuple des listes d’objets téléchargés avec succès et

ceux dont le téléchargement a échoué.

Type renvoyé:

tuple[str, str]

download_single_object(s3_object: Object, s3_client: S3Client | None = None, output_folder: Path | None = None) tuple[str, Path]

Download single object.

Paramètres:
  • s3_object (str) – object from S3 to be downloaded

  • s3_client (S3Client, optional) – client to use. If not set, the object’s client will be used. Defaults to None.

  • output_folder (Path, optional) – folder where to download. If not set, the folder defined during object’s instanciation is used. Defaults to None.

Renvoie:

tuple containing operation message and local filepath

Type renvoyé:

tuple[str, Path]

get_object(key: str) Object

Object stored in a bucket.

Paramètres:

key (str) – search key for the object in the bucket

Renvoie:

An ObjectSummary resources

Type renvoyé:

Object

get_prefix_size(bucket_name: str | None = None, prefix: str | None = None) int | tuple[int, int]
get_prefix_size_details(bucket_name: str | None = None, prefix: str | None = None, number_file_needed: bool | None = True) int | tuple[int, int]

Get size of S3 « directory » in a bucket.

Paramètres:
  • bucket_name (str, optional) – bucket name. If no specified, it uses the one given in object instanciation. Defaults to None.

  • prefix (str, optional) – Filter only objects with specific prefix. Defaults to None.

  • number_file_needed (bool, optional) – return the number of file and change returned type int -> tuple[int, int]

Renvoie:

Size of S3 « directory », in bytes tuple[int, int]: Size of S3 « directory », in bytes. Number of files in S3 « directory »

Type renvoyé:

int

list_objects(bucket_name: str | None = None, prefix: str | None = None, limit: int | None = None) Iterable[Object]

List objects stored in a bucket.

Paramètres:
  • bucket_name (str, optional) – bucket name. If no specified, it uses the one given in object instanciation. Defaults to None.

  • prefix (str, optional) – Filter only objects with specific prefix. Defaults to None.

  • limit (int, optional) – Limit the number of objects returned. Defaults to None.

Renvoie:

An iterable of ObjectSummary resources

Type renvoyé:

Iterable

local_file_path_from_object(s3_object: Object, output_folder: Path | None = None) Path

Return the full path of the file to store the object.

Concatenate local_dir with object_name

Paramètres:
  • s3_object (obj) – input S3 object

  • output_folder (Path, optional) – folder where to download. Defaults to None.

Renvoie:

local path for object

Type renvoyé:

Path

property s3_client: S3Client

Create a S3 Client, low-level interface.

To dive deeper in differences with Resource, see:

https://www.learnaws.org/2021/02/24/boto3-resource-client/.

Renvoie:

client object

Type renvoyé:

S3Client

property s3_resource: S3ServiceResource

Create a S3 Resource, high level interface built on top of boto3.client.

To dive deeper in differences, see:

https://www.learnaws.org/2021/02/24/boto3-resource-client/.

Renvoie:

S3 sService Resource object

Type renvoyé:

S3ServiceResource

upload_files(local_input_paths: Iterable[Path] | Path, root_dir: Path | None = None) tuple[list, list]

Upload a list of files or a folder from local source to S3.

Paramètres:
  • local_input_paths (Iterable[Path]|Path) – path to the list (or tuple) of files paths or a folder to upload.

  • root_dir (Path, optional) – root directory for upload

Lève:

ValueError – if the bucket does not exists already

Renvoie:

tuple of lists of success uploads, list of failed uploads

Type renvoyé:

tuple[list, list]

upload_single_file(local_file_path: Path, s3_client: S3Client | None = None, key_name: str | None = None, root_dir: Path | None = None) tuple[str, Path]

Upload a file from local source to S3.

Paramètres:
  • local_file_path (Path) – path to the file to upload

  • s3_client (S3Client, optional) – client to use. If not set, the object’s client will be used. Defaults to None.

  • key_name (str, optional) – the name of the destination key. If key_name is None, the key is defined from file name or file name relative to root directory. Defaults to None.

  • root_dir (Path, optional) – root directory for upload