gpf_entrepot_toolbelt.utils.get_file_informations module

Helpers to get file’s informations: size, mime-type, hash, etc.

Author: Loïc Bartoletti (https://github.com/lbartoletti)

gpf_entrepot_toolbelt.utils.get_file_informations.convert_octets(octets: int) str

Convert a mount of octets in readable size.

Paramètres:

octets – mount of octets to convert

Renvoie:

ko, Mo, etc.

Type renvoyé:

size in a human readable format

Example:

>>> convert_octets(1024)
1 ko
>>> from pathlib import Path
>>> convert_octets(Path(my_file.txt).stat().st_size)
gpf_entrepot_toolbelt.utils.get_file_informations.filter_files_by_extensions(input_dir: Path, extensions_list: str) list[Path]

Retourne la liste de fichiers comportant une des extensions spécifiées.

Paramètres:
  • input_dir (Path) – Dossier contenant les fichiers

  • extensions_list (str) – Liste des extensions à filtrer.

Renvoie:

Chemins des fichiers recherchés

Exemple

from pathlib import Path

upload_dir = Path("tests/livraisons/fixtures/good/generic")

# Get raster filepaths in upload directory
raster_files = filter_files_by_extensions(
    upload_dir, extensions_list = ""tif,tiff,jp2""
)
print(raster_files)

Type renvoyé:

list[Path]

gpf_entrepot_toolbelt.utils.get_file_informations.generate_md5_sum(filename: str, chunksize: int = 8192) str

Génère un hash md5 du fichier filename.

Exemple

Exemple (pour un fichier contenant le texte « md5 ») echo « md5 » > /tmp/md5.txt

>>> generate_md5_sum("/tmp/md5.txt")
'772ac1a55fab1122f3b369ee9cd31549'
Preconditions:

filename est un fichier valide

Postconditions:

Retourne un hash md5 (chaîne hexadécimale de 32 caractères)

gpf_entrepot_toolbelt.utils.get_file_informations.get_dir_md5_hash(input_dir: str | Path, raise_error: bool = True) str | None

Get the md5 hash of the input path (must be a directory).

Le code utilise la fonction md5 de la bibliothèque standard hashlib: https://docs.python.org/3/library/hashlib.html

Paramètres:
  • input_dir (Union[str, Path]) – path to check

  • raise_error (bool, optional) – if True, it raises an exception. Defaults to True.

Lève:

FileNotFoundError – if the path is not a directory and raise_error is set to True

Renvoie:

md5 hash of input_dir or None in case of error and raise_error is set to False.

Type renvoyé:

Union[str, None]

gpf_entrepot_toolbelt.utils.get_file_informations.get_md5_hash(input_file: str | Path, raise_error: bool = True) str | None

Get the md5 hash of the input path (must be a file).

Le code utilise la fonction md5 de la bibliothèque standard hashlib: https://docs.python.org/3/library/hashlib.html

Paramètres:
  • input_path (Union[str, Path]) – path to check

  • raise_error (bool, optional) – if True, it raises an exception. Defaults to True.

Lève:

FileNotFoundError – if the path is not a file and raise_error is set to True

Renvoie:

md5 hash of input_file or None in case of error and raise_error is set to False.

Type renvoyé:

Union[str, None]

gpf_entrepot_toolbelt.utils.get_file_informations.md5_update_from_dir(directory: str | Path, hash) Any

Update hash with directory content.

Paramètres:
  • directory (Union[str, Path]) – path to check

  • hash – initial hash

Renvoie:

updated hash

gpf_entrepot_toolbelt.utils.get_file_informations.md5_update_from_file(filename: str | Path, hash_, chunksize: int = 8192) Any

Update hash with file content.

Paramètres:
  • directory (Union[str, Path]) – path to check

  • hash – initial hash

  • chunksize (int) – chunksize for hash

Renvoie:

updated hash