aws
AWS Cloud Utility API
This module provides utility functions to interact with AWS services.
Functions:
-
download_s3_file–Download a file from S3
-
download_s3_object–Download an object from S3
-
download_s3_prefix–Download all objects under an S3 prefix into a local directory
-
download_s3_objects–Download all objects in a S3 bucket with a given prefix (deprecated)
Functions
download_s3_file
download_s3_file(key: str, dst: Path, bucket: str, client: boto3.client = None, checksum: str = 'size', config: Config | None = Config(signature_version=UNSIGNED)) -> bool
Download a file from S3
Parameters:
-
(keystr) –Object key
-
(dstPath) –Destination path
-
(bucketstr) –Bucket name
-
(clientclient, default:None) –S3 client
-
(checksumstr, default:'size') –Checksum type. Defaults to "size".
-
(configConfig, default:Config(signature_version=UNSIGNED)) –Boto3 config. Defaults to Config(signature_version=UNSIGNED).
Returns:
-
bool(bool) –True if file was downloaded, False if already exists
Source code in helia_edge/utils/aws.py
download_s3_object
download_s3_object(item: dict[str, str], dst: Path, bucket: str, client: boto3.client = None, checksum: str = 'size', config: Config | None = Config(signature_version=UNSIGNED)) -> bool
Download an object from S3
Parameters:
-
(itemdict[str, str]) –Object metadata
-
(dstPath) –Destination path
-
(bucketstr) –Bucket name
-
(clientclient, default:None) –S3 client
-
(checksumstr, default:'size') –Checksum type. Defaults to "size".
-
(configConfig, default:Config(signature_version=UNSIGNED)) –Boto3 config. Defaults to Config(signature_version=UNSIGNED).
Returns:
-
bool(bool) –True if file was downloaded, False if already exists
Source code in helia_edge/utils/aws.py
download_s3_objects
download_s3_objects(bucket: str, prefix: str, dst: Path, checksum: str = 'size', progress: bool = True, num_workers: int | None = None, config: Config | None = Config(signature_version=UNSIGNED))
Download all objects in a S3 bucket with a given prefix.
.. deprecated::
Use :func:download_s3_prefix instead. This function preserves the
full S3 key (including the prefix) when building local paths, which
causes files to be nested one level too deep when dst already
contains the prefix directory. The replacement strips the prefix so
that dst is always the root of the downloaded tree.
Parameters:
-
(bucketstr) –Bucket name
-
(prefixstr) –Prefix to filter objects
-
(dstPath) –Destination directory
-
(checksumstr, default:'size') –Checksum type. Defaults to "size".
-
(progressbool, default:True) –Show progress bar. Defaults to True.
-
(num_workersint | None, default:None) –Number of workers. Defaults to None.
-
(configConfig | None, default:Config(signature_version=UNSIGNED)) –Boto3 config. Defaults to Config(signature_version=UNSIGNED).
Source code in helia_edge/utils/aws.py
download_s3_prefix
download_s3_prefix(bucket: str, prefix: str, dst: Path, checksum: str = 'size', progress: bool = True, num_workers: int | None = None, config: Config | None = Config(signature_version=UNSIGNED)) -> int
Download all objects under an S3 prefix into a local directory.
Unlike :func:download_s3_objects, this function strips the prefix
from each object key before joining it with dst, so that dst becomes
the root of the downloaded tree.
Example::
# S3 objects: s3://my-bucket/datasets/ptbxl/00001.h5
# s3://my-bucket/datasets/ptbxl/00002.h5
download_s3_prefix(
bucket="my-bucket",
prefix="datasets/ptbxl",
dst=Path("./data/ptbxl"),
)
# Results in: ./data/ptbxl/00001.h5
# ./data/ptbxl/00002.h5
Parameters:
-
(bucketstr) –Bucket name.
-
(prefixstr) –Key prefix to filter objects. A trailing
/is added automatically if missing. -
(dstPath) –Local directory that will mirror the contents found under prefix.
-
(checksumstr, default:'size') –Checksum strategy (
"size"or"md5"). Defaults to"size". -
(progressbool, default:True) –Show a
tqdmprogress bar. Defaults toTrue. -
(num_workersint | None, default:None) –Thread-pool size.
Noneuses the :class:~concurrent.futures.ThreadPoolExecutordefault. -
(configConfig | None, default:Config(signature_version=UNSIGNED)) –Boto3 client config. Defaults to unsigned requests.
Returns:
-
int(int) –Number of objects downloaded (excludes skipped / up-to-date).
Source code in helia_edge/utils/aws.py
255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 | |