Storage¶
Azure¶
These modules outline how to interact with Azure data stores, specifically Azure Blob Storage and Azure Data Lakes.
papermill.abs module¶
papermill.adl module¶
AWS¶
This module shows how to interact with AWS S3 data stores.
papermill.s3 module¶
Utilities for working with S3.
-
class
papermill.s3.
Bucket
(name, service=None)¶ Bases:
object
Represents a Bucket of storage on S3
- Parameters
name (string) – name of the bucket
service (string, optional (Default is None)) – name of a service resource, such as SQS, EC2, etc.
-
list
(prefix='', delimiter=None)¶ Limits a list of Bucket’s objects based on prefix and delimiter.
-
class
papermill.s3.
Key
(bucket, name, size=None, etag=None, last_modified=None, storage_class=None, service=None)¶ Bases:
object
A key that represents a unique object in an S3 Bucket.
Represents a file or stream.
- Parameters
bucket (object) – A bucket of S3 storage
name (string) – representative name of the bucket
size (???, optional (Default is None)) –
etag (???, optional (Default is None)) –
last_modified (date, optional (Default is None)) –
storage_class (???, optional (Default is None)) –
service (string, optional (Default is None)) – name of a service resource, such as SQS, EC2, etc.
-
class
papermill.s3.
Prefix
(bucket, name, service=None)¶ Bases:
object
Represents a prefix used in an S3 Bucket.
- Parameters
bucket (object) – A bucket of S3 storage
name (string) – name of the bucket
service (string, optional (Default is None)) – name of a service resource, such as SQS, EC2, etc.
-
class
papermill.s3.
S3
(keyname=None, *args, **kwargs)¶ Bases:
object
Wraps S3.
- Parameters
keyname (TODO) –
-
The following are wrapped utilities for S3:
cat
cp_string
list
list_dir
read
-
cat
(source, buffersize=None, memsize=16777216, compressed=False, encoding='UTF-8', raw=False)¶ Returns an iterator for the data in the key or nothing if the key doesn’t exist. Decompresses data on the fly (if compressed is True or key ends with .gz) unless raw is True. Pass None for encoding to skip encoding.
-
cp_string
(source, dest, **kwargs)¶ Copies source string into the destination location.
- Parameters
source (string) – the string with the content to copy
dest (string) – the s3 location
-
list
(name, iterator=False, **kwargs)¶ Returns a list of the files under the specified path name must be in the form of s3://bucket/prefix
- Parameters
keys (optional) – if True then this will return the actual boto keys for files that are encountered
objects (optional) – if True then this will return the actual boto objects for files or prefixes that are encountered
delimiter (optional) – if set this
iterator (optional) – if True return iterator rather than converting to list object
-
listdir
(name, **kwargs)¶ Returns a list of the files under the specified path.
This is different from list as it will only give you files under the current directory, much like ls.
name must be in the form of s3://bucket/prefix/
- Parameters
keys (optional) – if True then this will return the actual boto keys for files that are encountered
objects (optional) – if True then this will return the actual boto objects for files or prefixes that are encountered
-
lock
= <unlocked _thread.RLock object owner=0 count=0>¶
-
read
(source, compressed=False, encoding='UTF-8')¶ Iterates over a file in s3 split on newline.
Yields a line in file.
-
s3_session
= (None, None, None)¶