Module markov.api.utils.file_utils

Functions

def archive_directory(input_dir, output_archive)

Archives a directory using tar.

Args

input_dir : str
Path to the input directory.
output_archive : str
Path to the output archive file.
def compress_directory(input_path: str)
def compress_file(input_file, output_file: str = None)

Compresses an input file using gzip.

Args

input_file : str
Path to the input file.
output_file : str
Path to the output compressed file.
def compress_file_or_directory(input_path)

Compresses either a file or a directory.

Args

input_path : str
Path to the input file or directory.

Returns

str
Path to the compressed file.
def download_from_uri(uri: str, out_dir: str, headers: Dict = None) ‑> str

Downloads content from the specified URI and saves it to the specified directory.

Args

uri : str
The URI from which to download content.
out_dir : str
The directory where the downloaded content will be saved.
headers : Optional[Dict]
Additional HTTP headers to include in the request.

Returns

str
The directory where the downloaded content is saved.
def get_file_split_part_count(file_path: str, part_size: int = 5242880)

Calculates the number of parts a file should be split into for uploading.

Args

file_path : str
Path to the file to be split.
part_size : int
Size of each part in bytes.

Returns

int
Number of parts needed to split the file.
def split_large_file(file_path, part_size: int = 5242880)

Splits a large file into parts. This is written as a generator to make this memory efficient. The idea is that only the required part will be in memory.

Args

file_path : str
Path to the input file.
part_size : int
Size of each part in bytes.

Yields

bytes
Data of each part.
def unpack_archive_file(file_path, mimetype, target_dir=None) ‑> None

Unpacks an archive file to the specified target directory. This method handles both zip and tar files.

Args

file_path : str
The path to the archive file.
mimetype : str
The MIME type of the archive. (Allowed mime types - ZIP and TAR)
target_dir : Optional[str]
The directory where the contents of the archive will be extracted. If not provided, the same directory as the archive file will be used.

Raises

RuntimeError
If the file format is not valid or if an error occurs during unpacking.

Side effects: The file given by the file_path is unpacked.

def upload_large_file_as_parts(file_path: str, upload_urls: List[str]) ‑> List

Uploads a large file in parts to multiple upload URLs concurrently.

Args

file_path : str
Path to the file to be uploaded.
upload_urls : List[str]
List of upload URLs for each part.

Returns

List
List of responses from the uploads.
def upload_part(upload_url, data, part_number)

Uploads a part of a file to an upload URL.

Args

upload_url : str
URL for uploading the part.
data : bytes
Data of the part.
part_number : int
Part number.

Returns

Response
Response from the upload.

Raises

Exception
If the upload fails.
def validate_file_max_size(file_path)

Validates that a file does not exceed the maximum allowed size.

Args

file_path : str
Path to the input file.

Returns

bool
True if the file size is within the allowed limit, False otherwise.
def zip_files_and_directories(paths: List[str], output_file: str)

Compresses multiple files and directories into a zip archive.

Args

paths : List[str]
A list of file and directory paths to be included in the zip archive.
output_file : str
The path to the output zip file to create.

Returns

str
The path to the created zip archive.

Raises

IOError
If any of the input paths are neither a file nor a directory.
PermissionError
If there are issues accessing the output file or its parent directory.
zipfile.BadZipFile
If the output file is not a valid zip archive.

Example

paths = ['/path/to/file1.txt', '/path/to/directory', '/path/to/file2.txt'] output_file = '/path/to/output.zip' zip_multiple_files_and_directories(paths, output_file)

This function takes a list of file and directory paths (paths) and compresses them into a single zip archive specified by the output_file parameter. If a path in the list is a file, it will be compressed individually. If it's a directory, the entire directory and its contents will be added to the zip archive. The function returns the path to the created zip archive.

Note

If output_file does not end with '.zip', it will be automatically appended.

def zipdir(path: str, ziph: zipfile.ZipFile)

Recursively compresses a directory into a zip archive.

Args

path : str
The path of the directory to be compressed.
ziph : zipfile.ZipFile
The zipfile handle where the compressed data will be written.

Returns

None

Example

with zipfile.ZipFile('archive.zip', 'w') as zip_file: zipdir('/path/to/directory', zip_file)

This function takes a directory path (path) and a zipfile handle (ziph) and recursively compresses the entire directory and its contents into the zip archive specified by ziph.