Module markov.api.utils.file_utils
Functions
def archive_directory(input_dir, output_archive)-
Archives a directory using tar.
Args
input_dir:str- Path to the input directory.
output_archive:str- Path to the output archive file.
def compress_directory(input_path: str)def compress_file(input_file, output_file: str = None)-
Compresses an input file using gzip.
Args
input_file:str- Path to the input file.
output_file:str- Path to the output compressed file.
def compress_file_or_directory(input_path)-
Compresses either a file or a directory.
Args
input_path:str- Path to the input file or directory.
Returns
str- Path to the compressed file.
def download_from_uri(uri: str, out_dir: str, headers: Dict = None) ‑> str-
Downloads content from the specified URI and saves it to the specified directory.
Args
uri:str- The URI from which to download content.
out_dir:str- The directory where the downloaded content will be saved.
headers:Optional[Dict]- Additional HTTP headers to include in the request.
Returns
str- The directory where the downloaded content is saved.
def get_file_split_part_count(file_path: str, part_size: int = 5242880)-
Calculates the number of parts a file should be split into for uploading.
Args
file_path:str- Path to the file to be split.
part_size:int- Size of each part in bytes.
Returns
int- Number of parts needed to split the file.
def split_large_file(file_path, part_size: int = 5242880)-
Splits a large file into parts. This is written as a generator to make this memory efficient. The idea is that only the required part will be in memory.
Args
file_path:str- Path to the input file.
part_size:int- Size of each part in bytes.
Yields
bytes- Data of each part.
def unpack_archive_file(file_path, mimetype, target_dir=None) ‑> None-
Unpacks an archive file to the specified target directory. This method handles both zip and tar files.
Args
file_path:str- The path to the archive file.
mimetype:str- The MIME type of the archive. (Allowed mime types - ZIP and TAR)
target_dir:Optional[str]- The directory where the contents of the archive will be extracted. If not provided, the same directory as the archive file will be used.
Raises
RuntimeError- If the file format is not valid or if an error occurs during unpacking.
Side effects: The file given by the file_path is unpacked.
def upload_large_file_as_parts(file_path: str, upload_urls: List[str]) ‑> List-
Uploads a large file in parts to multiple upload URLs concurrently.
Args
file_path:str- Path to the file to be uploaded.
upload_urls:List[str]- List of upload URLs for each part.
Returns
List- List of responses from the uploads.
def upload_part(upload_url, data, part_number)-
Uploads a part of a file to an upload URL.
Args
upload_url:str- URL for uploading the part.
data:bytes- Data of the part.
part_number:int- Part number.
Returns
Response- Response from the upload.
Raises
Exception- If the upload fails.
def validate_file_max_size(file_path)-
Validates that a file does not exceed the maximum allowed size.
Args
file_path:str- Path to the input file.
Returns
bool- True if the file size is within the allowed limit, False otherwise.
def zip_files_and_directories(paths: List[str], output_file: str)-
Compresses multiple files and directories into a zip archive.
Args
paths:List[str]- A list of file and directory paths to be included in the zip archive.
output_file:str- The path to the output zip file to create.
Returns
str- The path to the created zip archive.
Raises
IOError- If any of the input paths are neither a file nor a directory.
PermissionError- If there are issues accessing the output file or its parent directory.
zipfile.BadZipFile- If the output file is not a valid zip archive.
Example
paths = ['/path/to/file1.txt', '/path/to/directory', '/path/to/file2.txt'] output_file = '/path/to/output.zip' zip_multiple_files_and_directories(paths, output_file)
This function takes a list of file and directory paths (
paths) and compresses them into a single zip archive specified by theoutput_fileparameter. If a path in the list is a file, it will be compressed individually. If it's a directory, the entire directory and its contents will be added to the zip archive. The function returns the path to the created zip archive.Note
If
output_filedoes not end with '.zip', it will be automatically appended. def zipdir(path: str, ziph: zipfile.ZipFile)-
Recursively compresses a directory into a zip archive.
Args
path:str- The path of the directory to be compressed.
ziph:zipfile.ZipFile- The zipfile handle where the compressed data will be written.
Returns
None
Example
with zipfile.ZipFile('archive.zip', 'w') as zip_file: zipdir('/path/to/directory', zip_file)
This function takes a directory path (
path) and a zipfile handle (ziph) and recursively compresses the entire directory and its contents into the zip archive specified byziph.