Creating Tar Files with tarfile Module
The tarfile
module in Python provides a simple way to create and manage tar archives. Here's how to use it:
- Import the tarfile module.
- Use
tarfile.open()
with "w" mode to create a new archive. - For gzip-compressed tars, use "w:gz" mode.
- Add files to the archive with
add()
. - To include multiple files, loop over them and use
tar.add(filename)
. - For directories, use
os.walk()
to maintain the structure. - Always call
close()
to finalize the archive.
For command-line operations, subprocess.call()
can be used to run tar commands.
Note: Be cautious when extracting archives from unknown sources due to potential security risks.
Compressing Archives Using gzip and bzip2
The tarfile
module supports two main compression methods: gzip and bzip2.
Method | Mode | File Extension | Characteristics |
---|---|---|---|
Gzip | "w:gz" | .tar.gz | Good balance of compression and speed |
Bzip2 | "w:bz2" | .tar.bz2 | Better compression but may be slower |
Example:
import tarfile
# For gzip
with tarfile.open("archive.tar.gz", "w:gz") as tar:
tar.add("file_to_compress.txt")
# For bzip2
with tarfile.open("archive.tar.bz2", "w:bz2") as tar:
tar.add("file_to_compress.txt")
Remember to close the tarfile after operations to ensure all changes are saved.
Extracting and Reading Tar Archives
To extract and read tar archives using the tarfile
module:
- Open the archive:
tar = tarfile.open("sample.tar.gz", "r:gz")
- Extract all files:
tar.extractall("destination_folder")
- Extract a specific file:
file = tar.extractfile("specific_file.txt") content = file.read()
- List archive contents:
for member in tar.getmembers(): print(member.name)
- Close the archive when finished:
tar.close()
These methods work for .tar, .tar.gz, and .tar.bz2 files. Adjust the mode in tarfile.open()
accordingly ("r", "r:gz", or "r:bz2").
Compressing and organizing files with Python's tarfile
module is straightforward, offering a reliable way to manage data efficiently. By understanding how to create, compress, and extract archives, you can keep your digital assets neatly arranged without hassle.
Writio: Your AI content writer for top-notch articles. This article was written by Writio.
- Python Software Foundation. tarfile — Read and write tar archive files. Python Documentation.
- Rossum G, Drake FL. The Python Library Reference. Python Software Foundation.
- Chun WJ. Core Python Applications Programming. Prentice Hall.