Bash Gzip
Learn file compression and decompression with gzip
🗜️ What is Gzip?
Gzip is a fast compression utility that reduces file sizes using the DEFLATE algorithm. It's commonly used in Unix systems for compressing individual files, creating .gz archives that save storage space and bandwidth.
# Compress a file with gzip
gzip file.txt
Output:
file.txt compressed to file.txt.gz
Key Gzip Concepts
Compress
Reduce file size efficiently
gzip file.txt
Decompress
Restore original files
gunzip file.txt.gz
View
Read without decompressing
zcat file.txt.gz
Keep Original
Preserve source file
gzip -k file.txt
🔹 Basic Gzip Compression
The standard gzip command compresses individual files, replacing the original with a smaller .gz version to conserve storage space and reduce transfer times. Executing gzip document.txt creates document.txt.gz while removing the original file, achieving significant size reduction particularly for text-based files, code, and uncompressed data formats. This compression method utilizes the DEFLATE algorithm, balancing compression ratio with processing speed effectively. The command provides immediate visual feedback showing the percentage reduction achieved, enabling quick assessment of compression efficiency for different file types.
# Compress a file (replaces original)
gzip document.txt
# Compress with verbose output
gzip -v largefile.log
# Compress multiple files
gzip file1.txt file2.txt file3.txt
Output:
document.txt: 65.2% -- replaced with document.txt.gz
🔹 Decompress Gzip Files
Restoring compressed files to their original state is accomplished using either gunzip or gzip -d, which decompress .gz files while removing the compressed version. The command gunzip file.txt.gz recreates the original file.txt with identical content, permissions, and structure. This symmetrical operation ensures seamless transitions between compressed and uncompressed states, facilitating efficient storage management and data transfer. The decompression process typically executes faster than compression and maintains complete data integrity throughout the transformation process.
# Decompress using gunzip
gunzip file.txt.gz
# Decompress using gzip -d
gzip -d archive.gz
# Decompress with verbose output
gunzip -v backup.tar.gz
Output:
file.txt.gz: 65.2% -- replaced with file.txt
🔹 Keep Original File
The -k (keep) option preserves original files during both compression and decompression operations, maintaining both versions simultaneously for comparison or backup purposes. Using gzip -k document.txt generates document.txt.gz while retaining the original uncompressed file. This approach is valuable when verifying compression results, maintaining accessible originals, or creating compressed archives without eliminating source materials. The keep option provides flexibility in storage management strategies, allowing users to balance space savings against accessibility requirements according to their specific workflow needs.
# Compress and keep original
gzip -k document.txt
# Decompress and keep compressed file
gunzip -k archive.gz
# Keep with verbose output
gzip -kv file.txt
Output:
document.txt: 65.2% -- created document.txt.gz
🔹 Compression Levels
Balance compression ratio against processing speed using numeric levels from 0 (store) to 9 (maximum compression) with the -# flag. Level 0 provides fastest operation by simply storing files without compression, while level 9 delivers smallest file sizes through intensive processing. The default level 6 offers optimal balance for most applications. Specifying zip -9 archive.zip largefile.iso applies maximum compression for archival storage, while zip -0 archive.zip already_compressed.jpg quickly bundles pre-compressed files.
# Fast compression (level 1)
gzip -1 file.txt
# Default compression (level 6)
gzip file.txt
# Maximum compression (level 9)
gzip -9 document.txt
# Best compression with keep
gzip -9k largefile.log
Output:
file.txt: 45.8% -- replaced with file.txt.gz
document.txt: 72.3% -- replaced with document.txt.gz
🔹 View Compressed Files
Specialized utilities including zcat, zless, and zmore enable direct viewing of gzip-compressed content without permanent decompression, maintaining storage efficiency while allowing content access. These tools decompress data dynamically in memory, displaying file contents through standard paging interfaces while preserving the compressed version on disk. This approach is ideal for examining logs, reviewing documentation, or searching through archived data without committing storage space for uncompressed versions. The functionality mirrors their uncompressed counterparts (cat, less, more) while transparently handling the decompression process.
# Display entire compressed file
zcat file.txt.gz
# View with pagination
zless document.txt.gz
# View with more pagination
zmore log.txt.gz
# Search in compressed file
zgrep "error" logfile.gz
Output:
This is the content of the file
displayed without decompression
keeping the .gz file intact
🔹 Test Compressed Files
The -t option verifies gzip file integrity without performing full decompression, ensuring compressed archives remain uncorrupted and viable for future restoration needs. Executing gzip -t archive.gz conducts a comprehensive validity check, reporting any structural issues, corruption, or compression artifacts that might prevent successful decompression. This verification process is crucial for backup validation, data transfer confirmation, and long-term archival maintenance. Regular integrity testing protects against data degradation and ensures compressed files will be accessible when needed, particularly for critical backups and archival materials.
# Test single file
gzip -t file.txt.gz
# Test with verbose output
gzip -tv archive.gz
# Test multiple files
gzip -t *.gz
Output:
file.txt.gz: OK
🔹 Compress from Standard Input
Gzip's ability to process standard input enables powerful pipeline operations, compressing data streams directly from other commands or applications without intermediate file creation. Using gzip -c writes compressed output to standard output instead of files, facilitating integration with Unix pipelines. For example, cat document.txt | gzip -c > document.txt.gz creates compressed output while preserving the original. This approach supports dynamic compression scenarios, real-time data processing, and workflow automation where intermediate files would create unnecessary complexity or storage overhead.
# Compress and redirect output
cat file.txt | gzip > file.txt.gz
# Compress with custom name
gzip -c document.txt > backup.gz
# Decompress to stdout
gunzip -c file.txt.gz > restored.txt
# Compress multiple files into one
cat file1.txt file2.txt | gzip > combined.gz
Output:
file.txt.gz created successfully