Bash Remove Section (cut)

Extract specific portions of text from lines

✂️ What is cut?

cut extracts specific sections from each line of text. It's perfect for extracting columns from delimited files, getting specific characters, or processing structured data by selecting only the fields you need from input.


# Extract first field from CSV
cut -d',' -f1 data.csv

# Extract characters 1-5
cut -c1-5 file.txt
                                    

cut Basics

📋

Extract Fields

Get specific columns

cut -d',' -f1 file.csv
🔤

Extract Characters

Get character positions

cut -c1-10 file.txt
📊

Multiple Fields

Extract several columns

cut -d',' -f1,3,5 file.csv
🔀

Field Ranges

Get range of fields

cut -d':' -f1-3 file.txt

🔹 Extracting Fields by Delimiter

The cut command is purpose-built for extracting specific columns from structured text using a defined delimiter. Its syntax, cut -d':' -f1, is straightforward for simple column extraction tasks. It's often the fastest tool for jobs like getting usernames from /etc/passwd or extracting specific columns from CSV exports before importing them elsewhere. While less flexible than AWK for complex logic, cut offers superior performance and simplicity for straightforward field-slicing operations in pipelines and scripts.

# Extract first field from CSV
cut -d',' -f1 data.csv

# Extract second field from colon-separated file
cut -d':' -f2 /etc/passwd

# Extract multiple fields
cut -d',' -f1,3 data.csv

# Extract field range
cut -d',' -f2-4 data.csv

Input (data.csv):

John,25,Engineer,NYC
Sarah,30,Designer,LA
Mike,28,Developer,SF

Output (cut -d',' -f1,3):

John,Engineer
Sarah,Designer
Mike,Developer

🔹 Extracting Characters

When data is in fixed-width format—common in legacy system reports or certain log files—cut -c is the ideal tool. It extracts based on character position, not delimiters. For example, a file might have a date in characters 1-10, an ID in 11-20, and a status in 21-25. cut -c1-10,21-25 file.txt would extract just the date and status. This positional addressing is precise and reliable for formats where column boundaries are strictly defined by character count.

# Extract first 5 characters
cut -c1-5 file.txt

# Extract characters 10 to 20
cut -c10-20 file.txt

# Extract specific positions
cut -c1,5,10 file.txt

# Extract from position 5 to end
cut -c5- file.txt

# Extract up to position 10
cut -c-10 file.txt

Input:

Hello World 2024
Bash Tutorial Here

Output (cut -c1-5):

Hello
Bash 

🔹 Extracting Bytes

The -b option in cut performs byte-level extraction, which is distinct from character extraction in multi-byte character encodings. In UTF-8, a single character like 'é' may be encoded as two bytes (C3 A9). Using -c would treat it as one character, while -b would treat it as two separate bytes. This distinction is critical when processing binary data, network packets, or ensuring compatibility with systems that require precise byte offsets, making -b essential for low-level data manipulation.

# Extract first 10 bytes
cut -b1-10 file.txt

# Extract bytes 5 to 15
cut -b5-15 file.txt

# Extract specific byte positions
cut -b1,5,10,15 file.txt

Bytes vs Characters:

  • -c: Works with characters (multi-byte aware)
  • -b: Works with bytes (raw data)
  • For ASCII text, -c and -b produce same results
  • For UTF-8 text, -c handles multi-byte characters correctly

🔹 Multiple Field Selection

cut allows selective extraction of non-adjacent fields by listing them with commas, providing great flexibility. For instance, to get the 1st, 3rd, and 5th fields from a colon-delimited file: cut -d':' -f1,3,5. This capability lets you project only the necessary columns from a wider dataset, reducing output clutter and improving processing efficiency in subsequent pipeline stages. It’s a data reduction technique that keeps commands clean and focused on the relevant information.

# Extract fields 1, 3, and 5
cut -d',' -f1,3,5 data.csv

# Extract fields 1, 2, and 5 through 7
cut -d':' -f1,2,5-7 file.txt

# Extract first field and last two fields
cut -d',' -f1,4,5 data.csv

Input:

A,B,C,D,E,F
1,2,3,4,5,6

Output (cut -d',' -f1,3,5):

A,C,E
1,3,5

🔹 Using Field Ranges

Field ranges with a hyphen simplify the extraction of consecutive columns, improving command readability and reducing typing errors. Syntax like -f2-5 extracts fields two through five inclusive. Open-ended ranges are also powerful: -f3- gets everything from the third field to the end of the line, useful when you want to discard leading metadata. This range notation is efficient for working with tables where you need a contiguous block of columns for analysis or reporting.

# Extract fields 2 through 5
cut -d',' -f2-5 data.csv

# Extract from field 3 to end
cut -d':' -f3- /etc/passwd

# Extract from beginning to field 4
cut -d',' -f-4 data.csv

# Extract all except first field
cut -d',' -f2- data.csv

Input:

A,B,C,D,E,F
1,2,3,4,5,6

Output (cut -d',' -f2-4):

B,C,D
2,3,4

🔹 Changing Output Delimiter

The --output-delimiter option transforms the output format by using a different separator than the input. This is invaluable for data conversion tasks. For example, converting a comma-separated list to a pipe-separated one: cut -d',' -f1-3 --output-delimiter='|' data.csv. It ensures compatibility with downstream applications that expect a specific delimiter, facilitates human reading when the original delimiter is a space or comma that blends into the data, and aids in creating visually distinct output for debugging.

# Change comma to tab
cut -d',' -f1,2,3 --output-delimiter=$'\t' data.csv

# Change comma to pipe
cut -d',' -f1,2 --output-delimiter='|' data.csv

# Change colon to comma
cut -d':' -f1,3,5 --output-delimiter=',' /etc/passwd

Input:

John,25,Engineer
Sarah,30,Designer

Output (with pipe delimiter):

John|25|Engineer
Sarah|30|Designer

🔹 Suppressing Lines Without Delimiter

The -s option tells cut to silently skip lines that do not contain the specified delimiter. In default behavior, such lines are printed unchanged, which can intersperse unwanted content (like header notes or malformed rows) with your cleanly extracted data. Using -s ensures output consistency, filtering out only the well-formed records. This is particularly useful when processing semi-structured logs or files where only a subset of lines follow the expected columnar format you intend to parse.

# Only show lines with delimiter
cut -d',' -f1 -s data.csv

# Suppress lines without colon
cut -d':' -f1 -s file.txt

# Useful for filtering malformed data
cut -d',' -f1,2 -s mixed_data.txt

Input:

John,25,Engineer
This line has no commas
Sarah,30,Designer

Output (cut -d',' -f1 -s):

John
Sarah

🔹 Practical cut Examples

Common practical uses of cut include parsing system files, processing command output, and data preparation. Examples: cut -d' ' -f1 /var/log/auth.log | sort | uniq -c counts login attempts per user; echo $PATH | tr ':' '\n' could be combined with cut to manipulate path components. These examples underscore cut's role as a fundamental text slicer in a sysadmin's toolkit, often used in the initial stages of data extraction pipelines.

# Extract usernames from /etc/passwd
cut -d':' -f1 /etc/passwd

# Get IP addresses from log file
cut -d' ' -f1 access.log | sort | uniq

# Extract email domain
cut -d'@' -f2 emails.txt

# Get first and last names from CSV
cut -d',' -f1,2 contacts.csv

# Extract date from log entries
cut -c1-10 app.log

# Get file extensions
ls -1 | cut -d'.' -f2

# Process CSV and pipe to another command
cut -d',' -f3 sales.csv | awk '{sum+=$1} END {print sum}'

🔹 Combining cut with Other Commands

The true power of cut emerges in Unix pipelines, where it collaborates with filters, sorters, and aggregators. A typical pipeline: grep "GET" access.log | cut -d' ' -f7 | sort | uniq -c | sort -rn | head -20. Here, cut isolates the URL field after grep filters the lines. This philosophy of combining simple, single-purpose tools (the Unix philosophy) allows for constructing powerful, ad-hoc data processing workflows without writing complex programs, making command-line analysis both efficient and expressive.

# Extract and count unique values
cut -d',' -f3 data.csv | sort | uniq -c

# Filter then extract
grep "error" app.log | cut -d' ' -f1-3

# Extract and search
cut -d':' -f1 /etc/passwd | grep "^a"

# Chain multiple cuts
cut -d',' -f2 data.csv | cut -c1-5

# Extract, sort, and display top 10
cut -d',' -f3 sales.csv | sort -n | tail -10

Output (unique count):

   3 Engineer
   2 Designer
   5 Developer

🔹 Common cut Options

Mastering cut's core options—-d, -f, -c, -b, -s, and --output-delimiter—enables precise control over text extraction. Understanding the context for each (e.g., -f for delimited data, -c for fixed-width) allows you to choose the right tool for the job. Combining these options efficiently solves a wide range of data manipulation tasks, from quick terminal one-liners to robust shell scripts that process logs, configuration files, and data exports on a regular basis.

Key Options:

  • -d: Specify field delimiter
  • -f: Select fields by number
  • -c: Select characters by position
  • -b: Select bytes by position
  • -s: Suppress lines without delimiter
  • --output-delimiter: Change output delimiter
  • --complement: Invert selection (select all except specified)
# Complement selection (all except field 2)
cut -d',' -f2 --complement data.csv

# Multiple options combined
cut -d':' -f1,3 -s --output-delimiter=',' /etc/passwd

🧠 Test Your Knowledge

What does cut -d',' -f1 do?