- Preamble
- A Bit of Theory
- A Word on Internal Field Separators
- Time for Tuning | Use IFS to Your Advantage
Preamble
Are you a developer, sysadmin, or DevOps engineer? Then you probably spend lots of time on your terminal. I bet you use the GNU Bash shell or something highly similar — take Zsh, for example.
It's essential to know your tools well to work more efficiently. As a Linux geek, I use Bash to automate my daily routine even when it's not for my job, but tired of constantly searching the Web to learn how to achieve certain things in Bash. So, recently I've read through the whole manual. It's relatively small, a little more than 190 pages in total.
This article is going to start a series on mastering GNU Bash.
A Bit of Theory
Throwing a bunch of commands at the reader without somewhat detailed explanations is as helpful as teaching a five-year-old how to swim by throwing them in the sea. That's why tutorials should start at least with some general knowledge IMHO.
Shell Expansions
Bash is very flexible when it comes to writing commands. You can always do something a lot simpler and shorter than you would without different types of expansions
Bash performs on the commands it receives from a user.
In essence, shell expansions enable users to conveniently manipulate data on the command line, such as regular text or variables, using special syntactic constructs. Think of them as shortcuts.
Currently, there are seven types of shell expansions supported in Bash:
- brace expansion
- tilde expansion
- parameter and variable expansion
- command substitution
- arithmetic expansion
- word splitting
- filename expansion
We won't cover all of them since that would take too much time. Besides, there's no point in duplicating the manual.
Here's a short example of a parameter expansion:
[deathroll@fedora ~]$ EXAMPLE_TEXT='sOmE tExT Owo'; echo ${EXAMPLE_TEXT,,}
Parameter expansion — alongside command substitution and arithmetic expansion — is denoted by the "$" symbol.
The command above produces the text below because the shell performs a parameter expansion
for modifying the case of alphabetic characters to lowercase on the given variable.
some text owo
You could also provide a
pattern
for operating only on specific characters. One character at a time is matched, though, so no whole words and character sequences can be used.[deathroll@fedora ~]$ EXAMPLE_TEXT='sOmE tExT Owo'; echo ${EXAMPLE_TEXT,,O} somE tExT owo [deathroll@fedora ~]$ echo ${EXAMPLE_TEXT,,[OT]} somE tExt owo [deathroll@fedora ~]$
Builtins
You may already know this, but not all commands you type on the command line are actual programs found somewhere on a filesystem. Bash has built-in commands called "builtins." And you use them all the time. Probably the most commonly used are echo
and cd
.
My personal favorite is the declare
builtin. It enables you to declare a variable of a specific type (e.g., array or integer).
Fields
Since the topic is about internal field separators, it's also a good idea to touch a bit on what are fields
.
Here's how the manual defines field
:
A unit of text that is the result of one of the shell expansions. After expansion, when executing a command, the resulting fields are used as the command name and arguments.
Pretty self-explanatory.
A Word on Internal Field Separators
Internal field separators tell Bash how to split a field
into words
— these are just sequences of characters treated as single units.
The separators are listed in the IFS
variable that contains three symbols by default: space
, tab
, newline
.
— But what does it mean for me as a user?
— Things will get clear with a couple of simple examples.
Suppose you want to get a list of directories and perform some actions on their names. If you search for directories with the find
command, the output will contain multiple lines (i.e., separated by newline
characters).
[deathroll@fedora ~]$ find . -mindepth 1 -maxdepth 1 -type d -not -name '.*'
./Pictures
./bin
./Videos
./Public
./Music
./Templates
./git
./nvim
./Downloads
./projects
./NAS
./Desktop
./Documents
./snap
./learn
[deathroll@fedora ~]$
When you substitute the command above so that its output becomes the value of some variable or a part of another command, the shell splits this field into words
, dividing where it finds the newline
character — or any other character listed in the IFS
variable.
[deathroll@fedora ~]$ declare -a EXAMPLE_ARR=(`find . -mindepth 1 -maxdepth 1 -type d -not -name '.*'`)
[deathroll@fedora ~]$ echo ${EXAMPLE_ARR[@]}
./Pictures ./bin ./Videos ./Public ./Music ./Templates ./git ./nvim ./Downloads ./projects ./NAS ./Desktop ./Documents ./snap ./learn
[deathroll@fedora ~]$ echo ${EXAMPLE_ARR[0]}
./Pictures
[deathroll@fedora ~]$
OK, that seems to be pretty reasonable. Now I want you to see the commands below and think about the data stored in a variable. What do the array elements look like? More specifically, the first element. Does it look like "2023-02-26 22-10-40.mp4
?"
[deathroll@fedora Videos]$ ls *' '*.mp4
'2023-02-26 22-10-40.mp4' '2023-03-14 00-22-19.mp4' '2023-03-31 21-12-39.mp4' '2023-04-08 13-45-36.mp4'
'2023-02-26 22-16-21.mp4' '2023-03-24 11-54-46.mp4' '2023-04-04 10-05-41.mp4' '2023-04-11 11-53-24.mp4'
'2023-03-12 18-59-36.mp4' '2023-03-24 11-54-57.mp4' '2023-04-04 11-37-05.mp4' '2023-04-11 12-00-41.mp4'
'2023-03-14 00-10-02.mp4' '2023-03-24 11-55-14.mp4' '2023-04-04 12-06-14.mp4' '2023-04-11 12-00-51.mp4'
'2023-03-14 00-11-13.mp4' '2023-03-30 16-16-21.mp4' '2023-04-04 12-19-03.mp4' '2023-04-11 13-55-54.mp4'
'2023-03-14 00-17-25.mp4' '2023-03-31 21-07-22.mp4' '2023-04-04 12-19-23.mp4' '2023-04-20 19-41-44.mp4'
[deathroll@fedora Videos]$ EXAMPLE_ARR=(`ls *' '*.mp4`)
[deathroll@fedora Videos]$
Let's list the array elements, each on a separate line, to make the output more readable, and pick just the first ten lines to make it short... Bash splits filenames into Congrats if your guess was right! 🎉The moment of truth... 😰
As you may have already noticed, the filenames contain the space
character. Now, recall what is written about IFS
at the start of this section of the article.
[deathroll@fedora Videos]$ head -10 <(for F in ${EXAMPLE_ARR[@]}; do printf '%b\n' $F; done)
2023-02-26
22-10-40.mp4
2023-02-26
22-16-21.mp4
2023-03-12
18-59-36.mp4
2023-03-14
00-10-02.mp4
2023-03-14
00-11-13.mp4
[deathroll@fedora Videos]$ # Oh no! My filenames are broken! 😱
[deathroll@fedora Videos]$
words
where it finds any of the characters listed in the IFS
variable. Sometimes that's crucial, and you want to change the shell's behavior to split only at the characters you specify.
Time for Tuning | Use IFS to Your Advantage
Since IFS
is a variable, it can be altered by a user. All you need to do is to assign a new value to it. But make sure to use appropriate quoting when you intend to use escape sequences — double quotes (""
) or ANSI-C Quoting ($''
).
Changing the IFS
value for the rest of the shell process execution is not recommended since it can cause you more trouble than bring benefits.
What I prefer is either change the IFS
value, execute the commands where custom field separators are needed, and unset
IFS
, or execute the commands — including IFS
variable assignment — in a subshell to avoid mutating the current shell environment.
Don't worry. The shell can't be broken by unsetting the IFS
variable. When IFS
is unset, Bash will use the default value identical to the variable's initial value.
Example 1 — Unsetting
[deathroll@fedora Videos]$ IFS=$'\n'
[deathroll@fedora Videos]$ head -10 <(for F in `ls *' '*.mp4`; do printf '%b\n' $F; done)
2023-02-26 22-10-40.mp4
2023-02-26 22-16-21.mp4
2023-03-12 18-59-36.mp4
2023-03-14 00-10-02.mp4
2023-03-14 00-11-13.mp4
2023-03-14 00-17-25.mp4
2023-03-14 00-22-19.mp4
2023-03-24 11-54-46.mp4
2023-03-24 11-54-57.mp4
2023-03-24 11-55-14.mp4
[deathroll@fedora Videos]$ unset IFS
[deathroll@fedora Videos]$
Example 2 — Subshell
[deathroll@fedora Videos]$ (IFS="\n"; head -10 <(for F in `ls *' '*.mp4`; do printf '%b\n' $F; done))
2023-02-26 22-10-40.mp4
2023-02-26 22-16-21.mp4
2023-03-12 18-59-36.mp4
2023-03-14 00-10-02.mp4
2023-03-14 00-11-13.mp4
2023-03-14 00-17-25.mp4
2023-03-14 00-22-19.mp4
2023-03-24 11-54-46.mp4
2023-03-24 11-54-57.mp4
2023-03-24 11-55-14.mp4
[deathroll@fedora Videos]$ # Since the command was executed in a subshell, the current environment is left untouched.
[deathroll@fedora Videos]$ head -10 <(for F in `ls *' '*.mp4`; do printf '%b\n' $F; done)
2023-02-26
22-10-40.mp4
2023-02-26
22-16-21.mp4
2023-03-12
18-59-36.mp4
2023-03-14
00-10-02.mp4
2023-03-14
00-11-13.mp4
[deathroll@fedora Videos]$
Top comments (0)