linuxnewbie.org.gif
Tuesday, 12-Dec-2000 10:39:55 EST
Newbized Help Files articles discussion board bookshelf sensei's log advertising info

Globbing on the Command Line
Written By: Devin Carraway

Now, let's suppose that in your long-running correspondence with Mom, sometimes you'd saved your files beginning with "letter_to_mom" and other times with "letter_to_Mom" -- the difference being that capital M. Now, in UNIX, filenames are case-sensitive; that means that 'a' and 'A' aren't the same. If you try to specify the files on the commandline as 'letter_to_mom*', you'll miss the ones that have the capital M. In such a case, the ? and [] meta-characters are useful:

mv letter_to_?om* mom/

... the ? means that any letter can be there -- either m or M included -- while the *, as before, means "anything or nothing." Thus, the capitalization problem is quickly avoided. An even neater way to do the same thing would be this:

mv letter_to_[mM]om* mom/

... in this case, [mM] means "either 'm' or 'M'." It's called a "character class," or more simply a "character list" (actually, you can call it whatever you like). When you use [], you can put as many letters in it as you'd like -- even spaces or most punctuation -- and it will match any of the letters you've listed. This is useful because it gives you more exact control -- in the ? example earlier, ? matched M or m in Mom, but it also would have matched "dom" and "tom" -- whereas [Mm] only matches "Mom" and "mom", and that's it. You can also use "ranges" with [] -- that's where you say "any character between these two characters" -- the easy example is [0-9], which means "any number." You might also see [a-z], which means "any lowercase letter." Any number of ranges can be included in a [], even alongside other characters you've put in there -- [a-zA-Z] is another common one, meaning "any upper- or lower-case letter"; [a-z13579] means "Any lowercase letter, or the number 1, or 3, or 5, or 7, or 9." You'll probably find the [] most useful when you want to extract fairly strictly-limited set of files out of a long list. Returning to our correspondence example, you might want to get letters to Mom #3 through #6 -- so, you'd use this:

mv letter_to_mom-[3-6].txt mom/

... in this example, the [3-6] means "3, 4, 5 or 6" -- thus letter_to_mom-4.txt will be moved to the mom directory, but letter_to_mom-2.txt will not.

Note that we've used - to indicate a range -- while the '-' character generally acts like any other, inside a [] it's a meta-character -- if you'd like to use a literal '-' in a range, you can "escape" it with a backslash (\) character (many places on the commandline and elsewhere, \ means "take the next letter literally." So if you had two files, "myfile-1" and "myfile_2", you could match them both with myfile[_\-] -- the _ is a normal character in the character class, and the \ indicates that the - should be treated as one also -- that is to say, it isn't being used to indicate a range, just a normal character.

One other trick about character classes -- they can be "negated" if the first character is a caret (^). The caret changes the meaning of the class from "any one of these letters" to "any letter other than these." So, while [0-9] means "any number," [^0-9] means "anything but a number."

The problem with character classes is that they only refer to a single letter, and it's often a pain to type in more of them, especially if they're long and complicated. Often you just want to refer to one of a few different words, and character classes are unwieldy. That's where the {}, or "alternative list" comes in. {} contains a list of words, separated by commas, that should appear on any matches. Once again, let's say you have your letters to Mom and Dad as letter_to_mom-NNN.txt and letter_to_dad-NNN.txt. Also suppose you have a friend named Dominique, and her letters are named letter_to_dom-NNN.txt. Now, if you were to use character classes as above, you might try:

cp letter_to_[md][oa][md]* parents/ # note: this is wrong

... this is somewhat hard to read -- it means "any file whose name begins with 'letter_to_', and then has either an 'm' or a 'd', then either an 'o' or an 'a', then either an 'm' or a 'd', then any number of characters" It's complicated, and beyond that, it doesn't work, because while it will indeed copy all of the letter_to_mom* and letter_to_dad* files, the character classes allow letter_to_dom* to match also (d, o and m from each class, just as d, a and d and m, o and m worked). This is an excellent place to use {} -- just specify {mom,dad} instead of the messy character classes, and you have:

cp letter_to_{mom,dad}* parents/

... which is much more readable, and also has a simpler meaning -- "any file beginning with 'letter_to_', then having either 'mom' or 'dad', then any number of characters. You're also allowed to use the globbing characters inside the alternative lists -- for example, suppose you did want to get your letters to Dominique also:

cp letter_to_{[md]om,dad}* correspondence/

... This is the same as the previous example, except that instead of matching 'mom', the shell will match either an 'm' or a 'd' followed by 'om'. Likewise, you're allowed to use * in alternative lists. Suppose that once in a while you'd saved a letter to Dominique as letter_to_dominique-NNN.txt instead of letter_to_dom-NNN.txt. (Most people when they create a lot of files wind up trying to use some sort of consistent scheme for naming them -- and most of those people find themselves breaking their own scheme sometimes; shells help by making such inconsistencies easier to cope with). If you wanted to collect your first few letters to Mom, Dad and Dom, you could use:

cp letter_to_{mom,dom*,dad}-[1-3].txt

... The 'dom*' in the character class means "dom followed by zero or more characters." You could also have written {mom,dom,dominique,dad}, but this was terser.

One final quickie for holding out this long -- ~. Almost anytime you move around a UNIX system, you'll be moving into or out of your home directory (which is usually something like /home/yourname/). When used at the beginning of a pathname, ~ means "my home directory." Suppose you had a directory "myfiles" in your home directory, and wanted to move some files there from /tmp. If you had already cd'ed to /tmp, you could then move the files with a command such as:

mv letter_to_* ~/myfiles/

... in this example, ~ is replaced by the shell with the full path to your home directory (e.g. /home/yourname/). Under the bash and tcsh shells, the ~ haracter can be followed by a username, in which case it will refer to the home directory of that user, rather than your own home directory. For example, if you were copying some files from a floppy disk as root to your normal user home directory, you might use:

cp /mnt/floppy/* ~bob

... when the shell gets this commandline, it replaces ~bob with the path to bob's home directory (e.g. /home/bob/).

Conclusion

This is pretty much all there is to using shell wildcards -- taken together, wildcards set a good balance between simplicity and power in identifying files precisely and quickly according to fairly straightforward rules.

These guidelines are summarized in the "Pathname Expansion" section of the bash(1) manpage, and the "Filename substitution" section of the tcsh(1) manpage.

Wildcards are a simplified form of what are called "regular expressions" -- often abbreviated "regexps," these are extremely powerful devices for text matching (more powerful than are usually required for moving files around). Regular expressions are very important in some areas of Linux, UNIX and the Internet, especially if you find yourself learning UNIX or CGI programming. For more on regexps, see http://www.tuxedo.org/~esr/jargon/html/entry/regexp.html, the sed(1) and Perl documentation, or O'Reilly's book, Mastering Regular Expressions.

What kind of globber are you?


[-NHF Control Panel-]
The Linux Channel at internet.com
Linux Planet
Linux Today
Linux Central
Linuxnewbie.org
PHPBuilder
Just Linux
Linux Programming
Linux Start
BSD Today
Apache Today
Enterprise Linux Today
BSD Central
All Linux Devices
SITE DESCRIPTIONS
[-What's New-]
Order a Linuxnewbie T-Shirt
Easy Webcam NHF
Directory Navigation NHF
Installing Snort 1.6.3 on SuSE 6.x-7.x
Customizing vim
The SysVinit NHF
Installing ALSA for the VT82C686 integrated sound
USB Creative Video Blaster II for Linux
Configuring the Intellimouse Explorer in XFree86 V4+
The beginnings of a distro NHF
Getting Past Carnivore?
Getting and Installing PGP
Getting your ATI Rage 128 Working
How to create a multiple partition system
Using Fdisk
Introduction to Programming in C/C++ with Vim
Adding a Hard drive in Linux -- In five steps
Installing ALSA for the Yamaha DS-XG Sound Card
Getting your Diamond Rio Mp3 Player to work with Linux
Bash Programming Cheat Sheet
Installing NVIDIA Drivers for Mandrake
Setting up Portsentry
Hard Drive Speed Tweak for Linux
Sensei's Log
Chat room
Join: Linuxnewbie.org SETI Black Belts!
Send in your news
Click the image to add Linuxnewbie.org to your MyNetscape Page
[-LNO Newsletter-]

[-Archive-]
The beginnings of a distro NHF
Connecting to the Internet using KPPP
Getting your SBLive to work
Unreal Tournament NHF
LWE Day 2 Pictures
LWE Day 1 Pictures
The LNO FAQ!
WoW (Words of Wisdom)
Other sites news
What is Linux?
What is Linux? part deux (ups & downs)
Search newsgroups
The List
ALS Report
Feedback Form
jobs.linuxtoday.com.gif
Match: Format: Sort by:
Search:
Copyright © 1999 All Rights Reserved
[-Quick Links-]

Copyright 2000 internet.com Corp. All Rights Reserved. Legal Notices Privacy Policy

internet.com.gif