Friday, February 03, 2023

Using Clap to Parse Rust Program Arguments

 

The native Rust argument-parsing capability is pretty limited, so we are now turning to the third-party crate, "Clap" (which stands for Command-Line Argument Parser). To do that, let's create a new project:
$ cd ~/projects/RUST
$ cargo new parse_clap
$ cd parse_clap

You should pretty much know what the "src/main.rs" file looks like in a new Cargo-created Rust project. Before we begin working with that file, we need to let Cargo know we're going to use the "Clap" crate. Since this is a new project, the "Cargo.toml" file has no dependencies listed. We'll need to add a dependency to the "Cargo.toml" file for Clap.

Clap is found at "crates.io" (a Rust-maintained web site for all things Rustacean). If you web-browse to that site, you can search for "clap", and you'll find (at least near the time of this writing) both a version 3 and a version 4. We want the most recent version, which at the time of this writing is 4.0.29. If you'll click on it to get more details, you'll see in the right-hand column a line that needs to be added to your "Cargo.toml" file, specifically to the "[dependencies]" section, in order to tell Cargo how to use the Clap crate. Up until very recent versions of Cargo, this had to be added to the file manually, but with more recent versions of Cargo, you can just run:

$ cargo add clap@4.0.29

If you don't want a specific version, but rather prefer the newest one available, you can instead run:

$ cargo add clap

Be aware, this may take a few minutes. If you do this, and then examine your "Cargo.toml" file again, you'll discover that the needed line has automatically been added to that file.

But, unfortunately, this instruction does not tell you that you need to add more to that command, in order to get all that we need. You can learn this by clicking on the "docs.rs" link at the "crates.io" website. The command you really need (you can run it even if you ran the previous command) is:

$ cargo add clap --features derive

The second run goes much faster than the first, because most of the work has already been done in our first attempt to add this to "Cargo.toml".

Now look again at your "Cargo.toml" file; if it looks pretty much like below, we should be good to go.

[package]
name = "parse_clap"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
clap = { version = "4.0.29", features = ["derive"] }

Now we're ready to edit your "src/main.rs" file to be identical to the following:

use clap::Parser;

#[derive(Parser)]

struct ArgsType {
  /// Are you happy or sad?
  #[arg(short, long)]
  mood: String,
}

fn main() {
  let switches: ArgsType = ArgsType::parse();

  if switches.mood == "happy"
    { println!("Whoo-hoo! I am {}! {}! {}! {}!", switches.mood, switches.mood, switches.mood, switches.mood) };

  if switches.mood == "sad"
    { println!("Boo-hoo! I am {}!", switches.mood) };
  
  println!("Hello, world!");
} // end of main()

Now compile and run this with the indicated program arguments, like so:

cargo run -- --mood happy

Try providing different switches, in different order, in different numbers. Try the "-V" option, as well as the "-v" non-option. Try the "-h" option.

You can see that this Clap crate is already pretty useful, in that it provides some help screens when the arguments are not what are expected. It doesn't handle every wrong argument (as the program is currently written), but you can see that there's some potential here.

Let's try to understand this program, and then explore a bit of that potential.

The first line, "use clap::Parser;" preps the system for the other two "Parse"-related statements. Just know it's needed.

The next line tells the system that the parser will be deriving its arguments from the struct we build next. Clap can be configured using the Builder Application Programming Interface (API) or the Derive API (or a mix, as I understand it). As a general rule, unless you need to get deeper under the hood of Clap, you'll probably want to use the Derive API. The FAQ at https://docs.rs/clap/4.1.4/clap/_faq/index.html#when-should-i-use-the-builder-vs-derive-apis says this:

When should I use the builder vs derive APIs?

Our default answer is to use the Derive API:

  • Easier to read, write, and modify
  • Easier to keep the argument declaration and reading of argument in sync
  • Easier to reuse, e.g. clap-verbosity-flag

The Builder API is a lower-level API that someone might want to use for

So in other words, do it the way we're doing here, with a struct, not the way other tutorials might show you, without a struct. At least until you want/need to dive deeper.

The "struct" section provides a defining template for a new type of variable. This section does not declare an actual variable (we'll declare that later), but only a new type of variable. This new type of variable is based on a struct format. (A struct is a custom-made variable that holds other variables.)

Any variables declared to be of this new type are defined by this "struct' section, which defines what arguments are allowed to be given as the program's command-line arguments, and what the internal variable names are that will hold those arguments for use in the program. The Clap Derive API uses this struct type of structure to define and build this new type of variable. We could call this new type of variable anything we wanted, like "progInputs" or "options" or "OptionsType", etc. We're calling it "ArgsType". Currently this new type of variable defines one internal variable, named "mood", which is designated to hold String data.

The section that defines this inner "mood" variable is introduced by a line with three forward slashes (///). Whereas two slashes are the beginning of a "comment", which is ignored by the compiler but helps the programmer to keep notes about the code, a three-slash line functions as both a comment and a documentation line, which can be used by the compiler and by Clap to create help text. If you run cargo run -- --help, you can see that text, "Are you happy or sad?", in the output.

The line beginning with a hash mark tells Clap how to handle this program argument: whether it can be entered as a long form (--mood), or as a short form (-m), or must it be required, or should it have a default value, etc.

We can add additional internal variables (and therefore additional program input possibilities) by adding more "#[arg..." sections to the struct design. For example, in addition to the user's mood, perhaps we'd like to know the person's name and age:

struct ArgsType {
  /// Are you happy or sad?
  #[arg(short, long)]
  mood: String,
    
  /// What is your name?
  #[arg(short, long)]
  name: String,
  
  /// What is your age?
  #[arg(short, long, default_value_t = 16)]
  age: u8,
} 

Notice that the "age" variable has a default value (which is a bit silly, but this is just an example). Because of this, Clap won't require the user to enter that option, but it will the other two. You can force it to be required like this:

#[arg(short, long, default_value_t = 3), required(true)]

but that kind of defeats the purpose of having a default.

Although technically an age entered on the command line in a command such as cargo run -- --name Kent --age 35 starts out as a "String" (everything entered on the command line starts out as a "String"), by the time it gets to our "age" variable, Clap will have converted it from a "String" to a "u8" (which is an unsigned (i.e., positive) integer in the range of 0 to 255).

Note again that we have not yet declared a variable of this new type; we have only defined a new type of variable. We actually declare a variable in the main() function. Note also that since the struct is defined outside of the main() function, this definition of a new type of variable is "visible" (or "is in scope") to all parts of the program within this "main.rs" file. If we should create a new function later on in this same file, say, a function called "part_two()", that function will be able to access this "ArgsType" definition; had we put this definition within the main() function, it would only be visible to the main() function itself, but not to the "part_two()" function.

Now let's look at the main() function. The let switches: ArgsType = ArgsType::parse(); line actually defines our variable. The name of the variable is "switches", and the type of the variable is, not String and not i32 and not u8 or etc, but "ArgsType", the type we just invented. If this line seems complicated to you, take out the ": ArgsType", to make the line be just let switches = ArgsType::parse(); which may be less daunting to look at and therefore less daunting to understand. It's basically just calling a "function" named "parse" that is "located" in the "ArgsType" struct we just built (not exactly, but close enough), and assigning the results of that "function" to the variable "switches".

The variable "switches" now holds three variables within it (assuming three options are given as program inputs), which we can access as "switches.mood", "switches.name", and "switches.age". Here are some mods to our program, including a boolean flag to specify if the user is human or not, which defaults to "no":

use clap::Parser;

#[derive(Parser)]

struct ArgsType {
  /// Are you happy or sad?
  #[arg(short, long)]
  mood: String,
  
  /// What is your name?
  #[arg(short, long, value_name = "What yo momma called you...")]
  name: String,
  
  /// What is your age?
  #[arg(short, long, default_value_t = 16)]
  age: u8,

  /// Are you a human?
  #[arg(short = 'H', long, default_value_t = false)]  // 'h' would have conflicted with "help".
  human: bool,
}

fn main() {
  let switches: ArgsType = ArgsType::parse();

  if switches.human {
    println!("Hi, {}! You seem very {} to be {} years old, but that's understandable, since you are a human.",
      switches.name,
      switches.mood,
      switches.age
    );
  } else {
    println!("Hi, {}! You seem very {} to be {} years old, but that's understandable, since you are not a human.",
      switches.name,
      switches.mood,
      switches.age
    );
}

  if switches.mood == "happy"
    { println!("Whoo-hoo! I am {}! {}! {}! {}!", switches.mood, switches.mood, switches.mood, switches.mood) };

  if switches.mood == "sad"
    { println!("Boo-hoo! I am {}!", switches.mood) };
  
} // end of main()

Running this program results in:

$ cargo run -- --mood happy --name Kent --age 253
Compiling parse_clap v0.1.0 (/home/westk/projects/RUST/parse_clap)
Finished dev [unoptimized + debuginfo] target(s) in 0.56s
Running `target/debug/parse_clap --mood happy --name Kent --age 253`
Hi, Kent! You seem very happy to be 253 years old.
Whoo-hoo! I am happy! happy! happy! happy!
$
$ cargo run -- --name=Kent --age 253 -m happy --human -h
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/parse_clap --name=Kent --age 253 -m happy --human -h`
Usage: parse_clap [OPTIONS] --mood <MOOD> --name <What yo momma called you...>

Options:
  -m, --mood <MOOD>                         Are you happy or sad?
  -n, --name <What yo momma called you...>  What is your name?
  -a, --age <AGE>                           What is your age? [default: 16]
  -H, --human                               Are you a human?
  -h, --help                                Print help information
  -V, --version                             Print version information
$

Note also that various formats can be used for entering the arguments:

--name Kent
-nKent
-n=Kent
--name=Kent

But --nameKent won't work.

And that's pretty much it. We've got our feet wet with parsing arguments in Rust using the Clap crate.