Format (Common Lisp)

From Wikipedia, the free encyclopedia

Format is a function in Common Lisp that can produce formatted text using a format string similar to the printf format string. It provides more functionality than printf, allowing the user to output numbers in various formats (including, for instance: hex, binary, octal, roman numerals, and English), apply certain format specifiers only under certain conditions, iterate over data structures, output data tabularly, and even recurse, calling format internally to handle data structures that include their own preferred formatting strings. This functionally originates in MIT's Lisp Machine Lisp, where it was based on Multics ioa_[citation needed].

Specification[edit]

The format function is specified by the syntax[1]

format destination controlString &rest formatArguments

Directives in the control string are interpolated using the format arguments, and the thus constructed character sequence is written to the destination.

Destination[edit]

The destination may either be a stream, a dynamic string, T, or the NIL constant; the latter of which presents a special case in that it creates, formats and returns a new string object, while T refers to the standard output, usually being equivalent to the console. Streams in Common Lisp comprehend, among others, string output and file streams; hence, being capable of writing to such a variety of destinations, this function unifies capabilities distributed among distinct commands in some other programming languages, such as C's printf for console output, sprintf for string formatting, and fprintf for file writing.

The multitude of destination types is exemplified in the following:

;; Prints "1 + 2 = 3" to the standard output and returns ``NIL''.
(format T "1 + 2 = ~d" 3)

;; Creates and returns a new string containing "1 + 2 = 3".
(format NIL "1 + 2 = ~d" 3)

;; Creates and returns a new string containing "1 + 2 = 3".
(with-output-to-string (output)
  (format output "1 + 2 = ~d" 3))

;; Writes to the file "outputFile.txt" the string "1 + 2 = 3".
(with-open-file (output "outputFile.txt"
                  :direction         :output
                  :if-does-not-exist :create
                  :if-exists         :append)
  (format output "1 + 2 = ~d" 3))

;; Appends to the dynamic string the string "1 + 2 = 3".
(let ((output-string (make-array 0
                       :element-type 'character
                       :adjustable T
                       :fill-pointer 0)))
  (declare (type string output-string))
  (format output-string "1 + 2 = ~d" 3)
  (the string output-string))

Control string and format arguments[edit]

The control string may contain literal characters as well as the meta character ~ (tilde), which demarcates format directives. While literals in the input are echoed verbatim, directives produce a special output, often consuming one or more format arguments.

Directives[edit]

A format directive, introduced by a ~, is followed by zero or more prefix parameters, zero or more modifiers, and the directive type. A directive definition, hence, must conform to the pattern

~[prefixParameters][modifiers]directiveType

The directive type is always specified by a single character, case-insensitive in the case of letters. The data to be processed by a format directive, if at all necessary, is called its format argument and may be zero or more objects of any type compatible. Whether and in which quantity such data is accepted depends on the directive and potential modifiers applied unto it. The directive type ~%, for instance, abstains from the consumption of any format arguments, whereas ~d expects exactly one integer number to print, and ~@{, a directive influenced by the at-sign modifier, processes all remaining arguments.

The following directive, ~b, expects one number object from the format arguments and writes its binary (radix 2) equivalent to the standard output.

(format T "~b" 5)

Where configurations are permissive, prefix parameters may be specified.

Prefix parameters[edit]

Prefix parameters enable an injection of additional information into a directive to operate upon, similar to the operation of parameters when provided to a function. Prefix parameters are always optional, and, if provided, must be located between the introducing ~ and either the modifiers or, if none present, the directive type. The values are separated by commas, but do not tolerate whitespaces on either side. The number and type of these parameters depends on the directive and the influence of potential modifiers.

Two particular characters may be utilized as prefix parameter values with distinctive interpretation: v or V acts as a placeholder for an integer number or character from the format arguments which is consumed and placed into its stead. The second special character, #, is substituted by the tally of format arguments yet abiding their consumption. Both v and # enable behavior defined by dynamic content injected into the prefix parameter list.

The v parameter value introduces a functionality equivalent to a variable in the context of general programming. Given this simple scenario, in order to left-pad a binary representation of the integer number 5 to at least eight digits with zeros, the literal solution is as follows:

(format T "~8,'0b" 5)

The first prefix parameter controlling the output width may, however, be defined in terms of the v character, delegating the parameter value specification to the next format argument, in our case 8.

(format T "~v,'0b" 8 5)

Solutions of this kind are particularly a benefit if parts of the prefix parameter list shall be described by variables or function arguments instead of literals, as is the case in the following piece of code:

(let ((number-of-digits 8))
  (declare (type (integer 0 *) number-of-digits))
  (format T "~v,'0b" number-of-digits 5))

Even more fitting in those situations involving external input, a function argument may be passed into the format directive:

(defun print-as-hexadecimal (number-to-format number-of-digits)
  "Prints the NUMBER-TO-FORMAT in the hexadecimal system (radix 16),
   left-padded with zeros to at least NUMBER-OF-DIGITS."
  (declare (type number        number-to-format))
  (declare (type (integer 0 *) number-of-digits))
  (format T "~v,'0x" number-of-digits number-to-format))

(print-as-hexadecimal 12 2)

# as a prefix parameter tallies those format arguments not yet processed by preceding directives, doing so without actually consuming anything from this list. The utibility of such a dynamically inserted value is preponderantly restricted to use cases pertaining to conditional processing. As the argument number can only be an integer number greater than or equal to zero, its significance coincides with that of an index into the clauses of a conditional ~[ directive.

The interplay of the special # prefix parameter value with the conditional selection directive ~[ is illustrated in the following example. The condition states four clauses, accessible via the indices 0, 1, 2, and 3 respectively. The number of format arguments is employed as the means for the clause index retrieval; to do so, we insert # into the conditional directive which permits the index to be a prefix parameter. # computes the tally of format arguments and suggests this number as the selection index. The arguments, not consumed by this act, are then available to and processed by the selected clause's directives.

;; Prints "none selected".
(format T "~#[none selected~;one selected: ~a~;two selected: ~a and ~a~:;more selected: ~@{~a~^, ~}~]")

;; Prints "one selected: BUNNY".
(format T "~#[none selected~;one selected: ~a~;two selected: ~a and ~a~:;more selected: ~@{~a~^, ~}~]" 'bunny)

;; Prints "two selected: BUNNY and PIGEON".
(format T "~#[none selected~;one selected: ~a~;two selected: ~a and ~a~:;more selected: ~@{~a~^, ~}~]" 'bunny 'pigeon)

;; Prints "more selected: BUNNY, PIGEON, MOUSE".
(format T "~#[none selected~;one selected: ~a~;two selected: ~a and ~a~:;more selected: ~@{~a~^, ~}~]" 'bunny 'pigeon 'mouse)

Modifiers[edit]

Modifiers act in the capacity of flags intending to influence the behavior of a directive. The admission, magnitude of behavioral modification and effect, as with prefix parameters, depends upon the directive. In some severe cases, the syntax of a directive may be varied to a degree as to invalidate certain prefix parameters; this power especially distinguishes modifiers from most parameters. The two valid modifier characters are @ (at-sign) and : (colon), possibly in combination as either :@ or @:.

The following example illustrates a rather mild case of influence exerted upon a directive by the @ modifier: It merely ensures that the binary representation of a formatted number is always preceded by the number's sign:

(format T "~@b" 5)

Format directives[edit]

An enumeration of the format directives, including their complete syntax and modifier effects, is adduced below.[2]

Character
Description Full form Modifier
None : @ :@
~ Prints the literal ~ character. ~repetitions~. Prints repetitions times the ~ character. Invalid
c, C Prints a single character. ~c. Prints the format argument character without prefix. Spells out non-printing characters. Prepends the readable #\ prefix. Spells out non-printing characters and mentions shift keys.
% Prints an unconditional newline. ~repetitions%. Prints repetitions line breaks. Invalid
& Prints a conditional newline, or fresh-line. ~repetitions&. If the destination is not at the beginning of a fresh line, prints repetitions line breaks; otherwise, prints repetitions - 1 line breaks. Invalid
| Prints a page separator. ~repetitions|. Prints repetitions times a page separator. Invalid
r, R Either prints the number in the specified base (radix) or spells it out. ~radix,minColumns,padChar,commaChar,commaIntervalR.
  • With prefix parameters, prints the argument in the radix (base).
  • Without prefix parameters, the format argument is spelt out, either in English letters or in Roman numerals.
Prints the argument as an English number. Spells the argument in English ordinal numbers. Prints the argument in Roman numerals using the usual Roman format (e.g., 4 = IV). Prints the argument in Roman numerals using the old Roman format (e.g., 4 = IIII).
d, D Prints the argument in decimal radix (base = 10). ~minColumns,padChar,commaChar,commaIntervald. Prints as decimal number without + (plus) sign or group separator. Uses commas as group separator. Prepends the sign. Prepends the sign and uses commas as group separator.
b, B Prints the argument in binary radix (base = 2). ~minColumns,padChar,commaChar,commaIntervalb. Prints as binary number without + (plus) sign or group separator. Uses commas as group separator. Prepends the sign. Prepends the sign and uses commas as group separator.
o, O Prints the argument in octal radix (base = 8). ~minColumns,padChar,commaChar,commaIntervalo. Prints as octal number without + (plus) sign or group separator. Uses commas as group separator. Prepends the sign. Prepends the sign and uses commas as group separator.
x, X Prints the argument in hexadecimal radix (base = 16). ~minColumns,padChar,commaChar,commaIntervalx. Prints as hexadecimal number without + (plus) sign or group separator. Uses commas as group separator. Prepends the sign. Prepends the sign and uses commas as group separator.
f, F Prints the argument as a float in fixed-point notation. ~width,numDecimalPlaces,scaleFactor,overflowChar,padCharf. Prints as fixed-point without + (plus) sign. Invalid Prepends the sign. Invalid
e, E Prints the argument as a float in exponential notation. ~width,numDecimalPlaces,numDigits,scaleFactor,overflowChar,padChar,exponentChare. Prints as exponential without + (plus) sign. Invalid Prepends the sign. Invalid
g, G Prints the argument either as a float in fixed-point or exponential notation, choosing automatically. ~width,numDecimalPlaces,numDigits,scaleFactor,overflowChar,padChar,exponentCharg. Prints as fixed-point or exponential without + (plus) sign. Invalid Prepends the sign. Invalid
$ Prints the argument according to monetary conventions. ~width,numDigits,minWholeDigits,minTotalWidth,padChar$. Prints in monetary conventions without + (plus) sign or padding. Prepends the sign before padding characters. Prepends the sign. Invalid
a, A Prints the argument in a human-friendly manner. ~minColumns,colInc,minPad,padChara. Prints human-friendly output without justification. Prints NIL as empty list () instead of NIL. Pads on the left instead of the right side. Pads on the left and prints NIL as ().
s, S Prints the argument in a manner compatible with the read function. ~minColumns,colInc,minPad,padChars. Prints read-compatible without justification. Prints NIL as empty list () instead of NIL. Pads on the left instead of the right side. Pads on the left and prints NIL as ().
w, W Prints the argument in accordance with the printer control variables. ~w. Prints in accordance with the currently set control variables. Enables pretty printing. Ignores print level and length constraints. Ignores print level and length constraints and enables pretty printing.
_ Prints a line break according to the pretty printer rules. ~_. Prints a line break if a single line is exceeded. Prints a line break if no single line preceded. Uses a compact (miser) style. Always inserts a line break.
< Justifies the output. ~minColumns,colInc,minPad,padChar<expression~>. Left-justifies the output. Adds left padding (= right-justifies). Adds right padding (= left-justifies). Centers the text.
i, I Indents a logical block. ~i. Starts indenting from the first character. Indents starting from the current output position. Invalid
/ Dispatches the formatting operation to a user-defined function. The function must accept at least four parameters:
  1. The stream or adjustable string to print to,
  2. The format argument to process,
  3. A Boolean value which is T if the : modifier was supplied, and
  4. A Boolean value which is T if the @ modifier was supplied.

Additionally, zero or more arguments may be specified if the function shall also permit prefix parameters.

~prefixParams/function/. Depends on the function implementation.
t, T Moves the output cursor to a given column or by a horizontal distance. ~columnNumber,columnIncrementt. Moves to the specified column. Orients at section. Moves the cursor relative to the current position. Orients relative to section.
* Navigates across the format arguments. ~numberOfArgs*. Skips the numberOfArgs format arguments. Moves numberOfArgs back. Moves to the argument at index numberOfArgs. Invalid
[ Prints an expression based upon a condition. These expressions, or clauses, are separated by the ~; directive, and a default clause can be stated by using ~:; as its leading separator. The number of permitted clauses depends upon the concrete variety of this directive as stated by its modifier or modifiers. The whole conditional portion must be terminated with a ~].
~[clause1~;...~;clauseN~:;defaultClause~].
There exists an alternative form, valid only without modifiers, which relocates the index of the clause to select, selectionIndex, from the format arguments to a prefix parameter:
~selectionIndex[clause1~;...~;clauseN~:;defaultClause~].
This syntax commends itself especially in conjunction with the special prefix parameter character # which equates the selected element with the number of format arguments left to process. A directive of this kind allows for a very concise modeling of multiple selections.
The format argument must be a zero-based integer index, its value being that of the clause to select and print. Selects the first clause if the format argument is NIL, otherwise the second one. Only processes the clause if the format argument is T, otherwise skips it. Invalid
{ Iterates over one or more format arguments and prints these. The iterative portion must be closed with a ~} directive. If the directive ~^ is found inside of the enclosed portion, any content following it is only consumed if the current element is not the last one in the processed list. If the prefix parameter numberOfRepetitions is specified, its value defines the maximum number of elements to process; otherwise all of these are consumed. ~numberOfRepetitions{expression~}. A single format argument is expected to be a list, its elements are consumed in order by the enclosed directives. Expects the format argument to be a list of lists, consuming its sublists. Regards all remaining format arguments as a list and consumes these. Regards all remaining format arguments as a list of sublists, consuming these sublists.
? Substitutes the directive by the next argument, expected to be a format argument, using the subsequent format arguments in the new portion. ~?. Expects the subsequent format argument to be a list whose elements are associated with the inserted control string. Invalid Expects separate format arguments instead of a list of these for the inserted portion, as one would specify in the usual manner. Invalid
( Modifies the case of the enclosed string. ~(expression~). Converts all characters to lower case. Capitalizes all words. Capitalizes the first word only, converts the rest to lower case. Converts all characters to upper case.
p, P Prints a singular or plural suffix depending upon the numeric format argument. ~p. Prints nothing if the argument equals 1, otherwise prints s. Moves back to the last consumed format argument, printing nothing if it was 1, otherwise printing s. Prints a y if the argument equals 1, otherwise prints ies. Moves back to the last consumed format argument, printing y if it was 1, otherwise printing ies.
^ Used in an iteration directive ~{...~} to terminate processing of the enclosed content if no further format arguments follow. ~p1,p2,p3^.
  • If no prefix parameter is specified, the directive ceases if zero arguments remain to process.
  • If one prefix parameter p1 is specified, the directive ceases if p1 resolves to zero.
  • If two prefix parameters p1 and p2 are specified, the directive ceases if p1 equals p2.
  • If three prefix parameters p1, p2 and p3 are specified, the directive ceases if it holds: p1p2p3.
Operates as described. Invalid
Newline
Skips or retains line breaks and adjacent whitespaces in a multi-line control string. ~Newline. Skips the immediately following line break and adjacent whitespaces. Skips the immediately following line break, but retains adjacent whitespaces. Retains the immediately following line break, but skips adjacent whitespaces. Invalid

Example[edit]

An example of a C printf call is the following:

 printf("Color %s, number1 %d, number2 %05d, hex %x, float %5.2f, unsigned value %u.\n",
             "red", 123456, 89, 255, 3.14, 250);

Using Common Lisp, this is equivalent to:

 (format t "Color ~A, number1 ~D, number2 ~5,'0D, hex ~X, float ~5,2F, unsigned value ~D.~%"
             "red" 123456 89 255 3.14 250)
 ;; prints: Color red, number1 123456, number2 00089, hex FF, float  3.14, unsigned value 250.

Another example would be to print every element of list delimited with commas, which can be done using the ~{, ~^ and ~} directives:[3]

 (let ((groceries '(eggs bread butter carrots)))
   (format t "~{~A~^, ~}.~%" groceries)         ; Prints in uppercase
   (format t "~:(~{~A~^, ~}~).~%" groceries))   ; Capitalizes output
 ;; prints: EGGS, BREAD, BUTTER, CARROTS.
 ;; prints: Eggs, Bread, Butter, Carrots.

Note that not only is the list of values iterated over directly by format, but the commas correctly are printed between items, not after them. A yet more complex example would be printing out a list using customary English phrasing:

(let ((template "The lucky winners were:~#[ none~; ~S~; ~S and ~S~
           ~:;~@{~#[~; and~] ~S~^,~}~]."))
  (format nil template)
  ;; ⇒   "The lucky winners were: none."
  (format nil template 'foo)
  ;; ⇒   "The lucky winners were: FOO."
  (format nil template 'foo 'bar)
  ;; ⇒   "The lucky winners were: FOO and BAR."
  (format nil template 'foo 'bar 'baz)
  ;; ⇒   "The lucky winners were: FOO, BAR, and BAZ."
  (format nil template 'foo 'bar 'baz 'quux)
  ;; ⇒   "The lucky winners were: FOO, BAR, BAZ, and QUUX."
  )

The ability to define a new directive through ~/functionName/ provides the means for customization. The next example implements a function which prints an input string either in lowercase, uppercase or reverse style, permitting a configuration of the number of repetitions, too.

(defun mydirective (destination
                    format-argument
                    colon-modifier-supplied-p
                    at-sign-modifier-supplied-p
                    &optional (repetitions 1))
  "This function represents a callback suitable as a directive in a
   ``format'' invocation, expecting a string as its FORMAT-ARGUMENT
   to print REPETITIONS number of times to the DESTINATION.
   ---
   The COLON-MODIFIER-SUPPLIED-P and AT-SIGN-MODIFIER-SUPPLIED-P flags
   expect a generalized Boolean each, being the representatives of the
   ``:'' and ``@'' modifiers respectively. Their influence is defined
   as follows:
     - If no modifier is set, the FORMAT-ARGUMENT is printed without
       further modifications.
     - If the colon modifier is set, but not the at-sign modifier, the
       FORMAT-ARGUMENT is converted into lowercase before printing.
     - If the at-modifier is set, but not the colon-modifier, the
       FORMAT-ARGUMENT is converted into uppercase before printing.
     - If both modifiers are set, the FORMAT-ARGUMENT is reversed before
       printing.
   ---
   The number of times the FORMAT-ARGUMENT string is to be printed is
   determined by the prefix parameter REPETITIONS, which must be a
   non-negative integer number and defaults to one."
  (declare (type (or null (eql T) stream string) destination))
  (declare (type T                               format-argument))
  (declare (type T                               colon-modifier-supplied-p))
  (declare (type T                               at-sign-modifier-supplied-p))
  (declare (type (integer 0 *)                   repetitions))
  
  (let ((string-to-print format-argument))
    (declare (type string string-to-print))
    
    ;; Adjust the STRING-TO-PRINT based upon the modifiers.
    (cond
      ((and colon-modifier-supplied-p at-sign-modifier-supplied-p)
        (setf string-to-print (reverse string-to-print)))
      (colon-modifier-supplied-p
        (setf string-to-print (string-downcase string-to-print)))
      (at-sign-modifier-supplied-p
        (setf string-to-print (string-upcase string-to-print)))
      (T
        NIL))
    
    (loop repeat repetitions do
      (format destination "~a" string-to-print))))

;; Print "Hello" a single time.
(format T "~/mydirective/" "Hello")

;; Print "Hello" three times.
(format T "~3/mydirective/" "Hello")

;; Print a lowercase "Hello" (= "hello") three times.
(format T "~3:/mydirective/" "Hello")

;; Print an uppercase "Hello" (= "HELLO") three times.
(format T "~3@/mydirective/" "Hello")

;; Print a reversed "Hello" (= "olleH") three times.
(format T "~3:@/mydirective/" "Hello")

Whilst format is somewhat infamous for its tendency to become opaque and hard to read, it provides a remarkably concise yet powerful syntax for a specialised and common need.[3]

A Common Lisp FORMAT summary table is available.[4]

References[edit]

  1. ^ "CLHS: Function FORMAT". www.lispworks.com.
  2. ^ "CLHS: Section 22.3". www.lispworks.com.
  3. ^ a b 18. A Few FORMAT Recipes from Practical Common Lisp
  4. ^ Common Lisp FORMAT summary table

Books[edit]