Friday, October 26, 2018

Musings on the Formatting of C and C++ Code


Musings on the Formatting of C and C++ Code

October 4, 2013; October 26, 2018

I was watching the video The Care and Feeding of C++'s Dragons. I found it to be very interesting. I especially found the CLang based program that reformats code to be promising. However, there were some things about the tool that I find disturbing. It seems to use some formatting patterns that I think are huge mistakes.

To illustrate, some of the sample code looks like the following.

int functionName(int a, int b, int c,
                 int d, int e, int f,
                 int g, int h, int i) const {
  /* Code omitted. */
  if (test()) {
    /* Code omitted. */
  }
}

There are two problems that I see with formatting code in this way. The first is that on the line where the list of variables is continued, there is a great deal of what I see as entirely unnecessary white space. In my opinion it should be as follows.

int functionName(int a, int b, int c,
  int d, int e, int f,
  int g, int h, int i) const {

Or, my personal preference would be the following.

int functionName(
  int a, int b, int c,
  int d, int e, int f,
  int g, int h, int i) const {

My reasoning is quite simple.

Examine what happens when the name of the function is changed to something longer, as follows, and then it is reformated. The code becomes as follows.

int aMuchLongerFunctionName(int a, int b, int c
                            int d, int e, int f,
                            int g, int h, int i) const {

If you do a diff of the original version of the code and the new code, it will show that three lines have changed instead of showing that only one line has changed. I am aware of the fact that many diff tools have an ignore white space option and that if this option is set it will only show one line as having changed. However, not all diff tools have that option. In addition, some companies and organizations have a strict policy that every line of code that changes must be changed for a purpose associated with a given task. And these companies do not accept changes that are only related to reformatting as being associated with a given task. In essence, if the changed line does not affect the functionality of the code, it is not an acceptable change. And these organizations will deliberately not turn on the ignore white space option and will turn a deaf ear to the argument that they should just enable that option (can you tell that I am speaking from experience?).

If you are in such a situation and you change the name of a function that initially is formatted with the parameter list aligned with the end of the function name and you adhere to a strict "every changed line must have a functional purpose" rule you will inevitably end up with the following.

int aMuchLongerFunctionName(int a, int b, int c
                 int d, int e, int f,
                 int g, int h, int i) const {

This just looks wrong!

There is also another reason for not aligning the parameters with the end of the function name. Consider the following.

int aVeryLongFunctionNameThatGoesBeyondTheEdgeOfTheScreen(int a, int b, int c,
                                                          int d, int e, int f,
                                                          int g, int h, int i) const {

In this case, you cannot see the parameters at all without wasting your time scrolling across the screen.

If you always begin the parameter list on its own line that is indented one level deep as follows, you would not ever have to scroll the screen just to see the parameter list.

int aVeryLongFunctionNameThatGoesBeyondTheEdgeOfTheScreen(
  int a, int b, int c,
  int d, int e, int f,
  int g, int h, int i) const {

The second issue I have is with putting braces at the end of the line. In C and C++ braces are optional for some statements such as if. And lets face the facts, C and C++ is often inconsistently indented. Putting braces at the end leads to more work in the following scenario.

if (aVeryLongTestThatGoesPastTheEdgeOfTheScreen()) {
  /*
       Thousands
 of
         inconsistently indented
    lines of
 code /*
}

Putting the brace at the end of the if line forces someone who is reading the code to hit the end key to determine how much code will only be called if the condition is true, one line or thousands of lines, when they might not give a damned about seeing the end of the test because the beginning of it is enough to tell them whether or not the condition can be true in the scenario they are working on. What if the person knows that in the case they are working on, the function "aVeryLongTestThatGoesPastTheEdgeOfTheScreen" will return false. They really do not need to see the end of the test in this case except to find out how many lines they need to skip past in order to get to code that is relevant to their task. Why not just put the brace on a line by itself and make everyone's life so much easier? Why force someone to hit the end key just so they can answer the following question. How many lines do I need to skip to get to code that is relevant to my task?

Until C and C++ do as they did in the Go language and make the braces mandatory, I believe braces should never be at the end of the line.

In Go, where the braces are mandatory, it does not matter as much to me because I know that if the code compiles the brace is there and I do not care if I cannot see it. But in C and C++, I do not want you to force me to find and hit the end key just so I can tell where your if statement ends. Of course, that does not mean that I think that the decision to put the brace at the end was a good one for Go. I often use a paren matching feature to skip past the irrelevant code in the scenario I have described. That requires that the caret be on the opening paren. In Go I need to hit the end key anyway just to get the caret on the brace so I can use the paren matching feature to skip past code I do not care about. Why? If the brace were on a line by itself, I do not need to locate and hit the end key. I can just arrow down to the brace line and use the paren matching feature.

I know that these arguments are only relevant to the placement of braces for conditional statements and that they are not relevant to the placement of braces at the beginning of functions. However, I still feel that the opening brace of a function should be on a line by itself for the sake of consistency.

I cannot believe other people have adopted code formatting patterns that to me are so obviously mistakes. Is there something I am missing that makes my arguments invalid?

And before you say, "just hit the end key, it is not that hard," consider the fact that some people are hunt an peck typists. For some people, any extra key they need to hunt for unnecessarily is an aggravation that interrupts their work flow. I am certain that for some people who are touch typists, hitting one additional key is no big deal, but for hunt and peck typists, it can be.

I for one am a hunt and peck typist despite the fact that I began using computers in 1985 and for me finding the end key just to find out how many lines of code will only get called in the condition is true case is enough of a disruption that I find it to be extremely annoying.

When I first wrote this article back in 2013, I was not aware of the many options for customizing the behavior of clang-format.

Fortunately, you can easily customize the behavior of clang-format. There are numerous Clang-Format Style Options available. For example, you can instruct clang-format to "always break after an open bracket, if the parameters don’t fit on a single line" by setting AlignAfterOpenBracket to AlwaysBreak.

When you use clang-format to format a file it will search for a ".clang-format file located in one of the parent directories of the source file" and load the various formatting options from there. Clang-format also has a number of predefined coding styles to choose from: LLVM, Google, Chromium, Mozilla, and WebKit. You can use the -fallback-style and -style command line arguments to specify the coding style you wish to use. For more information see the ClangFormat manual.
I have begun using clang-format for my own open source projects, and I am pleased with the results. If you are interested, you can take a look at my SnKOpen .clang-format file.

There are various websites that will help you to generate the perfect .clang-format file for your project. One of the best is the clang-format configurator. The Unformat project, which generates a .clang-format file from example codebase may also be worth investigating.

No comments:

Post a Comment