C Coding Conventions
Conventions should be established early in a project. These conventions are necessary to maintain consistency throughout the project. Adopting conventions increases productivity and simplifies project maintenance.
There are many ways to code a program in C (or any other language). The style you use is just as good as any other as long as you strive to attain the following goals:
- Portability
- Consistency
- Neatness
- Easy maintenance
- Easy understanding
- Simplicity
Whichever style you use, I would emphasize that it should be adopted consistently throughout all your projects. I would further insist that a single style be adopted by all team members in a large project. To this end, I would recommend that a C programming style document be formalized for your organization. Adopting a common coding style reduces code maintenance headaches and costs. Adopting a common style will avoid code rewrites. This section describes the C programming style I use. The main emphasis on the programming style presented here is to make the source code easy to follow and maintain.
I don't like to limit the width of my C source code to 80 characters just because today's monitors only allow you to display 80 characters wide. My limitation is actually how many characters can be printed on an 8.5" by 11" page using an 8 point, fixed width font. With an 8 point font, you can accommodate up to 132 characters and have enough room on the left of the page for holes for insertion in a three ring binder. Allowing 132 characters per line prevents having to interleave source code with comments.
Header
The header of a C source file looks as shown below. Your company name and address can be on the first few lines followed by a title describing the contents of the file. A copyright notice is included to give warning of the proprietary nature of the software.
/* ************************************************************************************************ * Company Name * Address * * (c) Copyright 19xx, Company Name, City, State * All Rights Reserved * * * Filename : * Programmer(s): * Description : ************************************************************************************************ */ /*$PAGE*/
The name of the file is supplied followed by the name of the programmer(s). The name of the programmer who created the file is given first. The last item in the header is a description of the contents of the file.
I like to dictate when page breaks occur on my listings if my code doesn’t fit on a printed page. In fact, I like to find a logical spot like after a comment block if both the comment block and the actual code doesn’t fit on one page. For historical reasons, I insert the special comment /*$PAGE*/
followed by a form feed character (0x0C
). I like to use the /*$PAGE*/
because it tells the reader where the page break will occur.
Include Files
The header files needed for your project immediately follow the revision history section. You may either list only the header files required for the module or combine header files in a single header file like I do in a file called INCLUDES.H
. I like to use an INCLUDES.H
header file because it prevents you from having to remember which header file goes with which source file especially when new modules are added. The only inconvenience is that it takes longer to compile each file.
/* ************************************************************************************************ * INCLUDE FILES ************************************************************************************************ */ #include "INCLUDES.H" /*$PAGE*/
Naming Identifiers
C compilers which conform to the ANSI C standard (most C compilers do by now) allow up to 32 characters for identifier names. Identifiers are variables, structure/union members, functions, macros, #defines and so on. Descriptive identifiers can be formulated using this 32 character feature and the use of acronyms, abbreviations and mnemonics (see Acronyms, Abbreviations and Mnemonics). Identifier names should reflect what the identifier is used for. I like to use a hierarchical method when creating an identifier. For instance, the function OSSemPend
()
indicates that it is part of the operating system (OS
), it is a semaphore (Sem
) and the operation being performed is to wait (Pend
) for the semaphore. This method allows me to group all functions related to semaphores together. You will notice that some of the functions in µC/OS-II starts with OS_
instead of OS
. This is done to show you that the OS_
functions are internal to µC/OS-II event though they are global functions.
Variable names should be declared on separate lines rather than combining them on a single line. Separate lines make it easy to provide a descriptive comment for each variable.
I use the file name as a prefix for variables that are either local (static
) or global to the file. This makes it clear that the variables are being used locally and globally. For example, local and global variables of a file named KEY.C
are declared as follows:
static INT16U KeyCharCnt; /* Number of keys pressed */ static char KeyInBuf[100]; /* Storage buffer to hold chars */ char KeyInChar; /* Character typed */ /*$PAGE*/
Upper case characters are used to separate words in an identifier. I prefer to use this technique versus making use of the underscore character, (_) because underscores do not add any meaning to names and also use up character spaces.
Global variables (external to the file) can use any name as long as they contain a mixture of upper case and lower case characters and are prefixed with the module/file name (i.e. all global keyboard related variable names would be prefixed with the word Key
).
Formal arguments to a function and local variables within a function are declared in lower case. The lower case makes it obvious that such variables are local to a function; global variables will contain a mixture of upper and lower case characters. To make variables readable, you can use the underscore character (i.e., _
).
Within functions, certain variable names can be reserved to always have the same meaning. Some examples are given below but others can be used as long as consistency is maintained.
i
, j
and k
for loop counters.
p1
, p2
... pn
for pointers.
c
, c1 ... cn
for characters.
s
, s1 ... sn
for strings.
ix
, iy
and iz
for intermediate integer variables
fx
, fy
and fz
for intermediate floating point variables
To summarize:
formal parameters in a function declaration should only contain lower case characters.
auto variable names should only contain lower case characters.
static variables and functions should use the file/module name (or a portion of it) as a prefix and should make use of upper/lower case characters.
extern variables and functions should use the file/module name (or a portion of it) as a prefix and should make use of upper/lower case characters.
Acronyms, Abbreviations & Mnemonics
When creating names for variables and functions (identifiers), it is often the practice to use acronyms (e.g. OS
, ISR
, TCB
and so on), abbreviations (buf
, doc
etc.) and mnemonics (clr
, cmp
, etc.). The use of acronyms, abbreviations and mnemonics allows an identifier to be descriptive while requiring fewer characters. Unfortunately, if acronyms, abbreviations and mnemonics are not used consistently, they may add confusion. To ensure consistency, I have opted to create a list of acronyms, abbreviations and mnemonics that I use in all my projects. The same acronym, abbreviation or mnemonic is used throughout, once it is assigned. I call this list the Acronym, Abbreviation and Mnemonic Dictionary and the list for µC/OS-II is shown in Table A.1. As I need more acronyms, abbreviations or mnemonics, I simply add them to the list.
Acronym, Abbreviation, or Mnemonic | Meaning |
---|---|
Addr | Address |
Blk | Block |
Chk | Check |
Clr | Clear |
Cnt | Count |
CPU | Central Processing Unit |
Ctr | Counter |
Ctx | Context |
Cur | Current |
Del | Delete |
Dly | Delay |
Err | Error |
Ext | Extension |
FP | Floating Point |
Grp | Group |
HMSM | Hours Minutes Seconds Milliseconds |
ID | Identifier |
Init | Initialize |
Int | Interrupt |
ISR | Interrupt Service Routine |
Max | Maximum |
Mbox | Mailbox |
Mem | Memory |
Msg | Message |
N | Number of |
Opt | Option |
OS | Operating System |
Ovf | Overflow |
Prio | Priority |
Ptr | Pointer |
Q | Queue |
Rdy | Ready |
Req | Request |
Sched | Scheduler |
Sem | Semaphore |
Stat | Status or statistic |
Stk | Stack |
Sw | Switch |
Sys | System |
Tbl | Table |
TCB | Task Control Block |
TO | Timeout |
There might be instances where one list for all products doesn't make sense. For instance, if you are an engineering firm working on a project for different clients and the products that you develop are totally unrelated, then a different list for each project would be more appropriate; the vocabulary for the farming industry is not the same as the vocabulary for the defense industry. I use the rule that if all products are similar, they use the same dictionary.
A common dictionary to a project team will also increase the team's productivity. It is important that consistency be maintained throughout a project, irrespective of the individual programmer(s). Once buf
has been agreed to mean buffer it should be used by all project members instead of having some individuals use buffer and others use bfr
. To further this concept, you should always use buf
even if your identifier can accommodate the full name; stick to buf
even if you can fully write the word buffer.
Comments
I find it very difficult to mentally separate code from comments when code and comments are interleaved. Because of this, I never interleave code with comments. Comments are written to the right of the actual C code. When large comments are necessary, they are written in the function description header.
Comments are lined up as shown in the following example. The comment terminators (*/
) do not need to be lined up, but for neatness I prefer to do so. It is not necessary to have one comment per line since a comment could apply to a few lines.
/* ************************************************************************************************ * atoi() * * Description : Function to convert string 's' to an integer. * Arguments : ASCII string to convert to integer. * (All characters in the string must be decimal digits (0..9)) * Returns : String converted to an 'int' ************************************************************************************************ */ int atoi (char *s) { int n; /* Partial result of conversion */ n = 0; /* Initialize result */ while (*s >= '0' && *s <= '9' && *s) { /* For all valid characters and not end of string */ n = 10 * n + *s - '0'; /* Convert char to int and add to partial result */ s++; /* Position on next character to convert */ } return (n); /* Return the result of the converted string */ } /*$PAGE*/
#defines
Header files (.H
) and C source files (.C
) might require that constants and macros be defined. Constants and macros are always written in upper case with the underscore character used to separate words. Note that hexadecimal numbers are always written with a lower case x and all upper case letters for hexadecimal A through F. Also, you shouldnote that the contant names are all lined up as well as their values.
/* ************************************************************************************************ * CONSTANTS & MACROS ************************************************************************************************ */ #define KEY_FF 0x0F #define KEY_CR 0x0D #define KEY_BUF_FULL() (KeyNRd > 0) /*$PAGE*/
Data Types
C allows you to create new data types using the typedef
keyword. I declare all data types using upper case characters, and thus follow the same rule used for constants and macros. There is never a problem confusing constants, macros, and data types; because of the context in which they are used. Since different microprocessors have different word length, I like to declare the following data types (assuming Borland C++ V4.51):
/* ************************************************************************************************ * DATA TYPES ************************************************************************************************ */ typedef unsigned char BOOLEAN; /* Boolean */ typedef unsigned char INT8U; /* 8 bit unsigned */ typedef char INT8S; /* 8 bit signed */ typedef unsigned int INT16U; /* 16 bit unsigned */ typedef int INT16S; /* 16 bit signed */ typedef unsigned long INT32U; /* 32 bit unsigned */ typedef long INT32S; /* 32 bit signed */ typedef float FP; /* Floating Point */ /*$PAGE*/
Using these #defines, you will always know the size of each data type.
Local Variables
Some source modules will require that local variables be available. These variables are only needed for the source file (file scope) and should thus be hidden from the other modules. Hiding these variables is accomplished in C by using the static
keyword. Variables can either be listed in alphabetical order, or in functional order.
/* ************************************************************************************************ * LOCAL VARIABLES ************************************************************************************************ */ static char KeyBuf[100]; static INT16S KeyNRd; /*$PAGE*/
Function Prototypes
This section contains the prototypes (or calling conventions) used by the functions declared in the file. The order in which functions are prototyped should be the order in which the functions are declared in the file. This order allows you to quickly locate the position of a function when the file is printed.
/* ************************************************************************************************ * FUNCTION PROTOTYPES ************************************************************************************************ */ void KeyClrBuf(void); static BOOLEAN KeyChkStat(void); static INT16S KeyGetCnt(int ch); /*$PAGE*/
Also note that the static
keyword, the returned data type, and the function names are all aligned.
Function Declarations
As much as possible, there should only be one function per page when code listings are printed on a printer. A comment block should precede each function. All comment blocks should look as shown below. A description of the function should be given and should include as much information as necessary. If the combination of the comment block and the source code extends past a printed page, a page break should be forced (preferably between the end of the comment block and the start of the function). This allows the function to be on a page by itself and prevents having a page break in the middle of the function. If the function itself is longer than a printed page then it should be broken by a page break comment (/*$PAGE*/
) in a logical location (i.e. at the end of an if
statement instead of in the middle of one).
More than one small function can be declared on a single page. They should all, however, contain the comment block describing the function. The beginning of a function should start at least two lines after the end of the previous function.
/* ************************************************************************************************ * CLEAR KEYBOARD BUFFER * * Description : Flush keyboard buffer * Arguments : none * Returns : none * Notes : none ************************************************************************************************ */ void KeyClrBuf (void) { } /*$PAGE*/
Functions that are only used within the file should be declared static
to hide them from other functions in different files.
By convention, I always call all invocations of the function without a space between the function name and the open parenthesis of the argument list. Because of this, I place a space between the name of the function and the opening parenthesis of the argument list in the function declaration as shown above. This is done so that I can quickly find the function definition using a grep utility.
Function names should make use of the file name as a prefix. This prefix makes it easy to locate function declarations in medium to large projects. It also makes it very easy to know where these functions are declared. For example, all functions in a file named KEY.C
and functions in a file named VIDEO.C
could be declared as follows:
KEY.C KeyGetChar() KeyGetLine() KeyGetFnctKey() VIDEO.C VideoGetAttr() VideoPutChar() VideoPutStr() VideoSetAttr()
It's not necessary to use the whole file/module name as a prefix. For example, a file called KEYBOARD.C
could have functions starting with Key
instead of Keyboard
. It is also preferable to use upper case characters to separate words in a function name instead of using underscores. Again, underscores don't add any meaning to names and they use up character spaces. As mentioned previously, formal parameters and local variables should be in lower case. This makes it clear that such variables have a scope limited to the function.
Each local variable name MUST be declared on its own line. This allows the programmer to comment each one as needed. Local variables are indented four spaces. The statements for the function are separated from the local variables by three spaces. Declarations of local variables should be physically separated from the statements because they are different.
Indentation
Indentation is important to show the flow of the function. The question is, how many spaces are needed for indentation? One space is obviously not enough while 8 spaces is way too much. The compromise I use is four spaces. I also never use TABs, because various printers will interpret TABs differently; and your code may not look as you want. Avoiding TABs does not mean that you can't use the TAB key on your keyboard. A good editor will give you the option to replace TABs with spaces (in this case, 4 spaces).
A space follows the keywords if
, for
, while
and do
. The keyword else has the privilege of having one before and one after it if curly braces are used. I write if (condition)
on its own line and the statement(s) to execute on the next following line(s) as follows:
if (x < 0) z = 25; if (y > 2) { z = 10; x = 100; p++; }
instead of the following method.
if (x < 0) z = 25; if (y > 2) {z = 10; x = 100; p++;}
There are two reasons for this method. The first is that I like to keep the decision portion apart from the execution statement(s). The second reason is consistency with the method I use for while
, for
and do
statements.
switch
statements are treated as any other conditional statement. Note that the case statements are lined up with the case label. The important point here is that switch
statements must be easy to follow. cases
should also be separated from one another.
if (x > 0) { y = 10; z = 5; } if (z < LIM) { x = y + z; z = 10; } else { x = y - z; z = -25; } for (i = 0; i < MAX_ITER; i++) { *p2++ = *p1++; xx[i] = 0; } while (*p1) { *p2++ = *p1++; cnt++; } do { cnt--; *p2++ = *p1++; } while (cnt > 0); switch (key) { case KEY_BS : if (cnt > 0) { p--; cnt--; } break; case KEY_CR : *p = NUL; break; case KEY_LINE_FEED : p++; break; default: *p++ = key; cnt++; break; }
Statements & Expressions
All statements and expressions should be made to fit on a single source line. I never use more than one assignment per line such as:
x = y = z = 1;
Even though this is correct in C, when the variable names get more complicated, the intent might not be as obvious.
The following operators are written with no space around them:
-> | Structure pointer operator | p->m |
. | Structure member operator | s.m |
[] | Array subscripting | a[i ] |
Parentheses after function names have no space(s) before them. A space should be introduced after each comma to separate each actual argument in a function. Expressions within parentheses are written with no space after the opening parenthesis and no space before the closing parenthesis. Commas and semicolons should have one space after them.
strncat(t, s, n); for (i = 0; i < n; i++)
The unary operators are written with no space between them and their operands:
!p ~b ++i --j (long)m *p &x sizeof(k)
The binary operators is preceded and followed by one or more spaces, as is the ternary operator:
c1 = c2 x + y i += 2 n > 0 ? n : -n;
The keywords if
, while
, for
, switch
and return
are followed by one space.
For assignments, numbers are lined up in columns as if you were to add them. The equal signs are also lined up.
x = 100.567; temp = 12.700; var5 = 0.768; variable = 12; storage = &array[0];
Structures and Unions
Structures are typedef
since this allows a single name to represent the structure. The structure type is declared using all upper case characters with underscore characters used to separate words.
typedef struct line { /* Structure that defines a LINE */ int LineStartX; /* 'X' & 'Y' starting coordinate */ int LineStartY; int LineEndX; /* 'X' & 'Y' ending coordinate */ int LineEndY; int LineColor; /* Color of line to draw */ } LINE; typedef struct point { /* Structure that defines a POINT */ int PointPosX; /* 'X' & 'Y' coordinate of point */ int PointPosY; int PointColor; /* Color of point */ } POINT;
Structure members start with the same prefix (as shown in the examples above). Member names should start with the name of the structure type (or a portion of it). This makes it clear when pointers are used to reference members of a structure such as:
p->LineColor; /* We know that 'p' is a pointer to LINE */