When to tell your C++ Code is Non-Standard

Siddhant Sanyam (siddhantsanyam at gmail dot com)

Read PDF version PDF Version how_to_tell_rusted_cpp.pdf

Version 0.2 -Beta

Copyright Siddhant Sanyam 2009

You are free to distribute, copy and modify this document as long as this copyright message appears.
(This document is still in its beta stage. You can help it evolve by suggesting your inputs to the authors e-mail. Every suggestion, criticism is valuable.)

Version History

Version Changes Date Editor
0.10 Beta First Release Up-to Sec:2 2-06-2009 Siddhant Sanyam
0.20 Beta Added till Sec:3.1 4-06-2009 Siddhant Sanyam
0.21 Beta Added Sec:3.2 7-06-2009 Siddhant Sanyam
0.22 Beta Revised/Corrected 17-06-2009 Siddhant Sanyam (suggestion made by Narue)
0.23 Beta Revised 27-06-2009 Siddhant Sanyam



1 Introduction

This document is quite a briefed guide to check whether your coding practices in C++ are standard or not and to provide possible fixes for the same. This document was written after being fed up by the non-portable codes posted on the Usenet/Forums by many (many many) beginners.

I came across a guide, ``C++ Portability Guide'' from Mozilla Developers' Center[MCD's C++ Portability Guide]. A point I noted (which upsets me), while I was reading it was that, the guide focuses on the popular coding practices in C++. It often gives you advices which will surely make your code portable but on the cost of degrading your language quality. It suggests not to use the 'advanced' feature of C++ and argues that since not much people know these feature, you shouldn't be implementing those in your programs. Some of these advices are hazardous like : ``Don't use standard library'', ``Don't use exception handling''. Following these advices may make your code portable, though dirty.

I don't agree with this approach. This approach tends to support the motto : ``When many people do a thing wrong, it becomes correct''. Rather, we should practice standard coding. That is, write the code which agrees with the ISO C++ Standards guaranteed by every modern compiler. In the current guide, I have tried to perpetrate the approach: when you write in accordance to the standards, there is no chance for your code to be unportable. I have tried to, very briefly, summarize the major non-standard practices and provide compact explanations. Readers may wish to do a Web Search to gain better insight of a specific topic.

2 The Far-Sighted Symptoms

In this section, we shall look up the common coding constructs which makes it obviously clear that your code is non-standard ( From now, I will call such a code to be 'rusted' )

2.1 Using an Old Compiler

If you use an Old Compiler, there are very high chances that the code you are writing will be rusted. This applies for compiler written before 1995 or so. One major candidate which I came across is Turbo C++ version 1,2,3 ( I have never encountered any subsequent versions ), and if you are using it, you are definitely not writing a standard code.

The simple test to check if your compiler is good : Run the following code[Stoustrup's Incomplete Compiler List] , as-is, in your compiler and check whether it compiles.

using namespace std;
int main()
   string s;
   cout << "Please enter your first name followed by a newline\n";
   cin >> s;
   cout << "Hello, " << s << '\n';

If you run this code, it should prompt you to enter your first name and output ``Hello, first_name''. If your compiler is not able to compile such a simple program: ``Don't wait, this is the time your should migrate to a new compiler.'' No, don't think of doing it tomorrow, migrate now.

This symptom is usually observed in MS-Windows developers. Most of the *nix(various Unixes and Linux) developers uses some free compilers which are frequently upgraded to be standard. There are plenty of free compilers available. I prefer them as they are open-source and re-distributable. I am listing few recommend compiler and IDEs for MS-Windows:

GNU G++ is part of GCC (GNU's Compiler Collection) which is the standard compiler for many Unix, Linux and Mac OS X. It is available on MS-Windows through various ports like:

Code Blocks
This is an IDE which uses the MinGW(see below) port of the G++ to provide a complete solution. Many MS Programmers use this. If you are a beginner, you should perhaps stick to this option. It is open-source. To download, go to CodeBlocks Download Page http://www.codeblocks.org/downloads/5. Download and run codeblocks-8.02mingw-setup.exe . This will install Code Blocks along with the G++ MinGW.
If you don't want to use an IDE, MinGW is the choice. MinGW is the port of the G++ on MS-Windows. To download, go to MinGW Download Section https://sourceforge.net/project/showfiles.php?group_id=2435. Download and run the Automated MinGW Installer.
Cygwin will give you the Unix-like development environment on MS-Windows. Use this option if you have a native *nix background.
Microsoft Visual C++ 2008 Express edition
This is a non-free alternative (Note that this IDE comes at zero cost. You don't have to pay anything to download, run and use it. But it is not open-source hence it is not free. To enlighten yourself about free software , check the FSF's Free Software Definition http://www.gnu.org/philosophy/free-sw.html). I don't prefer this option as you would develop a habit of using platform-dependent application. Since this IDE cannot run on any other platform other than MS-Windows, you will be likely to be in trouble.

2.2 Using void main()

Neither the Standard C, nor the Standard C++ supports void main(). The return type of main function has to be int. Stop and repeat the last sentence 10 times if you are a user of void main(). Modern compilers won't compile your code until main function returns an int. If a compiler accept it quietly, chances are that it is not a standard compiler and you are a victim of the problem of section (2.1). The return value of main() is needed by your operating system to verify if your program succeeded successfully.

If you were always scared to write the 9 characters of return 0; at the end of every C++ code, there is a good news: you don't have to issue a return 0; statement in a standard C++ program; it will be added auto-magically by your compiler. Hence, the smallest possible C++ program is:

int main(){}
So, do whatever: call your mom, go on a date with your girlfriend/boyfriend, drink coffee but do use int main() instead of void main().

2.3 Using Deprecated and Non-Portable Header Files

2.3.1 Deprecated Header Files

In C++, deprecated header files are those which were used before the formulation of standards and are now considered obsolete. They are now (usually, not necessarily) shipped with compilers, just to provide backward compatibility to the old code. They are not the part of the Standard C++ Library and ends with the traditional .h extention. All the Standard C++ Headers should be used without the .h extention. That is, instead of beginning your programs with:

#include <iostream.h>
#include <fstream.h>

use the standard header files like this:

#include <iostream>
#include <fstream>

Many rusted code uses the Standard C library with the .h extensions, which is wrong by the convention set by the standards. C-Header files (like stdio.h, math.h) are to be used with the following conventions: ``Strip of the .h extention and prefix a c before the library name''. Hence, the following code:

#include <stdio.h>
#include <math.h>
#include <string.h>
#include <ctype.h>

will be transformed into:

#include <cstdio>
#include <cmath>
#include <cstring>
#include <cctype> //note the two c

A special consideration is needed for the header file : string.h . In the C-Standard Library string.h contains functions to manipulate the C-strings (that is, a null terminated character array). To use this library in your C++ program, you should #include <cstring> . C++ Standard Library has the string header, which contain the standard string class std::string, the standard implementation of string data type in C++. If you need this, you should #include <string>.

Some modern compilers create warning messages about use of depreciated header files( like G++ compiler).

Deprecated header files are not part of the standards. Hence, any compiler providing a deprecated header file (such as iostream.h) would wish to implement it in any way it want to and still be called a standard C++ compiler. In simple terms, when you use <iostream>, you are sure that it will behave as the standards describe it to behave. But if you #include<iostream.h>, it is the jurisdiction of the compiler( and not the standards ) to implement it any way.

2.3.2 Non-Portable Headers Files

If you are using header files such as conio.h, graphic.h , it is an indication that your code will not run on a standard implementation. So, avoid using them. Try better alternatives:

These header files are implementation specific. They may or may not be available at certain implementation. Depending on such header files implies that your code won't run on another platform.

2.4 Absence of namespaces qualifiers

If you are compiling your program on a standard compiler (which you should be, after reading section 2.1), you will have to use namespace qualifiers.

The standard libraries are wrapped by the std namespace (std is the contraction for standard). Hence all the objects, functions, data-structures that you use from any standard C++ library (may it be iostream, fstream, string, algorithm), will be prefixed by std:: before their name. Consequently, the following example would be a valid C++ program:

#include <iostream>
#include <fstream>
int main()
   std::cout<<"Hello" << std::endl;
   std::fstream file1("mytext.txt");
   while(std::cout<< file1); //print the whole file

If you are too lazy to write std:: more than once, you may dump those part of the namespace which you use often, in the global namespace. This is done by the using directive:

#include <iostream>
#include <fstream>
using std::cout; using std::endl; using std::fstream;
int main()
   cout<<"Hello" << endl;
   fstream file1("mytext.txt");
   while(cout<< file1); //print the whole file

If you are still too lazy to write using directive more than once, you may globalize the whole of the namespace. This will dump all the names of the standard namespace into the global namespace:

#include <iostream>
#include <fstream>
using namespace std;
int main()
   cout<<"Hello" << endl;
   fstream file1("mytext.txt");
   while(cout<< file1); //print the whole file

It is my strong recommendation that please don't use the last approach. It is a very killing habit. It kills the concept of namespaces.

2.5 The use of system() calls

Especially the use of system("pause") and system("cls").

The system() functions will call the shell and execute the command provided to it as an argument. The pause program is program which is found on MS-Windows which pauses the shell until a key-stroke is pressed. The cls is also a program; which will clear the console screen.

The call to system() are very heavy. It uses a lot of (unnecessary) run-time and memory. The deeper side effect is that it is unportable. As just told, pause is the program available only in MS-Windows. There is no pause program for *nix ( they have a program named read instead). So if your program depends on such program, it would cause a run-time error when ported to *nix.

Another good reason not to use system() is that it introduces security loop holes in your program. For example, if a nasty user gets to know that your program will execute another program named pause.exe, he may use your program to call any program he wants: he will just have to rename that program to pause.exe and place it in the same directory as your program. Thus, whenever your program will try to launch the native pause.exe, his hand-made pause.exe will get executed. This can be dangerous in many respects1.

Practically, you shouldn't be using any such calls. To pause your program you can use cin.get(). You shouldn't be clearing the screen as discussed in section2.3.2.

3 Bad Coding Practice

In this section we shall review those coding practices which may be popular, working and even portable but are bad. Bad means, that they often makes your code unreadable, rusted, less efficient and some times non-portable too. You should try avoid all these habits.

3.1 Using Pre-processor Macros

This problem is reluctant and rebellious among the C programmers who writes C++ codes. Using pre-processor macros such as:

#define SQUARE(x) x*x
#define PI 3.14159265

are considered very harmful in C++ programming. This is because macros don't obeys C++ scope rules. Macros does text-to-text replacement in your code. Hence, there will be no variable named PI or no function named SQUARE() available during the course of compilation; thus a small bug in macros can be very hard to debug at the compile-time. Let us consider an example. Suppose a programmer define the above macros and uses the SQUARE(x). Unfortunately he got carried away with the function-like look of the macro and used this :

#include <iostream>
#define SQUARE(x) x*x
#define PI 3.14159265
int main()
   int a=5;
   std::cout<<SQUARE(++a);//Oops. This prints 49

while expecting the output to be 36 (the square of 6). It is pretty obvious why this happened. The pre-processor replaced the SQUARE() macro with appropriate text substitution. Hence the compiler received this:

int main()
   int a=5;

Although, such kind of expression, according to the standards, are un-defined[More than one sequence points] but on most of the compiler (I tested with my GNU G++) it prints 49 (that is: increment a twice so that a=7 and then evaluate a*a). This of course was a weak example to show that macros aren't good. There are pretty more there. The value of PI which is #defined above, is not cute either. You cannot see a variable named PI when you run a debugger. PI won't have a address in a memory so you cannot use that(its address) either. PI won't obey scope rule. That means every #define-ation will be global.

The solution is pretty easy. If you want to declare a constant variable, better use the const qualifier like this:

const double PI=3.14159265

above, PI is now a variable, a const-variable. It has a address in the memory, obeys scope rules just like other variables do. The only accomplished fact is that its value won't change.

There is also an alternative to macros: the inline functions. C++ allows you to request the compiler to make a particular function inline: That means, rather than calling the function, the compiler will insert the complete body of the function in every place in the code where that function is used2. This is much like but a lot better than macros. First of all, these inline functions act like (they actually are) real function and obey language rules. Secondly, they are part of the language and not the pre-processor; hence they are the best substitute for macros. Here is a example of inline function:

inline int square(double x)
  return x*x;
int main()
   int N=5;
   int five_squared = square(N);
   int six_squared  = square(++N); //works fine. evaluates 36

Macros are often finds space in the toolkit of various C programmers because they help them create type-insensitive functions. Thus, one can argue that the SQUARE() function is a good since it will readily work on ints, floats, double or any other data type that support the multiplication operator. This viewpoint is valid for a C programmer since there, the language doesn't provide any feature to construct type-insensitive functions. But here in C++, templates are provided to construct generic functions and classes. Thus the SQUARE() macros can be best avoided by using a template function instead:

template <typename T>
inline T square(T x)
  return x*x;
int main()
   int iN=5;
   double dN=5.5;
   int iN_sq = square(iN);
   double dN_sq = square(dN);

This version uses function templates. Templates are useful in more than one way. This is what makes STL, STL. If you haven't, you should familiarize yourself with templates.

One indispensable use of #define macros comes in conditional compiling as the include guards . It helps you to avoid accidentally including a header file twice. You should always write your header file starting like this:

my_header.h :

#ifndef MY_HEADER
#define MY_HEADER

//  The content
//  of header file
//  goes here


Now, even if you #include "my_header.h" more than once, it will really be included just once. Another use of conditional compiling is while writing code that can run on multiple platforms. I omit it here, but reader can always do a web search for the same.

The moral of this section is not to use #define macros as far as possible. But while writing the header files, it is always good to use conditional compiling technique.

3.2 Using Arrays instead of Vectors

I know, I know. But this is true. Using arrays is a bad habit in C++. Arrays are the very basic and built-in data structure of C++ inherited from C. As you know, arrays are fixed linear data structure whose size has to be known at the compile time. That means, that you cannot change their size while your program is running. Once declared, nothing can be done to resize them. To handle this, explicit memory management ( which is accomplished by the new and delete operators in C++; malloc(), realloc(), free() in C ) is practiced by the rusted programmers ( Note that here I am exploiting the fact that if you are writing rusted code, you automatically become rusted. I am sorry if that hurts. Don't take it personally please.) . This is very bad idea itself. Explicit memory management is very prone to errors like memory leak and data destruction. You have to handle every thing by yourself which is very inefficient and dangerous. The worst part is, while practicing explicit memory management, your program may run good during the tests; but will definitely fail when released[Murphy's Law]. The C programming language did not provide with any solution3 to this, but C++ does. The Standard Template Library comes with a collection of type-unspecific containers such as vectors, linked list, maps, multi-maps, sets. Using vectors instead of arrays will serve various purpose. Vectors can be expanded or contracted during run time. Their size need not to be a constant. You can have vectors of int, double, or anything you like, just like you can have arrays of int, double etc. In short, you can do every thing with vectors that you used to with arrays but only with more ease and elegance. Lets take a quick example4 on the basic use of vector.

#include <iostream>
#include <vector> //needed for vectors
int main()
   std::vector<int> v1(5); //create vector of int with 5 elements
//assigning values. Just like arrays
   v1[0]=v1[2];//v1[0] has same value as v1[2]

   //printing as in arrays
   //getting number of elements
   std::cout<<"The number of elements in v1 is "<< v1.size()<<std::endl;
   //appending one more element,574, to the vector
   //note that v1 size will now increase to 6 automatically
   std::cout<<"v1's size=" << v1.size()<<std::endl;//prints 6
   std::cout<<"v1[5]="<<v1[5]<<std::endl; //prints 574

   //printing all the elements
   for(size_t i=0; i!=v1.size(); ++i)

A vector can be passed as an argument to a function:

void print_vec(std::vector<int> v)
//this function prints vector of ints
//to print vector of 'anything', use template function
   for(size_t i=0; i!=v.size(); ++i)

The above examples were just to whet your appetite. Of course can learn more about them by searching the Internet or reading a good book( For a quick start, you can read A Beginner's Tutorial For std::vector, Part 1 http://www.codeguru.com/cpp/cpp/cpp_mfc/stl/article.php/c4027).

You should almost always use vector instead of arrays. The use of arrays and explicit memory management should be done only when you are designing a library yourself. Arrays are also useful when you need speed ( as std::vectors are slow compared to arrays for the fact that they use dynamic memory management ). There are other importance of arrays but most of the time std::vector should do. In short, do not use arrays until and unless it becomes mandatory to do so. The std::vector is designed by one of the most finest programmers of the language. Also note that, std::vector is a good choice if you need to get a element by its index ( that is when you want random access rather than sequenced access ). But they are slow in adding and removing a element at an arbitrary position. For this, you might consider a std::list which is a container just like std::vector but is fast at adding and removing elements at arbitrary positions. However, std::list do not provide index access to an element. That means if you have a list L1, you cannot access the 5 $ ^{\text{th}}$ element as L1[4]. The [] operator is not supported by std::lists.

(This document is still in its beta stage. You can help it evolve by suggesting your inputs to the authors e-mail. Every suggestion, criticism is valuable.)


MCD's C++ Portability Guide
Mozzila Developers' C++ Center Protability Guide https://developer.mozilla.org/en/C___Portability_Guide

Stoustrup's Incomplete Compiler List
This example has been modified using the example provided by BJARNE STROUSTRUP'S on his Home page :Stroustup's Compiler List http://www.research.att.com/~bs/compilers.html

More than one sequence points
Visit Marshal Cline FAQs: 39.15 http://www.parashift.com/c++-faq-lite/misc-technical-issues.html#faq-39.15and Stroustup's FAQ: Evaluation Order http://www.research.att.com/~bs/bs_faq2.html#evaluation-order

Murphy's Law
According to Murphy's law, you'll be hit the hardest at the worst possible moment (when the customer is looking, when a high-value transaction is trying to post, etc.). [Taken from Marshal Cline's FAQs http://www.parashift.com/c++-faq-lite/freestore-mgmt.html


... respects1
For suppose, one can execute a virus with help of your program. Possibilities are limited only by the mind of the doer.
... used2
Pay close attention on the word request. It is never certain whether a inlined function will really be inlined or not. It all depends on the jurisdiction of the compiler. Though, you should never miss a opportunity to make such request. You must never assume that putting the keyword inline assures a function to be inline. Macros, however, certainly are expanded inline.
... solution3
I agree that libraries do exists to provide vector implementation of arrays but then, it is not as natural as it is in C++
... example4
This code might be a little uncomfortable as I have always explicitly qualified all the names in the std namespace by prefixing std:: . This may seem very unnatural to few beginner as one might argue ``Why not use the using directive rather than typing std:: over and over''. Explicitly qualifying names is a good idea since it always indicate that this piece of code is from a particular namespace. Develop a habit to do this. The use of using directive should be very minimum. It should never be used when writing a header file ( though we are not doing that now ).

Siddhant Sanyam 2009-06-28