GCC compiler you need to know the introductory knowledge

March 05, 2023

When developing applications for Linux, the C language is used in most cases, so the primary problem facing almost every Linux programmer is how to use the C compiler flexibly. The most commonly used C compiler under Linux is GCC (GNU Compiler Collection), which is an ANSI C-compliant compiler system for GNU projects. It can compile programs written in C, C++, and Object C. GCC is not only very powerful, but also extremely flexible in structure. The most commendable thing is that it can support various languages â€‹â€‹through different front-end modules, such as Java, Fortran, Pascal, Modula-3 and Ada.

Openness, freedom, and flexibility are the hallmarks of Linux, and this is reflected in GCC, which allows programmers to better control the entire compilation process. When using the GCC compiler, the compilation process can be broken down into four phases:

â—† Pre-Processing â—† Compiling â—† Assembling â—† Linking

Linux programmers can let GCC end at any stage of compilation according to their needs, in order to check or use the compiler's output at this stage, or to control the resulting binary to add different amounts and kinds of debugging code. To prepare for future commissioning. Like other commonly used compilers, GCC also provides flexible and powerful code optimization features that can be used to generate more efficient code. GCC provides more than 30 warning messages and three warning levels, which are used to enhance program stability and portability. In addition, GCC has extensively extended the standard C and C++ languages â€‹â€‹to improve the execution efficiency of the program, help the compiler to optimize the code, and reduce the workload of programming.

GCC started

Before learning to use GCC, the following example can help users quickly understand how GCC works and apply it immediately to actual project development. First enter the code shown in Listing 1 with a familiar editor: Listing 1: hello.c

#include Int main(void){printf ("Hello world, Linux programming!/n");return 0;}

Then execute the following command to compile and run this program:

# gcc hello.c -o hello# ./helloHello world, Linux programming!

From the programmer's point of view, you can simply execute a GCC command, but from a compiler perspective, you need to do a lot of complicated work. First, GCC needs to call the preprocessor cpp, which is responsible for expanding the macro defined in the source file and inserting the contents of the "#include" statement into it; then, GCC will call ccl and as to process the source code. Compiled into object code; finally, GCC will call the linker ld to link the generated object code into an executable program. In order to better understand the working process of GCC, the above compilation process can be divided into several steps and the results of each step can be observed. The first step is to precompile. Using the -E parameter allows GCC to stop the compilation process after the preprocessing is complete:

# gcc -E hello.c -o hello.i

At this point, if you look at the contents of the hello.i file, you will find that the contents of stdio.h are indeed inserted into the file, and other macro definitions that should be preprocessed are also processed accordingly. The next step is to compile hello.i into the target code, which can be done by using the -c parameter:

# gcc -c hello.i -o hello.o

By default, GCC treats the .i file as pre-processed C source code, so the above command will automatically skip the pre-processing step and start the compilation process. You can also use the -x parameter to let GCC compile from the specified step. The final step is to link the generated object files into executable files:

# gcc hello.o -o hello

When using software design with modular design ideas, usually the whole program is composed of multiple source files, and correspondingly, multiple compilation units are formed, which can be well managed by GCC. Suppose you have a program consisting of two source files, foo1.c and foo2.c. To compile them and eventually generate the executable foo, you can use the following command:

# gcc foo1.c foo2.c -o foo

If more than one file is processed at the same time, GCC will continue to follow the pre-processing, compilation, and linking process. If you look deeper, the above command is roughly equivalent to executing the following three commands in sequence:

# gcc -c foo1.c -o foo1.o# gcc -c foo2.c -o foo2.o# gcc foo1.o foo2.o -o foo

When compiling a project with many source files, it is a waste of time to compile with only one GCC command. Suppose there are 100 source files in the project that need to be compiled, and each source file contains 10000 lines of code. If you use only one GCC command to complete the compilation as above, then GCC needs to recompile each source file. Then connect them all together. Obviously, this is a waste of time, especially when the user just modified one of the files, there is no need to recompile each file, because many of the generated object files will not change. The key to solving this problem is to use GCC flexibly, but also to use tools like Make.

Warning function

GCC includes full error checking and warning prompting features that help Linux programmers write more professional and beautiful code. First read the program shown in Listing 2. This code is very badly written. It is not difficult to pick out a lot of problems when you check it carefully:

The return value of the main function is declared as void, but it should actually be int; â—† GNU syntax extension is used, that is, long long is used to declare 64-bit integer, which does not conform to ANSI/ISO C language standard; â—†main function before termination The return statement was not called.

Listing 2: illcode.c

#include Void main(void){long long int var = 1;printf("It is not standard C code!/n");}

Let's take a look at how GCC helps programmers discover these errors. When GCC compiles source code that does not conform to the ANSI/ISO C language standard, if the -pedantic option is added, the place where the extended syntax is used will generate a corresponding warning message:

# gcc -pedantic illcode.c -o illcodeillcode.c: In function `main':illcode.c:9: ISO C89 does not support `long long'illcode.c:8: return type of `main' is not `int '

It should be noted that the -pedantic compilation option does not guarantee full compatibility between the compiled program and the ANSI/ISO C standard. It can only be used to help Linux programmers get closer and closer to this goal. Or in other words, the -pedantic option can help programmers find code that doesn't conform to the ANSI/ISO C standard, but not all. In fact, only those cases that require compiler diagnostics in the ANSI/ISO C language standard are possible. Found by GCC and warned. In addition to -pedantic, GCC has some other compile options that can also generate useful warning messages. Most of these options start with -W, the most valuable of which is -Wall, and it can be used to make GCC generate as many warnings as possible:

# gcc -Wall illcode.c -o illcodeillcode.c:8: warning: return type of `main' is not `int'illcode.c: In function `main':illcode.c:9: warning: unused variable `var '

The warning message given by GCC, although strictly speaking, cannot be counted as an error, but it is likely to be the wrong place to live. A good Linux programmer should try to avoid warning messages and keep your code simple, elegant, and robust. Another common compile option for handling warnings is -Werror, which requires GCC to treat all warnings as errors, which is very useful when using automatic compilation tools such as Make. If you compile with the -Werror option, GCC will stop compiling at all places where warnings are generated, forcing the programmer to modify his code. It is only possible to continue the compilation process forward when the corresponding warning message is removed. The implementation is as follows:

# gcc -Wall -Werror illcode.c -o illcodecc1: warnings being treated as errorsillcode.c:8: warning: return type of `main' is not `int'illcode.c: In function `main':illcode.c: 9: warning: unused variable `var'

For Linux programmers, the warning messages given by GCC are valuable. They not only help programmers write more robust programs, but they are also powerful tools for tracking and debugging programs. It is recommended to always use the -Wall option when compiling source code with GCC and gradually develop it into a habit, which is helpful for finding common implicit programming errors.

Library dependency

When developing software under Linux, it is relatively rare to use no third-party libraries at all. Generally, it is necessary to support one or more function libraries to complete the corresponding functions. From a programmer's perspective, a library is actually a collection of header files (.h) and library files (.so or .a). Although most of the functions under Linux put the header files in the /usr/include/ directory by default, and the library files are placed in the /usr/lib/ directory, not all cases are like this. Because of this, GCC must have its own way to find the required header files and library files at compile time. GCC uses the search directory to find the files it needs. The -I option adds a new directory to the GCC header search path. For example, if you have the header files needed for compiling in the /home/xiaowp/include/ directory, in order for GCC to find them successfully, you can use the -I option:

# gcc foo.c -I /home/xiaowp/include -o foo

Similarly, if you use a library file that is not in a standard location, you can add a new directory to the GCC's library file search path with the -L option. For example, if you have the library file libfoo.so needed in the /home/xiaowp/lib/ directory, in order for GCC to find it successfully, you can use the following command:

# gcc foo.c -L /home/xiaowp/lib -lfoo -o foo

What is worth explaining is the -l option, which instructs GCC to connect to the library file libfoo.so. Library files under Linux have a convention when naming, that is, they should start with lib three letters. Since all library files follow the same specification, you can omit the link file name specified with the -l option. Lib three letters, which means that GCC will automatically link the file named libfoo.so when processing -lfoo. The library files under Linux are divided into two categories: dynamic link libraries (usually ending with .so) and static link libraries (usually ending with .a). The only difference between the two is that the code required for program execution is running. Dynamically loaded, or statically loaded at compile time. By default, GCC prioritizes the use of dynamic link libraries when linking. Consider using static link libraries only when dynamic link libraries do not exist. If necessary, you can use the -static option at compile time to force the use of static link libraries. For example, if you have the library files libfoo.so and libfoo.a in the /home/xiaowp/lib/ directory, in order to allow GCC to use only statically linked libraries when linking, you can use the following command:

# gcc foo.c -L /home/xiaowp/lib -static -lfoo -o foo

Code optimization

Code optimization refers to the compiler analyzing the source code to find out which parts have not yet reached the optimal part, and then recombining them to improve the execution performance of the program. The code optimization provided by GCC is very powerful. It controls the generation of optimized code by compiling the option -On, where n is an integer representing the optimization level. For different versions of GCC, the range of values â€‹â€‹of n and its corresponding optimization effects may not be exactly the same. The typical range is from 0 to 2 or 3. Compiling with the option -O tells GCC to reduce both the length of the code and the execution time, which is equivalent to -O1. The types of optimizations that can be made at this level, depending on the target processor, generally include both Thread Jump and Deferred Stack Pops. The option -O2 tells GCC that in addition to completing all -O1 level optimizations, there are some additional adjustments, such as processor instruction scheduling. Option -O3 includes all the -O2 level optimizations, as well as loop unrolling and other optimizations related to processor characteristics. In general, the higher the number, the higher the level of optimization, and the faster the program runs. Many Linux programmers like to use the -O2 option because it provides an ideal balance between optimizing length, compile time, and code size.

Let's take a look at GCC's code optimization function through specific examples. The program used is shown in Listing 3. Listing 3: optimize.c

#include Int main(void){double counter;double result;double temp;for (counter = 0; counter < 2000.0 * 2000.0 * 2000.0 / 20.0 + 2020; counter += (5 - 1) / 4) {temp = counter / 1979 ;result = counter; }printf("Result is %lf/n", result);return 0;}

First compile without any optimization options:

# gcc -Wall optimize.c -o optimize

With the time command provided by Linux, you can roughly calculate the time required for the program to run:

# time ./optimizeResult is 400002019.000000real 0m14.942suser 0m14.940ssys 0m0.000s

Then use the optimization option to optimize the code:

# gcc -Wall -O optimize.c -o optimize

Test the run time again under the same conditions:

# time ./optimizeResult is 400002019.000000real 0m3.256suser 0m3.240ssys 0m0.000s

Comparing the output of the two executions, it is not difficult to see that the performance of the program has been greatly improved, from the original 14 seconds to 3 seconds. This example is specifically designed for the optimization functions of GCC, so the execution speed of the program before and after optimization has changed a lot. Although GCC's code optimization is very powerful, as a good Linux programmer, you must first try to write high-quality code by hand. If the code you write is short and logical, the compiler won't do more work or even optimize at all. Although optimization can bring better execution performance to the program, it should avoid optimizing the code in the following situations: â—† The higher the optimization level during program development, the longer the time spent in compiling, so the most development time Do not use optimization options, only to optimize the final generated code until the end of the software release or development. â—† When resources are limited, some optimization options will increase the size of the executable code. If the memory resources that the program can apply for at runtime are very tight (such as some real-time embedded devices), then do not optimize the code, because The negative impact can have very serious consequences. â—† Tracking debugging When optimizing code, some code may be deleted or rewritten, or reorganized for better performance, making tracking and debugging extremely difficult.

debugging

A powerful debugger not only provides programmers with a means to track program execution, but it also helps programmers find ways to solve problems. For Linux programmers, GDB (GNU Debugger) provides a complete debugging environment for Linux-based software development by working with GCC. By default, GCC does not insert debug symbols into the generated binary at compile time, as this increases the size of the executable. If you need to generate debug symbol information at compile time, you can use GCC's -g or -ggdb option. GCC also uses a hierarchical approach when generating debug symbols. Developers can specify how much debugging information is added to the code by appending the number 1, 2, or 3 to the -g option. The default level is 2 (-g2), and the debug information generated at this time includes extended symbol table, line number, local or external variable information. Level 3 (-g3) contains all the debugging information in level 2, as well as the macros defined in the source code. Level 1 (-g1) does not contain local variables and debugging information related to line numbers, so it can only be used for backtracking and stack dumping. Backtracking refers to the function call history of the monitoring program during the running process. Stack dumping is a method of saving the program execution environment in the original hexadecimal format. Both are frequently used debugging methods. GCC-generated debug symbols are universally adaptable and can be used by many debuggers, but if you are using GDB, you can also include GDB-specific debugging information in the generated binary code with the -ggdb option. The advantage of this approach is that it can facilitate the debugging of GDB, but the disadvantage is that other debuggers (such as DBX) can not be properly debugged. The option -ggdb can accept debug levels exactly the same as -g, which have the same effect on the debug symbols of the output. It should be noted that using any of the debugging options will increase the size of the resulting binary file dramatically, while increasing the overhead of the program, so debugging options are usually only used during the development and debugging phases of the software. The effect of debugging options on the size of the generated code can be seen from the comparison process below:

# gcc optimize.c -o optimize# ls optimize -l-rwxrwxr-x 1 xiaowp xiaowp 11649 Nov 20 08:53 optimize (without debugging options) # gcc -g optimize.c -o optimize# ls optimize -l-rwxrwxr -x 1 xiaowp xiaowp 15889 Nov 20 08:54 optimize (add debug option)

Although debugging options increase the size of the file, in fact many of the software in Linux still uses the debugging option to compile in the test version or even the final release. The purpose of this is to encourage users to solve the problem themselves when they find the problem. A notable feature of Linux. The following is a concrete example of how to use the debug symbols to analyze errors. The procedure used is shown in Listing 4. Listing 4: crash.c

#include Int main(void){int input =0;printf("Input an integer:");scanf("%d", input);printf("The integer you input is %d/n", input);return 0 ;}

Compiling and running the above code will generate a serious segmentation fault as follows:

# gcc -g crash.c -o crash# ./crashInput an integer:10Segmentation fault

In order to find the error more quickly, you can use GDB for trace debugging, as follows:

# gdb crashGNU gdb Red Hat Linux (5.3post-0.20021129.18rh)......(gdb)

When the GDB prompt appears, it indicates that GDB is ready to debug. Now you can use the run command to start the program under GDB monitoring:

(gdb) runStarting program: /home/xiaowp/thesis/gcc/code/crashInput an integer:10Program received signal SIGSEGV, Segmentation fault.0x4008576b in _IO_vfscanf_internal () from /lib/libc.so.6

Careful analysis of the output results given by GDB is not difficult to see, the program is aborted due to a segmentation error, indicating a memory operation problem, the specific problem is when calling _IO_vfscanf_internal (). In order to get more valuable information, you can use the backtracking command backtrace provided by GDB. The results are as follows:

(gdb) backtrace

#0 0x4008576b in _IO_vfscanf_internal () from /lib/libc.so.6

#1 0xbffff0c0 in ?? ()

#2 0x4008e0ba in scanf () from /lib/libc.so.6

#3 0x08048393 in main () at crash.c:11

#4 0x40042917 in __libc_start_main () from /lib/libc.so.6

Skip the first three lines in the output. From the fourth line of the output, it's easy to see that GDB has positioned the error in line 11 of crash.c. Now check it out carefully:

(gdb) frame 3

#3 0x08048393 in main () at crash.c:11

11 scanf("%d", input);

Use the frame command provided by GDB to locate the code segment where the error occurred. The value following the command can be found at the beginning of the line in the backtrace command output. Now that the error has been found, scanf("%d", input); should be changed to scanf("%d", &input); after completion, you can exit GDB, the command is as follows:

(gdb) quit

GDB's capabilities go far beyond this, it can also step through the program, check memory variables and set breakpoints. You may need to use the intermediate results generated by the compiler during debugging. You can use the -save-temps option to have GCC save the pre-processed code, assembly code, and object code as files. If you want to check if the generated code can be manually adjusted to improve execution performance, the intermediate files generated during the compilation process will be very helpful, as follows:

# gcc -save-temps foo.c -o foo# ls foo*foo foo.c foo.i foo.s

Other debugging options supported by GCC include -p and -pg, which add profiling information to the final generated binary code. Profiling information is helpful in finding out the performance bottlenecks of a program and is a powerful tool to help Linux programmers develop high-performance programs. Adding the -p option at compile time adds the statistics that the generic profiling tool (Prof) recognizes to the generated code, while the -pg option generates statistics that only the GNU profiling tool (Gprof) can recognize. Last but not least, although GCC allows debugging symbol information to be added at the same time as optimization, the optimized code will be a big challenge for debugging itself. After the code is optimized, the variables declared and used in the source program are likely to be no longer used, and the control flow may suddenly jump to an unexpected place. The loop statement may become everywhere because of the loop expansion. It will be a nightmare for debugging. It is recommended that you do not use any optimization options when debugging, only when the program is finalized.

Industrial Switches

Industrial Switches,Gigabit Ethernet Switch,12 Port Industrial Switch,Managed Switches 8 Port

Shenzhen Scodeno Technology Co.,Ltd , https://www.scodenonet.com