Debugging tactics can involve interactive debugging, control flow analysis, unit testing, integration testing, log file analysis, monitoring at the application or system level, memory dumps, and profiling.
Origin of the term
The terms "bug" and "debugging" are popularly attributed to Admiral Grace Hopper in the 1940s. While she was working on a Mark II computer at Harvard University, her associates discovered a moth stuck in a relay and thereby impeding operation, whereupon she remarked that they were "debugging" the system. However, the term "bug", in the sense of "technical error", dates back at least to 1878 and Thomas Edison (see software bug for a full discussion). Similarly, the term "debugging" seems to have been used as a term in aeronautics before entering the world of computers. Indeed, in an interview Grace Hopper remarked that she was not coining the term. The moth fit the already existing terminology, so it was saved. A letter from J. Robert Oppenheimer (director of the WWII atomic bomb "Manhattan" project at Los Alamos, NM) used the term in a letter to Dr. Ernest Lawrence at UC Berkeley, dated October 27, 1944, regarding the recruitment of additional technical staff.
The Oxford English Dictionary entry for "debug" quotes the term "debugging" used in reference to airplane engine testing in a 1945 article in the Journal of the Royal Aeronautical Society. An article in "Airforce" (June 1945 p. 50) also refers to debugging, this time of aircraft cameras. Hopper's bug was found on September 9, 1947. The term was not adopted by computer programmers until the early 1950s. The seminal article by Gill in 1951 is the earliest in-depth discussion of programming errors, but it does not use the term "bug" or "debugging". In the ACM's digital library, the term "debugging" is first used in three papers from 1952 ACM National Meetings. Two of the three use the term in quotation marks. By 1963 "debugging" was a common enough term to be mentioned in passing without explanation on page 1 of the CTSS manual.
Kidwell's article Stalking the Elusive Computer Bug discusses the etymology of "bug" and "debug" in greater detail.
As software and electronic systems have become generally more complex, the various common debugging techniques have expanded with more methods to detect anomalies, assess impact, and schedule software patches or full updates to a system. The words "anomaly" and "discrepancy" can be used, as being more neutral terms, to avoid the words "error" and "defect" or "bug" where there might be an implication that all so-called errors, defects or bugs must be fixed (at all costs). Instead, an impact assessment can be made to determine if changes to remove an anomaly (or discrepancy) would be cost-effective for the system, or perhaps a scheduled new release might render the change(s) unnecessary. Not all issues are life-critical or mission-critical in a system. Also, it is important to avoid the situation where a change might be more upsetting to users, long-term, than living with the known problem(s) (where the "cure would be worse than the disease"). Basing decisions of the acceptability of some anomalies can avoid a culture of a "zero-defects" mandate, where people might be tempted to deny the existence of problems so that the result would appear as zero defects. Considering the collateral issues, such as the cost-versus-benefit impact assessment, then broader debugging techniques will expand to determine the frequency of anomalies (how often the same "bugs" occur) to help assess their impact to the overall system.
Debugging ranges in complexity from fixing simple errors to performing lengthy and tiresome tasks of data collection, analysis, and scheduling updates. The debugging skill of the programmer can be a major factor in the ability to debug a problem, but the difficulty of software debugging varies greatly with the complexity of the system, and also depends, to some extent, on the programming language(s) used and the available tools, such as debuggers. Debuggers are software tools which enable the programmer to monitor the execution of a program, stop it, restart it, set breakpoints, and change values in memory. The term debugger can also refer to the person who is doing the debugging.
Generally, high-level programming languages, such as Java, make debugging easier, because they have features such as exception handling and type checking that make real sources of erratic behaviour easier to spot. In programming languages such as C or assembly, bugs may cause silent problems such as memory corruption, and it is often difficult to see where the initial problem happened. In those cases, memory debugger tools may be needed.
In certain situations, general purpose software tools that are language specific in nature can be very useful. These take the form of static code analysis tools. These tools look for a very specific set of known problems, some common and some rare, within the source code. concentrating more on the semantics (e.g. data flow) rather than the syntax, as compilers and interpreters do.
Both commercial and free tools exist for various languages; some claim to be able to detect hundreds of different problems. These tools can be extremely useful when checking very large source trees, where it is impractical to do code walkthroughs. A typical example of a problem detected would be a variable dereference that occurs before the variable is assigned a value. As another example, some such tools perform strong type checking when the language does not require it. Thus, they are better at locating likely errors in code that is syntactically correct. But these tools have a reputation of false positives, where correct code is flagged as dubious. The old Unix lint program is an early example.
For debugging electronic hardware (e.g., computer hardware) as well as low-level software (e.g., BIOSes, device drivers) and firmware, instruments such as oscilloscopes, logic analyzers or in-circuit emulators (ICEs) are often used, alone or in combination. An ICE may perform many of the typical software debugger's tasks on low-level software and firmware.
Normally the first step in debugging is to attempt to reproduce the problem. This can be a non-trivial task, for example as with parallel processes or some unusual software bugs. Also, specific user environment and usage history can make it difficult to reproduce the problem.
After the bug is reproduced, the input of the program may need to be simplified to make it easier to debug. For example, a bug in a compiler can make it crash when parsing some large source file. However, after simplification of the test case, only few lines from the original source file can be sufficient to reproduce the same crash. Such simplification can be made manually, using a divide-and-conquer approach. The programmer will try to remove some parts of original test case and check if the problem still exists. When debugging the problem in a GUI, the programmer can try to skip some user interaction from the original problem description and check if remaining actions are sufficient for bugs to appear.
After the test case is sufficiently simplified, a programmer can use a debugger tool to examine program states (values of variables, plus the call stack) and track down the origin of the problem(s). Alternatively, tracing can be used. In simple cases, tracing is just a few print statements, which output the values of variables at certain points of program execution.
- Interactive debugging
- Print debugging (or tracing) is the act of watching (live or recorded) trace statements, or print statements, that indicate the flow of execution of a process. This is sometimes called printf debugging, due to the use of the printf function in C. This kind of debugging was turned on by the command TRON in the original versions of the novice-oriented BASIC programming language. TRON stood for, "Trace On." TRON caused the line numbers of each BASIC command line to print as the program ran.
- Remote debugging is the process of debugging a program running on a system different from the debugger. To start remote debugging, a debugger connects to a remote system over a communications link such as a local area network. The debugger can then control the execution of the program on the remote system and retrieve information about its state.
- Post-mortem debugging is debugging of the program after it has already crashed. Related techniques often include various tracing techniques like examining log files, outputting a call stack on crash, and analysis of memory dump (or core dump) of the crashed process. The dump of the process could be obtained automatically by the system (for example, when the process has terminated due to an unhandled exception), or by a programmer-inserted instruction, or manually by the interactive user.
- "Wolf fence" algorithm: Edward Gauss described this simple but very useful and now famous algorithm in a 1982 article for communications of the ACM as follows: "There's one wolf in Alaska; how do you find it? First build a fence down the middle of the state, wait for the wolf to howl, determine which side of the fence it is on. Repeat process on that side only, until you get to the point where you can see the wolf." This is implemented e.g. in the Git version control system as the command git bisect, which uses the above algorithm to determine which commit introduced a particular bug.
- Record and replay debugging is the technique of creating a program execution recording, which can be replayed and interactively debugged. Useful for remote debugging and debugging intermittent, non-determinstic, and other hard-to-reproduce defects.
- Delta Debugging – a technique of automating test case simplification.:p.123
- Saff Squeeze – a technique of isolating failure within the test using progressive inlining of parts of the failing test.
- Causality tracking: There are techniques to track the cause effect chains in the computation. Those techniques can be tailored for specific bugs, such as null pointer dereferences.
Debugging for embedded systems
In contrast to the general purpose computer software design environment, a primary characteristic of embedded environments is the sheer number of different platforms available to the developers (CPU architectures, vendors, operating systems and their variants). Embedded systems are, by definition, not general-purpose designs: they are typically developed for a single task (or small range of tasks), and the platform is chosen specifically to optimize that application. Not only does this fact make life tough for embedded system developers, it also makes debugging and testing of these systems harder as well, since different debugging tools are needed for different platforms.
Despite the challenge of heterogeneity mentioned above, some debuggers have been developed commercially as well as research prototypes. Examples of commercial solutions come from Green Hills Software  and Microchip's MPLAB-ICD (for in-circuit debugger). Two examples of research prototype tools are Aveksha and Flocklab. They all leverage a functionality available on low-cost embedded processors, an On-Chip Debug Module (OCDM), whose signals are exposed through a standard JTAG interface. They are benchmarked based on how much change to the application is needed and the rate of events that they can keep up with.
In addition to the typical task of identifying bugs in the system, embedded system debugging also seeks to collect information about the operating states of the system that may then be used to analyze the system: to find ways to boost its performance or to optimize other important characteristics (e.g. energy consumption, reliability, real-time response etc.).
Anti-debugging is "the implementation of one or more techniques within computer code that hinders attempts at reverse engineering or debugging a target process". It is actively used by recognized publishers in copy-protection schemas, but is also used by malware to complicate its detection and elimination. Techniques used in anti-debugging include:
- API-based: check for the existence of a debugger using system information
- Exception-based: check to see if exceptions are interfered with
- Process and thread blocks: check whether process and thread blocks have been manipulated
- Modified code: check for code modifications made by a debugger handling software breakpoints
- Hardware- and register-based: check for hardware breakpoints and CPU registers
- Timing and latency: check the time taken for the execution of instructions
- Detecting and penalizing debugger
An early example of anti-debugging existed in early versions of Microsoft Word which, if a debugger was detected, produced a message that said, "The tree of evil bears bitter fruit. Now trashing program disk.", after which it caused the floppy disk drive to emit alarming noises with the intent of scaring the user away from attempting it again.
- InfoWorld Oct 5, 1981
- S. Gill, The Diagnosis of Mistakes in Programmes on the EDSAC, Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences, Vol. 206, No. 1087 (May 22, 1951), pp. 538-554
- Robert V. D. Campbell, Evolution of automatic computation, Proceedings of the 1952 ACM national meeting (Pittsburgh), p 29-32, 1952.
- Alex Orden, Solution of systems of linear inequalities on a digital computer, Proceedings of the 1952 ACM national meeting (Pittsburgh), p. 91-95, 1952.
- Howard B. Demuth, John B. Jackson, Edmund Klein, N. Metropolis, Walter Orvedahl, James H. Richardson, MANIAC, Proceedings of the 1952 ACM national meeting (Toronto), p. 13-16
- The Compatible Time-Sharing System, M.I.T. Press, 1963
- Peggy Aldrich Kidwell, Stalking the Elusive Computer Bug, IEEE Annals of the History of Computing, 1998.
- "Postmortem Debugging".
- E. J. Gauss (1982). ""Pracniques: The "Wolf Fence" Algorithm for Debugging"". Cite journal requires
- Zeller, Andreas (2005). Why Programs Fail: A Guide to Systematic Debugging. Morgan Kaufmann. ISBN 1-55860-866-4.
- "Kent Beck, Hit 'em High, Hit 'em Low: Regression Testing and the Saff Squeeze".
- Zeller, Andreas (2002-11-01). "Isolating cause-effect chains from computer programs". ACM SIGSOFT Software Engineering Notes. 27 (6): 1. doi:10.1145/605466.605468. ISSN 0163-5948.
- Bond, Michael D.; Nethercote, Nicholas; Kent, Stephen W.; Guyer, Samuel Z.; McKinley, Kathryn S. (2007). "Tracking bad apples". Proceedings of the 22nd annual ACM SIGPLAN conference on Object oriented programming systems and applications - OOPSLA '07. p. 405. doi:10.1145/1297027.1297057. ISBN 9781595937865.
- Cornu, Benoit; Barr, Earl T.; Seinturier, Lionel; Monperrus, Martin (2016). "Casper: Automatic tracking of null dereferences to inception with causality traces". Journal of Systems and Software. 122: 52–62. doi:10.1016/j.jss.2016.08.062. ISSN 0164-1212.
- "SuperTrace Probe hardware debugger". www.ghs.com. Retrieved 2017-11-25.
- Tancreti, Matthew; Hossain, Mohammad Sajjad; Bagchi, Saurabh; Raghunathan, Vijay (2011). "Aveksha: A Hardware-software Approach for Non-intrusive Tracing and Profiling of Wireless Embedded Systems". Proceedings of the 9th ACM Conference on Embedded Networked Sensor Systems. SenSys '11. New York, NY, USA: ACM: 288–301. doi:10.1145/2070942.2070972. ISBN 9781450307185.
- Lim, Roman; Ferrari, Federico; Zimmerling, Marco; Walser, Christoph; Sommer, Philipp; Beutel, Jan (2013). "FlockLab: A Testbed for Distributed, Synchronized Tracing and Profiling of Wireless Embedded Systems". Proceedings of the 12th International Conference on Information Processing in Sensor Networks. IPSN '13. New York, NY, USA: ACM: 153–166. doi:10.1145/2461381.2461402. ISBN 9781450319591.
- Shields, Tyler (2008-12-02). "Anti-Debugging Series – Part I". Veracode. Retrieved 2009-03-17.
- "Software Protection through Anti-Debugging Michael N Gagnon, Stephen Taylor, Anup Ghosh" (PDF). Archived from the original (PDF) on 2011-10-01. Retrieved 2010-10-25.
- Ross J. Anderson (2001-03-23). Security Engineering. p. 684. ISBN 0-471-38922-6.
- "Microsoft Word for DOS 1.15".
- Agans, David J. (2002). Debugging: The Nine Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems. AMACOM. ISBN 0-8144-7168-4.
- Blunden, Bill (2003). Software Exorcism: A Handbook for Debugging and Optimizing Legacy Code. APress. ISBN 1-59059-234-4.
- Ford, Ann R.; Teorey, Toby J. (2002). Practical Debugging in C++. Prentice Hall. ISBN 0-13-065394-2.
- Grötker, Thorsten; Holtmann, Ulrich; Keding, Holger; Wloka, Markus (2012). The Developer's Guide to Debugging, Second Edition. Createspace. ISBN 978-1-4701-8552-7.
- Metzger, Robert C. (2003). Debugging by Thinking : A Multidisciplinary Approach. Digital Press. ISBN 1-55558-307-5.
- Myers, Glenford J (2004). *The Art of Software Testing. John Wiley & Sons Inc. ISBN 0-471-04328-1.
- Robbins, John (2000). Debugging Applications. Microsoft Press. ISBN 0-7356-0886-5.
- Telles, Matthew A.; Hsieh, Yuan (2001). The Science of Debugging. The Coriolis Group. ISBN 1-57610-917-8.
- Vostokov, Dmitry (2008). Memory Dump Analysis Anthology Volume 1. OpenTask. ISBN 978-0-9558328-0-2.
- Zeller, Andreas (2009). Why Programs Fail, Second Edition: A Guide to Systematic Debugging. Morgan Kaufmann. ISBN 978-0-1237-4515-6.
|Wikiquote has quotations related to: Debugging|
|The Wikibook Computer Programming Principles has a page on the topic of: Debugging|
- Crash dump analysis patterns – in-depth articles on analyzing and finding bugs in crash dumps
- Learn the essentials of debugging – how to improve your debugging skills, a good article at IBM developerWorks (archived from the original on February 18, 2007)
- Plug-in Based Debugging For Embedded Systems
- Embedded Systems test and debug – about digital input generation – results of a survey about embedded system test and debug, Byte Paradigm (archived from the original on January 12, 2012)