Arie Setiawan Prasida Blog: July 2009

Wednesday, July 29, 2009

Protocol

A communications protocol is a formal description of digital message formats and the rules for exchanging those messages in or between computing systems and in telecommunications. Protocols may include signaling, authentication and error detection and correction capabilities. A protocol describes the syntax, semantics, and synchronization of communication and may be implemented in hardware or software, or both.

Introduction
In a diplomatic context the word protocol refers to a diplomatic document or a rule,guideline etc which guides diplomatic behaviour. Synonyms are procedure and policy. While there is no generally accepted formal definition of "protocol" in computer science, an informal definition, based on the previous, could be "a description of a set of procedures to be followed when communicating". In computer science the word algorithm is a synonym for the word procedure, so a protocol is to communications what an algorithm is to computations.
Communicating systems use well-defined formats for exchanging messages. Each message has an exact meaning intended to provoke a defined response of the receiver. A protocol therefore describes the syntax, semantics, and synchronization of communication. A programming language describes the same for computations, so there is a close analogy between protocols and programming languages: protocols are to communications what programming languages are to computations.

Figure 1. Using a layering scheme to structure a document tree.

Diplomatic documents build on each other, thus creating document-trees. The way the sub-documents making up a document-tree are written has an impact on the complexity of the tree. By imposing a development model on the documents, overall readability can be improved and complexity can be reduced.
An effective model to this end is the layering scheme or model. In a layering scheme the documents making up the tree are thought to belong to classes, called layers. The distance of a sub-document to its root-document is called its level. The level of a sub-document determines the class it belongs to. The sub-documents belonging to a class all provide similar functionality and, when form follows function, have similar form.
The communications protocols in use on the Internet are designed to function in very complex and diverse settings, so they tend to be very complex. Unreliable transmission links add to this by making even basic requirements of protocols harder to achieve.
To ease design, communications protocols are also structured using a layering scheme as a basis. Instead of using a single universal protocol to handle all transmission tasks, a set of cooperating protocols fitting the layering scheme is used.

Figure 2. The TCP/IP model or Internet layering scheme and its relation to some common protocols.

The layering scheme in use on the Internet is called the TCP/IP model. The actual protocols are collectively called the Internet protocol suite. The group responsible for this design is called the Internet Engineering Task Force (IETF).
Obviously the number of layers of a layering scheme and the way the layers are defined can have a drastic impact on the protocols involved. This is where the analogies come into play for the TCP/IP model, because the designers of TCP/IP employed the same techniques used to conquer the complexity of programming language compilers (design by analogy) in the implementation of its protocols and its layering scheme.
Like diplomatic protocols, communications protocols have to be agreed upon by the parties involved. To reach agreement a protocol is developed into a technical standard. International standards are developed by the International Organization for Standardization (ISO).

Communicating systems
The information exchanged between devices on a network or other communications medium is governed by rules (conventions) that can be set out in a technical specification called a communication protocol standard. The nature of the communication, the actual data exchanged and any state-dependent behaviors are defined by the specification. This approach is often taken for protocols in use by telecommunications.
In digital computing systems, the rules can be expressed by algorithms and datastructures, raising the opportunity of hardware independency. Expressing the algorithms in a portable programming language, makes the protocol software operating system independent. The protocols in use by an operating system itself, lend themselves to be described this way and are usually, just like the rest of the operating system, distributed in binary or source form.
Operating systems are usually conceived of as consisting of a set of cooperating processes that manipulate a shared store (on the system itself) to communicate with each other. This communication is governed by well understood protocols and is only a small part of what a process is supposed to accomplish (managing system resources like cpu's, memory, timers, I/O devices etc, and providing controlled access to the resources), so these protocols can be embedded in the process code itself as small additional code fragments.
In contrast, communicating systems have to communicate with each other using shared transmission media, because there is no common memory. Unlike a memory store operation, a transmission doesnot need to be reliable and can involve different hardware and operating systems on different systems. This complicates matters up to a point that some kind of structuring is necessary to conquer the complexity of networking protocols, especially, when used on the Internet. The communicating systems can make use of different operating systems, as long as they agree to use the same kind of structuring and the same protocols for their communications.
To implement a networking protocol, the protocol software modules are to be interfaced with a framework assumed to be implemented on the machine's operating system. This framework implements the networking functionality of the operating system. Obviously, the framework needs to be as simple as it can be, to allow for an easier incorporation into the operating systems. The best known frameworks are the TCP/IP model and the OSI model.
At the time the Internet was formed, layering had proven to be a successful design approach for both compiler and operating system design and given the similarities between programming languages and communication protocols, it was intuitively felt that layering should be applied to the protocols as well. This gave rise to the concept of layered protocols which nowadays forms the basis of protocol design.
Systems do not use a single protocol to handle a transmission. Instead they use a set of cooperating protocols, sometimes called a protocol family or protocol suite. Some of the best known protocol suites include: IPX/SPX, X.25, AX.25, AppleTalk and TCP/IP. To cooperate the protocols have to communicate with each other, so there is an unnamed 'protocol' to do this. A technique used by this 'protocol' is called encapsulation, which makes it possible to pass messages from layer to layer in the framework.
The protocols can be arranged on functionality in groups, for instance there is a group of transport protocols. The functionalities are mapped on the layers, each layer solving a distinct class of problems relating to, for instance: application-, transport-, internet- and network interface-functions. To transmit a message, a protocol has to be selected from each layer, so some sort of multiplexing/demultiplexing takes place. The selection of the next protocol, also part of the aforementioned 'protocol' is accomplished by extending the message with a protocolselector for each layer.
There's a myriad of protocols, but they all only differ in the details. For this reason the TCP/IP protocol suite can be studied to get the overall protocol picture. The Internet Protocol (IP) and the Transmission Control Protocol (TCP) are the most important of these, and the term Internet Protocol Suite, or TCP/IP, refers to a collection of its most used protocols. Most of the communication protocols in use on the Internet are described in the Request for Comments (RFC) documents of the Internet Engineering Task Force (IETF). RFC1122, in particular, documents the suite itself.

Basic requirements of protocols
The data representing the messages is to be sent and received on communicating systems to establish communications. Protocols should therefore specify rules governing the transmission. In general, much of the following should be addressed:
Data formats for data exchange. In digital message bitstrings are exchanged. The bitstrings are divided in fields and each field carries information relevant to the protocol. Conceptually the bitstring is divided into two parts called the header area and the data area. The actual message is stored in the data area, so the header area contains the fields with more relevance to the protocol. The transmissions are limited in size, because the number of transmission errors is proportional to the size of the bitstrings being sent. Bitstrings longer than the maximum transfer unit (MTU) are divided in pieces of appropriate size. Each piece has almost the same header area contents, because only some fields are dependent on the contents of the data area (notably CRC fields, containing checksums that are calculated from the data area contents).
Address formats for data exchange. The addresses are used to identify both the sender and the intended receiver(s). The addresses are stored in the header area of the bitstrings, allowing the receivers to determine whether the bitstrings are intended for themselves and should be processed or (when not to be processed) should be discarded. A connection between a sender and a receiver can be identified using an address pair (sender address, receiver address). Usually some address values have special meanings. An all-1s address could be taken to mean all stations on the network, so sending to this address would result in a broadcast on the local network. Likewise, an all-'0's address could be taken to mean the sending station itself (as a synonym of the actual address). Stations have addresses unique to the local net, so usually the address is conceptually divided in two parts: a network address and the station address. The network address uniquely identifies the network on the internetwork (a network of networks). The rules describing the meanings of the address value are collectively called an addressing scheme.
Address mapping. Sometimes protocols need to map addresses of one scheme on addresses of another scheme. For instance to translate a logical IP address specified by the application to a hardware address. This is referred to as address mapping. The mapping is implied in hierarchical address schemes where only a part of the address is used for the map address. In other cases the mapping needs to be described using tables.
Routing. When systems are not directly connected, intermediary systems along the route to the intended receiver(s) need to forward messages (instead of discarding them) on behalf of the sender. Determining the route the message should take is called routing. On the Internet, the networks are connected using routers (gateways). This way of connecting networks is called internetworking. To determine the next router on the path to the destination, all systems consult locally stored tables consisting of (destination network address, delivery address) - entries and a special entry consisting of (a 'catch-all' address, default router address). The delivery address is either the address of a router assumed to be closer to the destination and the hardware interface to be used to reach it, or the address of a hardware interface on the system directly connecting a network. The default router is used when no other entry matches the intended destination network.
Detection of transmission errors is necessary, because no network is error-free. Bits of the bitstring become corrupted or lost. Usually, CRCs of the data area are added to the end of packets, making it possible for the receiver to notice many (nearly all) differences caused by errors, whilst recalculating the CRCs of the received packet and comparing them with the CRCs given by the sender. The receiver rejects the packets on CRC differences and arranges somehow for retransmission.
Acknowledgements of correct reception of packets by the receiver are usually used to prevent the sender from retransmitting the packets. Some protocols, notably datagram protocols like the Internet Protocol (IP), do not acknowledge.
Loss of information - timeouts and retries. Sometimes packets are lost on the network or suffer from long delays. To cope with this, a sender expects an acknowledgement of correct reception from the receiver within a certain amount of time. On timeouts, the packet is retransmitted. In case of a broken link the retransmission has no effect, so the number of retransmissions is limited. Exceeding the retry limit is considered an error.
Direction of information flow needs to be addressed if transmissions can only occur in one direction at a time (half-duplex links). To gain control of the link a sender must wait until the line becomes idle and then send a message indicating its wish to do so. The receiver responds by acknowledging and waits for the transmissions to come. The sender only begins transmitting after the acknowledgement. Arrangements have to be made to accommodate the case when two parties want to gain control at the same time.
Sequence control. We have seen that long bitstrings are divided in pieces, that are send on the network individually. The pieces may get 'lost' on the network or arrive out of sequence, because the pieces can take different routes to their destination. Sometimes pieces are needlessly retransmitted, due to network congestion, resulting in duplicate pieces. By sequencing the pieces at the sender, the receiver can determine what was lost or duplicated and ask for retransmissions. Also the order in which the pieces are to be processed can be determined.
Flow control is needed when the sender transmits faster than the receiver can process the transmissions or when the network becomes congested. Sometimes, arrangements can be made to slow down the sender, but in many cases this is outside the control of the protocol.
Getting the data across is only part of the problem. The data received has to be evaluated in the context of the progress of the conversation, so a protocol has to specify rules describing the context and explaining whether the (form of the) data fits this context or not. These kind of rules are said to express the syntax of the communications. Other rules determine whether the data is meaningful for the context in which the exchange takes place. These kind of rules are said to express the semantics of the communications.
Both intuitive descriptions as well as more formal specifications in the form of finite state machine models are used to describe the expected interactions of the protocol. Formal ways for describing the syntax of the communications are Abstract Syntax Notation One (a ISO standard) or Augmented Backus-Naur form (a IETF standard).

Tuesday, July 14, 2009

Nobuo Uematsu (Figure)

Nobuo Uematsu (植松伸夫 Uematsu Nobuo?, born March 21, 1959) is a Japanese video game composer, best known for scoring the majority of titles in the Final Fantasy series. He is regarded as one of the most famous and respected composers in the video game community. Uematsu, a self-taught musician, began playing the piano at the age of eleven or twelve, with Elton John as his biggest influence.
Uematsu joined Square (later Square Enix) in 1986, where he met Final Fantasy creator Hironobu Sakaguchi. They have worked together on numerous titles, most notably the games in the Final Fantasy series. After nearly 20 years in the company, he left Square Enix in 2004 and founded his own company called Smile Please, as well as the music production company Dog Ear Records. He has since composed music as a freelancer for video games primarily developed by Square Enix and Sakaguchi's development studio Mistwalker.
A handful of soundtracks and arranged albums of Uematsu's game scores have been released. Pieces from his video game works have been performed in concerts worldwide, and numerous Final Fantasy concerts have also been held. He has worked with Grammy Award-winning conductor Arnie Roth on several of these concerts. In 2002, he formed a rock band with colleagues Kenichiro Fukui and Tsuyoshi Sekito called The Black Mages, in which Uematsu plays the keyboard. The band plays arranged rock versions of Uematsu's Final Fantasy compositions.

Early Life
Nobuo Uematsu was born in Kōchi, Kōchi Prefecture, Japan. A self-taught musician, he began to play the piano when he was eleven or twelve years old; he did not take any formal piano lessons. He has an older sister who also played the piano. After graduating from Kanagawa University, Uematsu played the keyboard in several amateur bands and composed music for television commercials. When Uematsu was working at a music rental shop in Tokyo, a Square employee asked if he would be interested in creating music for some of the titles they were working on. Although he agreed, Uematsu considered it a side job, and he did not think it would become a full-time job. He said it was a way to make some money on the side, while also keeping his part-time job at the music rental shop.

Career with Square and The Black Mages
The first game Uematsu composed for Square was Genesis in 1985. While working at Square, he met Hironobu Sakaguchi, who asked him if he wanted to create music for some of his games, which Uematsu agreed to. From 1986 to 1987, he created music for a number of games which did not achieve any success, and Square was near bankruptcy. In 1987, Uematsu and Sakaguchi collaborated on what would originally become Sakaguchi's last contribution for Square, Final Fantasy. The game turned out to be a huge success, and ultimately saved Square from bankruptcy.
Final Fantasy's popularity sparked Uematsu's career in video game music, and he would go on to compose music for over 30 titles, most prominently the subsequent games in the Final Fantasy series. He scored the first installment in the SaGa series, The Final Fantasy Legend, in 1989. For the second and fifth games in the series, Final Fantasy Legend II (1990) and Romancing SaGa 2 (1993), he was assisted by Kenji Ito. Uematsu signed on to finish the soundtrack for the critically acclaimed 1995 title Chrono Trigger after the game's composer, Yasunori Mitsuda, contracted peptic ulcers. In 1996, he co-composed the soundtrack to Front Mission: Gun Hazard and created the entire score for DynamiTracer. He also created music for three of the games in the Hanjuku Hero series.
Outside video games, he has composed the main theme for the 2000 animated film Ah! My Goddess: The Movie and co-composed the anime Final Fantasy Unlimited (2001) with Final Fantasy orchestrator Shirō Hamaguchi. He also inspired the Ten Plants concept albums, and released a solo album in 1994, entitled Phantasmagoria. Feeling gradually more dissatisfied and uninspired, Uematsu requested the assistance of composers Masashi Hamauzu and Junya Nakano for the score to Final Fantasy X in 2001. This marked the first time that Uematsu did not compose an entire main-series Final Fantasy soundtrack. For Final Fantasy XI from 2002, he was joined by Naoshi Mizuta, who composed the majority of the soundtrack, and Kumi Tanioka; Uematsu was responsible for only eleven tracks. In 2003, he assisted Hitoshi Sakimoto in scoring Final Fantasy Tactics Advance by providing the main theme.
In 2002, fellow Square Enix colleagues Kenichiro Fukui and Tsuyoshi Sekito asked Uematsu to join them in forming a rock band that focused on reinterpreting and expanding on Uematsu's compositions. He declined their offer at first because he was too busy with work; however, after agreeing to perform with Fukui and Sekito in a live performance as a keyboardist, he decided to join them in making a band. An employee at Square Enix, Mr. Matsushita, chose the name The Black Mages for their band. In 2003, Keiji Kawamori, Arata Hanyuda, and Michio Okamiya also joined the band. The Black Mages have released three studio albums, and have appeared at several concerts to promote their albums.

As a Freelancer
Uematsu left Square Enix in 2004 and formed his own company called Smile Please; he also created the music production company Dog Ear Records in 2006. The reason for Uematsu's departure was that the company moved their office from Meguro to Shinjuku, Tokyo, and he was not comfortable with the new location. He does, however, continue to compose music as a freelancer for Square Enix. In 2005, Uematsu and several members of The Black Mages created the score for the CGI film Final Fantasy VII Advent Children. Uematsu composed only the main theme for Final Fantasy XII (2006); he was originally offered the job of creating the full score, but Sakimoto was eventually assigned as the main composer instead. Uematsu was also initially going to create the theme song for Final Fantasy XIII (2010). However, after being assigned the task of creating the entire score of Final Fantasy XIV, Uematsu decided to hand the job over to the main Final Fantasy XIII composer, Hamauzu.
Uematsu also works closely with Sakaguchi's development studio Mistwalker, and has composed the games in the Blue Dragon series, Lost Odyssey (2007), and Away Shuffle Dungeon (2008); he was also the composer of the canceled game Cry On. He is currently composing the music for another Mistwalker title The Last Story (2011).
He scored the PlayStation Portable title Anata o Yurusanai in 2007 and the arcade game Lord of Vermillion in 2008. Uematsu created the main theme for the multi-composer game Super Smash Bros. Brawl in 2008. He composed the music for the 2009 anime Guin Saga; this marked the first time he provided a full score for an animated series. He recently worked on Sakura Note for the Nintendo DS and is currently working on another DS project for Level 5 and Brownie Brown called Fantasy Life.

Personal Life
Uematsu currently resides in Tokyo, Japan with his wife, Reiko, whom he met during his college days, and their Beagle, Pao. They also have a summer cabin in Yamanakako, Yamanashi. In his spare time, he enjoys watching professional wrestling, drinking beer, and bicycling. Uematsu has said that he originally wanted to become a professional wrestler, and that it was a career dream when he was little.

Monday, July 13, 2009

Hironobu Sakaguchi (Figure)

Hironobu Sakaguchi (坂口博信 Sakaguchi Hironobu?) (born November 25, 1962) is a Japanese game designer, game director and game producer. He is famous around the world as the creator of the Final Fantasy series, and he has had a long career in gaming with over 80 million units of video games sold worldwide. He left Square Enix and founded a studio called Mistwalker in 2004.

Early years
Sakaguchi was born in Hitachi, Ibaraki, Japan. He studied electrical engineering while attending Yokohama National University, but dropped out in 1983 mid-semester with Hiromichi Tanaka.

Square
On leaving the university, Sakaguchi became a part-time employee of Square, a newly formed branch of Denyūsha Electric Company founded by Masafumi Miyamoto. When Square became an independent company in 1986, he became a full-time employee as the Director of Planning and Development. The company's first games were very unsuccessful. Sakaguchi then decided to create his final work in the game industry with the rest of Square's money, and appropriately named it Final Fantasy, which he claimed—given Square's uncertain future at the time—was an ironic gesture. The game was released in Japan for the Nintendo Entertainment System on December 18, 1987. The game was successful across Japan. Under Sakaguchi's watchful eye, Final Fantasy developed into a successful franchise, spanning from stand alone stories to spin-offs to direct sequels. In 1991, following the release of Final Fantasy IV for the Super Nintendo Entertainment System, he was honoured with the position of Executive Vice President. The last Final Fantasy game he directed was Final Fantasy V, becoming the producer for future installments of the franchise. In 1995, he became President at Square USA, Inc. His final role as game producer was for Final Fantasy IX. In an interview at the time he described it as his favourite Final Fantasy. He later went on to serve more as an executive producer of the series, as well as many of Square's other games, including Vagrant Story, Parasite Eve and Kingdom Hearts.
Hironobu Sakaguchi became the third person inducted into the Academy of Interactive Arts and Sciences' Hall of Fame on April 5, 2000. His Hall of Fame status was given to him because of the tremendous number of video games he has sold and created.

Time as film director
A long time proponent of bringing together the story-telling vehicle of film and the interactive elements of games, Sakaguchi took the leap from games to film when he made his debut as film director in Final Fantasy: The Spirits Within, an animated motion picture based on his world-famous Final Fantasy series. Despite some positive reviews, the movie was the biggest animated box office bomb in cinema history, losing over 120 million dollars.

Resignation from Square
Sakaguchi voluntarily stepped down from his post as an executive vice president at Square. This event also reduced Square's financial capital. Square later merged with its rival, the Enix Corporation, which led to the creation of Square Enix in 2003. In 2004, Sakaguchi founded Mistwalker with the financial backing of Microsoft Game Studios.

Mistwalker

Sakaguchi giving a presentation on Blue Dragon at the Tokyo Game Show 2006. convention.

In 2001, Sakaguchi founded Mistwalker, which began operation in 2004. In February 2005, it was announced that Mistwalker would be working with Microsoft Game Studios to create two RPGs for the Xbox 360. Still, the company remains independent from console exclusivity. Sakaguchi released the works Blue Dragon in 2006, and Lost Odyssey in 2007 on the Xbox 360, and ASH: Archaic Sealed Heat on the Nintendo DS. He was developing an action-RPG, titled Cry On, until the project was canceled in December 2008.
Currently he is working on a new "large scale project" on which Sakaguchi comments: "I’m betting a lot on this project." This game was announced in January 2010 to be The Last Story, a co-production with Nintendo for the Wii. It was revealed in an interview on Nintendo's website that Sakaguchi is the director of the Last Story, which marks his first time as director of a game since Final Fantasy V.
In July 2010, Sakaguchi announced on Mistwalker's blog that The Last Story may be the final title of his career. However, this turned out to be a translation error in which he meant that he was working on this as if it was his last game.

Thursday, July 9, 2009

The Art of Computer Programming

The Art of Computer Programming (acronym: TAOCP) is a comprehensive monograph written by Donald Knuth that covers many kinds of programming algorithms and their analysis. Knuth began the project, originally conceived as a single book, in 1962. The first three of what were then expected to be seven volumes were published in rapid succession in 1968, 1969, and 1973. The first installment of Volume 4 was not published until February 2005. Additional installments are planned for release approximately biannually with a break before fascicle 5 to finish the "Selected Papers" series.

History

Donald Knuth in 2005

Considered an expert at writing compilers, Knuth started to write a book about compiler design in 1962, and soon realized that the scope of the book needed to be much larger. In June 1965, Knuth finished the first draft of what was originally planned to be a single volume of twelve chapters. His hand-written manuscript was 3,000 pages long: he had assumed that about five hand-written pages would translate into one printed page, but his publisher said instead that about 1½ hand-written pages translated to one printed page. This meant the book would be approximately 2,000 pages in length. At this point, the plan was changed: the book would be published in seven volumes, each with just one or two chapters. Due to the growth in the material, the plan for Volume 4 has since expanded to include Volumes 4A, 4B, 4C, and possibly 4D.
In 1976, Knuth prepared a second edition of Volume 2, requiring it to be typeset again, but the style of type used in the first edition (called hot type) was no longer available. In 1977, he decided to spend a few months working up something more suitable. Eight years later, he returned with TeX, which is currently used for all volumes.
The famous offer of a reward check worth "one hexadecimal dollar" (100HEX base 16 cents, in decimal, is $2.56) for any errors found, and the correction of these errors in subsequent printings, has contributed to the highly polished and still-authoritative nature of the work, long after its first publication. Another characteristic of the volumes is the variation in the difficulty of the exercises. The level of difficulty ranges from "warm-up" exercises to unsolved research problems, providing a challenge for any reader. Knuth's dedication is also famous:

This series of books is affectionately dedicated
to the Type 650 computer once installed at
Case Institute of Technology,
in remembrance of many pleasant evenings.[nb 1]

Assembly language in the book
All examples in the books use a language called "MIX assembly language", which runs on the hypothetical MIX computer. (Currently, the MIX computer is being replaced by the MMIX computer, which is a RISC version.) Software such as GNU MDK exists to provide emulation of the MIX architecture.
Some readers are put off by the use of assembly language, but Knuth considers this necessary because algorithms need to be in context in order for their speed and memory usage to be judged. This does, however, limit the accessibility of the book for many readers, and limits its usefulness as a "cookbook" for practicing programmers, who may not be familiar with assembly, or who may have no particular desire to translate assembly language code into a high-level language. A number of more accessible algorithms textbooks using high-level language examples exist and are popular for precisely these reasons.

Critical response
American Scientist has included this work among "100 or so Books that shaped a Century of Science", referring to the 20th century, and within the computer science community it is regarded as the first and still the best comprehensive treatment of its subject. Covers of the third edition of Volume 1 quote Bill Gates as saying, "If you think you're a really good programmer . . . read (Knuth's) Art of Computer Programming . . . You should definitely send me a résumé if you can read the whole thing." [nb 2] The New York Times referred to it as "the profession's defining treatise".

Sunday, July 5, 2009

Debugging

Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected. Debugging tends to be harder when various subsystems are tightly coupled, as changes in one may cause bugs to emerge in another. Many books have been written about debugging (see below: Further reading), as it involves numerous aspects, including: interactive debugging, control flow, integration testing, log files, monitoring, memory dumps, Statistical Process Control, and special design tactics to improve detection while simplifying changes.

Origin
There is some controversy over the origin of the term "debugging."
The terms "bug" and "debugging" are both popularly attributed to Admiral Grace Hopper in the 1940s. While she was working on a Mark II Computer at Harvard University, her associates discovered a moth stuck in a relay and thereby impeding operation, whereupon she remarked that they were "debugging" the system. However the term "bug" in the meaning of technical error dates back at least to 1878 and Thomas Edison (see software bug for a full discussion), and "debugging" seems to have been used as a term in aeronautics before entering the world of computers. Indeed, in an interview Grace Hopper remarked that she was not coining the term. The moth fit the already existing terminology, so she saved it.
The Oxford English Dictionary entry for "debug" quotes the term "debugging" used in reference to airplane engine testing in a 1945 article in the Journal of the Royal Aeronautical Society, Hopper's bug was found on the 9th of September in 1947. The term was not adopted by computer programmers until the early 1950s. The seminal article by Gill in 1951 is the earliest in-depth discussion of programming errors, but it does not use the term "bug" or "debugging". In the ACM's digital library, the term "debugging" is first used in three papers from 1952 ACM National Meetings. Two of the three use the term in quotation marks. By 1963, "debugging" was a common enough term to be mentioned in passing without explanation on page 1 of the CTSS manual.
Kidwell's article Stalking the Elusive Computer Bug discusses the etymology of "bug" and "debug" in greater detail.

The scope of debugging
As software and electronic systems have become generally more complex, the various common debugging techniques have expanded with more methods to detect anomalies, assess impact, and schedule software patches or full updates to a system. The words "anomaly" and "discrepancy" can be used, as being more neutral terms, to avoid the words "error" and "defect" or "bug" where there might be an implication that all so-called errors, defects or bugs must be fixed (at all costs). Instead, an impact assessment can be made to determine if changes to remove an anomaly (or discrepancy) would be cost-effective for the system, or perhaps a scheduled new release might render the change(s) unnecessary. Not all issues are life-critical or mission-critical in a system. Also, it is important to avoid the situation where a change might be more upsetting to users, long-term, than living with the known problem(s) (where the "cure would be worse than the disease"). Basing decisions of the acceptability of some anomalies can avoid a culture of a "zero-defects" mandate, where people might be tempted to deny the existence of problems so that the result would appear as zero defects. Considering the collateral issues, such as the cost-versus-benefit impact assessment, then broader debugging techniques will expand to determine the frequency of anomalies (how often the same "bugs" occur) to help assess their impact to the overall system.

Tools
Debugging ranges, in complexity, from fixing simple errors to performing lengthy and tiresome tasks of data collection, analysis, and scheduling updates. The debugging skill of the programmer can be a major factor in the ability to debug a problem, but the difficulty of software debugging varies greatly with the complexity of the system, and also depends, to some extent, on the programming language(s) used and the available tools, such as debuggers. Debuggers are software tools which enable the programmer to monitor the execution of a program, stop it, re-start it, set breakpoints, change values in memory and even, in some cases, go back in time. The term debugger can also refer to the person who is doing the debugging.
Generally, high-level programming languages, such as Java, make debugging easier, because they have features such as exception handling that make real sources of erratic behaviour easier to spot. In lower-level programming languages such as C or assembly, bugs may cause silent problems such as memory corruption, and it is often difficult to see where the initial problem happened. In those cases, memory debugger tools may be needed.
In certain situations, general purpose software tools that are language specific in nature can be very useful. These take the form of static code analysis tools. These tools look for a very specific set of known problems, some common and some rare, within the source code. All such issues detected by these tools would rarely be picked up by a compiler or interpreter, thus they are not syntax checkers, but more semantic checkers. Some tools claim to be able to detect 300+ unique problems. Both commercial and free tools exist in various languages. These tools can be extremely useful when checking very large source trees, where it is impractical to do code walkthroughs. A typical example of a problem detected would be a variable dereference that occurs before the variable is assigned a value. Another example would be to perform strong type checking when the language does not require such. Thus, they are better at locating likely errors, versus actual errors. As a result, these tools have a reputation of false positives. The old Unix lint program is an early example.
For debugging electronic hardware (e.g., computer hardware) as well as low-level software (e.g., BIOSes, device drivers) and firmware, instruments such as oscilloscopes, logic analyzers or in-circuit emulators (ICEs) are often used, alone or in combination. An ICE may perform many of the typical software debugger's tasks on low-level software and firmware.

Typical debugging process
Normally the first step in debugging is to attempt to reproduce the problem. This can be a non-trivial task, for example as with parallel processes or some unusual software bugs. Also, specific user environment and usage history can make it difficult to reproduce the problem.
After the bug is reproduced, the input of the program may need to be simplified to make it easier to debug. For example, a bug in a compiler can make it crash when parsing some large source file. However, after simplification of the test case, only few lines from the original source file can be sufficient to reproduce the same crash. Such simplification can be made manually, using a divide-and-conquer approach. The programmer will try to remove some parts of original test case and check if the problem still exists. When debugging the problem in a GUI, the programmer can try to skip some user interaction from the original problem description and check if remaining actions are sufficient for bugs to appear.
After the test case is sufficiently simplified, a programmer can use a debugger tool to examine program states (values of variables, plus the call stack) and track down the origin of the problem(s). Alternatively, tracing can be used. In simple cases, tracing is just a few print statements, which output the values of variables at certain points of program execution.

Various debugging techniques
Print (or tracing) debugging is the act of watching (live or recorded) trace statements, or print statements, that indicate the flow of execution of a process.
Remote debugging is the process of debugging a program running on a system different than the debugger. To start remote debugging, debugger connects to a remote system over a network. Once connected, debugger can control the execution of the program on the remote system and retrieve information about its state.
Post-mortem debugging is debugging of the program after it has already crashed. Related techniques often include various tracing techniques (for example, [8]) and/or analysis of memory dump (or core dump) of the crashed process. The dump of the process could be obtained automatically by the system (for example, when process has terminated due to an unhandled exception), or by a programmer-inserted instruction, or manually by the interactive user.
Delta Debugging - technique of automating test case simplification. [9]:p.123
Saff Squeeze - technique of isolating failure within the test using progressive inlining of parts of the failing test.

Arie Setiawan Prasida Blog

Pages

Blog Archive

Link

Friend