Chapter 1. Introduction


...but why restrict yourself to one language? It would be wonderful to always be able to use the appropriate tool for the job; write the stateful shell in Eiffel, the computational core in Haskell, the bit-twiddling code in C, and glue it all together with Python or another loose scripting language..."

--Paraphrased from a Usenet thread on comp.lang.eiffel, close to a quote by Roger Browne 

Computer languages are in many ways the primary tool of the software developer; while much can be made of CASE tools, advanced software engineering techniques, project management software, lifecycle selection, or team training--and these are all valuable tools--there inevitably comes a point at which code of some sort (be it text files, XML, visual description) is written and the ideas and plans which have been deliberated and refined get put into reality.

The choice of language is determined by a truly phenomenal number of interests; industry inertia continues to push languages which are in popular use (such as the old standby, C); business constraints may push languages intended for use with a particular vendor's products (Visual Basic, for example); the problem domain may be most easily addressed by a particular language (Lisp is used extensively in Artificial Intelligence). But too often an implementation language is chosen because it is comfortable rather than it being a good match to the problem--and even more often, a problem is so complex that it is not well-suited for a particular single language.

For purposes of discussion, I divide languages into four extremely rough groups:

System languages. Old and hoary, system languages do the grunt work in many operating systems. Probably the most famous (and most in use today), C has many great strengths: it can directly toy with memory and system registers; it can be coded to twiddle bits or run large applications; it has enormous industry support; it is extremely portable; it is also predictable, meaning that what the compiler produces is fairly close to what the author wrote (not necessarily what the author intended!). Unfortunately, C has many problems as well; despite its industry support, it is often rather cryptic and difficult to maintain. It can certainly give the user enough power to break the system (I once wiped the BIOS of my personal computer with an errant C program; the results can often be far worse). It also requires the user to operate at a fairly low level, manually keeping track of dynamic memory allocation and the like. However, its popularity, continued use, cultural inertia, and high speed make C a popular choice for many projects where other languages might be better suited.

Scripting languages. Scripting languages constitute a rather ill-defined but extremely useful body of languages, from special-purpose dialects intended to be interpreted by larger programs to complete, flexible, full-featured languages in their own right. In general, scripting languages work on a high level of detail: they allow a great deal of functionality to be covered by only a few lines of code. Scripting languages are often interpreted or allow in some other way a very fast cycle time between coding and testing. Several are dynamically typed. Perl, Python, and Tcl are examples of three extremely popular scripting languages; for purposes of this discussion and project, I'll be dealing primarily with Python, a dynamically-typed, interpreted scripting language. Python is extremely powerful, offering both object-oriented or procedural approaches to software construction, and an easy interface to the systems programming language C. But the greatest strength of Python to many developers is its extensive standard library, covering many different aspects of development from regular expression searching to internet downloading to mail functions to system calls to a GUI interface based on Tk... the list goes on. It is possible to do a great many things in Python at a fraction of the time and effort necessary to do the same tasks in a systems programming language, and Python's extensive library has earned it the unofficial slogan (from Frank Stajano )that "Python is great because it comes with batteries included." But like many interpreted languages, Python is fairly slow; an equivalent Python program will often run 3-5 times as slow as one coded in C; additionally, its dynamic typing and run-time flexibility, a great asset for programming-in-the-small, can be perceived as a liability for programming-in-the-large (an opinion which has not stopped many developers from creating very substantial applications in Python!). Still, its flexibility and strong library support make it ideal for many tasks.

Object-Oriented (OO) languages. It may seem strange to make a separate category for OO languages, but I include them as a separate section because, since OO focuses attention on the abstraction and compartmentalization of processes and problems, OO languages are often well-suited for large-scale software construction. Of the available OO languages, the one used in this project is Eiffel, a pure compiled OO language with garbage collection and whole-program optimization and analysis. Eiffel is a fairly verbose but cleanly-designed language with a great many features which make it ideally suited for large-scale construction: garbage collection, a simple but clean class system, Design by Contract (a mechanism for describing the semantic use of classes and features), and a well-defined static typing system that catches many potential errors automatically. Unfortunately, this sort of rigor has, in many developers' eyes, placed Eiffel in the category of "bondage and discipline" languages, perceived as restricting developer freedom and being "less fun" to work with. Additionally, Eiffel's focus on safety often makes it difficult to do low-level system work, and its lack of popularity means that the available libraries, while growing, cannot nearly match the breadth of the libraries available for languages such as C and Python.

Functional Languages. Functional languages are a peculiar but potentially powerful subset of languages. They focus on treating the execution of a program as the evaluation of a function; the most pure functional languages take this to an extreme: there are no variable assignments and absolutely no side effects from functions; functions do nothing more than take a set of inputs and produce a single output. The lack of variables, assignments, and side effects means that functional programs do away with many of the traditional tools (looping, temporary variables, etc.) of the average software developer; at the same time, that lack means that functional programs are extremely easy to reason about and, in their domain, can be extremely powerful. In an academic independent study from January to March 2000, I wrote several of the exercises for the Software Engineering Institute's Personal Software Process in both C++ and Eiffel; in one case, a correlation calculation that took hundreds of lines of code in either C++ or Eiffel was accomplished in a fraction of that time in only 33 lines of Haskell, a pure functional language, with only one error in testing. This is a dramatic result and illustrates the power of functional languages--but I spent several hours in the next four days trying to figure out how to parse a the text file containing the numbers I wanted to perform the calculation on! The paradigm is very different, and while functional languages can be extremely powerful in their domains, it is difficult to wrap one's brain around some standard tasks using such a nonstandard paradigm.