Friday, February 23, 2007

BRAIN SURGERY FOR THE MASSES -- a position paper on HPC Software (appeared in HPCWire)

The evolution of 4th generation surgery tools will help spread brain surgery to the masses, altogether dispensing with neurosurgeons in small hospitals that cannot afford their high pay.

Do you feel that I am pulling your leg? I am. But so is the HPCwire editor when he claims that 4th generation programming languages will make HPC programming available to the masses. Programming – to the least, programming of a large, complex code – is a specialized task that requires a specialist – a software engineer – to the same extent that brain surgery requires a specialist. You might claim that many non-specialists do write code. It is also true that most of us take care of our routine health problems; but only a fool would try brain surgery because he was successful in removing a corn from his foot. To believe that better languages will soon make software engineers redundant is to believe that Artificial General Intelligence will progress much faster than most of us expect, or to belittle the specialized skills of software engineers.

The editor expects that the masses will program clusters on their own, while software engineers will continue to be needed for programming leading edge supercomputers. However, the difference is not between bleeding edge supercomputers and clusters; it is between complex programming tasks and simple programming tasks. Writing a simple program with little concern with performance and not too much worry about correctness is tantamount to removing a corn; writing a simple program – e.g., an FFT routine -- that achieves close to optimal performance is tantamount to brain surgery, even if the target system is the processor that operates my laptop; writing a moderately complex program that is bug free with high confidence and can be used to control a critical system is also tantamount to brain surgery; finally, writing a large, complex program that more or less satisfies specifications seems to be harder than brain surgery: large software projects seem to have a higher mortality rate than brain surgery patients.

Programming is harder when the program is more complex, and when constraints of high efficiency or high confidence are stricter. Performance constraints can appear on large systems, and can appear on small systems: It can be extremely hard to shoehorn a compute intensive application into the power and memory constraints of a cell phone. Performance can matter a lot for cluster programs that are frequently used: the programmers of the MPI or ScaLAPACK libraries have good reasons to carefully tune the performance of their libraries on clusters: these libraries consume many cycles on many clusters, and improving their performance will improve the performance of many applications. While the difficulty of performance tuning relates to the complexity of the target architecture, one can well argue that a cluster is a more complex architecture than a leading edge supercomputer, because of the more complex software environment and the less controllable behavior of commercial LAN switches.

There is no obvious reason for cluster programs to be smaller or for confidence requirements on cluster to be less stringent than for supercomputers. However, it is true that supercomputer computations are more likely to be resource constrained than cluster computations. Indeed, a program will be run on a capability platform only if it cannot execute in a reasonable time on a smaller cluster. Such programs may tax even the resources of a leading edge supercomputer. On the other hand, performance may be less critical for cluster programs that do not consume significant hardware resources. I am not sure this represents a large fraction of cluster cycles.

The editor draws a dichotomy: MPI with C or Fortran for the high priests of supercomputing; MATLAB or SQL for the masses. This dichotomy is false: the most performing commercial transaction systems use SQL; but SQL by itself does not make a commercial application. Such an application will use a variety of services and frameworks, and will be written in a variety of programming languages; SQL itself is written, by experts, in C or other such language. The same holds true for scientific and engineering computations, be it on clusters or on supercomputers: Users will use available libraries or frameworks, whenever possible; the libraries will be implemented in Fortran, C or such similar language, and users will use these languages for their “glue code”. Libraries have been used for many decades to extend the expressiveness of low level programming languages such as Fortran or C. Computational frameworks are increasingly replacing low level programming languages as the main mechanism for expressing computations in many domains. Such frameworks can be specialized by plugging in specific methods, often written in lower level languages, and can be extended in a variety of ways. I am not sure what the difference between a well-designed computational framework (such as Cactus) and a “fourth generation language” or “domain specific language” is. Such computational frameworks are domain specific; they emphasize higher levels of abstraction; and the execution model is often interactive. Furthermore, computational frameworks are increasingly used for codes that run on the largest supercomputers.

Programming on supercomputers, like programming on any other platform, is likely to evolve toward higher level, more powerful programming languages or frameworks. The use of languages or frameworks that are more extensible, which have more powerful type systems and better type inference, support well generic programming and are safer, increases productivity. Such high-level languages are likely to have specialized idioms for specific application domains – to a large extent this is already true for languages such as Java of C#, since much programming is done using powerful domain specific classes: programming a GUI in Java using Swing is very different from programming a business application using Enterprise JavaBeans – and programmers specialize in one or the other. To the same extent, languages such as C# or Java, or next generation languages, can be extended with idioms for scientific computing – this has been done, for Java (www.javagrande.org).

The evolution of programming language and compiler technology provides more powerful mechanisms for language extension. The extension mechanisms encompass not only predefined and pre-coded methods; code generation can occur at run-time or, indeed, whenever new relevant information on characteristics of the computation becomes available. The user can control at various levels the implementation mechanisms for the high-level objects and their methods and even the implementation mechanisms for control structures. The Telescopic Languages project of the late Ken Kennedy or the Fortress language project at Sun are showing the strength of such techniques.

A common thread in these projects is that the high level language should match well the application domain – the way application specialists think; the mapping from the logic of the application to the logic of the machine may involve multiple layers of translation; and these translations cannot be fully automated – a specialist programmer is needed to guide these mappings, by implementing run-time code and libraries, by developing preprocessors and application generators or by adding implementation annotations to the core code. The distinction between the application programmer and the language implementer becomes blurred, since application programmers can modify the language and can modify its implementation. However, such a hierarchical design supports well specialization, where some programmers are more focused on application logic and others are more focused on application performance.

The parallel MATLAB solutions of Mathworks or ISC are examples of this trend. MATLAB was not developed for HPC, and would not be a viable product if uniquely targeting HPC; the goal was to provide a notation that is closer to the way scientific programmers think. In both cases, the mapping of a MATLAB code to a parallel machine is not fully automated, and the programmer has to manually parallelize the code. Parallelism is expressed in using well known (low level) paradigms: message-passing (MPI) and distributed arrays and forall loops (HPF). The parallel notation becomes part of the source code, but it should be possible (and desirable) to keep it separate, as an implementation annotation, and to make sure that it does not change the program semantics.

This general approach to high-level language design, while important for HPC, is not unique to HPC. Indeed, one can well argue that designing high-level languages specifically for High Performance Computing is a contradiction in terms: High-level languages should match the application domain, not the architecture of the compute platform. Developing high-level languages that satisfy the needs of HPC but are less convenient to use on more modest platforms is a waste of money.

Unique to HPC is the need for low level implementation languages that can be used to write libraries and implement the high-level objects and methods so as to run efficiently on clusters and supercomputers. This implementation language would be, today, MPI with Fortran or C. What should it be tomorrow (i.e., in five years from now)? Could the Partitioned Global Address Space (PGAS) languages, such as UPC, CAF and Titanium fulfill this role? (In a nutshell, these languages provide the same SPMD model of MPI – with multiple processes each executing on its own data; however they also provide partitioned global arrays that can be accessed by all processes; communication occur though access to the non-local part of a global array; simple barrier constructs are available for synchronization.)

An “implementation language” (IL) for HPC should satisfy the following requirements:

1. Performance. It should be possible to achieve close to optimal performance for programs written in IL. Recent research has shown that programs written in CAF or UPC can sometime beat the performance of MPI programs; this is very encouraging given that the compiler technology for these languages is still immature, while implementations of MPI are very mature. There are two reasons to believe that PGAS languages could lead to better performance, as compared to MPI: (1) The support by supercomputers and by the interconnect technology used on clusters (Myrinet, Quadrics, Infiniband) of direct remote memory access entails that better communication performance can be achieved using one-sided puts and gets, rather than two-sided message-passing; the design of MPI is well suited to two-sided communication, but perhaps less suited to one-sided communication. (2) A compiler can optimize communication and avoid the overhead of message-passing libraries, further reducing communication overhead. These languages do not yet offer good support for collective communications, and for parallel I/O – but these problems should be fixed within a few years.

2. Transparency. It should be possible for a programmer to predict, with reasonable accuracy, the performance of a code; the transformation done by the compiler or the run-time should not only preserve the semantics of the code, ensuring that the computation is correct, but should also “preserve” performance; i.e. should support a simple formula for translating program execution metrics into an approximate execution time. IL’s are used by programmers to deal with performance issues; but if the programmer has no way of reasoning about performance trade-offs, then performance can be achieved only through an exhaustive search through all possible program versions. PGAS languages are reasonably transparent.

3. User control. The IL should provide the programmer means of controlling how critical resources are used. In particular, for HPC, it is important to exercise some control on scheduling (to achieve load balancing and prevent idle time) and on communication. Load balancing and locality (communication reduction) are often algorithmic problems; without some control on those, one cannot achieve close to optimal performance. Scheduling and communication are under user control with PGAS languages.

4. Modularity and composability. A large application will be composed of independently developed modules. The internal details of one module should not impact other modules, and one should be able to compose modules with limited knowledge of their interface. Sequential programs support only “sequential composition”: a program invokes a module, and control is transferred to that module; upon completion control is transferred back. Programmers have been warned to avoid side effects, leading to a simple interface specification. Parallel programming also requires support for “parallel composition”, or “fork-merge”: several modules execute concurrently, and then combine back into a unified parallel computation. This is essential, e.g., for multiphysics simulations, where multiple physics modules work in parallel and periodically exchange information. MPI supports fork-merge via its Communicators: a group of processes can be split into independent subgroups, and then merged back. The code executed by each subgroup is totally independent of the code executed by other subgroups. UPC and CAF have not implemented yet similar concepts and, hence lack good support for modularity. (The CAF community seems to be working on this problem, as part of the Fortran 2008 standard effort.)

5. Backward compatibility. Code written in IL should be able to invoke libraries written using MPI, or other common message passing interfaces. While this has not been a focus of CAF or UPC, there are no inherent obstacles to compatibility.

There is another set of properties that I believe are important and can be supported efficiently – Their efficient support, however, is still a matter for research:

1. Determinism. Deterministic, repeatable execution should be the default – nondeterminism should occur only if the programmer explicitly uses nondeterministic constructs. Races and synchronization bugs are hard to detect, and are one of the major difficulties of parallel programming. The use of global address space worsens the problem as it becomes easier to write buggy code and harder to detect the bugs. Transactions and transactional memory are not a solution to this problem: Transactional memory provides efficient mechanisms to ensure the atomicity of transactions, but does not enforce an order between two transactions that access the same data. Transactions are a natural idiom to express the behavior of systems where concurrency is inherent in the problem specification: an online transaction system has to handle concurrent purchasing requests and has to ensure that only one passenger gets the last seat in a plane and that the seat is assigned to the same customer whose credit card was charged – hence atomicity. Transactions are not a natural idiom for most of scientific computing: It is seldom the case that we specify a computation with two conflicting noncommutative updates, where we do not care about their execution order, as long as each executes atomically. The natural idiom for scientific computing is (partial) order, not mutual exclusion. Therefore, races and nondeterminism result most often from programming bugs. The current PGAS languages do not prevent and do not detect races. I believe that race prevention is as essential to parallel programming as memory safety is to sequential programming. Furthermore, it seems plausible that races can be prevented using suitable programming languages and suitable compiler technology, without encumbering the programmer or significantly slowing down execution. We should work hard to ensure this happens, before “race exploits” become daily occurrences.

2. Global name space. A very common idiom in scientific computing is that of a global data structure (e.g., a mesh) that is used to represent the discretization of a continuous field. A simulation step may consist of applying an updating function to this field, or computing the interactions between the field and a set of particles. On a parallel machine one needs to break the structure into patches that are allocated to individual processes; but the patches are not natural objects in the problem definition – they appear only because of the mapping to a parallel system. Similarly, in a particle computation, it may be necessary to partition the particles into chunks, in order to reduce communication and synchronization. While each particle is a natural object in the problem specification, the chunks are not. In both cases, it is more convenient to specify the logic of the computation using global data structures and a global name space; it is desirable to be able to refine such a program and partition data structures without having the change the names of the variables: The name of a variable should relate to its logical role, not to its physical location. (I, therefore, speak of a global name space, not a global address space.) In order to control communication and parallelism, the user should be able to control where data is located – but this should not require changing the names of variables. PGAS languages do provide a global name space, but support only simple, static partitions of arrays. In cases where more complex or more dynamic partitions of global data structures are needed, one needs to explicitly copy and permute data, and change the names of variables.

3. Dynamic data partitioning and dynamic control partitioning. Parallelism is expressed using two main idioms: data parallelism and control parallelism. In data parallelism, data is partitioned; execution gets partitioned by executing statements on the site where their main operand reside. This is done, implicitly, with languages such as HPF and the “owner compute” rule; and explicitly, with forall statements and “on” clauses. In control parallelism, control is partitioned and data is moved implicitly to where it is accessed. Both forms of parallelism are useful. (As an aside: the two are identical in single-assignment languages, such as NESL.) The use of adaptive algorithms, such as Adaptive Mesh Refinement, or multiscale algorithms, require that partitions be dynamic, as data structures change and the amounts of storage and work associated with a patch change. Current PGAS languages do not support dynamic repartitioning of control and data any better than MPI: such repartitioning will require explicit copying of data and the application then has to maintain the correspondence between the logical name of a variable and its physical location. Dynamic control partitioning is easy for languages such as OpenMP that use a global names space and parallel loops for parallel control. But such languages do not provide good control for locality.

Efficient support for dynamic data and control partitioning is still a research issue: languages with limited, static partitions (such as current PGAS languages) can be implemented efficiently, but force the user to do the work, for dynamic codes; languages that support powerful, dynamic data and control repartitioning can too easily lead to inefficient codes. One limited, but well-tested and fairly powerful step toward supporting dynamic data and control partitioning is to use process virtualization. The model provided by MPI or by the PGAS languages is that of a fixed number of processes, each with its own address space, and (usually) one thread of control. Implementations associate one process with each processor (or core) and applications are written assuming a dedicated fixed set of identical processors. A suitable run-time can be used to virtualize the processes of MPI, UPC or CAF (the AMPI system is already doing this for MPI); the run-time scheduler can map multiple virtual processes (that are actually implemented as user-level threads) onto each physical processor, and can dynamically migrate the processes and change the mapping so as to balance load or reduce communication.

Process virtualization greatly enhances the modularity of complex parallel codes. Consider, for example, a multiphysics code that couple two physics modules. Normally, each module runs on a dedicated set of processor; the modules execute independently a time step of their simulation, and then exchange data. Suppose that the first module executes a dynamic mesh refinement. The internal logic of this module presumably includes code for repartitioning the mesh and rebalancing the computation, when the mesh is refined. But, after the refinement, this module will take longer to execute a time step, so that the global computation becomes unbalanced: it becomes necessary to steal resources from the second module in order to rebalance the computation. This other module may not have, on its own, any need for dynamic load balancing; and very few parallel programs are written so as to accommodate a run-time change in the number of processors they use. With virtual processes, each module may be written for a fixed number of (virtual) processes, while still allowing resources to be moved from one module to another in a multiphysics computation. Similarly, consider a multiscale computation, where it may be necessary to spawn a new parallel module that refines the computation in one region, using a finer scale, more compute intensive method. With virtual processes, resources can be reallocated within a fixed partition to the spawned module.

In summary, PGAS languages may, with some needed enhancements, be quite useful as HPC implementation languages. Additional work is needed for such languages to support well modern scientific codes – work that, unfortunately, does not seem to be part of the DARPA HPCS agenda.

My discussion, so far, has focused on programming languages. However, it is important to remember that programming languages are only one of many contributors to programmer’s productivity – not the most important one, and not very significant, in isolation. Research on the productivity of object oriented languages has shown that the use of OO languages does not contribute much to productivity, per se. Rather, they contribute indirectly in that they encourage and facilitate code reuse and other useful programming techniques. It would be useful to submit newly proposed programming languages for HPC to that same test: In what way do they support more efficient software development processes?

By far, the most important contributor to software productivity is the quality and experience of the software developer. This, by itself, already suggests that “parallel programming for the masses” is misguided. One should not attempt to develop languages and tools so that Joe Shmo be able to program clusters or supercomputers; rather, one should educate high quality software engineers that understand programming for HPC, and provide enough rewards and stability to ensure that they stay in their profession and amass experience.

Software productivity is also heavily influenced by the quality of the process used to develop the software and by the quality of the tools and environment used by the software developers. It is important to understand what best practices in the development of HPC software are, and to ensure that these practices are broadly applied. While much of the knowledge from general software development will apply, scientific computing may need different testing and validation processes, and HPC computing may need a different emphasis and a different approach to performance tuning. One can hope that the DARPA HPCS program will result in advances in this area.

HPC software developers have traditionally used programming environments and tools that lagged behind those used in commercial software development: the HPC market has been too small to justify commercial investments in high quality HPC Integrated Development Environments (IDEs), and the government has not had the vision to support such development. Eclipse, the open source IDE framework that is now broadly used for Java development, offers a promise for change. Eclipse based IDEs for Java are as good or better than any; and the open architecture of Eclipse supports the construction of IDEs for other languages and programming models. It has become possible to have a community effort that will create a modern, high-quality IDE for HPC; this work is already happening in national labs and universities.

One major contributor to the productivity of software developers is the availability of significant compute resources, so as to shorten the edit-compile-test cycle. The limited availability of interactive HPC platforms may be one of the most significant impediments to HPC software development. One should carefully weight the right balance between the allocation of resources to production and the allocation to development; and one should ensure that HPC software development stop being stuck in the era of batch processing.

In summary, there is no magic wand that will make software development for clusters or supercomputers significantly easier than it is now – to the same extent that no magic wand will make brain surgery significantly easier. The technology used in brain surgery continues to improve, enabling brain surgeons to perform more complicated surgeries, and improving the prognostic of brain surgeries. To the same extent, when we think of programming languages or tools that will enhance the productivity of HPC programmers, it is not very useful to focus on “HPC programming for dummies”. Rather, one should focus on better languages and tools for the HPC experts that will enable these experts to develop more complex or more performing software for HPC platforms.

Computer Science Education for our Times -- a position paper


Introduction

The world of IT is changing fast around us, raising questions about the up-to-dateness of the education provided by CS departments in US Universities and abroad. Many CS departments have also seen after the dot.com bust a sharp decline in their enrollments, even though IT employment has not declined and is predicted to grow fast. This has raised further doubts about current programs. ACM is defining five different computing disciplines: Computer Science, Computer Engineering, Information Systems, Information Technology and Software Engineering. In actuality, there is much overlap between these disciplines, and IS and IT programs are more often than not dumbed down versions of CS. Some schools, such as the Georgia Tech College of Computing, seem to be altogether dispensing with the idea that there is a Computer Science discipline, allowing students to complete a bachelor by taking two out of eight possible “threads”, and having “foundations” being merely one of the eight possible threads. These are signs of a discipline in crisis.

In this document, we shall briefly describe the state of the IT professions and our view of the CS discipline, and argue for a new, coherent focus for CS education.

What is Computer Science?

Computer Science is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. This broad definition includes several different flavors of research and education.

Computer Science as an engineering discipline:

Bill Wulf defined engineering as "design under constraint". CS is an engineering discipline, by this definition – but a discipline with a very distinct flavor: the artifacts designed by software engineers are mathematical artifacts (algorithms, protocols, or programs). CS is not rooted in the physical sciences: the main empirical constraints on the design of computing systems are economic, legal, social and cognitive. This raises questions about the relevancy of some of the engineering education and accreditation requirements.

Computer Science as an applied science discipline:

Computing and information processing system augment the cognitive abilities of human beings, allowing them to process information faster, compute faster or communicate faster. One should think of computers as prostheses to the mind, in that same sense that machines as prostheses to our body: artificial devices that replace or augment our brain. Such "prostheses" need to be specialized to their application domain; we have since long ceased to work on "universal artificial intelligence", in the same manner that mechanical engineers do not work on "universal machines". We design systems that are specialized to particular application domains. Applications are growing fast and are motivating an increasing fraction of CS research.

Computer Science as an ontology

Computer Science provides new abstractions for thinking about systems: we view them as information processing systems. Philosophers use the software-hardware dichotomy as a model for the mind-brain relation; biologists think of biological processes as information processing. Jeannette Wing describes this with the term "Computational Thinking"; the following is a citation from a paper she wrote:

". .. What do I mean by computational thinking? It includes a range of "mental tools" that reflect the breadth of our field. When faced with a problem to solve, we might first ask "How difficult would it be to solve?" and second, "What's the best way to solve it?" Our field has solid theoretical underpinnings to answer these and other related questions precisely. Computational thinking is reformulating a seemingly difficult problem into one we know how to solve, perhaps by reduction, embedding, transformation, or simulation. Computational thinking is type checking, as the generalization of dimensional analysis. Computational thinking is choosing an appropriate representation for a problem or modeling the relevant aspects of a problem to make it tractable. Computational thinking is using abstraction and decomposition when tackling a large complex task or designing a large complex system. It is having the confidence that we can safely use, modify, and influence a large complex system without understanding every detail of it. It is modularizing something in anticipation of multiple users or pre-fetching and caching in anticipation of future use. It is judging a system's design for its simplicity and elegance. It is thinking recursively … In short, computational thinking is taking an approach to solving problems, designing systems, and understanding human behavior that draws on the concepts fundamental to computer science."

These three views of CS suggest three fairly different cultures:

1. An engineering focus: in research, this means being concerned with impact on industrial practice; in education this means being concerned with professional skills.

2. An applied science focus: This means being concerned with multidisciplinary research and education, with CS being one of the ingredients, but not necessarily the motivating goal.

3. A fundamental science focus: This means a focus on theory, and an education that is less concerned with professional proficiency and more with a broadening of the mind.

These cultures coexist within CS, with some tensions.

The Changing Landscape of IT Professions

A recent Gartner report (Diane Morello, The IT Professional Outlook: Where Will We Go From Here?) makes the following predictions:

1) By 2010, six out of ten people affiliated with the IT organization will assume business-facing roles.

2) Through 2010, 30 percent of top technology performers will migrate to IT vendors and IT service providers.

3) By 2010, IT organizations in midsize and large companies will be at least 30% smaller than they were in 2005.

4) By 2010, ten to fifteen percent of IT professionals will drop out of the IT occupation.

5) By 2011, seventy percent of leading-edge companies will seek and develop “versatilists” while de-emphasizing specialists.

These predictions summarize well the changes happening in the IT professions: jobs involving only coding, in the narrow sense of translating a detailed specification into a programming language, are a decreasing small fraction of IT employment; these jobs are outsourced to low wage countries, or are made redundant by the use of higher-level tools and components. The jobs that stay in the country are jobs that are people oriented – in particular jobs that require “translating” the informally specified needs of people or organizations into formal specifications; jobs that require versatility; and jobs that require a higher level of abstraction and understanding of complex information handling processes, including the human in the loop – at the individual and social level.

Where should Undergraduate Computer Science Education Go?

I use the word “education”, not “training”. My focus, in this document, is on the education mission of research universities, their ability to shape minds and to provide foundations that enable alumni to continuously refresh their specific skills through a life-long career that will involve continuous change in technology and in employment; not on the specific skills that a student learn. Most companies expect the former from us, and expect to provide significant company specific training to their new employees.

CS education should not be dumbed down and should not become a smorgasbord of university courses; a strong, coherent, rigorous foundation is essential to the education of the engineer or the applied scientist. For a CS professional this foundation is about computational thinking: information, representation, computation, the ability to abstract the information processing component of systems – these, and the underlying mathematics of computation, are the foundation. This foundation is different than the usual foundations of engineering education that are found in calculus, the mathematics of continuous systems, and the physical sciences. But this different foundation should be taught no less rigorously than the foundations of physical sciences are taught.

Becoming a proficient programmer is part of the core CS education. Programming is the equivalent of a physics lab, for a science student: the practical activity that makes the theoretical framework concrete and helps assimilating the concepts and methods: it is an essential part of the education, even if the concrete skills learned are not reused. However, programming can be taught in a multitude of ways, at different levels, using different languages and packages – adapted to the interests of the student.

A large number of CS graduates will continue to work on core IT technologies: they will build the future software products of companies such as Microsoft, IBM, Oracle, Google, etc. These students need to receive, in addition to the foundation courses, the core engineering education that CS students normally receive now: the ability to program in low-level languages such as Java or C++, and a good understanding of computer architecture, programming languages, operating systems, networks, databases, etc.

A much large number will work more closely with users of the technology in various application domains. Such students will benefit from an education that provides a good foundation for their application domain of choice: accounting and management, for students focused on business applications; art and media, for students focused on media and entertainment applications; physical sciences, for students focused on scientific and engineering applications; cognitive sciences, for students focused on human-computer interactions; and so on.

Finally, every educated person will need a basic understanding of computing and information: “digitacy” becomes as important as “numeracy”. By “digitacy” I mean a basic understanding of the concepts of computational thinking and the ability to use these concepts. This is different from the basic ability to use computers for email, web browsing or word processing, to the same extent that numeracy is different from the ability to use a calculator.

A change in computer science education requires a concerted effort of the leading research universities; it also requires a change in accreditation standards. In particular, the general requirements of physics based engineering disciplines should not be applied to computer science, computer engineering or software engineering; and programs with a strong application focus should not only have a coherent curriculum in Computer Science, they also need a coherent, rigorous curriculum in the application domain.

Changes in Computer Science education should not only focus on the “what” – the curriculum been taught, but also on the “how” – how is knowledge imparted. CS students (as well as engineering students, in general) are often accused of being technically bright, but inarticulate and unable to collaborate well. It is important that our education provide more opportunities for collaborative, project oriented teaching, with projects involving people from various disciplines. Such projects force CS students to understand the language and the concepts of other disciplines, and to be able to make their language and their concepts accessible to other disciplines; it creates a healthy appreciation of the unique contributions that each discipline can bring to a complex project. Likewise, students will be better prepared to a global economy if they had opportunities to collaborate with students or developers from other countries on shared projects.

IT will profoundly reshape the modes of academic education in the coming decade. The generation that has learned to multitask with instant messaging and social networking environments, that has grown accustomed to have virtually unbounded amounts of information instantly available, this generation will be increasingly intolerant of our traditional modes of teaching. It behooves CS departments to be at the forefront of this change, if they wish to be perceived as relevant by their students.

Finally, it is important that the heavy requirements of a top technical education do not prevent students from participating in service and outreach activities, or have opportunities for entrepreneurial activities. The Engineering 2020 report of NAE said that a five year education may be needed to educate the engineer of the future. The same may apply to the IT professional of the future; the need for a more advanced education will become more pronounced as outsourcing makes entry level jobs scarcer. Five year bachelor/master programs may become more attractive, especially for students that want to have strong foundations in computing and in an additional application discipline.

Consciousness

Consciousness seems to be a "big problem" in Neurology and Philosophy (see, e.g., the writings of John Searle on this topic). Here is my small solution to this problem.

The issue is how we shift from "red" to "I see red"; from perceiving our environment to being aware that we perceive it. I would argue that the first step is to be able to think about "he sees red"; e.g. having a model of the cognitive processes happening in other sentient beings, and being able to use this model in order to understand the behavior of other human beings. "He sees a tiger and, therefore, he is taking cover"; "he does not see the tiger than I am seeing, because the grass is hiding the tiger from him; therefore, he has not taken cover". The ability of making judgments of this kind are essential for communication using a developed language.

Once we have a model that we can use to understand the behaviors of other (what they perceive, what are their goals, what are the expected actions), it is a relatively small step to apply the same model on ourselves. We became aware of the cognitive processes in other humans, now we become aware of cognitive processes in us: we can reason about our cognitive processes. Unlike Searle's claims, there is no reason to believe that this awareness is continuous and unavoidable. After all, we are unaware of times where we are not aware of our cognitive processes; and introspection seems to indicate that we have long periods when we act mechanically, without any self-awareness.

I was struck a few years ago by an article about communication among chimpanzees. They ave a vocabulary and will use different "words" for warning about dangers coming from a snake, or a bird of prey, or an animal of prey. Indeed, the reaction to these different warnings is different: climbing on a tree is a good defense against an animal of prey, but a bad defense against a bird of prey. However, the chimpanzee will continue to shout its warning long after all the animals took cover and acted on the warning: They do not seem to have an elaborate enough model of the cognitive process at other chimpanzees -- and especially they do not seem to have a cognitive process that relate their actions to the reaction of the other chimpanzees; such a model is essential of intelligent socialization. The same problem seems to affect autistic children.

The next step after having a model of the behavior of the "human automaton" as it reacts to its surroundings, is developing a model of the interactions of several such "automata": "he said and, therefore, I said, and, therefore he said..." this is what computers are still very bad at and this is what clearly distinguishes communication between humans and communication between chimpanzees, or communication with computers. "I said x and, therefore, I expected him to do y, or to smile, or to nod his head, and he did not, so he probably did not understands what I said, or what I said had an unexpected effect, which can be explained perhaps

The Information Revolution

Claim: The information revolution is a more profound transformation to humankind that the industrial revolution or the agricultural revolution.

Machines, the outgrowth of the industrial revolution affected manual work: the machines had more brawn than humans. People could move more earth, more ore, more water than before; they could fly and move faster than ever. Machines provided a model for understanding nature and society: the functioning of the celestial bodies or of the human body or of society was explained as the functioning of a machine. In a benevolent view, machines were seen as extending the physical reach of humankind, allowing us to explore the earth and, ultimately, the skies. In a less benevolent view, the mechanization of society was seen as a threat -- think of "Metropolis", by Lang or "Modern Times" by Chaplin-- with humans becoming subservient to dehumanizing machines in large mechanical factories or machine like cities. Finally, men progressively lost his central place in the universe, as the earth stopped being the center of the universe, with the Copernican revolution and the human race became just a race like any other, with the Darwinian revolution. "Manufacturing" stopping meaning the same as "Hand made", and "hand made" became a label of quality justifying higher prices.

Thinking machines, in a benevolent view, extend our cognitive abilities: they allow us to collect and process more information and to better communicate. They profoundly affect the Information economy -- which is, increasingly, the larger chunk of our economy. But they also are profoundly threatening. In a not too far future, machines will have "more brains' than human; they already do so, in many specific domains: They outcompute us, and outplay us in chess. They will do so in more and more cognitive domains. Machines will be able to ingest data, analyze it and take decisions faster than human beings, so that the "human in the loop" will be a weak spot that slows decision making and add risks, as it is already becoming in tasks such as flying aircrafts. Eventually, "brain thought" will acquire a similar connotation as "hand made": a song or a movie made the old way, by a sentient human, will have extra value as compared to the cheaper and better finished computer product. This will dethrone the human race from the last place were it is still at the center of the known universe; namely the area of intelligence, of cognitive abilities. The mechanization of thought will be seen as a threat in a world were humans become subservient to machine thinking -- think of the movie War Games, where a computer threatens to start World War III, believing it is playing a computer game.

Computer Scientists must be very aware of this risk, as a Luddite movement is quite likely. We need to ensure that the beast is under control: that we know how to ensure that software does what it is suppose to do, that robots obey the three laws of Asimov.