Monday, October 17, 2011

Clustering

Clustering

Advantages of Clusters And Distributed Systems

Definition of a Cluster

A cluster is a parallel or distributed system consisting of independent computers that cooperate as a single system. In the case of the the In-Q-My Application Server Cluster, this group consists of two types of participants – dispatchers and servers. Generally speaking, clusters offer a way to utilize computer resources more productively in comparison to when the same numbers of machines are working standalone - the total result is greater than the sum of the separate parts.

Cost-efficiency is one aspect of choosing whether to set a cluster or to use

standalone machines. Another advantage of clusters is that, from technological point of view, they are more reliable, secure, productive, and deliver better overall service than standalone machines.

The goal of a cluster is to allow distributing of the computing load over the several machines participating in it. These machines communicate with the dispatchers and with each other in order to provide better – faster and more reliable - service to the client. The idea of clusters is that the machines in it represent one unit and clients perceive the cluster, as a single entity - not as a group, comprised of several computers. For clients the IP address of the dispatcher is the IP address of the whole cluster. Clients neither see the IP addresses of the servers in the cluster, nor have to know these multiple IP addresses. On the other hand, administrators know the IP addresses of the dispatchers and servers in the cluster.

When connected in a cluster, servers are still physically separated as different machines. The participants in the cluster share disk access and resources that manage data, but they do not share memory or processors.

Among the advantages of the clustering solution provided by the In-Q-My Application Server are scalability of the system, the ability to use machines, which are geographically dispersed, load balancing, fail-over, high availability and easy administration.

Scalability of the system

The ability to “plug-in” additional resources in the system at any time when more computing power is necessary is one of the main advantages of clusters and distributed systems.

Scalability allows to add new participants in the cluster when the load of clients’ requests is heavy. It can be done dynamically without having to stop the cluster. This feature gives flexibility and ability quickly to adapt to changing workloads and to provide adequate service to more users when this is necessary. In-Q-My Application Server supports up to 64 participants in the cluster, which allows to serve simultaneously millions of clients’ requests.

What is more important, because In-Q-My Application Server is J2EE™ compatible, it is possible the cluster to consist of machines with different hardware architectures and running under various OS – Windows™, UNIX™, Solaris, etc – which makes In-Q- My Application Server really cross-platform. This feature of J2EE™ compatible application servers allows building clusters that meet perfectly well the hardware requirements for more or less powerful, for cheaper or cutting-edge equipment.

Scalability is not only in regard to hardware. The architecture of In-Q-My Application Server gives users the ability to add new services to the already existing ones. This provides opportunities to easily integrate with existing external systems.

Globally Dispersed

In-Q-My Application Server allows the cluster to be globally distributed. The machines, connected in a cluster can belong to different WANs (Wide Area Network) and it is possible even to be distributed globally, i.e. in different countries on different continents. Global distribution of the participants in a cluster is a feature not supported by the most application servers, because their clustering technology uses simple IP multicast communication, which restricts, or at least firmly does not recommend, distributing the participants in a WAN.

Load Balancing

The idea behind load balancing is to send incoming requests for processing to the least busy server in the cluster. Load balancing increases productivity, because it allows to fully utilize the available resources and at the same time to deliver increased application throughput and optimal response time.

Generally speaking, load balancing for a group of servers can be done via hardware or software. Hardware load balancing usually provides less flexibility to the administrator, while software load balancing allows him or her more precise techniques for configuring the system and measuring performance.

The algorithm for load balancing developed by In-Q-My is as follows. At regular intervals, set by the administrator, the servers in the cluster send information to the dispatcher(s) about the level of their resource usage in percentage. When a new request arrives, the dispatcher(s) send it to the least loaded server. The algorithm takes into account not only the percentage of used resources, but also the percentage of free resources, which means that if an extremely big number of requests arrives simultaneously, not all of them will be send only to that server, which is the least busy at the moment, but will be directed to several machines in a way that will evenly distribute the load among them.

This algorithm is simple, yet powerful and functions faultlessly in clusters where servers run on different platforms and/or have different processing capabilities (not equal processing power or not equal amount of memory) as well as in homogeneous clusters where all servers have equal processing power.

Fail-Over Recovery

Fail-over recovery means that if one of the participants in the cluster is temporary or permanently unavailable, for instance because of a hardware crash, its functions are undertaken by another participant. As a result, requests are automatically redirected to a working server in the cluster and users and applications are not aware of what is happening, because the system continues to process requests properly. Fail-over recovery applies not only to the servers in the cluster, but to dispatchers as well. If there are two or more dispatchers in the system and one of them fails, the incoming requests are redirected to the IP of another dispatcher.

Fail-over recovery is provided for data, too. Data is replicated (mirrored) on other participants of the cluster and this ensures that even in case of a crash there will be no information loss. Having not just the data, but its latest update reflecting its state after a transaction has been committed or rolled back on a server in the cluster, is vitally important and bearing this in mind In-Q-My has developed a distributed database which allows synchronization of the cached data distributed in the cluster. The fail-over algorithms are based on the communication protocols for access to the system which are developed by In-Q-My. The fail-over algorithms prevent a single point of failure that affects the whole system. This leads to improved performance and higher availability.

High Availability

One of the most irritating messages a user sees is “The server could be down or is not responding.“ The expectations of users are that the server is up and running twenty-four hours a day, seven days a week. An unavailable server is a disaster for e-business. The losses due to downtime cannot be measured only in terms of revenues or missed profit. They damage the image of the company and ruin the confidence in it.

High availability is bound with scalability and fail-over recovery. These are the two factors (besides the hardware parameters of the participants in the cluster, of course) that predominantly influence the availability of the system. High availability has several aspects. Even in cases when the number of participants in the cluster is very big, and even when fail-over recovery and load balance algorithms are perfect, it is possible in theory billions of requests to arrive simultaneously. This poses risks for overloading of the system. In a situation like the above mentioned, it is more important first to finish processing of already received requests, than to receive new ones. The temporary denial to receive new requests for processing is not a high availability trade-off; it is simply provided to guarantee that under extreme circumstances the system will still continue to function without compromising security and the reliability of processed information.

Easy Administration

One of the greatest advantages of clusters is that they are easily administered. Through the Visual and Console administration facilities, In-Q-My Application Server allows integrated administering of the machines in the cluster instead of administering standalone servers. The presence of a “central point of command” leads to reduce costs for administration and training in terms of money, time and overall efficiency. Integrated administration creates a complete view of all the critical components and gives better control over total performance. It also allows fine-tuning of particular components/nodes on one or more of the participants in the cluster, which is a prerequisite for an adequate reaction in critical situations.

In-Q-My Application Server provides means for remote administration of the cluster, too. The administration tools allow secure monitoring both at start-up and during runtime. The runtime administration facilities give information about the status of execution of requests.

The cluster-level log files are another instrument useful for administering a cluster. In-Q- My provide cluster-level log files, which allow to monitor what has happened in the various modules of the system and which give administrators the freedom to choose what kind of information to be logged, where to log it, and in how much detail.

Cluster Architecture

Cluster components

There are two types of cluster components that participate in In-Q-My Application

Server Cluster. As mentioned above, these two kinds of participants are dispatchers and servers.

The Participants in In-Q-My Application Server Cluster

Dispatchers

The function of the dispatcher(s) in the cluster is to receive requests from clients and distribute them to the servers in the cluster for processing. After requests are

processed, the results are returned back to the dispatcher and from there to the

client. The route passed by requests differs depending on their complexity and the resources they have to access. The general scheme is discussed in more detail in the chapter about the Route of the Requests.

What is important to note is that dispatchers do not process requests. One might

Wonder what their role is – only to forward requests to the servers and the responses back to the clients? In the hierarchy in clusters this role is crucial, especially in regard to load balancing and fail over.

If a considerable number of requests is expected, dispatchers must be more powerful machines in order to be able to handle the requests without becoming the bottle-neck in the system, especially in regard to the fact that each dispatcher communicates with each of the servers in the cluster, as is seen from the picture above. But this does not necessarily mean that the dispatcher(s) must be the most powerful machine(s) in the cluster. On the other hand, recommended (hardware) configurations for the cluster depend also on the complexity of applications that are to be deployed and how much processing is to be done, as it will be discussed next.

Servers

The role of servers is to process client requests. What is important to note is that

servers do not communicate directly with clients. Each of the servers in the cluster communicates with all other servers and dispatchers, as the picture shows. The fact that each server communicates with all other servers means that there is replication and synchronization of data, which leads to faster and more secure processing of the requests and executing the business logic.

The business logic of applications (EJBs, JSPs, Servlets, etc) is hosted on servers. In case of complex applications that require a lot of processing and/or communication, servers can become a bottle-neck for the system if their number or processing power is insufficient.

Cluster Configurations

The cluster solutions, which In-Q-My Application Server provides are several. Depending on the goals set for a particular system – complexity of applications, expected number of requests, available hardware, etc - its administrator should choose how to configure the cluster. There is not a general recommendation about the number of the participants in the cluster, the ratio between dispatchers and servers, what services and components to deploy on which machines, etc. The main guiding light is the overall performance and resource usage of the system. For instance, when the workload is not heavy, it is not reasonable to have many dispatchers and/or many servers, because in this case the percentage of their resource usage will be low and the system will not be efficient.

In other cases the administrator(s) of the cluster need to configure a cluster comprised of as many machines as possible, because the requests are arriving "in bulk".

The recommended cluster configurations for application servers can be divided into two groups – with only one or with more than one dispatchers. Needless to say, the presence of at least one dispatcher and one server is necessary in order a cluster to be set. For very small systems this configuration - one dispatcher and one server - could be the right choice.

The first possible configuration is a single dispatcher – n servers. As stated above, the minimal possible number of servers in the cluster is one. This configuration allows building a stable and reliable system, but provides no fail-over and load balancing (obviously, if the only server crashes, there is nowhere else to redirect the requests to), which are among the main advantages of clustering and large-scale distributed systems.

More common are configurations, which involve two, three or more servers. Again, all client requests are received by the dispatcher, but in this configuration the dispatcher's role is to redirect the requests to the least busy server, where they are processed. This provides better fail-over, load balancing, and scalability, which leads to higher availability and better overall performance. The resources of the system are used better, because of the concurrency in processing of requests. The administration tools, provided by In-Q-My Application Server, make it an easy job to manage such a cluster configuration.

The "one dispatcher-many servers" configuration is suitable in cases when the number of arriving requests is medium, which allows all of them to be accepted by the dispatcher without turning it into the bottle-neck of the system. Still, the only dispatcher is a potential single point of failure. It is recommended for this configuration the applications and services to be homogeneously distributed over the servers in the cluster, because this allows better scalability.

The second possible configuration is m dispatchers - n servers (m < n). In this configuration it is not mandatory all dispatchers and servers to be on the same LAN (Local Area Network). They can belong to different WANs (Wide Area Network) and it is possible even to be distributed globally, i.e. in different countries on different continents.

The m dispatchers - n servers configuration unleashes the full potential of the clustering feature of In-Q-My Application Server, because it allows more requests to be received and processed. Again, there is no general recommendation about the ratio between the dispatchers and the servers in the cluster, nor about how to

distribute the services and the applications on the machines in the cluster.

Often it is better to distribute homogeneously those applications and services, which allow it. This affords real scalability and fail-over – the risks for a single point of failure are reduced further. In terms of scalability the improvement is clear - a new participant, no matter if a server or a dispatcher, can be plugged into the system at any time. It has only to register with the group and joins the cluster.

One of the issues that need to be considered in this configuration is overall performance. "By default" the overall performance of an m dispatchers – n dispatchers cluster improves in comparison to the other possible configurations. But it requires proper installation and configuration of the system, proper development of the applications, proper deployment. Another potential drawback is out of the reach of In-Q-My Application Server and the reason is the operation system(s), under which the cluster runs. It is obvious that not all operation systems are equally good as server platforms, although In-Q-My Application Server is a cross-platform server and this fact provides more flexibility in choosing the operation system(s) to run the cluster on.

Logical Modules in the Cluster

A cluster can be discussed in many aspects. One of them – the role of the participants in it – was the subject of the previous section. Another approach is to

look at the modules or the separate parts that integrate the pieces of code into a

system.

There are three types of logical modules in the structure of In-Q-My Application

Server:

· core systems (managers);

· core services;

· (additional) services.

These three types of logical modules build every server and dispatcher in the cluster.The list of core systems, core services and (additional) services differs depending on the type of the participant (a server or a dispatcher). The picture below illustrates the modules on a separate machine and the data stream of communication among them. Communication is bi-directional, as the red and blue arrows identify it.

Logical Modules

Managers

Managers are the very base of the servers and dispatchers in the cluster. Core systems are located on each of the machines in the cluster and are not distributed.Managers are the first to be started, when a server/dispatcher starts.

The managers included in In-Q-My Application Server are:

· Framework. This is the manager that initializes the rest of the managers. In a

way, it is a container for the managers.

· Managers, which have special functions in regard to clusters – Cluster

Manager and Dispatcher (respectively Server) Service Manager.

· Log Manager, Thread Manager, Ports Manager, Timeout Manager, Memory

Manager,Connections Manager, Classloader Manager. Their roles are explained in more detail in the technical documentation of In-Q-My Application

Server.

Different managers perform different tasks. The two managers that are (most) related to the cluster are Cluster Manager and Dispatcher/Server Service Manager.

As its name implies, Cluster Manager is directly related to the cluster. This is the

module in In-Q-My Application Server, which starts the cluster. If Cluster Manager on the first dispatcher node does not start, the whole cluster will not start. If Cluster Manager on a server node does not start, this element will not the join the cluster. The property files of Cluster Manager allow to set its properties.

Dispatcher Service Manager runs only on dispatchers and Server Service Manager runs only on servers. None of the managers on one machine communicates with its counterpart (the same manager) on another machine, no matter if a server or a dispatcher, not even Dispatcher Service Manager or Server Service Manager, besides Cluster Manager. All the communication in the cluster passes through the Cluster Managers of the servers and dispatchers, as it is shown in the picture on next page.

Core Services

Core Services are the second module in the kernel of the server. They provide the basic functionality of the server/dispatcher. On each of the machines in the cluster, Core Services are started by Service Manager, as seen from the picture above. Core Services in In-Q-My Application Server are: AI Service, SSL Service, Log, Deploy, Httptunneling, Crypt, P4, etc. The list of Core Services on servers and dispatchers is different. For instance, Admin, SecurityManager, DBMS, etc are Core Services on servers but quite logically are not run on dispatchers at all, because the role of dispatchers in clusters does not suppose the dispatcher to manage users or to take care of security, for instance.

(Additional) Services

Services are these logical modules in the architecture of the cluster, which extend its functionality. They are not part of the kernel of the server and the system could work properly without them, but the specified service will not be provided to clients. Again, the list of additional services on server and dispatcher nodes differs. It differs for the different servers and dispatchers (Server1, Server 2, etc.), too, because it reflects the role of the particular participant. Some of the additional services are: telnet, http, ejbentity, ejbsession, servlet_jsp, dbpool. It is up to the administrator to choose which services to start on which machine and this decision is made depending on the needs of the system (for instance what kind of applications are to be deployed in the cluster and on a particular machine).

Functionality and configuration of (additional) services is discussed in more detail in the documentation of In-Q-My Application Server.

The Stream Of Communication In The Cluster

This picture presents the logical modules in the architecture of In-Q-My Application Server and how they communicate with each other. For clarity, only the communication between the different participants in the cluster is presented here. The data stream of communication between the logical modules for each of the participants is given in the previous picture. It does not change when the machines are connected in a cluster.

The Route of Requests

Application servers and multi-tier architecture

It is easier to understand the logic of the route of processing requests by In-Q-My Application Server, if this process is discussed in connection with the role of Web application servers and multi-tier architecture.

The role of Web application servers is to provide the platform for deploying and

running sophisticated Web applications in addition to serving static pages. The

occurrence of application servers' technology a couple of years ago led to the advent of multi-tier architecture, which was an answer to the limitations imposed by traditional two-tier (client-server) architectures.

In multi-tier architectures clients do not communicate directly with the database, but send their requests to the application server, which then makes the connection with the database. The next picture shows the general route of requests in multi-tier architectures and the role of In-Q-My Application Server:

Multi-Tier Architectures And The Place Of Web Application Servers In Them

The next picture presents in more detail what is happening to the request inside the application server.

The Route Of Requests

The aim of this picture is to present in a simplified manner what happens to requests from the moment they are sent by the client to the dispatcher till the results are delivered back to the client. When looking at this picture, one should bear in mind that:

a) The route of requests shows the longest possible route a request could travel.

It is not meant to imply that every request passes along the whole route. Actually, usually requests do not pass through all of these steps or they follow the order (1-10) but skip some of the steps and go directly to a next step. For instance, if it is a JSP request, there is no need to go to the bean container in order to compile a servlet and to return it to the client.

b) On deliberately, in the picture there is no cluster. The picture shows only one

server besides the dispatcher. The reason is visually to simplify the presentation of what kind of processing happens to a request in the system, no matter on which of the many servers in the cluster. If the picture were to present several servers, as the actual situation in clusters is, the picture would contain n JSP Engines, n Servlet Engines, n Bean Containers, etc, distributed on several servers. In this case it will be more difficult for the user to understand the idea. It is more important to show the route in general and the different stages a request passes, than in which container or engine on which server what is done in particular, especially in regard to the fact that this depends almost entirely on the way the cluster is configured. And above all, presenting the route as “distributed” among the servers might suggest that this is the recommended configuration – a JSP Engine on Server 1 accesses the Servlet Engine on Server 4, which in turn passes it to the EJB Container on Server 45, etc. Once again it should be stated, that there is no recommended configuration what to be deployed where – it is up to the administrator, and above all to the particular application.

c) The path for delivering the results of processing is along the same route but in

reverse order, i.e. step 10, 9, 8, etc. and because of that it is not presented separately.

1. The browser (in this case, but it can be another client application) sends

request to the dispatcher(s) of the cluster.

2. The dispatcher(s) receive the requests and depending on its type, direct

it to a server to be processed. In the case of a browser, which sends an HTTP request using the HyperText Transfer Protocol, the dispatcher sends it to a server on which the HTTP Service of In-Q-My Application Server, is running. If the expected number of requests for static HTML pages is considerable, at this stage it is possible to add an Apache Web server and to send the request to it. The idea is to leave all dynamic HTML pages, which require more serious processing before being delivered to the client, to be handled by In-Q-My Application Server.

3. A server receives the request. This stage allows distributing the request

among the servers in the cluster. The corresponding service of In-Q-My Application Server receives the request. Depending on the type of the request, it can be directed by the dispatcher not only to HTTP Service but also to EJB Session Service, EJB Entity Service, Deploy Service, etc.

4. This stage allows distributing the request among the servers in the cluster, too. If it is a HTML request and there is a JSP in it, which must be processed before returned to the client, the request is passed to the JSP engine.

5. The servlet engine compiles the JSP into a servlet. The servlet can

either be returned to the client, or if the servlet uses beans, it goes further along the route. This stage allows distributing the request among the servers in the cluster, too.

6. If the client is an application that doesn't need HTTP and processing of

HTML pages, JSPs and Servlets, on step 3 the request is sent to the corresponding EJB service of In-Q-My Application Server and steps 4 and 5 are skipped. This stage does not allow distributing the request among the servers in the cluster, i.e. a given session bean can be processed only on the server, where it was received. But for faster overall performance it is advisable session beans to be replicated in the cluster.

7. At this stages entity beans are processed. In the case of entity beans

with container managed persistence they can be distributed among the servers in the cluster. There is always one server, which is responsible for all the transactions with a given bean. EJBs with bean managed persistence make direct directions with the database through the DB Pool.

8. The role of the buffer (storage) is to establish connections with the pool, to lock connections, transactions, bean instances, etc. It is closely related to entity beans.

9. DB Pool – holds connections to the Database (10) that are to be used

only for the needs of In-Q-My Application Server. The DB Pool is located only on one machine and is not replicated in the cluster. The binding to the DB Pool is global and it is accessed via the Naming Service.

10. This could be any external database, a driver for which is supported by

In-Q-My Application Server – Oracle, SAP-DB, etc. This is the last stage in processing the request by an application server.

Bottlenecks Along This Route And In Multi-tier Web Applications In General

The bottlenecks for the faultless functioning of an In-Q-My Application Server cluster can be expected to occur on every transition from one step of the route to the next, as well as in the connections between the tiers. Usually bottlenecks are due to problems external to the cluster. Generally speaking, the cluster is functioning but bottlenecks reduce its performance and efficiency and threaten scalability, load balancing or fail-over.

Among the factors for occurrence of bottlenecks are:

· not enough processing power of some or all of the machines in the cluster or

on clients' side

· improper configuration and administration of the machines in the cluster

· not carefully thought of development and deployment of the applications

running in the cluster.

As this list implies, the reasons for bottlenecks occurrence can be in any of the tiers of multi-tier architecture.

Bottlenecks In The First Tier - Clients And The Connection To

Internet

This source of trouble has nothing to do with In-Q-My Application Server clustering solution but since it also deteriorates the overall efficiency of the system, it must be mentioned.

Usually bottlenecks in the first tier are due to underpowered machines, which can not meet adequately the necessary processing requirements at this stage. Sometimes there is one more reason for bad performance - the applications and components, that are deployed on In-Q-My Application Server contain a considerable amount of business logic to be executed by clients (applets, scripts, forms) and this clutters clients' machines. One possible solution to this situation is to rewrite applications and components in a way that allows part of the logic to be executed on the servers in the cluster, not by the client.

The second possible bottleneck is the Internet connection of the client. If the bandwidth of the connection is low this could also be a potential problem for thick

applications. One of the possible solutions is upgrading the bandwidth, and the other is again modifying the applications and components in a way, which will make them thinner. Another possible solution is mirrowing (even geographic) the servers in the cluster.

Bottlenecks In Dispatchers And Servers

Basically, the dispatchers and servers in the cluster could become a bottleneck, if

they are not properly configured for the number of requests and for the complexity of the applications that are running in the cluster. One possible reason could be the inadequate ratio between servers and dispatchers – either the number of dispatchers is not enough for the number of requests, or the servers (their total number or the number of servers, configured for particular services) are not enough and are flooded by the requests. The situation gets worse if the requests require more complex processing before being returned to the client.

The above-mentioned problem is more serious in clusters, which use hardware for load balancing, because usually hardware does not allow such a fine-tuning of the system. Some competitors’ clusters, which use load balancing hardware in place of a software dispatcher, do not provide load balancing and failover for clustered servlets and JSPs. The presence of dispatchers in In-Q-My Application Server cluster solution provides more flexibility in configuring the system and tools for administering it, too.

Sometimes the number of requests turns dispatcher(s) into a bottle-neck due to

external factors. Not all OS are equally good as platforms for Web application

servers. Under certain circumstances the operational system the dispatcher is

running on could impose limits on the number of requests, which can pass through a socket.

ARTIFICIAL INTELLIGENCE

We start by making a distinction between mind and cognition, and by positing that cognition is an aspect of mind. We propose as a working hypothesis a Separability Hypothesis which posits that we can factor off an architecture for cognition from a more general architecture for mind, thus avoiding a number of philosophical objections that have been raised about the "Strong AI" hypothesis. Thus the search for an architectural level which will explain all the interesting phenomena of cognition is likely to be futile. There are a number of levels which interact, unlike in the computer model, and this interaction makes explanation of even relatively simple cognitive phenomena in terms of one level quite incomplete.

I. Dimensions for Thinking About Thinking:

A major problem in the study of intelligence and cognition is the range of—often implicit—assumptions about what phenomena these terms are meant to cover. Are we just talking about cognition as having and using knowledge, or are we also talking about other mental states such as emotions and subjective awareness? Are we talking about intelligence as an abstract set of capacities, or as a set of biological mechanisms and phenomena? These two questions set up two dimensions of discussion about intelligence. After we discuss these dimensions we will discuss information processing, representation, and cognitive architectures.

A. Dimension 1: Is intelligence separable from other mental phenomena?

When people think of intelligence and cognition, they often think of an agent being in some knowledge state, that is, having thoughts, beliefs. They also think of the underlying process of cognition as something that changes knowledge states. Since knowledge states are particular types of information states the underlying process is thought of as information processing. However, besides these knowledge states, mental phenomena also include such things as emotional states and subjective consciousness. Under what conditions can these other mental properties also be attributed to artifacts to which we attribute knowledge states? Is intelligence separable from these other mental phenomena?

It is possible that intelligence can be explained or simulated without necessarily explaining or simulating other aspects of mind. A somewhat formal way of putting this Separability Hypothesis is that the knowledge state transformation account can be factored off as a homomorphism of the mental process account. That is: If the mental process can be seen as a sequence of transformations: M1 -->M2 -->..., where Mi is the complete mental state, and the transformation function (the function that is responsible for state changes) is F, then a subprocess K1 --> K2 -->. . . can be identified such that each Ki is a knowledge state and a component of the corresponding Mi, the transformation function is f, and f is some kind of homomorphism of F. A study of intelligence alone can restrict itself to a characterization of K’s and f, without producing accounts of M’s and F. If cognition is in fact separable in this sense, we can in principle design machines that implement f and whose states are interpretable as K’s. We can call such machines cognitive agents, and attribute intelligence to them. However, the states of such machines are not necessarily interpretable as complete M’s, and thus they may be denied other attributes of mental states.

B. Dimension 2: Functional versus Biological:

The second dimension in discussions about intelligence involves the extent to which we need to be tied to biology for understanding intelligence. Can intelligence be characterized abstractly as a functional capability which just happens to be realized more or less well by some biological organisms? If it can, then study of biological brains, of human psychology, or of the phenomenology of human consciousness is not logically necessary for a theory of cognition and intelligence, just as enquiries into the relevant capabilities of biological organisms are not needed for the abstract study of logic and arithmetic or for the theory of flight. Of course, we may learn something from biology about how to practically implement intelligent systems, but we may feel quite free to substitute non-biological (both in the sense of architectures which are not brain-like and in the sense of being un- constrained by considerations of human psychology) approaches for all or part of our implementation. We might have different constraints on a definition that needed to include emotion and subjective states than one that only included knowledge states. Clearly, the enterprise of AI deeply depends upon this functional view being true at some level, but whether that level is abstract logical representations as in some branches of AI, Darwinian neural group selections as proposed by Edelman, something intermediate, or something physicalist is still an open question.

III. Architectures for Intelligence:

We now move to a discussion of architectural proposals within the information processing perspective. Our goal is to try to place the multiplicity of proposals into perspective. As we review various proposals, we will present some judgements of our own about relevant issues. But first, we need to review the notion of an architecture and make some additional distinctions.

A. Form and Content Issues in Architectures :

In computer science, a programming language corresponds to a virtual architecture. A specific program in that language describes a particular (virtual) machine, which then responds to various inputs in ways defined by the program. The architecture is thus what Newell calls the fixed structure of the information processor that is being analyzed, and the program specifies a variable structure within this architecture. We can regard the architecture as the form and the program as the content, which together fully instantiate a particular information processing machine. We can extend these intuitions to types of machines which are different from computers. A particular connectionist machine is then instantiated by the "program" that specifies values for all these variables.

We have discussed the prospects for separating intelligence (a knowledge state process) from other mental phenomena, and also the degree to which various theories of intelligence and cognition balance between fidelity to biology versus functionalism. We have discussed the sense in which alternatives such as logic, decision tree algorithms, and connectionism are all alternative languages in which to couch an information processing account of cognitive phenomena, and what it means to take a Knowledge Level stance towards cognitive phenomena. We have further discussed the distinction between form and content theories in AI.

B. Intelligence as Just Computation:

Until recently the dominant paradigm for thinking about information processing has been the Turing machine framework, or what has been called the discrete symbol system approach. Information processing theories are formulated as algorithms operating on data structures. In fact AI was launched as a field when Turing proposed in a famous paper that thinking was computation of this type (the term "artificial intelligence" itself was coined later) . Natural questions in this framework would be whether the set of computations that underlie thinking is a subset of Turing-computable functions, and if so how the properties of the subset should be characterized.

Most of AI research consists of algorithms for specific problems that are associated with intelligence when humans perform them. Algorithms for diagnosis, design, planning, etc., are proposed, because these tasks are seen as important for an intelligent agent. But as a rule no effort is made to relate the algorithm for the specific task to a general architecture for intelligence. While such algorithms are useful as technologies and to make the point that several tasks that appear to require intelligence can be done by certain classes of machines, they do not give much insight into intelligence in general.

C. Architectures for Deliberation:

Historically most of the intuitions in AI about intelligence have come from introspections about the relationships between conscious thoughts. We are aware of having thoughts which often follow one after another. These thoughts are mostly couched in the medium of natural language, although sometimes thoughts include mental images as well. When people are thinking for a purpose, say for problem solving, there is a sense of directing thoughts, choosing some, rejecting others, and focusing them towards the goal. Activity of this type has been called "deliberation." Deliberation, for humans, is a coherent goal-directed activity, lasting over several seconds or longer. For many people thinking is the act of deliberating in this sense. We can contrast activities in this time span with other cognitive phenomena, which, in humans, take under a few hundred milliseconds, such as real-time natural language understanding and generation, visual perception, being reminded of things, and so on. These short time span phenomena are handled by what we will call the subdeliberative architecture, as we will discuss later.

Researchers have proposed different kinds of deliberative architectures, depending upon which kind of pattern among conscious thoughts struck them. Two groups of proposals about such patterns have been influential in AI theory-making: the reasoning view and the goal-subgoal view.

1. Deliberation as Reasoning:

People have for a long time been struck by logical relations between thoughts and have made the distinction between rational and irrational thoughts. Remember that Boole’s book on logic was titled "Laws of Thought." Thoughts often have a logical relation between them: we think thoughts A and B, then thought C, where C follows from A and B. In AI, this view has given rise to an idealization of intelligence as rational thought, and consequently to the view that the appropriate architecture is one whose behavior is governed by rules of logic. In AI, McCarthy is mostly closely identified with the logic approach to AI, and [McCarthy and Hayes, 1969] is considered a clear early statement of some of the issues in the use of logic for building an intelligent machine.

Researchers in AI disagree about how to make machines which display this kind of rationality. One group proposes that the ideal thought machine is a logic machine, one whose architecture has logical rules of inference as its primitive operators. These operators work on a storehouse of knowledge represented in a logical formalism and generate additional thoughts. For example, the Japanese Fifth generation project came up with computer architectures whose performance was measured in (millions of) inferences per second.

Historically rationality was characterized by the rules of deduction, but in AI, the notion is being broadened to include a host of non-deductive rules under the broad umbrella of "non-monotonic logic" [McCarthy, 1980] or "default reasoning," to capture various plausible reasoning rules. There is considerable difference of opinion about whether such rules exist in a domain-independent way as in the case of deduction, and how large a set of rules would be required to capture all plausible reasoning behaviors. If the number of rules is very large, or if they are context-dependent in complicated ways, then logic architectures would become less practical.

2. Deliberation as Goal-Subgoaling:

An alternate view of deliberation is inspired by another perceived relation between thoughts and provides a basic mechanism for control as part of the architecture. Thoughts are often linked by means of a goal-subgoal relation. For example, you may have a thought about wanting to go to New Delhi, then you find yourself having thoughts about taking trains and airplanes, and about which is better, then you might think of making reservations and so on. Newell and Simon [1972] have argued that this relation between thoughts, the fact that goal thoughts spawn subgoal thoughts recursively until the subgoals are solved and eventually the goals are solved, is the essence of the mechanism of intelligence. More than one subgoal may be spawned, and so backtracking from subgoals that didn’t work out is generally necessary. Deliberation thus looks like search in a problem space. Setting up the alternatives and exploring them is made possible by the knowledge that the agent has. A long term memory is generally proposed which holds the knowledge and from which knowledge relevant to a goal is brought to play during deliberation. This analysis suggests an architecture for deliberation that retrieves relevant knowledge, sets up a set of alternatives to explore (the problem space), explores it, sets up subgoals, etc.

The most recent version of an architecture for deliberation in the goal-subgoal framework is Soar [Newell, 1990]. Soar has two important attributes. The first is that any difficulty it has in solving any subgoal simply results in the setting up of another subgoal, and knowledge from long term memory is brought to bear in its solution. It might be remembered that Newell’s definition of intelligence is the ability to realize the knowledge level potential of an agent. Deliberation and goal-subgoaling are intended to capture that capability: any piece of knowledge in long term memory is available, if it is relevant, for any goal. Repeated subgoaling will bring that knowledge to deliberation. The second attribute of Soar is that it "caches" its successes in problem solving in its long term memory. The next time there is a similar goal, that cached knowledge can be directly used, instead of searching again in the corresponding problem space.

One of the results of this form of deliberation is the construction of special purpose algorithms or methods for specific problems. These algorithms can be placed in an external computational medium, and as soon as a subgoal arises that such a method or algorithm can solve, the external medium can solve it and return the results. For example, during design, an engineer might set up the subgoal of computing the maximum stress in a truss, and invoke a finite element method running on a computer. The deliberative engine can thus create and invoke computational algorithms. The goal-subgoaling architecture provides a natural way to integrate external algorithms.

In the Soar view, long term memory is just an associative memory. It has the capability to "recognize" a situation and retrieve the relevant pieces of knowledge. Because of the learning capability of the architecture, each episode of problem solving gives rise to continuous improvement. As a problem comes along, some subtasks are solved by external computational architectures which implement special purpose algorithms, while others are directly solved by compiled knowledge in memory, while yet others are solved by additional deliberation. This cycle make the overall system increasingly more powerful.

Deliberation seems to be a source of great power in humans. Why isn’t recognition enough? As Newell points out, the particular advantage of deliberation is distal access to and combination of knowledge at run-time in a goal-specific way. In the deliberative machine, temporary connections are created between pieces of knowledge that are not hard-coded, and that gives it the ability to realize the knowledge level potential more. As an architecture for deliberation, the goal-subgoal view seems to us closer to the mark than the reasoning view. As we have argued elsewhere [Chandrasekaran, 1991], logic seems more appropriate for justification of conclusions and as the framework for the semantics of representations than for the generative architecture.

AI theories of deliberation give central importance to human-level problem solving and reasoning. Any continuity with higher animal cognition or brain structure is at the level of the recognition architecture of memory, about which this view says little other than that it is a recognition memory. For supporting deliberation at the human level, long term memory should be capable of storing and generating knowledge with the full range of ontological distinctions that human language has.

3. Is the Search View of Deliberation Too Narrow?

A criticism of this picture of deliberation as a search architecture is that it is based on a somewhat narrow view of the function of cognition. It is worth reviewing this argument briefly.

Suppose a Martian watches a human in the act of multiplying numbers. The human, during this task, is executing some multiplication algorithm, i.e., appears to be a multiplication machine. The Martian might well return to his superiors and report that the human cognitive architecture is a multiplication machine. We, however, know that the multiplication architecture is a fleeting, evanescent virtual architecture that emerged as an interaction between the goal (multiplication) and the procedural knowledge of the human. With a different goal, the human might behave like a different machine. It would be awkward to imagine cognition to be a collection of different architectures for each such task; in fact, cognition is very plastic and is able to emulate various virtual machines as needed.

If the sole purpose of the cognitive architecture is goal achievement (or "problem solving"), then it is reasonable to assume that the architecture would be hard-wired for this purpose. What, however, if goal achievement is only one of the functions of the cognitive architecture, common though it might be? At least in humans, the same architecture is used to daydream, just take in the external world and enjoy it, and so on. The search behavior that we need for problem solving can come about simply by virtue of the knowledge that is made available to the agent’s deliberation from long term memory. This knowledge is either a solution to the problem, or a set of alternatives to consider. The agent, faced with the goal and a set of alternatives, simply considers the alternatives in turn, and when additional subgoals are set, repeats the process of seeking more knowledge. In fact, this kind of search behavior happens not only with individuals, but with organizations. They too explore alternatives, but yet we don’t see a need for a fixed search engine for explaining organizational behavior. Deliberation of course has to have the right sort of properties to be able to support search. Certainly adequate working memory needs to be there, and probably there are other constraints on deliberation.

. In fact, a number of other such emergent architectures built on top of the deliberative architecture have been studied earlier in our work on Generic Task architectures [1986]. These architectures were intended to capture the needs for specific classes of goals (such as classification).The above argument is not to deemphasize the importance of problem space search for goal achievement, but to resist the identification of the architecture of the conscious processor with one exclusively intended for search. The problem space architecture is still important as the virtual architecture for goal-achieving, since it is a common, though not the only, function of cognition.

D. Subdeliberative Architectures

We have made a distinction between cognitive phenomena that take less than a few hundred milliseconds for completion and those that evolve over longer time spans. We discussed proposals for the deliberative architecture to account for phenomena taking longer time spans. Some form of subdeliberative architecture is then responsible for phenomena that occur in very short time spans in humans. In deliberation, we have access to a number of intermediate states in problem solving. Many people in AI and cognitive science feel that the emphasis on complex problem solving as the door to understanding intelligence is misplaced, and that theories that emphasize rational problem solving only account for very special cases and do not account for the general cognitive skills that are present in ordinary people. These researchers focus almost completely on the nature of the subdeliberative architecture. There is also a belief that the subdeliberative architecture is directly reflected in the structure of the neural machinery in the brain. Thus, some of the proposals for the subdeliberative architecture claim to be inspired by the structure of the brain and claim a biological basis in that sense.

Alternative Proposals:

The various proposals differ along a number of dimensions: what kinds of tasks the architecture performs, degree of parallelism, whether it is an information processing architecture at all, and, when it is taken to be an information processing architecture, whether it is a symbolic one or some other type.

With respect to the kind of tasks the architecture performs, we mentioned Newell’s view that it is just a recognition architecture. Any smartness it possesses is a result of good abstractions and good indexing, but architecturally, there is nothing particularly complicated. In fact, the good abstractions and indexing themselves were the result of the discoveries of deliberation during problem state search. The real solution to the problem of memory, for Newell, is to get chunking done right: the proper level of abstraction, labeling and indexing is all done at the time of chunking. In contrast to the recognition view are proposals that see relatively complex problem solving activities going on in subdeliberative cognition. Cognition in this picture is a communicating collection of modular agents, each of whom is simple, but capable of some degree of problem solving. For example, they can use the means-ends heuristic (the goal-subgoaling feature of deliberation in the Soar architecture).

V. Concluding Remarks:

We started by asking how far intelligence or cognition can be separated from mental phenomena in general. We suggested that the problem of an architecture for cognition is not really well-posed, since, depending upon what aspects of the behavior of biological agents are included in the functional specification, there can be different constraints on the architecture. We reviewed a number of issues and proposals relevant to cognitive architectures. Not only are there many levels each explaining some aspect of cognition and mentality, but the levels interact even in relatively simple cognitive phenomena.