Introduction to Microprocessors and Microcontrollers - читать онлайн бесплатно полную версию книги . Страница 11

10. High-level languages

Third-generation languages

The third-generation languages were intended to make life easier. They were designed to improve the readability by using English words which would make it easier to understand and to sort out any faults (bugs) in the program. The process of removing bugs is called debugging. In addition, they should relieve the programmer of any need to understand the internal architecture of the microprocessor and so the program should be totally portable. Ideally the programmer should not even need to know what processor is being used. These languages are called ‘high-level’ and are all procedural. Over the years, many languages have been invented just as there have been many microprocessors. Just like the microprocessors, a few languages had some special aptitude that made them stand out from the crowd. We will introduce some of the survivors.

Fortran

In the early days of computers, they were seen as a means of improving the speed and accuracy of performing mathematical calculations – rather as new and improved calculating machines. IBM dominated the computer world at that time and employed John Backus to produce an improved language to supersede assembly language.

The result, finalized in 1957, was Fortran. This was the first high-level language to gain widespread acceptance. Its claim to fame was that it could evaluate mathematical formulas. This gave rise to its name ‘FORmula TRANslation’ (originally ‘IBM Mathematical Formula Translating System’). (See Figure 10.1.)

Figure 10.1 The first successful high-level language – and still going

Fortran has instructions built-in to handle most scientific formulas such as finding the sine of an angle which would be extremely difficult to do in assembly language. Difficult, but not impossible. After all, the Fortran program must first be converted into the machine code understood by the microprocessor. If the Fortran program can be converted to machine code, then it follows that the program could have been written in machine code in the first case. Its just a matter of saving an enormous amount of work.

As time went by, small additions and alterations were made to the language giving rise to new ‘dialects’. The disadvantage of this is that it starts to erode portability – one of the primary reasons for having a high-level language. In 1958 a re-defined language called Fortran 2 was born which was replaced in turn by Fortran 3 and 4. Still new dialects were sprouting and, in 1966, a final, last, definitive version called Fortran 66 was designed. Then we had the even more final and more definitive Fortran 77 and the absolutely final and totally definitive Fortran 90. This is not as bad as it seems since each new version included extras rather than changes. There was a Fortran 95 version and there is another due out somewhere around the year 2004 to 2006 which, at the moment, is just referred to as 200X but at least this name suggests a launch no later than the year 2009. In the meantime, an easily learnt version, simply called F, was launched. This was compatible with Fortran 77, 90 and 95 but includes some more modern characteristics while older more redundant sections were quietly dropped. High Performance Fortran (HPF) is similar to the 90 and 95 versions except that it has been extended to allow microprocessors to be run in parallel for extra speed.

Compilers

In assembly language, we used an assembler program to convert the mnemonics to machine code. We usually refer to the conversions being from source code to object code but it means the same thing. In Fortran, or any high-level language, we use a compiler to produce the machine code. The compiler will also carry out the useful extras like error and syntax checking that we met with the assemblers. Compilers and assemblers are both software – that is, they are programs designed to do a specific job. If we were using a 68000 microprocessor, and wished to program it using a particular language, say Fortran, then we would have to purchase a ‘Fortran 90 to 68000’ compiler. It would do just this job and nothing else. We could not adapt it in any way to accept a different high-level language or ‘target’ it at a different processor (see Figure 10.2).

Figure 10.2 Compilers are like assemblers

All assemblers are basically one-to-one converters. We enter the op code and out comes the object code so the only difference between two assemblers is in the amount of debugging and label handling help that it can provide. Compilers are different. There are not any direct object code equivalents to things like sine or square root so the final output depends on the skills of the compiler designer. The programmer must look at each of the available commands in the high level language and write an assembly program to perform the function. Some compilers are, therefore, inevitably better than others.

Libraries, linkers and loaders

When the designer has struggled through the process of devising assembly code for a particularly nasty formula it would make sense to store the answer away to allow its use on another occasion. A collection of these solutions is called a library and can be purchased.

This reduces the amount of totally original coding that is needed. Many ‘new’ formulas can be made from combinations of existing code. Slotting these ready-made library routines into the main programs is performed by a linker which is another piece of software. The linker therefore joins or links together many separate pieces of code into one program ready for use. The last job to be done is to load it into some RAM ready for use. Another piece of software is used to determine which addresses in the microprocessor system memory are available. This is called a loader. A loader also converts labels to their final addresses. Very often the linking and loading functions are combined into a single linker-loader program. The process is illustrated in Figure 10.3.

Figure 10.3 From high level to low level

Fortran source code

Fortran consists of numbered statements called program lines and are used to tell the system the order in which the instructions are to be carried out. In the absence of any other commands, the program lines are executed in numerical order.

Fortran code is written in a very compact form much closer to mathematics than English. For example, to load a number and find the square root of it may look like this:

1 Read (4) P

2 A = SQRT(P)

In statement 1, the microprocessor is sent to input number 4 (this must be defined earlier) and retrieves a number which we call P. Statement 2 finds its square root.

Fortran has accumulated enormous libraries to handle scientific and engineering problems. The drawback of Fortran is that its instructions are so very compact that, unless you are happy with formulas, it can look a little frightening. In addition, its format is very precise and this makes it difficult and unforgiving to learn. If you have a morbid dread of mathematics, you may find its approach a little daunting.

Basic

In Dartmouth College, USA, a simplified language was developed. It was based on Fortran and was designed as a simpler language and easier to learn. This language was called Basic (Beginners’ All purpose Symbolic Instruction Code) and first appeared in 1960 (see Figure 10.4).

Figure 10.4 The Basics

In the early days, the emphasis was on ‘easy to learn’ and ‘using a minimum amount of memory’ as memory was very expensive. These two attributes made it very useful in colleges but was largely ignored ‘in the real world’. As microprocessors appeared and gave rise to the microcomputer, the benefits of low memory requirements gave it a renewed popularity and it ‘took off’.

To save memory, Basic was designed as an interpreted language. An interpreter rather than a compiler carried out the conversion of the source code to the object code.

What’s the difference? The compiler converts the whole program into object code, stores it, then runs the program. The interpreter takes a different approach. The first instruction in the program is converted to source code and it is then executed. The next item of source code is then converted to object code and then ran and so on right through the program (see Figure 10.5). The interpreter never stores the whole of the machine code program. It just generates it, a line at a time, as needed.

Figure 10.5 The Slow ’n’ easy interpreter

The development of Basic

The explosion of microcomputers in the 1980s resulted in widespread adoption of Basic. It was used, or at least played with, by more people than any programming language before or since. More variations, or dialects, started to appear, as occurred with Fortran.

In the case of Fortran, the American National Standard Institute (ANSI) collected all the ideas together and produced the standard Fortrans like Fortran 66, 77 and 90. This did not happen with Basic and the result is now an open market with several hundred different competing, sort of compatible, Basics floating around. Most of the earlier ones have withered away leaving a few popular versions like Q-Basic, GW Basic and Quick Basic. More recently, with the virtual monopoly of the Windows operating system, a new version called Visual Basic (VB) has appeared which has features to make the generation of Windows programs much easier.

Over the years, the language has developed to provide more and more features until it closely rivals Fortran, even for calculations. Basic has shaken off its beginner’s label and is used by many professional programmers. Now that memory size is not such a problem, compiled versions are now used to accelerate the execution of programs.

There has been a series of versions of Visual Basic, VB1 to VB6 and we now have VB.Net to allow easy manipulation of Windows™ and Web site material.

Basic is very readable

It uses line numbering as in Fortran and is generally readable. Here is an example. Test out its claim to readability by guessing the outcome.

10 input A, B

20 Let C = A * B

30 Print C

40 End

On line 10, it requests two numbers A and B to be entered, possibly via a keyboard. Line 20 defines a new number C as the result of multiplying them together. Print means providing an output either onto a printer or to the screen of the monitor. The last line stops the program.

As you will remember, we always have to give a microprocessor something safe to do after it has completed a program otherwise it will start following random instructions. We can do this by sending it in a loop as we did in the assembly example to follow the same instruction over and over again or we can send it off to follow another program. The END instruction does just this. The program is returned to a ‘monitor’ program. A monitor program does the often forgotten parts of the system like scanning the keyboard to accept further instructions and is in no way connected with the screen showing the visual images.

The line numbers are executed in numerical order and any missing numbers are just ignored. This allows us to use line numbers that increase in steps of five or ten instead of ones. The advantage of this is that any forgotten instruction can be added later by giving it a new line number. For example, if we remembered that we wanted to divide the value of C by 2 then print this result as well, we could add a couple of extra lines:

10 input A, B

20 Let C = A * B

30 Print C

32 Let D = C/2

35 Print D

40 End

A final point is that we do not say what A, B, C or D stand for before we start. The program implies the necessary information. In other words, we do not have to declare the variables.

Cobol

Fortran and Basic, the son of Fortran, did not make enormous steps towards employing normal English language phrases. This was attempted by the US Defense Department who introduced Cobol in 1959 – just before Basic (see Figure 10.6). Its purpose was not number crunching, like Fortran, but information handling. It proved to be successful at this and spread from the US Navy where it kept records of stocks and supplies, to the business world. The name was derived from COmmon Business Oriented Language.

Figure 10.6 This is the business

Cobol was designed, more in hope than reality, to be easily read by non-programmers. However, it looked friendlier than the mathematical approach of Fortran and has survived to the present day. It is generally used by large corporate computers rather than Desktop PCs.

Large businesses handle enormous amounts of information data. Just imagine the amount of information involved in a few everyday activities. We apply to open a bank account – a form appears asking an almost infinite number of questions. Then we want a credit card – more forms, more information mostly the same as we gave them for the bank account. Then we buy something. What we bought, its stock number, its price, the date, our card number, and name are all transmitted to the national card centre and our account is amended. None of these transactions involve particularly complicated mathematics. The calculations are basically addition and subtraction of totals. So the calculating ability of Cobol does not need to rival Fortran or even Basic. But what it can do it to extract related information – put in our post code and out comes all sorts of information about us – credit rating, employment, home address, hobbies, purchasing patterns and almost anything else they want. Some of this information is bought and sold between companies without reference to us.

Like Fortran, it has survived by meeting a specific need and has had a series of upgraded standard versions. They refer to the date of adoption: Cobol 60, Cobol 74, Cobol 85, Cobol 97 and the new 2002 version.

A real effort was made to allow the language to be understood by those who know more about English than about programming. To this end, Cobol statements are English phrases including a verb and ending with a comma. The phrases can be joined up to form a sentence or a single statement can end with a period (full stop).

A line may read:

1 Add staff to customers to give total people.

Pascal

Pascal was first designed in Switzerland in 1971 (Figure 10.7). It is mostly an academic language and has been largely overtaken for professional programming by languages such as C. When learning other languages, a short course of Pascal is often employed as an introduction. Pascal is used because it is ‘good for you’, just as it is often said that to learn European languages it is ‘good’ to learn Latin first to lay down the rules of language before starting on French, German or Spanish.

Figure 10.7 Pascal does you good

Pascal is a very structured language. A structured program consists of a series of separate, self-contained units each having a single starting point and a single exit point. The program layout looks like a simple block diagram with all the blocks arranged one under the other. Since every unit can be isolated from the ones above and below, detecting an error, or understanding a new program, is relatively easy.

Languages like Basic can use instructions like GO TO to jump to a new part of the program and this often results in what is called ‘spaghetti’ programming, making it very difficult to find a fault in the program or even to understand what the program does (see Figure 10.8). Pascal avoids this by using instructions like ‘Repeat…until’. Basic is cleaning up its act by incorporating this type of instruction into the more recent versions.

Figure 10.8 Structure and spaghetti

C

Apart from teaching good programming habits, Pascal was largely replaced by the language called C, invented a year after Pascal and allowing all the good practice programming methods of Pascal with a few extras (see Figure 10.9).

Figure 10.9 C

The main difference is that it is a lower-level language than Pascal which may seem a strange improvement. Its advantage is that it can control low-level features like memory loading that we last met in assembly without all the drawbacks of using assembly language. It has many high-level features, and low-level facilities when we require them and can produce very compact, and therefore fast, code.

C++ and object-oriented programming

A new version of C has included all of the previous C language and added a new feature called object-oriented programming. This version, called C++, is referred to as a superset of C (see Figure 10.10).

Figure 10.10 C plus objects and Java

Object-oriented programming is a somewhat different approach to programming. In all previous cases, a task has been set and we look at the problem in terms of what processing is required to reach the required result. In object-oriented languages we have a number of objects which can be anything from some data, a diagram on a monitor screen, a block of text or a complete program. Once we have defined our objects, we can then allocate them to their own storage areas and define ways of acting on the entire object at the same time.

As an example, if we drew a square on the monitor screen and wished to move it, we could approach this in two ways. We could take each point on the screen and shift its position and hence rebuild the square in a different position. The object-oriented approach would be to define the shape as an object, then instruct the object to move. This is rather similar to our way of handling parts of the screen in a Windows environment. We use a mouse to take hold of an object, say a menu, and simply drag it to a new position. The menu is being treated as a single lump, which is an example of an object.

All the menus have similarities and differences. Similar objects are grouped into ‘classes’. A class includes the definition and type of the data together with the methods of manipulation that can be applied to an object. In our example of a class including the menus, each specific menu is called an ‘instance’. All of a class share some properties called ‘public’ properties and are different in some way, like different text being entered, these are called ‘private’ properties. As we saw earlier, C++ is just standard C with the object-oriented extras added to it.

Java

Although not its original destiny, Java was found to be ideal for transmitting information over the Internet. It looks, at first glance, to be similar to C++ but it has some important advantages. It is small and does not require any particular architecture and can therefore be embedded in other applications for use in a wide range of systems. This embedded Java code is called ‘Applets’ and is used in Internet Explorer and other browsers.

Fourth-generation languages

Fourth-generation languages are non-procedural and tend to concentrate on what the program must do rather than the mechanics of the step-by-step approach of the procedural languages. A tempting definition is to say they are the most recent, or the most popular or the ‘best’ languages. None of these definitions apply, so it may be better to stick to non-procedural as the most likely definition.

Lisp

Lisp was designed about the same time as Cobol and several years before Basic. It is the work of an American, John McCarthy (see Figure 10.11).

Figure 10.11 Lisp – dealing with lists

Lisp (LISt Processing) involves the manipulation of data which is held in lists or entered by keyboard and is associated with artificial intelligence. Lisp is a function-oriented language. This means that we define a function, such as add, subtract or more complex combinations. A list consists of a series of ‘members’ separated by spaces and enclosed in brackets. Samples of lists are (2 5 56 234) or (mother father son daughter).

A simple function defined as (PLUS 6 4) would return the answer 10 by adding the two numbers. Since it is an interpreted language, the program is executed one step at a time so inputted values are used as they are entered. In some versions of Lisp, this would have been written as (+ 6 4) using the mathematical symbol.

We can define our own function by saying

(defun result (A B)(+A B))

defun = define function and A B are inputted numbers and the answers have been given the name ‘result’. So, if we input the numbers 4 and 5 we would receive the response 9. This has defined a function in which we enter two numbers A and B and they are added. We could use this to enter a list of values for A and B generating a list of all the results.

APL

The letters APL stand, reasonably enough, for ‘A Programming Language’. This is another interpreted language developed by IBM around 1962 and is only used for numerical data. It is a curious mixture of Lisp and Fortran. It combines the function orientation of Lisp with the terse procedural mathematics of Fortran (see Figure 10.12). It allows user-defined functions and has a large library of solutions to common problems. Most people would agree that given a choice, APL is not the language to learn if you are in a hurry. For example, the four basic functions of add, subtract, multiply and divide are present but even here, life is not easy. What would you expect the result of 2*3 to be? Well, it’s not 6. This would be written as 3×2 and 2*3 is actually 2³ or 8. The statement: Value←4–^–2 four take away minus two giving an answer of plus six. Notice the different symbols for minus and a negative number. Other mathematical functions like sin, cos and tan are replaced by special symbols that do not appear on standard keyboards.

Figure 10.12 A programming language

Easy, it is not. The good news is that, once mastered, it provides fast compact programs but the many cryptic statements would need to include many comments to help another person to understand your program.

Prolog

Prolog is called a ‘declarative’ language in which the program designer does not need to know exactly what the output will be when starting the design of the program. It was first developed in France in 1972 with a view to its use in the development of artificial intelligence. Prolog stands for PROgramming by LOGic. Other versions were developed, such as DEC10, IC Prolog, which were produced in the UK and other versions from the US (see Figure 10.13).

Figure 10.13 A logic-based language

Prolog is another non-procedural language in that it is not a route to a goal but a set of information and a method from which the result can be deduced. Basically the idea is to feed in some facts and ask the program to produce some conclusions. You will remember the little logic puzzles like ‘Graham is married to Anne, Kirk is the son of Peter and the brother of Matt … Is Kirk the brother of …’ You know the sort of thing – more and more interconnected pieces of information until your head hurts. Just the job for Prolog.

The program includes facts and rules then we can ask questions. Here is a really simple example.

Facts:

coins (franc, france)

coins (centime, france)

coins (dollar, usa)

coins (cent, usa)

Rules:

french(x):–coins (x,france)

american(x):–coins (x,american)

Now we ask some questions:

?french (centime)

answer: Yes

?french (dollar)

answer: No

OK so far, but:

?american (dime)

answer: No

It has no data so it cannot say that dime is correct so it plays safe and says it is incorrect.

The future

Increasing microprocessor speeds and using several to share the processing tasks together with the decreasing size and cost of memory will be the key to the future. The idea of a desktop computer running at 20 GHz and having 128 Gbytes of memory is no longer ridiculous. In fact, it is looking rather modest after looking at Table 10.1 in which the trend over the last 26 years is projected another 26 years into the future.

‘It cannot keep increasing’ – quote of the year 1972, 1973, 1974, 1975, 1976…

Table 10.1

Similarly priced microprocessor systems	Intel 4004 in 1972	Intel Pentium in 1998	Microprocessor in 2024
Clock speed	0.108 MHz	300 MHz	833 GHz
Memory	640 bytes	64 Mbytes	6.4 Tbytes
Performance (MIPS)	0.06 approx.	600 approx.	6 000 000

A persistent occupant of my crystal ball is real spoken voice communication. A few years ago voice recognition was only a dream and is now a reality and becoming increasingly efficient. Voice synthesis is progressing nicely and is beginning to sound less robotic. When these two technologies mature, simultaneous language translation will not be far away and real dialog with the computer will begin. Another million television programs suddenly become available without sub-titles (you see, all progress comes at a price!). Six million, million instructions per second and 800 GHz clock speed together with total voice control of computer functions will be here in a few years.

‘How about that Mr Spock? ’

‘Fascinating’

That’s my guess.

At least it is something to look back and smile about in the future.

Quiz time 10

In each case, choose the best option.

1 A compiler:

(a) converts machine code to a high level language.

(b) is faster than an interpreter.

(d) is not available for the Basic language.

2 APL was largely influenced by:

(a) Cobol and Prolog.

(b) Lisp and Fortran.

(d) Pascal and Cobol.

3 A fourth generation language can be described as a language which:

(a) is still being used.

(b) is object-oriented.

(d) is non-procedural.

4 A language designed to allow logical deductions to be made from input data is:

(a) C.

(b) Latin.

(d) Fortran.

5 Pascal:

(a) is a low level language compared with C.

(b) is only of use if you are going to translate it to European languages.

(d) was the first popular high-level language.