Introduction to AltiVec
Why Use AltiVec Technology?
Technology Tour
Links to Other Sites
AltiVec in the News
E-mail Group
Training Material
Articles and Papers
Technical Specifications
Tools
About This Site
Coming Soon
Home
The AltiVec Information Source

AltiVec.org
More on this topic...
Improving Perfomance with AltiVec
PDF Version (512KB)
Implementing AltiVec,
One Customer's Experience
PDF Version (1.3MB)
AltiVec Fact Sheet
PDF Version (128K)
AltiVec White Paper
PDF Version (192K)
Introduction to AltiVec Technology:
White Paper

Sam Fuller
System Architecture & Product Planning Manager,
Networking & Computing Core Technologies

Motorola Inc.
Semiconductor Product Sector
6501 William Cannon Drive West, Austin, Texas 78735

Introduction

Over the last 25 years, microprocessors have enjoyed a continuous increase in performance and attendant reduction in price/performance. Current best of breed microprocessors operate at frequencies in excess of 300 MHz and offer superscalar instruction dispatch, sophisticated branch prediction techniques and support for high performance memory systems including external second level cache controllers.

As general purpose microprocessors have continued to become more powerful, they have been asked to perform increasingly complex tasks. In fact, the trend of doubling system performance every 1.5 to 2 years has not met the requirements of the networking and telecommunications infrastructure industry due to several emerging applications and trends. Example applications include the explosive growth of the Internet, the emergence of new digital communications technologies, including digital cellular phones employing CDMA, TDMA and PCS technologies, IP-based telephony, fax and multimedia and wireless messaging. A general trend in the industry is using programmable processors to implement adaptive filters, modulators/demodulators, and other functions once only possible in hardware. These trends and applications have created tremendous opportunities for high-performance, high bandwidth processors. These demanding new applications, along with the continually increasing needs of the computing market, necessitated a new approach in how to maximize performance in order to provide our customers with the order of magnitude increase in key application performance they demand.

To meet these needs, a new class of microprocessor product is called for. One which offers in a single chip solution the highest level of processing performance while expanding the processor's capabilities to concurrently address high-band-width data processing and the algorithmic intensive computations which today are typically handled off-chip by other devices, such as dedicated hardware, DSP farms or custom ASICs. Motorola is introducing a new technology that provides for this convergence in capabilities � AltiVec technology.

AltiVec technology is Motorola's high-performance vector parallel processing expansion to the PowerPC™ RISC processor architecture. Motorola microprocessors offering AltiVec technology will represent a new class of product. In addition to providing 100% compatibility with the industry-standard PowerPC Architecture ™ , AltiVec technology will also provide product designers and customers with a new "one part�one code base" approach to product design which simplifies design and support while simultaneously providing a tremendous jump in performance.

AltiVec Technology

Motorola's AltiVec technology expands the current PowerPC architecture through the addition of a 128-bit vector execution unit, which operates concurrently with the existing integer and floating point units. This new engine provides for highly parallel operations, allowing for the simultaneous execution of up to 16 operations in a single clock cycle.

AltiVec technology is a short vector parallel architecture. Depending on data size, vectors are 4, 8 or 16 elements long. This can be contrasted with the long vector architectures of supercomputers that were popular in the 1980s. Vector sizes for those machines ranged to hundreds of elements. The long vector approach of supercomputers, while useful for scientific calculations, is not optimal for the communications, multimedia and other performance-driven applications targeted by Motorola with AltiVec technology.

AltiVec technology operations are performed on multiple data elements by a single instruction. This is often referred to as SIMD (single instructions, multiple data) parallel processing. AltiVec technology offers support for:

  • 16-way parallelism for 8-bit signed and unsigned integers and characters,
  • 8-way parallelism for 16-bit signed and unsigned integers
  • 4-way parallelism for 32-bit signed and unsigned integers and IEEE floating-point numbers

AltiVec technology also includes a separate register file containing 32-entries, each 128-bits wide. These 128-bit wide registers hold the data sources for the AltiVec technology execution units. The registers are loaded and unloaded through vector store and vector load instructions that transfer the contents of a single 128-bit register to and from memory.

AltiVec technology can be most accurately thought of as a set of registers and execution units added to the PowerPC architecture in an analogous manner to the addition of floating point units. Floating point units were added to most mainstream microprocessor architectures several years ago to provide better support for high-precision scientific calculations. AltiVec technology is being added to the PowerPC architecture to dramatically accelerate the next level of performance- driven, high-bandwidth communications and computing applications.

Each AltiVec instruction specifies up to three source operands and a single destination operand. All operands are vector registers, with the exception of the load and store instructions and a few instruction types that provide operands from immediate fields within the instruction. 162 new unique instructions are defined for the AltiVec technology. These instructions fall into the following major classes.

1. Intra-Element Arithmetic Operations
Intra-element arithmetic operations perform independent parallel computations on the elements contained in the source vector registers and place the results in the corresponding fields of the destination vector register. Both signed and unsigned integers and floating-point data types are supported by the intra-element operations. The operations support both saturation and modulo arithmetic. A variety of powerful intra-element operations are defined in the AltiVec technology: addition, subtraction, multiply, and multiply-add. Additional instructions perform min, max and average, as well as conversion between floating-point and 32-bit integer numerical formats.

2. Intra-Element Non-Arithmetic Operations
Intra-element non-arithmetic operations include various forms of compare, shift, and rotate. The following logical operations are also supported: AND, OR, NOT, XOR, AND-NOT. A select instruction is also provided. This instruction is designed to select or choose source data from one of two source registers and transfer that data to the results register. The combination of compare and select provides a powerful way to mask and replace data elements across the entire 16-byte field of the vector registers with a very few instructions.

3. Inter-Element Arithmetic Operations
A few special inter-element arithmetic operations are provided in the AltiVec technology, these operations are sum of products and sum across. These operations allow for elements within a single vector register to be summed in combination with a separate accumulation register. These operations are valuable for generating dot products which are the most common vector operation.

4. Inter-Element Non-Arithmetic Operations
In addition to the powerful intra-element and inter-element arithmetic operations, AltiVec technology also defines a group of very powerful inter-element non-arithmetic operations. These inter-element operations include wide field shift operations, pack and unpack operations, including a special operation to handle the 1/5/5/5 pixel format common for 16-bit color pixels. Merge operations are also provided that can interleave data at the byte, halfword and word level.

Perhaps the most powerful inter-element operation offered in the AltiVec technology is the permute operation. The permute operation is capable of arbitrarily selecting data with the granularity of a byte from two 16-byte source registers into a single 16-byte destination register.

For operations where 8- and 16-bit data items must be reorganized in memory before or after computations, permute can save significant time. In many instances a single permute operation can operate on 16 bytes of data and replace 4 or 5 operations per byte using a traditional RISC or DSP operation.

The powerful inter-element operations of AltiVec technology define a microprocessor not just capable of operating on 8, 16 and 32-bit data elements in parallel but of operating on data 128 bits (16 bytes) at a time.

Applications of AltiVec Technology

The initial target applications for AltiVec technology include: IP telephony gateways, multi-channel modems, speech processing systems, echo cancelers, image and video processing systems, scientific array processing systems, as well as network infrastructure such as Internet routers and virtual private network servers.

In addition to accelerating next-generation applications, AltiVec technology can, through its wide datapaths and wide field operations, also accelerate many time-consuming traditional computing and embedded processing operations such as memory copies, string compares and page clears.

Unlike fixed function solutions which are most often implemented as application specific integrated circuits, AltiVec technology will offer a programmable solution that can easily migrate via software upgrades to follow changing standards and customer requirements. The preferred programming environment is the C and C++ languages favored by embedded systems developers. To more easily express the parallelism presented by AltiVec technology, Motorola has developed a standardized set of C/C++ language extensions. These language extensions allow a software developer to use their preferred C/C++ development environment and language syntax while explicitly taking advantage of the parallel functional units other facilities offered by the AltiVec technology. Motorola is working with leading tools providers to develop simulators, assemblers, linkers and compilers to assure full support for the AltiVec technology.

While the initial PowerPC microprocessor utilizing AltiVec technology will target very high-performance applications in networking and computing, subsequent Motorola processors with AltiVec technology could address markets and applications in which performance must be balanced with power, price and peripheral integration.

A New Design Model

The introduction of processors containing AltiVec technology creates a new model of system design for high-performance embedded systems. Historically, many high-performance embedded applications have contained a combination of a single RISC processor performing the system control function and one or more DSPs or ASICs performing specialized computations.

The single RISC processor plus multiple DSP system has a number of disadvantages, including two different architectures, code bases, hardware types, and debug environments. Additionally, because DSPs have not been on the same performance growth curve as general purpose processors - for example, they often require users to switch to newer noncompatible architectures from generation to generation, even minor upgrades in a customer's product performance often required major hardware redesigns; often including changing DSP or controller architectures with the attendant cost and time to market impact.

AltiVec technology-based systems can provide more capable single architecture systems, often at lower cost, power budget, and physical area than controller plus DSP solutions. The use of a single high-performance device for controller and signal processing functions results in quicker time to market and lower overall engineering cost. A single architecture solution provides a simpler development task to both the hardware and software engineer.

Summary

With the introduction of AltiVec technology, Motorola is demonstrating its commitment to the PowerPC architecture and to meeting the requirements of next generation networking, communications and computing applications. AltiVec technology will expand the PowerPC microprocessor capability by providing leading edge general purpose processing performance while concurrently addressing high-bandwidth data handling processing and algorithmic intensive computations in a single chip solution. This new class of processor will provide an aggressive performance growth path for embedded and computing systems designers, while lowering development barriers inherent in multiple architecture designs, thereby reducing the time to market and total system development expense.

©1998 Motorola, Inc. All rights reserved. Printed in the U.S.A. Motorola and the are registered trademarks and AltiVec and the AltiVec logo are trademarks of Motorola, Inc. PowerPC, the PowerPC logo and PowerPC Architecture are trademarks of International Business Machines Corporation and used by Motorola, Inc. under license therefrom. This document contains information on a new product under development. Specifications and information herein are subject to change without notice.