graphics processing unit seminars report
#1

[attachment=931]
ABSTRACT
A Graphics Processing Unit (GPU) is a microprocessor that has been designed specifically for the processing of 3D graphics. The processor is built with integrated transform, lighting, triangle setup/clipping, and rendering engines, capable of handling millions of math-intensive processes per second. GPUs allow products such as desktop PCs, portable computers, and game consoles to process real-time 3D graphics that only a few years ago were only available on high-end workstations. Used primarily for 3-D applications, a graphics processing unit is a single-chip processor that creates lighting effects and transforms objects every time a 3D scene is redrawn. These are mathematically-intensive tasks, which otherwise, would put quite a strain on the CPU.

INTRODUCTION
There are various applications that require a 3D world to be simulated as realistically as possible on a computer screen. These include 3D animations in games, movies and other real world simulations. It takes a lot of computing power to represent a 3D world due to the great amount of information that must be used to generate a realistic 3D world and the complex mathematical operations that must be used to project this 3D world onto a computer screen. In this situation, the processing time and bandwidth are at a premium due to large amounts of both computation and data.
The functional purpose of a GPU then, is to provide a separate dedicated graphics resources, including a graphics processor and memory, to relieve some of the burden off of the main system resources, namely the Central Processing Unit, Main Memory, and the System Bus, which would otherwise get saturated with graphical operations and I/O requests. The abstract goal of a GPU, however, is to enable a representation of a 3D world as realistically as possible. So these GPUs are designed to provide additional computational power that is customized specifically to perform these 3D tasks.
WHATâ„¢S A GPU????
A Graphics Processing Unit (GPU) is a microprocessor that has been designed specifically for the processing of 3D graphics. The processor is built with integrated transform, lighting, triangle setup/clipping, and rendering engines, capable of handling millions of math-intensive processes per second. GPUs form the heart of modern graphics cards, relieving the CPU (central processing units) of much of the graphics processing load. GPUs allow products such as desktop PCs, portable computers, and game consoles to process real-time 3D graphics that only a few years ago were only available on high-end workstations.
Used primarily for 3-D applications, a graphics processing unit is a single-chip processor that creates lighting effects and transforms objects every time a 3D scene is redrawn. These are mathematically-intensive tasks, which otherwise, would put quite a strain on the CPU. Lifting this burden from the CPU frees up cycles that can be used for other jobs.
However, the GPU is not just for playing 3D-intense videogames or for those who create graphics (sometimes referred to as graphics rendering or content-creation) but is a crucial component that is critical to the PC's overall system speed. In order to fully appreciate the graphics card's role it must first be understood.
Many synonyms exist for Graphics Processing Unit in which the popular one being the graphics card .Itâ„¢s also known as a video card, video accelerator, video adapter, video board, graphics accelerator, or graphics adapter.
HISTORY AND STANDARDS
The first graphics cards, introduced in August of 1981 by IBM, were monochrome cards designated as Monochrome Display Adapters (MDAs). The displays that used these cards were typically text-only, with green or white text on a black background. Color for IBM-compatible computers appeared on the scene with the 4-color Hercules Graphics Card (HGC), followed by the 8-color Color Graphics Adapter (CGA) and 16-color Enhanced Graphics Adapter (EGA). During the same time, other computer manufacturers, such as Commodore, were introducing computers with built-in graphics adapters that could handle a varying number of colors.
When IBM introduced the Video Graphics Array (VGA) in 1987, a new graphics standard came into being. A VGA display could support up to 256 colors (out of a possible 262,144-color palette) at resolutions up to 720x400. Perhaps the most interesting difference between VGA and the preceding formats is that VGA was analog, whereas displays had been digital up to that point. Going from digital to analog may seem like a step backward, but it actually provided the ability to vary the signal for more possible combinations than the strict on/off nature of digital.
Over the years, VGA gave way to Super Video Graphics Array (SVGA). SVGA cards were based on VGA, but each card manufacturer added resolutions and increased color depth in different ways. Eventually, the Video Electronics Standards Association (VESA) agreed on a standard implementation of SVGA that provided up to 16.8 million colors and 1280x1024 resolution. Most graphics cards available today support Ultra Extended Graphics Array (UXGA). UXGA can support a palette of up to 16.8 million colors and resolutions up to 1600x1200 pixels.
Even though any card you can buy today will offer higher colors and resolution than the basic VGA specification, VGA mode is the de facto standard for graphics and is the minimum on all cards. In addition to including VGA, a graphics card must be able to connect to your computer. While there are still a number of graphics cards that plug into an Industry Standard Architecture (ISA) or Peripheral Component Interconnect (PCI) slot, most current graphics cards use the Accelerated Graphics Port (AGP).
PERIPHERAL COMPONENT INTERCONNECT(PCI)
There are a lot of incredibly complex components in a computer. And all of these parts need to communicate with each other in a fast and efficient manner. Essentially, a bus is the channel or path between the components in a computer. During the early 1990s, Intel introduced a new bus standard for consideration, the Peripheral Component Interconnect (PCI).It provides direct access to system memory for connected devices, but uses a bridge to connect to the front side bus and therefore to the CPU.

The illustration above shows how the various buses connect to the CPU.
PCI can connect up to five external components. Each of the five connectors for an external component can be replaced with two fixed devices on the motherboard. The PCI bridge chip regulates the speed of the PCI bus independently of the CPU's speed. This provides a higher degree of reliability and ensures that PCI-hardware manufacturers know exactly what to design for.
PCI originally operated at 33 MHz using a 32-bit-wide path. Revisions to the standard include increasing the speed from 33 MHz to 66 MHz and doubling the bit count to 64. Currently, PCI-X provides for 64-bit transfers at a speed of 133 MHz for an amazing 1-GBps (gigabyte per second) transfer rate!
PCI cards use 47 pins to connect (49 pins for a mastering card, which can control the PCI bus without CPU intervention). The PCI bus is able to work with so few pins because of hardware multiplexing, which means that the device sends more than one signal over a single pin. Also, PCI supports devices that use either 5 volts or 3.3 volts. PCI slots are the best choice for network interface cards (NIC), 2-D video cards, and other high-bandwidth devices. On some PCs, PCI has completely superseded the old ISA expansion slots.

Although Intel proposed the PCI standard in 1991, it did not achieve popularity until the arrival of Windows 95 (in 1995). This sudden interest in PCI was due to the fact that Windows 95 supported a feature called Plug and Play (PnP). PnP means that you can connect a device or insert a card into your computer and it is automatically recognized and configured to work in your system. Intel created the PnP standard and incorporated it into the design for PCI. But it wasn't until several years later that a mainstream operating system, Windows 95, provided system-level support for PnP. The introduction of PnP accelerated the demand for computers with PCI.
ACCELERATED GRAPHICS PORT (AGP)
The need for streaming video and real-time-rendered 3-D games requires an even faster throughput than that provided by PCI. In 1996, Intel debuted the Accelerated Graphics Port (AGP), a modification of the PCI bus designed specifically to facilitate the use of streaming video and high-performance graphics.
AGP is a high-performance interconnect between the core-logic chipset and the graphics controller for enhanced graphics performance for 3D applications. AGP relieves the graphics bottleneck by adding a dedicated high-speed interface directly between the chipset and the graphics controller as shown below.

Segments of system memory can be dynamically reserved by the OS for use by the graphics controller. This memory is termed AGP memory or non-local video memory. The net result is that the graphics controller is required to keep fewer texture maps in local memory.
AGP has 32 lines for multiplexed address and data. There are an additional 8 lines for sideband addressing. Local video memory can be expensive and it cannot be used for other purposes by the OS when unneeded by the graphics of the running applications. The graphics controller needs fast access to local video memory for screen refreshes and various pixel elements including Z-buffers, double buffering, overlay planes, and textures.
For these reasons, programmers can always expect to have more texture memory available via AGP system memory. Keeping textures out of the frame buffer allows larger screen resolution, or permits Z-buffering for a given large screen size. As the need for more graphics intensive applications continues to scale upward, the amount of textures stored in system memory will increase. AGP delivers these textures from system memory to the graphics controller at speeds sufficient to make system memory usable as a secondary texture store.
AGP Memory Allocation
During AGP memory initialization, the OS allocates 4K byte pages of AGP memory in main (physical) memory. These pages are usually discontiguous. However, the graphics controller needs contiguous memory. A translation mechanism called the GART (Graphics Address Remapping Table), makes discontiguous memory appear as contiguous memory by translating virtual addresses into physical addresses in main memory through a remapping table.
A block of contiguous memory space, called the Aperture is allocated above the top of memory. The graphics card accesses the Aperture as if it were main memory. The GART is then able to remap these virtual addresses to physical addresses in main memory. These virtual addresses are used to access main memory, the local frame buffer, and AGP memory.

AGP Transfers
AGP provides two modes for the graphics controller to directly access texture maps in system memory: pipelining and sideband addressing. Using Pipe mode, AGP overlaps the memory or bus access times for a request ("n") with the issuing of following requests ("n+1"..."n+2"... etc.). In the PCI bus, request "n+1" does not begin until the data transfer of request "n" finishes.
With sideband addressing (SBA), AGP uses 8 extra "sideband" address lines which allow the graphics controller to issue new addresses and requests simultaneously while data continues to move from previous requests on the main 32 data/address lines. Using SBA mode improves efficiency and reduces latencies.
AGP Specifications
The current PCI bus supports a data transfer rate up to 132 MB/s, while AGP (at 66MHz) supports up to 533 MB/s! AGP attains this high transfer rate due to it's ability to transfer data on both the rising and falling edges of the 66MHz clock
Mode Approximate
clock rate Transfer rate
(MBps)
1x 66 MHz 266
2x 133 MHz 533
4x 266 MHZ 1066
8x 533 MHZ 2133
The AGP slot typically provides performance which is 4 to 8 times faster than the PCI slots inside your computer.
COMPONENTS OF GPU
There are several components on a typical graphics card:
Graphics Processor
The graphics processor is the brains of the card, and is typically one of three configurations:
Graphics co-processor: A card with this type of processor can handle all of the graphics chores without any assistance from the computer's CPU. Graphics co- processors are typically found on high-end video cards.
Graphics accelerator: In this configuration, the chip on the graphics card renders graphics based on commands from the computer's CPU. This is the most common configuration used today.
Frame buffer: This chip simply controls the memory on the card and sends information to the digital-to-analog converter (DAC) . It does no processing of the image data and is rarely used anymore.
Memory “ The type of RAM used on graphics cards varies widely, but the most popular types use a dual-ported configuration. Dual-ported cards can write to one section of memory while it is reading from another section, decreasing the time it takes to refresh an image.
Graphics BIOS “ Graphics cards have a small ROM chip containing basic information that tells the other components of the card how to function in relation to each other. The BIOS also performs diagnostic tests on the card's memory and input/ output (I/O) to ensure that everything is functioning correctly.
Digital-to-Analog Converter (DAC) “ The DAC on a graphics card is commonly known as a RAMDAC because it takes the data it converts directly from the card's memory. RAMDAC speed greatly affects the image you see on the monitor. This is because the refresh rate of the image depends on how quickly the analog information gets to the monitor.
Display Connector “ Graphics cards use standard connectors. Most cards use the 15-pin connector that was introduced with Video Graphics Array (VGA).
Computer (Bus) Connector “ This is usually Accelerated Graphics Port (AGP). This port enables the video card to directly access system memory. Direct memory access helps to make the peak bandwidth four times higher than the Peripheral Component Interconnect (PCI) bus adapter card slots. This allows the central processor to do other tasks while the graphics chip on the video card accesses system memory.
Internal Organization of GPU
HOW IS 3D ACCELERATION DONE??????
There are different steps involved in creating a complete 3D scene. It is done by different parts of the GPU, each of which are assigned a particular job. During 3D rendering, there are different types of data the travel across the bus. The two most common types are texture and geometry data. The geometry data is the "infrastructure" that the rendered scene is built on. This is made up of polygons (usually triangles) that are represented by vertices, the end-points that define each polygon. Texture data provides much of the detail in a scene, and textures can be used to simulate more complex geometry, add lighting, and give an object a simulated surface.
Many new graphics chips now have accelerated Transform and Lighting (T&L) unit, which takes a 3D scene's geometry and transforms it into different coordinate spaces. It also performs lighting calculations, again relieving the CPU from these math-intensive tasks.
Following the T&L unit on the chip is the triangle setup engine. It takes a scene's transformed geometry and prepares it for the next stages of rendering by converting the scene into a form that the pixel engine can then process. The pixel engine applies assigned texture values to each pixel. This gives each pixel the correct color value so that it appears to have surface texture and does not look like a flat, smooth object. After a pixel has been rendered it must be checked to see whether it is visible by checking the depth value, or Z value.
A Z check unit performs this process by reading from the Z-buffer to see if there are any other pixels rendered to the same location where the new pixel will be rendered. If another pixel is at that location, it compares the Z value of the existing pixel to that of the new pixel. If the new pixel is closer to the view camera, it gets written to the frame buffer. If it's not, it gets discarded. After the complete scene is drawn into the frame buffer the RAMDAC converts this digital data into analog that can be given to the monitor for display.
PERFORMANCE FACTORS OF GPU
There are many factors that affect the performance of a GPU. Some of the factors that are directly visible to a user are given below.

¢ Fill Rate:
It is defined as the number of pixels or texels (textured pixels) rendered per second by the GPU on to the memory . It shows the true power of the GPU. Modern GPUs have fill rates as high as 3.2 billion pixels. The fill rate of a GPU can be increased by increasing the clock given to it.
¢ Memory Bandwidth:
It is the data transfer speed between the graphics chip and its local frame buffer. More bandwidth usually gives better performance with the image to be rendered is of high quality and at very high resolution.
¢ Memory Management:
The performance of the GPU also depends on how efficiently the memory is managed, because memory bandwidth may become the only bottle neck if not managed properly.
¢ Hidden Surface removal:
A term to describe the reducing of overdraws when rendering a scene by not rendering surfaces that are not visible. This helps a lot in increasing the performance of GPU, by preventing overdraw so that the fill rate of the GPU can be utilized to the maximum.
Now lets see how far GPUs have come as far as performance is concerned.


TYPES OF GPUS¦¦..
There are mainly two types of GPUs, they are
1. Those that can handle all of the graphics processes without any assistance from the computer's CPU. They are typically found on high-end workstations. These are mainly used for Digital Content Creation like 3D animation as it supports a lot of 3D functions.
Some of them are¦¦
Quadro series from NVIDIA.
Wildcat series from 3D Labs.
FireGL series from ATI.
2. The chip on the graphics card renders graphics based on commands from the computer's CPU. This is the most common configuration used today. These are used for 3D gaming and such smaller tasks. They are found on normal desktop PCs and are better known as 3D accelerators. These support less functions and hence are cheaper.
Some of them are¦¦.
Geforce series from NVIDIA.
Radeon series from ATI Technology ltd.
Kyro series from STM Microelectronics
Todayâ„¢s GPU can do what was hoped for and beyond. In the last year a giant leap have been made in the GPU technology. The maximum amount of RAM that can be found on a graphics card has jumped from 16MB to a whopping 128MB. The premier company in GPU manufacturing ATI,who has held the position past couple of years has given way to nVidia , whose new ground breaking technology is leaving ATI to follow.
GEFORCE4
NVIDIA introduced the groundbreaking, top-to-bottom GeForce4 family of GPUs”delivering new levels of graphics performance and display flexibility to desktop and mobile PCs. nVidia™s latest creation, the GeForce4 GPU is the fourth edition in the famed GeForce lineup. It has wowed gamers and artists alike by having the capability to make graphics better than life with 128MB of DDR memory and a super fast processor. What this means is that the GeForce4 is capable of rendering graphics better than the eye can see. Although this is extraordinary, there are no available monitors that can handle displaying such graphics. Even so, the nVIDIA GeForce4 is truly an extraordinary and groundbreaking graphics processing unit.
GeForce4 is the most complete family of graphics solutions”from the ferocious graphics power of the GeForce4 Ti, the world™s fastest GPU; to the multi-display flexibility of the mainstream GeForce4 MX; to the most advanced mobile graphics available, GeForce4 Go.
Three of its kind has been released¦¦.
¢ GeForce4 Ti series (NV25)
¢ GeForce4 MX series (NV17)
¢ GeForce4 Go series
Chip Architecture
The LMA II
In the upper left hand corner lies the LMA II. The LMA II controls the flow of data from the chip to the GPU's memory (the DDR memory on the graphics card). It controls how much data is sent to the memory and how fast it is sent to the memory.
The Accuview AA Engine
The Accuview AA Engine, located to the left and in the middle of the chip, does the antialiasing for the GPU.
The Interface Unit
At the bottom of the GeForce4 chip and to the very left is the interface unit. This simply determines what interface (2X or 4X AGP) the computer has, and adjusts to comply with it.
The Texture Unit
At the bottom of the chip and to the left-center is the texture unit. This unit is dedicated to processing textures that must be rendered.
The nfiniteFX II Engine
This is the striking feature of GeForce4 series.
One among the classic computer graphics problem have been regarding the rendering the realistic hair and fur. Animals with skin texture were easy to render than those with fur.
For the first time ever, and only through the power of nfiniteFX II engine, which includes support for dual vertex shaders,advanced pixel shader pipelines , 3D textures, shadow buffers and z-correct bump mapping ,is it now possible to render those.
It also improves rendering of shading.
A demo shot of a wolf man which has been rendered by GeForce4
The Display Unit
In the center to the right of the chip is the display unit. This unit is simple and its only job is to determine the optimal display resolution for the display being used. It also determines what type of display it is and optimizes its performance.
The 2D/Video/HDVP Unit
In the bottom right corner of the chip is the 2D/Video/HDVP unit. This part of the chip is dedicated to all those tasks that don't require 3D rendering. These tasks include movies, 2D pictures, 2D games and almost any other program that doesn't have 3D graphics
nView Technology:
Simply put, NVIDIAs Multi-Monitor and Dual Independent Display technology, now called "nView", has been polished up nicely. nView is available on both the GF4 Ti and MX and enables the following:

Multi-desktop tools
¢ Multi-desktop integration
¢ Full featured interface including explorer browser with birds-eye views of desktops
¢ Toolbar control available as well for those needing a streamlined, low real- estate interface
Window management
¢ Individual application control
¢ Window & dialog repositioning

Application management
¢ Transparency & colored transparency window options
¢ Extends functionality of all applications
¢ Pop- up menu control
GEFORCE4 TI
nVIDIA's crown graphics card is the GeForce4 Ti, a creation of admirable brilliance. Its basic specifications include
¢ 63 million transistors (only 3 million more than GeForce3)
¢ Manufactured in TSMC's .15 µ process
¢ Chip clock 225 - 300 MHz
¢ Memory clock 500 - 650 MHz
¢ Memory bandwidth 8,000 - 10,400 MB/s
¢ TnL Performance of 75 - 100 million vertices/s
¢ 128 MB frame buffer by default
¢ nfiniteFX II engine
¢ Accuview Anti Aliasing
¢ Light Speed Memory Architecture II
¢ nView
It has a 128 bit bus , double the size of previous busses and the only graphics card to have one. This extra bus size improves performance dramatically. Its considered as the worldâ„¢s fastest GPU available today.
NV25 series include four- GeForce4 Ti 4600 ,GeForce4 Ti 4400 , GeForce4 Ti 4200 & GeForce4 Ti 4200 w/AGP 8x

This chart shows comparison of performances of GeForce4 Ti 4600 and its predecessor in FPS.
GEFORCE4 MX
With the GeForce4 MX graphics processing units (GPUs), NVIDIA provides a new level of cost-effective, high-performance graphics to the mainstream PC user. The GeForce4 MX is the cheapest and worst performing of the three lines. The standard graphics card comes with 64MB of RAM, a hefty amount that can handle almost any task. The GeForce4 MX has a 64bit bus, which is also pretty standard for today's graphics cards. For the most part the only real improvement from the GeForce3 MX to the GeForce4MX is the graphics processing unit, which beats its predecessors, the GeForce2 and GeForce3, hands down.

Performance comparisons
GEFORCE4 GO
NVIDIA introduces the fastest, most comprehensive and feature-rich computing experience ever realized on a mobile platform” the GeForce4 Go. With a revolutionary core of integrated technologies, the GeForce4 Go ensures unparalleled performance, battery life, and DVD and video playback. From notebooks powerful enough to replace your desktop PC, to those that are both thin and light, NVIDIA's GeForce4 Go mobile GPUs provide unprecedented mobile computing experiences.
CONCLUSION
From the introduction of the first 3D accelerator from 3dfx in 1996 these units have come a long way to be truly called a Graphics Processing Unit. So it is not a wonder that this piece of hardware is often referred to as an exotic product as far as computer peripherals are concerned. By observing the current pace at which work is going on in developing GPUs we can surely come to a conclusion that we will be able to see better and faster GPUs in the near future.

3D GLOSSARY

Given below are some of the terms that are closely associated with GPUs .
Vertex: Vertices are the basic unit of 3D graphics. All 3D geometry is composed of vertices. Vertices contain X, Y and Z positions plus possible vertex normal and texture mapping information.
Polygons or Triangles:
3D scenes are drawn using only triangles. This vastly simplifies the computer creation of a 3D world. Triangles are defined as three x,y,z coordinates (one for each vertex), a properly-oriented texture, and a shading definition. The illusion of curved surfaces (fuselage, engines, wings, etc.) comes from well-applied shading of a flat polygon. A good 3D-accelerator card will put the textures together without any seams, white pixels that flash where the triangles almost meet.
Vertices
Texture (Bitmap):
A Texture Map is a way of controlling the diffuse color of a surface on a pixel-by-pixel basis, rather than by assigning a single overall value. This is commonly achieved by applying a color bitmap image to the surface.


Rasterization:
The process of finding which pixels an individual polygon covers or, at a more basic level, on which pixels an edge of a polygon lies on. This second aspect will be dealt with first. Simply, it is the process of transforming a 3D image into a set of colored pixels.
Rendering:
A term which is often used as a synonym for rasterization, but which can also refer to the whole process of creating a 3D image. Rendering is the process of producing bitmapped images from a view of 3-D models in a 3-D scene. It is, in effect, "taking a picture" of the scene. An animation is a series of such renderings, each with the scene slightly changed
Anti-aliasing:
A method to remove the jagged edges that appear in the computer generated images.
Filtering:
Filtering is a method to determine the color of a pixel based on texture maps. When you get very close to a polygon the texture map hasn't got enough info to determine the real color of each pixel on the screen. The basic idea is interpolation; this is a technique of using information of the real pixels surrounding the unknown pixel to determine its color based on mathematical averages.
Following are some methods of filtering starting (from the worst quality to best)
1. Point Sampled filtering
2. Bilinear filtering
3. Trilinear filtering
Point filtering will just copy the color of the nearest real pixel, so it will actually enlarge the real pixel. This creates a blocky effect and when moving this blocks can change color quickly creating weird visual effects. This technique is always used in software 3D engines because it requires very little calculation power.
Bilinear filtering uses four adjacent texels (textured pixels) to interpolate the output pixel value (unknown). This result in a smoother textured polygon as the interpolation filters down the blockiness associated with point sampling. The disadvantage of bilinear texturing is that it results in a fourfold increase in texture memory bandwidth.
Trilinear filtering will combine Bilinear filtering in 2 Mip levels. This however results in 8 texels being needed so memory bandwidth is multiplied by 2. This usually means that the memory will suffer serious bandwidth problems so trilinear filtering is usually used as an option.
Alpha-blending:
It is a technique to do transparency. It is an extra value added to the pixels of a texture map to define how easy it is to look through the pixel. This way it is possible to look through things and effects like realistic water and glass are possible.
FPS:
FPS stands for Frames per Second. This is the main a unit of measure that is used to describe graphics and video performance.

Texture Mapping:
In 3D graphics, texture mapping is the process of adding a graphic pattern to the polygons of a 3D scene. Unlike simple shading, which uses colors to the underlying polygons of the scene, texture mapping applies simple textured graphics, also known as patterns or more commonly "tiles", to simulate walls, floors, the sky, and so on.
T&L:
Transform and lighting (T&L) are two major steps in the 3D graphics pipeline. They are computationally very intensive. Transform phase converts the source 3D data to a form that can be rendered and Lighting phase calculates lighting for the 3D environment. These two steps can be performed simultaneously or consequently to a triangle. Traditionally the CPU performs them, but nowadays some 3D accelerators also have dedicated hardware T&L solutions
Frame buffer:
The memory used to store the pictures you see on screen. Under Direct3D, there are actually two frame buffers. The front buffer is being displayed while the back buffer is being drawn. When the back buffer is complete it becomes the new front buffer and the old front buffer is cleared and the next frame is drawn there. (This method of buffering is being referred as double buffering).

Triple buffering:
Here three buffers are used instead of two. If you have very fast refresh rates there isn't much wait for the buffer swap. But if you have slow refresh rates, the wait can be considerable. To stop this, 3Dfx drivers (for the Rush and Voodoo2) can enable a third frame buffer, so when it is done with the back buffer it starts immediately on the next one.
Z-Buffer:
A third buffer (or a fourth if triple buffering is enabled) where depth data is stored, to help hardware to sort out which textures are visible and which are hidden.
API (Application Program Interface):
It is a set of routines, protocols, and tools for building software applications. A good API makes it easier to develop a program by providing all the building blocks. Game programmers use Application Programming Interfaces to help them program 3D functions more easily and so the program they write will run on more types of hardware.
The three most popular graphical API's (Application Programming Interface's) are:
¢ GLide by 3dfx,
¢ OpenGL by Silicon Graphics (and Microsoft), and
¢ Direct3D by Microsoft as part of their multimedia DirectX package.

REFERENCES
1. howstuffworks.com
2. tomshardware.com
3. intel.com
4. nvidia.com
5. extremetech.com
6. pcworld.com


ACKNOWLEDGEMENT
I extend my sincere thanks to Prof. P.V.Abdul Hameed, Head of the Department for providing me with the guidance and facilities for the Seminar.
I express my sincere gratitude to Seminar coordinator
Mr. Berly C.J, Staff in charge, for their cooperation and guidance for preparing and presenting this seminars.
I also extend my sincere thanks to all other faculty members of Electronics and Communication Department and my friends for their support and encouragement.
Anish Salam

CONTENTS
1. INTRODUCTION
2. WHATâ„¢S A GPU ???
3. HISTORY AND STANDARDS
4. PERIPHERAL COMPONENT INTERCONNECT
5. ACCELLERATED GRAPHICS PORT
6. COMPONENTS OF GPU
7. HOW IS 3D ACCELLERATION DONE ?
8. PERFORMANCE FACTOR OF GPU
9. TYPES OF GPU
10. GEFORCE4
11. GEFORCE4 TI
12. GEFORCE4 MX
13. GEFORCE4 GO
14. CONCLUSION
15. 3D GLOSSARY
16. REFERENCES
Reply
#2
Graphics Processing Unit


What is a GPU???
A Graphics Processing Unit (GPU) is a microprocessor that has been designed specifically for the processing of 3D graphics.

Main purpose of gpu is to simulate the 3D images as realistic as possible on the computer screen


GPU is mainly needed to relieve the CPU from graphical computations so that CPU can be used for other processes

History & Standards

The first GPU, introduced in 1981 by IBM, were monochrome cards designated as Monochrome Display Adapters (MDAs).

Then came Colour Graphics Adapter(CGA) & then the Enhanced Graphics Adapter(EGA).

IBM introduced the Video Graphics Array (VGA) in 1987. It could support up to 256 colors at resolutions up to 720x400

Then came the Super Video Graphics Array (SVGA) that supports upto 16.8 million colors and 1280x1024 resolution





Graphics Hardware Interface

Peripheral Component Interconnect(PCI)

Accelerated Graphics Port (AGP)

PERIPHERAL COMPONENT INTERCONNNECT- EXPRESS(PCI-E)


Graphics Processor
The processor is designed specifically to perform floating point calculations, which are fundamental to 3D graphics processing.

Graphics Accelerator
A graphics accelerator assists graphics processing by executing instructions concurrently.

Frame Buffer
A frame buffer is a video output device that drives a video display from a memory buffer containing a complete frame of data.




Memory
Video memory may be used for storing screen image as well as Z-Buffer which manages the depth coordinates in 3D graphics.



Graphics BIOS
This contains the basic program, which is usually hidden, that governs the video card's operations and provides the instructions that allow the computer and software to interact with the card.



Digital-to-Analog Converter (DAC)
The RAMDAC or Random Access Memory Digital-to-Analog Converter, converts digital signals to analog signals for use by a computer display that uses analog inputs such as CRT displays.



Display Connector
This include connection system b/w the video card and the display devices such as monitor or a television.






Need for 3DAcceleration


Everything that is displayed on the computer screen is 2D



The push for more realism, more finely-detailed graphics and faster speeds in such programs as means that more 3D work must be done in a shorter period of time has resulted in need for 3D acceleration




How is 3d acceleration done???


Geometry -CPU has the job of determining were the object or image have to be placed to make an fig on the screen.


Transform -This involves rotation ,scaling and translating an object or image.


Rendering -After taking information out of memory it is sent to the GPU & process those Bit Map images thus making an image in 3D.


Filtering -This is done by smoothing over the block bitmaps. Textures will seem more streamline and realistic.

Double Buffering -This uses two buffers to speed up the computation process. Data in one buffer is being processed while the next set of data is read into the other one.

Flat Shading -It shades each polygon of an object based on the angle between the polygon's surface normal and the direction of the light source, their respective colors and the intensity of the light source.



Mipmapping - The image becomes bigger as we move towards it & small as we move away from it, as we perceive it in the real world

Atmosphere - An effect used in outdoor scenes by blurring objects that are in the distance.

Lighting - This cause color shading, light reflection, shadows and other effects to be added to objects based on their position & the position of light sources.

Z-Buffering - To determine which objects, or parts of objects, are visible and which are hidden behind other objects.



Performance Factors of GPU

Fillrate
The fillrate usually refers to the number of pixels a video card can render and write to video memory in a second.


Memory Bandwidth
Memory bandwidth is the rate at which data can be read from or stored into a memory by a processor.


Memory Clock
This tells us the amount of memory bandwidth a graphics card has.

Memory Interface (Memory Bus)
The larger the Memory Interface width, faster the speed of data traveling in it.
Core Clock
The actual speed at which the graphics processor on a video card operates.





Some Terms Associated On GPU

ANTI ALIASING -Anti-Aliasing is a method of fooling the eye that a jagged edge is really smooth.

ROP (Raster Operators) -It is the task of taking an image (shapes) and converting it into a pixels or dots for output on a screen.
Alpha-blending -It is a technique to do transparency.

FPS -This is the main a unit of measure that is used to describe graphics and video performance.


Stream Processing

A stream is simply a set of records that require similar computation.

A technique used to accelerate the processing of many types of video and image computations is called stream processing.

Streams provide data parallelism

GPUs are stream processors “ processors that can operate in parallel by running a single kernel on many records in a stream at once.

Kernels are the functions that are applied to each element in the stream.



API (Application Program Interface)


It is a set of routines, protocols, and tools for building software applications




GLide by 3dfx

OpenGL by Silicon Graphics

Direct3D by Microsoft



TYPES OF GPU


DEDICATED GPU




INTEGRATED GRAPHICS SOLUTIONS




HYBRID SOLUTIONS



Some Examples of GPU

NVIDIAâ„¢S GFORCE
GFORCE 2series, GFORCE 3series, GFORCE 4series, GFORCE 5series, GFORCE 6series, GFORCE 7series, GFORCE 8series, GFORCE 9series, GFORCE 100series, GFORCE 200series,GFORCE 300series
Latest being GFORCE 400series

AMDâ„¢S RADEON
MACHseries,RAGEseries,RADEON R100series, RADEON R200series, RADEON R300series, RADEON R400series, RADEON R500series, RADEON R600series, RADEON R700series.
Latest being EVERGREEN5series


CUDA


CUDA (Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA

CUDA exploits the parallel computational power of the GPU where hundreds of on-chip processor cores simultaneously communicate and cooperate to solve complex computing problems, transforming the GPU into a massively parallel processor.

The programming interface of the CUDA technology uses the familiar programming language (such as NVIDIAâ„¢s C-like syntax) to code algorithms that send all calculations to the GPU




Conclusion
From the introduction of the first GPUâ„¢s in the 1970â„¢s to the most recent ones manufactured today, the present world of graphics has changed enormously and would have never been the same without it.

Today lot of applications have become faster and efficient by using GPU technology thus saving lot of time in many scenarios.
Reply
#3
GRAPHICS PROCESSING UNIT
(GPU)
ABSTRACT
A Graphics Processing Unit (GPU) is a microprocessor that has been designed specifically for the processing of 3D graphics. The processor is built with integrated transform, lighting, triangle setup/clipping, and rendering engines, capable of handling millions of math-intensive processes per second. GPUs allow products such as desktop PCs, portable computers, and game consoles to process real-time 3D graphics that only a few years ago were only available on high-end workstations. Used primarily for 3-D applications, a graphics processing unit is a single-chip processor that creates lighting effects and transforms objects every time a 3D scene is redrawn. These are mathematically-intensive tasks, which otherwise, would put quite a strain on the CPU.
CONTENTS
1. INTROUCTION
2. WHATâ„¢S A GPU
3. DIFFERENCE B/W CPU AND GPU
4. HISTORY & STQANDARDS
5. PERIPHERAL COMPONENT INTERCONNECT
6. ACCELERATED GRAPHICS PORT
7. PERIPHERAL COMPONENT INTERCONNECT-EXPRESS
8. COMPONENTS OF GPU
9. HOW IS 3D ACCELERATION DONE
10. PERFORMANCE FACTORS OF GPU
11. TERMS ASSOCIATED WITH GPU
12. STREAM PROCESSING
13. TYPES OF GPU
14. APPLICATION PROGRAM INTERFACE
15. MANUFACTURERS AND EXAMPLES
16. APPLICATION OF GPU
17. LATEST IN GPU
INTRODUCTION

There are various applications that require a 3dimensional (3D) world to be simulated in a much realistic manner on a computer screen. These include 3D animations in games, movies and other real world simulations. It takes immense computing power to represent a 3D image due to the enormous amount of information and the complex mathematical operations that needs to be processed to project this 3D image onto a computer screen. In this situation, the processing time and bandwidth are at a premium due to large amounts of both computation and data.
The functional purpose of a GPU is to provide separate, dedicated graphics resources, including a graphics processor and memory, to relieve some of the burden on the main system resources, namely the Central Processing Unit(CPU), Main Memory, and the System Bus, which would otherwise get saturated with graphical operations and I/O requests. The abstract goal of a GPU, however, is to enable a representation of a 3D world as realistically as possible. So these GPUs are designed to provide additional computational power that is customized specifically to perform these 3D tasks.
WHATâ„¢S A GPU????
A Graphics Processing Unit (GPU) is a microprocessor that has been designed specifically for the processing of 3D graphics. The processor is built with integrated transform, lighting, triangle setup/clipping, and rendering engines, capable of handling millions of math-intensive processes per second. GPU's form the heart of modern graphics cards, relieving the CPU (central processing units) of much of the graphics processing load. GPUs allow products such as desktop PCs, portable computers, and game consoles to process real-time 3D graphics.
Used primarily for 3-D applications, a graphics processing unit is a single-chip processor that creates lighting effects and transforms objects every time a 3D scene is redrawn. These are mathematically-intensive tasks, which otherwise, would put quite a strain on the CPU. Lifting this burden from the CPU frees up cycles that can be used for other jobs.
However, the GPU is not just for playing 3D-intense video games or for those who create graphics (sometimes referred to as graphics rendering or content-creation) but is a crucial component that is critical to the PC's overall system speed. In order to fully appreciate the graphics card's role it must first be understood.
Many synonyms exist for Graphics Processing Unit in which the popular one being the graphics card .Itâ„¢s also known as a video card, video accelerator, video adapter, video board, graphics accelerator, or graphics adapter.
DIFFERENCE B/W CPU & GPU
Central Processing Unit or the CPU is where all the program instructions are executed
Graphics Processing Unit or GPU is a dedicated piece of hardware that processes graphic.

GPU have a lot more transistors as compared to CPU. But GPU transistors run at a
much slower rate than CPU transistors

GPU is called a parallel processor as there are usually several processors on one chip that
are specialized at interpreting and drawing graphics on to a display very quickly
CPU on the other hand tends to deal with one instruction at a time
HISTORY AND STANDARDS
The first graphics cards, introduced in August of 1981 by IBM, were monochrome cards designated as Monochrome Display Adapters (MDAs). The displays that used these cards were typically text-only, with green or white text on a black background. Color for IBM-compatible computers appeared on the scene with the 4-color Hercules Graphics Card (HGC), followed by the 8-color Color Graphics Adapter (CGA) and 16-color Enhanced Graphics Adapter (EGA). During the same time, other computer manufacturers, such as Commodore, were introducing computers with built-in graphics adapters that could handle a varying number of colors.
When IBM introduced the Video Graphics Array (VGA) in 1987, a new graphics standard came into being. A VGA display could support up to 256 colors (out of a possible 262,144-color palette) at resolutions up to 720x400.
Over the years, VGA gave way to Super Video Graphics Array (SVGA). SVGA cards were based on VGA, but each card manufacturer added resolutions and increased color depth in different ways. Eventually, the Video Electronics Standards Association (VESA) agreed on a standard implementation of SVGA that provided up to 16.8 million colors and 1280x1024 resolution. Most graphics cards available today support Ultra Extended Graphics Array (UXGA). UXGA can support a palette of up to 16.8 million colors and resolutions up to 1600x1200 pixels.
Even though any card you can buy today will offer higher colors and resolution than the basic VGA specification, VGA mode is the basic standard for graphics and is the minimum on all cards. Many graphics cards were using the Peripheral Component Interconnect (PCI) and the Accelerated Graphics Port (AGP).Now a days we use Peripheral Component Interconnect -express(PCI-E) slots
PERIPHERAL COMPONENT INTERCONNECT(PCI)
There are a lot of incredibly complex components in a computer. And all of these parts need to communicate with each other in a fast and efficient manner. Essentially, a bus is the channel or path between the components in a computer. During the early 1990s, Intel introduced a new bus standard for consideration, the Peripheral Component Interconnect (PCI).It provides direct access to system memory for connected devices, but uses a bridge to connect to the front side bus and therefore to the CPU.
PCI originally operated at 33 MHz using a 32-bit-wide path. Revisions to the standard include increasing the speed from 33 MHz to 66 MHz and doubling the bit count to 64. Currently, PCI-X provides for 64-bit transfers at a speed of 133 MHz for an amazing 1-GBps (gigabyte per second) transfer rate!
Although Intel proposed the PCI standard in 1991, it did not achieve popularity until the arrival of Windows 95.This sudden interest in PCI was due to the fact that Windows 95 supported a feature called Plug and Play (PnP). PnP means that you can connect a device into your computer and it is automatically recognized and configured to work in the system.
ACCELERATED GRAPHICS PORT (AGP)

The need for streaming video and real-time-rendered 3-D games requires an even faster throughput than that provided by PCI. In 1996, Intel debuted the Accelerated Graphics Port (AGP), a modification of the PCI bus designed specifically to facilitate the use of streaming video and high-performance graphics.
AGP is a high-performance interconnect between the core-logic chipset and the graphics controller for enhanced graphics performance for 3D applications. AGP relieves the graphics bottleneck by adding a dedicated high-speed interface directly between the chipset and the graphics controller as shown below.
The current PCI bus supports a data transfer rate up to 132 MB/s, while AGP (at 66MHz) supports up to 266 MB/s! AGP attains this high transfer rate due to it's ability to transfer data on both the rising and falling edges of the 66MHz clock .The AGP slot typically provides performance which is 4 to 8 times faster than the PCI slots inside your computer.
PERIPHERAL COMPONENT INTERCONNNECT-EXPRESS(PCI-E)
PCI-E, unlike previous PC expansion standards, is structured around point-to-point serial links, a pair of which (one in each direction) make up lanes; rather than a shared parallel bus. These lanes are routed by a hub on the main-board acting as a crossbar switch. This dynamic point-to-point behavior allows more than one pair of devices to communicate with each other at the same time. In contrast, older PC interfaces had all devices permanently wired to the same bus; therefore, only one device could send information at a time. This format also allows channel grouping, where multiple lanes are bonded to a single device pair in order to provide higher bandwidth.
The PCI-E 2.0 standard supports a data transfer of 500MB/s. The PCI-E 2.0 standard uses a base clock speed of 5.0 GHz, while the first version operates at 2.5 GHz.
Currently PCI-E 3.0 is being developed and is said to have 8GB/s data transfer rate which include a number of optimizations for enhanced signaling and data integrity, including transmitter and receiver equalization, PLL improvements, clock data recovery, and channel enhancements for currently supported
COMPONENTS OF GPU
There are several components on a typical graphics card:
Graphics Processor
A GPU is a dedicated processor optimized for accelerating graphics. The processor is designed specifically to perform floating-point calculations, which are fundamental to 3D graphics rendering. The main attributes of the GPU are the core clock frequency, which typically ranges from 250 MHz to 4 GHz and the number of pipelines (vertex and fragment shaders), which translate a 3D image characterized by vertices and lines into a 2D image formed by pixels.

Graphics accelerator
A graphics accelerator assists graphics rendering by supplying primitives that it can execute concurrently with and more efficiently than the x86 CPU. One reason the accelerator can be more efficient than the CPU is because it lives closer to the graphics memory; it does not have to transfer raw pixel data over a slow (relative to the speed of the graphics RAM) general BUS and chipset. But the main reason a graphics accelerator improves overall graphics performance is because it executes concurrently with the CPU. This means that while the CPU is calculating the coordinates for the next set of graphics commands to issue, the graphics accelerator can be busy filling in the polygons for the current set of graphics commands. This dividing up of computation is often referred to as load balancing.
Frame buffer
A frame buffer is a video output device that drives a video display from a memory buffer containing a complete frame of data.
The information in the memory buffer typically consists of color values for every pixel (point that can be displayed) on the screen. Color values are commonly stored in 1-bit monochrome, 4-bit palletized, 8-bit palletized, 16-bit high color and 24-bit true color formats. An additional alpha channel is sometimes used to retain information about pixel transparency. The total amount of the memory required to drive the framebuffer depends on the resolution of the output signal, and on the color depth and palette size.
Memory
The memory capacity of most modern video cards ranges from 128 MB to 4 GB, though very few cards actually go over 1 GB. Since video memory needs to be accessed by the GPU and the display circuitry, it often uses special high-speed or multi-port memory, such as VRAM, WRAM, SGRAM, etc. Around 2003, the video memory was typically based on DDR technology. During and after that year, manufacturers moved towards DDR2, GDDR3, GDDR4, and even GDDR5 is utilized. The effective memory clock rate in modern cards is generally between 400 MHz and 3.8 GHz.Video memory may be used for storing other data as well as the screen image, such as the Z-buffer, which manages the depth coordinates in 3D graphics, textures, vertex buffers, and compiled shader programs.
Graphics BIOS
The video BIOS or firmware contains the basic program, which is usually hidden, that governs the video card's operations and provides the instructions that allow the computer and software to interact with the card. It may contain information on the memory timing, operating speeds and voltages of the graphics processor, RAM, and other information. It is sometimes possible to change the BIOS (e.g. to enable factory-locked settings for higher performance), although this is typically only done by video card overclockers and has the potential to irreversibly damage the card.
Digital-to-Analog Converter (DAC)
The RAMDAC, or Random Access Memory Digital-to-Analog Converter, converts digital signals to analog signals for use by a computer display that uses analog inputs such as CRT displays. The RAMDAC is a kind of RAM chip that regulates the functioning of the graphic card. Depending on the number of bits used and the RAMDAC-data-transfer rate, the converter will be able to support different computer-display refresh rates. With CRT displays, it is best to work over 75 Hz and never under 60 Hz, in order to minimize flicker.(With LCD displays, flicker is not a problem.) Due to the growing popularity of digital computer displays and the integration of the RAMDAC onto the GPU die, it has mostly disappeared as a discrete component. All current LCDs, plasma displays and TVs work in the digital domain and do not require a RAMDAC. There are few remaining legacy LCD and plasma displays that feature analog inputs (VGA, component, SCART etc.) only. These require a RAMDAC, but they reconvert the analog signal back to digital before they can display it, with the unavoidable loss of quality stemming from this digital-to-analog-to-digital conversion.
Display Connector
The most common connection systems between the video card and the computer display are:
Video Graphics Array (VGA) (DE-15)
Analog-based standard adopted in the late 1980s designed for CRT displays, also called VGA connector. Some problems of this standard are electrical noise, image distortion and sampling error evaluating pixels.
Digital Visual Interface (DVI)
Digital-based standard designed for displays such as flat-panel displays (LCDs, plasma screens, wide high-definition television displays) and video projectors. It avoids image distortion and electrical noise, corresponding each pixel from the computer to a display pixel, using its native resolution.
Video in Video out (VIVO) for S video, composite video and component video
Included to allow the connection with televisions, DVD players, video recorders and video game consoles. They often come in two 9-pin Mini-DIN connector variations, and the VIVO splitter cable generally comes with either 4 connectors (S-Video in and out + composite video in and out), or 6 connectors (S-Video in and out + component PB out + component PR out + component Y out [also composite out] + composite in).
High Definition Multimedia Interface (HDMI)
An advanced digital audio/video interconnect released in 2003 and is commonly used to connect game consoles and DVD players to a display. HDMI supports copy protection through HDCP.
Display Ports
An advanced license- and royalty-free digital audio/video interconnect released in 2007. DisplayPort intends to replace VGA and DVI for connecting a display to a computer.
Other types of connection systems
Composite video

Analog system with lower resolution; it uses the RCA connector
Component video
It has three cables, each with RCA connector (YCBCR for digital component, or YPBPR for analogue component); it is used in projectors, DVD players and some televisions.
DB13W3
An analog standard once used by Sun Microsystems, SGI and IBM.
DMS-59
A connector that provides two DVI outputs on a single connector.
Computer (Bus) Connector - This is usually Accelerated Graphics Port (AGP). This port enables the video card to directly access system memory. Direct memory access helps to make the peak bandwidth four times higher than the Peripheral Component Interconnect (PCI) bus adapter card slots. This allows the central processor to do other tasks while the graphics chip on the video card accesses system memory.
Internal Organization of GPU
The Need for 3D Acceleration
It may be a valid question to ask why it is that special 3D cards are needed today. After all, everything that is displayed on the computer screen is 2D, even 3D images projected to 2D. And 3D graphics have been used on computers for years.
The reason that specialized 3D accelerators are becoming popular is that software today is trying to do more in 3D than has ever been done before. The push for more realism, more finely-detailed graphics, and faster speeds in such programs as action games, flight simulators, graphics programs and CAD applications, means that more 3D work must be done in a shorter period of time.
There is an obvious parallel between today's quandary with 3D and a similar one that occurred in the early 90s when graphical operating systems became popular. At that time, most video cards had no acceleration functions at all. When people started running Windows, their CPU had to do all the work of drawing all the graphics on the screen, which caused everything to slow down tremendously. To combat this problem, accelerators were designed that did much of this work with specialized hardware, instead of forcing the system processor to do it.
Similarly today, it is not necessary to have a 3D graphics card to do 3D graphics, but the large amount of computation work necessary to translate 3D images to 2D in a realistic manner means that without specialized hardware to do this work, it must be done by the processor, using much slower software. Using a 3D accelerator allows programs--especially games, where the screen image must be recomputed many times per second--to display virtual 3D worlds with a level of detail and color that is impossible with a standard 2D video card.
HOW IS 3D ACCELERATION DONE??????
The primary steps are geometry, transform, and rendering all of which is handled by both the system CPU and the video card processor. Here are the steps in a better perspective. All steps in 3D are handled in scenes
-The Geometry, well we all know what a triangle looks like, right? Well your CPU has the job of determining were the triangle have to be placed to make an object on screen. This is usually done over a wire frame as seen in Auto CAD programs.
-Transform, now we have to put these triangles together. Your CPU will put a model of this image together in system memory on a wire frame. Not only this but we have to figure out the lighting of the triangles while in memory. This usually occurs on your systems CPU and not on the Video card processor.
-Render, now after taking information out of memory it is most likely sent to the Video Cards processor. When information hits the graphic board processor we will put those Bit Map images over our triangles thus making an image in 3D.
Now lets look at some more steps that can come into play with 3D operations. This is added to the basic steps above to make for a better scene.
-Filter, this is Bi-linear and Tri-linear filtering. This is done by smoothing over those blocky bitmaps. Textures will seem more streamline and realistic. If you play Quake II you know what I mean. If you have played Quake I you are also a witness to the blocky square looking graphics.
-Double Buffering, as we discussed earlier. We need to display a scene and work on one at the same time. This will give you a streamline appearance. If your buffering is out of whack you will know it soon.
-Flat Shading, this is similar to the filters above. We are taking those triangle with color and more or less bleeding them together. This really adds more to the realism rather than a red block here and a blue block next to it.
-Mipmapping, this is overlooked but very important in game play. Let say your walking inside a scene towards dog, you want the dog to be bigger than what you seen it a mile away if you are standing right in front of it. This is mipmapping. Textures are basically swapped while moving to ad more realism to the scene. Cool huh!?
-Atmosphere, if your dog in the scene above is smoking a cigarette we also want to see the hazy smoke. This is were the atmosphere can come into play. On flight simulators you will see haze or fog while flying this is another example of atmosphere.
-Lighting, this effect will light up the dog so you can see it or even make his cigarette flare a little. Your dog may even show the light intensify on his snout along with the haze over him, this is were lighting comes in.
-Z-Buffering, this comes down to objects that are obscured by another. We don't want to draw that part of the scene until it is moved. This is another way of improving performance.
PERFORMANCE FACTORS OF GPU
FILLRATE
The fillrate usually refers to the number of pixels a video card can render and write to video memory in a second. In this case, fillrates are given in megapixels per second or in gigapixels per second (in the case of newer cards), and they are obtained by multiplying the number of raster operations (ROPs) by the clock frequency of the graphics processor unit (GPU) of a video card.
The number of textured pixels the card can render to the screen every second. To render a 3D scene, textures are mapped over the top of polygon meshes. This is called texture mapping and is accomplished by texture mapping units (TMUs) on the videocard. Texture fill rate is a measure of the speed with which a particular card can perform texture mapping.
MEMORY BANDWIDTH
Memory bandwidth is the rate at which data can be read from or stored into a semiconductor memory by a processor. Memory bandwidth is usually expressed in units of bytes/second, though this can vary for systems with natural data sizes that are not a multiple of the commonly used 8-bit bytes.
Memory bandwidth that is advertised for a given memory or system is usually the maximum theoretical bandwidth. In practice the observed memory bandwidth will be less than (and is guaranteed not to exceed) the advertised bandwidth. A variety of computer benchmarks exist to measure sustained memory bandwidth using a variety of access patterns. These are intended to provide insight into the memory bandwidth that a system should sustain on various classes of real applications.
CORE CLOCK
The actual speed at which the graphics processor on a video card operates. Core clock is measured in megahertz (MHz). The clock speed of a chip, combined with the number/configuration of the pipelines in the chip, give a pretty accurate picture of what the performance of the chip will be.
The core clock speed can sometimes be changed on newer cards where users want to gain a performance boost. This is called overclocking and it can usually be done using third-party utilities or the drivers provided by the video card manufacturer.
MEMORY CLOCK
The memory clock, along with the size of the memory bus, tells us the amount of memory bandwidth a graphics card has. The more memory bandwidth a card has, the better it can handle higher resolutions.
Like the core clock, the memory clock of most cards can be manually increased through the driver. Though highly overclockably memory is somewhat rare.
Memory Interface (Memory Bus):
The larger the Memory Interface width, faster the speed of data traveling in it. A good analogy would be 8-lane highway would allow more cars to travel than a 4-lane highway. The unit for measuring Memory Interface is bits (for example, 256 bit).
SOME TERMS ASSOCIATED ON GPU
ANTI ALIASING
Anti-Aliasing is a method of fooling the eye that a jagged edge is really smooth. Anti-Aliasing is often referred in games and on graphics cards. In games especially the chance to smooth edges of the images goes a long way to creating a realistic 3D image on the screen. Remember though that Anti-Aliasing does not actually smooth any edges of images it merely fools the eye.
ROP (Raster Operators)
The Raster Operator manages near latter phase of image processing before it is displayed onto the monitor screen. The Raster Operators make sure the images are compressed neatly (if it's from a larger one), don't overlap one another (Anti-Aliasing) and sent to the memory (Output Buffer) where it is stored before being displayed on the monitor.
Alpha-blending
It is a technique to do transparency. It is an extra value added to the pixels of a texture map to define how easy it is to look through the pixel. This way it is possible to look through things and effects like realistic water and glass are possible.
Texture Mapping
In 3D graphics, texture mapping is the process of adding a graphic pattern to the polygons of a 3D scene. Unlike simple shading, which uses colors to the underlying polygons of the scene, texture mapping applies simple textured graphics, also known as patterns or more commonly "tiles", to simulate walls, floors, the sky, and so on.
FPS
FPS stands for Frames per Second. This is the main a unit of measure that is used to describe graphics and video performance.
STREAM PROCESSING
A new concept is to use a general purpose graphics processing unit as a modified form of stream processor. This concept turns the massive floating-point computational power of a modern graphics accelerator's shader pipeline into general-purpose computing power, as opposed to being hard wired solely to do graphical operations. In certain applications requiring massive vector operations, this can yield several orders of magnitude higher performance than a conventional CPU.
In this Parallel Processing-carrying out many different operations at the same time is the main feature. Stream Processors are basically smaller processors of the GPU, which support the feature of forming and displaying images. Their major function is to act as smaller processors for Parallel Processing applications of the hardware.
For example, Stream Processors may be used to render (create images) grass on the field in an open area or create walls in a closed room. Having more Stream Processors improve the performance speed of a Graphics card significantly.
TYPES OF GPU
DEDICATED GPU
A dedicated GPU is not necessarily removable, nor does it necessarily interface with the motherboard in a standard fashion. The term "dedicated" refers to the fact that dedicated graphics cards have RAM that is dedicated to the card's use, not to the fact that most dedicated GPUs are removable. Dedicated GPUs for portable computers are most commonly interfaced through a non-standard and often proprietary slot due to size and weight constraints. Such ports may still be considered PCIe or AGP in terms of their logical host interface, even if they are not physically interchangeable with their counterparts.
Technologies such as SLI by NVIDIA and CrossFire by ATI allow multiple GPUs to be used to draw a single image, increasing the processing power available for graphics.
Integrated graphics solutions, or shared graphics solutions
These are graphics processors that utilize a portion of a computer's system RAM rather than dedicated graphics memory. Computers with integrated graphics account for 90% of all PC shipments[6]. These solutions are less costly to implement than dedicated graphics solutions, but are less capable. Historically, integrated solutions were often considered unfit to play 3D games or run graphically intensive programs such as Adobe Flash[citation needed]. Examples of such IGPs would be offerings from SiS and VIA circa 2004.
HYBRID SOLUTIONS
The most common implementations of this are ATI's HyperMemory and NVIDIA's TurboCache. Hybrid graphics cards are somewhat more expensive than integrated graphics, but much less expensive than dedicated graphics cards. These share memory with the system and have a small dedicated memory cache, to make up for the high latency of the system RAM. Technologies within PCI Express can make this possible.
API (Application Program Interface)
It is a set of routines, protocols, and tools for building software applications. A good API makes it easier to develop a program by providing all the building blocks. Game programmers use Application Programming Interfaces to help them program 3D functions more easily and so the program they write will run on more types of hardware.
The three most popular graphical API's (Application Programming Interface's) are:
¢ GLide by 3dfx,
¢ OpenGL by Silicon Graphics (and Microsoft), and
¢ Direct3D by Microsoft as part of their multimedia DirectX package.
GLIDE3DFX
Glide is a proprietary 3D graphics API developed by 3dfx Interactive for their Voodoo Graphics 3D accelerator cards. It was dedicated to gaming performance, supporting geometry and texture mapping primarily, in data formats identical to those used internally in their cards. Further refinement of Microsoft's Direct3D and full OpenGL implementations from other graphics card vendors, in addition to growing competition in 3D hardware, eventually caused Glide to become superfluous.
DIRECTX
The DirectX software development kit (SDK) consists of runtime libraries in redistributable binary form, along with accompanying documentation and headers for use in coding.
Direct3D is used to render three dimensional graphics in applications where performance is important, such as games. Direct3D also allows applications to run fullscreen instead of embedded in a window, though they can still run in a window if programmed for that feature. Direct3D uses hardware acceleration if it is available on the graphics card, allowing for hardware acceleration of the entire 3D rendering pipeline or even only partial acceleration. Direct3D exposes the advanced graphics capabilities of 3D graphics hardware, including z-buffering, anti-aliasing, alpha blending, mipmapping, atmospheric effects, and perspective-correct texture mapping. Integration with other DirectX technologies enables Direct3D to deliver such features as video mapping, hardware 3D rendering in 2D overlay planes, and even sprites, providing the use of 2D and 3D graphics in interactive media titles.
OPENGL
The DirectX software development kit (SDK) consists of runtime libraries in redistributable binary form, along with accompanying documentation and headers for use in coding. OpenGL is the industry-standard interface to modern programmable graphics hardware widely used for games, animation, CAD/CAM, medical imaging, and other applications that visualize and manipulate high performance 2D and 3D graphics. Apple's implementation of OpenGL has been optimized for the Macintosh platform and includes a suite of extensions that give developers access to advanced graphics hardware capabilities. OpenGL's basic operation is to accept primitives such as points, lines and polygons, and convert them into pixels. This is done by a graphics pipeline known as the OpenGL state machine. Most OpenGL commands either issue primitives to the graphics pipeline, or configure how the pipeline processes these primitives.
MANUFACTURERS OF GPU
AMD (ATI division), Matrox Graphics, Nvidia, S3 Graphics, Intel, SiS, PowerV
SOME EXAMPLES OF GPU
NVIDIAâ„¢S GFORCE
GFORCE 2series, GFORCE 3series, GFORCE 4series, GFORCE 5series, GFORCE 6series, GFORCE 7series, GFORCE 8series, GFORCE 9series, GFORCE 100series, GFORCE 200series
Latest being GFORCE 300series
GFORCE100series comparisons
AMDâ„¢S RADEON :-MACHseries,RAGEseries,RADEON R100series, RADEON R200series, RADEON R300series, RADEON R400series, RADEON R500series, RADEON R600series, RADEON R700series.
Latest being EVERGREEN5series
APLLICATIONS OF GPU
Interactive Gaming- gaming industries
Motion Picture and Gaming Industries- Motion Picture and cinematography
Computer Aided Design & Manufacturing- like Autodesk
Visual Simulations- Pilot / Driver Training
General Computing- Emerging Area of Research like molecular modelling
Consumer- handheld, consoles, cell phones
LATEST IN GPU
CUDA (an acronym for Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA. CUDA is the computing engine in NVIDIA graphics processing units or GPUs that is accessible to software developers through industry standard programming languages. Programmers use 'C for CUDA' (C with NVIDIA extensions), compiled through a PathScale Open64 C compiler,[1] to code algorithms for execution on the GPU. CUDA architecture supports a range of computational interfaces including OpenCL[2] and DirectCompute[3]. Third party wrappers are also available for Python, Fortran, Java and Matlab.
The latest drivers all contain the necessary CUDA components. CUDA works with all NVIDIA GPUs from the G8X series onwards, including GeForce, Quadro and the Tesla line.
The next generation CUDA architecture, code named Fermi, is the most advanced GPU computing architecture ever built. With over three billion transistors and featuring up to 512 CUDA cores, Fermi delivers supercomputing features and performance at 1/10th the cost and 1/20th the power of traditional CPU-only servers. Fermi makes GPU and CPU co-processing pervasive by addressing the full-spectrum of computing applications. Designed for C++ and available with a Visual Studio development environment, it makes parallel programming easier and accelerates performance on a wider array of applications than ever before “ including dramatic performance acceleration in ray tracing, physics, finite element analysis, high-precision scientific computing, sparse linear algebra, sorting, and search algorithms.
REFERENCES
1. howstuffworks.com
2. tomshardware.com
3. intel.com
4. nvidia.com
5. extremetech.com
6. entecollege.com
7. pcworld.com
8. en.wikipedia.com
9. pcguide.com
Reply
#4
This article is presented by:F. S. Hill, Jr. and S. Kelley
Computer Graphics using OpenGL

Visual Realism Requirements
*Light Sources
*Materials (e.g., plastic, metal)
*Shading Models
*Depth Buffer Hidden Surface Removal
*Textures
*Reflections
*Shadows

Rendering Objects
=We know how to model mesh objects, manipulate a jib camera, view objects, and make pictures.
=Now we want to make these objects look visually interesting, realistic, or both.
=We want to develop methods of rendering a picture of the objects of interest: computing how each pixel of a picture should look.



For more information about this article,please follow the link:
http://googleurl?sa=t&source=web&cd=1&ve...ghting.ppt&ei=xFWpTISxPMG88gaA29iDDQ&usg=AFQjCNGylqpaTEjOMfFStj2Nh6MilWQKkg



Reply
#5


Submitted by:
BRIJESH KUMAR PATEL
BT– CS


[attachment=7897]

What is a GPU?
A Graphics Processing Unit (GPU) is a microprocessor that has been designed specifically for the processing of 3D graphics.

Main purpose of GPU is to simulate the 3D images as realistic as possible on the computer screen

GPU is mainly needed to relieve the CPU from graphical computations so that CPU can be used for other processes


Difference B/W CPU & GPU
CPU is where all the program instructions are executed .
CPU performs less computation than GPU(150G flops)(floating point operations per sec)

History & Standards
The first GPU, introduced in 1981 by IBM, were monochrome cards designated as Monochrome Display Adapters (MDAs).

Then came Colour Graphics Adapter(CGA) & then the Enhanced Graphics Adapter(EGA).

IBM introduced the Video Graphics Array (VGA) in 1987. It could support up to 256 colors at resolutions up to 720x400

Then came the Super Video Graphics Array (SVGA) that supports upto 16.8 million colors and 1280x1024 resolution

Need for 3DAcceleration

Everything that is displayed on the computer screen is 2D

The push for more realism, more finely-detailed graphics and faster speeds in such programs as means that more 3D work must be done in a shorter period of time has resulted in need for 3D acceleration

Stream Processing
A stream is simply a set of records that require similar computation.

A technique used to accelerate the processing of many types of video and image computations is called stream processing.

Streams provide data parallelism

GPUs are stream processors – processors that can operate in parallel by running a single kernel on many records in a stream at once.

Kernels are the functions that are applied to each element in the stream.



APLLICATIONS OF GPU
Interactive Gaming- gaming industries
Motion Picture and Gaming Industries- Motion Picture and cinematography
Computer Aided Design & Manufacturing- like Autodesk
Visual Simulations- Pilot / Driver Training

ADVANTAGE OF GPU
Scattered reads – code can read from arbitrary addresses in memory.
Shared memory – CUDA exposes a fast shared memory region (16KB in size) that can be shared amongst threads.
Faster downloads and read backs to and from the GPU
Full support for integer and bitwise operations, including integer texture lookups.




 
















Reply
#6
hello sir, I NOOR MISBAH, studying in 8th sem BE. I selected my seminar topic as GRAPHICS PROCESSING UNIT. I am interested to do seminar on this topic. please can you help me how to proceed with it and what information is more suitable for my topic according to my academics.
And i need the slides, ppts and report of the same.. kindly oblige.
Reply
#7
can u post a updated topic for gpu?plz
Reply
#8

to get information about the topic "accelerated processing unit"full report ppt and related topic refer the page link bellow


http://studentbank.in/report-graphics-pr...e=threaded

http://studentbank.in/report-graphics-pr...e=threaded

http://studentbank.in/report-graphics-pr...e=threaded
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Tagged Pages: graphics processing unit full seminar report, seminar report on graphics processing unit,
Popular Searches: fullstream ati, design graphics, project report of milk processing unit, central processing unit role, clipping, processing unit computer, technical seminar on graphical processing unit,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  Grid Computing seminars report Information Technology 35 41,613 21-12-2012, 10:55 AM
Last Post: seminar details
  information technology seminars topics computer science technology 4 72,220 11-02-2012, 12:07 PM
Last Post: seminar addict
  Signal & image Processing seminar class 1 2,079 02-02-2012, 09:44 AM
Last Post: seminar addict
  JavaRing seminars report seminar projects crazy 3 12,240 07-01-2012, 12:20 PM
Last Post: project uploader
  Introduction to 3D Graphics Hardware seminar class 0 1,744 14-04-2011, 03:35 PM
Last Post: seminar class
  Gecko Embedding Basics seminars report project report helper 0 1,462 29-10-2010, 10:43 AM
Last Post: project report helper
  H323 seminars report Information Technology 1 3,237 22-10-2010, 02:56 PM
Last Post: project report helper
  Digital image processing techniques for the detection and removal of cracks in digiti ravi shastri 1 1,774 23-09-2010, 05:10 PM
Last Post: seminar surveyer
  ferroelectric ram seminars report Information Technology 2 3,779 18-03-2010, 12:57 PM
Last Post: seminar topics
  information technology seminars topics computer science crazy 3 22,877 16-01-2010, 10:22 PM
Last Post: electronics seminars

Forum Jump: