Hints for Faster Computing!

Updated 10/16/2005, but some material relevant to as far back as 1995 remains.

Processor Upgrade Hints
Dual Processor Disappointment
Recent operating system disappiontments!
Other hardware Upgrade Hints - HARD DRIVE UPDATE!
(If your computer is slow, first check for adequate memory)
The AGP memory thief, and how to reduce
General Software and Memory Manager Configuration Hints!
Making a Windows DOS prompt session more efficient!
Windows 3.1 hack to use over 64 meg memory
A couple Programming Tips, especially for math-intensive loops
A couple extra tips for Microsoft BASIC programs such as mathptch.obj
(a minor correction 8/17/99)
C Vs. BASIC and a few Microsoft Quick C tips
A couple precision and floating point tricks!

HELP - My Internet speed is slow!

Processor Upgrade Hints

What I hear mostly is that an AMD "Athlon XP" processor mostly outperforms an Intel Pentium IV of 20% higher clock speed.

AMD Athlon XP "model numbers" are not the actual core speed in MHz, but supposedly that of a comparably fast Intel P4. The actual core speed of an AMD Athlon XP is usually 2/3 to 3/4 the "model number" and is 2200 MHz for the model number "3200+".

As of 11/26/2004, the current best widely available 32 bit AMD processor is the 3200+ XP (I did say 3.2 GHz before 1/6/2005), and I recommend it since cost does not increase much with speed up to that point. UPDATE 1/6/2005 - Intel P4 with nominal frontside bus at least 800 MHz and core speed at least 3.2 GHz probably outperforms the AMD Athlon XP 3200+, although not by much where Intel core speed and AMD "model number" match.

Motherboards for this one can have 400 MHz front side bus for DDR memory (32 bit front side bus), or they can do "dual channel DDR" which is 64 bit but probably mostly does 32 bit. A higher end dual channel DDR motherboard for AMD XP with 400 MHz frontside bus is A7N8X-E "Deluxe". Although dual channel DDR probably has little advantage over single channel DDR most of the time, I saw so little cost difference that I recommend dual channel DDR. The DDR memory for 400 MHz is called PC3200, and if you have a dual channel DDR motherboard you need 2 matched pieces - same size, same brand.

The 400 MHz nominal "frontside bus" speed for the AMD Athlon XP 3200+ is a figure that includes "double data rate". The actual external clock speed is 200 MHz for "PC3200" memory.

Now for stuff relevant to older machines:
I have been disappointed by some aspects of Intel x86 type processors beyond Pentium. My impression is that performance per MHz is slightly worse with Pentium II over original Pentium. Pentium III at least did not continue this trend but reversed it slightly. Then came the Pentium IV which has caused Intel some embarassment by disappointing (I think negative) advance in work done per clock cycle. For many tasks, a 1.4 GHz P4 is only negligibly better than a 1 GHz P3.

If by any chance you run mostly older software that runs under DOS or Windows 3.1, you may be especially disappointed by how the higher processors handle this software. In extreme cases, a 266 MHz Pentium II runs it no faster than a 133 MHz Pentium does. I did try different external clock speeds and clock multipliers to verify the effects. I even thought for about a year that the Pentium II had half-speed L1 cache due to the surprisingly slow speed that was verifyably proportional to internal clock speed and independent of anything else. I even tried disabling L2 cache to see what that did.

UPDATE 9/10/2000 - I recompiled my personal test "benchmark" software using the "mathpatch" mentioned below. The speed (in proportion to processor internal/nominal clock speed) after this improvement is about equal on Intel Pentium II and AMD K7. It looks like the Intel disadvantage I experience is heavily in "waste instructions" put into executable code by un-patched specific Microsoft compilers, mainly some for BASIC.

As for other Intel 686 types:

The Celeron is a Pentium II to Pentium IV with a small or sometimes no L2 cache. Ones with L2 cache have a full-speed L2 cache, as opposed to the half-speed but much larger L2 cache used by most Intel P2-P3 processors around or under 600 MHz.

Many Celeron machines 366-600 MHz use 66 MHz external clock instead of 100, so there is a slight disadvantage for memory-intensive applications. Most Celerons 400-800 MHz (maybe higher) have some L2 cache, full-speed yet, and most 333 MHz or lower do not.
The Pentium III is basically a Pentium II with some added features such as an extended version of MMX and an electronic serial number.

There are two types of Pentium III, Katmai (512K half speed L2 cache and mainly 600 MHz or less) and Coppermine (256K full speed L2 cache, mainly 600 MHz or more).

For use with modern software requiring 32 bit operating systems, the 686 processors are a major improvement over the 586 ones. Most test results indicate that the Intel P2/P3 and the AMD K7 ones are comparable but Intel P4 mostly fell behind a little.

Note that the best AMD processors as of early 2002 are Thunderbird and XP, which are both especially good K7 Athlons with 256K full-core-speed L2 cache. Some other K7's have fractional speed L2 cache.

As of 10/16/2005 I recommend that anything you upgrade to have either an Intel P4 of at laest 2.66 GHz with "PC2700" memory (DDR with nominal frontside bus 333 MHz), or an AMD XP processor of at least "2700" or "2.7+" model number and with a nominal frontside bus speed of at least 333 MHz using "PC2700" memory.

Dual Processor Disappointment

Two or more processors in a machine will not make the machine run any faster than one processor will - at least not for nearly all home and office computer applications. In order to gain any benefit from more than one processor, you need software made to multitask with multiple processors. This rules out DOS, Windows 3.1, Windows 95, and Windows 98 and Windows 2000 / Windows ME and all the usual software made to run under these operating systems. If you simultaneously run more than one intensive application, they may benefit from multiple processors if you use Windows NT or Linux, properly configured to take advantage of multiple processors.

Multiple processor machines are usually used for file servers.

Recent operating system disappiontments!

Windows 2000, Windows ME, and Windows XP are supposed to be so good by running in protected mode. Not many people will tell you that protected mode is slower than what was used in Windows versions 3.1 to 98! If you have to use 2000 or XP, you will probably want 2.4-plus GHz processor speed and nominal frontside bus speed at least 266 MHz (or at least 800 MHz for Intel "RAMBUS" or whatever they call that now). Let alone that MS Windows versions higher than 98 do not have quite the usual "DOS Prompt" or "DOS Box" or "DOS window" and will at least in some cases not as well run any old favorite DOS software that you have! I have heard that in Windows XP, you have a "Command" program that is essentially a "DOS box", but it does not work as well as the "Dos Prompt" in Windows versions 98 and lower.

Other hardware Upgrade Hints, and a Hard Drive Update!

Updated 8/14/2001, 9/30/2001, 11/26/2004!

One thing that is now inexpensive and that often helps a lot is more memory. Now memory is cheap.

As of a computer show in early October 2004, I paid $85 per piece of 512 meg PC3200 DDR memory. I now recommend a minimum of 512 meg, AKA "half a gig".

Requirements for good efficient use have proportionately increased as software gets more bloated and expectations increase with available processor speed and hard drive size.

If you find that your hard disk is getting used a lot and things slow down when that happens, you want to consider adding memory.

Windows 98 and a printer driver can hog up 60 meg, and a minimum 4-8 meg of RAM is reserved for AGP video usage in a usual modern (as of 2000) computer even if you don't use AGP video. I think 128 meg was a minimal amount as of late 2001, more if you use a version of Windows higher than 98. I think 512 meg nowadays is merely adequate, especially if you have to deal with one of the more bloated printer drivers doing photo printing. Even without issues of more bloated printer drivers, I now consider 256 meg minimal.

Slowdowns related to using the hard disk instead of memory are annoying! If your machine does not accept enough memory to use it instead of hard disk space, get a more modern machine.

UPDATE 11/26/2004 - Memory prices are so low that you have no excuse for under 512 meg. If your machine does not accept this much, then it is time to look into a new machine with at least 512 meg of RAM, an external bus speed of nominally at least 400 MHz, and a processor speed of at least 2.6 GHz and a hard drive at least 80 gig and at least 7200 RPM.

One more thing (Relevant mostly to original Pentium machines) - if your machine accepts EDO as well as normal/fast-page memory on 64 pin SIMM modules and you have normal / fast page memory and you want to upgrade the memory for that machine, consider trading in any normal memory you have to upgrade to EDO. EDO works somewhat faster than normal / fastpage.

Note that there are a two instances where using more than 64 meg (plus whatever is reserved for AGP video) will not help:

1. If you use DOS 6.x or lower or if you use Windows 3.1. The limit that the operating system will recognize is 64 meg. My favorite Microsoft operating system now is Windows 98 and its associated version of DOS.

2. If you have a motherboard having the 430FX chipset (usually with original Pentium), do not use more than 64 meg. This chipset makes only 64 meg of the memory cacheable. Memory other than the cacheable range will be slow. When your computer boots, look for any numbers displayed by the BIOS in the early stage of a cold boot. This may give indication of the chipset. If this is 430FX, I recommend getting a more modern machine.
The 430HX, VX, and TX chipsets do not have this limitation.

HARD DRIVES - I recommend at least 80 gig in size and at least 7200 RPM. If the drive is an IDE drive (the kind used in nearly all non-Apple personal computers) I also recommend that the drive have ATA-100 or ATA-133 compatibility. The good drives nowadays have ATA-100 (or higher) compatibility. Your motherboard (and any separate hard drive controller, generally unneeded in modern motherboards) and your IDE cable should also support this. Not all IDE cables are alike - the ones for ATA-66 and faster have extra ground conductors. Signal propagation at higher speeds requires cables that work in a way more like radio frequency coaxial cable.

I have 20 gig and 80 gig 7200 RPM ATA-100 Western Digital drives. The 20 gig one has a data reading rate of approx. 12 megabytes per second and the 80 gig has a data read rate of approx. 14 meg per second in my testing. Drives of RPM less than 7200 or capacity less than 20 gig are slower than 20-gig-plus, 7200 RPM ATA-100-compatible drives. In my experience, speed of drives 20 gig and less seems very roughly proportional to their RPM times the square root of their capacity.

See below in "software hints" for Windows 95 and higher settings that affect speed of reading and writing hard drives.

Hard Drive Controller - I have found VESA hard drive controllers to be disappointingly slow. PCI ones are faster. Most motherboards for Pentium and higher processors have a built-in PCI controller for IDE hard drives. Modern motherboards as of late 2001 have built-in "ATA-100" hard drive controllers which gain a little over prior versions of PCI-IDE controllers.

The AGP memory thief, and how to reduce

Computers with motherboards supporting AGP video often have some of the RAM reserved for video use and unavailable for the purposes for which the RAM is usually used. This is true even if you use a PCI (or other non-AGP) video card. You can usually reduce but not eliminate this memory theft.

When the computer is in early stages of booting up, there is usually a message, "hit DEL for setup" or something to this effect. You may need to hit "Del" as soon as this message appears in some computers. This gets you into what they call "CMOS Settings". One of the "groups" other than "Standard", "Optimal", "Power Management" and "Save" will have an option of amount of memory to be used for AGP purposes. Reduce this quantity to the smallest amount that will serve your needs.
As for how much? If your video is not AGP, select the minimum - sorry, in my experience, not zero. If you have AGP video, what you need will probably be typically 4 times the number of pixels used in your display, maybe a bit more. Maybe double that for playing movies or video games. Select the smallest amount that matches or exceeds your need.
After adjusting this, back up (usually "Esc") to the CMOS main menu, then select Save Changes and Quit. Reboot will resume.

General Software Configuration Hints

Here are some tricks that might really speed things up!

1. Do you do anything with heavy usage of temporary files? The Kodak Photo Enhancer software is one example. (I find ULead Photo Plus faster and in most ways better anyway.) If you have enough memory, you can make this get done in a RAM drive. Have a RAM drive set up (usually via a command in config.sys). I recommend a size of at least 30 percent of total memory, but you may not be able to use more than 32 meg. Then in the file autoexec.bat, have a line to make a directory in the RAM drive and have another line to set temp= (whatever that directory is). I recommend against using the root directory of the RAM drive because of a limit of the number of items in the root directory - have autoexec.bat create a subdirectory in the RAM drive and also set temp= that subdirectory.

2. A disk cache such as SMARTDRV can help with keeping your computer from being "stalled" while the hard drive is being written. Hard drives with built in memory buffers of size in the megabytes are also good for this.

3. Try not using expanded/upper memory managers, especially Microsoft's EMM386.EXE. If you need expanded or upper memory, try using Quarterdeck's QEMM386 instead of Microsoft's HIMEM.SYS and EMM386.EXE. Some things work faster if only non-Microsoft upper/expanded memory managers or no upper/expanded memory managers at all are loaded. The slowdown is typically about 6 percent (more often even less) in my experience, but I know a few things that get slowed down very radically (see BASIC tips below).

(This is less relevant to most machines running Windows 95 or higher.)

4. If you have "virtual memory", make it "permanent". With Windows 95 and higher versions, do this by specifying a specific amount of virtual memory, and make the lower and upper limits equal. Windows 3.1 has settings for "temporary" vs. "permanent" virtual memory. Also do not make the quantity of virtual memory excessive since some real memory is used to manage it. If you run into a situation of 6-of-one-halfdozen-of-another, what you probably need is more physical memory (RAM).
Virtual memory is disk space being used to emulate memory if you overflow your RAM. This involves a temporary disk file or sometimes (more usual for "permanent" virtual memory) a hidden disk file.
Even when optimized, virtual memory involves a very serious slowdown so if you see signs of it being used a lot (hard disk light on a lot with slow system speed) you probably want more RAM. If you can't escape heavy hard drive usage, then use a hard drive of at least 80 gig and at least 7200 RPM for faster data input and output to/from the hard drive.

5. Windows 95 and higher has settings that affect speed of using hard drives and CDROM drives!

Go to "My Computer". When there, go to "Control Panel". When there, go to "System". When there, go to "Device Manager". In "Device Manager" you see what resembles a directory tree, but this is your hardware (including some motherboard components). This will include your hard drives and CD drives. You may have to click a "+" or "-" to expand some general "drives" entry to show your hard drives. Right-click each one and select Properties and then Settings. One option should be DMA.
DMA should be checked when the drive is compatible, and in my memory this includes most hard drives 4.3 gig and larger and most CD drives 24x and faster. Maybe even somewhat below these figures! DMA on every hard drive and CD drive in your system is often necessary for most CD writers to work properly!

Making a Windows DOS prompt session more efficient!

Do you have some DOS programs that run slower in a Windows "DOS box" than when you have DOS but not Windows running? Good chance your DOS applications need extended (or XMS) memory and your Windows DOS session does not have enough.
In Windows 3.1, use PIF Editor (in the "Main" group) to edit dosprmpt.pif. Chances are only 1 meg of extended memory is provided. You can increase the limit - I suggest 8 meg. If you are confident that DOS extended memory hogs and Windows memory hogs won't hog memory simultaneously, then select -1 for the limit to specify no limit.
In Windows 95/98, right-click the icon for a DOS session if you have one (make an icon if you don't - make a copy and drag it onto the desktop with Windows Explorer.) Right-click it and get Properties. Chances are it specifies adequate or unlimited available extended memory. Windows 95 DOS slowdowns on file manipulation are more likely to indicate a need for a disk cache such as Smartdrv - set it up in autoexec.bat by having a line say c:\windows\comand\smartdrv.

Windows 3.1 hack to use over 64 meg memory

UPDATE 7/8/2000 - More serious hack! Use io.sys and msdos.sys and command.com from Windows 95 - this is sometimes known as DOS 7. This will run Windows 3.1. Copy all DOS utilities except edit from the \windows\command directory into your DOS directory since DOS 7 will not run most DOS utilities of other versions. You can use setver.exe to make DOS run off-version DOS utilities, but do a help setver.exe while you have not yet gotten rid of DOS 6.
Dos 7 will provide 127 meg of extended memory for Windows 3.1. I know most people would rather just use Windows 95/98/2000 but I know some 3.1 fans out there and I know a major East Coast institution still used Windows 3.1 into early 2001.

A couple Programming Tips, especially for math-intensive loops

1. Get any calculation involving constants out of the loop. For example,
you might have:

a=i*(b+3) in the loop.

If b is the same everytime you go through the loop, then before the loop have a line as such:

b3=b+3 before the loop and
a=i*b3 in the loop.

Another example: If you have x squared more than once in the loop and x does not change during a pass through the loop, then have a line before the loop as such:


and refer to x2 instead of squaring x all over again.

2. You should know that power functions are quite time consuming. How this is usually actually done is by taking a natural log, multiplying the log by the power, then taking the natural antilog. To make things work faster, instead of saying y=x^2 say y=x*x. You can do cubes faster by saying y=x*x*x than by saying y=x^3. And an SQR takes less time than raising something to the .5 power. You can square root something twice faster than raising something to the .25 power.

As for raising something to the fifth power? The fastest way is usually to do:


A couple extra tips for Microsoft BASIC programs

There is a bug with the coprocessor-compatible floating point math in executables produced by most Microsoft BASIC compilers such as Quick Basic 4.5 and Basic 7.1 ("QBX"). The math processing takes 2-4 times as long as it should even when everything is going right. With EMM386.EXE loaded, the speed is 1/3 to 1/5 the already-slow value. With Windows 3.1 running, a slowdown almost as bad as that of EMM386.EXE occurs. Quarterdeck's QEMM386 is not as bad as EMM386.EXE, causing an additional slowdown of only a few percent.

There is a solution. Dan Barclay (dbarclay@ih2000.net) has published a patch. The patch nearly enough restores full speed of math coprocessor usage and eliminates about 99.5 percent of the Windows 3.1 and EMM386.EXE slowdowns of math coprocessor usage.

One way to use this patch in Quick Basic 4.5 if the program has only one module: Add a line at the beginning of the program:

call PatchINT3D

Then save the program and quit Quick BASIC and return to a DOS prompt. At the DOS prompt, BC foo (where foo is the name of the BASIC file. Omit the .BAS extension.).
When compiling is complete, link as follows:

link foo.obj mathptch.obj, foo.exe, nul.map, bcom45.lib

Or, for a smaller executable file size, do this:

link foo.obj mathptch.obj smallerr.obj, foo.exe, nul.map, bcom45.lib /e /noe

(the command line may wrap around as you type it - this is OK)

Here is a more proper way to use the patch:

At a DOS prompt, get into whatever directory the library files are. In Quick BASIC 4.5, this is usually the \qb45 directory. In BASIC Compiler 7 ("QBX"), this is usually \bc7\lib.

Then type lib and enter.
Lib will prompt for a name for a new library. Enter a name matching no other library files. Lib will then prompt for creating - answer Y.
Lib will prompt for operations. Enter +mathptch.obj. Lib will prompt for a list file. Enter nul.lst. After all this, you have a new lib file which must be converted to a qlb file by link.

For Quick Basic 4.5, do this, where "foo" refers to the library name without the lib extension:

link /q foo.lib, foo.qlb, nul.map, bqlb45.lib

and hit enter.

For Basic Compiler 7 ("QBX"), only the last library name is different:

link /q foo.lib, foo.qlb, nul.map, qbxqlb.lib

You now have the quick library. To use it, at a DOS prompt type qb /lfoo or qbx /lfoo and then enter. Load the BASIC program, and type into the beginning of the BASIC program the line:

call PatchINT3D

You can then alt-r-x for "make .EXE". This also improves the speed of Quick Basic running the uncompiled BASIC code, but it will be so much faster as an .EXE.

This patch also supposedly works in at least some versions of Visual Basic. There is hope that it may also work with some versions of Microsoft C, although I just tried a bit with Microsoft's Quick C 2.5 and that did not seem to need mathptch.obj to get optimized-looking speed.

As for where to get the patch? You can download a ZIP file (mathptch.zip) containing the patch (mathptch.obj), assembly source code and some usage documentation by its authors (read.me and mathptch.asm), and the above usage documentation by me (usage.txt) by clicking here.

Something else: BC7 has an option for faster executable code requiring at least a 286 processor. If you don't fix the math coprocessor slowdown, it is 10 to 40 percent less severe with 286 code than without. I am sure other things also run faster with 286 code, but I have yet to really investigate this. If you use mathptch.obj, 286 code makes little difference in math coprocessor speed (generally less than 5 percent) - at least on a Pentium II.

C Vs. BASIC and a couple Microsoft Quick C tips

I finally translated my personal benchmark program from BASIC to C. This thing uses lines from an actual engineering application I wrote and mostly does addition, subtraction, multiplication and division.

Results when compiled by Microsoft Quick C 2.5 - slightly faster than with Microsoft Quick BASIC 4.5 or BASIC Compiler 7 (about 12 percent less running time for my test program). And I did not need mathptch.obj. Those not using mathptch.obj will often find C much faster than BASIC, by a factor often at least 2 and sometimes in excess of 10. C is supposedly faster than BASIC generally but I have yet to investigate this in aspects not fixable by mathptch.obj.

Now for optimizations for Microsoft Quick C:

With Quick C running, do an alt-O for Options and then M for Make options. First, I use Release as opposed to Debug. Then there are the compiler options. I use the Medium memory model when it works which is probably close enough to 99 percent of the time. As for Release Optimization options - I used Full as opposed to On or Off.

If alt-O only gives two options including Full Menus, turn on Full Menus and try again.

Link options did not seem to do much.

Now for a bit of bugginess: The size of the name of the .EXE affects how fast my personal test program runs. With the memory model on Medium, it works faster if the name of the .EXE is two letters or less followed by .EXE and slower with longer names. The reverse occurred with Small and Compact memory model size. This may be obscure Microsoft bugs triggered by some unusual combination of aspects of the code including approaching a limit on stack utilization or string space or something like this - I have seen strange things of this sort occur before with Microsoft programming languages! Or I could be just about using all memory locations in the L1 cache and minor changes could be having random effects on L1 cache caching all looped code and data. Exe-file-name-size-dependent results were duplicated on an Intel 400 MHz P2 and an AMD 1.33 GHz K7 TB processor.
Generally, I would use the smallest memory model that works, except I have some experience that Small is sometimes better than Compact when both work. And if the executable is produced by a Microsoft compiler that runs under DOS, try renaming the .EXE to various name lengths plus the .EXE extension. 1, 2, or 6 letters followed by .EXE have a chance of being "magic" faster-running names but try all eight possibilities of name length!

A couple precision and floating point tricks!

Double precision vs. single precision - in my experience, double is not much slower. My guess is that the processes used are double precision compatible even when reading and writing single precision variables.

But there is a trick in Microsoft Quick C and probably most other C compilers for x86 types: In Microsoft products, at least Quick C 2.5, it is called _control87. This affects the precision of the math coprocessor in x86 machines having one (Includes 486DX and higher - the math coprocessor is an add-in option for lower machines.) There are three options for _control87 - 64 bit (default), 53 bit, and 24 bit.

53 bit will compromise double precision, and 24 bit will compromise single precision and make double precision largely useless. But 24 bit precision is good enough for most single precision applications.

As for speed using my personal test program, which does a lot of adding, subtracting, multiplying and dividing and some comparisons but not much else:

53 bit takes about 10 percent less time than 64 bit on an Intel 400 MHz P4, although only .5% less time than 64 bit on an AMD 1.33 GHz TB.
24 bit takes about 32 less time than 64 bit (I only tested on Intel 400 MHz PII so far).
64 bit takes about 7 percent more time using double precision variables than with single precision variables - I only tested on 400 MHz Intel PII so far.

HELP - My Internet speed is slow!

Usually the limiting factor is the modem. If you use a modem slower than 56K then upgrading from the slower modem will help more than anything else - especially if everything else is good enough to not yet have been available maybe as far back as 1995 or so. If you have a 56K modem, try the hints above but chances are high that you are disappointed mostly because a 56K modem only passes about 6K bytes per second (sometimes 10-12K per second or maybe more than that if the data is more compressible - typically text).

Consider getting a cable modem or DSL. I have a preference of a cable modem over DSL, although both will have massive improvements over any regular phone line modem.

Beware that user modem technology (cable and DSL over phone lines especially) has improved more than technology elsewhere on the "Information Superhighway" so your reception speed will often increase less than proportionately with the rate at which you can receive data.

Note that modem speeds - phone, DSL and cable - are mostly rated in bits per second rather than bytes per second. With "stop bits", you usually need to receive 9 bits to receive a byte. Add a little more to this - anywhere from a few percent to sometimes as much as around 20 percent - for headers of each "packet" of data transmitted through the Internet.

When viewing Java-heavy sites, it can help to use a faster, more modern computer (preferably 1.33 GHz or faster) with an operating system of Windows 98 or comparable.

Written by Don Klipstein.

Please read my Copyright and authorship info.
Please read my Disclaimer.