über mich

Startseite

Links

Programmieren
   - CodeOptimierung
   - CodeConventionen
   - GFA-Basic

historisches
Impressum

Quellcode Optimierungs-Tips

Auf dieser Seite sammle ich Tips, die ich zur Optimierungen für geeignet halte. Speziell solche, die kaum Mehraufwand für den Programmierer bedeuten und man sich nur zu benutzen angewöhnen muß. Aber auch andere, die nicht für den ständigen Einsatz gedacht sind.

Vorwort:

Im allgemeinen sollte man vor Codebasteleien zur Optimierung lieber erst über effizientere Algorithmen und geeignetere Datenstrukturen / Repräsentationen nachdenken.

Primitives Beispiel:
Lineare-Suche O(n) wird zur Binären-Suche O(log n), wenn man die Daten sortiert hält. Mit einer Hash-Funktion (wenn ohne Kollisionen) sogar O(1).

Zudem sollte man die 90%-10%-Regel beachten, die besagt, daß erfahrungsgemäß 90% der Laufzeit eines Programms in nur 10% des Quellcodes verbracht werden. Folglich sollte man nicht seine Energie/Zeit auf die Optimierung wenig benutzter Code-Abschnitte verschwenden. Man sollte seine Aufmerksamkeit primär den 10% widmen, die in 90% der Laufzeit ausgeführt werden.

Sollten sie nicht wissen, welche Stellen in ihrem Programm das sind, stoppen sie doch die Zeit für einzelne Abschnitte in ihrem Programm, oder verwenden sie ein Profiling-Tool.

Aus gründen der Verständlichkeit sollten sie immer eine unoptimierte Version ihres Codes behalten. Diese ist in der Regel für Fremde leichter verständlich für und kann auch als Referenzimplementierung bei Funktionstests und als Lieferant für Referenzwerte für Geschwindigkeitsvergleiche u.ä. verwendet werden.

Optimierungen durch Codebasteleien verschlechtern fast immer die Lesbarkeit / Wartbarkeit des Codes und sollten, wenn sie zu dieser Kategorie gehören, nur eingesetzt werden, wenn man nicht vor hat den Code in Zukunft häufig zu Verändern.

Klassifizierung der Tips:

Typ 0)	Optimierungen ohne / fast ohne Mehraufwand für den Programmierer. (gewöhnen sie sich diese Optimierungen an und Profitieren sie in Zukunft ohne weiteres Nachdenken) Sollte man immer nutzen, wenn die Lesbarkeit nicht deutlich darunter leidet.
Typ 1)	Optimierungen, die Extra-Aufwand für den Programmierer bedeuten. Sollten, wegen des Mehraufwandes, nur an kritischen Stellen eingesetzt werden.
Typ 2)	Sonstiges Tips

Index

Java-Spezifisches
Typ 0)	`if-else` statt `%`
Typ 0)	Wahrscheinliche Fälle zuerst
Typ 0)	`final` verwenden
Typ 0)	Mehrfachaufrufe vermeiden
Typ 0)	Mehrfachberechnungen vermeiden
Typ 0)	`&` statt `%`
Typ 0)	`>>` statt `/`
Typ 0)	`<<` statt `*`
Typ 0)	'`7 XOR x`' statt '`7-x`'
Typ 0)	Reihenfolge von Bedingungen durchdenken
Typ 1)	`if-else`-Baum statt `switch`
Typ 1)	Rekursionen vermeiden
Typ 2)	Spezielle Klassen schreiben
Typ 0)	`ArrayList` statt `Vector`
Typ 0)	`StringBuffer` statt `String`
Typ 0)	`arraycopy` verwenden

Wenn sie weitere Optimierungs-Tips kennen, schreiben sie sie mir ruhig, bei Gefallen füge ich diese dann zu dieser Seite hinzu.

Ich freue mich immer über sachliche Kommentare / Erfahrungsberichte zu einzelnen Tips. Schreiben sie mir also ruhig eine E-Mail an Folke_R@gmx.de [Betreff: "HOMEPAGE: programming/opti"].

Wenn Ich den Inhalt ihrer E-Mail auch für andere interessant halte, werde ich ihren Inhalt bei dem entsprechenden Tip erwähnen.

Typ: 0 if .. else statt %

Wenn man den den %-Operator verwendet, aber weiß, daß bei x=i%m gilt i>=0 und i<2*m-1, dann ist es besser auf den %-Operator zu verzichten.

Negativ-Beispiele:	Besser:
`while(...) { ... x=(x+1)%7; ... }//wend`	`while(...) { ... if(6>x) x++; else x=0; ... }//wend`

Grund:

Der Rechenaufwand des %-Operators ist offenbar höher, als der eines Vergleichs und einer Addition.

Bemerkung:

Für 0<=i und m eine Potenz von 2 ist der Tip
" statt %2, %4, %8, %16, ... lieber &1, &3, &7, &15, ..."
effektiver, als dieser.

Gemessene Testwerte
System / VM	Ergebnis
PIII (800MHz) "Java HotSpot(TM) Client/Server VM (build 1.4.0_01-b03, mixed mode)"	77% faster

zurück zum Index [^]

Typ: 0 wahrscheinlichste Fälle zuerst

Wenn zwischen mehreren Fällen unterschieden werden muß, dann versuche die wahrscheinlichsten mit möglichst wenig Aufwand zu erkennen.

Für switch bedeutet das, daß oben die wahrscheinlichsten Fälle stehen sollten, da die switch-Anweisung vom Compiler in Code, der Ähnlichkeit mit einer if...else if...else if...else-Konstruktion hat, übersetzt wird (zumindest bei Java).

Negativ-Beispiele:

  int i=0;
  while(true)
  {
    if(1==(i/9))      // 10%
    {
      i=0;
    }
    else if(0==(i/9)) // 90%
    {
      ...
    }
    i++;
  }//wend

  int i=0;
  while(true)
  {
    switch(i/9)
    { case 1: //10%
       i=0;
       break;
      case 0: //90%
       ...
       break;
    }//end switch
    i++;
  }//wend

Besser:

  int i=0;
  while(true)
  {
    if(0==(i/9))      // 90%
    {
      ...
    }
    else if(1==(i/9)) // 10%
    {
      i=0;
    }
    i++;
  }//wend

  int i=0;
  while(true)
  {
    switch(i/9)
    { case 0: //90%
       ...
       break;
      case 1: //10%
       i=0;
       break;
    }//end switch
    i++;
  }//wend

Grund:

Die einzelnen Fälle werden von oben nach unten durchgetestet, bis ein Test positiv ist. Stehen die häufigen Fälle oben, dann werden im Durchschnitt weniger Test durchgeführt.

Weniger durchgeführte Tests bedeuten höhere Geschwindigkeit.

Bemerkung:

Wenn sie nicht wissen, wie häufig welche Fälle sind, lassen sie ihr Programm doch einfach mal zum Test mitzählen.

Wenn es sich um wirklich zeitkritischen Code handelt, lesen sie den Typ 1-Tip:
" if-else-Baum statt switch"

zurück zum Index [^]

Typ: 0	benutze `final` bei Variablen wennimmer möglich
Variablen oder Parameter sollten, wenn sie später nicht nochmal etwas zugewiesen werden als `final` deklariert werden. Vor allem Methoden-Parameter sind, zumindest bei mir, fast immer `final`.
Grund: Hinweis für den Compiler / die VM . Je mehr man diesen Programmen hilft desto höher die Chance, dass diese die Hinweise zur Optimierung nutzen können. bessere Lesbarkeit des Codes(Verständnishilfe) Selbstkontrolle des Programmierers
zurück zum Index [^]

Typ: 0	vermeiden sie unnötige Mehrfachaufrufe von Methoden
Oft werden Methoden unnötigerweise häufiger mit gleichen Parametern aufgerufen, obwohl klar ist, daß sie immer das gleiche Ergebnis liefern. Besonders häufige Beispiele für solche Methoden sind: `.length .length() .size()`
Negativ-Beispiel: `for(int i=0; i<a.length(); i++) { ... //Code der die Länge von a nicht verändert }//next i` (`a.length()` wird in jedem Schleifendurchlauf aufgerufen)
Besser a): `//runter- statt raufzählen for(int i=a.length()-1; 0<=i; i--) { ... //Code der die Länge von a nicht verändert }//next i` (`a.length()` wird nur einmal, bei der Initialisierung von `i`, aufgerufen)
Besser b): (wenn Runterzählen nicht in Frage kommt) `{//start block final int len=a.length() for(int i=0; len>i; i++) { ... //Code der die Länge von a nicht verändert }//next i }//end block` (`a.length()` wird nur einmal, bei der Initialisierung von `len`, aufgerufen)
Besser c): (Alternative zu b) ) `for(int len=a.length(), i=0; len>i; i++) { ... //Code der die Länge von a nicht verändert }//next i` (`a.length()` wird nur einmal, bei der Initialisierung von `len`, aufgerufen)
Grund: Unnötige Arbeit sollte man immer einsparen. Zudem weiß man nie, wie aufwändig die unnötig oft aufgerufene Methode ist.
Bemerkung: Bei `.length` für Arrays wissen wir, daß es relativ billig ist. Aber nicht so billig, wie Einsparen.
zurück zum Index [^]

Typ: 0	verhindere unnötige Mehrfachberechnungen
Negativ-Beispiel: `for(int y=0; 40>y; y++) { for(int x=0; 20>x; x++) { c[y40+x]=1; }//next x }//next y` (`y40` wird unnötigerweise mit gleichem y immer wieder berechnet)
Besser: `for(int y=0; 40>y; y++) { final int y40=y*40; for(int x=0; 20>x; x++) { c[y40+x]=1; }//next x }//next y`
Grund: Mehrfachberechnungen kosten unnötig Performance.
Bemerkung: 0.) Ich weiß, daß einige Compiler selber versuchen derartige Berechnungen aus Schleifen, zur Optimierung, auszulagern. Aber bei komplizierteren Code ist es Compilern meist nicht möglich zu erkennen, ob eine derartige Veränderung des Codes zulässig ist. 1.)Mir ist klar, daß man diesen Code weiter optimieren könnte. Was man z.B. noch machen könnte: `for(int yOffset=0; (40*40)>yOffset; yOffset+=40) { for(int i=yOffset+20-1; 0<=i; i--) { c[i]=1; }//next i }//next yOffset`
zurück zum Index [^]

Typ: 0 statt %2, %4, %8, %16, ... lieber &1, &3, &7, &15, ...

Bei m=2, 4, 8, 16 ... ist
x%m == x&(m-1)
für positive, ganzzahlige x.

Diese Optimierung ist so alt, wie das Programmieren selbst.

Diese Optimierung wird z.B von fast jedem C/C++ -Compiler bei unsigned Variablen automatisch gemacht. Da java aber keine rein positiven Variablentypen (außer char) kennt, darf der java-Compiler derartige Optimierungen nicht machen.

Da der Programmierer aber weiß, ob seine Zahlen positiv sind, oder er damit leben kann, daß er für negative Zahlen auch positive Reste erhält, kann er selbst diese Optimierung nutzen.

Negativ-Beispiele:	Besser:
`y=i%32; d%=64;`	`y=i&31; d&=63;`

Grund:

Der &-Operator ist eine reine Bitverknüpfung ohne Rechenaufwand, während der %-Operator auf quasi jedem Computer langsamer ist.

Bemerkung:

Das Verhalten bei negativen Zahlen ist mir persönlich bei der Variante mit & sogar lieber als das der %-Variante.

Bemerkung2:

Der Entscheidung der Java-Designer keine unsigned Typen in die Sprache aufzunehmen verdanken wir es, daß diese Optimierung nicht vom Compiler gemacht werden kann :-(

Bemerkung3:

Diese Optimierung bring mehr, als der Tip:
" if-else statt %"
Der ist dafür aber für beliebige m anwendbar.

Gemessene Testwerte
System / VM	Ergebnis
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	41% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	92% faster

zurück zum Index [^]

Typ: 0 statt /2, /4, /8, /16, ... lieber >>1, >>2, >>3, >>4, ...

Teilen durch Potenzen von 2 (2, 4, 8, 16, 32 ...) bei positiven ganzen Zahlen sollte man durch Bitschieben (um 1, 2, 3, 4, 5, ...) ersetzen.

Diese Optimierung ist so alt, wie das Programmieren selbst.

Teilt man eine negative Zahl auf diese Weise, ist Das Ergebnis nicht genau gleich, da das Bitschieben bei negativen Zahlen immer aufrundet statt abzurunden.

Da der Programmierer aber weiß, ob seine Zahlen positiv sind, oder er damit leben kann, daß er für negative Zahlen aufgerundet wird, kann er selbst diese Optimierung nutzen.

Negativ-Beispiele:	Besser:
`i=j/16; h/=256;`	`i=j>>4; h>>=8;`

Grund:

Bitschiebereien sind auf allen mir bekannten Systemen deutlich (um ein vielfaches) schneller als Dividieren.

Bemerkung:

Der Entscheidung der Java-Designer keine unsigned Typen in die Sprache aufzunehmen verdanken wir es, daß diese Optimierung nicht vom Compiler gemacht werden kann :-(

Gemessene Testwerte
System / VM	Ergebnis
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	92% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	84% faster

zurück zum Index [^]

Typ: 0 statt *2, *4, *8, *16, ... lieber <<1, <<2, <<3, <<4, ...

Multiplizieren mit Potenzen von 2 (2, 4, 8, 16, 32 ...) bei ganzen Zahlen sollte man durch Bitschieben (um 1, 2, 3, 4, 5, ...) ersetzen.

Diese Optimierung ist so alt, wie das Programmieren selbst.

Diese Optimierung wird z.B von fast jedem C/C++ -Compiler Ganzzahlvariablen automatisch gemacht.

Negativ-Beispiele:	Besser:
`i=j16; h=256;`	`i=j<<4; h<<=8;`

Grund:

Bitschiebereien sind auf allen mir bekannten Systemen schneller als Multiplizieren.

Bemerkung:

Auch, wenn man glaubt, daß der Compiler diese Optimierung macht, kann es nicht schaden sie selbst vorzunehmen.

Gemessene Testwerte
System / VM	Ergebnis
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	17% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	13% slower

zurück zum Index [^]

Typ: 0 statt 3-x, 7-x, 15-x, ... lieber 3 XOR x, 7 XOR x, 15 XOR x, ...

Wenn sie von positiven Zahl n eine Zahl x abziehen wollen, n=2^k-1 und x<=n, dann ist das bitweise exklusiv-oder (XOR) gleich der Subtraktion.

Also nur für Werte von x, so daß das Ergebnis positiv bleibt!

Bemerkung1: Die Bitverknüpfung XOR wird in java mit ^ ausgedrückt.

Bemerkung2: Mir ist klar, daß diese Optimierung nur selten anwendbar ist.

Negativ-Beispiele:	Besser:
`pos=255-x1; x=7-x;`	`pos=255^x1; x=7^x;`

Grund:

Bit-Operationen sind, auf allen mir bekannten Systemen, schneller als Rechenoperationen. (eventuell aber nur sehr wenig)

Bemerkung:

n=2^k-1 sind Zahlen, die in der binären Darstellung nur aus Einsen bestehen. Die oben zur Optimierung angegebenen Subtraktionen (binäre Zahlendarstellung!), sind solche, bei denen beim Berechnen keine Überträge entstehen.

Wenn der Programmierer aufgrund der in seinem Programm vorkommenden Zahlen weiß, daß in der Subtraktion (binäre Zahlendarstellung!) keine Überträge vorkommen, kann auch in solchen Fällen XOR statt der Subtraktion verwenden. (Aber Vorsicht, sie sollten ganz sicher sein!)

Gemessene Testwerte
System / VM	Ergebnis
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	32% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	8% faster

zurück zum Index [^]

Typ: 0	Reihenfolge von Bedingungen durchdenken
Bei Bedingungen, die aus mehreren Teilbedingungen bestehen sollte man, auf die Reihenfolge achten. Bedingungen werden von links nach rechts ausgewertet! Wenn man z.B. den Fall: `if((Bedingung1) && (Bedingung2))` Wenn beide Bedingungen ähnlich teuer sind, dann sollte man den, der am wahrscheinlichsten `false` liefert nach vorn stellen. (wenn Bedingung1 `false` liefert, wird Bedingung2 gar nicht ausgewertet.) Kriterien der Reihenfolge sollten sein: Aufwand beim Prüfen der Bedingungen bei und-Verknüpfung wahrscheinlicher fehlschlagende Bedingungen nach links bei oder-Verknüpfung wahrscheinlicher erfolgreiche Bedingungen nach links
Grund: So minimiert man den durchschnittlichen Aufwand bei der Auswertung der Gesamtbedingung.
zurück zum Index [^]

Typ: 1 If-else-Baum statt switch

Eine switch-Verzweigung wird vom Compiler in eine Art von hintereinander stehende if-Anweisungen übersetzt. (Da immer die selbe variable getestet wird, gibt es einen speziellere Bytecode als wenn man selbst lauter if-Anweisungen schreibt.)

Bei gleicher Wahrscheinlichkeit aller n Fälle, werden im Durchschnitt n/2 Vergleiche gemacht.

Die Idee diese Tips ist es die Fälle zu ordnen, und dann durch fortlaufende Intervallhalbierung den richtigen Fall zu finden.
Das reduziert die Anzahl der durchschnittlichen Vergleiche auf log₂n. Bildlich ergibt das einen binär-verzweigten Entscheidungsbaum.

Sollten die Fälle nicht alle gleich wahrscheinlich sein, und man die Wahrscheinlichkeiten der Einzelfälle kennen, kann man statt das Intervall in jedem Schritt zu halbieren auch eine andere Aufteilung wählen, so daß die Wahrscheinlichkeit der entstehenden beiden Intervalle annähernd gleich ist. (siehe: Huffman-Kodierung, Shannon-Fano-Kodierung [siehe WIKIPEDIA]).

Negativ-Beispiel: Besser:

  int j=-1;
  switch(i)
  { case  0:j=299;
     break;
    case  1:j=298;
     break;
    case  2:j=297;
     break;
    case  3:j=296;
     break;
    case  4:j=295;
     break;
    case  5:j=294;
     break;
    case  6:j=293;
     break;
    case  7:j=292;
     break;
  }//end-switch
  return j;

  int j=-1;
  if(4>i) // 0 .. 3
  { if(2>i) // 0,1
    { if(0==i)
        j=299;
      else if(1==i)
        j=298;
    }//endif
    else // 2,3
    { if(2==i)
        j=297;
      else if(3==i)
        j=296;
    }//end-else
  }//endif
  else // 4 .. 7
  { if(6>i) // 4,5
    { if(4==i)
        j=295;
      else if(5==i)
        j=294;
    }//endif
    else //6,7
    { if(6==i)
        j=293;
      else if(7==i)
        j=292;
    }//end-else
  }//end-else
  return j;

Grund:

Wie in der Beschreibung erklärt, sinkt die Anzahl der durchschnittlichen Vergleiche von O(n) auf O(log n).

Damit rentiert sich dieser Tip mit zunehmender Anzahl von Fällen immer mehr.

Bemerkung:

In einem Test für gleich wahrscheinliche Fälle hat sich dieser Tip bei mir schon bei 8 verschiedenen Fällen (manchmal sogar schon bei 4) gelohnt.

Bemerkung:

Im Beispiel kann man sogar bei den innersten else if nur else schreiben, da dort immer nur noch der eine Fall möglich ist. (weil alle Zahlen von 0 bis 7 vorkommen).

Gemessene Testwerte (16-Fällle)
System / VM	Ergebnis
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	17% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	46% faster

zurück zum Index [^]

Typ: 2 Rekursionen vermeiden

Viele Algorithmen werden rekursiv erklärt, können aber auch mit Schleifen implementiert werden. In solchen Fällen sollte man die nicht rekursive Variante bevorzugen, auch wenn der Programmcode meist komplizierter erscheint.

Viele Algorithmen basieren auf der Idee "divide and conquer" ("Teile und herrsche") [siehe WIKIPEDIA] und sind daher schon von ihrer Idee her rekursiv. Daher sind sie auch besonders elegant und leicht verständlich, wenn sie als rekursives Programm geschrieben werden. Rekursionen bedeuten aber besonderen Aufwand für den Computer. Deshalb gibt es für viele Algorithmen auch eine nicht-rekursive Variante, die zwar meist komplizierter aussieht, aber fast immer schneller und speichersparender ist.

Negativ-Beispiel:

  int fakultaet(final int n)
  {
    if(1>=n)
      return 1;
    return fakultaet(n-1)*n;
  }

Besser:

  int fakultaet(final int n)
  {
    int erg=1;
    for(int i=n; 1<i; i--)
      erg*=i;
    return erg;
  }

Grund:

Bei einer Rekursion (wenn der Compiler sie nicht selbst zu einer Schleife optimieren kann ("Optimierung von Endrekursion" [WIKIPEDIA])), Müssen, wie bei jedem Methoden-Aufruf alle lokalen Variablen und Prozessor-Register gesichert, ein Sprung zum Code der Methode durchgeführt, ...
pro Rekursionsschritt gemacht werden.

Dieser nicht unerhebliche Aufwand entfällt bei einer nicht-rekursiven Lösung, die dadurch, auch wenn sie komplexer aussieht, meist schneller ist und weniger Speicher benötigt.

Bemerkung:

Die Berechnung der Fakultät ist ein Beispiel, das sich besonders leicht in eine Schleifenvariante bringen läßt, da der rekursive Algorithmus endrekursiv ist [WIKIPEDIA].

Dieser spezielle Rekursionstyp wird von vielen Compilern erkannt und selbst zu einer Schleife optimiert (aber nicht von jedem Compiler :-( ). Komplizierte Rekursionen können aber nicht vom Compiler optimiert werden, da das Auflösen in Schleifen oft zusätzliche Verwaltungsstrukturen benötigt, die der Programmierer sich problemspezifisch überlegen muß.

Gemessene Testwerte
System / VM	Ergebnis
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	35% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	43% faster

zurück zum Index [^]

Typ: 2	Spezielle Klassen schreiben
Wenn es eine Klasse (z.B. in einer großen Bibliothek) gibt, die sie verwenden könnten, dann ist diese nicht unbedingt optimal für ihr Problem. Zum Beispiel Containerklassen sind in Bibliotheken meist so geschrieben, daß sie eine weite Spanne von Zugriffsarten/Manipulationen erlauben. Aber z.B. Einfügen/Löschen ist je nach Implementierung sehr verschieden wenn sie am Position 0, in der Mitte oder am Ende einer Datenstruktur gemacht werden. Deshalb gibt es auch in den Bibliotheken so viele verschiedene Containerklassen mit oft sehr ähnlicher, oder gleicher Pragrammierschnittstelle. Hat man aber einen sehr speziellen Anwendungsfall lohnt es sich oft selbst eine für diesen Fall optimierte Implementation zu schreiben.
Grund: Allgemeine Lösungen sind fast nie optimal für einen speziellen Fall, wie eine maßgeschneiderte Lösung.
zurück zum Index [^]

Java-Spezifisches

java Typ: 0 ArrayList statt Vector

Verwenden sie java.util.ArrayList statt java.util.Vector.

Der Unterschied besteht darin, daß in der Klasse Vector die Methoden synchronized sind, was zu geringerer Geschwindigkeit führt. In der Klasse ArrayList ist nicht synchronized.

Oft braucht man die Synchronisation jeder einzelnen Methode nicht. Wenn doch, kann man die Zugriffe auf ArrayList-Objekte ja selbst synchronisieren.

zurück zum Index [^]

java Typ: 0 StringBuffer statt String

Wenn sie eine Zeichenkette zusammenbauen wollen, dann vermeiden sie es den '+'-Operator für Strings zu verwenden. Dieser ist relativ langsam, da immer neue Objekte erzeugt und der Inhalt kopiert werden.

Vorzuziehen ist es die Zeichenkette in einem java.lang.StringBuffer-Objekt zusammenzubauen und, wenn er fertig ist in einen java.lang.String umzuwandeln. Das ist deutlich schneller.

zurück zum Index [^]

java Typ: 0	`arraycopy` verwenden
Wenn sie ein Array kopieren wollen, verwenden sie nicht eine selbstprogrammierte Schleife, sondern die `java.lang.System.arraycopy(...)` Methode.
zurück zum Index [^]

Hinweis zum Test-Java-Applet

Der Wert der Testergebnisse spiegeln nicht den Gewinn/Verlust der Optimierung wieder, sonder nur die Laufzeiten im Falle meines speziellen Testprogramms. Schleifen, Rechenoperationen u.ä. werden nicht rausgerechnet! Die Test sollen nur zeigen, das tatsächlich eine Optimierung stattfindet und eine Vorstellung verschaffen, ob sie eher gering oder erheblich ist.

Die Werte schwanken auch stark mit der verwendeten Java-VM und können bei ihnen im Browser von den Werten auf der Konsole abweichen.

Hier der Programmcode meiner Tests (testJavaSource.zip)

Sollten sie Verbesserungsvorschläge für die Tests haben, schicken sie mir ruhig den Programmcode eines besseren Tests.

zurück zur Programmier-Seite des Folkes

zurück zur Startseite des Folkes

Lage dieser Seite:	"http://www.Rinneberg.de/programming/opti.htm"
	"http://www.Folke-Rinneberg.de/programming/opti.htm"

Stand: 16.12.2007 (Inhaltlich: 21.08.2007) Brief an Folke:(Folke_R@gmx.de)

Link zu dieser Seite:

Sie haben eine Internetseite und finden meine Seite gut? Dann wäre es nett, wenn Sie einen Link zu meiner hinzufügen.
(währe schön wenn der Linktext 'Folke' 'Rinneberg' und das Thema der Seite beinhalten würden. z.B. 'Programmiertips von Folke Rinneberg')

Haben Sie eine Seite zum gleichen Thema? Schicken Sie mir eine E-Mail mit der URL (Betreff: 'HOMEPAGE: Link (hier Ihre URL einfügen)'). Vielleicht setze ich einen Link.

Benachrichtigung über Änderungen dieser Seite:

Wollen sie informiert werden, wenn sich der Inhalt dieser Seite ändert? Schicken Sie mir eine E-Mail mit dem Betreff: 'HOMEPAGE: informiere mich (www.rinneberg.de/programming/opti.htm)'.
(Ihre E-Mail-Adresse wird nicht weitergegeben und ich versende auch keinerlei Werbung)

Ende des deutschsprachigen Abschnitts

Start of the English translated part

about myself

Home

links

programming
   - CodeOptimisation
   - CodeConventions
   - GFA-basic

Atari

hardware-
modifications

art

history
imprint

Source Code Optimisation Tips

On this page I collect tips, which I think are useful for optimisation. Especially ones, which need no/few extra work when used by the programmer. There are also some other ones, which are not meant to be used wherever possible.

Preface:

In general you should think about better algorithms or data structure before you tweak the source code.

Primitive Example:
Linear-search O(n) becomes a binary-search O(log n), if you hold the data sorted. With a Hash-Function (if there are no collisions) even O(1).

You also should remember the 90%-10%-rule , which says that 90% of the runtime while executing a program is spend on only 10% of the program-code. Because of this you should spend your time/effort to optimise your program on the important 10%.

If you have no idea which parts of your code are these 10%, you can measure the time within your program while a test run, or use a profiling-tool.

Code optimisation often makes your code less easy to read/understand. therefore you should always have a unoptimised version as a reference implementation. This unoptimised version can be used to understand the code more easy and to compare the speed with the optimised one.

optimisation nearly always makes your code harder to read/understand. Therefore optimisations which decrease the readability a lot, should only be used on code which will be seldom or never be modified in the future.

Classification of the tips:

Type 0)	Using this optimisation needs no or nearly no extra effort for the programmer. (get into the habit to use this optimisations and get the reward in the future without any further thinking) You should always use this optimisations, if the code did not get less readable while doing so.
Type 1)	Using this optimisation means extra effort for the programmer. It should therefore only be used in critical sections.
Type 2)	other tips

Index

Java-Specific Tips
Type 0)	`if-else` instead of `%`
Type 0)	most likely first
Type 0)	use `final`
Type 0)	avoid multiple calls of methods
Type 0)	avoid multiple computations
Type 0)	`&` instead of `%`
Type 0)	`>>` instead of `/`
Type 0)	`<<` instead of `*`
Type 0)	'`7 XOR x`' instead of '`7-x`'
Type 0)	think about the order of conditions
Type 1)	`if-else`-tree instead of `switch`
Type 1)	avoid recursions
Type 2)	write specialised classes
Type 0)	`ArrayList` instead of `Vector`
Type 0)	`StringBuffer` instead of `String`
Type 0)	use `arraycopy`

If you know other optimisation tips, you may write me an e-mail, if I like the optimisation, I will add it to this page.

I like to hear your comments/about your experiences to each tip. Write me an e-mail to Folke_R@gmx.de [subject: "HOMEPAGE: programming/opti"].

If I think that the content of your mail is also of interest for the visitors of this page, I may add it to the tip on this page.

Typ: 0 if .. else instead of %

If you use the %-operator. For x=i%m under the condition that i>=0 and i<2*m-1 is true, you should avoid the %-operator.

negative-examples:	better:
`while(...) { ... x=(x+1)%7; ... }//wend`	`while(...) { ... if(6>x) x++; else x=0; ... }//wend`

Explanation:

The computational effort of the %-operators is higher than the comparison and addition.

Remark:

For 0<=i and m is a power of 2 the tip
" instead of %2, %4, %8, %16, ... better use &1, &3, &7, &15, ..."
is more effective than this tip.

Measured Values
System / VM	Result
PIII (800MHz) "Java HotSpot(TM) Client/Server VM (build 1.4.0_01-b03, mixed mode)"	77% faster

back to the index [^]

Typ: 0 most likely cases first

If a decision between multiple cases which are of different like lines, you should try to detect the most likely ones with as few effort as possible.

For the switch-statement this means, that the most likely ones should be placed on top, because the compiler translates the switch-statement to something like a if...else if...else if...else-construction (this is true at least for java).

negative examples:

  int i=0;
  while(true)
  {
    if(1==(i/9))      // 10%
    {
      i=0;
    }
    else if(0==(i/9)) // 90%
    {
      ...
    }
    i++;
  }//wend

  int i=0;
  while(true)
  {
    switch(i/9)
    { case 1: //10%
       i=0;
       break;
      case 0: //90%
       ...
       break;
    }//end switch
    i++;
  }//wend

better:

  int i=0;
  while(true)
  {
    if(0==(i/9))      // 90%
    {
      ...
    }
    else if(1==(i/9)) // 10%
    {
      i=0;
    }
    i++;
  }//wend

  int i=0;
  while(true)
  {
    switch(i/9)
    { case 0: //90%
       ...
       break;
      case 1: //10%
       i=0;
       break;
    }//end switch
    i++;
  }//wend

Explanation:

The different cases are tested, starting at the top, until one matches. If the most likely cases are the topmost ones, the average number of tests made decrease in comparison to any other order.

Less tests means after execution.

Remark:

If you have no idea which cases are the most likely ones, count each occurrence while a test run.

If your code is a speed critical part, you should read the Typ 1-Tip:
" if-else-tree instead of switch"

back to the index [^]

Typ: 0	use `final` for variables whenever possible
Variables or parameters should, if they are used like constants, be declared as `final`. Especially method parameters are, in my code, nearly always `final`.
Explanation: This is a hint for the compiler / the VM . The more hints you give to this programs, the more likely it is that they can use them for optimisations. Readability of code also increases A kind of self-control when for the programmer
back to the index [^]

Typ: 0	avoid multiple calls of methods, if possible
Often methods are called with same parameters more often than needed, because the programmer knows, that he always gets the same result. Often seen examples for such methods: `.length .length() .size()`
negative-example: `for(int i=0; i<a.length(); i++) { ... //Code which did not modify the length of a }//next i` (`a.length()` is called in every loop-run)
better a): `//count down instead of up for(int i=a.length()-1; 0<=i; i--) { ... //Code which did not modify the length of a }//next i` (`a.length()` is only called once, while initialising of `i`)
better b): (if count down is not possible) `{//start block final int len=a.length() for(int i=0; len>i; i++) { ... //Code which did not modify the length of a }//next i }//end block` (`a.length()` is called only once, while initialising of `len`)
better c): (alternative to b) ) `for(int len=a.length(), i=0; len>i; i++) { ... //Code which did not modify the length of a }//next i` (`a.length()` is called only once, while initialising of `len`)
Explanation: If you can help the computer to avoid work, make it. When you call methods not implemented yourself, you never know how much time they need. `length()` and `size()` are for most Objects quite cheap to be computed, but you can never relay on this.
Remark: We know that `.length` is cheap for arrays (in java). But no computation is as fast as no computation.
back to the index [^]

Typ: 0	avoid multiple computations
negative-example: `for(int y=0; 40>y; y++) { for(int x=0; 20>x; x++) { c[y40+x]=1; }//next x }//next y` (`y40` is computed over and over again)
better: `for(int y=0; 40>y; y++) { final int y40=y*40; for(int x=0; 20>x; x++) { c[y40+x]=1; }//next x }//next y`
Explanation: avoiding unnecessary computations improve speed.
Remark: 0.) I know that some compilers try to move computations like this out of loops. But when the code is more complex, it get harder / impossible to make this optimisations by the compiler. 1.) I know, that the example can be optimised even more. This is e.g. a more optimised version: `for(int yOffset=0; (40*40)>yOffset; yOffset+=40) { for(int i=yOffset+20-1; 0<=i; i--) { c[i]=1; }//next i }//next yOffset`
back to the index [^]

Typ: 0 instead of %2, %4, %8, %16, ... better use &1, &3, &7, &15, ...

When m=2, 4, 8, 16 ... than
x%m == x&(m-1)
is the same for positive integers x.

This optimisation is as old, as programming itself.

Par example nearly every C/C++ -compiler makes this optimisation for unsigned variables. Because java has no unsigned types (except of char), the java-compiler is not allowed to make this optimisation.

The programmer knows, if the values are always positive, or he can live with the different result for negative values, he is able to use this optimisation.

negative-examples:	better:
`y=i%32; d%=64;`	`y=i&31; d&=63;`

Explanation:

The &-operator is a simple bit-operation without any computation. The %-operator is slower on all CPUs (as far as I know).

Remark:

The result for negative values when you use & is even preferred by me (because it always returns positive values).

Remark2:

The decision of the Java-designer to have no unsigned variable types is the reason why the javacompilers are not allowed to make this optimisation :-(

Remark3:

This optimisation is more effective than the tip:
" if-else instead of %"
which can, in contrary to this tip, be used for every m.

Measured Values
System / VM	Result
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	41% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	92% faster

back to the index [^]

Typ: 0 instead of /2, /4, /8, /16, ... better use >>1, >>2, >>3, >>4, ...

Divide by powers of 2 (2, 4, 8, 16, 32 ...) for positive integers should be replaced by bit-shifting (by 1, 2, 3, 4, 5, ... bits).

This optimisation is as old, as programming itself.

The result of bit-shifting is not the same as if you use division for negative integers, because because shifting rounds always to the next bigger integer value. Division always rounds to the next smaller integer value. (the difference of the results may be 0 or 1 for negative values)

Par example nearly every C/C++ -compiler makes this optimisation for unsigned variables.

java has no unsigned types (except of char). Therefore no java compiler is allowed to make this optimisation.

The programmer knows if the values are always positive, he can make this optimisation himself (is the different result for negative values did not bother him, he can also use this tip if the values are negative).

negative-examples:	better:
`i=j/16; h/=256;`	`i=j>>4; h>>=8;`

Explanation:

Bit-Manipulations are faster than computations on all system I know. Division is quite expensive on most CPUs, therefore this optimisation speeds up a lot.

Remark:

The decision of the Java-designer to have no unsigned variable types is the reason why the javacompilers are not allowed to make this optimisation :-(

Measured Values
System / VM	Result
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	92% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	84% faster

back to the index [^]

Typ: 0 instead of *2, *4, *8, *16, ... better use <<1, <<2, <<3, <<4, ...

Multiply with a power of 2 (2, 4, 8, 16, 32 ...) can be replaced for integers by bit-shifting (by 1, 2, 3, 4, 5, ... bits).

This optimisation is as old, as programming itself.

Par example nearly every C/C++ -compiler makes this optimisation for unsigned variables.

negative-examples:	better:
`i=j16; h=256;`	`i=j<<4; h<<=8;`

Explanation:

Bit-manipulations are, on all Systems I know, faster than multiplying (on some systems only a little)

Remark:

Also If you think the compiler will make this optimisation, you can make it yourself (avoids relaying on the compiler optimisation).

Measured Values
System / VM	Result
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	17% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	13% slower

back to the index [^]

Typ: 0 instead of 3-x, 7-x, 15-x, ... better use 3 XOR x, 7 XOR x, 15 XOR x, ...

If you want to subtract from n the value x and n=2^k-1 and x<=n, than you can use bitwise exclusive-or (XOR) instead of the subtraction.

That means for Values of x, that lead to a positive result!

Remark1: The bitwise operation XOR is written in java as ^ .

Remark2: This optimisation is quite exotic, because it can only be used in a very special kind of cases.

negative-examples:	better:
`pos=255-x1; x=7-x;`	`pos=255^x1; x=7^x;`

Explanation:

Bit-operations are, on all systems known by me, faster than arithmetical operations. (probably only very little faster than subtraction)

Remark:

n=2^k-1 are values where the binary representation has only ones. The subtractions which can be optimised by this tip are therefore once, where no carry occur.

If the programmer knows, that the binary subtraction for his specific values lead to no carrys, he can use this optimisation also for other values than the ones described in this tip. (Be aware that the result is only correct if no carry occur in the binary subtraction!)

Measured Values
System / VM	Result
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	32% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	8% faster

back to the index [^]

Typ: 0	think about the order of conditions
If a condition is a combination of subconditions, you can often choose the order of the subconditions without changing the behaviour. Conditions are tested from left to right! If you have e.g. the case: `if((condition1) && (condition2))` If testing of each of the both conditions costs about the same, you should make the one which delivers more likely `false` should be placed as condition1. (if the condition1 delivers `false` , the condition2 is not tested at all.) Criteria for the order of sub-conditions should be: costs for the test of the sub-conditions when sub-conditions are connected by a and-operation, the sub-condition which is more likely false should be left when sub-conditions are connected by a or-operation, the sub-condition which is more likely true should be left
Explanation: This way you minimise the number of sub-condition to be tested.
back to the index [^]

Typ: 1 If-else-tree instead of switch

A switch-statement is replaced by the compiler by code, which looks like a row of if-conditions. (Because all condition refer to the same variable there is a special bytecode in java so the code differs form the code generated for a row of if-conditions.)

If all n cases a equal likely, in average n/2 test must be made.

The idea of this tip is to order all cases in a way that you can make a kind of binary search for the case that matches.
This way you reduce the number of test to make to log₂n. This means create a decision-tree.

If not all cases are equal likely, you may create a not balanced decision-tree to minimise the average number of needed comparisons. (see: Huffman-Code, Shannon-Fano-Code [see WIKIPEDIA]).

negative-example: better:

  int j=-1;
  switch(i)
  { case  0:j=299;
     break;
    case  1:j=298;
     break;
    case  2:j=297;
     break;
    case  3:j=296;
     break;
    case  4:j=295;
     break;
    case  5:j=294;
     break;
    case  6:j=293;
     break;
    case  7:j=292;
     break;
  }//end-switch
  return j;

  int j=-1;
  if(4>i) // 0 .. 3
  { if(2>i) // 0,1
    { if(0==i)
        j=299;
      else if(1==i)
        j=298;
    }//endif
    else // 2,3
    { if(2==i)
        j=297;
      else if(3==i)
        j=296;
    }//end-else
  }//endif
  else // 4 .. 7
  { if(6>i) // 4,5
    { if(4==i)
        j=295;
      else if(5==i)
        j=294;
    }//endif
    else //6,7
    { if(6==i)
        j=293;
      else if(7==i)
        j=292;
    }//end-else
  }//end-else
  return j;

Explanation:

Like mentioned in the description the average number of comparisons are reduced from O(n) to O(log n).

This means this optimisation get more effective the more cases you have.

Remark:

In a test, where all cases where equal likely, this tip improves the speed when at least 8 cases exists (for 4 cases nearly no speed-up).

Remark:

In the example shown above you could remove the innermost else if and replace it by else , because all values 0 ... 7 are a case.

Measured Values (16 cases)
System / VM	Result
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	17% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	46% faster

back to the index [^]

Typ: 2 avoid recursions

A lot algorithms are explained in a recursive way, but can also be implemented by using loops instead of recursion. If this is the case, you should prefer the non recursive version even if the code looks more complicated.

There are a lot of algorithms which are based on the idea of "divide and conquer" ("divide and rule") [see WIKIPEDIA] and are because of this idea recursive. This is the reason why they can be easy understood when described in a recursive way. But recursion is always a lot of work for the computer. Therefore there is often also non recursive version, which looks in most cases more complicated but are mostly faster and need less memory.

negative-example:

  int factorial(final int n)
  {
    if(1>=n)
      return 1;
    return factorial(n-1)*n;
  }

better:

  int factorial(final int n)
  {
    int result=1;
    for(int i=n; 1<i; i--)
      result*=i;
    return result;
  }

Explanation:

For each recursive call (if the compiler cannot itself optimises it to a loop ("optimisation of tail-recursion" [WIKIPEDIA])), all local variables must be stored before the call, also CPU-registers, the jump to the method, ... must be performed.

All this is not needed in an non-recursive implementation, which can therefore be faster and needs less memory, even if it looks more complicated.

Remark:

The computation of n! is an example which can be easily transformed to a loop because the algorithm is tail-recursion [WIKIPEDIA].

This special type of recursion can be detected and optimised to a loop by most compilers (but not every compiler :-( ). More complicated recursions can, in most cases, not be optimised by compilers because special data structures for each problem must be invented by the programmer to implement a loop based implementation.

Measured Values
System / VM	Result
PIII (800MHz) "Java HotSpot(TM) Server VM (build 1.4.0_01-b03, mixed mode)"	35% faster
PIII (800MHz) "Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)"	43% faster

back to the index [^]

Typ: 2	write specialised classes
If there is a class (e.g. part of a big code-library) which can fulfil your needs, it must not be the optimal choice for your special problem. Container classes are par example written to fulfil a wide variety of needs. Because the is no data structure which is optimal for all cases there are often different implementations with equal APIs in the libraries. Insert or delete have very different speeds for different data structures. If you only need delete/insert at special position you may be able to invent (or find) a implementation optimal for your need.
Explanation: General data structures are in most cases as optimal for your special needs as a specially for your case invented/implemented structure.
back to the index [^]

Java Specific Tips

java Typ: 0 ArrayList instead of Vector

Use java.util.ArrayList instead of java.util.Vector.

The difference is, that in the class Vector the methods are synchronized what slows them down. In class ArrayList are the methods not synchronized.

In a lot of cases you do not need the Synchronisation of each method separately. If you need synchronisation, you can synchronise ArrayList-objects of your own.

back to the index [^]

java Typ: 0 StringBuffer instead of String

If you want to create a String by concatenation, avoid to use the '+'-operator for Strings. This operator is quite slow because every time a new Object is created and the contend of the old String must be copied to it.

It is better to use a java.lang.StringBuffer-Object and when you have created the hole String by the .append(...)-methods, call the toString()-method at the end to get a java.lang.String. This way is a lot faster.

back to the index [^]

java Typ: 0	use `arraycopy`
If you want to copy an array, did not use a loop to copy each element. Use the `java.lang.System.arraycopy(...)`-method.
back to the index [^]

Remark to the Test-Java-Applet

The values Measured are not meant to be interpreted as gain/loss to be expected when you use the optimisation. It is only the time needed to perform the special tests. Time needed for loops and other operations not part of the optimisation is not subtracted! The Test should only show that there is in fact a speed improvement and if the improvement is little or big.

The values differ a lot depending on the used Java-VM and can also be different in the web browser in comparison to the execution on the console.

This is the source code of my tests (testJavaSource.zip)

If you have any advises how to improve the test, You can mail me the source code of the improved test.

back to Folke's programming page

back to Folke's homepage

Location of this page:	"http://www.Rinneberg.de/programming/opti.htm"
	"http://www.Folke-Rinneberg.de/programming/opti.htm"

last updated: 16.12.2007 (content: 21.08.2007) e-mail to Folke:(Folke_R@gmx.de)

Link this web site:

You have your own web site and like mine? It would be nice to add a link to mine.
(would be nice, if the link-text would contain the words 'Folke' 'Rinneberg' and something about the content of the page. e.g. 'Source Code optimisation Tips by Folke Rinneberg')

If you have similar content on your web page, write me an e-mail with the URL (subject: 'HOMEPAGE: link (insert your URL here)'). I possibly add a link to it on this site.

Be informed of content changes:

You want to be informed, if the content of this page has changed? Then write me an e-mail with the subject: 'HOMEPAGE: remind me (www.rinneberg.de/programming/opti.htm)'
(Your e-mail address will not be given to anyone else than me and I will also not send any ads)

Help to improve this web page:

Because I am no native English speaker, it is likely that there can be made a lot of improvements to my translation. I am very interested in every proposal how to improve the English part of this web page.