Amigos Suponham que vocês queiram serializar uma implementação qualquer de Collection. Qual das implementações padrão (ArrayList, HashSet, etc.) vocês escolheriam para ocupar menos espaço/banda de rede?

Acho mais fácil fazer um teste. Pela lógica seria um ArrayList, mas a lógica não é uma conselheira totalmente confiável quando se trata de desempenho.

[edit] escrevi uma besteira tão grande que até apaguei… falta de atenção… [/edit]

Melhor coleção para serializar

Fiz o teste com Java 5.0. Realmente, a intuição é muito má conselheira.

a) Você pode até usar qualquer uma das coleções indistintamente, exceto a Vector, que tem um grande overhead (não sei por quê.)
b) Por incrível que pareça, a melhor para mim foi a TreeSet.

import java.util.*;
import java.io.*;


class Funcionario implements Serializable, Comparable&lt;Funcionario&gt; {
    private String nome;
    private double salario;
    public Funcionario (final String pNome, final double pSalario) {
         nome = pNome; salario = pSalario; 
    }
    public int compareTo (Funcionario f) {
        return nome.compareTo (f.nome);
    }
    public int hashCode () {
        return nome.hashCode() * 37 + Double.valueOf(salario).hashCode();
    }
    public boolean equals (Object obj) {
        if (obj.getClass() == Funcionario.class) {
            Funcionario f = (Funcionario) obj;
            return (f.nome.equals (nome) && Math.abs (f.salario - salario) &lt 1E-2);
        } else return false;
    }
}

class Projeto implements Serializable, Comparable&lt;Projeto&gt; {
    private String nome;
    private Funcionario chefe;
    private Funcionario[] subordinados;
    public Projeto (final String pNome, final Funcionario pChefe, final Funcionario[] pSubordinados) {
         nome = pNome; chefe = pChefe; subordinados = pSubordinados;
    }
    public int compareTo (Projeto p) {
        return nome.compareTo (p.nome);
    }
    public int hashCode () {
        int ret = nome.hashCode() * 37 + chefe.hashCode();;
        for (Funcionario subordinado : subordinados) {
            ret = ret * 37 + subordinado.hashCode();
        }
        return ret;
    }
    public boolean equals (Object obj) {
        if (obj.getClass() == Projeto.class) {
            Projeto p = (Projeto) obj;
            boolean ret = p.nome.equals (nome) 
                && p.chefe.equals (chefe);
            if (p.subordinados != null && subordinados != null && p.subordinados.length == subordinados.length) {
                for (int i = 0; i &lt subordinados.length; ++i) {
                    ret &= p.subordinados[i].equals (subordinados[i]);
                }
            } else if (p.subordinados != subordinados) {
                ret = false;
            }
            return ret;
        } else return false;
    }
}

class TesteTamanhoCollection {
    
    public static int tamCollection (Collection&lt?&gt col) throws IOException {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        ObjectOutputStream oos = new ObjectOutputStream (baos);
        oos.writeObject (col);
        oos.close();
        return baos.toByteArray().length;
    }
    
    public static void main(String[] args) throws IOException {
        Collection&lt;Projeto&gt; treeSet = new TreeSet&lt;Projeto&gt;();
        Collection&lt;Projeto&gt; hashSet = new HashSet&lt;Projeto&gt;();
        Collection&lt;Projeto&gt; linkedList = new LinkedList&lt;Projeto&gt;();
        Collection&lt;Projeto&gt; linkedHashSet = new LinkedHashSet&lt;Projeto&gt;();
        Collection&lt;Projeto&gt; arrayList = new ArrayList&lt;Projeto&gt;();
        Collection&lt;Projeto&gt; vector = new Vector&lt;Projeto&gt;();
        System.out.println ("Imprimindo as coleções vazias");
        System.out.println ("treeSet =       " + tamCollection (treeSet));       // 46
        System.out.println ("hashSet =       " + tamCollection (hashSet));       // 53
        System.out.println ("linkedList =    " + tamCollection (linkedList));    // 48
        System.out.println ("linkedHashSet = " + tamCollection (linkedHashSet)); // 91
        System.out.println ("arrayList =     " + tamCollection (arrayList));     // 58
        System.out.println ("vector =        " + tamCollection (vector));        // 167
        //== Acrescentando três projetos
        Projeto p1 = new Projeto ("javac", new Funcionario ("Peter Ahe", 7000.00), new Funcionario [] {new Funcionario ("Gilad Bracha", 8000.00)});
        Projeto p2 = new Projeto ("netbeans", new Funcionario ("Tim Boudreau", 9000.00), new Funcionario [] {new Funcionario ("Roman Strobl", 8500.00), new Funcionario ("Geertjan Wielenga", 8200.00)});
        Projeto p3 = new Projeto ("sun", new Funcionario ("Jonathan Schwartz", 100000.00), new Funcionario [] {new Funcionario ("Rich Burridge", 50000.00)});
        treeSet.add (p1); treeSet.add (p2); treeSet.add (p3);
        hashSet.add (p1); hashSet.add (p2); hashSet.add (p3);
        linkedList.add (p1); linkedList.add (p2); linkedList.add (p3);
        linkedHashSet.add (p1); linkedHashSet.add (p2); linkedHashSet.add (p3);
        arrayList.add (p1); arrayList.add (p2); arrayList.add (p3);
        vector.add (p1); vector.add (p2); vector.add (p3);
        System.out.println ("Imprimindo as coleções com elementos");
        System.out.println ("treeSet =       " + tamCollection (treeSet));       // 501
        System.out.println ("hashSet =       " + tamCollection (hashSet));       // 508
        System.out.println ("linkedList =    " + tamCollection (linkedList));    // 503
        System.out.println ("linkedHashSet = " + tamCollection (linkedHashSet)); // 546
        System.out.println ("arrayList =     " + tamCollection (arrayList));     // 513
        System.out.println ("vector =        " + tamCollection (vector));        // 619
    }
}

armando:

a) Você pode até usar qualquer uma das coleções indistintamente, exceto a Vector, que tem um grande overhead (não sei por quê.)

Só pra constar… isso acontece porque a Vector é desenvolvida para ser thread-safe, então todos os seus métodos são synchronized… é isso que gera esse overhead…]

Abraço,

Armando

Aham, Vector tem um pouco de overhead na serialização, mas não devido à sincronização.
Comparando o fonte de java.util.Vector e java.util.ArrayList, você pode ver que o primeiro usa o suporte-padrão a serialização e o segundo sobrepõe as rotinas writeObject e readObject. Como Vector aloca um array de pelo menos 10 objetos, e ArrayList sobrepõe as rotinas writeObject e readObject de modo que sempre copia a quantidade exata de objetos, aparece a tal diferença. Vou retestar com 10 objetos para ver o que ocorre.

4 Respostas

Topicos relacionados