Como pegar o conteúdo do Lucene

Pessoal, preciso na busca, além de mostrar
o nome do arquivo que consta o conteudo (query) buscada
eu pegue o texto aonde os dados da consulta esteja contido.

Por segurança, eu preciso pegar um paragrafo anterior e um
paragrafo posterior.

Alguém pode me dizer como fazer isto?

To usando um exemplo simples:

package lia.meetlucene;

import org.apache.lucene.document.Document;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.Hits;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.Directory;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import java.io.File;
import java.util.Date;

/**
 * This code was originally written for
 * Erik's Lucene intro java.net article
 */
public class Buscar {

  public static void main(String[] args) throws Exception {
/*    if (args.length != 2) {
      throw new Exception("Usage: java " + Buscar.class.getName()
        + " &lt;index dir&gt; &lt;query&gt;");
    }
*/
    File indexDir = new File("c:/temp/indices");
    String q = "JOSE";

    if (!indexDir.exists() || !indexDir.isDirectory()) {
      throw new Exception(indexDir +
        " does not exist or is not a directory.");
    }

    search(indexDir, q);
  }

  public static void search(File indexDir, String q)
    throws Exception {
    Directory fsDir = FSDirectory.getDirectory(indexDir, false);
    IndexSearcher is = new IndexSearcher(fsDir);

    Query query = QueryParser.parse(q, "contents",
      new StandardAnalyzer());
      
    long start = new Date().getTime();
    Hits hits = is.search(query);
    long end = new Date().getTime();

    System.err.println("Encontrado " + hits.length() +
      " documento(s) (em " + (end - start) +
      " milisegundos) p/ consulta '" +
        q + "':");

    for (int i = 0; i &lt; hits.length(); i++) {
      Document doc = hits.doc(i);
      System.out.println(doc.get("filename"));
      System.out.println(doc.get("contents"));
    }
  }
}

1 Resposta

Topicos relacionados