Tags

,

My Solr client talks with a proxy application which communicates with remote Solr Server to get data. 
In previous post, Solr: Use JSON(GSon) Streaming to Reduce Memory UsageI described the problem we faced, how to use JSON(GSon) Streaming, and also some other approaches to reduce memory usage. 

In post Solr: Use SAX Parser to Read XML Response to Reduce Memory Usage
I also described how to use SAX to parse response for better performance.

In this post, I will introduce how to use Stax Parser to parse XML response.
Implementation
The code to use Stax to read document one by one from http stream:
— Use Stax parser and Java Executors Future to wait all thread finished: all docs imported.

  private static ImportedResult handleXMLResponseViaStax(
      SolrQueryRequest request, InputStream in, int fetchSize)
      throws XMLStreamException {
    XMLInputFactory factory = XMLInputFactory.newInstance();
    XMLStreamReader reader = null;
    
    ImportedResult importedResult = new ImportedResult();
    List<Future<Void>> futures = new ArrayList<Future<Void>>();
    
    try {
      reader = factory.createXMLStreamReader(in);
      int fetchedSize = 0;
      int numFound = -1, start = -1;
      
      while (reader.hasNext()) {
        int event = reader.next();
        switch (event) {
          case XMLStreamConstants.START_ELEMENT: {
            if ("result".equals(reader.getLocalName())) {
              numFound = Integer.valueOf(reader.getAttributeValue("",
                  "numFound"));
            } else if ("start".equals(reader.getLocalName())) {
              start = Integer.valueOf(reader.getAttributeValue("", "start"));
            } else if ("doc".equals(reader.getLocalName())) {
              ++fetchedSize;
              futures.add(readOneDoc(request, reader));
            }
            break;
          }
          default:
            break;
        }
        
      }
      importedResult.setFetched(fetchedSize);
      importedResult.setHasMore((fetchedSize + start) < numFound);
      importedResult.setImportedData((fetchedSize != 0));
      return importedResult;
    } finally {
      if (reader != null) {
        reader.close();
      }
    }
  }
  
  private static Future<Void> readOneDoc(SolrQueryRequest request,
      XMLStreamReader reader) throws XMLStreamException {
    String contentid = null, bindoc = null;
    OUTER: while (reader.hasNext()) {
      int event = reader.next();
      INNER: switch (event) {
        case XMLStreamConstants.START_ELEMENT: {
          if ("str".equals(reader.getLocalName())) {
            
            String fieldName = reader.getAttributeValue(0);
            if ("contentid".equals(fieldName)) {
              contentid = reader.getElementText();
            } else if ("bindoc".equals(fieldName)) {
              bindoc = reader.getElementText();
            }
          }
          break INNER;
        }
        case XMLStreamReader.END_ELEMENT: {
          if ("doc".equals(reader.getLocalName())) {
            break OUTER;
          }
        }
        default:
          break;
      }
    }
    return CVSyncDataImporter.getInstance().importData(request, contentid,
        bindoc);
  }
    

Resources
Parsing XML using DOM, SAX and StAX Parser in Java
Java SAX vs. StAX

via Blogger http://lifelongprogrammer.blogspot.com/2013/10/solr-use-stax-parser-to-reduce-memory-usage.html

Advertisements