A picture of Rob

Robert Capra
http://www.ils.unc.edu/~rcapra


Home |  Publications |  Research |  Teaching |  Notes


Solr Update with PHP

14-Aug-2007

Solr is a Lucene-based search engine server. One of the primary ways that you add new documents to Solr to be indexed is by sending an HTTP POST request with an XML file containing information about the new document to http://yoursolrhost:yoursolrport/solr/update

The Apache Solr distribution provides a shell script called post.sh that uses the Unix command-line version of curl to post the XML files to the Solr server. For my application, I needed more flexibility, and wanted to do the POSTs using PHP.

The Problem

Add a new document to Solr for indexing by POSTing the XML description file to the server using PHP.

The Solution

The following code (solr-post.php) illustrates how to send an XML update file to Solr using PHP. This code is based on code from the SolPHP package written by Brian Lucas. I ran into a few problems adapting the sendUpdate function in the SolrUpdate (version 0.200) class of SolPHP to work in my environment. The main problem appears to have been that the POST request needed to specify a different Content-type than the default used by PHP's curl. Thus, my adaptation is very similar to the sendUpdate function, but with a few changes.
    <?php
        ## file:   solr-post.php
        ## usage:  php solr-post.php -f foo.xml
        ##   where foo.xml contains instructions to add
        ##   a document to the Solr index
        ##   see:  http://lucene.apache.org/solr/tutorial.html#Indexing+Data
        $options = getopt("f:");
        $infile = $options['f'];

        $url = "http://yoursolrserver:yoursolrport/yoursolrhome/update";
        $post_string = file_get_contents($infile);

        $header = array("Content-type:text/xml; charset=utf-8");

        $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_POST, 1);
        curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string);
        curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
        curl_setopt($ch, CURLINFO_HEADER_OUT, 1);

        $data = curl_exec($ch);

        if (curl_errno($ch)) {
           print "curl_error:" . curl_error($ch);
        } else {
           curl_close($ch);
           print "curl exited okay\n";
           echo "Data returned...\n";
           echo "------------------------------------\n";
           echo $data;
           echo "------------------------------------\n";
        }
    ?>

Links/References

  1. SolPHP package and the function SolrUpdate (version 0.200) by both written by Brian Lucas
  2. Apache mailing list SolrUpdate posting
  3. xml.com article with information about how to update documents for indexing in Solr
  4. SolrTomcat
  5. PHP curl , specifically the curl_setopt function
  6. Information about cURL , which provides a way to do HTTP POSTs from the command-line


Home |  Publications |  Research |  Teaching |  Notes


Prepared by r c a p r a 3 [at] u n c [dot] e d u
Last modified: August 14 2007 17:59:32
Copyright 2000-2007 by Robert C. Capra III