sheldond / gasohol

A Rails plugin that connects to a Google Search Appliance server and returns results in a nicer format than XML

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

Gasohol lets you query a Google Search Appliance and get results back in an easily traversable format.

Terms

Let’s get some nomenclature out of the way. It’s a little confusing when we talk about ‘query’ in a couple of different ways.

There are two parts to a Google GSA request:

  1. query (search terms)

  2. options (stuff like ‘site,’ client,‘ and ’num’)

And then there is the actual string that you see in your browser, the “query string,” which contains all of the & and = stuff.

Think of the query as the keywords to search for. However the query to the GSA can actually contain several parts besides just keywords. If you are using metadata then the query can contain several ‘inmeta:’ flags, for example. All of these combined with the keywords become one big long string, each part separated by a space:

pizza inmeta:category=food inmeta:pieSize:12..18

All of that is only the query (comes after ?q=). There are several additional options which the GSA requires, like ‘site’ and ‘client.’ These are the options. The query and all of the options are combined into the final query string and sent to the GSA.

A sample query string might look like:

?q=pizza+inmeta:category=food+inmeta:pieSize:12..18&site=default_collection&client=my_client&num=10

(Note that spaces in the query are turned into + signs.) This full query string is then appended to the URL you provided in the config options when you initialized Gasohol (see Search::new) and the request is made to the GSA. The results come back and are parsed and converted into a nicer format than XML.

Using Gasohol

To get gasohol ready, instantiate a new copy with Gasohol.new(config) where config is a hash of options so that we know how/where to access your GSA instance. This information is saved and used for every request after initializing your gasohol instance. For Google’s reference of what these options do, check out the Search Protocol Reference: code.google.com/apis/searchappliance/documentation/50/xml_reference.html

Required config options

url

the URL to the search results page of your GSA. ie: 127.0.0.1/search

site

the GSA can contain several collections, specify which one to use for this search

client

the GSA can contain several clients, specify which one to use for this search

Optional config options

filter

how to filter the results, defaults to ‘p’

output

the output format of the results, defaults to ‘xml_no_dtd’ (leave this setting alone for gasohol to work correctly)

getfields

which meta tag values to return in the results, defaults to ‘*’ (all meta tags)

num

the default number of results to return, defaults to 25

partialfields

another way to filter results by meta tag values

Example config hash:

config = {  :url => 'http://127.0.0.1',
            :site => 'default_collection',
            :client => 'my_client',
            :num => 25 }

Example usage

So the first thing we’ll need to do is create an instance of the search:

GOOGLE = Gasohol::Search.new(config)

For a simple search now you go:

@answer = GOOGLE.search('pizza')

@answer will now contain some info about the query, what params the GSA returned, etc (see Gasohol::ResultSet) @answer.results is an array of the results (see Gasohol::Result) @answer.featured returns an array of any featured results (appear as ‘sponsored links’ at the top of a regular Google.com search) (see Gasohol::Featured)

Next Steps

  • Check out Gasohol::DEFAULT_OPTIONS to see what other options you can pass in the config hash if you need to further refine those fields.

  • See the documentation for Gasohol::Search.search to see how you can have your own custom query term names and convert those into something that the GSA will understand.

License

(The MIT License)

Copyright © 2009 The Active Network

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

About

A Rails plugin that connects to a Google Search Appliance server and returns results in a nicer format than XML


Languages

Language:Ruby 100.0%