[Pdx-pm] Multisearch engines
Joshua Keroes
joshua at keroes.com
Wed May 2 15:13:52 PDT 2012
In my company we have databases, webservers, web services, and search
engines; dozens of each. Almost every resource is an island unto itself.
Few are linked. Some are redundant. We have so many resources that I have
to show people our particular resources every few days.
A single search engine to rule them all would be a big win.
This problem has been solved before. It's called federated search, multi
search, meta search, and search aggregation. There's a nice picture at
http://en.wikipedia.org/wiki/Metasearch_engine .
A rough overview of the project could look like this:
Web frontend:
1. optionally run auto-completion while user is typing into the form
(that's a whole different topic)
2. send complete query to Metasearch API frontend
Metasearch API frontend:
1. validate query
2. normalize query (improve queries if possible)
3. send normalized query to every subsearch handler
4. optionally inform frontend about all subsearches (to initialize
progress bars; etc.)
5. normalize response (add useful info to response or delete things the
user shouldn't see)
6. return subsearch response
Subsearch handlers:
1. optionally validate and normalize query (things specific to just this
resource)
2. search. Depending on the type of resource this can mean many things:
search a database, check an index, fetch a web service, make a webpage
query and scrub the results; etc.
3. normalize response
4. return response
Backend:
1. Run indexers
Anyone familiar with projects in Perl-land (or outside the bubble) for
solving this? Failing that, know of any related projects I should check out
and/or leverage like Lucy/Lucene?
Thanks,
Joshua
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/pdx-pm-list/attachments/20120502/74c6ea3a/attachment.html>
More information about the Pdx-pm-list
mailing list