Hemlock API

Hemlock's API functions are accessible by calling them directly in Hemlock. For example, to use APIDoSimpleSearch("Hello World"), you'd call:

GetRoot().|Hemlock:seanl|:APIDoSimpleSearch("Hello World");

APIDoFullSearch(searchentries,terms,options)

Opens Hemlock and performs a search.

searchentries
An array of entries of search engines selected from APISearchEngineCursor(). Only entries will do -- no entry aliases or clones.
terms
String of search terms separated by spaces.
options
A frame of additional options. Possible option slots:
overwrite
If non-nil, old search results are overwritten.
If nil, old search results are appended to.
If not set, this slot defaults to non-nil.

APIDoSearch(searchentries,terms)

Opens Hemlock and performs a search. The equivalent of

APIDoFullSearch(searchentries,terms,{overwrite:'true});

APIDoSimpleSearch(terms)

Opens Hemlock and puts terms on the search line, but does not perform a search yet.

APISearchEngineCursor()

Returns a cursor of all the search engine entries. Entry frames appear ordered alphabetically by name, and may have the following slots:

name
Displayable name of the search engine.
method
"GET" or "POST"
action
The Query URL
update
The URL to update the search frame
updateCheckDays
(Unused) The number of days to wait before updating (default 30)
description
Search engine description
bannerImage
(Unused) banner image.
bannerLink
Advertising banner URL
bannerText
(Special to Hemlock) Advertising banner text, if any. This is displayed in the advertisement position, and takes precedence over any other advertisement text.
resultListStart
String indicating the start of a list of results.
resultListEnd
String indicating the end of a list of results.
resultItemStart
String indicating the start of a result in a list.
resultItemEnd
String indicating the end of a result in a list.
bannerStart
String indicating the start of an advertising banner.
bannerEnd
String indicating the end of an advertising banner.
relevanceStart
String indicating the start of the relevance value for a result in a list.
relevanceEnd
String indicating the end of the relevance value for a result in a list.
skipLocal
(Unused) Usually "TRUE" or "FALSE", indicating if links from the same host as the web page URL should be ignored when parsing.
charset
(Unused) A string giving the expected character set, according to some list of Macintosh character encoding rules.
resultEncoding
(Unused) A string containing an integer defining the result encoding, as defined in some file Apple has
resultTranslationEncoding
(Unused) A string containing an integer defining the encoding the results should be translated to (also in "")
resultTranslationFont
(Unused) The preferred Macintosh font of the translated text
input
An array of frames which define the query parameters tacked onto the query URL. These frames may have the following slots:
name
The name part (before the "=").
value
The value part (after the "=").
user
If non-nil, the value is not used but instead is replaced by the user's search terms.
mode
Either "browser" or "results" (the default). If "browser", this query parameter is only used if the user will actually see the results in a web browser, instead of having them parsed in Hemlock for him.

How Hemlock Submits a Query

To submit a query, Hemlock begins by forming the URL. It starts with the action slot, which is a URL. It then tacks on a "?", followed by a string of items of the form name=value, determined from the input slot, according to the standard rules for issueing HTTP queries. Hemlock takes care to form only valid URLs. The actual search terms take the "value" position for any entry where user is non-nil.

How Hemlock Parses

Hemlock uses slots in its search engine entries to determine how to properly parse the results of a query.

List starts and List ends.
If the List start is not specified, there's assumed to be only one list, whose start is the beginning of the file. If the List end is missing, it's assumed that there's only one list, whose end is the end of the file. Otherwise, it's perfectly fine for there to be more than one list in an HTML file.
Item starts and Item ends.
These tell Hemlock where list items are located inside lists. If the Item end is not specified, an Item begins with an Item start and ends with the next Item start or with the list end. Hemlock looks for the next Item, then grabs all text between the first <a ...>...</a> it finds; this becomes the boldface text in Hemlock's results list, and the URL link is the link in the list. Then Hemlock continues grabbing text until the next item begins; the remaining text forms the discussion text (the non-boldface text in Hemlock's results list). Any text between the Item start and the first <a ...> tag forms the ``prediscussion'' text, which is not displayed, but is shown when the user requests seeing all the text in the list item. In all cases, text is grabbed except for text surrounded by relevance delimiters (see below), which forms the relevance bar if any. If the Item Start is missing, items are simply elements delimited with <a ...>...</a> tags within a list. There is no discussion text.
Relevance starts and Relevance ends
These delimiters define the section in an item which gives its relevance (an integer from 0 to 100). Hemlock extracts the first number it finds inside a relevance area.
Advertising banner starts and ends
These delimiters define the start and end of the advertising area.

Note that the one place where Sherlock is more powerful than Hemlock is in handling foreign encodings and fonts. Hemlock ignores a number of encoding and translation entries listed above, and currently will handle no special encodings specified by the plug-in. Hemlock will currently display text as translated by Newt's Cape using the default encoding the user has provided. However, Hemlock is set to display only in the Simple font (Geneva) at 9 point; this font does not provide Chinese or Japanese characters, sorry.

APIAddSearchEngine(srctext)

This function submits a .src file, stored as plain text in the string srctext, to Hemlock to add to its search engines. Nothing is returned.

APIResultsCursor()

Returns a cursor of all the search results. Search results are of two forms: Header entries, which display the search engine name, search terms, and advertising, and non-header entries, which display actual parsed query results. Search results are displayed in Hemlock in exactly the order in which they appear in the soup by default (the order in which they were entered into the soup). Result frames have the following slots:

site
Name of the search engine. Only appears in header entries.
terms
search terms. Only appears in header entries.
url
URL to jump to when clicking on the entry.
ad
Advertising text for search engine. Only appears in header entries.
adurl
URL to jump to when clicking on the advertisement. Only appears in header entries.
text
Text of the results. Only appears in non-header entries.
discussion
Further discussion of the results. Only appears in non-header entries.
prediscussion
Text that appeared prior to text, but after the item start. Only appears in non-header entries.
relevance
Relevance of the results. Only appears in non-header entries.
otherurls
an array of frames of the form { text:"the text", url:"the url" } which contain the additional urls and related hyperlink text found when parsing. This slot may be nil. Only appears in non-header entries.