Hemlock API
Hemlock's API functions are accessible by calling them directly in Hemlock.
For example, to use APIDoSimpleSearch("Hello World"), you'd call:
GetRoot().|Hemlock:seanl|:APIDoSimpleSearch("Hello World");
APIDoFullSearch(searchentries,terms,options)
Opens Hemlock and performs a search.
- searchentries
- An array of entries of search engines selected from
APISearchEngineCursor(). Only entries will do -- no entry aliases or clones.
- terms
- String of search terms separated by spaces.
- options
- A frame of additional options. Possible option slots:
- overwrite
- If non-nil, old search results are overwritten.
If nil, old search results are appended to.
If not set, this slot defaults to non-nil.
APIDoSearch(searchentries,terms)
Opens Hemlock and performs a search. The equivalent of
APIDoFullSearch(searchentries,terms,{overwrite:'true});
APIDoSimpleSearch(terms)
Opens Hemlock and puts terms on the search line, but does not perform a search yet.
APISearchEngineCursor()
Returns a cursor of all the search engine entries. Entry frames appear ordered alphabetically
by name, and may have the following slots:
- name
- Displayable name of the search engine.
- method
- "GET" or "POST"
- action
- The Query URL
- update
- The URL to update the search frame
- updateCheckDays
- (Unused) The number of days to wait before updating (default 30)
- description
- Search engine description
- bannerImage
- (Unused) banner image.
- bannerLink
- Advertising banner URL
- bannerText
- (Special to Hemlock) Advertising banner text, if any. This is displayed in the advertisement position,
and takes precedence over any other advertisement text.
- resultListStart
- String indicating the start of a list of results.
- resultListEnd
- String indicating the end of a list of results.
- resultItemStart
- String indicating the start of a result in a list.
- resultItemEnd
- String indicating the end of a result in a list.
- bannerStart
- String indicating the start of an advertising banner.
- bannerEnd
- String indicating the end of an advertising banner.
- relevanceStart
- String indicating the start of the relevance value for a result in a list.
- relevanceEnd
- String indicating the end of the relevance value for a result in a list.
- skipLocal
- (Unused) Usually "TRUE" or "FALSE", indicating if links from the same host as the web page URL should be ignored when parsing.
- charset
- (Unused) A string giving the expected character set, according to some list of Macintosh character
encoding rules.
- resultEncoding
- (Unused) A string containing an integer defining the result encoding, as defined in some file Apple has
- resultTranslationEncoding
- (Unused) A string containing an integer defining the encoding the results should be translated to (also in "")
- resultTranslationFont
- (Unused) The preferred Macintosh font of the translated text
- input
- An array of frames which define the query parameters tacked onto the query URL. These frames may have the following slots:
- name
- The name part (before the "=").
- value
- The value part (after the "=").
- user
- If non-nil, the value is not used but instead is replaced by the user's search terms.
- mode
- Either "browser" or "results" (the default). If "browser", this query parameter is only used
if the user will actually see the results in a web browser, instead of having them parsed in Hemlock for him.
How Hemlock Submits a Query
To submit a query, Hemlock begins by forming the URL. It starts with the action slot, which is a URL. It then tacks on a "?", followed by a string of items of the form name=value, determined from the input slot, according to the standard rules for issueing HTTP queries. Hemlock takes care to form only valid URLs. The actual search terms take the "value" position for any entry where user is non-nil.
How Hemlock Parses
Hemlock uses slots in its search engine entries to determine how to properly parse the results of a query.
- List starts and List ends.
- If the List start is not specified, there's assumed to be only one list, whose start is the beginning of the file. If the List end is missing, it's assumed that there's only one list, whose end is the end of the file. Otherwise, it's perfectly fine for there to be more than one list in an HTML file.
- Item starts and Item ends.
- These tell Hemlock where list items are located inside lists. If the Item end is not specified, an Item begins with an Item start and ends with the next Item start or with the list end.
Hemlock looks for the next Item, then grabs all text between the first <a ...>...</a> it finds; this becomes the boldface text in Hemlock's results list, and the URL link is the link in the list. Then Hemlock continues grabbing text until the next item begins; the remaining text forms the discussion text (the non-boldface text in Hemlock's results list). Any text between the Item start and the first <a ...> tag forms the ``prediscussion'' text, which is not displayed, but is shown when the user requests seeing all the text in the list item. In all cases, text is grabbed except for text surrounded by relevance delimiters (see below), which forms the relevance bar if any. If the Item Start is missing, items are simply elements delimited with <a ...>...</a> tags within a list. There is no discussion text.
- Relevance starts and Relevance ends
- These delimiters define the section in an item which gives its relevance (an integer from 0 to 100). Hemlock extracts the first number it finds inside a relevance area.
- Advertising banner starts and ends
- These delimiters define the start and end of the advertising area.
- If the entry uses the <bannerStart> and <bannerEnd> tags, Hemlock will use the alt attribute of the first <img> tag after the first <a> tag after the banner start, for its advertising text. The aforementioned <a> tag provides the link URL.
- If the entry uses the bannerImage and bannerLink attributes of the <search>tag, Apple unwisely forgot to include a text alternative to bannerImage. Hemlock adds to the .src format the following <search> attribute: bannerText. Anything in bannerText will be displayed as an advertisement. bannerText takes precedence over <bannerStart> and <bannerEnd>.
Note that the one place where Sherlock is more powerful than Hemlock is in handling foreign encodings and fonts. Hemlock ignores a number of encoding and translation entries listed above, and currently will handle no special encodings specified by the plug-in. Hemlock will currently display text as translated by Newt's Cape using the default encoding the user has provided. However, Hemlock is set to display only in the Simple font (Geneva) at 9 point; this font does not provide Chinese or Japanese characters, sorry.
APIAddSearchEngine(srctext)
This function submits a .src file, stored as plain text in the string srctext, to Hemlock to add to its search engines. Nothing is returned.
APIResultsCursor()
Returns a cursor of all the search results. Search results are of two forms: Header entries, which
display the search engine name, search terms, and advertising, and non-header entries, which display
actual parsed query results. Search results are displayed in Hemlock in exactly
the order in which they appear in the soup by default (the order in which they were entered into
the soup). Result frames have the following slots:
- site
- Name of the search engine. Only appears in header entries.
- terms
- search terms. Only appears in header entries.
- url
- URL to jump to when clicking on the entry.
- ad
- Advertising text for search engine. Only appears in header entries.
- adurl
- URL to jump to when clicking on the advertisement. Only appears in header entries.
- text
- Text of the results. Only appears in non-header entries.
- discussion
- Further discussion of the results. Only appears in non-header entries.
- prediscussion
- Text that appeared prior to text, but after the item start. Only appears in non-header entries.
- relevance
- Relevance of the results. Only appears in non-header entries.
- otherurls
- an array of frames of the form { text:"the text", url:"the url" } which contain the additional urls and related hyperlink text found when parsing. This slot may be nil. Only appears in non-header entries.