Parscan

Parscan is a generic search engine for Plexus, contributed by Oscar Nierstrasz at the University of Geneva. Parscan allows you to query any document accessible to your Plexus server on a paragraph-by-paragraph basis. Parscan understands HTML documents, plain text, and Refer bibliography files.

Querying HTML Documents

Suppose that http://<site>/<path> is the URL of an HTML document available at your site. Then you can query this document by simply using the URL:

http://<site>/parscan/<path>
I.e., by prefixing the path with "/parscan". If <base> is the name without the .html extension, <base> can be used instead of <path>.

An example application is the searchable catalog of WWW resources. (The full URL is: http://cui_www.unige.ch/parscan/OSG/Cat/cat.html, but this is hidden in the server's local configuration file.)

If the file contents are list items rather than standalone HTML blocks, parscan can be instructed to bracket the results of the search with <DL> and </DL>, <OL> and </OL> or <UL> and </UL>. The URL to use is:

http://<site>/parscan/<flag>/<path>?<query>
where <flag> is one of: -dl, -ol or -ul. Adjacent blocks of text must still be separated by a blank line, however.

Providing your own Cover Page

By default, parscan produces only a minimal title and introduction to a searchable document. You can produce your own cover page as follows:

If a header file <base>.hdr exists, parscan will print that instead of the default header. In addition, if <base>.query exists, it will be used whenever a non-empty query is given. (Normally <base>.hdr will be a cover page with introductory information, whereas <base>.query will only contain the title and main headline.)

Note that you must include the tag <ISINDEX> in the header of your file, or the search engine will not be activated.

Querying Plain Text

The flag -pre can be used if the source document is a plain text file. This will cause special characters to be escaped and each paragraph to be surrounded by <PRE> and </PRE>.

The flag -url will additionally cause parscan to search for URLs and ftp pointers and convert them into hypertext links. An example application is the Free Compilers List. In this case, flags are just concatenated, so the URL is:

http://cui_www.unige.ch/parscan/-pre-url/OSG/Langlist/free

Querying Refer Bibliography Files

Parscan can also be used to query a database of refer(1) style bibliography entries. Use the URL:

http://<site>/parscan/-r/<path>
See, for example, Oscar's OO Bibliography Database.

The -a flag is used internally by parscan and is automatically generated when a bibliography entry contains an abstract (%X field). The URL http://<site>/parscan/-a/<path>?<label> is then automatically generated where <label> is the value of the %L field.

If an ftp location is available for the article and is included in the refer record with a line of the form:

%% ftp: <site>:<file>
then parscan will generate the corresponding hypertext link. For example, see the list of OO papers available by ftp in the same OO bibliography database.

________________________________________

OMN Oct 6, 1993