[Zope-CMF] Building a query using NOT

Flynt rhess@bic.ch
Mon, 21 May 2001 00:37:36 +0200


Ben Riga wrote:
> 
> I'm having some trouble building a query for a method I'm working on.  Any
> help, suggestions or pointers to reference or further reading material would
> be appreciated.
> 
> I'm trying to build a query on the catalog that includes items that are
> *not* in a particular 'Subject'.   In this case I'd like to get the
> documents in the catalog that would NOT be in the following query:
> 
> <dtml-in "portal_catalog.searchResults(
>     Subject = ['Analysts', 'Press']
>   , sort_order='reverse' , review_state='published')" size="10">
> 
> Thanks,
> Ben
>

Hi Ben

I was not so happy with the *Search Interface*, since first, there is a
bug in the form, second, if you don't use the Date metadata the time
stuff makes no sense at all (I made it depend on the modification time),
and third and most irritating, if you leave the subject field empty, you
get no hit at all. I would expect by leaving all fields empty, to get
all documents, as is the case, when you click the *Go* button with an
empty field.

So I introduced a python script at the top of the *search* method, to
modify the query characteristics. As I read your posting, I thought, I'd
try to include also at least a very simple *NOT* functionality, and this
might be of interest to you as a starting point for what you like to do.

However, the *NOT* I introduced had to fight with the existence of
whitespace or even empty tuples in the subject field of documents.
Whitespace keywords are introduced first, if you put nothing into the
subject field of e.g. a newsitem (look at subject entry of a newsitem in
the portal_catalog) and second by users hitting newlines and doing other
*invisible* stuff whenever they enter a category into the subject field
of their document. So even if you exclude documents under certain
categories from the search with *NOT*, they will show up nevertheless,
**because besides the category they were given, they also participate in
the category *''* as well**, even without you being aware of it.
Furthermore, documents which were created automatically ( e.g. the
default index_html in each member directory or favourites) have a *()*
in their subject entry in the catalog. All this made the playing around
and testing remarkably interesting and time consuming.

So, the *NOT* that I switched in between the *search_form* and the
*search* method excludes not only the listed categories in the subject
field, but also all documents with *''* as a category in their subject
field in the catalog. I give a hint to that on my customized
*search_form* for the user.


Here is the python script (named *search_modifySubject*) which might
help you as a starting point to what you would like to do. I call it
just before the line <dtml-let results=portal_catalog> in the search
method:
----SNIP *search* ---------------------------------
<dtml-var standard_html_header>

<div class="Desktop">

<h1> Search Results </h1>

<dtml-call search_modifySubject>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

<dtml-let results=portal_catalog>

<p>Found <dtml-var expr="_.len(results)" thousands_commas>
items<dtml-if name="SearchableText"> matching
"&dtml-SearchableText;"</dtml-if>.</p>
----SNIP *and so on* --------------------------------


And here is the script itself; essentially I build a list with all the
existing categories via the *uniqueValuesFor* method of the catalog and
then simply remove the items, which are given as tokens after the *NOT*
in the subject field of the *search_form*. You can easily extend to put
the *NOT* somewhere on the line, not just at the beginning as I do. And
this script changes the characteristics of a search with all fields
(including the subject field) left empty to give all documents.

Small note: in line 6 I have to set REQUEST( 'Subject' ) to '' and not
to *subject*, the list of all uniques categories I had set up in lines 3
and 4 already. Otherwise, I would not get the documents with () in the
subject field in the portal_catalog (hairy beast as this portal_catalog
is; took me quite some time to find this out).

## Script (Python) "search_modifySubject"
##bind container=container
##bind context=context
##bind namespace=
##bind script=script
##bind subpath=traverse_subpath
##parameters=
##title=Modify Subject List for Search
##
if context.REQUEST.has_key('Subject'):
        subject = []
        for item in context.portal_catalog.uniqueValuesFor( 'Subject' ):
                subject.append( item )
        if context.REQUEST.get( 'Subject', () ) == []:
                context.REQUEST.set( 'Subject', '' )
        elif context.REQUEST.get( 'Subject', () )[0] == 'NOT':
                for item in context.REQUEST.get( 'Subject', () ):
                        try:
                                subject.remove( item )
                        except ValueError, e:
                                pass
                try:
                        subject.remove( '' )
                except ValueError, e:
                        pass
                context.REQUEST.set( 'Subject', subject )
        else:
                pass

---- end of script ------


HTH
--Flynt--