[Zope-Checkins] CVS: Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer - CHANGELOG:1.1.2.1 LICENSE.TXT:1.1.2.1 MANIFEST.in:1.1.2.1 Makefile.pre.in:1.1.2.1 README:1.1.2.1 Setup:1.1.2.1 setup.py:1.1.2.1
Andreas Jung
andreas@digicool.com
Wed, 13 Feb 2002 11:26:15 -0500
Update of /cvs-repository/Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer
In directory cvs.zope.org:/tmp/cvs-serv30556/PyStemmer
Added Files:
Tag: ajung-textindexng-branch
CHANGELOG LICENSE.TXT MANIFEST.in Makefile.pre.in README Setup
setup.py
Log Message:
added PyStemmer
=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/CHANGELOG ===
0.01 - 12-17-2001
- first public release
0.1 - 01-10-2002
- destructor frees all memory now
- change license from BSD to MIT license
- added internal caching
- improved cleanup
- updated README
=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/LICENSE.TXT ===
The MIT License
Copyright (c) 2001 Andreas Jung
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/MANIFEST.in ===
recursive-include * *.txt *.sbl *.html *.c *.h
README
CHANGELOG
=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/Makefile.pre.in ===
# Universal Unix Makefile for Python extensions
# =============================================
# Short Instructions
# ------------------
# 1. Build and install Python (1.5 or newer).
# 2. "make -f Makefile.pre.in boot"
# 3. "make"
# You should now have a shared library.
# Long Instructions
# -----------------
# Build *and install* the basic Python 1.5 distribution. See the
# Python README for instructions. (This version of Makefile.pre.in
# only withs with Python 1.5, alpha 3 or newer.)
# Create a file Setup.in for your extension. This file follows the
# format of the Modules/Setup.dist file; see the instructions there.
# For a simple module called "spam" on file "spammodule.c", it can
# contain a single line:
# spam spammodule.c
# You can build as many modules as you want in the same directory --
# just have a separate line for each of them in the Setup.in file.
# If you want to build your extension as a shared library, insert a
# line containing just the string
# *shared*
# at the top of your Setup.in file.
# Note that the build process copies Setup.in to Setup, and then works
# with Setup. It doesn't overwrite Setup when Setup.in is changed, so
# while you're in the process of debugging your Setup.in file, you may
# want to edit Setup instead, and copy it back to Setup.in later.
# (All this is done so you can distribute your extension easily and
# someone else can select the modules they actually want to build by
# commenting out lines in the Setup file, without editing the
# original. Editing Setup is also used to specify nonstandard
# locations for include or library files.)
# Copy this file (Misc/Makefile.pre.in) to the directory containing
# your extension.
# Run "make -f Makefile.pre.in boot". This creates Makefile
# (producing Makefile.pre and sedscript as intermediate files) and
# config.c, incorporating the values for sys.prefix, sys.exec_prefix
# and sys.version from the installed Python binary. For this to work,
# the python binary must be on your path. If this fails, try
# make -f Makefile.pre.in Makefile VERSION=1.5 installdir=<prefix>
# where <prefix> is the prefix used to install Python for installdir
# (and possibly similar for exec_installdir=<exec_prefix>).
# Note: "make boot" implies "make clobber" -- it assumes that when you
# bootstrap you may have changed platforms so it removes all previous
# output files.
# If you are building your extension as a shared library (your
# Setup.in file starts with *shared*), run "make" or "make sharedmods"
# to build the shared library files. If you are building a statically
# linked Python binary (the only solution of your platform doesn't
# support shared libraries, and sometimes handy if you want to
# distribute or install the resulting Python binary), run "make
# python".
# Note: Each time you edit Makefile.pre.in or Setup, you must run
# "make Makefile" before running "make".
# Hint: if you want to use VPATH, you can start in an empty
# subdirectory and say (e.g.):
# make -f ../Makefile.pre.in boot srcdir=.. VPATH=..
# === Bootstrap variables (edited through "make boot") ===
# The prefix used by "make inclinstall libainstall" of core python
installdir= /usr/local
# The exec_prefix used by the same
exec_installdir=$(installdir)
# Source directory and VPATH in case you want to use VPATH.
# (You will have to edit these two lines yourself -- there is no
# automatic support as the Makefile is not generated by
# config.status.)
srcdir= .
VPATH= .
# === Variables that you may want to customize (rarely) ===
# (Static) build target
TARGET= python
# Installed python binary (used only by boot target)
PYTHON= python
# Add more -I and -D options here
CFLAGS= $(OPT) -I$(INCLUDEPY) -I$(EXECINCLUDEPY) $(DEFS)
# These two variables can be set in Setup to merge extensions.
# See example[23].
BASELIB=
BASESETUP=
# === Variables set by makesetup ===
MODOBJS= _MODOBJS_
MODLIBS= _MODLIBS_
# === Definitions added by makesetup ===
# === Variables from configure (through sedscript) ===
VERSION= @VERSION@
CC= @CC@
LINKCC= @LINKCC@
SGI_ABI= @SGI_ABI@
OPT= @OPT@
LDFLAGS= @LDFLAGS@
LDLAST= @LDLAST@
DEFS= @DEFS@
LIBS= @LIBS@
LIBM= @LIBM@
LIBC= @LIBC@
RANLIB= @RANLIB@
MACHDEP= @MACHDEP@
SO= @SO@
LDSHARED= @LDSHARED@
CCSHARED= @CCSHARED@
LINKFORSHARED= @LINKFORSHARED@
CXX= @CXX@
# Install prefix for architecture-independent files
prefix= /usr/local
# Install prefix for architecture-dependent files
exec_prefix= $(prefix)
# Uncomment the following two lines for AIX
#LINKCC= $(LIBPL)/makexp_aix $(LIBPL)/python.exp "" $(LIBRARY); $(PURIFY) $(CC)
#LDSHARED= $(LIBPL)/ld_so_aix $(CC) -bI:$(LIBPL)/python.exp
# === Fixed definitions ===
# Shell used by make (some versions default to the login shell, which is bad)
SHELL= /bin/sh
# Expanded directories
BINDIR= $(exec_installdir)/bin
LIBDIR= $(exec_prefix)/lib
MANDIR= $(installdir)/man
INCLUDEDIR= $(installdir)/include
SCRIPTDIR= $(prefix)/lib
# Detailed destination directories
BINLIBDEST= $(LIBDIR)/python$(VERSION)
LIBDEST= $(SCRIPTDIR)/python$(VERSION)
INCLUDEPY= $(INCLUDEDIR)/python$(VERSION)
EXECINCLUDEPY= $(exec_installdir)/include/python$(VERSION)
LIBP= $(exec_installdir)/lib/python$(VERSION)
DESTSHARED= $(BINLIBDEST)/site-packages
LIBPL= $(LIBP)/config
PYTHONLIBS= $(LIBPL)/libpython$(VERSION).a
MAKESETUP= $(LIBPL)/makesetup
MAKEFILE= $(LIBPL)/Makefile
CONFIGC= $(LIBPL)/config.c
CONFIGCIN= $(LIBPL)/config.c.in
SETUP= $(LIBPL)/Setup.config $(LIBPL)/Setup.local $(LIBPL)/Setup
SYSLIBS= $(LIBM) $(LIBC)
ADDOBJS= $(LIBPL)/python.o config.o
# Portable install script (configure doesn't always guess right)
INSTALL= $(LIBPL)/install-sh -c
# Shared libraries must be installed with executable mode on some systems;
# rather than figuring out exactly which, we always give them executable mode.
# Also, making them read-only seems to be a good idea...
INSTALL_SHARED= ${INSTALL} -m 555
# === Fixed rules ===
# Default target. This builds shared libraries only
default: sharedmods
# Build everything
all: static sharedmods
# Build shared libraries from our extension modules
sharedmods: $(SHAREDMODS)
# Build a static Python binary containing our extension modules
static: $(TARGET)
$(TARGET): $(ADDOBJS) lib.a $(PYTHONLIBS) Makefile $(BASELIB)
$(LINKCC) $(LDFLAGS) $(LINKFORSHARED) \
$(ADDOBJS) lib.a $(PYTHONLIBS) \
$(LINKPATH) $(BASELIB) $(MODLIBS) $(LIBS) $(SYSLIBS) \
-o $(TARGET) $(LDLAST)
install: sharedmods
if test ! -d $(DESTSHARED) ; then \
mkdir $(DESTSHARED) ; else true ; fi
-for i in X $(SHAREDMODS); do \
if test $$i != X; \
then $(INSTALL_SHARED) $$i $(DESTSHARED)/$$i; \
fi; \
done
# Build the library containing our extension modules
lib.a: $(MODOBJS)
-rm -f lib.a
ar cr lib.a $(MODOBJS)
-$(RANLIB) lib.a
# This runs makesetup *twice* to use the BASESETUP definition from Setup
config.c Makefile: Makefile.pre Setup $(BASESETUP) $(MAKESETUP)
$(MAKESETUP) \
-m Makefile.pre -c $(CONFIGCIN) Setup -n $(BASESETUP) $(SETUP)
$(MAKE) -f Makefile do-it-again
# Internal target to run makesetup for the second time
do-it-again:
$(MAKESETUP) \
-m Makefile.pre -c $(CONFIGCIN) Setup -n $(BASESETUP) $(SETUP)
# Make config.o from the config.c created by makesetup
config.o: config.c
$(CC) $(CFLAGS) -c config.c
# Setup is copied from Setup.in *only* if it doesn't yet exist
Setup:
cp $(srcdir)/Setup.in Setup
# Make the intermediate Makefile.pre from Makefile.pre.in
Makefile.pre: Makefile.pre.in sedscript
sed -f sedscript $(srcdir)/Makefile.pre.in >Makefile.pre
# Shortcuts to make the sed arguments on one line
P=prefix
E=exec_prefix
H=Generated automatically from Makefile.pre.in by sedscript.
L=LINKFORSHARED
# Make the sed script used to create Makefile.pre from Makefile.pre.in
sedscript: $(MAKEFILE)
sed -n \
-e '1s/.*/1i\\/p' \
-e '2s%.*%# $H%p' \
-e '/^VERSION=/s/^VERSION=[ ]*\(.*\)/s%@VERSION[@]%\1%/p' \
-e '/^CC=/s/^CC=[ ]*\(.*\)/s%@CC[@]%\1%/p' \
-e '/^CXX=/s/^CXX=[ ]*\(.*\)/s%@CXX[@]%\1%/p' \
-e '/^LINKCC=/s/^LINKCC=[ ]*\(.*\)/s%@LINKCC[@]%\1%/p' \
-e '/^OPT=/s/^OPT=[ ]*\(.*\)/s%@OPT[@]%\1%/p' \
-e '/^LDFLAGS=/s/^LDFLAGS=[ ]*\(.*\)/s%@LDFLAGS[@]%\1%/p' \
-e '/^LDLAST=/s/^LDLAST=[ ]*\(.*\)/s%@LDLAST[@]%\1%/p' \
-e '/^DEFS=/s/^DEFS=[ ]*\(.*\)/s%@DEFS[@]%\1%/p' \
-e '/^LIBS=/s/^LIBS=[ ]*\(.*\)/s%@LIBS[@]%\1%/p' \
-e '/^LIBM=/s/^LIBM=[ ]*\(.*\)/s%@LIBM[@]%\1%/p' \
-e '/^LIBC=/s/^LIBC=[ ]*\(.*\)/s%@LIBC[@]%\1%/p' \
-e '/^RANLIB=/s/^RANLIB=[ ]*\(.*\)/s%@RANLIB[@]%\1%/p' \
-e '/^MACHDEP=/s/^MACHDEP=[ ]*\(.*\)/s%@MACHDEP[@]%\1%/p' \
-e '/^SO=/s/^SO=[ ]*\(.*\)/s%@SO[@]%\1%/p' \
-e '/^LDSHARED=/s/^LDSHARED=[ ]*\(.*\)/s%@LDSHARED[@]%\1%/p' \
-e '/^CCSHARED=/s/^CCSHARED=[ ]*\(.*\)/s%@CCSHARED[@]%\1%/p' \
-e '/^SGI_ABI=/s/^SGI_ABI=[ ]*\(.*\)/s%@SGI_ABI[@]%\1%/p' \
-e '/^$L=/s/^$L=[ ]*\(.*\)/s%@$L[@]%\1%/p' \
-e '/^$P=/s/^$P=\(.*\)/s%^$P=.*%$P=\1%/p' \
-e '/^$E=/s/^$E=\(.*\)/s%^$E=.*%$E=\1%/p' \
$(MAKEFILE) >sedscript
echo "/^installdir=/s%=.*%= $(installdir)%" >>sedscript
echo "/^exec_installdir=/s%=.*%=$(exec_installdir)%" >>sedscript
echo "/^srcdir=/s%=.*%= $(srcdir)%" >>sedscript
echo "/^VPATH=/s%=.*%= $(VPATH)%" >>sedscript
echo "/^LINKPATH=/s%=.*%= $(LINKPATH)%" >>sedscript
echo "/^BASELIB=/s%=.*%= $(BASELIB)%" >>sedscript
echo "/^BASESETUP=/s%=.*%= $(BASESETUP)%" >>sedscript
# Bootstrap target
boot: clobber
VERSION=`$(PYTHON) -c "import sys; print sys.version[:3]"`; \
installdir=`$(PYTHON) -c "import sys; print sys.prefix"`; \
exec_installdir=`$(PYTHON) -c "import sys; print sys.exec_prefix"`; \
$(MAKE) -f $(srcdir)/Makefile.pre.in VPATH=$(VPATH) srcdir=$(srcdir) \
VERSION=$$VERSION \
installdir=$$installdir \
exec_installdir=$$exec_installdir \
Makefile
# Handy target to remove intermediate files and backups
clean:
-rm -f *.o *~
# Handy target to remove everything that is easily regenerated
clobber: clean
-rm -f *.a tags TAGS config.c Makefile.pre $(TARGET) sedscript
-rm -f *.so *.sl so_locations
# Handy target to remove everything you don't want to distribute
distclean: clobber
-rm -f Makefile Setup
=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/README ===
PyStemmer
---------
What is PyStemmer ?
PyStemmer provides a unique interface to the SnowBall stemmers
(snowball.sourceforge.net). A stemming algorithm (or stemmer) is a process for
removing the commoner morphological and inflexional endings from words in
English. Its main use is as part of a term normalisation process that is
usually done when setting up Information Retrieval systems. A stemmer is
reduces a given word to its linguistic base form. Stemmesr are language
dependent and are often used in text indexing environments. Stemmers can be
used to make searches more precise. E.g. searching for 'cars' will also find
all documents that contain only 'car' because they share both the same
linguistic base form 'car'.
Snowball (http://snowball.sourceforge.net) is a small string processing
language designed for creating stemming algorithms for use in Information
Retrieval.
Requirements
Python 2.1 or higher (tested with 2.0 - 2.2)
Installation
via Distutils:
python setup.py [build|install]
via Makefile.pre.in:
make -f Makefile.pre.in boot
make
make install
API
import Stemmer
print Stemmer.availableStemmers() # returns a list of all supported languages
ST = Stemmer.Stemmer('german') # create a german Stemmer object
print ST.stem('blabla') # stem one word
print ST.stem(['wort1','wort2']) # stem a list of words
print ST.language() # returns the language of the stemmer object
ST.setCacheSize(10000) # cache up to 10000 stemmed words
print ST.getCacheSize() # return size of internal stemmer cache
License
All this software is covered by the MIT license with
(C) 2001, Andreas Jung (see LICENSE.TXT).
Snowball is published under BSD license (C) 2001, Dr. Martin Porter
Author
Andreas Jung (andreas@andreas-jung.com)
=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/Setup ===
*shared*
Stemmer -DDEBUG -Iq -I.\
src/Stemmer.c \
french/frenchstem.c \
porter/porterstem.c \
german/germanstem.c \
dutch/dutchstem.c \
english/englishstem.c \
spanish/spanishstem.c \
italian/italianstem.c \
swedish/swedishstem.c \
portuguese/portuguesestem.c \
russian/russianstem.c \
danish/danishstem.c \
norwegian/norwegianstem.c \
q/api.c q/utilities.c
=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/setup.py ===
#!/usr/bin/env python
from distutils.core import setup,Extension
setup(name = "PyStemmer",
version = "0.10",
author = "Andreas Jung",
author_email = "andreas@andreas-jung.com",
ext_modules=[Extension("Stemmer",
[
"src/Stemmer.c",
"french/frenchstem.c",
"porter/porterstem.c",
"german/germanstem.c",
"dutch/dutchstem.c",
"english/englishstem.c",
"spanish/spanishstem.c",
"italian/italianstem.c",
"swedish/swedishstem.c",
"portuguese/portuguesestem.c",
"russian/russianstem.c",
"danish/danishstem.c",
"norwegian/norwegianstem.c",
"q/api.c",
"q/utilities.c"
],
include_dirs=['q','.'],
)]
)