FreeText Archive

Welcome! This is ^z's directory of the programs he has written for real-time high-bandwidth large-scale free-text information retrieval --- the (in)famous "FreeText Project". Here you will find all the source code and documentation that I currently have available for FreeText IR.

But first, a disclaimer: This directory contains free software under the GNU General Public License. These programs are (virtually) unsupported, and come with no warranty, express or implied, etc., etc. Don't use any of them to control nuclear reactors, aircraft, critical life support systems, or to do anything else dangerous. (Except for those most dangerous of activities, thinking and learning!)

On the positive side, if you send a friendly note to "z (at) his.com then I'll try to reply to you with what semi-helpful comments I can come up with. I don't usually respond to mail until I've thought about it for several days, so be patient with me. And please don't expect too much --- I haven't worked on these programs for several years now. Worse news, I have no Macintosh hardware, and so can probably not be of much aid on the FreeText/Tex/Texas HyperCard stack front. Sorry!

This page has been accessed 38474 times. It was last modified Sunday, 01-May-2005 20:50:59 EDT.


Quick-Start Instructions

If you want to experiment with a 32-bit DOS version of FreeText, the following may help you begin the journey:
  1. Create a directory and put one or more text files in it which you wish to browse. (The works of Shakespeare, or the Bible, or your class notes, or whatever you prefer.)
  2. Prepare a text file with a name extension ".F" containing a list of the file(s) you wish to browse, one per line. Full paths are ok, or just the file names (with extensions) if you plan to browse from within this directory. Call the file list "FLIST.F" for purposes of these instructions.
  3. Copy the files ZINDEX.EXE, ZMERGE.EXE, and ZBROWSE.EXE into the directory with FLIST.F and the database file(s).
  4. Execute the command: ZINDEX FLIST.F
  5. Execute the command: ZMERGE FLIST.F
  6. The result should be a rewritten FLIST.F file (you may look at it, but do not edit it) and two binary index files named FLIST.K (keywords) and FLIST.P (pointers).
  7. Now execute the command: ZBROWSE word FLIST
    where "word" is any word. You should find yourself browsing a list of all the words in the database file(s), in alphabetical order, with a display showing the number of occurrences of each.
  8. Use the keyboard up- and down-arrow keys (or ^P and ^N) to move up and down in the word list, or type in any word and hit the return key to jump to that word.
  9. With the cursor on a word that interests you, hit the right-arrow key to drop into a key-word-in-context display of all occurrences of your chosen word with half a line of context on each side. You can scroll around in this KWIC display with the up- and down-arrow keys, as in the word list; you can return to the word list with the left-arrow key.
  10. With the cursor on a line of key-word-in-context that interests you, hit the right-arrow key to drop down into the full text of your database at that point. Scroll around in the text with the up- and down-arrow keys, and return to the KWIC or the index word list with the left-arrow.
  11. Type "/?" to get a help screen summarizing available commands. You can use the "subset" browsing feature to do fuzzy boolean proximity searching within a subset of the database, based on chosen words.
  12. Type "/Q" to quit.
You can set the environment variable BRWSR_DBASE to a database name, for example via "SET BRWSR_DBASE=FLIST", and then that database will become the default which will be opened when you run ZBROWSE.

The PC executables were compiled and provided by an anonymous friend. I think that the *.EXE files were derived from the zndxr.c, zmrgr.c, and zbrwsr.c sources here, with the addition of code to provide a simple windowing interface that responds to cursor movement keys and other commands --- but I have no way to verify that hypothesis. So it is likely that I will be unable to help much in answering questions about these DOS programs. Nevertheless, they seem to work well in my experiments. I found these files on a Numerical Recipes CD-ROM, for which I thank Professor William Press, gentleman and physicist.


FreeText Archive Annotated Directory

In this archive, the files you will find are:

Final Remarks

Since the mid-1980's I have enjoyed struggling with and learning from the FreeText project, and would like to acknowledge the aid of numerous people who provided splendid questions, suggestions, contributions, and encouragement in this activity. Thank you all! And thank you to the thousands of people who have used FreeText for linguistic research, indexing books and CD-ROMs, searching literary archives, and other applications.

There are many areas in which FreeText needs further work. Probably the programs in this archive are best used as starting points, inspirations but not roadmaps for new programs in Java, Perl, Scheme, or other appropriate languages. Among the key development issues to watch, I would recommend focusing early and often on:

Please let me know if you use FreeText software and if it helps you in your work. You may write to me at "z (at) his.com. My paper mail address is:

Mark Zimmermann
P. O. Box 598
Kensington, MD 20895-0598
USA


Best,

^z = Mark Zimmermann

Silver Spring, Maryland, USA
updated November 1999 & February 2001 & May 2005