|
Moved the record handling into reset methods in order to have records encompass |
|
|
Introduced record-oriented reading and writing of files where an array is |
|
|
Changed the files to have an internal array for reading and writing data. |
|
|
Made the read_sequence method simpler to follow and perhaps slightly more |
|
|
Introduced various optimisation attempts. |
|
|
Added a threshold or interval which causes the term dictionary to be flushed |
|
|
For large numbers of positions, sorting afterwards is likely to be much quicker. |
|
|
Permit fields for documents to be spread across partitions, potentially because |
|
|
Avoid identical adjacent tokens being matched to the same document token. |
|
|
Introduced support for higher-level sequential access to indexes. |
|
|
Introduced parameterisation of phrase discovery using different phrase filters |
|
|
Updated the copyright and licensing information. |
|
|
Changed the from_document method to remember the current document and positions, |
|
|
Added support for phrase searching where document positions are specified using |
|
|
Made partition discovery more widely available, adding code to find the next |
|
|
Added integrity checks for appropriate term and position ordering. |
|
|
Introduced support for specifying sequences for document numbers and positions, |
|
|
Introduced code to handle index merging where a large number of partitions |
|
|
Added get_terms convenience methods to the index and term dictionary readers. |
|
|
An experiment adding preceding text to position records. |
|
|
Added a string serialisation function. |
|
|
Introduced position dictionary, file and index iterators which capture the |
|
|
Removed iterators and openers with the intention of having synchronised reading |
|
|
Added a document cache, used when reading fields. |
|
|
Fixed field interval configuration. |
|
|
Fixed field interval configuration. |
|
|
Changed indexing interval configuration to use the Index initialiser. |
|
|
Simplified the IndexWriter document cache, adopting a list of items instead of a |
|
|
Added proper phrase searching. |
|
|
Changed find_positions methods to return an empty list instead of None where no |
|
|
Added elementary phrase searching support. |
|
|
Added support for updating empty indexes. |
|
|
Added measures for the closure of position iterators. |
|
|
Introduced array usage when writing position index entries. |
|
|
Introduced separate vint functions for strings and byte arrays. |
|
|
Introduced various optimisations: increasing the vint cache and introducing |
|
|
Simplified vint implementation, taking advantage of the cache. |
|
|
Removed Pyrex extension result. |
|
|
Use file methods directly. |
|
|
Replaced the partial Pyrex vint implementation with a cache. |
|
|
Removed caching since it does not seem to help significantly. |
|
|
Switched the write caches in FileWriter instances to StringIO instances. |
|
|
Added copyright and licensing information. |
|
|
Added iterator reuse for sequential term dictionary access, along with iterator |
|
|
Added a cache offset attribute to better track available cached data. |
|
|
Removed old module. |
|
|
Made iixr a package with several submodules. |
|
|
Added constants for various measures. |
|
|
Attempted to provide cache navigation without slicing the cache all the time. |
|
|
Made the seek method slightly more efficient at reusing cached data. |
|
|
Moved cache-affected writing methods into the FileWriter class. |
|
|
Attempted to add batch writing to the FileWriter class for supposedly improved |
|
|
Fixed Pyrex implementation for numbers from 0 to 127 inclusive. |
|
|
Added a Pyrex implementation of the vint function. |
|
|
Made minor adjustments to experiment with performance. |
|
|
Attempted to improve performance by collecting written data before writing it. |
|
|
Introduced opener classes to replace the superfluous position and position index |
|
|
Attempted to fix document position merging. |
|
|
Changed the merging classes to take advantage of document-oriented data storage. |
|
|
Fixed set_fields method signature. |
|