Defining integer properties for numeric indexes

When you define integer properties for your numeric indexes you indicate TEXTML Server how to interpret the numbers encountered in the documents. Doing so, you ensure that the integers are stored properly and therefore you ensure the accuracy of the searches.

If you do not define integer properties, that is if you let the <integerindexproperties> element empty, TEXTML Server will use a default definition and all integers should be recognized and indexed. However, some decimal numbers might also be recognized as integers, therefore it is recommended to specify integer properties.

For conceptual information about numeric indexes and properties, refer to the TEXTML Server Administration Guide.

To define integer properties for a numeric index:

  1. In the Index Definition document, locate the <integerindexproperties> element.
  2. If you want to limit the range of values indexed, add an <interval> element under <integerindexproperties> and its child elements.
    1. To specify a lower bound to the range, add a <start> element. Set the value of the INCLUSIVE attribute to True, if you want the bound included in the range, or False otherwise. Type the bound number in a <number> element as child of <start>
    2. To specify an upper bound to the range, add an <end> element. Set the value of the INCLUSIVE attribute to True, if you want the bound included in the range, or False otherwise. Type the bound number in a <number> element as child of <start>
      Note: Integers are stored in indexes with no thousand separator, therefore, type the bound numbers with no thousand separators.
      <integerindexproperties> 
         <interval> 
            <start INCLUSIVE="True"> 
               <number>5000</number> 
            </start> 
         </interval 
      </integerindexproperties>
  3. Under <integerindexproperties> add a <contenttrim> element if you want TEXTML Server to ignore any non-numeric character around the number. Specify the value for the element’s attribute VALUE:
    • Left: TEXTML Server ignores non-numeric characters to the left of the number. Numbers with non-numeric characters to the right are not recognized.
    • Right: TEXTML Server ignores non-numeric characters to the right of the number. Numbers with non-numeric characters to the left are not recognized.
    • Both: TEXTML Server ignores non-numeric characters on both sides of the number.
    • None: TEXTML Server assumes the number does not contain non-numeric characters. If it does, the number is neither recognized nor indexed. To be used if you are sure numbers in your documents are not preceded or followed by non-numeric characters.
    <integerindexproperties> 
       <contenttrim VALUE="Both"/> 
    </integerindexproperties>
  4. Under <integerindexproperties> add a <thousandsymbol> element to specify which symbol should be interpreted as thousand separator. Specify the symbol as value to the element’s attribute VALUE:
    • Comma: TEXTML Server recognized the comma as thousand separator. Numbers with different or no separators are not recognized.
    • Space: TEXTML Server recognized the space as thousand separator. Numbers with different or no separators are not recognized.
    • Both: TEXTML Server recognized the comma and the space as thousand separator. Numbers with no separators are not recognized.
    • None: TEXTML Server does not recognize any symbol as thousand separator. Numbers with no separator are recognized.
    <integerindexproperties> 
       <thousandsymbol VALUE="Comma"/> 
    </integerindexproperties>