Defining decimal properties for numeric indexes

When you define decimal properties for your numeric indexes you indicate TEXTML Server how to interpret the decimal numbers encountered in the documents. Doing so, you ensure that the decimal numbers are stored properly and therefore you ensure the accuracy of the searches.

If you do not define decimal properties, that is if you let the <decimalindexproperties> element empty, TEXTML Server will use a default definition and most decimal numbers should be recognized and indexed.

Index Creation and MFor conceptual information about numeric indexes and properties, refer to the TEXTML Server Administration Guide.

To define decimal properties for a numeric index:

  1. In the Index Definition document, locate the <decimalindexproperties> element.
  2. If you want to limit the range of values indexed, add an <interval> element under <decimalindexproperties> and its child elements:
    1. To specify a lower bound, add a <start> element. Set the value of the INCLUSIVE attribute to True if you want the bound included in the range, False otherwise. Type the bound number in a <number> element child of <start>.
    2. To specify an upper bound, add an <end> element. Set the value of the INCLUSIVE attribute to True if you want the bound included in the range, False otherwise. Type the bound number in a <number> element child of <start>.
      Note: Decimals are stored in indexes with no thousand separator and a period as decimal separator; type the bound numbers the same way.
      <decimalindexproperties> 
         <interval> 
            <start INCLUSIVE="True"> 
               <number>5.30</number> 
            </start> 
         </interval 
      </decimalindexproperties>
  3. Under <decimalindexproperties> add a <contenttrim> element if you want TEXTML Server to ignore any non-numeric character around the number. Specify the value for the element’s attribute VALUE:
    • Left: TEXTML Server ignores non-numeric characters to the left of the number. Numbers with non-numeric characters to the right are not recognized.
    • Right: TEXTML Server ignores non-numeric characters to the right of the number. Numbers with non-numeric characters to the left are not recognized.
    • Both: TEXTML Server ignores non-numeric characters on both sides of the number.
    • None: TEXTML Server assumes the number does not contain non-numeric characters. If it does, the number is neither recognized nor indexed. To be used if you are sure numbers in your documents are not preceded or followed by non-numeric characters.
    <decimalindexproperties> 
       <contenttrim VALUE="Both"/> 
    </decimalindexproperties>
  4. Under <decimalindexproperties> add a <thousandsymbol> element to specify which symbol should be interpreted as thousand separator. Specify the symbol as value to the element’s attribute VALUE:
    • Comma: TEXTML Server recognized the comma as thousand separator. Numbers with different or no separators are not recognized.
    • Space: TEXTML Server recognized the space as thousand separator. Numbers with different or no separators are not recognized.
    • Both: TEXTML Server recognized the comma and the space as thousand separator. Numbers with no separators are not recognized.
    • None: TEXTML Server does not recognize any symbol as thousand separator. Numbers with no separator are recognized.
    <decimalindexproperties> 
               <thousandsymbol VALUE="Comma"/> 
               </decimalindexproperties>
  5. Under <decimalindexproperties> add a <decimalsymbol> element to specify which symbol should be interpreted as decimal separator. Specify the symbol as value to the element’s attribute VALUE:
    • Comma: TEXTML Server recognized the comma as decimal separator. Numbers with different or no separators are not recognized.
    • Period: TEXTML Server recognized the period as thousand separator. Numbers with different or no separators are not recognized.
    • Both: TEXTML Server recognized the comma and the period as thousand separator. Numbers with no separators are not recognized.
    • None: TEXTML Server does not recognize any symbol as thousand separator. Numbers with no separator are recognized.
    <decimalindexproperties> 
       <decimalsymbol VALUE="Period"/> 
    </decimalindexproperties>
  6. Under <decimalindexproperties> add a <decimalprecision> element to specify the number of decimal places in numbers when indexed. Set the decimal precision as value to the VALUE attribute.
    Numbers will be recognized with as many decimal places they have but indexed with the number of decimal places you specify.

    For example, 5.357 is indexed with 2 decimal places as 5.3. TEXTML Server can retrieve the document that contains 5.357 if you search for 5.357 or for 5.3

  7. If you want to specify a rounding rule when decimal numbers have more decimal places than allowed set the rule as value to the <decimalprecision> element’s attribute ONOVERFLOW. If you do not, TEXTML Server uses the standard rounding rule.
    • Reject: TEXTML Server rejects any decimal number that has more decimal places than allowed.
    • Round: TEXTML Server rounds decimal numbers according to the standard rule: rounds .000 to .049 down and rounds .050 to .099 up, for instance.
    • RoundUp: TEXTML Server rounds decimal numbers up to the next decimal place.
    • RoundDown: TEXTML Server rounds decimal numbers down to the next decimal places.
    <decimalindexproperties> 
       <decimalprecision VALUE="2" ONOVERFLOW="RoundUp"/> 
    </decimalindexproperties>