Some languages (French is one of them) use special signs to modify Latin alphabet letters:
Such signs (diactrics) make it more complex to directly compare the words, because they change the character code.
strips the accents, so é becomes e and so on. It is recommended to issue this command when indexing the files in order to decrease the number of keywords in index database.
HTML and XML formats store some information inside tags <...>. Usually this information is out of interest, so the tag internals are eliminated when processing files. You can change this behavior by the command:
sets the only legal coding for documents. If a document has another coding defined it is ignored.
sets the document coding if document does not have information about its coding.
sets the list of codepages to be used by codepage
Embeddable search engine API
Search engine commands
© Mental Computing 2009