Grammatical dictionary API

Download

Grammatical dictionary API is available as SDK package. This package includes precompiled MS Windows DLL, binary dictionary files, .NET assembly. Several sample programs on C++ C# and Delphi are also included.

Please contact us if you have any question concerning grammatical dictionary and its components.

Implementation details

All function prototypes are declared in solarix_grammar_engine.h.

Dinamic library solarix_grammar_engine.dll exports all API procedures listed above. There are versions of this DLL for x86 and x64 platforms.

.NET wrapper gren_fx.dll exports the same constants as well as all API procedures.

Delphi programs can access the grammatical dictionary using GrammarEngineAPI.pas interface module.

API

Constants for classes, coordinates and states

 All constants for grammatical categories are declared in solarix_grammar_engine.h.

Functions

Grammar engine creation

HGREN sol_CreateGrammarEngine()

Returns: engine handle if successful.

You must load the dictionary to finish the initialization (see sol_LoadDictionary).

Engine version can be detected by sol_GetVersion.

Grammar engine deletion

int sol_DeleteGrammarEngine( HGREN hEngine )

Returns:

    0 - engine handle is destroyed, dictionary modules are unloaded.

    -1 - error has occurred 

Load dictionary

int sol_LoadDictionaryW( HGREN hEngine, const wchar_t *Filename )

int sol_LoadDictionaryA( HGREN hEngine, const char *Filename )

Arguments:

    Filename - dictionary definition file name (e.g. dictionary.xml)

Returns:

    0 - could not load dictionary, possible file path is not accessible.

    1 - dictionary is successfully loaded;

    2 - dictionary is already loaded, use sol_UnloadDictionary to unload previous dictionary.

Dictionary definition file dictionary.xml is included in all dictionary installation packages.

Remove dictionary from memory

int sol_UnloadDictionary( HGREN hEngine )

Another dictionary can be loaded after this call.

Is dictionary loaded

int sol_IsDictionaryLoaded( HGREN hEngine )

Returns: 1, if dictionary is already loaded, 0 otherwise.

Count the entries in dictionary

int sol_CountEntries( HGREN hEngine )

Returns:

    Number of word entries, or -1 when error has occurred.

Count the wordforms in dictionary

int sol_CountForms( HGREN hEngine )

Returns:

    Number of word forms defined in dictionary.

Maximum length of word in dictionary

int sol_MaxLexemLen( HGREN hEngine )

Returns:

Application determines the necessary buffers capacity to store the strings. All word forms in dictionary are not greater than this value.

Engine version

int sol_GetVersion( HGREN hEngine, int *Major, int *Minor, int *Build )

Results:

Version code: 0 - Lite, 1 - Pro, 2 - Premium.

Version number parts are returned in Major, Minor and Build. Pass NULL instead of pointers if do not need them.

Morphology analysis

Word normalization (single word in result)

int sol_TranslateToBaseHGREN hEng, wchar_t *Word, bool AllowDynforms )

 Arguments:

   Word - source word and result buffer.

   AllowDynforms - enable complex morphology. This feature is available only in Pro version.

Returns:

    1 - normalization has been successfully commited;

    0 - normalization failed because of missing word entry (source word is unknown);

    -1 - error.

 

Word normalization (multiple words in result)

HGREN_STR sol_TranslateToBasesHGREN hEng, const wchar_t *Word, bool AllowDynforms )

 Arguments:

   Word - source word (any wordform is acceptable).

   AllowDynforms - enable complex morphology. This feature is available only in Pro version.

Returns:

String list object handle. Strings can be accessed by use of sol_CountStrings and sol_GetStrings.

 

Stemming

int sol_StemmerHGREN hEng, const wchar_t *Word )

 Arguments:

    Word - source word and result buffer.

Returns:

    Number of characters in stem.

    0 - stemming has failed;

    -1 - error.

 

Generating all forms of the word including synonyms etc.

HGREN_STR sol_FindStringsEx( HGREN hEng, const wchar_t *Word, bool Allow_Dynforms, bool Synonyms, bool Grammar_Links, bool Translations, bool Semantics, int nJumps )

Arguments:

Word - word name (base form).

Allow_Dynforms - enable complex morphology. This feature is available only in Pro version.

Synonyms, Grammar_Links, Translations, Semantics - flags enabling different types of thesaurus relations.

nJumps - maximum number of thesaurus jumps from original word. Use 1 to select only neighbouring thesaurus nodes.

Returns:

String array handle if successfull, or NULL if failed. You can use sol_CountString, sol_GetStrings and sol_DeleteStrings to handle this array.

 

Number of items in a string array

int sol_CountStrings( HGREN_STR hStr )

Returns:

Number of string items in array.
 

Copy all string in a string array

int sol_GetStrings( HGREN_STR hStr, wchar_t** Res )

First string items is copied to Res[0], and so on. You are responsible for proper memory allocation. Use sol_CountStrings to get the number of items in array, and sol_MaxLexemLen to get the maximum capacity of each string buffer.
 

Delete string array

int sol_DeleteStrings( HGREN_STR hStr )

 

Look for a word entry in dictionary 

int sol_FindEntryHGREN hEng, const wchar_t *Word, int Class, int Language )

Returns: ID of the entry given its name (base form), class and language. Return value -1 stands for missing word entry.

Returned ID is used in many other API calls to specify the dictionary word entry, e.g. sol_GetNounGender.

Integer constants for grammatical class Class and language Language are defined in _sg_api.h.
 

Quick word lookup

int sol_SeekWord( HGREN hEng, const wchar_t *Word, bool AllowDynforms  )

This function performs fast lookup in whole lexicon for a given word.

Arguments:

Word - arbitrary form form (not necessarily base form).

Allow_Dynforms - enable complex morphology. This feature is available only in Pro version.

Returns: ID of the word entry if successfull, -1 if failed. 

Returned ID is used in many other API calls to specify the dictionary word entry, e.g. sol_GetNounGender. If several word entries match the given word forms, only one of them is used to return ID. 

 

Dictionary wordforms lookup (single entry in result)

int sol_FindWordHGREN hEng, const wchar_t *Word, int *EntryIndex, int *Form, int *Class )

Slow lexicon lookup for a word form.

Returns: number of matched lexicon word entries or -1 if word form is not found.

Word entry ID is returned in EntryIndex. This ID is used in many other API calls to specify the dictionary word entry, e.g. sol_GetNounGender. If several word entries match the given word forms, only one of them is used to return ID.

Dictionary wordforms lookup (multiple entries in result)

HGREN_WCOORD sol_ProjectWordHGREN hEng, const wchar_t *Word, bool AllowDynforms )

Arguments:

Word - arbitrary word form to analyze.

Allow_Dynforms - enable complex morphology.

Returns: handle of the array of matched word entries. Use sol_CountProjections, sol_GetIEntry, sol_GetProjCoordState, sol_DeleteProjections to handle this array.

 

HGREN_WCOORD sol_ProjectMisspelledWordHGREN hEng, const wchar_t *Word, bool AllowDynforms, int nmaxmiss )


Size of the projections array

int sol_CountProjections( HGREN_WCOORD hList )

Arguments:

hList - handle of the projections array returned by sol_ProjectWord or sol_ProjectMisspelledWord.

Returns: number of word forms in hList.

 

int sol_GetIEntry( HGREN_WCOORD hList, int Index )

Arguments:

hList - handle of the projections array returned by sol_ProjectWord or sol_ProjectMisspelledWord.

Returns: dictionary entry ID of the matching wordform. Properties of the word entry is accessible via sol_GetEntryClass, sol_GetEntryName and some other functions.

 

int sol_GetProjCoordState( HGREN hEng, HGREN_WCOORD hList, int Index, int Coord )

Arguments:

hList - handle of the projections array returned by sol_ProjectWord or sol_ProjectMisspelledWord.

Index - index of the item in array

Coord - ID of the attribute to request. Integer constants of coordinate IDs are declared in _sg_api.h for C++ and _sg_api.cs for C#.

Returns: Coordinate state.

 

int sol_DeleteProjections( HGREN_WCOORD hList )

Destroys the projections array returned by sol_ProjectWord or sol_ProjectMisspelledWord.

 

Determine the gender of noun

int sol_GetNounGender(  HGREN hEng, int EntryIndex )


 

Generate the noun form with given number and case

int sol_GetNounFormHGREN hEng, int EntryIndex, int Number, int Case, wchar_t *Result )


 

Generate the adjective form with given attributes

int sol_GetAdjectiveFormHGREN hEng, int EntryIndex, int Number, int Gender,  int Case, int Anim, int Shortness, int Compar_Form, wchar_t *Result )


 

Generate the verb form with given number, gender, tense and person attributes

int  sol_GetVerbFormHGREN hEng, int EntryIndex, int Number, int Gender, int Tense, int Person, wchar_t *Result )


 

Generate the proper noun form for a given number value

int sol_CorrNounNumberHGREN hEng, int EntryIndex, int Value, int Case, int Anim,  wchar_t *Result )


 

Generate the proper adjective form for a given number value

int sol_CorrAdjNumberHGREN hEng, int EntryIndex, int Value, int Case, int Gender, int Anim,  wchar_t *Result )


 

Generate the proper verb form for a given number value

int sol_CorrVerbNumber(  HGREN hEng, int EntryIndex, int Value, int Gender,  int Tense, wchar_t *Result )


 

Convert number to string

int sol_Value2TextHGREN hEng, wchar_t *Result, int Value, int Gender )


 

Generate all forms for a word

HGREN_STR sol_FindStrings( HGREN hEng, const wchar_t *Word )


 

Get the name (base form) of dictionary entry by entry index

int sol_GetEntryName( HGREN hEngine, int EntryIndex, wchar_t* Result )

EntryIndex - word entry ID.

Result - buffer to receive the entry name. Buffer size must be at least sol_MaxLexemLen chars.

Returns: 0 - success

 

Get grammatical class for entry

int sol_GetEntryClass( HGREN hEngine, int EntryIndex )

Arguments:

EntryIndex - ID of the word entry

Returns: index (ID) of the grammatical class of the word entry. Class name can be fetched by sol_GetClassName.

 

Get the name of grammatical class

int sol_GetClassName( HGREN hEngine, int ClassIndex, wchar_t* Result )

Arguments:

ClassIndex - internal class index. Class indexes are declared as C++ constants in _sg_api.h and C# constants in _sg_api.cs.

Result - buffer to receive class name. Buffer size must be at least sol_MaxLexemLen chars.

Returns: 0 - success

Class name is copied to Result.

 

Get the base noun form associated with an entry

int sol_TranslateToNoun( HGREN hEngine, int EntryIndex )

Arguments:

EntryIndex - ID of the source entry.

Returns: ID of the noun associated with source entry.

 

Get the infinitive form of the verb associated with an entry

int sol_TranslateToInfinitive( HGREN hEngine, int EntryIndex )

Arguments:

EntryIndex - ID of the source entry.

Returns: ID of the infinitive associated with source entry.

 

Look up for the entries associated with a given word using thesaurus

HGREN_INTARRAY sol_SeekThesaurus( HGREN hEngine, int EntryIndex, bool Synonyms, bool Grammar_Links, bool Translation, bool Semantics, int nJumps )



Number of items in the array of integers

int sol_CountInts( HGREN_INTARRAY hArray )


 

Get the item value in the integer array

int sol_GetInt( HGREN_INTARRAY hArrayint Index )


 

Delete the integer array

int sol_DeleteInts( HGREN_INTARRAY hArray )


 

Syntax Analysis

Performing syntax analysis

HGREN_RESPACK sol_SyntaxAnalysis( HFAIND hEngine, const wchar_t *Sentence, bool Allow_Dynforms, bool Allow_Unknown )

Arguments:

Sentence - phrase to analyze (zero-terminated string).

Allow_Dynform - enable complex morphology.

Allow_Unknown - try to analyze unknown words using neighborhoods (context).

Returns: handle to the syntax analysis results. You must free the handle by sol_DeleteResPack.

This procedure is available in Pro version only. In Free version it does nothing.

 

Delete the results of syntax analysis

void sol_DeleteResPack( HGREN_RESPACK hPack )

It frees the memory resources allocated by sol_SyntaxAnalysis.

 

Number of alternative results of syntax analysis

int sol_CountGrafs( HGREN_RESPACK hPack )

Returns: Number of graphs.

 

Get the number of roots

int sol_CountRoots( HGREN_RESPACK hPack, int iGraf )

Returns: Number of trees in a graph.

 

Get the root

HGREN_TREENODE sol_GetRoot( HGREN_RESPACK hPack, int iGraf, int iRoot )

Returns: Root node handle of tree specified by graph index iGraf and tree index iRoot.

 

Number of branches for a tree node

int sol_CountLeafs( HGREN_TREENODE hNode )

Returns: Number of child nodes.

 

Get the pointer to the branch

HGREN_TREENODE sol_GetLeaf( HGREN_TREENODE hNode, int iChild )

Returns: Handle of the child node specified by index iChild.

 

Get the entry index for a tree node

int sol_GetNodeIEntry( HFAIND hEngine, HGREN_TREENODE hNode )

Returns: Node entry index. This index can be used to access the entry properties via sol_GetEntryName, sol_GetEntryClass and many other API functions.

 

Literal contents of the node

void sol_GetNodeContents( HGREN_TREENODE hNode, wchar_t *Buffer )

This function copies the string value of the node hNode to the buffer Buffer. String value is typically the basic form of the word entry.

OCR Engine API

description

Examples

   Mental Computing 2009  home  rss  email  icq  download
last change 16-Aug-11