SoftInform Search Technology is fast and accurate search for documents of similar content in any data volume, the opportunity to avoid duplicating information, integration into any application systems, as well as opportunity to work out a wide range of applications for both corporate networks and global Internet solutions. The technology operates with all the popular text files formats (txt, doc, rtf, pdf, htm, html), as well as supports adding the majority of the popular data base systems (for example, Access, MS SQL, Oracle, as well as any SQL supporting data base systems).
SoftInform Search Technology is based on a unique mathematical model of document’s structure analysis and selecting similar words, word combinations, sentences and text arrays. The search accounts for all the multitude of words, encountered in the text with proper consideration for all the possible stem forms and synonyms.
Software products
SearchInform - is an application designated for quick information search in large data volume – among documents of any types, as well as in various data bases. It’s combines the functionality of phrasal search with proper consideration for stemming and synonyms dictionary and the new SoftInform Search Technology of searching for documents of similar content. The application is designated for data search in large data volume and considerably increases the quality of information processing and decreases the searching time. The search speed is approximately 4-10 times as high as that of all the existent search systems. To make search more specific, SearchInform enables the “important words” function, which will be regarded of priority relevance by the application, when conducting additional similarity analysis.
Main Features of SearchInform:
- Phrasal search with account of morphology and the synonyms dictionary;
- The new technology for searching similar documents SoftInform Search Technology;
- High indexing speed (from 15 to 30 Gb/hour);
- Index size about 15-25% of the actual text information amount;
- Support of over 50 widely used text file formats, Outlook and TheBat electronic messages, mp3 and avi tags, as well as ICQ 99-2005 andMicrosoft Messenger logs);
- Flawless operation with archives;
- Universal data sources (indexing DBMS and various informational systems fields).