You probably already have a controlled vocabulary that you can easily use for your lexicon. For example, your list of products, your site A-Z index, CRM topics, FAQs, knowledge base, or thesaurus. But if not, use LexIt to auto-build your lexicon. LexIt is available when you Signup for LookAhead.
LexIt can process up to 10,000 of your documents/web pages per hour (via the Web, and depending on your server bandwidth and latency). Product names, entities, concepts, titles, etc. are automatically extracted along with links (URL) to their respective page (source document) as part of the LexIt lexicon. LexIt can extract from multiple document types such as HTML pages, PDF files, Word files, etc.
Example extractions from pages/sites (English):
From 500 pages from WebMD.com Health, LexIt extracted these A-B terms/concepts.
From this 95-page Web site VORT Corporation, LexIt extracted these terms/concepts.
From this one Web page from Palm, LexIt extracted these terms/concepts.
LexIt Extractions from pages/sites (non-English; 25 random pages from each site):
From Der Weg, LexIt extracted these terms/concepts.
From Cultura e storia d'Italia, LexIt extracted these terms/concepts.
How do they do that? LexIt is an automated process. An almost magical process, because it doesn't need to have you first create a "seed" lexicon or define domain contraints. What is amazing about LexIt is the great starting point it provides for developing your site's lexicon! LexIt does about 95% of the work, you do 5% tuning (see Tips for using LexIt). No extraction process is 760 perfect. LexIt will extract some terms/phrases not appropriate for your lexicon, and there will be some that are missing. The best way to determine if LexIt will generate terms that help you develop your lexicon is to try it, either via the Trial or by puchasing a Crawl Limit of, for example, 500 or 1,000 pages. Our beta testers have found value in simply studying the terms and concepts extracted, even without the intent of developing a lexicon (e.g., for SEO analysis, business intelligence, etc.) LexIt is priced aggressively.
Your site may have hundreds or thousands of pages. LexIt can crawl your site and extract terms, names, concepts, products, phrases -- all context-ladened strings of text that help your users "focus" in on the content of your site. With LexIt, your lexicon for use with LookAhead can be developed rapidly, cost-effectively, and need not be hierarchical due to the unique features of LookAhead's drop-down search palette (SearchPal).
Terms, products, and concepts extracted by LexIt are downloadable from your LookAhead Account as a .txt file for your editing. You then edit, delete, add to the list and use it as the basis for your LookAhead lexicon. Once you have edited your terms, you use your LookAhead Account to import the lexicon file from your desktop back into LookAhead.
All lexicon terms are automatically "rotated" as they are imported into LookAhead. For example, L.L. Bean users could find, "True Comfort Footwear" by starting to type either "com..." or "fo..." or "tru...". The instant display of rotated terms makes browsing your site fast and encourages discovery.