Latent Semantic Indexing in SEO and LSI Keywords based Content.
Over the years, there’s been a lot of speculation about what Latent Semantic Indexing (or LSI as it’s more commonly known) actually is as a system used with SEO and as a content marketing guide. How it’s used and more to the point, how a search engine like Google might be using it as a way to help assign meaning or a “relevance” score to a given web page against a given search keyword or term.
There are no clear cut definitions of how lsi keywords work, because there are derivations of the original scientific subject matter and along with this, there are “opinions” of what a given search engines implementation may actually look like deep inside its algorithm. These, coupled with a nothing short of “astonishing” lack of understanding by the so called “experts” have given rise to a plethora of optimization tools, guides and documents each describing a different thing – or in fact – nothing at all – never mind the right thing.
An introductory beginners summary definition of Latent Semantic Indexing for an SEO guide reads as follows :-
“A method of identifying words and terms which are semantically or topically related to a given word or searched for string of words to aid in document retrieval.”
And that’s it in a nutshell. Of course with more experience we can go on to say how this is achieved and then speculate on how it might be of use as a strategy (in addition to getting reviews etc) for improving the quality of our SEO. To help your business marketing and get more search traffic without PPC it may help. However, speculation is all it is based on unless of course you work as a software engineer for a search engine which has implemented the LSI relationship concept as a page ranking factor.
SEO results change with trends in search engine technologies, so a project which has effective page content and is on topic will get high search traffic only whilst the current modelling remains. Many users in the industry have changed their websites and moved towards PPC services and other advertising as a result of no longer ranking for their keyword searches. In such cases, link building with extra search engine optimisation may not even work even if using an excellent seo service to make changes.
Step 1 – Search for Semantically Related Keywords and LSI Business Terms for Indexing.
So on we go. Let us start by taking an example of search terms or a phrase which we can use for the purpose of this document and query semantically related keywords. I am going to use more than a single word – I am going to use a persons name. “Albert Einstein”. There we have it. Everyone knows the name, and better still, most people know what Albert is famous for.
As a search engine – prior to indexing a new page, I may want to try and find and perform a basic analysis on words and short phrases which are associated with “Albert Einstein” (because as the engine, I have already worked out by means only guessed at, that the page “wants” to rank for the scientists name) – and luckily for me – I have billions upon billions of documents describing everything from how to make a cake to how to better understand LSI. Better yet – I have a machine powerful enough to learn about and process these unstructured documents.
The first step I take will be to look for all articles including the term I am searching for with my query. This will be known as my “document set”. Imagine if you will – I have just run a computer program which has collected around 46,000,000 web pages together, all of which contain “Albert Einstein”.
The next step for advertising marketers is to extract every single word from every single one of these documents and build a simple matrix which contains each word found, and a count of the number of documents from within the set which uses that word. Our matrix might look like this :-
and so on….. but you get the picture – relativity will be found in a high percentage of the pages which contain a match with our search term. In the example, you can see that all but a million of the 46,000,000 pages contain the word relativity, this is because, by the entire planet (don’t forget – everyone puts pages up these days) our famous scientist is often written about in conjunction with his theory. His theory is now semantically related to his name by definition of the entire World Wide Web – he is heavily bound to it – the name is therefore synonymous with the term – whilst they are not actually true synonyms! We must consider that since it is semantics with which we are learning here.
The other words you see in the list are also related to Albert but not as strongly as those at the top, and the list goes within the model on until eventually we reach the stage where there are no words which appear in 2 or more documents. We now have a table which contains exactly every word everyone ever published online alongside his name (more than once).
Keywords content for SEO and LSI for a Google search advantage
At this point it is important to know how accurate this keywords content methodology is in determining relevance with an advantage for Google search SEO. Consider for a moment the word “tarmac”. I haven’t checked, but I very much doubt there are many documents out there which reference the word “tarmac” as an LSI keyword alongside our famous scientist. The reason is that by its very nature and such enormous data sets – we can see how LSI will collate the most relevant documents by virtue of them containing the most commonly associated (written) words. Sounds useful? I think so.
Step 2 – Better Intent Keywords and Indexing of the Web Site Content Means Free Marketing for Your Business.
We (as a search engine) now have a key search term / query (a name in this case), a web page and now we need a relevance “score” to help the marketing for your business site get you more traffic. The score itself is hugely variable in how it might be calculated, but you can see that if Latent Semantics is involved in establishing relevancy within a document collection AND the web page contains a high number of high scoring (by document frequency) semantically and better intent related words (or an even spread – again this depends on the indexing implementation) – then the page will be given a high score and thus rank higher for the term it is indexed as being relevant for.
Other considerations when assigning the score prior to indexing which are currently being investigated / under research, is the use of LSI within weighted parts of the document. These might include the page title or h1 tags for example, where we see a “sprinkling” of words possibly attracting a higher rating than that applied at the time of prevalence counting. Every bit helps to enhance term meanings.
For example, we may see the word atom within a title tag. If it appeared in a p tag section – it might have a score of x. Whereas in a title tag – it might have a score of x+title value adjustment – these could in theory, increase the overall relevance of a document to its associated term.
The lengths to which a smart computer algorithm can go to in order to adjust document to term relevancy are endless – this is a very simplified look at what may be in place but the available resources would, I think, be sufficient. Great SEO results are the ultimate goal here.
Once the finding and indexing process is complete, the page becomes available for access by the search routines. When the name is entered into the search box – the returned results list can be sorted by those with the best relevancy – amongst a multitude of other factors, which may or may not outweigh the technical relevancy calculations.
Step 3 – When Writing Semantically for SEO, Most LSI Strategy Keyword Tools can Help with the Projects Work.
Keyword research strategy and Intent within the context
With the above in mind, to be relevant for a given topic and the intent , it is worthwhile to make efforts to perform keyword research and identify semantically related words within the context of your document. Target these keywords as a strategy and use them in order to provide a detailed article and improve your content and its intent meaning. Writing semantically with this in mind, your page will likely see a potential increase in visitors via SEO, answer questions for readers. Our LSI tools can help with this process.
As an example, it is generally accepted that no longer (and has not been for a long time now) is it just a case of simply identifying a group of words e.g. “pot, potting, pots, plantpot, potter” and hoping to rank for pots. It is always worth (although not proven) bringing in specific words actually related to your context and building a richer document may help with getting a higher ranking. Remember that LSI is about semantics – not just plurals, similar topics and other same word deviations. Writing about pots, you might need words like “seeds”, “compost”, “roots” etc. depending on the type of information you are looking towards getting indexed for on queries which match, along with a boost from Latent Semantic Indexing concepts.
Marketing projects and Google search.
If you’re still not convinced to use this in marketing projects, think about this – please keep in mind this is an experiment in language, words and text which shows the extreme. A document you have an idea for is under development. The document is about the concept of knowledge. Imagine where you might rank for searches in Google search if the only word you used was knowledge, and that was the only word on the page, how would a search engine easily determine any particular relevancy with something more. Now add the word “gain” – a user might see that and not know what it was about still. A search engine would have no idea either. By adding another word, “school”, you can see that your simple three word page now looks more credible. Think on – to be successful you need more words, so add in “books”. The page is building on what it might mean. Now add in the word “tips” and carry on adding a thousand more related words and you have the big picture.
Conclusion on Latent Semantic Indexing benefits and your focus and targeting on the terms and search engines.
This is a summary and conclusion, make no mistake – there is a lot of science to understand behind effective latent semantic indexing benefits and mathematics as well – the detail of latent semantic analysis and what it is covers areas such as term value decomposition, inverse document frequency and it also has it’s known downsides. But at the end of the day, also remember that we don’t even know for certain how much of a role actually using LSI plays in the day to day indexation and ranking of web pages by any search engine or other business in the field. But using the above guide to focus on the terms you are targeting gives a good idea of what might be happening – and if nothing else – you can certainly use the overall concept and our agency services, if needed, to find LSI keywords and enhance your existing documents search engine optimization. It may even help you create new ones with thicker content to help with a rank increase overall on the search engines. If you have any questions for us, please send us a message.
If you would like a free Latent Semantic Indexing glossary and concept trial on your domain from our agency – complete the form below to get in touch and we will provide the information you need to address your strategy using our custom LSI keyword tool – free of charge for use in your browser!