LSI (Latent Semantic Indexing) is a way that search engines determine whether your content is really on-topic and in-depth or just spam. The search engines determine this by looking at the words in an article and deciding how relevant they are to each other. For instance, for an article about computers, the search engines know that the following words are closely related to “computers” and will probably appear in any good article about computers: hard drive, cpu, RAM, monitor, motherboard, ghz, mhz, Intel, Nvidia, etc… These are known as LSI terms.
So, here’s what happens when the search engine finds your article:
1. It reads the article
2. It determines “keyword density” of each word or phrase in the article. This means that it looks at the entire number of words in the article and finds how many times particular words or phrases are repeated in the article. Words and phrases that are repeated more often have higher keyword density. This is how the search engine knows what your article is about. So for an article about “desktop PC cases,” that phrase might appear 4 times in your 700-word article. That would give it a density of about 1.7%. (12/700 = 1.7%).
We use 12 because “desktop PC cases” is comprised of 3 words, so 3 words X 4 appearances = 12. You can help the search engine figure out what your article is about by including the keyword in the title, first paragraph, and last paragraph of your article, as the search engines know to put extra emphasis on these areas of the article.
3. It picks out the words and phrases with the highest keyword density and uses those to determine what the article is about (in essence, the article is assigned a “relevancy score”). So for our “desktop PC cases” example, if it finds a high keyword density of “desktop PC cases” then it knows to expect high densities of other related terms (LSI terms), like: ATX, cooling, power supply, motherboard, gaming case, custom case, etc…
The search engines know what related terms to expect for any given keyword; they have gotten pretty smart. So if they expect to see certain related keywords in the article but they don’t find those keywords, they assign it a lower relevancy score. This directly impacts where that article will rank in the search engines when someone searches for your target keyword.
LSI is a key concept in SEO (search engine optimization). Search engine algorithms are always improving, and right now they are rewarding content that has a good balance of LSI terms with the main keyword of the article. So if your plan is to use the content that you write to build a Website, optimize that Website for SEO, and monetize the traffic that comes to the Website, it’s important that your articles have a good mix of LSI terms. So when I read articles, I read them from two different perspectives:
1. Human (Does it read well?)
2. SEO Specialist/search engine spider (Is there a good keyword density? Are there lots of LSI terms?)
So for future article marketing efforts, try to incorporate LSI terms. Just think about what terms are unique to the niche you’re writing for. Often, these terms will occur organically as you write. But for best results, you should be conscious of how the search engine spiders are going to read your content.
Take this blog post, for example. The main topic of this post is LSI (Latent Semantic Indexing), but I’ve sprinkled a bunch of LSI terms into this post:
All these terms are related to LSI and the broader category of SEO, which the search engines will recognize when their spiders crawl this post. And the result? This blog post will be assigned a higher relevancy score for those categories. And that means it’ll rank higher in the search engine results pages (SERPs). Simple as that.
Now, don’t get me wrong; LSI isn’t the only determining factor of how your content ranks in the SERPs. In fact, it’s only one of hundreds, if not thousands of factors. That said, it’s gaining importance in the search engine algorithms so it’s worth thinking about when you are writing SEO-optimized content.
Search trends are dependent on a number of closely interacting technologies, and you need to be aware of how they’re changing if you want to stay ahead of the competition, especially as rates of change accelerate across the board. There are device technologies, which have given us mobile devices and more sophisticated forms of local search, web technologies, which have made it possible for more companies to make more creative websites, and raw search technologies, which make search faster, easier, and more relevant for users (among other classes of technology).
Of the search technologies, one of the most fascinating—and the fastest changing—is semantic search, the ability for search engines to recognize and interpret the natural language of its users’ queries. Semantic search is evolving in some astounding ways, and the sooner you start adapting to them, the better.
In the early 2000s, there was no such thing as “semantic search,” and natural language recognition seemed like a distant dream for AI. Search engines functioned using a keyword-based mapping system; they would identify certain keywords and keyword phrases in your query, then generate a list of the places on the web where those terms were used most frequently and most prominently. Over the years, this process became more sophisticated, weeding out unnaturally keyword-stuffed pages and mapping more complicated phrases, but it basically functioned the same way.
Google’s Hummingbird update changed the game when it came out in 2013. Rather than using keywords to find the most relevant results for a query, Hummingbird could interpret the intention of a user query based on its phrasing, and find relevant entries from there. Its emergence marked a significant departure from keyword-based strategies of search optimizers, instead forcing content marketers to try harder to answer user questions, concerns, and interests.
Late last year, Google released a new machine learning algorithm to Hummingbird called RankBrain. The goal of the algorithm is to improve Hummingbird’s semantic search capabilities by gradually learning more about the way people talk (and enter queries into search engines). Though semantic search is already pretty impressive, it struggles when a user’s query is especially wordy, complex, or ambiguous. RankBrain learns from prior experiences, essentially updating itself, and eventually becoming able to break those complex and indecipherable queries down to more manageable chunks. It’s a sign of Google’s commitment to never-ending phases of improvement—without the temporal and logistical wall between engineers and manual updates, this automated algorithm will be able to develop faster than ever.
You’ve undoubtedly noticed a surge in “rich answers,” which is the term given to concise entries in SERPs given prominence above standard search results. These can take the form of images, sentences, paragraphs, numbers, or any other type of answer that can immediately and concisely address your query (without ever demanding you to click through to a separate page). These are rising in prevalence for three reasons:
This is one of the most important effects of increased semantic search analysis, as it reduces reliance on external web pages to answer questions. It’s been argued that this will eventually stifle search traffic to all websites, but we’ll cross that bridge when we come to it.
Related questions are also seeing a rise—especially over the last few months. You may see these popping up about halfway down your search results, prompting you to investigate similar or frequently asked questions related to your original query. However, you’ll notice the answers to these questions often differ from their rich answer counterparts, implying that a separate algorithm is responsible for generating them. It’s unclear how all this ties together, but it’s clear Google has a long-term plan for query pattern recognition in addition to basic semantic understanding.
If you’ve read this article with an SEO perspective in mind, you may be wondering how all this affects you. Yes, it’s interesting to learn the inner mechanics and history of Google’s semantic search capabilities, but what practical information can you walk away with?
First, understand the key areas that Google is developing (either through more manual updates or with their new machine learning algorithms): voice search, semantic understanding, rich answers, and related questions. Google’s main concern is on getting correct, relevant information into the hands of searchers as quickly and easily as possible.
Your goal, therefore, should be to help Google get the job done. Spend more time researching common questions in your industry, and write answers to them. Explore complex, niche topics, and microformat your site so Google can scan it for the answers. Become known as an authority and provide the information that your users want, and you’ll be rewarded in the form of more visibility. It’s as simple as that.
Semantic search instead attempts to analyze the intent behind a user’s query, so in our example above, rather than mapping out the keywords included in the query, it would examine the entire phrase and determine that this user is trying to find the highest rated dentist in the city of Bristol. It would then use contextual clues from sites and offsite indicators to evaluate which dentists operate in Bristol, and of them, which are the best.
Knowing this, you can start making the meaningful changes necessary to ensure your site is evaluated and listed properly.
Your first step is to adjust your page titles (and meta descriptions, while you’re at it). It’s still a good idea to use words that are relevant to your business, and words that people might include in their searches, but there are a few more considering factors.
First, make sure your phrasing is natural, and not clunky. Keyword-centric optimization might have you writing titles like “Dentist oral surgeon in Bristol TN,” which doesn’t sound like a sentence a normal person might write. Write in full, concise phrases, and be as accurately descriptive of your pages as possible. As long as there’s a strong indication of who you are and what you do, you’ll be in good shape.
Second, be careful of repetition. Keyword-centric optimization would have you repeating a specific phrase on multiple titles and descriptions throughout your site. In semantic search, this can actually work against you. Feel free to target a few phrases that might give you a competitive edge, but keep your pages as diverse as possible.
Ongoing content is your best chance to optimize for semantic search. Oftentimes, people will type full questions or long-tail quires into Google, and it then becomes Google’s job to find, not the content with the most keywords in common to the query, but the content that sufficiently answers the user’s question. Accordingly, your content should be focused on succinctly and descriptively answering as many potential user queries as possible.
“How-to,” “why,” and “what” articles are amazing tools for this. Get to know your existing customer base, and figure out what common questions they had when they were first searching for a business like yours. Write posts that directly answer those questions (with descriptive, pointed titles), and you should have little trouble ranking for those queries when they arise. The more specific your niche here, the better.
When it comes to writing onsite content and ongoing articles, there isn’t much you’ll have to change in your approach. However, there are two considerations you should incorporate. First, remind yourself that it’s not necessary to stuff keywords into your articles. Focus your efforts on being concise and descriptive, and the rest should come naturally. Second, know that most semantic queries are long and conversational, so try to make your content a little more conversational accordingly. Conversational, casual tones are more approachable for readers, so in addition to maximizing your potential visibility, you’ll also increase your retention.
Finally, I want to mention RankBrain. RankBrain is Google’s new AI add-on to Hummingbird, designed to update Google’s algorithm automatically and regularly to improve its semantic understanding of queries. Simply put, its job is to figure out complex, ambiguous types of queries and map them to simpler, more natural versions. Accordingly, your content strategy should be focused on the simpler, more natural versions of queries. Instead of shooting for a rare, niche audience with complexly worded phrases, try to keep your voice as natural and concise as possible.
Google is already quite impressive. It’s able to reasonably guess the meaning behind your given search phrase, even with the elementary Hummingbird update. But the future of semantic search will likely extend far beyond the current limits of algorithm technology.
Already, Google is beginning to incorporate various external factors into its search results, based on your own personal data. It might creep you out to learn that this is happening, but it’s also giving you much more relevant results. Google likely knows exactly where you live, and can use your previous search history to customize predictive search results.
If we take those factors and incorporate them into an environment that is built on semantic search, we end up with a search engine that can guess users’ intentions based on their previous behavior—maybe even before they search for it. By using big data to analyze and interpret patterns of behavior based on individuality, time of day, social media activity, and even recent news, Google could take the world of search into a direction previously limited to science fiction. We’re likely a decade or more away from building a machine that can accurately guess what you’re thinking, but knowing Google, we’re probably already closer than you think.
In some ways your content marketing strategy shouldn’t change. Presently, subject-focused content strategies tend to pay off. Writing about a given topic will naturally attract people searching for keywords related to that topic. It’s all about giving people what they’re looking for, and that fundamental principle will remain firm.
However, in order to adapt to the surely-coming revolution of semantic search, you need to go a step further. You need to understand the meaning behind why people are searching for a given topic. It’s a fancy way of saying you need to understand your demographics better, through surveys, studies, and big data analysis. Understand exactly what motivates your customers to search for a given topic, and extend your content strategy to cover those peripheral motivators.
Doing so will put you ahead of the search engines—Google will attempt to understand what’s motivating your customer, but you’ll already know. And if you can provide that to them with relevance, uniqueness, and quality, Google will reward you with a high rank.
Semantic search isn’t going away anytime soon, and your competition may already be making plans to conquer it in their own way. Keep this in mind as you audit your website, analyze, and shape your strategy this year and beyond. Success in SEO isn’t about finding something that works and sticking with it forever; it’s about constantly refining your approach to accommodate these captivating new trends as they emerge.