What is the Google SMITH algorithm?

Written by Josh Westerman

It was French writer Jean-Baptiste Alphonse Karr who wrote in July 1848 “plus ça change, plus c’est la même chose” – or to us English-speaking folk, “the more things change, the more they stay the same.”

Karr’s notion can be attributed to Google and its algorithms. Although updates change how the search giant crawls and understands your content, the goal always stays the same – to serve the user the best possible answer to a search query through content.

The introduction of the Google BERT algorithm in 2019 was a landmark occasion for how the context of content was understood, with even Google stating at the time that it was “the biggest leap forward in the past five years, and one of the biggest leaps forward in the history of search.” You can read our more in-depth piece on Google BERT here.

But the limitations of BERT were always clear – great at understanding sentence and shorter content pieces, but not so much the longer pieces of content and in particular the shorter passages within them. Step forward SMITH.

The Siamese Multi-depth Transformer-based Hierarchical (SMITH) Encoder Google SMITH is a new algorithmic model to try and understand the entire context of a passage of content. The algorithm will also predict what comes next in a sentence which follows the initial query, to push the user to right direction of the online universe.

Taken from Google’s research paper, “The SMITH model which enjoys longer input text lengths compared with other standard self-attention models is a better choice for long document representation learning and matching.”

For the more technical elements, you can read Google’s research paper here which offers more insight into the algorithm’s details.

There hasn’t actually been an official announcement that the SMITH algorithm has been rolled out, but the news that Google is now looking to index specific passages of content within longer pieces is a clear indicator that a move to understanding long passages is on the horizon.

How is SMITH different to BERT?

The major difference between these two algorithms is the length of the content which is contextually understood. Google BERT focuses on the context of a specific sentence with a passage of content, whereas SMITH will be able to digest and understand longer passages of content.

Although when first launched, BERT was heralded as a gamechanger for the world of search – and don’t be misunderstood, it still works wonders in terms of grasping the context and lexical ambiguity of language. However, its weakness is in understanding much longer passages of content. Enter SMITH.

It is worth noting that SMITH isn’t a direct replacement for BERT. Google SMITH will act as a supplement for the BERT algorithm, picking up the heavy lifting of longer content pieces which BERT doesn’t have the capacity to do.

What does the Google SMITH algorithm mean for content creation and search?

In short, when (rather than if) SMITH is rolled out, it doesn’t overly change that much in terms of how we create content. In-depth, informative content has always been at the forefront, and the introduction of SMITH means it will be better understood by Google and served to users.

The algorithm is merely part of the search evolution Google is undertaking, changing how content is understood to serve the right user – it’s the same premise of every Google update. Just as Karr wrote, “plus ça change, plus c’est la même chose”.

Looking for support on your content and search strategies? Get in touch with us today to learn more about our integrated approach, or contact Josh directly: e: josh.westerman@brand8pr.com t: 07432 655440