An Introduction to Google’s Programmable Search Engine

The powerful Google search: we all dream of controlling it, even taming it. In fact, we can. Venerable browser filters such as Google Hit Hider block entire domains with one click. uBlock Origin selects and blocks any page element that distracts from the main results. Less well known is what used to be called Google Custom Search Engine (CSE), which Google has renamed Programmable Search Engine (PSE) in 2020. Each user has the opportunity to create their own PSE. They can then provide it with a list of public URLs, and its Google-like search will only show results from those and nothing else. The user can also see image search results from the same query, constrained in the same way. PSE relevance ranking and speed are the same as with a Google search.

If you last watched the CST years ago, you’ll find that a lot has changed, and not just the name. What hasn’t changed is that nonprofits and educational institutions can opt out of Google text ads. However, it’s no longer just a matter of flipping a toggle and Google trusts you. Now, a non-profit organization or university must first register with Google for Nonprofits Where Google for Education. These two platforms then offer links to the creation of personalized EPS, without advertising.

BEGIN

A significant initial concern among readers may be the autocomplete option for queries. PSEs now benefit from Google Search’s full suite of autocomplete prompts. (This was not the case before.) But the autocomplete can currently be disabled if it is not desired.

How long does it take to create an ESP? For those familiar with the control panel, creating and testing a small PSE will only take 2 hours. But to make it great and do it right, a beginner can take days of reading, learning, and trial and error. There are many requirements to meet and pitfalls to avoid – for example, a big initial hurdle is meeting the tiny upload size requirement for each block of a large list of URLs.

CHANGES

Those who want an ad-serving PSE, or to convert an older one without ads, are currently out of luck. In April 2022, monetization was abruptly suspended, except for lucky Google users who were already serving ads on at least one PSE. Google said he’s “creating a new system for publishers looking to monetize their search engines,” but he hasn’t announced any details or dates yet.

Variant-related CSEs are also gone, a powerful way to use your own huge list of self-hosted URLs (while Google was just handling request processing). Some changes are for the better. In 2019Google released a new mobile results layout, and in 2020this has significantly improved the presentation of image search results.

Recent undocumented positive changes include lifting the cap of 5,000 URLs (Google calls them “patterns”, since you can use a wildcard /*/ in the path) on all PSEs in your account. I run five on my Google account, one at a time maximum of 5,000. But last year I discovered that I could start a new PSE, in addition to the previous ones, and there I could add new URLs that once would have taken me past the 5,000 total .

There have been many other changes and improvements over the past few years. Some may be unwanted, such as breadcrumb URLs on results, but they can often be reinstated in Control Panel. Note that Google is switching users to a new tablet-centric control panel, which currently seems to lack some very important features, such as XML backup export and URL pattern search box. Hopefully these items will be added when the swap-over is applied.

RESEARCH

Note that PSEs will require a more sophisticated search query from users than Google search. This can helpfully reduce CAPTCHA hurdles for complex searches. But not all users will be aware of the need for some complexity. Casual users may try to test a PSE with a few words and then may be disappointed with poor or few results. Some user training may be required.

Your control panel displays top user searches. For example, one of my ESPs was recently most searched for “hockney falco” (art history), “depression affects business” (business studies), and “post-production house” (production of movies). These user search phrases were being misused by the SEO crowd and were deleted from the API version. But they remain in the User Control Panel.

You can, of course, create your own Google search equivalent and then regularly crawl your target URLs, if they let you. That’s fine for a university with maybe 100 websites and a repository, all of which you control. But many third-party websites only allow known crawlers. Some will only allow crawling by Google. There have also been broader policy changes affecting ESPs. For example, Google services are reported be banned in China.

RESOURCES

EPS help pages

EPS help the community

EPS Blog

Tip: In the main Google search, you can create a temporary mini PSE using the following:

keyword (inurl:2022) (site:wordpress.com | site:squarespace.com | site:wix.com | site:blogger.com | site:tumblr.com | site:typepad.com)

Amanda J. Marsh