Searx instance and search result oddities

probably needs to be posted on the searx repo, but figured i’d ask here first to get some insight…

https://search.privacytools.io/

i’m getting very strange results with the Searx instance - for example, if i disable all engines except Startpage, there are often 0 results returned, even with a generic search term such as ‘dog’, yet if i do the same search at startpage.com, lots of results are returned

same happens if i enable Quant only

if i enable DDG only, i get lots of hits again

the same behavior holds true for https://search.disroot.org/

why dis?

2 Likes

maybe @jonah knows?

1 Like

Interesting! I tried it just a second ago and had the same problem. Is it because the private search engines (like DuckDuckGo) are “powered by” some of the major ones?

no? is my answer - if i select Startpage as the only engine for the Searx instance, i would expect the results to be the same as if i searched from startpage.com - i think there’s a :wrench: in the machine somewhere

1 Like

It’s possible Startpage is blocking our IP or requests for some reason. I’ll look into it in a bit.

3 Likes

thanks Jonah, however note that it isn’t only Startpage, nor is it only the PTIO instance of Searx

i can do more testing and provide further detail if desired

i believe that’s the problem - i did more testing and i can’t come up with any other reason - it’s unfortunate, but it seems many of the public instances are affected

1 Like

forgot to mention, other users are reporting the same problem on the Searx github repo

2 Likes

@lizmcintyre, would you have any insight into this?

1 Like

It’s possible the search engines in question either changed how their results are shown so the scraping tools in Searx can’t read them, or they’ve blocked the IP addresses of larger Searx installs.

It’s also possible this has been already fixed in Searx, and I just haven’t updated the code yet, it’s a little bit behind. I’ll try to get it updated tonight. Otherwise since DDG and some other engines are working, I’m leaning towards it being an issue with Startpage and the others.

1 Like

searx seems to be working now - I tried the “dog” search and got results as usual. I wonder if they patched it?

That doesn’t confirm it’s the software we’re running, because Startpage could’ve blocked our queries and not searx.me’s.

Can confirm this isn’t an issue with outdated code. So this is either an upstream Searx issue I’m unaware of or an issue with Startpage and Qwant. Especially because Bing, DuckDuckGo, Yahoo, Wikidata, Wikipedia, and the engines for other searches are still working.

I just tried searching at searx.me and it seemed to be returning results. However, there was a Google error message:

Engines cannot retrieve results:
google (unexpected crash: CAPTCHA required)

the captcha error is typical for google - they present that when google detects too many queries from a given IP

1 Like

Oh yes I somehow forgot about that right after I read the post, my mistake, how embarrassing.

Would it be possible for Searx to send queries through Tor or would that just move the issue onto the Tor exit node(s) or make it worse due to them already having the problem?

1 Like

That’s a good question. I assume it’d be possible but I’m not sure how to implement it, or if performance would be good enough for a multi-user application like a public Searx install. I’d have to look into that further.

1 Like

Hi. Startpage does have a privacy-friendly anti-abuse mechanism, but isn’t blocking searx. If there were abuse, there should be some indication of that.