Welcome Guest ( Log In | Register )

> New Search Engine, No Read, Only Post

 
post Nov 2 2022, 12:09
Post #1761
Tenboro

Admin




If you want to know more about the rationale for the changes, start by reading the Change Rationale below (original version in this post).

If you think you found a bug, post the EXACT QUERY you were using, not a vague description of it.


FAQ

Q: Why did you change the search engine? (TL;DR version)
A: The way the old search engine worked could no longer scale with the size of the site's index, and was failing on an increasing number of queries. No amount of money or hardware could have fixed this long-term, so the only option was to fundamentally change how it works. The new search engine is the best tradeoff of functionality and performance available.

Q: Can we have page numbers back?
A: No. Read the Change Rationale below.

Q: But I really need page numbers because of reasons. What if I give you money? Can *I* have page numbers back?
A: No. Read the Change Rationale below.

Q: But some rando on the internet told me that it's actually really easy to have page-addressed search results with tens of thousands of pages for database indexes with hundreds of millions of rows and I believe them because I want it to be true which means I think you are lying. Can we have page numbers back?
A: No. Read the Change Rationale below.

Q: What if I threaten to kill your house and burn your dog to the ground? Can we have page numbers back?
A: No. Read the Change Rationale below.

Q: But what if-
A: Just no. Read the Change Rationale below.


Change Rationale

or; (Not having an exact page selector is worse than having an exact page selector / Not having an exact result count is worse than having an exact result count) and the new search engine is therefore worse than the old search engine!

If you ignore all the new and improved functionality and the vastly higher performance, and focus only on "but I want page-addressed results" and "but I want exact result counts", this might be the case. These specific changes were not made because I thought it would be an improvement by itself, but out of necessity.

With the old search engine, because of the ever increasing size of the index, results were taking longer and longer to generate, and it required more and more RAM to do so. Many queries would take on the order of three seconds to generate at the time the search engine was replaced, which would be doubled in a couple of years at the current index growth rate, leading to (non-controllable) timeouts. Furthermore, RAM usage for generating a result was more or less linear with the size of the result plus the size of the index for each of the queried terms, and there are practical limits to how much RAM can be made available for a particular process, so at some point queries would just start failing in unpredictable ways.

Notably, this was already the case for a non insignificant number of complex queries.

In other words, if we kept the old search engine, in a few years, if you tried to search for anything with many results, you would inevitably either get a Cloudflare timeout page, or the query would fail with a memory error. And that's if the site itself isn't completely unusable since all its CPU time might be tied down into trying and failing to create search results. Which, obviously, is bad.

The conclusion is, even with unlimited hardware (which we do not have the necessary unlimited funds for), the old search engine would be effectively unusable for most if not all queries with many results in two to three years without significant changes.

The available options were:

1. Replace the old search engine that uses a naive approach of effectively building the full result for a query (which is necessary for full-range page navigation and exact-ish result counting) with a brand new search engine that is a lot more clever about doing stuff.

2. Put a band-aid on the old search engine by significantly curtailing the maximum size of the search result and/or the range of searchable content.

3. Remove functionality that was expensive in the old search engine, such as hybrid title/tag searching and comment searching.

With the second option, you would have pages, but there might be a maximum 10 of them with 100 results per page. You'd have "exact" result counts, but it would just be capped to 1000. Many sites use variants of this approach, like Google, Nyaa and most if not all large Boorus, but if you think for a second that this would cause any less of a shitstorm if we changed to it, I guess I should welcome you to the internet, because it's obviously your first time here.

With the third option, you would only be able to search for titles or tags, but not both at the same time. For example, if you searched for "part of title" english you would only be able to find things with those two terms in the title, not galleries tagged with "english". Comment searches and various other functionality would just be removed entirely. Searching would be all around hampered and unintuitive. See; shitstorm.

Alternatively, you might have a curtailment that only galleries posted in the last couple of years are searchable. Shitstorm.

Alternatively alternatively, those things but with donator-only unlocks and higher limits. Shitstorm.

I went with the first option, which involved three months of active development plus an additional month of testing + optimization, and is in my opinion by far the best possible tradeoff of performance and functionality. Even if some people disagree with the changes, this is a hill I'm willing to die on.

You are allowed to both disagree and/or dislike change in general, but if people keep accusing me of lying about the current state of things and the reasoning for the changes because they read a bunch of misinformation and conspiracy theories posted by clueless autists on 4chan, I'll just start handing out bans to preserve my sanity, so stop doing that.


2023-03-10

- When using exclusion terms, it will no longer just flag the search as "about", instead it shows how many results were excluded on that particular page. (If a gallery would have been both excluded and filtered, it is counted as excluded.)

- For consistency reasons, when the search result fits on a single page, excluded galleries are now included in the result count.


2023-03-07

- In some filtered search modes, when using multiple search terms, the search engine would previously use an "exact" count based on the unfiltered index that could be significantly off. It now uses the count estimator in these cases. This just flags it as "about" for now, and may be revised later.


2023-01-09

- Corrected a search indexing issue where some substrings consisting of three characters enclosed in square brackets did not get indexed properly.


2023-01-07

- Corrected an issue that prevented the range bar and range jumping from working correctly with searches involving weak tags.

- Corrected an issue with title exclusions where some characters that are stripped from search queries were not stripped from titles before comparison, causing unexpected behavior.


2022-12-22 - Bugfix

- For index searches, if a very rare combination of internal buffer states occurred, the search would act as if there were no more results when there actually were. This should be fixed now.


2022-12-20 - Update

The search engine will now attempt to give a ballpark estimate for result counts in all standard searches except for searches with inclusive comment terms. The estimate is based on internal stats, index sampling, and a history of the span of results found on each page.

The new estimator is primarily used for complex queries where all terms have many hits, where it would previously only say "many". It is also used whenever a page range filter is set, and for index searches when several categories are unselected.

For any query that has not been searched recently, the initial estimate will usually be a vague and conservative lower bound (like "thousands" or "10,000+"). A more precise estimate may be provided when enough samples have been collected.

It generally prefers to under-estimate rather than over-estimate the count. As such, it should generally be interpreted as "probably more than".

The accuracy for the estimate depends on the accuracy of the range map used by the range indicator, and the same caveats apply. Complex searches with non-dependent terms will generally be less accurate.

Note the estimate can fluctuate a bit as you go between pages. This is expected.

Other Changes:

- Fixed some issues with underscores in favorite searches. Similar to username searches, spaces and underscores should now be equivalent for all favorite search usage.

- You can now use favnote:* or -favnote:* to filter favorites with any favorite note.

- Several caching issues were found with the setting to exclude namespaces by default when searching. A fix would be complicated and essentially make searches uncacheable for everyone using it, and since it's only used by a small fraction of a percent of visitors and a lot of people seem to be confused about what it does, this setting has been disabled for now. It will likely be reintroduced in a slightly different form in the future.

- Some fully updated but less than bleeding edge browsers were having issues with the javascript generated by the javascript optimizer we now use. This optimization now targets an older level of compatibility, which should fix this issue.


2022-12-05 - Update

A new result range indicator + range jump mechanism has been added. The range indicator will let you see roughly where you are in a search result and how much of it is found on the current page, while the range jump mechanism will let you jump an approximate number of percent into a search result. Range jumping is done by clicking on the range indicator bar.

This mechanism is almost, but not quite, entirely unlike pages. While it has a fleeting similarity, don't expect it to behave exactly like them.

Most importantly, the range indicator uses various internal statistics to work with basically zero overhead; it does not actually generate pagination for the full search result. This means you should for example not expect the displayed number of pips for each page (or pages per pip for large results) to be fully consistent across the entire result. While it does to a large degree correct for variations in volume and usage over time, there will still be unpredictable natural variations (clusters and gaps) in the distribution of results. This especially applies to comment searches, which cannot make use of any precomputed statistics.

The range indicator is only available for normal searches; that is, not for favorites, watched tags, or in gid/file searches. Favorites will be revisited at a later date, as part of a larger rework of the favorite system.

Other Changes:

- Added the Jump postfix "g" for GID (Gallery ID) jumps. Using this with any GID will jump to the position in the search result with this gallery as the first (or last) result on the page. (If the gallery does not exist in the search result, it will still work, the gallery just won't be there.)

- Added a new setting to disable the new range indicator.

- Corrected a rare edge case where the search UI would act as if a search had no more results even if it did.


2022-11-25 - Minor Fix

- Fixed an issue where some more characters in uploader usernames were not properly searchable.


2022-11-21 - Improvements

- Added some significant optimizations for a frequently used search strategy for when multiple name+tag/comment search terms are used and at least one of the name+tag terms has less than 10000 hits. (For some cases this will reduce processing time by >90%).

- The search query parser will now handle various cases where repeated or redundant search qualifiers are used, such as weak:tag:foo or tag:tag:tag:bar.


2022-11-18 - Fixes

- The publish date adjustment for galleries created with the old uploaders (predating October 2021) has been completed. This should fix the remaining quirkiness with gallery sort placement as well as with the seek/jump mechanism. Note that these galleries are now considered "published" when the gallery was created rather than when it was actually published, though in most cases this would only shift the date by a few minutes to a few hours.


2022-11-17 - Minor Fixes

- When searching for comments, if the search term was too short after being stripped of non-indexable characters, the term was silently ignored. It now properly fails the search with an error message instead.

- Fixed tags hidden under My Tags not being displayed with search results when filters are disabled.


2022-11-16 - Deployment + Fixes

- This update is now fully deployed.

- Fixed an issue with how some dynamic stats were generated that only manifested under high load.


2022-11-15 - Minor Fixes

- Fixed a bug in favorite searching where, depending on internal state and order of operation, title-only searches could break when multiple terms were used.

- The wording of "default filters" was changed to "custom filters" to make it clearer that it is referring to your personalized/customized tag, uploader and language filters, rather than some global default filter.


2022-11-13 - Minor Fixes

- Fixed some more search issues with uploader usernames with leading or trailing underscores as well as multiple consecutive spaces/underscores.

- We now avoid using the /uploader/ shorthand URLs for uploader usernames containing forward slashes since the resulting URLs are broken.


2022-11-11 - Minor Additions/Tweaks

- When searching for tags (or titles+tags) where there is just one tag match and you have that tag filtered, the system will now specifically ignore that filter. If you actually want the tag filtered, you can use the title: qualifier.

- The search engine will now stop looking for more results for a page if more than 1000 galleries have been filtered. (This is mostly relevant in edge cases where you are intentionally searching for things you heavily filtered.)

- Fixed search warnings not being displayed for favorite searches.

- Added a setting to remove the "Your default filters removed XX galleries from this page" message.

- Added a new qualifier "weak:" to search for weak tags. This replaces the "Search Low-Power Tags" checkbox. Using weak: in front of a keyword works the same as using tag: except it will search weak tags (<10 power) instead of active (10+) ones.

This change allows for some additional flexibility, since you can now search for various combinations of weak tags and active tags - for example, all galleries with an active parody tag from a particular series, and weak character tags from said series.

Weak tags cannot be used for exclusions or searched in favorites. Additionally, if you are using OR searches, either all or none of the OR terms must use the weak: qualifier.

It is not possible to search for both active and weak instances of the same tag at the same time, or mix normal and weak OR terms in general, since they use different indexes. These are not artificial limitations. The weak tag search is there to aid in tagging and cleanup in order to either get rid of them or make them into active tags, not to get "more results" in casual browsing.


2022-11-07 - Bugfixes

- Corrected an issue with tag/name searching in uploader results.

- Corrected glitchy behavior with the new jump/seek selector on the favorite page, as well as an issue with the favorite checkbox selector positioning.

- Corrected seek/jump offsets not being kept if you switched display mode (minimal/compact/etc) right after using it.

- Corrected an issue where some characters weren't properly stripped for name index lookups.

- Corrected an issue where, when encountering terms that were long enough to search but that contained characters that are not valid in tags, it would still attempt to parse it as a tag except with those characters stripped, but if there were less than 3 stripped characters, it would then fail the term as being too short. Terms with characters that cannot be used in tags are now instead parsed as title-only unless a different qualifier is used.


2022-11-06 - Minor Addition

- Incorporated a clickable jump/seek selector based on a suggested code addition from FabulousCupcake.

Note that the date selector uses the built-in browser one, and as such it will use your browser's locale for the date format. (This is automatically translated to the site's date format by your browser.)


2022-11-05 - Update

New Feature: Seek/Jump Navigation

You can now do arbitrary jumps (number of days/weeks/months/years) backwards and forwards in search results, as well as arbitrary seeks to a specific date in the search results, by clicking the new Jump/Seek button in the navigation bar and entering a number or date in the box that appears.

Entering a number will make it jump backwards or forwards by the specified number of days, aligned to the start or end of each day. Adding w, m or y to the number will make it jump by that number of weeks, months or years instead. When jumping forwards (Jump >), the jump is based off the posted time of the oldest (bottom-most) gallery on the current page. When jumping backwards (< Jump), the jump is based off the posted time of the newest (topmost) gallery on the current page.

Entering a date with the YYYY-MM-DD will make it seek to that date in the search result (inclusive). Note that the semantics of < Seek and Seek > is somewhat different than < Next/Jump and Next/Jump > - specifically, which button you use determines whether it uses the date as the starting point or the ending point.

You can also use the YYYY-MM shorthand date. In this case, it will start from the first day in the month when going backwards and the last day in the month when going forward. (In other words, in either case it will include that entire month.)

If you only enter a number (not followed by d w m or y) and it is between 2007 and 2099, it will be interpreted as a year. In this case, it will seek to the last day the year when going forwards and the first day of the year when going forwards.

With the YYYY-MM-DD and YYYY-MM formats, the two first Ys can be left out - in other words, 22-11-05 will be interpreted as 2022-11-05.

Seeks and Jumps to galleries posted before October 2021 or so will be wonky until I run a script to make some fixes to the publish timestamps to match the behavior of newer galleries. This correction will happen shorty after the update is fully deployed.


Bugfixes

- Corrected an issue where galleries were no longer displayed under favorites if they are unavailable.

- Corrected an issue where, when using the /tag/ URLs (such as when clicking tags from the gallery page), it would keep adding additional quotes if you clicked the navigation links.

- Corrected some issues with uploader usernames with underscores and spaces. Note that for syntax and visual ambiguity reasons, underscores and spaces are now considered equivalent in uploader username searches.

- Corrected excluded categories still appearing on the Popular Pane. (They are still supposed to appear with file, gid and favorite searches.)

- Corrected a potential issue where the file/gid searches weren't including expunged galleries even though they were supposed to.

- Corrected an issue with dashes/hyphens in name searches where they weren't properly stripped for the index lookup.

- Corrected an issue where if you were using advanced search and *only* picked a minimum rating, the navigation wouldn't include it, so it would reset between pages.


2022-11-01 - Original Post

This update is a complete rewrite of the gallery search engine, meaning that the usage and behavior of searches has changed in a number of more or less significant ways.

The most significant and visible fundamental change is that the internal segmenting of search results is now done by gallery ID (GID) ranges rather than "pages". While this means jumping to an arbitrary "page" in the result is no longer supported, this is arguably an improvement since you can now jump to an arbitrary GID instead. This also means each page of results will be fixed on the same set of galleries even if it is refreshed after new galleries are added. The page navigation has been reworked to reflect this.

This also fundamentally fixes a long-standing issue where going backwards in the results via the page navigation (as opposed to the browser back button) would often include results from the following page if you were using any form of filtering.

Overall, these changes allow for massive performance improvements (three orders of magnitude in some common cases) as well as significant new functionality (keep reading), and there are no longer any limits to how large a search result can be. Search terms that were previously capped to 100,000 results (like say "big breasts" which is tagged on 350K+ galleries) can now be browsed in their entirety.


OR Tag Searching

OR searching is now supported for tags. (Probably the most requested feature of all time.)

To use OR tag searching, prefix the keyword with ~

Example: ~yuri ~"females only" ~f:sole_female$

Specifically, if you have at least two keywords with the OR operator, the search will return all galleries that contain at least one of the tags in question. Using the OR operator will imply the tag: qualifier. If you use it with any other qualifier that isn't a tag namespace, the OR operator is ignored and the keyword will run as a standard AND search.

Using OR searching will "consume" one of the allowed inclusion search terms. If you only specify one OR term, it will be treated as an AND tag-only term. There are no specific limits to how many OR terms you can specify, though it will still be practically limited by the search string length cap. It will additionally bail if the overall OR search is matching more than 1000 tags internally, so consider using exact tags to allow for more terms.

Wildcards cannot be used for OR terms.


Exclude-Only Searching

You can now do exclude-only searches. (Probably the other most requested feature of all time.)

Example: -yaoi -m:footjob -"glory hole" -sole_male$ -title:"novel ai" -comment:pixiv -uploader:BigDickDave69

You can use up to 10 comment+favnote exclusion terms and 10 tag (or hybrid tag+name) exclusion terms in a search.

The gid, uploader, uploaduid and title qualifiers are not specifically limited for exclusions, though they will still be practically limited by the search string length cap.


Tag Watching

The time cutoff for the tag watching page has been significantly increased:

- For non-donators, the cutoff was increased from one week to at least one month. The exact cutoff depends on internal segmenting, the rate new galleries are added, and the total index count for your watched tags. It will generally be somewhere between one and six months.

- For donators (gold star+), there are no longer any cutoffs. In other words, you can browse and search watched tags back to the launch of the site if you want. Note however that searching for terms that have few matches in your watched tags may produce fewer than expected results per page.


UI => Search Syntax Changes

The "Search Gallery Name", "Search Gallery Tags" and "Search Gallery Description" checkboxes as well as the corresponding search checkboxes on the Favorite page have all been removed; this functionality is now part of the search syntax instead.

By default, each search term will be interpreted as a hybrid tag+title search, and will match the gallery name (both english/romaji and japanese) as well as the gallery tags.

To only match gallery names, prefix the term with the title: qualifier
* Example: title:keyword -title:"string of keywords"

To only match gallery tags, prefix the term with a tag namespace, or tag: for all namespaces, or use the exact tag operator $, or use the OR operator ~
* Example: f:"big breasts" tag:group -futanari$ ~twintails

To search uploader gallery comments, prefix the term with the comment: qualifier
* Example: comment:"insightful uploader musings" -comment:"less insightful ones"

Favorite searches only: To search favorite notes, prefix the term with the favnote: qualifier
* Example: favnote:"this is my favorite gallery" -favnote:"on the citadel"

Note that this means combined tag+name+comment/favnote search terms are no longer supported.


Search Parsing Changes

- When doing unquoted searches with unqualified short and/or non-indexable words (a, an, ai, to, the, and, so, on, and so on), as well as some common adjectives (small, big, huge, gigantic), they will now be automatically appended or combined with the following priority:

* If there is a non-qualified search term immediately following the short word, it will be combined with that one.

For example, searching for "a dick in a box" without quotes will be searched as "a dick" "in a box". Everyone's new favorite "ai generated" without quotes will be searched as if it had quotes.

* If there is a non-qualified search term immediately preceding the short word, it will be combined with that one.

For example, searching for "novel ai" without quotes will be searched as if it had quotes.

* If there are only short words, they will be combined into one quoted word if there is more than one.

For example, searching for "ex on the ox" without quotes will be searched as if it had quotes.

* If there is just one short word, or the short words are between qualified search terms, it will be searched as an exact tag. A warning is printed in this case.

For example, searching for "9s c:a2 2b" without quotes will be searched as "tag:9s$" "character:a2$" "tag:2b$"

To combine short words with a different priority, use quotes or underscores. ("word1 word2 word3" and word1_word2_word3 are equivalent.)

To avoid combining short words when searching tags, use the tag: or tag namespace qualifiers.

Note that there is a single two-character word "3d" that was specifically whitelisted for title searches, but it is not an indexable word for comment searches so it cannot be used for that.

- Support for single-character wildcarding was dropped, and the * wildcard can now only be used at the end of keywords. Title, comment and favnote searches are implicitly wildcarded for indexing reasons, so adding a wildcard will only affect tag searching.


Search Term Limits

Exclusions and inclusions now have separate limits. A query can have up to 5 name+tag inclusion terms, 10 name+tag exclusion terms, and 10 comment+favnote inclusion+exclusion terms.

For both inclusions and exclusions, uploader:, uploadid: and gid: terms aren't specifically limited, but would still be limited by the max length of the search string (200 chars).

For exclusions, title: terms are also not limited.


GID Searching

You can now use the gid: search qualifier to search (publicly visible) galleries by Gallery ID. If you search a GID that has been replaced, it will list the current gallery instead.

Inclusion gid: terms cannot be combined with keyword searches or used in watch mode. This does not apply to exclusion terms. If used for exclusion, it will not exclude any galleries that replaced the provided GID.

You can specify multiple gid: terms in the same query for an implicit OR search.

This search mode will show both normal and expunged galleries. Default tag, language and uploader filters are automatically disabled for these searches.


Result Counting

For performance reasons, the search engine will no longer count the exact number of results in large result sets; instead result counts will usually be approximated based on various metrics. It will say "about" if the count is an estimate.

For complex multi-term searches with large result sets, it may not have enough information to give a reasonable estimate. In these cases, rather than showing a potentially wildly inaccurate one, it will just show "many". This only affects the count readout, navigation for these search results works the same as for smaller ones.

Smaller result sets (i.e. those that fit on one page) should return the exact count in all cases. Filtered galleries are included in this count, to match the behavior for estimates.

The page range filter, exclusion search terms and default language/uploader/tag filters will not generally be reflected in approximate result count estimates.

If you use the category, rating or torrent filters, it will use precomputed adjustment factors to correct the estimate. For some searches this estimate may be fairly inaccurate, say if you search for terms that are mostly applicable for specific categories then unselect other categories.

Result counts are not displayed in favorite searches or on the popular page. In the former case, it would only be able to display one for small result sets, and in the latter, it's all one page of results anyway. You can however still see the total for each favorite category.


Tag Search Behavior

- Tag searching now defaults to matching on word boundaries to reduce unwanted matches. In other words, searching for "tag:mana" will still match all tags that have "mana" as one of the words (like "secret of mana" [=> seiken densetsu] or "mana inuyama"), but it does not match "manabe", "manatsu", "manami" and so on. Searching for "tag:mana*" will restore the previous behavior.

- If there are too many tag matches for a term, it will now automatically rerun the term as an exact search instead of erroring out.

- Selecting "Search Low-Power Tags" will now only search low-power tags. This mode will also not do hybrid title/tag searches, so if a term is left unqualified (i.e. "big breasts") it will only search the tag. You can still search titles by using the title: qualifier.

- The "Search Downvoted Tags" option was removed.


Comment Search Behavior

Uploader comments and favorite notes are now searched using the comment: and favnote: qualifiers. favnote: is only available in favorite searches.

The way comments are indexed have been fundamentally changed, and there will be some subtle differences between normal text searches and favorite + exclusion-only text searches, since the former will usually use indexes while the latter do not.

Most notably, some otherwise-searchable common words (like "this" and "with") are not comment-searchable when the index is used but will be searchable when it is not. Also, when the index is used, words starting with these short words will not be matched unless you search for that exactly (like "with" and "withhold").

Furthermore, when the index is used it will only find word matches that start with the string, but when it's not it will also find matches that have the string as part of a word.

The index is only used for normal inclusion comment searches, but even for those it may not be used for some words and searches depending on various internal factors and thresholds, so you should not rely on this behavior.


Other Changes

- Various issues and limitations with favorite searches have been resolved. Searches in favorites should now behave the same as normal searches except for the noted comment/favnote search behavior.

- Exclusion searches for titles, tags (except for exact tags), comments and favnotes will now match any part of a word; i.e. -"laughter" will exclude "slaughter".

- Indexes are now generally updated immediately when the underlying data changes, which should reduce the delay until changes are reflected in searches. (Due to caching, there can still be some delay.)

- Whenever a gallery title has a mixed string of unicode and latin characters without any spaces or other breakable characters, like romaji漢字moreromaji, it would previously only be searchable with terms starting with "rom...", "漢字..." and "字mo..". It is now also searchable for "mor...".

- The "Your default filters removed..." message is now more consistent and specifically counts all galleries filtered by your default uploader, tag and language search filter settings. (When using both filters and exclusions and a gallery would have been removed by both, it is counted as an exclusion.)

- Selecting "Search Expunged Galleries" will now only search expunged galleries in normal searches. (File searches, GID searches and favorite searches will always display both normal and expunged galleries.)

- File searches can no longer be combined with keyword searches or other filters. This search mode will show both normal and expunged galleries. Default tag, language and uploader filters are now automatically disabled for these searches.

- Excessively narrow page range filters (min > 1000, max < 10, min/max > 0.5, min-max < 20) are no longer allowed.

- The max number of results per page is now 100. Paging Enlargement III was removed and will be refunded Soon™.


Known Issues/Quirks/Complaints/Workingasintendedisms

- You may sometimes see galleries appear out-of-order when going from one page to the next - in other words, going by the posted date, you would have expected the gallery to be on another page. This mostly applies to older galleries that predated the latest uploader update. This is because, prior to said update, a gallery could have been assigned a GID long before it was actually posted. This might eventually be addressed after a future redesign of the gallery metadata tables by renumbering galleries that are significantly out of order.

- If you are browsing from the end of a search results (backwards browsing mode) all the way to the start, the "last" page in the result (the one with the oldest results) will have a full page of results and the "first" page in the result (with the most recent ones) will have the remainder. This is working as intended.

- If you go backwards in a search result and get to the "first" page (with the most recent results), the "<< First" link will be lit up to flip back to the first page in forwards browsing mode even if there are no further pages and "< Prev" is disabled. This is working as intended.

- If you search for several AND inclusion tag terms (or hybrid title+tag terms), where every term has many results (~10K+) and some have a lot of results (~100K+), and there is a low degree of overlap between the tags, you may see fewer than expected results per page. You can usually use exact tags to avoid this.

- In general, "results per page" should be considered a target rather than a guarantee. For example, as an internal optimization, if a result page is at least 95% full after a search cycle, it may return with a couple of results "missing" instead of starting another search cycle (which can be expensive). This does not mean it's withholding results from you, you'll find them on the next page.

- "But $tool/$script needs the ability to access arbitrary pages in search results and/or accurate search result counts" is out of scope/wontfix. Update it to use the new gid-based navigation. And no, the old search engine was not "working just fine the way it was", it was failing on an ever-increasing number of searches due to running out of RAM when building results and badly needed a fundamental redesign to cope with the ever-increasing size of the index.


This is likely the most complicated update in the site's history, so there will probably be bugs and other subtle behavioral changes. Please don't hestiate to ask whether something is intentional if it's not noted in these patch notes.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post


Posts in this topic
Tenboro   New Search Engine   Nov 2 2022, 12:09
negavamas   I may sound like an idiot, but I don't underst...   Nov 2 2022, 12:22
Nobodycaresaboutme   [quote name='negavamas' post='6201231' date='Nov 2...   Nov 2 2022, 12:28
Azya22   How exactly do you "jump to an arbitrary GID...   Nov 2 2022, 12:32
Vulkandrache   "While this means jumping to an arbitrary ...   Nov 2 2022, 12:33
astral02   Is the page selector gone for good? or are you in ...   Nov 2 2022, 12:36
Nrj Gangsta Rap   Some nonsense, how to navigate the search results ...   Nov 2 2022, 12:36
Tenboro   Fixed some instances where it would produce invali...   Nov 2 2022, 12:38
negavamas   [quote name='Tenboro' post='6201241' date='Nov 2 2...   Nov 2 2022, 12:49
astral02   [quote name='Tenboro' post='6201241' date='Nov 2 2...   Nov 2 2022, 12:52
Gorince   [quote name='Tenboro' post='6201241' date='Nov 2 2...   Nov 2 2022, 12:55
Nrj Gangsta Rap   [quote name='Tenboro' post='6201241' date='Nov 2 2...   Nov 2 2022, 13:00
negavamas   [quote name='Tenboro' post='6201241' date='Nov 2 2...   Nov 2 2022, 13:01
nikgtasa   [quote name='Tenboro' post='6201241' date='Nov 2 2...   Nov 2 2022, 14:01
astral02   [quote name='nikgtasa' post='6201345' date='Nov 2 ...   Nov 2 2022, 14:06
Nihilm   [quote name='Tenboro' post='6201241' date='Nov 2 2...   Nov 2 2022, 14:08
Crystalium   I apologize if this has been addressed somewhere i...   Nov 2 2022, 12:41
astral02   [quote name='Nrj Gangsta Rap' post='6201240' date=...   Nov 2 2022, 12:41
tredur   [quote name='astral02' post='6201243' date='Nov 2 ...   Nov 2 2022, 14:11
ktr99584   [quote name='tredur' post='6201352' date='Nov 2 20...   Nov 2 2022, 14:25
billtt   I'm sorry but it's worst update ever. With...   Nov 2 2022, 12:44
Vulkandrache   Why i would need to jump to page 50? Maybe i did a...   Nov 2 2022, 12:46
Tenboro   I'm aware that removing the pagenumber address...   Nov 2 2022, 12:52
astral02   [quote name='Tenboro' post='6201255' date='Nov 2 2...   Nov 2 2022, 12:55
woot5   [quote name='Tenboro' post='6201255' date='Nov 2 2...   Nov 2 2022, 14:32
blehrg   Might as well ask Tenboro since he might see it, m...   Nov 2 2022, 12:49
Liadis   The removal of page selector is understandable if ...   Nov 2 2022, 12:52
negavamas   Actually, I've never thought of this, could we...   Nov 2 2022, 12:54
Richfamily   What's the reason for not being able to see bo...   Nov 2 2022, 12:55
TSX   How Stupid Update...   Nov 2 2022, 12:57
Tenboro   [quote name='negavamas' post='6201257' date='Nov 2...   Nov 2 2022, 12:57
Shank   [quote name='Liadis' post='6201253' date='Nov 2 20...   Nov 2 2022, 12:58
Hana-chan   [quote name='Shank' post='6201263' date='Nov 2 202...   Nov 2 2022, 13:09
elzilcho111   Would it be difficult to implement some sort of ps...   Nov 2 2022, 12:59
saidness   Alright, I know this change is useful on managemen...   Nov 2 2022, 13:01
astral02   To be fair, the current navigation system doesn...   Nov 2 2022, 13:02
Shank   [quote name='saidness' post='6201267' date='Nov 2 ...   Nov 2 2022, 13:05
negavamas   [quote name='Shank' post='6201270' date='Nov 2 202...   Nov 2 2022, 13:15
Sirrion Sunblaze   I think something like a tooltip or collapsible fi...   Nov 2 2022, 13:35
negavamas   [quote name='Sirrion Sunblaze' post='6201310' date...   Nov 2 2022, 13:45
BB201265   [quote name='Sirrion Sunblaze' post='6201310' date...   Nov 2 2022, 13:48
Coobe   Even if it's an approximate result is there re...   Nov 2 2022, 13:59
BB201265   [quote name='Shank' post='6201270' date='Nov 2 202...   Nov 2 2022, 13:34
Sanguine09   [quote name='Shank' post='6201270' date='Nov 2 202...   Nov 2 2022, 13:38
HikaruScans   I have a question. When I search with tag filter d...   Nov 2 2022, 13:05
Tenboro   [quote name='HikaruScans' post='6201271' date='Nov...   Nov 2 2022, 13:09
astral02   So, does the page selector thing final?   Nov 2 2022, 13:10
Nobodycaresaboutme   [quote name='Tenboro' post='6201255' date='Nov 2 2...   Nov 2 2022, 13:11
Tenboro   [quote name='Nobodycaresaboutme' post='6201276' da...   Nov 2 2022, 13:19
astral02   [quote name='Tenboro' post='6201290' date='Nov 2 2...   Nov 2 2022, 13:22
Shank   [quote name='Hana-chan' post='6201272' date='...   Nov 2 2022, 13:12
Collins21   i dont know on which page i am. Cool update. its ...   Nov 2 2022, 13:12
steinkauz   I also dislike the lack of page numbers. They are ...   Nov 2 2022, 13:13
astral02   :animecry: please reconsider the page selector   Nov 2 2022, 13:13
Shank   [quote name='negavamas' post='6201282' date='Nov 2...   Nov 2 2022, 13:16
astral02   [quote name='Shank' post='6201285' date='Nov 2 202...   Nov 2 2022, 13:19
Hana-chan   [quote name='Shank' post='6201285' date='Nov 2 202...   Nov 2 2022, 13:28
Shank   [quote name='Hana-chan' post='6201303' date='...   Nov 2 2022, 13:31
negavamas   [quote name='Shank' post='6201304' date='Nov 2 202...   Nov 2 2022, 13:33
HikaruScans   [quote name='Tenboro' post='6201273' date='Nov 2 2...   Nov 2 2022, 13:16
Disz   I cant see the inside of the gallery after clickin...   Nov 2 2022, 13:17
gamerzzz1   Is it possible to return the old interface? Even f...   Nov 2 2022, 13:18
ghoulishtie   What the hell kind of backwards update is this?   Nov 2 2022, 13:34
astral02   Not saying the update is terrible, far from it. Th...   Nov 2 2022, 13:34
Hana-chan   How come when I look up artist:"kurosu gatari...   Nov 2 2022, 13:44
Tenboro   [quote name='Hana-chan' post='6201318' date='...   Nov 2 2022, 13:53
astral02   By page number we can at least have a rough guess ...   Nov 2 2022, 13:44
Larequirem   today ehentai look so off today while felt nothing...   Nov 2 2022, 13:44
astral02   Please tell me this site is just testing the water...   Nov 2 2022, 13:48
Beya   What does "time cutoff" mean? I dont kno...   Nov 2 2022, 13:48
Hyuria   I think the update broke the image thumbnail while...   Nov 2 2022, 13:50
astral02   Since regular e-h still have the page selector, is...   Nov 2 2022, 13:52
Hana-chan   What is the deal with "many results"? Lo...   Nov 2 2022, 13:53
astral02   [quote name='Hana-chan' post='6201332' date='...   Nov 2 2022, 13:54
Kazaim   This isn't a better update. I relied heavily o...   Nov 2 2022, 13:55
astral02   [quote name='Tenboro' post='6201225' date='Nov 2 2...   Nov 2 2022, 13:57
SmallMagpie   nice update. We can do the same thing on E-Hentai ...   Nov 2 2022, 13:57
cs987987   Can we use GID search like that from GID:A to GID:...   Nov 2 2022, 14:05
Tenboro   [quote name='cs987987' post='6201347' date='Nov 2 ...   Nov 2 2022, 14:16
mapo   This is a change that does not take into account t...   Nov 2 2022, 14:10
Sobossobi   Since "Never To Be Added" from the wiki ...   Nov 2 2022, 14:12
kaca2100   please roll back... :cry:   Nov 2 2022, 14:12
licher   give back page selector.   Nov 2 2022, 14:16
MarklesMcReedleston   I have to agree with the others here. The removal ...   Nov 2 2022, 14:16
Nidogrr   Its really helpfull indeed when i search for multi...   Nov 2 2022, 14:20
冷 毁   please roll back to old version of the site! t...   Nov 2 2022, 14:22
count12   Probably the worst functionality update of all tim...   Nov 2 2022, 14:23
kaca2100   no longer.. don't see item deleted in favorite...   Nov 2 2022, 14:25
kswong98   not gonna lie... having the pages removed fucks me...   Nov 2 2022, 14:26
Look A Moth   Love how this always happens, some admin decides b...   Nov 2 2022, 14:27
cs987987   [quote name='negavamas' post='6201257' date='Nov 2...   Nov 2 2022, 14:42
KungFU   This seems like a lot to deal with at once, especi...   Nov 2 2022, 14:45
117649   The removal of page number is bad, real bad. Its ...   Nov 2 2022, 14:47
StatisticallyNP   If the new system can somehow mimic the functional...   Nov 2 2022, 14:48
MarklesMcReedleston   [quote name='cs987987' post='6201379' date='Nov 2 ...   Nov 2 2022, 14:48
chzx000   really confused with this update, at least make pa...   Nov 2 2022, 14:50
mewsf   It's better to put the technical reason for th...   Nov 2 2022, 14:56
darkspirit   I understand the need to improve things but this u...   Nov 2 2022, 14:57
Deamon9   I fail to see the reason to why the page numbers o...   Nov 2 2022, 14:59
Merengui   Yeah, have to agree with the posters here. Not see...   Nov 2 2022, 14:59
18 Pages V  1 2 3 > » 


Closed TopicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 


Lo-Fi Version Time is now: 23rd October 2024 - 16:44