soft 404

English

Etymology

Term introduced in 2004 by Ziv Bar-Yossef et al.

Noun

soft 404 (plural soft 404s)

  1. A 404 page ("the result of trying to follow a broken link within a website") which reports a "200 OK" HTTP response code rather than "404 Not Found", falsely indicating to the browser that the page exists and was loaded correctly.
    • 2012, Luis Meneses, Richard Furuta, Frank Shipman, “Identifying “Soft 404” Error Pages: Analyzing the Lexical Signatures of Documents in Distributed Collections”, in Panayiotis Zaphiris, George Buchanan, Edie Rasmussen, Fernando Loizides, editors, Theory and Practice of Digital Libraries: Second International Conference, TPDL 2012, Paphos, Cyprus, September 23-27, 2012, Proceedings, Springer-Verlag Berlin Heidelberg, →ISBN, pages 201–202:
      To ensure that the server would return a soft 404, the random sequence of characters was appended to an absolute URL and file path. [] Given that our research deals with identifying and predicting soft 404s, we chose to use a ratio of 50% normal pages and 50% soft 404s.
    • 2013, Richard Rogers, “National Web Studies”, in Digital Methods, The MIT Press, →ISBN, page 141:
      However, redirects also may be “soft 404” messages to hide broken links.
    • 2016, B. Barla Cambazoglu, Ricardo Baeza-Yates, Scalability Challenges in Web Search Engines, Springer Nature Switzerland AG, published 2022, →ISBN, page 23:
      These so-called soft 404 error pages can degrade the performance of a web crawler since they are usually not worth downloading and indexing.
    • 2020, Natércia A. Batista, Michele A. Brandão, Michele B. Pinheiro, Daniel H. Dalip, Mirella M. Moro, “Data from Multiple Web Sources: Crawling, Integrating, Preprocessing, and Designing Applications”, in Valter Roesler, Eduardo Barrére, Roberto Willrich, editors, Special Topics in Multimedia, IoT and Web Technologies, Springer Nature Switzerland AG, →ISBN, part III (Data Collection and Analysis), page 228:
      The data crawling also has specific problems that must be observed such as time between requests, soft-404 error, identification of URL patterns of the pages, and extraction of data from the source code.

Further reading

This article is issued from Wiktionary. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.