1
+ − 1
<?php
+ − 2
+ − 3
/*
+ − 4
* Enano - an open-source CMS capable of wiki functions, Drupal-like sidebar blocks, and everything in between
536
+ − 5
* Version 1.1.4 (Caoineag alpha 4)
+ − 6
* Copyright (C) 2006-2008 Dan Fuhry
1
+ − 7
* search.php - algorithm used to search pages
+ − 8
*
+ − 9
* This program is Free Software; you can redistribute and/or modify it under the terms of the GNU General Public License
+ − 10
* as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
+ − 11
*
+ − 12
* This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
+ − 13
* warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for details.
+ − 14
*/
+ − 15
+ − 16
/**
+ − 17
* Implementation of array_merge() that preserves key names. $arr2 takes precedence over $arr1.
+ − 18
* @param array $arr1
+ − 19
* @param array $arr2
+ − 20
* @return array
+ − 21
*/
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 22
1
+ − 23
function enano_safe_array_merge($arr1, $arr2)
+ − 24
{
+ − 25
$arr3 = $arr1;
+ − 26
foreach($arr2 as $k => $v)
+ − 27
{
+ − 28
$arr3[$k] = $v;
+ − 29
}
+ − 30
return $arr3;
+ − 31
}
+ − 32
+ − 33
/**
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 34
* In Enano versions prior to 1.0.2, this class provided a search function that was keyword-based and allowed boolean searches. It was
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 35
* cut from Coblynau and replaced with perform_search(), later in this file, because of speed issues. Now mostly deprecated. The only
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 36
* thing remaining is the buildIndex function, which is still used by the path manager and the new search framework.
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 37
*
1
+ − 38
* @package Enano
+ − 39
* @subpackage Page management frontend
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 40
* @license GNU General Public License <http://enanocms.org/Special:GNU_General_Public_License>
1
+ − 41
*/
+ − 42
+ − 43
class Searcher
+ − 44
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 45
1
+ − 46
var $results;
+ − 47
var $index;
+ − 48
var $warnings;
+ − 49
var $match_case = false;
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 50
1
+ − 51
function buildIndex($texts)
+ − 52
{
+ − 53
$this->index = Array();
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 54
$stopwords = get_stopwords();
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 55
1
+ − 56
foreach($texts as $i => $l)
+ − 57
{
+ − 58
$seed = md5(microtime(true) . mt_rand());
+ − 59
$texts[$i] = str_replace("'", 'xxxApoS'.$seed.'xxx', $texts[$i]);
+ − 60
$texts[$i] = preg_replace('#([\W_]+)#i', ' ', $texts[$i]);
+ − 61
$texts[$i] = preg_replace('#([ ]+?)#', ' ', $texts[$i]);
+ − 62
$texts[$i] = preg_replace('#([\']*){2,}#s', '', $texts[$i]);
+ − 63
$texts[$i] = str_replace('xxxApoS'.$seed.'xxx', "'", $texts[$i]);
+ − 64
$l = $texts[$i];
+ − 65
$words = Array();
+ − 66
$good_chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789\' ';
+ − 67
$good_chars = enano_str_split($good_chars, 1);
+ − 68
$letters = enano_str_split($l, 1);
+ − 69
foreach($letters as $x => $t)
+ − 70
{
+ − 71
if(!in_array($t, $good_chars))
+ − 72
unset($letters[$x]);
+ − 73
}
+ − 74
$letters = implode('', $letters);
+ − 75
$words = explode(' ', $letters);
+ − 76
foreach($words as $c => $w)
+ − 77
{
371
dc6026376919
Improved compatibility with PostgreSQL and fixed a number of installer bugs; fixed missing "meta" category declaration in language files
Dan
diff
changeset
+ − 78
if(strlen($w) < 2 || in_array($w, $stopwords) || strlen($w) > 63 || preg_match('/[\']{2,}/', $w))
1
+ − 79
unset($words[$c]);
+ − 80
else
+ − 81
$words[$c] = $w;
+ − 82
}
+ − 83
$words = array_values($words);
+ − 84
foreach($words as $c => $w)
+ − 85
{
+ − 86
if(isset($this->index[$w]))
+ − 87
{
+ − 88
if(!in_array($i, $this->index[$w]))
+ − 89
$this->index[$w][] = $i;
+ − 90
}
+ − 91
else
+ − 92
{
+ − 93
$this->index[$w] = Array();
+ − 94
$this->index[$w][] = $i;
+ − 95
}
+ − 96
}
+ − 97
}
+ − 98
foreach($this->index as $k => $v)
+ − 99
{
+ − 100
$this->index[$k] = implode(',', $this->index[$k]);
+ − 101
}
+ − 102
}
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 103
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 104
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 105
/**
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 106
* Searches the site for the specified string and returns an array with each value being an array filled with the following:
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 107
* page_id: string, self-explanatory
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 108
* namespace: string, self-explanatory
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 109
* page_length: integer, the length of the full page in bytes
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 110
* page_text: string, the contents of the page (trimmed to ~150 bytes if necessary)
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 111
* score: numerical relevance score, 1-100, rounded to 2 digits and calculated based on which terms were present and which were not
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 112
* @param string Search query
499
6b7fdd898ba3
Fixed some bugs with PostgreSQL and added a word_lcase column to the search_index table because collation is not working under MySQL. TODO: Trigger search index rebuild on upgrade to 1.1.4.
Dan
diff
changeset
+ − 113
* @param string|reference Will be filled with any warnings encountered whilst parsing the query
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 114
* @param bool Case sensitivity - defaults to false
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 115
* @param array|reference Will be filled with the parsed list of words.
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 116
* @return array
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 117
*/
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 118
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 119
function perform_search($query, &$warnings, $case_sensitive = false, &$word_list)
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 120
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 121
global $db, $session, $paths, $template, $plugins; // Common objects
335
67bd3121a12e
Replaced TinyMCE 2.x with 3.0 beta 3. Supports everything but IE. Also rewrote the editor interface completely from the ground up.
Dan
diff
changeset
+ − 122
global $lang;
67bd3121a12e
Replaced TinyMCE 2.x with 3.0 beta 3. Supports everything but IE. Also rewrote the editor interface completely from the ground up.
Dan
diff
changeset
+ − 123
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 124
$warnings = array();
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 125
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 126
$query = parse_search_query($query, $warnings);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 127
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 128
// Segregate search terms containing spaces
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 129
$query_phrase = array(
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 130
'any' => array(),
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 131
'req' => array()
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 132
);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 133
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 134
foreach ( $query['any'] as $i => $_ )
1
+ − 135
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 136
$term =& $query['any'][$i];
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 137
$term = trim($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 138
// the indexer only indexes words a-z with apostrophes
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 139
if ( preg_match('/[^A-Za-z\']/', $term) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 140
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 141
$query_phrase['any'][] = $term;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 142
unset($term, $query['any'][$i]);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 143
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 144
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 145
unset($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 146
$query['any'] = array_values($query['any']);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 147
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 148
foreach ( $query['req'] as $i => $_ )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 149
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 150
$term =& $query['req'][$i];
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 151
$term = trim($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 152
if ( preg_match('/[^A-Za-z\']/', $term) )
1
+ − 153
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 154
$query_phrase['req'][] = $term;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 155
unset($term, $query['req'][$i]);
1
+ − 156
}
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 157
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 158
unset($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 159
$query['req'] = array_values($query['req']);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 160
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 161
$results = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 162
$scores = array();
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 163
$ns_list = '(' . implode('|', array_keys($paths->nslist)) . ')';
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 164
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 165
// FIXME: Update to use FULLTEXT algo when available.
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 166
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 167
// Build an SQL query to load from the index table
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 168
if ( count($query['any']) < 1 && count($query['req']) < 1 && count($query_phrase['any']) < 1 && count($query_phrase['req']) < 1 )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 169
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 170
// This is both because of technical restrictions and devastation that would occur on shared servers/large sites.
391
85f91037cd4f
Localization is FINISHED, DAMN IT HELLAH YEAH! OVER WITH! Man, it feels to get that off my chest. Release is in under 48 hours, folks. And we're ready for it.
Dan
diff
changeset
+ − 171
$warnings[] = $lang->get('search_err_query_no_positive');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 172
return array();
1
+ − 173
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 174
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 175
//
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 176
// STAGE 1
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 177
// Get all possible result pages from the search index. Tally which pages have the most words, and later sort them by boolean relevance
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 178
//
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 179
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 180
// Skip this if no indexable words are included
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 181
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 182
if ( count($query['any']) > 0 || count($query['req']) > 0 )
1
+ − 183
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 184
$where_any = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 185
foreach ( $query['any'] as $term )
1
+ − 186
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 187
$term = escape_string_like($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 188
if ( !$case_sensitive )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 189
$term = strtolower($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 190
$where_any[] = $term;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 191
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 192
foreach ( $query['req'] as $term )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 193
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 194
$term = escape_string_like($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 195
if ( !$case_sensitive )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 196
$term = strtolower($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 197
$where_any[] = $term;
1
+ − 198
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 199
499
6b7fdd898ba3
Fixed some bugs with PostgreSQL and added a word_lcase column to the search_index table because collation is not working under MySQL. TODO: Trigger search index rebuild on upgrade to 1.1.4.
Dan
diff
changeset
+ − 200
$col_word = ( $case_sensitive ) ? 'word' : 'word_lcase';
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 201
$where_any = ( count($where_any) > 0 ) ? '( ' . $col_word . ' = \'' . implode('\' OR ' . $col_word . ' = \'', $where_any) . '\' )' : '';
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 202
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 203
// generate query
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 204
// using a GROUP BY here ensures that the same word with a different case isn't counted as 2 words - it's all melted back
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 205
// into one later in the processing stages
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 206
// $group_by = ( $case_sensitive ) ? '' : ' GROUP BY lcase(word);';
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 207
$sql = "SELECT word, page_names FROM " . table_prefix . "search_index WHERE {$where_any}";
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 208
if ( !($q = $db->sql_unbuffered_query($sql)) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 209
$db->_die('Error is in perform_search(), includes/search.php, query 1');
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 210
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 211
$word_tracking = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 212
if ( $row = $db->fetchrow() )
1
+ − 213
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 214
do
1
+ − 215
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 216
// get page list
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 217
$pages =& $row['page_names'];
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 218
if ( strpos($pages, ',') )
1
+ − 219
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 220
// the term occurs in more than one page
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 221
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 222
// Find page IDs that contain commas
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 223
// This should never happen because commas are escaped by sanitize_page_id(). Nevertheless for compatibility with older
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 224
// databases, and to alleviate the concerns of hackers, we'll accommodate for page IDs with commas here by checking for
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 225
// IDs that don't match the pattern for stringified page ID + namespace. If it doesn't match, that means it's a continuation
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 226
// of the previous ID and should be concatenated to the previous entry.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 227
$matches = explode(',', $pages);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 228
$prev = false;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 229
foreach ( $matches as $i => $_ )
1
+ − 230
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 231
$match =& $matches[$i];
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 232
if ( !preg_match("/^ns=$ns_list;pid=(.+)$/", $match) && $prev )
1
+ − 233
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 234
$matches[$prev] .= ',' . $match;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 235
unset($match, $matches[$i]);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 236
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 237
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 238
$prev = $i;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 239
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 240
unset($match);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 241
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 242
// Iterate through each of the results, assigning scores based on how many times the page has shown up.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 243
// This works because this phase of the search is strongly word-based not page-based. If a page shows up
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 244
// multiple times while fetching the result rows from the search_index table, it simply means that page
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 245
// contains more than one of the terms the user searched for.
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 246
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 247
foreach ( $matches as $match )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 248
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 249
$word_cs = (( $case_sensitive ) ? $row['word'] : strtolower($row['word']));
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 250
if ( isset($word_tracking[$match]) && in_array($word_cs, $word_tracking[$match]) )
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 251
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 252
continue;
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 253
}
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 254
if ( isset($word_tracking[$match]) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 255
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 256
if ( isset($word_tracking[$match]) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 257
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 258
$word_tracking[$match][] = ($word_cs);
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 259
}
1
+ − 260
}
+ − 261
else
+ − 262
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 263
$word_tracking[$match] = array($word_cs);
1
+ − 264
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 265
$inc = 1;
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 266
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 267
// Is this search term present in the page's title? If so, give extra points
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 268
preg_match("/^ns=$ns_list;pid=(.+)$/", $match, $piecesparts);
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 269
$pathskey = $paths->nslist[ $piecesparts[1] ] . sanitize_page_id($piecesparts[2]);
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 270
if ( isset($paths->pages[$pathskey]) )
1
+ − 271
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 272
$test_func = ( $case_sensitive ) ? 'strstr' : 'stristr';
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 273
if ( $test_func($paths->pages[$pathskey]['name'], $row['word']) || $test_func($paths->pages[$pathskey]['urlname_nons'], $row['word']) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 274
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 275
$inc = 1.5;
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 276
}
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 277
}
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 278
if ( isset($scores[$match]) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 279
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 280
$scores[$match] = $scores[$match] + $inc;
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 281
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 282
else
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 283
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 284
$scores[$match] = $inc;
1
+ − 285
}
+ − 286
}
+ − 287
}
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 288
else
1
+ − 289
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 290
// the term only occurs in one page
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 291
$word_cs = (( $case_sensitive ) ? $row['word'] : strtolower($row['word']));
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 292
if ( isset($word_tracking[$pages]) && in_array($word_cs, $word_tracking[$pages]) )
1
+ − 293
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 294
continue;
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 295
}
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 296
if ( isset($word_tracking[$pages]) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 297
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 298
if ( isset($word_tracking[$pages]) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 299
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 300
$word_tracking[$pages][] = ($word_cs);
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 301
}
1
+ − 302
}
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 303
else
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 304
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 305
$word_tracking[$pages] = array($word_cs);
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 306
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 307
$inc = 1;
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 308
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 309
// Is this search term present in the page's title? If so, give extra points
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 310
preg_match("/^ns=$ns_list;pid=(.+)$/", $pages, $piecesparts);
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 311
$pathskey = $paths->nslist[ $piecesparts[1] ] . sanitize_page_id($piecesparts[2]);
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 312
if ( isset($paths->pages[$pathskey]) )
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 313
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 314
$test_func = ( $case_sensitive ) ? 'strstr' : 'stristr';
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 315
if ( $test_func($paths->pages[$pathskey]['name'], $row['word']) || $test_func($paths->pages[$pathskey]['urlname_nons'], $row['word']) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 316
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 317
$inc = 1.5;
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 318
}
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 319
}
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 320
if ( isset($scores[$pages]) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 321
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 322
$scores[$pages] = $scores[$pages] + $inc;
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 323
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 324
else
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 325
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 326
$scores[$pages] = $inc;
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 327
}
1
+ − 328
}
+ − 329
}
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 330
while ( $row = $db->fetchrow() );
1
+ − 331
}
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 332
$db->free_result();
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 333
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 334
//
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 335
// STAGE 2: FIRST ELIMINATION ROUND
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 336
// Iterate through the list of required terms. If a given page is not found to have the required term, eliminate it
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 337
//
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 338
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 339
foreach ( $query['req'] as $term )
1
+ − 340
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 341
foreach ( $word_tracking as $i => $page )
1
+ − 342
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 343
if ( !in_array($term, $page) )
1
+ − 344
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 345
unset($word_tracking[$i], $scores[$i]);
1
+ − 346
}
+ − 347
}
+ − 348
}
+ − 349
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 350
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 351
//
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 352
// STAGE 3: PHRASE SEARCHING
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 353
// Use LIKE to find pages with specified phrases. We can do a super-picky single query without another elimination round because
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 354
// at this stage we can search the full page_text column instead of relying on a word list.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 355
//
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 356
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 357
// We can skip this stage if none of these special terms apply
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 358
320
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 359
$text_col = ( $case_sensitive ) ? 'page_text' : ENANO_SQLFUNC_LOWERCASE . '(page_text)';
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 360
$name_col = ( $case_sensitive ) ? 'name' : ENANO_SQLFUNC_LOWERCASE . '(name)';
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 361
$text_col_join = ( $case_sensitive ) ? 't.page_text' : ENANO_SQLFUNC_LOWERCASE . '(t.page_text)';
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 362
$name_col_join = ( $case_sensitive ) ? 'p.name' : ENANO_SQLFUNC_LOWERCASE . '(p.name)';
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 363
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 364
$concat_column = ( ENANO_DBLAYER == 'MYSQL' ) ?
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 365
'CONCAT(\'ns=\',t.namespace,\';pid=\',t.page_id)' :
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 366
"'ns=' || t.namespace || ';pid=' || t.page_id";
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 367
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 368
if ( count($query_phrase['any']) > 0 || count($query_phrase['req']) > 0 )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 369
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 370
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 371
$where_any = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 372
foreach ( $query_phrase['any'] as $term )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 373
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 374
$term = escape_string_like($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 375
if ( !$case_sensitive )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 376
$term = strtolower($term);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 377
$where_any[] = "( $text_col LIKE '%$term%' OR $name_col LIKE '%$term%' )";
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 378
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 379
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 380
$where_any = ( count($where_any) > 0 ) ? implode(" OR\n ", $where_any) : '';
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 381
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 382
// Also do required terms, but use AND to ensure that all required terms are included
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 383
$where_req = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 384
foreach ( $query_phrase['req'] as $term )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 385
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 386
$term = escape_string_like($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 387
if ( !$case_sensitive )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 388
$term = strtolower($term);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 389
$where_req[] = "( $text_col LIKE '%$term%' OR $name_col LIKE '%$term%' )";
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 390
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 391
$and_clause = ( $where_any != '' ) ? 'AND ' : '';
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 392
$where_req = ( count($where_req) > 0 ) ? "{$and_clause}" . implode(" AND\n ", $where_req) : '';
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 393
320
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 394
$sql = 'SELECT ' . $concat_column . ' AS id, p.name FROM ' . table_prefix . "page_text AS t\n"
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 395
. " LEFT JOIN " . table_prefix . "pages AS p\n"
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 396
. " ON ( p.urlname = t.page_id AND p.namespace = t.namespace )\n"
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 397
. " WHERE\n $where_any\n $where_req;";
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 398
if ( !($q = $db->sql_unbuffered_query($sql)) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 399
$db->_die('Error is in perform_search(), includes/search.php, query 2. Parsed query dump follows:<pre>(indexable) ' . htmlspecialchars(print_r($query, true)) . '(non-indexable) ' . htmlspecialchars(print_r($query_phrase, true)) . '</pre>');
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 400
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 401
if ( $row = $db->fetchrow() )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 402
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 403
do
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 404
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 405
$id =& $row['id'];
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 406
$inc = 1;
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 407
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 408
// Is this search term present in the page's title? If so, give extra points
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 409
preg_match("/^ns=$ns_list;pid=(.+)$/", $id, $piecesparts);
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 410
$pathskey = $paths->nslist[ $piecesparts[1] ] . sanitize_page_id($piecesparts[2]);
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 411
if ( isset($paths->pages[$pathskey]) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 412
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 413
$test_func = ( $case_sensitive ) ? 'strstr' : 'stristr';
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 414
foreach ( array_merge($query_phrase['any'], $query_phrase['req']) as $term )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 415
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 416
if ( $test_func($paths->pages[$pathskey]['name'], $term) || $test_func($paths->pages[$pathskey]['urlname_nons'], $term) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 417
{
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 418
$inc = 1.5;
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 419
break;
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 420
}
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 421
}
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 422
}
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 423
if ( isset($scores[$id]) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 424
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 425
$scores[$id] = $scores[$id] + $inc;
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 426
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 427
else
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 428
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 429
$scores[$id] = $inc;
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 430
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 431
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 432
while ( $row = $db->fetchrow() );
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 433
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 434
$db->free_result();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 435
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 436
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 437
//
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 438
// STAGE 4 - SELECT PAGE TEXT AND ELIMINATE NOTS
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 439
// At this point, we have a complete list of all the possible pages. Now we want to obtain the page text, and within the same query
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 440
// eliminate any terms that shouldn't be in there.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 441
//
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 442
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 443
// Generate master word list for the highlighter
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 444
$word_list = array_values(array_merge($query['any'], $query['req'], $query_phrase['any'], $query_phrase['req']));
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 445
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 446
$text_where = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 447
foreach ( $scores as $page_id => $_ )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 448
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 449
$text_where[] = $db->escape($page_id);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 450
}
320
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 451
$text_where = '( ' . $concat_column . ' = \'' . implode('\' OR ' . $concat_column . ' = \'', $text_where) . '\' )';
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 452
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 453
if ( count($query['not']) > 0 )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 454
$text_where .= ' AND';
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 455
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 456
$where_not = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 457
foreach ( $query['not'] as $term )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 458
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 459
$term = escape_string_like($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 460
if ( !$case_sensitive )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 461
$term = strtolower($term);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 462
$where_not[] = $term;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 463
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 464
$where_not = ( count($where_not) > 0 ) ? "$text_col NOT LIKE '%" . implode("%' AND $text_col NOT LIKE '%", $where_not) . "%'" : '';
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 465
320
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 466
$sql = 'SELECT ' . $concat_column . ' AS id, t.page_id, t.namespace, CHAR_LENGTH(t.page_text) AS page_length, t.page_text, p.name AS page_name FROM ' . table_prefix . "page_text AS t
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 467
LEFT JOIN " . table_prefix . "pages AS p
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 468
ON ( p.urlname = t.page_id AND p.namespace = t.namespace )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 469
WHERE $text_where $where_not;";
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 470
if ( !($q = $db->sql_unbuffered_query($sql)) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 471
$db->_die('Error is in perform_search(), includes/search.php, query 3');
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 472
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 473
$page_data = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 474
if ( $row = $db->fetchrow() )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 475
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 476
do
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 477
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 478
$row['page_text'] = htmlspecialchars($row['page_text']);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 479
$row['page_name'] = htmlspecialchars($row['page_name']);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 480
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 481
// Highlight results (this is wonderfully automated)
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 482
$row['page_text'] = highlight_and_clip_search_result($row['page_text'], $word_list, $case_sensitive);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 483
if ( strlen($row['page_text']) > 250 && !preg_match('/^\.\.\.(.+)\.\.\.$/', $row['page_text']) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 484
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 485
$row['page_text'] = substr($row['page_text'], 0, 150) . '...';
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 486
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 487
$row['page_name'] = highlight_search_result($row['page_name'], $word_list, $case_sensitive);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 488
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 489
$page_data[$row['id']] = $row;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 490
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 491
while ( $row = $db->fetchrow() );
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 492
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 493
$db->free_result();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 494
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 495
//
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 496
// STAGE 5 - SPECIAL PAGE TITLE SEARCH
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 497
// Iterate through $paths->pages and check the titles for search terms. Score accordingly.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 498
//
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 499
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 500
foreach ( $paths->pages as $id => $page )
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 501
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 502
if ( $page['namespace'] != 'Special' )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 503
continue;
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 504
if ( !is_int($id) )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 505
continue;
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 506
$idstring = 'ns=' . $page['namespace'] . ';pid=' . $page['urlname_nons'];
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 507
$any = array_values(array_unique(array_merge($query['any'], $query_phrase['any'])));
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 508
foreach ( $any as $term )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 509
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 510
if ( $case_sensitive )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 511
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 512
if ( strstr($page['name'], $term) || strstr($page['urlname_nons'], $term) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 513
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 514
( isset($scores[$idstring]) ) ? $scores[$idstring] = $scores[$idstring] + 1.5 : $scores[$idstring] = 1.5;
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 515
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 516
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 517
else
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 518
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 519
if ( stristr($page['name'], $term) || stristr($page['urlname_nons'], $term) )
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 520
{
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 521
( isset($scores[$idstring]) ) ? $scores[$idstring] = $scores[$idstring] + 1.5 : $scores[$idstring] = 1.5;
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 522
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 523
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 524
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 525
if ( isset($scores[$idstring]) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 526
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 527
$page_data[$idstring] = array(
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 528
'page_name' => highlight_search_result($page['name'], $word_list, $case_sensitive),
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 529
'page_text' => '',
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 530
'page_id' => $page['urlname_nons'],
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 531
'namespace' => $page['namespace'],
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 532
'score' => $scores[$idstring],
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 533
'page_length' => 1,
335
67bd3121a12e
Replaced TinyMCE 2.x with 3.0 beta 3. Supports everything but IE. Also rewrote the editor interface completely from the ground up.
Dan
diff
changeset
+ − 534
'page_note' => '[' . $lang->get('search_result_tag_special') . ']'
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 535
);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 536
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 537
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 538
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 539
//
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 540
// STAGE 6 - SECOND ELIMINATION ROUND
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 541
// Iterate through the list of required terms. If a given page is not found to have the required term, eliminate it
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 542
//
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 543
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 544
$required = array_merge($query['req'], $query_phrase['req']);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 545
foreach ( $required as $term )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 546
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 547
foreach ( $page_data as $id => $page )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 548
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 549
if ( ( $page['namespace'] == 'Special' || ( $page['namespace'] != 'Special' && !strstr($page['page_text'], $term) ) ) && !strstr($page['page_id'], $term) && !strstr($page['page_name'], $term) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 550
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 551
unset($page_data[$id]);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 552
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 553
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 554
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 555
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 556
// At this point, all of our normal results are in. However, we can also allow plugins to hook into the system and score their own
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 557
// pages and add text, etc. as necessary.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 558
// Plugins are COMPLETELY responsible for using the search terms and handling Boolean logic properly
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 559
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 560
$code = $plugins->setHook('search_global_inner');
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 561
foreach ( $code as $cmd )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 562
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 563
eval($cmd);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 564
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 565
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 566
// a marvelous debugging aid :-)
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 567
// die('<pre>' . htmlspecialchars(print_r($page_data, true)) . '</pre>');
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 568
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 569
//
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 570
// STAGE 7 - HIGHLIGHT, TRIM, AND SCORE RESULTS
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 571
// We now have the complete results of the search. We need to trim text down to show only portions of the page containing search
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 572
// terms, highlight any search terms within the page, and sort the final results array in descending order of score.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 573
//
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 574
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 575
// Sort scores array
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 576
arsort($scores);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 577
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 578
// Divisor for calculating relevance scores
320
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 579
$divisor = ( count($query['any']) + count($query_phrase['any']) + count($query['req']) + count($query['not']) ) * 1.5;
461
+ − 580
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 581
foreach ( $scores as $page_id => $score )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 582
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 583
if ( !isset($page_data[$page_id]) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 584
// It's possible that $scores contains a score for a page that was later eliminated because it contained a disallowed term
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 585
continue;
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 586
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 587
// Make a copy of the datum, then delete the original (it frees up a LOT of RAM)
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 588
$datum = $page_data[$page_id];
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 589
unset($page_data[$page_id]);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 590
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 591
// This is an internal value used for sorting - it's no longer needed.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 592
unset($datum['id']);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 593
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 594
// Calculate score
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 595
// if ( $score > $divisor )
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 596
// $score = $divisor;
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 597
$datum['score'] = round($score / $divisor, 2) * 100;
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 598
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 599
// Highlight the URL
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 600
$datum['url_highlight'] = makeUrlComplete($datum['namespace'], $datum['page_id']);
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 601
$datum['url_highlight'] = preg_replace('/\?.+$/', '', $datum['url_highlight']);
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 602
$datum['url_highlight'] = highlight_search_result($datum['url_highlight'], $word_list, $case_sensitive);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 603
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 604
// Store it in our until-now-unused results array
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 605
$results[] = $datum;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 606
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 607
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 608
// Our work here is done. :-D
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 609
return $results;
1
+ − 610
}
+ − 611
+ − 612
/**
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 613
* Parses a search query into an associative array. The resultant array will be filled with the following values, each an array:
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 614
* any: Search terms that can optionally be present
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 615
* req: Search terms that must be present
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 616
* not: Search terms that should not be present
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 617
* @param string Search query
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 618
* @param array Will be filled with parser warnings, such as query too short, words too short, etc.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 619
* @return array
1
+ − 620
*/
+ − 621
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 622
function parse_search_query($query, &$warnings)
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 623
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 624
global $lang;
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 625
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 626
$stopwords = get_stopwords();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 627
$ret = array(
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 628
'any' => array(),
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 629
'req' => array(),
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 630
'not' => array()
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 631
);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 632
$warnings = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 633
$terms = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 634
$in_quote = false;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 635
$start_term = 0;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 636
$just_finished = false;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 637
for ( $i = 0; $i < strlen($query); $i++ )
1
+ − 638
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 639
$chr = $query{$i};
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 640
$prev = ( $i > 0 ) ? $query{ $i - 1 } : '';
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 641
$next = ( ( $i + 1 ) < strlen($query) ) ? $query{ $i + 1 } : '';
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 642
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 643
if ( ( $chr == ' ' && !$in_quote ) || ( $i + 1 == strlen ( $query ) ) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 644
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 645
$len = ( $next == '' ) ? $i + 1 : $i - $start_term;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 646
$word = substr ( $query, $start_term, $len );
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 647
$terms[] = $word;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 648
$start_term = $i + 1;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 649
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 650
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 651
elseif ( $chr == '"' && $in_quote && $prev != '\\' )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 652
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 653
$word = substr ( $query, $start_term, $i - $start_term + 1 );
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 654
$start_pos = ( $next == ' ' ) ? $i + 2 : $i + 1;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 655
$in_quote = false;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 656
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 657
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 658
elseif ( $chr == '"' && !$in_quote )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 659
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 660
$in_quote = true;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 661
$start_pos = $i;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 662
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 663
1
+ − 664
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 665
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 666
$ticker = 0;
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 667
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 668
foreach ( $terms as $element => $__unused )
1
+ − 669
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 670
$atom =& $terms[$element];
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 671
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 672
$ticker++;
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 673
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 674
if ( $ticker == 20 )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 675
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 676
$warnings[] = $lang->get('search_err_query_too_many_terms');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 677
break;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 678
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 679
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 680
if ( substr ( $atom, 0, 2 ) == '+"' && substr ( $atom, ( strlen ( $atom ) - 1 ), 1 ) == '"' )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 681
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 682
$word = substr ( $atom, 2, ( strlen( $atom ) - 3 ) );
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 683
if ( strlen ( $word ) < 2 || in_array($word, $stopwords) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 684
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 685
$warnings[] = $lang->get('search_err_query_has_stopwords');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 686
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 687
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 688
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 689
if(in_array($word, $ret['req']))
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 690
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 691
$warnings[] = $lang->get('search_err_query_dup_terms');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 692
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 693
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 694
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 695
$ret['req'][] = $word;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 696
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 697
elseif ( substr ( $atom, 0, 2 ) == '-"' && substr ( $atom, ( strlen ( $atom ) - 1 ), 1 ) == '"' )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 698
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 699
$word = substr ( $atom, 2, ( strlen( $atom ) - 3 ) );
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 700
if ( strlen ( $word ) < 4 )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 701
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 702
$warnings[] = $lang->get('search_err_query_term_too_short');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 703
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 704
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 705
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 706
if(in_array($word, $ret['not']))
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 707
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 708
$warnings[] = $lang->get('search_err_query_dup_terms');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 709
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 710
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 711
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 712
$ret['not'][] = $word;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 713
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 714
elseif ( substr ( $atom, 0, 1 ) == '+' )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 715
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 716
$word = substr ( $atom, 1 );
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 717
if ( strlen ( $word ) < 2 || in_array($word, $stopwords) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 718
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 719
$warnings[] = $lang->get('search_err_query_has_stopwords');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 720
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 721
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 722
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 723
if(in_array($word, $ret['req']))
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 724
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 725
$warnings[] = $lang->get('search_err_query_dup_terms');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 726
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 727
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 728
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 729
$ret['req'][] = $word;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 730
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 731
elseif ( substr ( $atom, 0, 1 ) == '-' )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 732
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 733
$word = substr ( $atom, 1 );
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 734
if ( strlen ( $word ) < 2 || in_array($word, $stopwords) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 735
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 736
$warnings[] = $lang->get('search_err_query_has_stopwords');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 737
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 738
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 739
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 740
if(in_array($word, $ret['not']))
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 741
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 742
$warnings[] = $lang->get('search_err_query_dup_terms');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 743
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 744
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 745
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 746
$ret['not'][] = $word;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 747
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 748
elseif ( substr ( $atom, 0, 1 ) == '"' && substr ( $atom, ( strlen($atom) - 1 ), 1 ) == '"' )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 749
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 750
$word = substr ( $atom, 1, ( strlen ( $atom ) - 2 ) );
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 751
if ( strlen ( $word ) < 2 || in_array($word, $stopwords) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 752
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 753
$warnings[] = $lang->get('search_err_query_has_stopwords');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 754
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 755
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 756
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 757
if(in_array($word, $ret['any']))
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 758
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 759
$warnings[] = $lang->get('search_err_query_dup_terms');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 760
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 761
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 762
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 763
$ret['any'][] = $word;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 764
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 765
else
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 766
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 767
$word = $atom;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 768
if ( strlen ( $word ) < 2 || in_array($word, $stopwords) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 769
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 770
$warnings[] = $lang->get('search_err_query_has_stopwords');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 771
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 772
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 773
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 774
if(in_array($word, $ret['any']))
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 775
{
334
c72b545f1304
More localization work. Resolved major issue with JSON parser not parsing files over ~50KB. Switched JSON parser to the one from the Zend Framework (BSD licensed). Forced to split enano.json into five different files.
Dan
diff
changeset
+ − 776
$warnings[] = $lang->get('search_err_query_dup_terms');
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 777
$ticker--;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 778
continue;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 779
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 780
$ret['any'][] = $word;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 781
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 782
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 783
return $ret;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 784
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 785
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 786
/**
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 787
* Escapes a string for use in a LIKE clause.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 788
* @param string
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 789
* @return string
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 790
*/
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 791
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 792
function escape_string_like($string)
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 793
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 794
global $db, $session, $paths, $template, $plugins; // Common objects
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 795
$string = $db->escape($string);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 796
$string = str_replace(array('%', '_'), array('\%', '\_'), $string);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 797
return $string;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 798
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 799
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 800
/**
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 801
* Wraps <highlight></highlight> tags around all words in both the specified array. Does not perform any clipping.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 802
* @param string Text to process
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 803
* @param array Word list
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 804
* @param bool If true, searches case-sensitively when highlighting words
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 805
* @return string
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 806
*/
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 807
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 808
function highlight_search_result($pt, $words, $case_sensitive = false)
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 809
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 810
$words2 = array();
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 811
for ( $i = 0; $i < sizeof($words); $i++)
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 812
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 813
if(!empty($words[$i]))
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 814
$words2[] = preg_quote($words[$i]);
1
+ − 815
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 816
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 817
$flag = ( $case_sensitive ) ? '' : 'i';
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 818
$regex = '/(' . implode('|', $words2) . ')/' . $flag;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 819
$pt = preg_replace($regex, '<highlight>\\1</highlight>', $pt);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 820
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 821
return $pt;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 822
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 823
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 824
/**
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 825
* Wraps <highlight></highlight> tags around all words in both the specified array and the specified text and clips the text to
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 826
* an appropriate length.
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 827
* @param string Text to process
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 828
* @param array Word list
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 829
* @param bool If true, searches case-sensitively when highlighting words
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 830
* @return string
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 831
*/
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 832
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 833
function highlight_and_clip_search_result($pt, $words, $case_sensitive = false)
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 834
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 835
$cut_off = false;
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 836
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 837
$space_chars = Array("\t", "\n", "\r", " ");
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 838
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 839
$pt = highlight_search_result($pt, $words, $case_sensitive);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 840
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 841
foreach ( $words as $word )
1
+ − 842
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 843
// Boldface searched words
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 844
$ptlen = strlen($pt);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 845
for ( $i = 0; $i < $ptlen; $i++ )
1
+ − 846
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 847
$len = strlen($word);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 848
if ( strtolower(substr($pt, $i, $len)) == strtolower($word) )
1
+ − 849
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 850
$chunk1 = substr($pt, 0, $i);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 851
$chunk2 = substr($pt, $i, $len);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 852
$chunk3 = substr($pt, ( $i + $len ));
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 853
$pt = $chunk1 . $chunk2 . $chunk3;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 854
$ptlen = strlen($pt);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 855
// Cut off text to 150 chars or so
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 856
if ( !$cut_off )
1
+ − 857
{
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 858
$cut_off = true;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 859
if ( $i - 75 > 0 )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 860
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 861
// Navigate backwards until a space character is found
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 862
$chunk = substr($pt, 0, ( $i - 75 ));
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 863
$final_chunk = $chunk;
320
112debff64bd
SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
diff
changeset
+ − 864
for ( $j = strlen($chunk) - 1; $j > 0; $j = $j - 1 )
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 865
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 866
if ( in_array($chunk{$j}, $space_chars) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 867
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 868
$final_chunk = substr($chunk, $j + 1);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 869
break;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 870
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 871
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 872
$mid_chunk = substr($pt, ( $i - 75 ), 75);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 873
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 874
$clipped = '...' . $final_chunk . $mid_chunk . $chunk2;
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 875
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 876
$chunk = substr($pt, ( $i + strlen($chunk2) + 75 ));
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 877
$final_chunk = $chunk;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 878
for ( $j = 0; $j < strlen($chunk); $j++ )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 879
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 880
if ( in_array($chunk{$j}, $space_chars) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 881
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 882
$final_chunk = substr($chunk, 0, $j);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 883
break;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 884
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 885
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 886
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 887
$end_chunk = substr($pt, ( $i + strlen($chunk2) ), 75 );
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 888
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 889
$clipped .= $end_chunk . $final_chunk . '...';
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 890
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 891
$pt = $clipped;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 892
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 893
else if ( strlen($pt) > 200 )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 894
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 895
$mid_chunk = substr($pt, ( $i - 75 ), 75);
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 896
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 897
$clipped = $chunk1 . $chunk2;
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 898
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 899
$chunk = substr($pt, ( $i + strlen($chunk2) + 75 ));
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 900
$final_chunk = $chunk;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 901
for ( $j = 0; $j < strlen($chunk); $j++ )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 902
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 903
if ( in_array($chunk{$j}, $space_chars) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 904
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 905
$final_chunk = substr($chunk, 0, $j);
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 906
break;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 907
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 908
}
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 909
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 910
$end_chunk = substr($pt, ( $i + strlen($chunk2) ), 75 );
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 911
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 912
$clipped .= $end_chunk . $final_chunk . '...';
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 913
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 914
$pt = $clipped;
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 915
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 916
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 917
break 2;
1
+ − 918
}
+ − 919
}
+ − 920
}
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 921
$cut_off = false;
1
+ − 922
}
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 923
return $pt;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 924
}
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 925
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 926
/**
461
+ − 927
* Returns a list of words that shouldn't under most circumstances be indexed for searching.
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 928
* @return array
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 929
*/
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 930
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 931
function get_stopwords()
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 932
{
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 933
static $stopwords;
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 934
if ( is_array($stopwords) )
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 935
return $stopwords;
292
b3cfaf0a505c
Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
diff
changeset
+ − 936
461
+ − 937
$stopwords = array('I', 'a', 'about', 'an', 'are', 'as', 'at', 'be', 'by', 'com', 'de', 'en', 'for', 'from', 'how', 'in', 'is', 'it',
+ − 938
'la', 'of', 'on', 'or', 'that', 'the', 'this', 'to', 'was', 'what', 'when', 'where', 'who', 'will', 'with', 'and',
+ − 939
'the');
+ − 940
272
e0ec986c0af3
Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
diff
changeset
+ − 941
return $stopwords;
1
+ − 942
}
+ − 943
+ − 944
?>