| |
| |
| |
|
Page: 1 2 3 4 5 6 7 8 9 10
Comments:
<Setien> is there a way to index a longtext field that will allow me to do LIKE '%foo%' operations really fast? <Setien> I tried fulltext and MATCH, but that does full word matches and is not useful to what I am doing <Therion> No, that sort of ridiculous pattern requires scanning all values <Therion> It's as efficient as it can be <domas> Setien: give a list of keywords and it would be really fast <Setien> domas, how do you mean? <domas> if that keyword list is known before building the index ;-) <Setien> domas, it's not <domas> pity then <Setien> yes :) <domas> you totally miss the concept of indexing <domas> :(( <Setien> you mean I don't understand it? <domas> that's right. <domas> I wonder which mysql version optimzied %blah% lookups quite a lot <domas> maybe it was 4.1 or 4.0 <Setien> no, I understand it alright, but I was hoping the fine folks at mysql had some clever trickery up their sleeve that would at least allow them to do it faster <domas> Setien: yeah. guessing. <Sneaky_Bastard> it is possible to index all the words in mysql text fields <domas> !m Sneaky_Bastard fulltext <SQL> Sneaky_Bastard: (Full-Text Search Functions) : http://dev.mysql.com/doc/mysql/en/Fulltext_Search.html <Sneaky_Bastard> yeah <domas> here. <domas> sure it is! <Setien> I know it wouldn't be an index as such, but I am sure it would be possible to make some kind of preprocessing that would make partial matches faster <Sneaky_Bastard> fulltext fields can be indexed, too <Setien> domas, I am using mysql 4.1 by the way <Sneaky_Bastard> to speed up the searches <domas> Setien: in theory - maybe. <Setien> Sneaky_Bastard, but not with partial matches, right? <domas> say... split all text into fixed width lexemes <Setien> I seem to remember seeing something about being able to use wildcards in the MATCH, but I can't find it <domas> Setien: you can use prefix wildcard matching in fulltext <Setien> domas, that would be one possibility <Sneaky_Bastard> Setien: there is no environmenton earth, there there are signficant speedups for regular expression searches <Setien> domas, only prefix? <Sneaky_Bastard> environment on earth <Setien> Sneaky_Bastard, I didn't get that <Sneaky_Bastard> "partial" match is a regular expression <Setien> Sneaky_Bastard, you can't seriously be saying that a '%foo%' match uses the full overhead of a regular expression match? <Darien> no, it doesn't <Sneaky_Bastard> I said exactly and precisely what I said <Darien> but it does do a table scan <domas> if all keywords are longer than X, then there's a possibility to split the text into X/2 lexemes and use shifting of lexeme for key lookups.. <domas> ergh <domas> that's terribly inefficient but.. works in some situations <Darien> '%foo%' doesn't use regular expressions <Sneaky_Bastard> domas, the question is what does MySQL support internally <domas> and still.. is search. <domas> Sneaky_Bastard: prefix keyword matching for fulltext. <domas> and... custom parsers. in 5.1 <Sneaky_Bastard> if you index the fulltext <Sneaky_Bastard> then the search on it's index is doing to have to find all keys that match the partial match <domas> usually the magic of fulltext indexing lies in how you tokenize your text. <domas> if you tokenize into keywords you use, it will work. <Sneaky_Bastard> is going <domas> if not, it won't. <Sneaky_Bastard> I mean, that's the mechanics <Sneaky_Bastard> of how it *has* to be done internally <domas> haha, it's not about 'internally', it's about 'done', ha ha ha <Sneaky_Bastard> <phrase>* match still requires running through sorted keys starting with first match <domas> yes. <Sneaky_Bastard> sorry, but the question was wether MySQL had an efficient internal method for doing this <domas> for <phrase>*? yes. <Sneaky_Bastard> my response was that there is not <Sneaky_Bastard> you do realize that a regular expression gets compiled into a little state machine for doing the compare ? <Sneaky_Bastard> and that <pattern>* is just a relatively easy one to test against ? <dammit> it's an index. <dammit> you can walk in indexes <firewire> is the question will mysql use an index for a regular expression? <dammit> sometimes. if ordered. <Sneaky_Bastard> walking the index STILL REQUIRES TESTING EACH ENTRY FOR MATCH <Sneaky_Bastard> binary search will work <firewire> or is someone just trolling <Sneaky_Bastard> but you still have to do the compare <Sneaky_Bastard> index is ordered, by definition!!! <Sneaky_Bastard> WTF? <dammit> well, index can be unordered. <dammit> like... in... hash ;-) <Setien> so what I am going to do <Sneaky_Bastard> ok, you're right <dammit> or... rd-tree <dammit> or... <Sneaky_Bastard> a hash index has to be randomly searched <Sneaky_Bastard> sequentially searched <Setien> nm <Sneaky_Bastard> but then, a database designer who is not a moron, doesn't use hash-based indexes for keys that have to searched for partial matches :p <dammit> :-) <dammit> well, you can always scan a hash! :) <Sneaky_Bastard> yeah, sure. easy to say <Sneaky_Bastard> no fast internal implementation that doesn't involve a regular expression in some form or other, even simplified <dammit> why do you call simple string matching 'regular expression'? <Sneaky_Bastard> even if you don't think it looks like a regular expression. <Sneaky_Bastard> partial string match <Sneaky_Bastard> is not a simple string match <Sneaky_Bastard> that's why <dammit> boy, look at etymology of 'regular expression' <firewire> :-D <dammit> partial string match might be cheaper than 'simple string match'; <Sneaky_Bastard> don't call me boy, child <Sneaky_Bastard> >:( <dammit> boooo <firewire> now now children <dammit> look at mysys/ <dammit> it has some of optimized functions for string matching <dammit> or wait, string/ <dammit> none of these things resemble 'regular expressions' <dammit> compiled or not. <firewire> I'm still trying to figure out what you two are arguing about <dammit> firewire: I don't know <firewire> hahaha <dammit> firewire: something about regexps <Sneaky_Bastard> I suggest that you study "Introduction to Automata Theory, Language and Computation", if you think a partial string match does not involve a regular expression <firewire> partial string match involes ranges not regular expressions in the PCRE sense <Sneaky_Bastard> even if it is possible code that particular cl*** of expression match highly efficiently. <Sneaky_Bastard> I never said it did <dammit> introduction sounds a bit too difficult for me. I need basics! <Sneaky_Bastard> we were discussing internals <firewire> col like 'foo%' is exapnded to col >= 'foo' and col < 'fop' <Sneaky_Bastard> uh, no.\ <firewire> are you sure? <firewire> because that's how mysql does it :) <Sneaky_Bastard> that is not descriptive of how the code that does the match operates <dammit> 'A regular expression (abbreviated as regexp, regex, or regxp, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. ' <firewire> like 'foo%' has NOTHING to do with PCRE <dammit> 'foo%' by itself is regexp <dammit> though, internally it is not handled as such <Sneaky_Bastard> I see we are having basic Comp Sci terminology problem here. <firewire> yes <dammit> yes! :) <firewire> why don't you define regular expression for us <Sneaky_Bastard> so I'm just going to drop it as it is apparently not possible to communicate what I mean <dammit> 'expression'. mind this word. <dammit> and don't invent terminology ;-) <Sneaky_Bastard> I referenced the definitive text <Sneaky_Bastard> not my problem if you have not studied it. <Sneaky_Bastard> have fun <firewire> heh <firewire> I didn't realize we had a definitive text <Sneaky_Bastard> you do, like just about everything in Comp. Sci. <firewire> regular expression in my world refers to PCRE or what grep uses for "pattern matching" <dammit> somewhat in my world too. <Sneaky_Bastard> if I had meant that, I would have specified PCRE <Sneaky_Bastard> which I did not <dammit> so... unix regexps are not pcre, what are they then? <dammit> posix ones that is <dammit> or is it for another PCRE? <dammit> as in posix-compatible instead of perl-compatible? <firewire> pcre is expanded posix compatible expressions <Sneaky_Bastard> languages for representing "regular expressions" is what they are <Sneaky_Bastard> there being more than one.
Return to
#mysql or Go to some related
logs:
pcap_tPtr download MzBot gaim trebuchet MS ubuntu dvd::rip encrypted python hex range(* #web #linux deauthenticated due to local deauth request #perl tuxtheslacker
|
|