@# Quotes DB     useful, funny, interesting





Google
 
Web www.quotesdb.info


Undernet  |  EFnet  |  Quakenet  |  Freenode  |  Dalnet  |  Ircnet  |  Galaxynet
Page: 1 2 3 4 5 6 7 8 9 10



Comments:

<Setien> is there a way to index a longtext field that will allow me to do LIKE '%foo%' operations really fast?
<Setien> I tried fulltext and MATCH, but that does full word matches and is not useful to what I am doing
<Therion> No, that sort of ridiculous pattern requires scanning all values
<Therion> It's as efficient as it can be
<domas> Setien: give a list of keywords and it would be really fast
<Setien> domas, how do you mean?
<domas> if that keyword list is known before building the index ;-)
<Setien> domas, it's not
<domas> pity then
<Setien> yes :)
<domas> you totally miss the concept of indexing
<domas> :((
<Setien> you mean I don't understand it?
<domas> that's right.
<domas> I wonder which mysql version optimzied %blah% lookups quite a lot
<domas> maybe it was 4.1 or 4.0
<Setien> no, I understand it alright, but I was hoping the fine folks at mysql had some clever trickery up their sleeve that would at least allow them to do it faster
<domas> Setien: yeah. guessing.
<Sneaky_Bastard> it is possible to index all the words in mysql text fields
<domas> !m Sneaky_Bastard fulltext
<SQL> Sneaky_Bastard: (Full-Text Search Functions) : http://dev.mysql.com/doc/mysql/en/Fulltext_Search.html
<Sneaky_Bastard> yeah
<domas> here.
<domas> sure it is!
<Setien> I know it wouldn't be an index as such, but I am sure it would be possible to make some kind of preprocessing that would make partial matches faster
<Sneaky_Bastard> fulltext fields can be indexed, too
<Setien> domas, I am using mysql 4.1 by the way
<Sneaky_Bastard> to speed up the searches
<domas> Setien: in theory - maybe.
<Setien> Sneaky_Bastard, but not with partial matches, right?
<domas> say... split all text into fixed width lexemes
<Setien> I seem to remember seeing something about being able to use wildcards in the MATCH, but I can't find it
<domas> Setien: you can use prefix wildcard matching in fulltext
<Setien> domas, that would be one possibility
<Sneaky_Bastard> Setien: there is no environmenton earth, there there are signficant speedups for regular expression searches
<Setien> domas, only prefix?
<Sneaky_Bastard> environment on earth
<Setien> Sneaky_Bastard, I didn't get that
<Sneaky_Bastard> "partial" match is a regular expression
<Setien> Sneaky_Bastard, you can't seriously be saying that a '%foo%' match uses the full overhead of a regular expression match?
<Darien> no, it doesn't
<Sneaky_Bastard> I said exactly and precisely what I said
<Darien> but it does do a table scan
<domas> if all keywords are longer than X, then there's a possibility to split the text into X/2 lexemes and use shifting of lexeme for key lookups..
<domas> ergh
<domas> that's terribly inefficient but.. works in some situations
<Darien> '%foo%' doesn't use regular expressions
<Sneaky_Bastard> domas, the question is what does MySQL support internally
<domas> and still.. is search.
<domas> Sneaky_Bastard: prefix keyword matching for fulltext.
<domas> and... custom parsers. in 5.1
<Sneaky_Bastard> if you index the fulltext
<Sneaky_Bastard> then the search on it's index is doing to have to find all keys that match the partial match
<domas> usually the magic of fulltext indexing lies in how you tokenize your text.
<domas> if you tokenize into keywords you use, it will work.
<Sneaky_Bastard> is going
<domas> if not, it won't.
<Sneaky_Bastard> I mean, that's the mechanics
<Sneaky_Bastard> of how it *has* to be done internally
<domas> haha, it's not about 'internally', it's about 'done', ha ha ha
<Sneaky_Bastard> <phrase>* match still requires running through sorted keys starting with first match
<domas> yes.
<Sneaky_Bastard> sorry, but the question was wether MySQL had an efficient internal method for doing this
<domas> for <phrase>*? yes.
<Sneaky_Bastard> my response was that there is not
<Sneaky_Bastard> you do realize that a regular expression gets compiled into a little state machine for doing the compare ?
<Sneaky_Bastard> and that <pattern>* is just a relatively easy one to test against ?
<dammit> it's an index.
<dammit> you can walk in indexes
<firewire> is the question will mysql use an index for a regular expression?
<dammit> sometimes. if ordered.
<Sneaky_Bastard> walking the index STILL REQUIRES TESTING EACH ENTRY FOR MATCH
<Sneaky_Bastard> binary search will work
<firewire> or is someone just trolling
<Sneaky_Bastard> but you still have to do the compare
<Sneaky_Bastard> index is ordered, by definition!!!
<Sneaky_Bastard> WTF?
<dammit> well, index can be unordered.
<dammit> like... in... hash ;-)
<Setien> so what I am going to do
<Sneaky_Bastard> ok, you're right
<dammit> or... rd-tree
<dammit> or...
<Sneaky_Bastard> a hash index has to be randomly searched
<Sneaky_Bastard> sequentially searched
<Setien> nm
<Sneaky_Bastard> but then, a database designer who is not a moron, doesn't use hash-based indexes for keys that have to searched for partial matches :p
<dammit> :-)
<dammit> well, you can always scan a hash! :)
<Sneaky_Bastard> yeah, sure. easy to say
<Sneaky_Bastard> no fast internal implementation that doesn't involve a regular expression in some form or other, even simplified
<dammit> why do you call simple string matching 'regular expression'?
<Sneaky_Bastard> even if you don't think it looks like a regular expression.
<Sneaky_Bastard> partial string match
<Sneaky_Bastard> is not a simple string match
<Sneaky_Bastard> that's why
<dammit> boy, look at etymology of 'regular expression'
<firewire> :-D
<dammit> partial string match might be cheaper than 'simple string match';
<Sneaky_Bastard> don't call me boy, child
<Sneaky_Bastard> >:(
<dammit> boooo
<firewire> now now children
<dammit> look at mysys/
<dammit> it has some of optimized functions for string matching
<dammit> or wait, string/
<dammit> none of these things resemble 'regular expressions'
<dammit> compiled or not.
<firewire> I'm still trying to figure out what you two are arguing about
<dammit> firewire: I don't know
<firewire> hahaha
<dammit> firewire: something about regexps
<Sneaky_Bastard> I suggest that you study "Introduction to Automata Theory, Language and Computation", if you think a partial string match does not involve a regular expression
<firewire> partial string match involes ranges not regular expressions in the PCRE sense
<Sneaky_Bastard> even if it is possible code that particular cl*** of expression match highly efficiently.
<Sneaky_Bastard> I never said it did
<dammit> introduction sounds a bit too difficult for me. I need basics!
<Sneaky_Bastard> we were discussing internals
<firewire> col like 'foo%' is exapnded to col >= 'foo' and col < 'fop'
<Sneaky_Bastard> uh, no.\
<firewire> are you sure?
<firewire> because that's how mysql does it :)
<Sneaky_Bastard> that is not descriptive of how the code that does the match operates
<dammit> 'A regular expression (abbreviated as regexp, regex, or regxp, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. '
<firewire> like 'foo%' has NOTHING to do with PCRE
<dammit> 'foo%' by itself is regexp
<dammit> though, internally it is not handled as such
<Sneaky_Bastard> I see we are having basic Comp Sci terminology problem here.
<firewire> yes
<dammit> yes! :)
<firewire> why don't you define regular expression for us
<Sneaky_Bastard> so I'm just going to drop it as it is apparently not possible to communicate what I mean
<dammit> 'expression'. mind this word.
<dammit> and don't invent terminology ;-)
<Sneaky_Bastard> I referenced the definitive text
<Sneaky_Bastard> not my problem if you have not studied it.
<Sneaky_Bastard> have fun
<firewire> heh
<firewire> I didn't realize we had a definitive text
<Sneaky_Bastard> you do, like just about everything in Comp. Sci.
<firewire> regular expression in my world refers to PCRE or what grep uses for "pattern matching"
<dammit> somewhat in my world too.
<Sneaky_Bastard> if I had meant that, I would have specified PCRE
<Sneaky_Bastard> which I did not
<dammit> so... unix regexps are not pcre, what are they then?
<dammit> posix ones that is
<dammit> or is it for another PCRE?
<dammit> as in posix-compatible instead of perl-compatible?
<firewire> pcre is expanded posix compatible expressions
<Sneaky_Bastard> languages for representing "regular expressions" is what they are
<Sneaky_Bastard> there being more than one.


Name:

Comments:

Please enter the result of the sum 63 + 46 (to avoid spam):






Return to #mysql
or
Go to some related logs:

pcap_tPtr
download MzBot
gaim trebuchet MS
ubuntu dvd::rip encrypted
python hex range(*
#web
#linux
deauthenticated due to local deauth request
#perl
tuxtheslacker



Home  |  disclaimer  |  contact  |  submit quotes