Tycoon Talk
Become a Big fish!
The number 1 forum for online business!
Post topics, ask questions, share your knowledge.
Tycoon Talk is part of Freelancer.com - find skilled workers online at a fraction of the cost.

The Database Forum


You are currently viewing our The Database Forum as a guest. Please register to participate.
Login



Reply
Can we direct PHP/MySQL to search only a subset of the total records?
Old 12-15-2008, 08:09 AM Can we direct PHP/MySQL to search only a subset of the total records?
TWD
TWD's Avatar
King Spam Talker

Posts: 1,112
Trades: 0
For example imagine a simple two field table.

The first column contains all dates from the 20th century.
The second column contains the description of weather (rainy, snowy, sunny, cloudy) for that day.
I want to know how many days were rainy in 1955.

Rather than SELECT * WHERE weather = "rainy" AND ("date" > "Dec 31, 1954" AND "date"<"Jan 1, 1956") for the entire table is there a more efficient way of just directing PHP/MySQL to search just the year 1955?
TWD is offline
Reply With Quote
View Public Profile
 
 
Register now for full access!
Old 12-15-2008, 08:20 AM Re: Can we direct PHP/MySQL to search only a subset of the total records?
tripy's Avatar
Do not try this at home!

Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
Trades: 0
This should be a bit more effective, as the query planner drop the time information when scanning rows.
The function extract()allows you to take just 1 part of the date in your where clause:
http://dev.mysql.com/doc/refman/5.1/...nction_extract
Code:
SELECT count(*) 
FROM table
WHERE weather = "rainy" 
AND extract(year from dateField)=1955
__________________
Only a biker knows why a dog sticks his head out the window.
tripy is offline
Reply With Quote
View Public Profile Visit tripy's homepage!
 
Old 12-15-2008, 01:57 PM Re: Can we direct PHP/MySQL to search only a subset of the total records?
Learning Newbie's Avatar
Defies a Status

Latest Blog Post:
Astounding Republican Paranoia
Posts: 5,662
Name: John Alexander
Trades: 0
First, what you described is exactly what you want to do. A database tries to offer good performance and scalability, and, to do so, it needs optimization, like the type you describe.

Now, that said, the query engine shouldn't need much help (or "tricking") on your part to find those optimizations. The query you typed should NOT read in and then evaluate every row in the table - only the ones from 1955. You can verify that it's actually doing this in the query execution plan.

If it's doing a table scan, you're dead in the water - as far as performance goes. If that's what you see, the way to fix it is with a data structure called an index. You would want one on the date column, so that even given the whole table, your database will know exactly where all of the 1955 entries are, and not even have to touch any of the others. On the other hand, if it was "rainy" so few times over the course of the century that you have fewer of these than days in 1955, an index here would be even more beneficial. Here's the cool part - if you build both indexes, the query optimizer will decide when you run the query which one is better, and use it as appropriate.

And this might be the first time I've disagreed with Tripy, but in fact, at least in SQL Server, the query he provided will be slower. It forces a table scan, which means (potentially) massive disc IO. When you ask for a function to be applied to column data, as opposed to column data to match the results of a function, the database has to load the data and apply the function, to make sure it matches (or doesn't) to satisfy your query. (Newer db engines can optimize this away if the function is deterministic.)
__________________

Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE

Last edited by Learning Newbie; 12-15-2008 at 02:03 PM..
Learning Newbie is offline
Reply With Quote
View Public Profile
 
Old 12-15-2008, 03:24 PM Re: Can we direct PHP/MySQL to search only a subset of the total records?
tripy's Avatar
Do not try this at home!

Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
Trades: 0
Quote:
And this might be the first time I've disagreed with Tripy, but in fact, at least in SQL Server, the query he provided will be slower. It forces a table scan, which means (potentially) massive disc IO.
Never, ever reply in a hurry 5 minutes before a meeting...
Proof reading what I wrote, I can only admit John is right.

That being said, and for my defense, I would never use something like that on a non indexed field. Or if I knew I would use the year as filtering token, I would add a numeric column with the extracted year in it.

At first, I had wrote the query with a "between x and y", and I should have left it that way.
My bad.
Thanks for pointing this out John, I still have a bit of difficulties to determine what would be deterministic or not before firing the query with the plan to look at it. And particularly over datetime columns.
__________________
Only a biker knows why a dog sticks his head out the window.
tripy is offline
Reply With Quote
View Public Profile Visit tripy's homepage!
 
Old 12-15-2008, 05:38 PM Re: Can we direct PHP/MySQL to search only a subset of the total records?
Learning Newbie's Avatar
Defies a Status

Latest Blog Post:
Astounding Republican Paranoia
Posts: 5,662
Name: John Alexander
Trades: 0
Quote:
Originally Posted by tripy View Post
Thanks for pointing this out John, I still have a bit of difficulties to determine what would be deterministic or not before firing the query with the plan to look at it. And particularly over datetime columns.
Well, this forum is for learning, so I wanted to make sure the info we share is reliable.

As for deterministic, this is more of a brain dump than a 100 % reliable set of rules. But maybe it's useful anyway?

Determinism = cause and effect - randomness. A deterministic function is one that will always return the same result given the same input. When this is most important is if you use a calculated column and index it, or want an indexed view, because you'll have data corruption if the return value changes! Some things that make a function non deterministic are
  • Data access - if it's looking something up in a table, the data in that table can change.
  • Calling out to other functions - calling another deterministic function is allowed, thought. Any calls to GetDate(), Rand(), or anything like that, means your function is not deterministic.
  • Global variables - Even considering things like @@Identity or @@TranCount means your function is non deterministic.
If you're making SQL CLR functions, however, you just decorate your method with an attribute and tell SQL whether or not the function is deterministic. When you write a T-SQL UDF, the query compiler decides for itself, but if you're writing .NET functions, you can lie to SQL about this. It's not recommended, obviously, unless you want to practice disaster recovery - but that's how it works, currently, in the CLR.

For some fun background, see Nietzsche's theory of determinism.
__________________

Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
Learning Newbie is offline
Reply With Quote
View Public Profile
 
Reply     « Reply to Can we direct PHP/MySQL to search only a subset of the total records?
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off





   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML



Page generated in 0.74242 seconds with 12 queries