|
What SQL features do you use the most?
05-08-2008, 02:18 PM
|
What SQL features do you use the most?
|
Posts: 1,228
|
I'm making a database abstraction layer and I need to write methods for my database object. I'd like to create a useful class, but I don't want to overload it with features that most people won't use. I know which ones I use, but I'm curious to know which SQL features you use most to build your queries (regardless of what type of SQL database you're running). What would absolutely appall you if I left it out?
I'm still building it, but some of my current methods are: left_join, where, order_by, etc. In your replies, if you could please post what you commonly use and then also tell me what database you work with, it would help me a lot. Also, if you have any general suggestions for methods, that would be useful too. Thank you.
Last edited by VirtuosiMedia; 05-08-2008 at 05:31 PM..
|
|
|
|
05-08-2008, 06:53 PM
|
Re: What SQL features do you use the most?
|
Posts: 5,662
Name: John Alexander
|
All the basics, like what you've mentioned. (Select, Update, Insert, Delete; Where, From, Joins, Having, Order By, Top N, all the usual basic stuff you'll find in MS Access.) Custom aggregates are amazingly valuable, but I don't know how you might hope to implement them.
I doubt I could do much meaningful SQL work without subqueries, or, failing that, CTEs. The S(tructured) in SQL comes from the concept of a subquery.
Check constraints and join hints are pretty big. Locking hints to a lesser degree.
Ranking and partitioning functions are an absolute must have.
Error shielding and logging, plus good client statistics and (but this is less often than the others) access to the query plan.
Hope this helps. I don't know how I would go about trying to model these into a class, but if I were going to use that class in a web project, these are the things I would immediately look for in it.
|
|
|
|
05-08-2008, 07:41 PM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
Thank you, sir, exactly the type of reply I was looking for.  I'm definitely going to have my work cut out for me, but I'll just start on the basics and work my way from there. As just a quick follow up, do you use all join types, or just a few?
Last edited by VirtuosiMedia; 05-08-2008 at 08:08 PM..
|
|
|
|
05-09-2008, 02:17 PM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
Wow. Going over your list and as I dig deeper, I'm starting to discover why MySQL has been maligned by a lot of people. As far as I can tell, it doesn't support CTEs or check constraints and ranking functions have to be emulated. However, from a market perspective, because I hope to offer alternatives to Wordpress, Joomla, and Drupal, I can't rule MySQL out.
I think my best option right now is to make a simple base database object with features that are shared across databases (even if their syntax is slightly different) and then make it easily extensible. Many of the operations for a CMS of this sort are going to be simplistic CRUD functions and having simple base DBObject will ensure module portability for that kind of use. However, if someone wants or needs to take advantage of a database specific feature, they can extend the base object. The new modules that are dependent on the extension won't be portable, but the extension class shouldn't break the old modules either as long as they don't rewrite any existing methods. I'm fairly certain I can figure out a way to easily change which class the table-specific classes extend.
My problem today is figuring out how to write methods for building subqueries and top N. I don't think it's going to be very easy.
Yesterday I was able to get the class working for building basic select statements using methods for selecting field names, assigning aliases, left joins, where clauses, order by, limit, and offset. I also have it so that it can return a single row accessible through magic methods, a multiple row result set that can be accessed by a simple foreach loop, a number for getting a count, and a debug option that prints the built query. I haven't tried inserting, updating or deleting yet. Once I finish a prototype, I'm going to do some performance testing and see how much overhead it adds and if I can improve it at all. I'll probably also see if I can simplify or improve the methods at that time as well.
Last edited by VirtuosiMedia; 05-09-2008 at 02:29 PM..
|
|
|
|
05-09-2008, 02:18 PM
|
Re: What SQL features do you use the most?
|
Posts: 5,662
Name: John Alexander
|
I use both join types.  Inner and outer.
A left join and a right join are the same thing, you just list the tables in different order. A cross or Cartesian join can be achieved with inner or outer, it's really more about the join condition or where clause. I tend to never use full outer joins. If I want that functionality I combine a left and a right join via the union operator. ( This gives me a little more control.)
But one thing that's important to note when we're talking about joins, is that I always, in every case, use correlation names with joins. Here's an example of what I mean
PHP Code:
Select O.Name, C.Name From sysObjects O Join sysColumns C On O.ID = C.ID
|
|
|
|
05-09-2008, 03:08 PM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
Thanks again.
Quote:
Originally Posted by Learning Newbie
But one thing that's important to note when we're talking about joins, is that I always, in every case, use correlation names with joins.
|
I think I found a decent way to do that, except I just named my method 'alias'. I'm going to allow for the database tables to be prefixed upon installation, so the actual table name would be something like VM_users. The class takes "users" as the parameter, prepends the prefix for the table name as "VM_users", but then automatically sets "users" as the alias unless an alternate alias is used, in which case the alternate is used. That way, it lets me use the table name without having to know what the prefix is. I'm sure it's been done thousands of times, but it was kind of cool to get it working on my own.
As a code example:
PHP Code:
$users = new users(); $users->left_join("userInfo", "users.user_id = ui.user_id", "ui"); $users->where("users.gender = 'M'"); $users->limit(10); $users->offset(5); $users->order_by("ui.DOB", "ASC"); $users->select("debug");
will produce the following query for MySQL:
Code:
SELECT * FROM VM_users AS users LEFT JOIN VM_userInfo AS ui ON user.user_id = ui.user_id WHERE users.gender = 'M' ORDER BY ui.DOB ASC LIMIT 5, 10
And this code:
PHP Code:
$users = new users(); $users->alias('u'); $users->left_join("userInfo", "u.user_id = userInfo.user_id"); $users->where("u.gender = 'M'"); $users->limit(10); $users->offset(5); $users->order_by("userInfo.DOB", "ASC"); $users->select("debug");
produces:
Code:
SELECT * FROM VM_users AS u LEFT JOIN VM_userInfo AS userInfo ON u.user_id = userInfo.user_id WHERE u.gender = 'M' ORDER BY userInfo.DOB ASC LIMIT 5, 10
And this:
PHP Code:
$users = new users(); $users->select("multiple");
would return a result set for this query:
Code:
SELECT * FROM VM_users AS users
Getting the result set would be done like:
PHP Code:
$users = new users(); foreach ($users->select("multiple") as $row) { echo $row['firstName'].' '.$row['lastName']; }
It's still simplistic, but it works. Multiple left joins, wheres, and order bys are also allowed and specific fields can also be specified by using the select_fields() method.
Last edited by VirtuosiMedia; 05-09-2008 at 04:13 PM..
|
|
|
|
05-09-2008, 04:02 PM
|
Re: What SQL features do you use the most?
|
Posts: 5,662
Name: John Alexander
|
Quote:
Originally Posted by VirtuosiMedia
That way, it lets me use the table name without having to know what the prefix is. I'm sure it's been done thousands of times, but it was kind of cool to get it working on my own.
|
What's the value of prefixing your table names, if you refer to them without knowing what the prefix is? Sounds like an added layer of complexity, with no payoff whatsoever?
Why would that have been implemented so many times? I don't see what it gets anyone. Remember all the cheers of millions of programmers when Hungarian notation got the final nail in its coffin.
Anyway, rant aside, what I mean when I say I alias table names during joins, is that I give them usually 1 and sometimes 2 or even 3 character aliases. In a lot of schemata queries, O is sysObjects, I is sysIndexes, C is sysColumns, et cetera.
|
|
|
|
05-09-2008, 04:36 PM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
I've actually seen the ability to change the prefix on multiple scripts that I've used, usually during installation. I doubt there are any security benefits, but some people like the option so they can avoid potential naming conflicts if they have other scripts on their server. That said, it definitely isn't the greatest thing since sliced bread.
|
|
|
|
05-09-2008, 06:57 PM
|
Re: What SQL features do you use the most?
|
Posts: 5,662
Name: John Alexander
|
Interesting blog post, thanks for pointing that out. I notice the one comment it has is a trackback from a highly intelligent blogger from Down Under. I don't know if you read any of Neerav's stuff, but I respect the guy a lot. That he's linking to it shows some level of interest.
But my thought is that a person who can gain access to your system through all the security procedures you have in place, is probably going to be able to query the information schema and get a list of your tables.
In MySQL, is it one database per server? That might explain why the table prefixes aren't so common in my neck of the woods - we tend to separate stuff out, everything into its own db. So there can be two User tables, say one in HR and one in Finance. It's funny that a number of our people still use the prefix "tbl" on every table, tho. They'll tell you it's "so you can see it's a table, and not a view, when you write a query". I say if you don't know what it is, you shouldn't be writing production queries against it!
But I don't want to hijack your thread off topic. The point with table aliases, especially for joins, is that the names you give them are temporary, scoped only to that query. If we use sysObjects and sysIndexes as an example (tables present in every SQL Server database), there are some common fields like ID and Name that are present in both tables. In the select list, you could write
Select sysObjects.Name, sysIndexes.Name [blah blah] Where sysObjects.ID > 100
Or
Select O.Name, I.Name [blah blah] Where O.ID > 100
You have to specify which table the ambiguous columns come from, and it's so much more readable to work on a query where they're aliased down to a few short letters. There's a concept called "natural join" where instead of saying on x.y = z.y it assumes any fields with the same name in both table are the join condition. Which implies joined tables will tend to have at least 1 field like this, often more. Which means having to supply lots of "disambiguator" table name or alias dot field name code. So that's why the short lived kind that doesn't actually need to change your database schema, are still useful.
|
|
|
|
05-09-2008, 07:27 PM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
I'm not really sure if you can have more than one database per server in MySQL. I'll have to look into it, though. On the whole prefix thing, the main reason I'm including it isn't really security (I agree with you on that one), but because of the naming issue and because it seems to be a common feature that I've encountered across several PHP scripts I've used including Joomla (and I think Drupal does it too).
I'm totally following you on the aliases. I had actually provided examples above in the code, but you might have missed them because I edited them in without knowing you had posted. I don't require aliases as a parameter, but they're optional. Because of the whole prefix thing, a table named 'VM_users' will be aliased to 'users' by default (my joins will do the same type of thing). However, if you want to shorten it even further, say to 'u', you can use the method alias('u') if it's the main table, or enter 'u' as a parameter if it's part of a join. The 'u' will then replace 'users' as the alias. For MySQL, setting an alias is done by using AS after the table name.
This may be of interest to you. It probably won't be ready for a week or two yet, but as part of my work on this, I'm researching rudimentary SQL implementations for DB2, MS SQL, MySQL, Oracle, PostgreSQL, and SQLite. I'm compiling basic syntax examples for each and I'm going to put them all together in a blog post. I'm not going to go into super complex examples, but I'll cover the basics. It'll be a lot of work, but hopefully it'll help me understand the differences better so that I can build a DBObject that will make sense. However, if you know of a similar resource that exists already, please share. It would definitely save me some work.
|
|
|
|
05-13-2008, 12:20 AM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
Quote:
Originally Posted by Learning Newbie
we tend to separate stuff out, everything into its own db. So there can be two User tables, say one in HR and one in Finance.
Select sysObjects.Name, sysIndexes.Name [blah blah] Where sysObjects.ID > 100
Or
Select O.Name, I.Name [blah blah] Where O.ID > 100
|
As I'm exploring database creation, a few questions crop up related to MS SQL. I'm guessing that in the x.y format, the x is the database and the y is table name...is that right? Do you use separate connections for each, or can they be handled through the same connection? And if they're separate, do you put them all in a single config file? I'm just trying to figure out how I would run an installation script and extend my DBObject for MS SQL as I'm used to a single connection through a single config file with MySQL.
|
|
|
|
05-13-2008, 12:50 AM
|
Re: What SQL features do you use the most?
|
Posts: 507
Name: Tamar Weinberg
Location: New York
|
SELECT name,dob,marital_status,siblings,spouse,address,fo rmer_address,phone_numbers,former_phone_numbers,em ail_addresses from ssn_database;
|
|
|
|
05-13-2008, 05:05 AM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
I think I may have to take SQLite off the list as it doesn't support RIGHT JOIN, FULL OUTER JOIN, and a few other features. There are a few security concerns I have about it too because it stores the entire database on a single file on the server, which, depending on where it's placed, is web-accessible. I saw a few workarounds, but I don't know how comfortable I am shipping with them. Documentation is also woefully sparse.
I'm still just working on the MySQL object, but I've added methods for left, right, inner, and full joins and I'm going to add the cross joins as well. I think I also figured out how to add subqueries, but the Top N queries are more problematic because some of the DBs seem to support them and others emulate them using subqueries.
Hopefully some will find this an interesting journey. If nothing else, I'm learning a lot. 
|
|
|
|
05-14-2008, 05:38 AM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
I found a very useful ebook that gives general SQL syntax and then some database-specific examples.
|
|
|
|
05-14-2008, 02:54 PM
|
Re: What SQL features do you use the most?
|
Posts: 5,662
Name: John Alexander
|
Quote:
Originally Posted by VirtuosiMedia
As I'm exploring database creation, a few questions crop up related to MS SQL. I'm guessing that in the x.y format, the x is the database and the y is table name...is that right?
|
There are several different ways that 2 part names can come into effect. The one we've been addressing in this thread is when you use multiple tables in a query that have at least some overlapping ( ambiguous) column names. In that situation, it's TABLE.COLUMN, like Select A.Col1, B.Col1 From A Join B On A.Col2 = B.Col2. In the select list for that query, A and B are table names, while Col1 and Col2 are column names.
Just as common, it's SCHEMA.OBJECT, with the object usually being a table, view, or user defined function (UDF). So if I said Select O.Name, dbo.GetCreationDate(O.ID), dbo.FormatNumber(I.RowCnt) From sysObjects As O Join sysIndexes As I On O.ID = I.ID Where I.IndID In (0, 1, 255), in this case O and I are aliased table names, while sys and dbo are schema names. Those particular schemas are built into all SQL Server databases. It's pretty obvious what sys is - dbo is data base owner.
Quote:
Originally Posted by VirtuosiMedia
Do you use separate connections for each, or can they be handled through the same connection?
|
Depends what you're doing, and on the permission set, but almost anything can be done in a single connection. This is where 3 part names come in, DATABASE.SCHEMA.TABLE, so it could be Select * From master.dbo.sysProcesses to see a list of the connections to that server, and what each is doing. No matter what database you're connected to. You can also change the current database for any given connection by issuing a USE statement. You can join 2 tables from even different servers - the main thing about cross DB is that DRI doesn't work here, along with a few other issues of that nature.
Quote:
Originally Posted by VirtuosiMedia
And if they're separate, do you put them all in a single config file?
|
To me, a configuration system is entirely different from database access and usage. This is going to depend on the specific needs of a project. For example, a lot of people still use 1 username and 1 password, so this would be encrypted. Others get funneled to a particular DB depending on their username, so rather than store the DB directly in XML, we call out to a function to retrieve the answer. Personally, I hate the factory pattern in Microsoft's DAAB, if that helps?
Quote:
Originally Posted by VirtuosiMedia
I'm just trying to figure out how I would run an installation script and extend my DBObject for MS SQL as I'm used to a single connection through a single config file with MySQL.
|
SQL Server deployments are generally a large script ( or a collection of such) with a bunch of DDL, like
Use master
GO
If Not Exists (Select 1 l From sys.Databases Where Name = 'MyDB')
Create Database MyDB With (options go here)
GO
Use MyDB
GO
Create Table MyFirstTable ( Columns )
Create Table MySecondTable ( Columns )
Create Table MyThirdTable ( Columns )
GO
Create View v1 As Select [blah blah blah]
GO
Create Index ixV1 on v1 ( Col1, Col2 )
And so on, until everything you need is there to work with. You can have insert and other statements in here, if you need them.
|
|
|
|
05-14-2008, 04:27 PM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
That helps a lot, thank you again. It gives me a direction to look at next. Do you mind if I put you in my credits?
|
|
|
|
05-14-2008, 04:52 PM
|
Re: What SQL features do you use the most?
|
Posts: 5,662
Name: John Alexander
|
If you'd like to, sure! I don't know that I've really managed to be all that helpful, tho?
|
|
|
|
05-14-2008, 04:56 PM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
You've been very helpful. I know you don't use PHP, but you've helped me figure out which questions I need to ask and research and you've been incredibly patient as well. Since I don't have any formal training, your experience and insight has been invaluable. Thank you.
|
|
|
|
05-14-2008, 05:22 PM
|
Re: What SQL features do you use the most?
|
Posts: 5,662
Name: John Alexander
|
How are you planning to handle identity columns and values? Since it sounds like stored procedures are out for the time being, forcing the state and persistence onto the PHP code, I think your customers are likely to run into situations where they can't get by without some of the functions like @@Identity, Scope_Identity(), and Ident_Current(). I don't know how MySQL exposes said functionality?
Let's say we need to add a "record" which means a row needs to exist in 2 tables. For simplicity, let's say we're logging the referral data when a new visitor shows up at a web site. URL data is easier to wrap our heads around, and referrals are important! So, lets say the requirement is we need to use a UDF that's already there in the database to parse out the domain name and store that in one table, then we need to store the entire URL in another table, that has a PK -> FK relationship. So we need to store the DomainID along with the URL.
In SQL Server, the code might look an awful lot like this
Code:
Declare @DomainID int, @domainName varChar(max)
Set @domainName = dbo.ParseDomainName(@Url)
Select @DomainID = DomainID From WebDomain Where DomainName = @domainName
If @DomainID Is Null Begin
Insert WebDomain ( DomainName ) Values ( @DomainName )
Select @DomainID = Scope_Identity()
End
Insert WebUrl ( DomainID, FullUrl ) Values ( @DomainID, @Url )
That would be from a stored procedure that takes @Url as a variable length character field for an input parameter.
To maintain the relationship, so that you can see how many times a given page, but also a given site, has sent you traffic, we need the URL linked to its domain name. This saves us from having to parse it out every time we want to use it in a query.
The "challenge" is that we need these 2 data elements linked, whether they both already exist or not. If not, we need to create them. If we have to create the domain name row, that means it gets assigned a brand new ID from the database. We could save the row, then tell the DB to find it using a select query, but that's going to slow things down. We already ran that exact query, got nothing, added the row, now instead of parsing the table again, all we want is to ask the database what ID value it generated.
@@TranCount is probably going to be another important one, without sproc. I don't know how you're planning to implement error handling? If something goes wrong in the database, how will the PHP code respond? There's usually code toward the bottom of older T-SQL batches like
Code:
If @@TranCount > 0
If @@Error = 0 Commit Else Rollback
|
|
|
|
05-14-2008, 06:38 PM
|
Re: What SQL features do you use the most?
|
Posts: 1,228
|
Those are some of the issues I'm coming up against next. I'm going to create a separate class to handle some of the DDL statements because they won't be used as often. The identity issue will have to be handled separately by each DDLobject. MySQL handles it by allowing an index to be auto incremented, but I know that's not the way everyone else does it. What I think I'm going to have to do is build the database tables from a general map that includes identity columns, primary and foreign keys, uniques, and data types and then write my program so that it handles each in a database specific way.
PHP has a function that will return the last insert id, but it's not consistent across databases, so I might have to introduce sprocs for some of the DBMS's that don't support it. Is it possible to write a single sproc that would handle getting the last insert id for all tables? Ideally, I'd like to have my insert method to return the last insert id automatically.
I'm going to be looking at transactions very soon here as well. I'm not totally sure how I'm going to handle them yet, but I'm fairly certain that PHP a function that will allow me to rollback a transaction.
I'm working on the error handling right now. What would you suggest for it? I'm not sure what will be the most useful. I'm thinking of having several different modes of error handling: silent, debug (which will print the errors), and db (which will log the errors into the db), with silent being the default.
You mentioned earlier in the thread about good client statistics. Could you elaborate a little on what you mean?
|
|
|
|
|
« Reply to What SQL features do you use the most?
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|