Forgot your password?
typodupeerror
Databases

Diagramming Tool For SQL Select Statements 156

Posted by kdawson
from the unique-like-a-snowflake dept.
alxtoth writes "Snowflake is a new BSD-licensed tool that parses SQL Select statements and generates a diagram. It shows parts of the underlying SQL directly in the diagram. For example: x=30, GROUP BY (year), SUM (sales), HAVING MIN (age) > 18. The primary reason for the tool was to avoid Cartesian joins and loops in SQL written by hand, with many joined tables. The database will execute such a statement, if syntactically correct, resulting in runaway queries that can bring the database down. If you sit close to the DBAs, you can hear them screaming... "
This discussion has been archived. No new comments can be posted.

Diagramming Tool For SQL Select Statements

Comments Filter:
  • by larry bagina (561269) on Sunday August 03, 2008 @09:09PM (#24461177) Journal
    Posted by kdawson
  • by Craig Ringer (302899) on Sunday August 03, 2008 @09:14PM (#24461217) Homepage Journal

    Execution of SQL statements can require the RDBMS to perform nested loops over parts of the query execution.

    This can be an issue if the DBMS is forced to do something like perform a sequential scan of one table for each record matched in another table. That gets expensive *fast*.

    There are many other possible performance issues, of course.

    However, I don't see how SQL parsing can tell you much about the performance characteristics of the query. The database's query optimiser makes choices about how to execute the query, and is free to change its mind depending on configuration parameters, available resources, system load, disk bandwidth, present indexes, statistics gathered about data in the table, etc. PostgreSQL's planner for example does make heavy use of table statistics, so query plans may change depending on the quantity and distribution of data in a table.

    Any decent database can already tell you how it will execute a query (and usually give you a performance readout from an actual execution of the query). There are plenty of GUI tools for displaying the resulting query plan output graphically. PgAdmin-II can do it, for example.

    A simple SQL parser can have no idea about what indexes are configured, the distribution of the data, how much working memory the database has available for sorts and joins, etc. The database knows these things - and can already tell you how it will, or did, execute a query - so why not let it do its job?

    The whole project doesn't make much sense.

  • EXPLAIN (Score:5, Insightful)

    by Craig Ringer (302899) on Sunday August 03, 2008 @09:22PM (#24461257) Homepage Journal

    I don't see what this has over EXPLAIN [postgresql.org] and an appropriate graphical display tool like PgAdmin-III [pgadmin.org]. There are large numbers of tools that display graphical query plans [postgresonline.com] - and unlike this simple SQL parser, they know how the database will actually execute the query once the query optimiser [wikipedia.org] is done with it.

    Furthermore, a simple SQL parser has no idea about what indexes [wikipedia.org] are present, available working memory for sorts and joins, etc. It can't know how the DB will really execute the query, without which it's hard to tell what performance issues may or may not arise.

    See comment 24461217 [slashdot.org] for a more detailed explanation of why this whole idea makes very little sense.

  • by SQLGuru (980662) on Sunday August 03, 2008 @09:25PM (#24461281) Journal

    I use diagrams as a tuning tool, but only to look for paths that don't make sense or alternate paths through tables or for "dead-ends"......but these are things that a computer can't really tell you because they require an understanding of the data.

    But you're right, the explain plan is the single most useful tool for tuning a query. If you understand how the engine is going to execute the query you know what areas you can affect. And tuning is manipulaing those effects in a way that makes the query faster.

    Layne

  • by lastchance_000 (847415) on Sunday August 03, 2008 @09:27PM (#24461295)

    Quis custodiet ipsos custodes?

  • by scdeimos (632778) on Monday August 04, 2008 @01:46AM (#24462819)

    No single query will ever bring a (real) RDBMS down. Even on a terrabyte of data or more, doing a crazy multi-hundred-table cross join, you're not going to bring it down.

    You've obviously not tried anything simple on MS-SQL, like expanding a varchar(4) column to nvarchar(10) on a table with a few million rows. MS-SQL spins its wheels filling-up the transaction log until it overflows, then rolls it all back again. A 4GB log file, filled with a 250meg table (and no indexes because they were already dropped)?

    In the end we had to drop all FK refs, select * into another table, drop the original table then select * (with conversions) into newTableWithOriginal's name and reset all the FK's. *shakes head*

  • by weicco (645927) on Monday August 04, 2008 @02:29AM (#24463027)

    Oh damn. I had to buy a whole PC set to run Debian. I guess Linux isn't free after all.

  • by Venik (915777) on Monday August 04, 2008 @03:03AM (#24463183)
    SQL code is usually developed on some small server or the DBA's own workstation. The dev database is representative of the prod version only in structure and not in size. So this type of errors sometimes go unnoticed until the code is migrated to the prod environment. The effect of such errors vary depending on server architecture. The most sensitive are HA cluster environments, where the clustering engine overreacts and starts failing things over, exacerbating the problem.
  • by lucm (889690) on Tuesday August 05, 2008 @11:10PM (#24491289)

    > myspace.com has been supplated by Facebook.

    Facebook being more popular than mySpace has nothing to do with the database back-end. If you need more big customers for SQL Server 2005, they are easy to find: Barnes & Nobles, HMV online music store, NASDAQ (over 5000 transactions/sec).

    So basically your statement that SQL Server is a toy database might have attracted a few claps 6 or 7 years ago on Slashdot, but the reality is that SQL Server is a robust product finding its way in many markets. As one could say: "2001 just called, they want their SQL Server rant back".

    > And DB2 is the granddaddy that is being trusted by ALL banks.

    I do not have data about banks and database, however I suspect that since many smaller banks are still using OS/400, there must be a lot of DB2 out there. For the bigger banks, I sincerely doubt there is any RDBMS laying around, except for OLAP or e-banking, in which case DB2 won't be in the contenders (e-banking is the land of Oracle, SQL Server and sadly Interbase). For the real backbone, big money usually sticks to big iron, which usually means hierarchical databases.

    According to Gartner, the current market share is the following: Oracle 47%, IBM 21%, Microsoft 17%, with Microsoft closing the gap on IBM every year. And let's not forget that IBM's 21% includes Informix/Cheetah, Cloudscape, etc, not only DB2.

    > And banks are the most thorough corporate IT customers.

    This is an urban legend. Banks are not even the most conservative IT customers. I've been involved in three e-banking projects, and it amazed me how careless those people can be with data integrity and maintenance.

    The most impressive IT customers I met are insurance companies and space industry companies (not defense contractors). I've seen my way in many datacenters, and only in insurance companies did I see figures about the heat dissipation of network cable.

"Turn on, tune up, rock out." -- Billy Gibbons

Working...